THE METABOLIC PATHWAY ENGINEERING HANDBOOK Fundamentals
The Metabolic Pathway Engineering Handbook, 1st Edition The M...
161 downloads
2239 Views
10MB Size
Report
This content was uploaded by our users and we assume good faith they have the permission to share this book. If you own the copyright to this book and it is wrongfully on our website, we offer a simple DMCA procedure to remove your content from our site. Start by pressing the button below!
Report copyright / DMCA form
THE METABOLIC PATHWAY ENGINEERING HANDBOOK Fundamentals
The Metabolic Pathway Engineering Handbook, 1st Edition The Metabolic Pathway Engineering Handbook: Fundamentals The Metabolic Pathway Engineering Handbook: Tools and Applications
THE METABOLIC PATHWAY ENGINEERING HANDBOOK Fundamentals
Edited by
Christina D. Smolke
CRC Press Taylor & Francis Group 6000 Broken Sound Parkway NW, Suite 300 Boca Raton, FL 33487-2742 © 2010 by Taylor & Francis Group, LLC CRC Press is an imprint of Taylor & Francis Group, an Informa business No claim to original U.S. Government works Printed in the United States of America on acid-free paper 10 9 8 7 6 5 4 3 2 1 International Standard Book Number-13: 978-1-4398-0296-0 (Hardcover) This book contains information obtained from authentic and highly regarded sources. Reasonable efforts have been made to publish reliable data and information, but the author and publisher cannot assume responsibility for the validity of all materials or the consequences of their use. The authors and publishers have attempted to trace the copyright holders of all material reproduced in this publication and apologize to copyright holders if permission to publish in this form has not been obtained. If any copyright material has not been acknowledged please write and let us know so we may rectify in any future reprint. Except as permitted under U.S. Copyright Law, no part of this book may be reprinted, reproduced, transmitted, or utilized in any form by any electronic, mechanical, or other means, now known or hereafter invented, including photocopying, microfilming, and recording, or in any information storage or retrieval system, without written permission from the publishers. For permission to photocopy or use material electronically from this work, please access www.copyright.com (http:// www.copyright.com/) or contact the Copyright Clearance Center, Inc. (CCC), 222 Rosewood Drive, Danvers, MA 01923, 978-750-8400. CCC is a not-for-profit organization that provides licenses and registration for a variety of users. For organizations that have been granted a photocopy license by the CCC, a separate system of payment has been arranged. Trademark Notice: Product or corporate names may be trademarks or registered trademarks, and are used only for identification and explanation without intent to infringe. Library of Congress Cataloging-in-Publication Data The metabolic pathway engineering handbook : fundamentals / editor, Christina D. Smolke. p. ; cm. Includes bibliographical references and index. ISBN 978-1-4398-0296-0 (hardcover : alk. paper) 1. Genetic engineering--Handbooks, manuals, etc. 2. Biosynthesis--Handbooks, manuals, etc. I. Smolke, Christina D. II. Title. [DNLM: 1. Genetic Engineering--methods. 2. Metabolic Networks and Pathways. 3. Biological Products--metabolism. 4. Biotechnology--methods. 5. Models, Biological. QU 450 M5871 2010] TP248.6.M478 2010 660.6’5--dc22 Visit the Taylor & Francis Web site at http://www.taylorandfrancis.com and the CRC Press Web site at http://www.crcpress.com
2008051635
Contents Introduction ............................................................................................................................... ix Editor . ......................................................................................................................................... xv Contributors . .......................................................................................................................... xvii
Section I Cellular Metabolism Andy Ekins and Vincent J.J. Martin
1
Solute Transport Processes in the Cell . ................................................................... 1-1
2
Catabolism and Metabolic Fueling Processes ........................................................ 2-1
3
Biosynthesis of Cellular Building Blocks: The Prerequisites of Life . .............. 3-1
4 Polymerization of Building Blocks to Macromolecules:
Adelfo Escalante, Alfredo Martínez, Manuel Rivera, and Guillermo Gosset Olubolaji Akinterinwa and Patrick C. Cirino
Zachary L. Fowler, Effendi Leonard, and Mattheos Koffas
Polyhydroxyalkanoates as an Example .................................................................... 4-1 Si Jae Park, Soon Ho Hong, and Sang Yup Lee
5 Rare Metabolic Conversions—Harvesting Diversity
through Nature ............................................................................................................... 5-1 Manuel Ferrer and Peter N. Golyshin
Section II Balances and Reaction Models Walter M. van Gulik
6
Growth Nutrients and Diversity ................................................................................ 6-1
7
Mass Balances, Rates, and Experiments .................................................................. 7-1
8
Data Reconciliation and Error Detection . .............................................................. 8-1
9
Black Box Models for Growth and Product Formation ....................................... 9-1
Joseph J. Heijnen Joseph J. Heijnen
Peter J.T. Verheijen Joseph J. Heijnen
v
vi
10
Contents
Metabolic Models for Growth and Product Formation ..................................... 10-1
Walter M. van Gulik
11 A Thermodynamic Description of Microbial Growth and
Product Formation . ..................................................................................................... 11-1
Joseph J. Heijnen
Section III Bacterial Transcriptional Regulation of Metabolism James C. Liao
12 13 14
Transcribing Metabolism Genes: Lessons from a Feral Promoter ................. 12-1 Alan J. Wolfe
Regulation of Secondary Metabolism in Bacteria ............................................... 13-1
Wenjun Zhang, Joshua P. Ferreira, and Yi Tang
A Synthetic Approach to Transcriptional Regulatory Engineering .. ............. 14-1
Wilson W. Wong and James C. Liao
Section IV Modeling Tools for Metabolic Engineering Costas D. Maranas
15
Metabolic Flux Analysis ............................................................................................. 15-1
16
Metabolic Control Analysis . ..................................................................................... 16-1
17 18 19 20
Maria I. Klapa
Joseph J. Heijnen
Structure and Flux Analysis of Metabolic Networks ......................................... 17-1
Kiran Raosaheb Patil, Prashant Madhusudan Bapat, and Jens Nielsen
Constraint-Based Genome-Scale Models of Cellular Metabolism ................. 18-1
Radhakrishnan Mahadevan
Multiscale Modeling of Metabolic Regulation . .................................................. 19-1 C.A. Leclerc and Jeffrey D. Varner
Validation of Metabolic Models ............................................................................... 20-1
Sang Yup Lee, Hyohak Song, Tae Yong Kim, and Sung Bum Sohn
Section V Developing Appropriate Hosts for Metabolic Engineering Jens Nielsen
21 Escherichia coli as a Well-Developed Host for
Metabolic Engineering . .............................................................................................. 21-1
22
Eva Nordberg Karlsson, Louise Johansson, Olle Holst, and Gunnar Lidén
Metabolic Engineering in Yeast ............................................................................... 22-1 Maurizio Bettiga, Marie F. Gorwa-Grauslund, and Bärbel Hahn-Hägerdal
Contents
vii
23
Metabolic Engineering of Bacillus subtilis ............................................................ 23-1
24
Metabolic Engineering of Streptomyces .................................................................24-1
25 26
John Perkins, Markus Wyss, Hans-Peter Hohmann, and Uwe Sauer Irina Borodina, Anna Eliasson, and Jens Nielsen
Metabolic Engineering of Filamentous Fungi . .................................................... 25-1
Mikael Rørdam Andersen, Kanchana Rucksomtawin, Gerald Hofmann, and Jens Nielsen
Metabolic Engineering of Mammalian Cells . ......................................................26-1
Lake-Ee Quek and Lars Keld Nielsen
Index ........................................................................................................................................... I-1
Introduction
Progression of Biological Synthesis Methods toward Commercial Relevance The advent of recombinant DNA in the 1970s brought transformative technologies for the synthesis and manipulation of artificial genetic material. The ability to amplify, cut, and piece together fragments of DNA outside of a cell and to get (or transform) that DNA into a cell of interest resulted in a set of molecular cloning tools that enabled the field of genetic engineering. In genetic engineering, foreign DNA that encodes for new or altered functions or traits is inserted into an organism of interest. Many early applications of recombinant DNA technology focused on heterologous protein production in microbial hosts. The first medicine made through recombinant DNA technology that was approved by the United States Federal Drug Administration was the synthesis of synthetic “human” insulin in Escherichia coli. This was an important early application of recombinant DNA technology, as the success of producing a safe and effective synthetic hormone in a bacteria led to the widespread acceptance of the technology and significant resources and funding to be directed to its support and advancement. As the technologies in support of synthesizing and manipulating artificial DNA matured and advanced, so did the applications to which they were applied. The early successful applications of recombinant DNA technology resulted in alternative routes to the synthesis of medicines, such as insulin, human growth factor, and erythropoietin, vaccines, and even genetically modified organisms, including crops that exhibit more desirable traits. Technologies were developed for the manipulation of artificial DNA in both prokaryotic and eukaryotic host organisms, including mammalian and plant cells. In addition, inspired by the diversity of natural products, chemicals, and materials synthesized by biological systems that are observed in the natural world, researchers began to look beyond applications that were limited to the synthesis of a single heterologous protein product in a cellular host to more complicated engineering feats. In particular, these new applications focused on the manipulation of sets or combinations of proteins, or enzymes, that acted in conjunction in a cell, within metabolic pathways, to convert energy and precursor chemicals into desired natural and non-natural products. The production of chemicals, materials, and energy through biology presents an alternative to traditional chemical synthesis routes. While the development of chemical synthesis methods for the production of valuable chemicals and small molecule pharmaceuticals is a more mature field and has demonstrated significant successes, many chemicals remain difficult to be synthesized through such strategies, particularly those with many chiral centers. Biological catalysts, or enzymes, have demonstrated remarkable adeptness at the synthesis of very complex molecules. In addition, cellular biosynthesis strategies offer several advantages over traditional chemical synthesis strategies in that the former is often conducted under less harsh conditions, thereby enabling “green” synthesis strategies that are associated with the production of fewer toxic by-products. In addition, cellular biosynthesis ix
x
Introduction
takes advantage of the cell’s natural ability to replenish enzymes and cofactors and to provide precursors from often inexpensive and renewable starting materials. Such advantages are particularly compelling in light of the global challenges we face today in energy, the environment, and sustainability. However, new challenges are presented when manipulating the metabolic pathways in cellular hosts that link energy sources and starting materials to products of commercial interest. The unique challenges faced in engineering metabolic pathways, when compared to the early genetic engineering applications of heterologous protein production, require the development of new enabling technologies, spanning experimental and analytical techniques and computational tools.
The Field of Metabolic Engineering Metabolic engineering is a field that includes the construction, redirection, and manipulation of cellular metabolism through the alteration of endogenous and/or heterologous enzyme activities and levels to achieve the biosynthesis or biocatalysis of desired compounds. Researchers in metabolic engineering often view the biological system as a chemical factory that is converting starting materials to different value-added products. Because the yield or productivity of the process is linked to its commercial viability, the ability to precisely regulate the flow of energy and materials through different cellular pathways becomes critical to the optimization of the overall process, drawing parallels to the more traditional engineering discipline of chemical process design. The basic tenet of metabolic engineering, the use of biology as a technology for the conversion of energy, chemicals, and materials to value-added products, has a long history. Early applications can be cited, even prior to the development of recombinant DNA technology, in the food and beverage industry where more traditional methods of strain development based on evolution, mating, and selection strategies were used to develop more desired production hosts for particular applications. However, recombinant DNA technology enabled the capability to introduce new enzymatic activities and pathways into production hosts allowing access to different energy resources and starting materials and to the production of different chemicals and materials. Such technologies support the forward design of more complex synthetic pathways in host organisms or the targeted manipulation of endogenous pathways, enabling more directed manipulation of the cellular host. Current metabolic engineering efforts are focused on the synthesis of products such as chemical commodities, small molecule drugs, and alternative energy sources including biofuels. In addition, significant effort is also directed to the engineering of host metabolisms to utilize renewable, low cost energy resources. Many of the challenges faced in metabolic engineering are related to the engineering of energy and material flow within complex systems. More specifically, metabolic pathways make up complex interconnected networks in cells, which can rarely be manipulated in isolation of the rest of the network. Highlighting the interconnections between cellular metabolites is the fact that all metabolites are made from a set of 12 common precursors. In addition, the flow of metabolites through a network of enzymes, and in the background of other cellular enzymes that may exhibit activity on these metabolites, is often controlled through layered processes that act at different time scales, implement dynamic feedback control, and utilize localization and transport. Metabolic engineering requires a breadth of skill sets to tackle different points of system design and as a result has developed into a very interdisciplinary field. Researchers with expertise spanning a variety of disciplines, including chemical engineering, biological engineering, environmental engineering, biochemistry, molecular biology, cell biology, bioinformatics, and control theory, are working in different areas of metabolic engineering. However, as an academic endeavor, metabolic engineering has remained an interdisciplinary research discipline with courses covering aspects of the field depending on the expertise of the department in which it is taught. As it has matured, metabolic engineering has gained greater industrial significance. Initial industrial interest was directed to the synthesis of chemical commodities in microorganisms largely at groups within larger chemical companies. However, many smaller startup companies have developed in recent years that are focused on the synthesis of specialty chemicals such as pharmaceuticals and biofuels, on
Introduction
xi
the development of computational and modeling programs to direct metabolic engineering efforts, and on the discovery and development of new enzyme activities in support of engineering new synthetic pathways into host organisms. The intersection of metabolic engineering, with other emerging areas of systems and synthetic biology, presents exciting opportunities to develop solutions to many of the global challenges we face in energy, the environment, health and medicine, resources, and sustainability, and will likely continue to fuel a significant sector of the biotechnology industry in future years.
An Overview of The Metabolic Pathway Engineering Handbook The purpose of The Metabolic Pathway Engineering Handbook is to provide a thorough overview of the field of metabolic engineering. Each section provides an overview of different aspects of a particular topic that is a central component of the field by experts in that area. Sections are introduced by section editors to provide a perspective on the topic and a description of how the chapters in that section link together to form an integrated overview of that particular topic. The sections are split into two books, where the content of the first book focuses on “fundamentals” or basic principles of metabolic engineering and the second book focuses on “tools and applications” in metabolic engineering. Due to its organization, the handbook can be used as a reference book and read for individual sections or chapters, or it can be used as a book for advanced courses in metabolic engineering. Section I in The Metabolic Pathway Engineering Handbook: Fundamentals provides an overview of the basic processes that support cellular metabolism. The boundary of a cell is defined by its cellular membrane, which acts to separate cellular constituents from the environment. Metabolism begins with systems that allow the import of nutrients and starting materials across the cellular membrane and efforts to engineer transport systems for particular chemicals have been important strategies in enabling cells to convert those chemicals to desired products. Once inside the cell, nutrients are broken down into common precursors for metabolic syntheses, which provide the energy and reducing power necessary for cell survival. In addition, precursors are channeled into the synthesis of important building blocks that the cell then utilizes to build larger macromolecules, including lipids, nucleic acids, and proteins. An understanding of the central metabolic pathways and the general flow of metabolism through a small number of common precursors and carriers is critical to being able to effectively link new synthetic nutrient or product pathways to endogenous metabolisms. Finally, the wealth of untapped diversity in nature, particularly in the microbial biosphere, provides significant opportunities in harvesting new enzymatic activities from nature that can be applied to the production of new chemicals and materials in engineered hosts. Section II provides an overview of mass balances and reaction models applied to predicting product formation and microbial growth in fermentation processes. Various models have been proposed and utilized in the field that exhibit varying levels of detail to provide predictions of product yield and cell growth. Conversion rates are calculated from mass balances and rate equations that take into account the basic nutrients and constituents of cellular systems. Different models, such as those based on thermodynamic or metabolic network constraints, can be utilized to predict product yield and cell growth in fermentation processes. Different models may be more or less appropriate based on the specifics of the fermentation. The application of such models to experimental systems can allow minimization of error in detection strategies resulting in optimized control schemes for fermentations based on such experimental measurements. Section III provides an overview of transcriptional regulation of metabolic pathways in bacterial systems. Bacterial cells use a variety of mechanisms to regulate the transcription of enzymes involved in primary and secondary metabolisms. Transcriptional regulatory strategies exist that regulate a small set of genes in response to specific environmental chemicals, such as operon-specific regulation and two-component systems. However, other strategies exist that regulate larger sets of genes in response to significant environmental changes such as heat shock or nitrogen starvation, through sigma factors and global transcriptional factors. An understanding of the strategies used to regulate the expression of
xii
Introduction
enzymes in a cellular host is critical in metabolic engineering to developing effective strategies to alter the expression of endogenous enzymes and to design synthetic systems that exhibit more sophisticated regulatory schemes to balance and coordinate the expression of multiple enzymes to ultimately optimize flux through desired pathways. Section IV is an overview of modeling tools that have been developed for metabolic engineering applications. Earlier modeling and computation efforts that resulted in tools for metabolic flux analysis (MFA) and metabolic control analysis (MCA) have been very powerful for the elucidation of fluxes and control strategies in metabolic networks given partial sets of data. Computation tools based on network and graph concepts have enabled structure and flux analyses that provide optimization tools for metabolic engineering. In addition, metabolic network reconstruction and modeling efforts have resulted in genome-scale models of cellular metabolism for specific organisms based on sets of constraints that enable prediction of flux distributions under different conditions. Whereas multi-scale modeling tools are extending current predictive capabilities by integrating stoichiometry, kinetics, and regulatory and control responses in metabolic networks, such multi-scale tools can be utilized by metabolic engineers to predict the dynamic metabolic response. Section V provides an overview of common cellular hosts that are used in metabolic engineering applications. In particular, the bacterial hosts Escherichia coli, Bacillus subtilis, and Streptomyces have been utilized in various metabolic engineering applications, with E. coli being the most well-developed and utilized host largely due to the genetic tools available for manipulating pathways in this host organism. In addition, two lower eukaryotic hosts, yeast and filamentous fungi, have been utilized in various metabolic engineering applications for the production of natural products or for pathway enzymes that are more readily expressed in functional forms in eukaryotic organisms. Finally, much effort has also been put toward the development of mammalian cell culture hosts for the production of metabolites and products that are more readily produced in mammalian cells. Each host may present advantages and disadvantages in the synthesis of a desired chemical based on the genetic tools available for manipulating pathways and the endogenous metabolism and processing pathways present in that organism, such that the selection of a suitable host is driven largely by the properties of the pathway of interest. Section I in The Metabolic Pathway Engineering Handbook: Tools and Applications provides an overview of the evolutionary tools widely in use in the engineering of metabolic enzymes and networks. Evolutionary strategies have been traditionally used in metabolic engineering to select for desired phenotypes in host organisms. As biological organisms naturally undergo processes of evolution and selection, design strategies that integrate evolutionary engineering objectives with metabolic engineering objectives may result in a more robustly performing engineered cellular system. Directed evolution is a laboratory tool that is used to mimic the evolutionary process in a test tube, by generating diversity in cellular components and then screening or selecting through this diversity for optimized component properties. Various experimental strategies have been utilized for generating and screening through component diversity. In addition, computational tools have been developed that optimize the design of laboratory evolution strategies. These experimental and computational tools have been applied to the directed evolution of enzymes, regulatory systems, pathways, and whole genomes for the optimization of flux through targeted metabolic pathways. Section II provides an overview of gene expression tools that have been utilized in metabolic engineering applications. Various tools have been developed that regulate DNA copy number and enable chromosomal engineering in host organisms. In addition, a variety of other genetic tools have been developed that precisely regulate gene expression levels through post-transcriptional and translational mechanisms. Still other tools have been developed that regulate the activity of enzymes through posttranslational engineering strategies. The application of the tools described in this section is critical to balancing the expression of multiple enzymes, such that individual conversion steps do not limit product yield, toxic intermediates do not accumulate, and cellular resources and energy are efficiently utilized by the host cell. Several examples exist of engineered systems that have utilized such genetic tools for the optimization of flux through metabolic pathways.
Introduction
xiii
Section III provides an overview of emerging technologies and their application to metabolic engineering. Genome-wide technologies that allow global profiling of cellular transcripts, proteins, metabolites, and phenotypes are critical for efficient troubleshooting and debugging of engineered systems. Bioinformatics tools that allow for management and analysis of the vast amounts of data collected from these techniques are also critical. As these technologies mature and become more available, their implementation as standard techniques in metabolic engineering will improve our understanding of the engineered system response and result in efficient troubleshooting and optimization strategies. Section IV provides an overview of key future prospects in metabolic engineering. The integration of new computational tools, such as genome-scale models, and new technologies for analyzing and understanding complex systems, such as systems biology, with metabolic engineering are rapidly advancing the success with which metabolic networks can be forward engineered. In addition, alternative strategies to cellular biosynthesis that remove complications associated with engineering living, evolving systems, such as cell-free synthesis systems, have demonstrated impressive successes. Finally, the modeling and optimization of engineered metabolic pathways in silico, prior to construction and characterization, will significantly transform the field of metabolic engineering and integrate advances in computational modeling, systems biology, and engineering design. Section V provides an overview of common tools that are utilized to determine flux through metabolic pathways. Various types of isotope flux labeling strategies have been widely used to monitor flux through metabolic pathways, where the data from such experiments are typically integrated into the modeling tools described in Section IV. In addition, various analytical strategies are utilized to profile cellular metabolites, where current and future efforts have been focused on developing strategies to profile and quantify global metabolite levels. Section VI provides an overview of various metabolic engineering application areas. One broad application area is focused on the engineering and regulation of the energy state, cofactor supply, and redox balance of cellular hosts. This is a challenge that affects most if not all metabolic engineering applications, where the introduction of new pathways or the manipulation of endogenous pathways can result in imbalances in cellular pathways and stress responses. Metabolic engineering applications are generally directed toward the synthesis of commercially relevant molecules including specialty or commodity chemicals, small molecule drugs, or alternative energy sources. Each of these application areas of metabolic engineering presents distinct challenges that must be addressed in the process design based on chemical and pathway complexity, market cost of the product, volume demand of the product, end use of the product, and purity requirements.
Metabolic Engineering: Looking toward the Future Metabolic engineering as a field has evolved significantly over the past 10 to 15 years in large part due to the scientific and technological advances made during this time frame in support of this application area. The future prospects of metabolic engineering are extremely exciting, and as other supporting scientific and engineering fields mature it is likely to see transformative advances that direct it further toward an engineering discipline. There are several key supporting fields that will aid in directing this transformation. First, enzyme engineering and enzyme discovery will be critical to expanding the diversity of natural and non-natural products that can be produced in engineered organisms. Much of the living world has not been cultured and characterized. Even in those organisms that have been cultured, we do not have genome sequence information, have not mapped functions to many of the sequenced genes, or have not characterized many of the enzyme activities in these organisms. For example, many pathways in plants responsible for the synthesis of diverse pharmacologically relevant molecules have not been elucidated, although many of these activities and their corresponding genes are currently present in large expressed sequence tag (EST) libraries. Because we cannot forward design enzymes to exhibit specific catalytic
xiv
Introduction
activities, the existing limitations in characterized enzyme activities severely limit the pathways that we can reconstruct in organisms. In addition, programs that will allow us to predict and design enzyme function from sequence will be critically enabling for the design of new activities that have not been recovered from natural systems. Second, because metabolic engineering is largely a systems engineering challenge, continued advances in systems biology will provide important insights into the function of biological systems that will inform engineering design and strategies directed at manipulating metabolic pathways. Many analytic techniques in support of systems biology, including strategies that allow global profiling of transcript, protein, and metabolite levels, are providing vast amounts of information regarding levels of cellular constituents under different conditions. In addition, computational tools are being developed to process the vast amounts of data coming from these techniques. Newer and future efforts in systems biology must focus on taking the information coming from these techniques and abstracting from it the organizing principles governing cellular metabolism and regulation. An understanding of how cells generally layer metabolic pathways with different regulatory strategies will allow engineers to design more robustly performing synthetic pathways that are better integrated with the endogenous metabolic pathways. In addition, such understanding will allow better identification of manipulation points in endogenous networks to alter flux through pathways. Third, the integration of information theory and control theory with systems biology and metabolic engineering will likely have a significant impact on our understanding of biological systems. Such tools will enable a deeper understanding of architectures and properties of complex networks that support robustness, evolvability, and fragility of the system, providing a conceptual framework to systems biology. In addition, such tools will allow researchers to more quantitatively examine models of control schemes around metabolic pathways to better elucidate the design principles around regulating flux through metabolic pathways. Such tools can also be used to examine synthetic network and control scheme designs and guide the more effective design of engineered systems. Finally, metabolic engineering is seeing a transformation with the emerging field of synthetic biology. Synthetic biology is the design, construction, and characterization of biological systems using engineering design principles. To support a framework for engineering biology, synthetic biology is rooted in foundational technologies that enable the construction of more complex, heterologous networks in living systems. With advances in DNA sequencing and synthesis it is becoming common practice to synthesize entire genes and pathways from scratch, no longer limiting researchers to the physical DNA that they obtain from natural organisms. In addition, abstraction frameworks have been proposed to enable rapid assembling and reassembling of basic biological components (or parts) into larger networks (or devices) and systems, supporting the rapid prototyping and troubleshooting and reliable construction of complex metabolic pathways in cellular hosts (or chassis). An example of a synthetic biology approach to the rapid prototyping of a metabolic pathway in Escherichia coli was recently described (http://parts. mit.edu/wiki/index.php/MIT_2006). There are also efforts directed to the engineering of specific chassis, or cellular hosts, optimized for metabolic engineering applications. Finally, enabling genetically encoded technologies are being developed for use in precise and quantitative manipulation of pathway components such as enzymes.
Christina D. Smolke Editor-in-Chief
Editor
Christina Smolke is an assistant professor in the Department of Bioengineering at Stanford University. She graduated with a BS in chemical engineering with a minor in biology from the University of Southern California in 1997. She conducted her graduate training as a National Science Foundation Fellow in the Chemical Engineering Department at the University of California at Berkeley and earned her PhD in 2001. Christina conducted her postdoctoral training as a National Institutes of Health Fellow in cell biology at UC Berkeley. She started her independent research program as an assistant professor in the Division of Chemistry and Chemical Engineering at the California Institute of Technology from 2003– 2008. She has pioneered a research program in developing foundational technologies for the design and construction of engineered ligand-responsive RNA-based regulatory molecules, their integration into molecular computation and signal integration strategies, and their reliable implementation into diverse cellular engineering applications. These technologies are resulting in scaleable platforms for the construction of molecular tools that work across many cellular systems and allow regulation of targeted gene expression levels in response to diverse endogenous or exogenous molecular ligands. Her research is rapidly advancing current capabilities of noninvasive detection of cellular state and programming cellular function. In particular, her laboratory is examining the application of these tools to the optimization of metabolic pathway engineering strategies in organisms such as yeast. Dr. Smolke’s innovative research program has recently been recognized with the receipt of a National Science Foundation CAREER Award, a Beckman Young Investigator Award, an Alfred P. Sloan Research Fellowship, and the listing of Dr. Smolke as one of Technology Review’s Top 100 Young Innovators in the World. She is also a member and adjunct faculty of the Comprehensive Cancer Center’s Cancer Immunotherapeutics Program at the City of Hope, where she has several translationally oriented collaborative projects exploring the clinical applications of these technologies. She is the inventor of over nine patents and serves on the Scientific Advisory Board of Codon Devices. Dr. Smolke is currently serving as the President of the Institute of Biological Engineering. She is a member of AIChE, ACS, the RNA Society, and IBE.
xv
Contributors
Olubolaji Akinterinwa
Andy Ekins
Peter N. Golyshin
Mikael Rørdam Andersen
Anna Eliasson
Marie F. GorwaGrauslund
Department of Chemical Engineering Pennsylvania State University University Park, Pennsylvania
Department of Biology Centre for Structural and Functional Genomics Concordia University Montreal, Quebec, Canada
Center for Microbial Biotechnology BioCentrum-DTU Technical University of Denmark Lyngby, Denmark
Center for Microbial Biotechnology BioCentrum-DTU Technical University of Denmark Lyngby, Denmark
Prashant Madhusudan Bapat
Adelfo Escalante
Maurizio Bettiga
Joshua P. Ferreira
Center for Microbial Biotechnology BioCentrum-DTU Technical University of Denmark Lyngby, Denmark
Department of Applied Microbiology Lund University Lund, Sweden
Irina Borodina
Center for Microbial Biotechnology BioCentrum-DTU Technical University of Denmark Lyngby, Denmark
Patrick C. Cirino
Department of Chemical Engineering Pennsylvania State University University Park, Pennsylvania
Cellular Engineering Biocatalysis Department Biotechnology Institute National Autonomous University of Mexico Cuernavaca, Mexico Department of Chemical and Biomolecular Engineering University of California Los Angeles, California
Manuel Ferrer
Department of Biocatalysis Institute of Catalysis Consejo Superior de Investigaciones Científicas Madrid, Spain
Zachary L. Fowler
Department of Chemical and Biological Engineering State University of New York at Buffalo Buffalo, New York
Department of Environmental Microbiology HZI-Helmholtz Centre for Infection Research Braunschweig, Germany
Department of Applied Microbiology Lund University Lund, Sweden
Guillermo Gosset
Cellular Engineering Biocatalysis Department Biotechnology Institute National Autonomous University of Mexico Cuernavaca, Mexico
Bärbel Hahn-Hägerdal Department of Applied Microbiology Lund University Lund, Sweden
Joseph J. Heijnen
Bioprocess Technology Group Department of Biotechnology Delft University of Technology Delft, the Netherlands
Gerald Hofmann
Center for Microbial Biotechnology, BioCentrum-DTU Technical University of Denmark Lyngby, Denmark
xvii
xviii
Hans-Peter Hohmann
DSM Nutritional Products Ltd Basel, Switzerland
Olle Holst
Department of Biotechnology Lund University Lund, Sweden
Contributors
C.A. Leclerc
Department of Chemical Engineering McGill University Montreal, Quebec, Canada
Department of Chemical Engineering and Bioengineering University of Ulsan Ulsan, Republic of Korea
Louise Johansson
Effendi Leonard
Department of Chemical Engineering Lund University Lund, Sweden
Department of Chemical and Biological Engineering State University of New York at Buffalo Buffalo, New York
Eva Nordberg Karlsson Department of Biotechnology Lund University Lund, Sweden
Tae Yong Kim
Department of Chemical and Biomolecular Engineering Center for Systems and Synthetic Biotechnology Institute for the BioCentury Korea Advanced Institute of Science and Technology Daejeon, Korea
Maria I. Klapa
James C. Liao
Chemical and Biomolecular Engineering Department University of California Los Angeles, California
Gunnar Lidén
Department of Chemical Engineering Lund University Lund, Sweden
Radhakrishnan Mahadevan
Department of Chemical and Biomolecular Engineering Institute of Chemical Engineering and High-Temperature Chemical Processes Foundation for Research and Technology-Hellas Patras, Greece
Department of Chemical Engineering and Applied Chemistry Institute of Biomaterials and Biomedical Engineering University of Toronto Toronto, Ontario, Canada
Mattheos Koffas
Costas D. Maranas
Department of Chemical and Biological Engineering State University of New York at Buffalo Buffalo, New York
Department of Biology Centre for Structural and Functional Genomics Concordia University Montreal, Quebec, Canada
Sang Yup Lee
Department of Chemical and Biomolecular Engineering Center for Systems and Synthetic Biotechnology Institute for the BioCentury Korea Advanced Institute of Science and Technology Daejeon, Korea
Soon Ho Hong
Vincent J.J. Martin
Department of Chemical Engineering Pennsylvania State University Fenske Laboratory University Park, Pennsylvania
Alfredo Martínez
Cellular Engineering Biocatalysis Department Biotechnology Institute National Autonomous University of Mexico Cuernavaca, Mexico
Jens Nielsen
Systems Biology Department of Chemical and Biological Engineering Chalmers University of Technology Gothenburg, Sweden and Center for Microbial Biotechnology BioCentrum-DTU Technical University of Denmark Lyngby, Denmark
Lars Keld Nielsen
Australian Institute for Bioengineering and Nanotechnology The University of Queensland Brisbane, Australia
Si Jae Park
Corporate R&D LG Chem, Ltd. Daejeon, Republic of Korea
Kiran Raosaheb Patil
Center for Microbial Biotechnology BioCentrum-DTU Technical University of Denmark Lyngby, Denmark
xix
Contributors
John Perkins
DSM Nutritional Products Ltd Basel, Switzerland
Lake-Ee Quek
Australian Institute for Bioengineering and Nanotechnology The University of Queensland Brisbane, Australia
Manuel Rivera
Cellular Engineering Biocatalysis Department Biotechnology Institute National Autonomous University of Mexico Cuernavaca, Mexico
Kanchana Rueksomtawin
Center for Microbial Biotechnology BioCentrum-DTU Technical University of Denmark Lyngby, Denmark
Uwe Sauer
Institute for Molecular Systems Biology ETH Zürich Zürich, Switzerland
Christina D. Smolke
Jeffrey D. Varner
Seung Bum Sohn
Peter J.T. Verheijen
Division of Chemistry and Chemical Engineering California Institute of Technology Pasadena, California
Department of Chemical and Biomolecular Engineering Center for Systems and Synthetic Biotechnology Institute for the BioCentury Korea Advanced Institute of Science and Technology Daejeon, Korea
Hyohak Song
Department of Chemical and Biomolecular Engineering Center for Systems and Synthetic Biotechnology Institute for the BioCentury Korea Advanced Institute of Science and Technology Daejeon, Korea
Department of Chemical and Biomolecular Engineering Cornell University Ithaca, New York
Department of Biotechnology Delft University of Technology Delft, the Netherlands
Alan J. Wolfe
Department of Microbiology and Immunology Loyola University at Chicago Stritch School of Medicine Maywood, Illinois
Wilson W. Wong
Chemical and Biomolecular Engineering Department University of California Los Angeles, California
Yi Tang
Department of Chemical and Biomolecular Engineering University of California Los Angeles, California
Walter M. van Gulik
Bioprocess Technology Group Department of Biotechnology Delft University of Technology Delft, the Netherlands
Markus Wyss
DSM Nutritional Products Ltd Basel, Switzerland
Wenjun Zhang
Department of Chemical and Biomolecular Engineering University of California Los Angeles, California
Cellular Metabolism
I
Andy Ekins and Vincent J.J. Martin Concordia University
1 Solute Transport Processes in the Cell Adelfo Escalante, Alfredo Martínez, Manuel Rivera, and Guillermo Gosset......................................................................................1-1 Introduction • Structure and Function of the Bacterial Membrane • The Transporter Classification (TC) System
2 Catabolism and Metabolic Fueling Processes Olubolaji Akinterinwa and Patrick C. Cirino...........................................................................................................................2-1 Introduction • Classification of Organisms • Thermodynamics of Fueling Processes • Products of Fueling Processes • Redox Potentials and Mobile Electron Carriers • Examples of Catabolic Processes in Different Organisms • Concluding Remarks
3 Biosynthesis of Cellular Building Blocks: The Prerequisites of Life Zachary L. Fowler, Effendi Leonard, and Mattheos Koffas......................................3-1 Introduction • Amino Acid Biosynthesis • Nucleotides as Building Blocks • Synthesis of Carbohydrates for Building Cells • Cell Synthesis of Lipids
4 Polymerization of Building Blocks to Macromolecules: Polyhydroxyalkanoates as an Example Si Jae Park, Soon Ho Hong, and Sang Yup Lee.................................................................................................................................4-1 Introduction • PHAs • PHA Synthases • Metabolic Engineering of Microorganisms for PHA Production • Conclusion
5 Rare Metabolic Conversions—Harvesting Diversity through Nature Manuel Ferrer and Peter N. Golyshin........................................................5-1 Introduction • How Diverse Are Functional Groups? • Diversity of Enyzmes and Current Frontiers for Bioconversions • Main Chemical Conversions Mediated by Enzymes: Putative “Rare” Conversions • How Can New Catalytic Functions Be Achieved? • Recent Advances in Metagenomics: The Untapped Reservoir of Proteins from Unculturable Microbes
I-1
I-2
T
Cellular Metabolism
he field of metabolic engineering has advanced over time and will undoubtedly continue to do so, based on a solid scientific foundation and continued research and innovation. Of critical importance is a solid understanding of the metabolism of the cell which will ultimately be manipulated through various techniques to produce a desired end product, or alternatively, remove or breakdown an undesirable one. This section describes the well defined knowledge of how, in particular Escherichia coli, is able to transport a variety of nutrients and use such nutrients, through a variety of metabolic pathways, to derive energy and synthesize the wide spectrum of cellular components required for the maintenance of a cell. In some instances, the alteration of such metabolic pathways of a particular cell may lead to the production of a desired product, while in others the synthesis of the desired product may require the introduction of foreign genes, isolated and characterized from other organisms, in order to allow synthesis to proceed. Furthermore, with the advent of metagenomics, one can sift out genes allowing unique metabolic conversions which have yet to have been described in cultured microorganisms. While the organism of choice for many metabolic engineering studies is E. coli, all cells are, by definition, enveloped by a membrane which separates cellular components from the extracellular environment. Highly efficient transport and export systems have evolved to allow exchange across this barrier. A sound knowledge of the transport systems which import the nutrients required for cell growth and drive the metabolism of the cell is essential in order to ensure that the desired metabolic pathway receives the necessary precursors and energy required for the production of a selected compound. As the outer membrane of E. coli is only capable of allowing the passive diffusion of molecules with a molecular weight less than approximately 600 Da, it is of utmost importance to determine if the import of precursors, for instance, can cause a bottleneck in the synthesis of a particular compound. Additionally, the type of transport system present in the cell can have an impact on the carbon flux within a cell. As an example, modification of the E. coli phosphotransferase system has been a strategy successfully applied to metabolically engineered strains (Gosset, 2005). Manipulation of the transport systems can increase the diversity of nutrients imported while manipulation of the regulatory systems of the cell can allow the simultaneous import and use of multiple carbon sources, as is the case for carbon catabolite repression mutants for example (Dien et al., 2002). As nutrients are catabolized, precursors for metabolic syntheses are generated along with the energy and reducing power required to drive the synthesis of all the components required by the cell. Solid knowledge of the metabolic pathways within the cell allows one to ensure that the proper precursors, reducing power and energy are in ample supply in order to produce the molecule of interest. Furthermore, culture components and conditions can be altered to enhance the efficiency of a particular metabolic conversion. In some instances, one may wish to overproduce a compound in an organism that does not naturally produce said compound. One such example is the production of polyhydroxyalkanoates (PHAs) in E. coli, an organism that does not naturally produce PHAs. It is therefore necessary to express PHA synthase genes isolated from foreign organisms. Additionally, in order to increase production of the desired PHA, it is crucial to evaluate the metabolic flux of the host organism and perhaps amplify certain endogenous pathways to increase the availability of precursors and reducing power, without decreasing the overall health of the expressing organism. In the case of poly (3-hydroxybutyrate) [P(3HB)] production, it was found that over expression of glucose-6-phosphate dehydrogenase and 6-phosphogluconate dehydrogenase led to an increase in the NADPH/NADP+ ratio and a subsequent rise in the concentration of P(3HB), there was however a detrimental effect on the producing cells and the observed increase in P(3HB) production was due to a lower cell concentration (Lim et al., 2002). In another experiment, 2-D gel electrophoresis and metabolic flux analysis was performed on E. coli producing P(3HB) and it was revealed that there was an increase in certain glycolytic pathway enzymes. Subsequently, amplification of the glycolytic pathway enzymes led to increased production of acetyl-CoA, which could subsequently be used to increase yields of P(3HB) (Park and Lee, unpublished results).
Cellular Metabolism
I-3
The pursuit of rare conversions have, up to this point, focused on the ability of cultured organisms and their enzymes to perform such functions. The products of such rare conversions are invaluable to a variety of industries spanning the agricultural, pharmaceutical, food additive and bioremediation fields, as examples. While many advances have been made in the realm of protein engineering using techniques such as directed evolution, it is reasonable to assume that an even greater diversity exists within the genomes of unculturable microorganisms. The diversity of the cultured microbial world has led to the discovery of many rare conversions, there remains, however, a large pool of untapped genetic material within the many “unculturable” microorganisms that are currently estimated to represent close to 99% of the microbial world (Fütterer et al., 2004). With the knowledge that each sequenced microbial genome yields on average 30–50% of genes with unknown function (Bode and Müller, 2005) and the recent shotgun sequencing of DNA isolated from the Sargasso Sea revealed greater than 1.2 million genes of unknown function (Venter et al., 2004) it would appear reasonable to assume that there exists a vast pool of untapped genetic resources that can be applied to all realms of biotechnology.
References Bode, H.B., and Müller, R. The impact of bacterial genomics on natural product research. Angew. Chem. Int. Ed. Engl., 44, 6828, 2005. Dien, B.S., Nichols, N.N., and Bothast, R.J. Fermentation of sugar mixtures using Escherichia coli catabolite repression mutants engineered for production of L-lactic acid. J. Ind. Microbiol. Biotechnol., 29, 221, 2002. Fütterer, O., et al. Genome sequence of Picrophilus torridus and its implications for life around pH 0. Proc. Natl. Acad. Sci. USA., 101, 9091, 2004. Gosset, G. Improvement of Escherichia coli production strains by modification of the phosphoenolpyruvate:sugar phosphotransferase system. Microb. Cell. Fact., 4, 14, 2005. Lim, S.J. et al. Amplification of the NADPH-related genes zwf and gnd for the oddball biosynthesis of PHB in an E. coli transformant harboring a cloned phbCAB operon. J. Biosci. Bioeng. 93, 543, 2002. Venter, J.C. et al. Environmental genome shotgun sequencing of the Sargasso Sea. Science, 304, 66, 2004.
1 Solute Transport Processes in the Cell
Adelfo Escalante, Alfredo Martínez, Manuel Rivera, and Guillermo Gosset National Autonomous University of Mexico
1.1 1.2
Introduction ���������������������������������������������������������������������������������������1-1 Structure and Function of the Bacterial Membrane.....................1-2
1.3
The Transporter Classification (TC) System..................................1-8
Structure of the Cellular Membrane • Functions • Kinetics of Transport Processes Channels and Pores • Electrochemical Potential-Driven Transporters • Primary Active Transporters • Group Translocators • Transmembrane Electron Flow Systems
References ����������������������������������������������������������������������������������������������������1-19
1.1 Introduction The cell membrane constitutes a hydrophobic barrier that isolates the cytoplasm from the external medium. The entry and exit of most of the nutrients required for cell growth and the byproducts generated by metabolism are highly restricted by this cellular structure. However, to sustain high growth rates, microbes require a high rate of nutrient import. The presence of specialized transport proteins in the membrane allows the cell to circumvent the permeability restrictions imposed by this barrier. Analyses of microbial genomes have revealed that approximately 10% of the genes encode proteins involved in transport [1]. These transport systems participate in the import and export of different classes of molecules and also in other important cellular functions. They allow the entry of nutrients to sustain metabolism and ion species to maintain concentration gradients leading to membrane potential and energy generation. Transporters allow the excretion of metabolite by-products and other toxic substances, like drugs or certain metal ions. Transport systems also participate in the secretion of lipids, carbohydrates, and proteins into membrane(s) or the external medium. They enable the transfer of nucleic acids between organisms, contributing to microbial diversity. Finally, transporters participate in the uptake of different types of signaling molecules like alarmones and hormones, among others, thus allowing cellular communication [2]. Solute transport and metabolism are linked processes in the cell. Genetic organization in bacteria frequently reflects this functional coupling by the clustering of genes encoding both transport and metabolic activities in transcriptional units. This association is generally observed in operons encoding catabolic pathways for carbon sources [3]. Transport and regulatory systems participate in the process whereby the bacterial cell can select from a mixture of nutrients those that afford the highest growth rate [4]. In addition, the differential expression of genes encoding distinct transporters for a specific substrate allow the cell to select the transport mechanism according to the physiological state and environmental conditions [5]. Transport systems are potential targets for modification with the aim of microbial production strain improvement. Metabolic engineering efforts usually focus on modifying metabolic enzyme activities. 1-1
1-2
Cellular Metabolism
However, it can be envisioned that high performance production strains will also require the modification of other cellular functions, including transport. Modification of transport systems can result in the improvement of several cellular properties including: (a) increasing the range of carbon source utilization [6]; (b) increasing metabolic precursor availability for the synthesis of amino acids, shikimate pathway intermediates, TCA cycle intermediates, and fermentation products like ethanol [7–10]; (c) increasing the efficiency in sugar mixture utilization by partial disruption of catabolic repression [11]; and (d) controlling overflow metabolism, thus reducing acetate production [12].
1.2 Structure and Function of the Bacterial Membrane 1.2.1 Structure of the Cellular Membrane The cell membrane, also known as the cytoplasmic membrane, plasma membrane, or cell surface membrane, is a thin structure that surrounds the cell. It is the barrier that defines the boundaries of the cell, separating the cytoplasm from its environment. If the membrane is damaged, the integrity of the cell is altered and the cytoplasm leaks into the environment, causing cell death. The general structure of prokaryotic and eukaryotic cell membranes (and the outer membranes of Gram-negative bacteria) is a bilayer composed of phospholipids, which contain both hydrophobic (fatty acid) and hydrophilic (glycerol-phosphate) components. It can exist in many chemical forms as a consequence of a diversity of compounds attached to the glycerol backbone. As phospholipids aggregate in aqueous solution they spontaneously organize to form two parallel rows, known as a lipid bilayer. Phospholipid molecules align with the fatty acids pointing inward toward each other to form a hydrophobic environment, whereas the hydrophilic portions face both the external side and the internal or cytoplasmic side of the membrane. The bilayer structure represents the most stable arrangement of the lipid molecules in an aqueous environment. The whole structure of the plasma membrane is stabilized by hydrogen bonds and hydrophobic interactions. In addition, cations such as Mg2 + and Ca 2 + help to stabilize the membrane due to ionic interactions with the negative charges of the phospholipids. A model of the structures of the bacterial cell membranes of Gram-positive and Gram-negative bacteria is shown in Figure 1.1 [13–15]. An important amount of protein and other materials is partially or completely embedded in the membrane layer. A typical bacterial membrane contains up to 200 different kinds of proteins (approximately 75% of the mass of the membrane). Protein molecules in the membrane are arranged in a variety of ways. Some proteins are fully embedded in the membrane and are thus called integral or transmembrane proteins. They can be removed from the membrane only after disrupting the lipidic bilayer. Some of these proteins are channels that have a pore, through which substances enter and exit the cell. Other proteins, called peripheral, are easily removed from the membrane by mild treatments and are firmly associated with the inner or outer surface of the membrane. They may function as enzymes that catalyze chemical reactions, as scaffolds for support of cell components, and as mediators of changes in membrane shape during movement. Some peripheral membrane proteins contain a lipid tail on the amino terminus that anchors the protein to the membrane. These proteins are called lipoproteins and interact directly with integral proteins in important cellular processes such as energy metabolism. Many proteins and some of the lipids on the outer membrane of the plasma membrane have carbohydrates attached to them. These structures are known as glycoproteins and glycolipids, respectively. Both of these structures help to protect the cell and are involved in cell-to-cell interactions. Sterols and related molecules are present in eukaryotic membranes. They are rigid and planar molecules, whereas fatty acids are flexible; their presence stabilizes and makes membranes less flexible. Sterols are absent in prokaryotic cellular membranes, except for methanotrophs and mycoplasms. Polycyclic compounds known as hopanes (derivatives of pentacyclic triterpenoides) are widely distributed among bacteria, and it is proposed that they may play a role in maintaining membrane rigidity (Figure 1.2). One widely distributed hopane is the C30 hopanoid diploptene. Hopanes are not present in species of Archaea [16].
1-3
Solute Transport Processes in the Cell O-specific side chains
Gram-negative
Lipopolysaccahride
Outer membrane
Murein lipoprotein
Periplasmic space and cell wall
Murein Phospholipids
Cytoplasmic membrane
Peripheral proteins
Transmembrane proteins
Gram-positive Cytoplasmic membrane
Figure 1.1 Cell membranes of Gram-positive and Gram-negative bacteria. Schematic representation of the inner and outer membrane lipid bilayers of Gram-negative bacteria (upper panel) and Gram-positives (lower panel). Several structures associated to cell membranes such as porins, integral or transmembrane and peripheral proteins, and cell wall components are shown. (a)
CH3 H3C
C
CH3 H2C
H2C
H2C CH CH3
H CH3
HO OH
(b)
CH3
CH3
CH3
OH
OH
OH
CH3
CH3
CH3
Figure 1.2 Structure of membrane sterols and hopanoids. (a) Structure of the cholesterol molecule, a typical sterol present in cell membranes of eukaryotic cells, methanotrophic bacteria, and mycoplasmas. (b) Structure of a hopane, a polyterpenoid present in prokaryotic cell membranes.
1-4
Cellular Metabolism
Membranes have a viscosity similar to that of light-grade oil. Experimental evidence has demonstrated that at temperatures that permit growth, membrane molecules are not static but move quite freely within the membrane surface. Individual lipid molecules are also generally free to exchange places with another lipid in the membrane, resembling a two dimensional fluid. It is proposed that this movement is most probably associated to the functions of the plasma membrane. These dynamics of phospholipids and proteins are known as the fluid mosaic model [17]. However, it is also proposed that some membrane regions have considerable order, because some lipids molecules are not free due to their relationship with specific membrane proteins and some other components [18]. The phospholipids of the cell membrane from bacteria contain ester linkages bonding the fatty acids to glycerol whereas in Archaea the membrane lacks fatty acids (Figure 1.3). Instead, their side chains are composed of repeating units of the five carbon hydrocarbon isoprene that is linked to glycerol by an ether bond; however, the overall architecture of the cytoplasmic membrane of Archaea, forming an inner and outer hydrophilic surfaces with a hydrophobic interior, is the same as in bacteria. Glycerol diethers and glycerol tetraethers are the major lipids present in membranes from Archaea. In the tetramer molecule, the phytanyl side chains (composed of four linked isoprenes) from each glycerol molecule are covalently bonded together (Figure 1.4), leading a lipid monolayer instead of a bilayer cytoplasmic membrane. This structure is widely distributed among hyperthermophilic Archaea helping to maintain the membrane architecture at high temperatures [19–20].
1.2.2 Functions The most important function of the cell membrane is to serve as a selective barrier through which material enters and exits the cell. The cytoplasm consists of an aqueous solution containing salts, sugars, amino acids, nucleotides, vitamins, coenzymes, proteins, and a variety of other soluble materials. The hydrophobic nature of the internal region of the plasma membrane constitutes a tight diffusion barrier with selective permeability, allowing certain molecules and ions to pass through and blocking passage to others. Some smaller molecules, such as water, oxygen, carbon dioxide, and some simple sugars, usually pass freely through the membrane by diffusion (Table 1.1). This is also the case for molecules that are dissolved easily in lipids (oxygen, carbon dioxide, and nonpolar organic molecules). In contrast, hydrophilic and small charged molecules such as the hydrogen ion (H+) do not pass through the membrane but instead must be specifically transported. Water is a molecule that freely crosses the membrane, because it is sufficiently small to pass through the phospholipid bilayer. However, water transport through the membrane can be accelerated
(a)
(b)
Ester
(c)
O H2C
O
C
Ether
R H2C
O
C
R
HC
O
C
R
H2C
O
O HC
O
C
R
O H2C
O
P O–
CH3
O O–
P
O–
H2C
C
C H
CH2
O–
Figure 1.3 Chemical diversity of lipidic bonds in cell membranes. (a) An ester linkage found in lipids of bacteria and eukaryotic cells. (b) An ether linkage of lipids of cell membrane of Archaea. (c) Structure of isoprene, the parent structure of the hydrophobic side chains present in Archaea.
1-5
Solute Transport Processes in the Cell (a) Glycerol diether
Phytanyl
Ether linkage
H
H H
Glycerol phosphate
C
O
CH2
CH2
H
C
O
CH2
H
C
O
R
Phytanyl
H
O
C
H
CH2
O
C
H
R
O
C
H
H
CH3 group
Lipid bilayer (b) Diglycerol tetraether
H H
H
O
C
H
H
C
O
CH2
CH2
O
C
H
H
C
O
CH2
CH2
O
C
H
H
C
O
H
H
Biphytanyl
H
Lipid monolayer
Figure 1.4 Structures of the Archaea cell membranes. (a) Schematic representation of bilayers of isoprenoids linked to glycerol by ether bonds. (b) Structure of the monolayers of the isoprenoid biphytanyl glycerol ether.
Table 1.1 Comparison of Diffusion-Controlled and Carrier Mediated Solute Fluxes across Bacterial Plasma Membranes Typical Transfer Rate [µmol min-1 (g Dry Mass) -1] Diffusion-Controlled at a Concentration Difference of Transported Solute Potassium ion (K ) Glutamate +
Glucose Isoleucine Phenylalanine Urea
10 µM
10 mM
Carrier Mediated (Vmax)
0.00002
0.02
100
<0.00005 0.001 0.0015 0.08 0.04
<0.05 1.0 1.5 8.0 40.0
50 30 1 5
25
by specific transport proteins called aquaporins. The movement of most hydrophilic solutes across plasma membranes depends on transporter molecules, which will be described in the following sections [21].
1.2.3 Kinetics of Transport Processes A solute transport system consists of integral membrane proteins that can be regarded as membranebound enzymes. However, instead of catalyzing the conversion of substrates to products, they mediate
1-6
Cellular Metabolism
the transfer of solutes between compartments separated by a membrane. Each transport system displays an affinity and specificity toward particular substrate(s). It is not uncommon to find in one organism more than one transporter for a solute, each having a different affinity and specificity, a situation analogous to that of isozymes in metabolism. In addition, there is considerable diversity with regard to energetic coupling mechanisms that drive active transport. Although transport processes are not fully understood, some models are helpful to understand basic molecular events [22,23]. The function of a transporter can be defined in three basics steps, analogous to those of enzyme activity: binding, translocation, and release of solute. The translocation step can involve a major conformational change of the transporter protein, thus defining either of two functional conformations. Diffusion plays an important role in solute transport across the lipid bilayer membrane. This process can occur either in the absence of or mediated by specific protein transporters. By measuring the transport rate of a solute it is possible to determine if the diffusion process is transporter mediated. In a transporter-independent process, there is a linear increase in the rate of diffusion with an increase in solute concentration, as shown in Figure 1.5. In contrast, in transporter mediated diffusion, a maximum value is reached. This response is explained considering that the transporter proteins become saturated once the solute substrate reaches a specific high concentration. The processes that govern the simple diffusion of an electrically neutral molecule were studied by Adolf Eugen Fick in the 1800s and is applied to the special case of a cell membrane. The measurement of the concentrations of a solute Sx outside and inside a cell allows us to predict if it is in equilibrium across the cell membrane or whether Sx would tend to passively move into the cell or out of it. As long as the movement of Sx is not coupled to the movement of another substance or to some biochemical reaction, the only factor determining the direction of the net transport is the difference in concentration. The ability to predict the movement of Sx is independent of any detailed knowledge of the actual transport pathway mediating its passive transport. When the concentration of an external solute Sx ([Sx]o) is greater than its internal concentration ([Sx]i), the net movement of Sx will be into the cell. The movement of Sx is described by its flux (Jsx), namely, the number of moles of Sx crossing a unit area of membrane (typically 1 cm2) per unit of time (s), i.e., moles/ (cm2∙s). High Sx solubility in the membrane lipids (the higher the lipid–water partition coefficient of Sx) will correspond to a higher flux through the membrane barrier. The flux of Sx will also be greater if Sx moves more readily once it is in the membrane (higher diffusion coefficient) and if the distance that it must traverse is short (membrane thickness). These three factors form part of the parameter called the permeability coefficient of Sx (Psx). Finally, the flux of Sx will be greater as the difference in [Sx] between the two sides of the membranes increases (gradient). All these concepts are integrated in the equation known as Fick´s law:
Rate of diffusion
Carrier mediated transport plus diffusion Carrier mediated transport
Diffusion
Substrate concentration (S)
Figure 1.5 Comparison of solute transport kinetics in the presence and absence of a transporter.
Solute Transport Processes in the Cell
Js x = Ps x ([Sx]o - [Sx]i )
1-7
(1.1)
The net flux of Sx can be decomposed into a unidirectional influx (Js xo→i) and a unidirectional efflux (Js x i→o). The net flux of Sx, as shown in Equation 1.1, is simply the difference between the unidirectional fluxes. Thus, the unidirectional influx is proportional to the outside concentration, the unidirectional efflux is proportional to the inside concentration, and the net flux is proportional to the concentration difference; in all cases the proportionality constant is PSx . The following equation allows the calculation of the electrochemical energy difference or electrochemical potential of Sx (ΔµSx). This parameter combines values of concentration and voltage gradients across a membrane:
ΔµSx = RT ln([Sx]i/[Sx]o) + zXF(ψi–ψo)
(1.2)
where zX is the charge of Sx, T is the absolute temperature, R is the gas constant, and F is the Faraday constant; this part of the equation defines the electrical energy difference. The term RT ln([Sx]i/[Sx]o) describes the energy (joules/mole) change as Sx moves across the membrane; it is a measure of the chemical energy difference. The term zXF(ψi–ψo) describes the energy change as a mole of charged particles (each with a valence of zX) moves across the membrane. The difference (ψi–ψ0) is a voltage difference across the membrane (membrane potential difference, Vm). By definition, Sx is at equilibrium when the electrochemical potential difference for Sx across the membrane is zero (ΔµSx = 0). When ΔµSx is not zero, its value represents the net driving force, causing Sx to either enter or leave the cell, provided that a pathway exists for it to cross the membrane. It is important to consider a couple of special cases for the equilibrium state equation (Equation 1.2). In the first case, when either the chemical or the electrical term in this equation is zero, for instance, when Sx is uncharged (zX = 0), as in the case of glucose, then equilibrium occurs only when [Sx] is equal at both sides of the membrane. Alternatively, when Sx is charged as in the case of the Na+, and the electrical potential difference and thus Vm are zero, equilibrium likewise occurs only when [Sx] is equal on both sides of the membrane. The second case is when neither the chemical nor the electrical term in Equation 1.2 is zero; equilibrium occurs when the two terms are equal but of opposite signs. This relationship is the Nernst equation, when ΔµSx from Equation 1.2:
Vm = EX = –(RT/zXF) ln ([Sx]i/[Sx]o)
(1.3)
Hence, the Nernst equation describes the conditions when an ion is in equilibrium across a embrane. Given values for [Sx]i and [Sx]o, Sx can be in equilibrium only when the voltage m d ifference across the membrane equals the equilibrium potential (E X), also know as the Nernst potential [24]. Transport processes can be studied by applying some of the tools employed for enzyme kinetics analyses. Transport can be regarded as a reaction, where substrate location, and not its structure, is changed. Mathematical/graphical analyses of experimental data, similar to those used in enzyme kinetics, can be performed. Plots such as velocity versus substrate concentration (Michaelis and Menten model; Equation 1.4), as shown in Figure 1.6 and Equation 1.4, and reciprocal plots (Lineweaver–Burk) are used to characterize kinetics and the type of inhibition.
v = Vmax [Sx]/(Km + [Sx])
(1.4)
The values for Km can vary considerably from one transporter to another, and for a particular transporter with different solutes. For transport studies, the definition of Km is the solute concentration that
1-8
Cellular Metabolism
Transport rate
Vmax
½ Vmax
Km
Substrate concentration (S)
Figure 1.6 Michaelis and Menten kinetics of a transport-mediated solute diffusion process.
results in half-maximal velocity for the transport reaction. An equivalent way of stating this is that Km represents the substrate concentration at which half of the transporter is occupied by solute molecules in steady state. Hence, the constant Km can be used as a relative measure of substrate binding affinity.
1.3 The Transporter Classification (TC) System The phylogenetic and functional analyses of transporter proteins from many different organisms have provided the basis for the development of a transporter classification (TC) system [25]. In this system, permeases are classified according to both function and phylogeny. The TC system is an International Union of Biochemistry and Molecular Biology (IUBMB) approved system of nomenclature for transport protein classification. The systematic classification of solute transporters is based on several criteria including mode of transport, energy-coupling source, and molecular phylogeny. A specific five-digit TC number classifies each transporter according to five criteria. The first digit is a number referring to the class, which defines the mode of transport and energy-coupling mechanism; the second digit is a letter indicating the subclass as defined by the type of transporter and the energy coupling mechanism; the third digit is a number indicating the superfamily or family; the fourth digit is a number indicating a phylogenetic cluster within a family or a family within a superfamily; and the last digit is a number indicating the substrate or range of substrates transported and the polarity of transport. For example, the TC number for the galactose:H + symporter form E. coli (GalP) is 2.A.1.1.1. This number indicates that GalP is a member of the 2.A.1.1. sugar porter family from the 2.A.1. major facilitator superfamily of the 2.A. porters subclass of the 2. electrochemical potentialdriven class of transporters. The functional and phylogenetic TC system taxonomy can be accessed in the transporter classification database (TCDB). This is a Web accessible (http://www.tcdb.org) relational database containing a wealth of information about transport systems. It is a curated database that continuously updates and classifies structural, functional, evolutionary, and sequence information. The TCDB compiles information from over 10,000 references and includes approximately 3,000 different proteins organized in approximately 400 transporter families. The Web interface provides several methods for accessing data and searching the database. It also includes several bioinformatic tools designed to analyze transport proteins [26]. Table 1.2 shows an outline of TC according to permease type and energy source. There are four known classes of transporters: (1) channels, (2) porters, (3) primary active transporters, and (4) group translocators. An additional class (8) includes auxiliary transport proteins and class includes
Solute Transport Processes in the Cell
1-9
Table 1.2 Classes and Subclasses of Transporters According to the TC Systema 1. Channels and Pores 1.A α-Type channels 1.B β-Barrel porins 1.C Pore-forming toxins (proteins and peptides) 1.D Nonribosomally synthesized channels 2. Electrochemical Potential-Driven Transporters 2.A Porters (uniporters, symporters, antiporters) 2.B Nonribosomally synthesized porters 2.C Ion-gradient-Driven energizers 3. Primary Active Transporters 3.A P-P-bond hydrolysis-Driven transporters 3.B Decarboxylation-Driven transporters 3.C Methyltransfer-Driven transporters 3.D Oxidoreduction-Driven transporters 3.E Light absorption-Driven transporters 4. Group Translocators 4.A Phosphotransfer-Driven group translocators 5. Transmembrane Electron Flow Systems 5.A Two-electron carriers 5.B One-electron carriers 8. Accessory Factors Involved in Transport 8.A Auxiliary transport proteins 9. Incompletely Characterized Transport Systems 9.A Recognized transporters of unknown biochemical mechanism 9.B Putative but uncharacterized transport proteins 9.C Functionally characterized transporters lacking identified sequences a
Categories 6–7 are reserved for yet to be discovered novel types of transporters.
uncharacterized sequences that are homologous to transporters and characterized transporters whose sequences are not known. Categories 6–7 are reserved for yet to be discovered novel types of transporters. Figure 1.7 shows the known classes and mechanisms of solute transporters found in bacteria that include: (a) channel and pore-mediated passive diffusion; (b) carrier-mediated solute-H+ symport; (c) carrier-mediated solute-H+ symport with an external solute-recognition receptor; (d) primary active uptake ABC transporter driven by ATP hydrolysis; and (e) group translocating permease of the phosphoenolpyruvate:sugar phosphotransferase system (PTS). Each of these classes will be presented in detail in the following sections.
1.3.1 Channels and Pores The channel or pore is one of the simplest structures for the transport of solutes (Figure 1.7a). In this case, facilitated diffusion is not coupled to the use of metabolic energy, and hence it cannot generate concentration gradients of the transported substrate across the membrane. In this class of transporters, the solute passes by a diffusion-limited process from one side of the membrane to the other via a channel or pore that is coated by amino acyl residue moieties of the constituent protein(s) that recognizes hydrophilic, hydrophobic, or amphipathic substrates. In Gram-negative bacteria some small solutes that in principle can cross the membrane relatively without restraint may need an additional transport system to sustain a high enough flux for physiological processes. Channel type proteins, called porins, that mediate the passive transfer of solutes, perform transport of molecules such as carbohydrates, amino acids, and simple ions across the outer membrane. A bacterium like E. coli can have up to 105 copies of porins OmpF, OmpC, or PhoE,
1-10
Cellular Metabolism (a)
(b) H++ +
H
(c) H++ H+ (d)
ATP Pi + ADP
(e) PEP EI PYR
HPr IIA P~
IIB
IIC IIC
Figure 1.7 Classes and mechanisms of solute transporters. (a) Channel and pore-mediated passive diffusion. (b) Carrier-mediated solute-H+ symport. (c) Carrier-mediated solute-H+ symport with an external solute-recognition receptor. (d) Primary active uptake ABC transporter driven by ATP hydrolysis. (e) Group translocating permease of the phosphoenolpyruvate:sugar phosphotransferase system.
which tend to form a barrel of antiparallel β-sheets, consisting of trimeric complexes of identical subunits of ca. 35 kDa, with an approximate diameter of 1 nm that allows the passage for molecules up to a mass of about 600 Da. 1.3.1.1 α-Type Channels This class of channel proteins is composed mainly of α-helical spanners (sometimes β-strands play a part in the channel) that catalyze the movement of solutes using transmembrane aqueous pores by an energy-independent mechanism and are found in bacteria and eukaryotes. 1.3.1.2 β-Barrel Porins These porin proteins are mainly β-barrels constituted by β-strands in the transmembrane region, which also allow the transport of solutes by an energy-independent mechanism. They are found in the outer membranes of Gram-negative bacteria, mitochondria, and plastids.
Solute Transport Processes in the Cell
1-11
1.3.1.3 Pore Forming Toxins (Proteins and Peptides) This category of transmembrane pores includes proteins or peptides ribosomally synthesized by one cell and secreted for insertion into the membrane of another cell. They are composed mainly of β-strands. The pores formed by these proteins allow the free flow of electrolytes and other small molecules across the membrane of the host, or they may allow entry into the target cell cytoplasm of a toxin protein that ultimate kills or controls the cell. 1.3.1.4 Nonribosomally Synthesized Channels These oligomeric transmembrane ion channels are chains of L- and D-amino acids, or small polymers of hydroxylactate or β-hydroxybutirate, in which the arrangement of the pore structure is stimulated by voltage changes, and are made by bacteria and fungi as agents of biological warfare that confer a selective advantage.
1.3.2 Electrochemical Potential-Driven Transporters Chemical, light, or electrochemical energy is used by the cells to transport and accumulate solutes inside the cell. Several transport phenomena are driven by electrochemical potential gradients, such as proton and sodium gradients. Concentration gradients of solutes power transporter-mediated facilitated diffusion; it does not require the expenditure of metabolic energy in the form of ATP. This mechanism permits the transfer of solutes across a membrane against a concentration gradient. This type of transport is also known as secondary transport and is catalyzed by uniporters, symporters, and antiporters (Figure 1.7b and c). In contrast to transporters driven by ATP, electrochemical transporters are relatively simple in composition, generally consisting of a single protein that transverses the membrane with several loops. 1.3.2.1 Porters (Uniporters, Symporters, and Antiporters) Some sugars, amino acids, nucleosides, and small molecules, such as Na+, are transported by uniporter proteins, which move one solute across a membrane down a concentration gradient from an area of greater concentration to one of lesser concentration. Selective conformational changes are induced by interactions between the solute and the uniporter, which enables the uniporter to transport the solute across the cytoplasmic membrane. Some sugars, amino acids, and ions (e.g., sulfate and phosphate) are co-transported by symporter proteins, which use the proton motive force to move a solute against a concentration gradient, i.e., symporters simultaneously transport two solutes across the membrane in the same direction. Antiport involves a tightly coupled process where two or more solutes are transported in opposite directions. Na + can be transported by antiporter proteins. In this instance, a gradient of protons generates the potential energy, and simultaneously the solute (Na+) is transported trough the membrane in the opposite direction against a concentration gradient. 1.3.2.2 Nonribosomally Synthesized Porters These transmembrane porters are peptides or small polymers of nonpeptide nature. They complex solutes (like a cation) in their hydrophilic interior and make possible the translocation of the complex across the membrane by exposing its hydrophobic exterior and moving from one side of the membrane bilayer to the other. Transport is electrophoretic if the porter in the uncomplexed form can cross the membrane and electroneutral (one charged solute is exchanged for another) if only the complexed form can cross the membrane. 1.3.2.3 Ion-Gradient-Driven Energizers This is a family of auxiliary proteins, like the TonB family, that mediate active transport using an outer membrane receptor, which is energized to accumulate its solutes inside the periplasm against large
1-12
Cellular Metabolism
IICBGlc
IICGal IIDGal
IICBMal
IIB'BCFru
IIBGal IIAGal ~P
IIAGlc ~P
PEP
EI
Pyruvate
EI
IICMan IIDMan IIABMan
IICBAMtl
IIAMHFru ~P
Hpr ~P ~P
Hpr
Figure 1.8 The phosphoenolpyruvate:sugar phosphotransferase system from Escherichia coli. General energy-coupling proteins and some sugar transporting complexes are shown. Glc, glucose; Mal, maltose; Gal, galactosamine; Fru, fructose; Man, mannose; and Mtl, mannitol.
concentration gradients. These energizers make use of protons or sodium ions fluxes, i.e., the proton motive force, through themselves to energize outer membrane receptors or porins. Conformational changes of receptors allow electrophoretic transport of protons.
1.3.3 Primary Active Transporters These systems consist of transporters that use a primary source of energy to drive the active transport of a solute against a concentration gradient. Primary energy sources known to be coupled to transport include chemical, electrical, and light. 1.3.3.1 P-P-Bond-Hydrolysis-Driven Transporters These transport systems hydrolyze the diphosphate bond of inorganic pyrophosphate, ADP, ATP, or another nucleoside triphosphate, to drive the active uptake and/or extrusion of a solute or solutes. The transport protein may or may not be transiently phosphorylated, but the substrate is not phosphorylated during the process. This subclass is the most abundant of the primary active transporters and comprises three superfamilies and 18 families distributed among all prokaryotic and eukaryotic organisms (Table 1.3). 1.3.3.2 Decarboxylation-Driven Transporters This subclass includes transport systems that drive ion uptake or extrusion by decarboxylation of a cytoplasmic substrate. The only family included in this subclass is the Na+ -transporting carboxylic acid decarboxylase (NaT-DC) system that catalyzes decarboxylation of carboxilic substrates such as oxaloacetate, methylmalonyl-CoA, glutaconyl-CoA, and malonate, respectively, and use the energy released to drive extrusion of one or two sodium ions (Na+) from the cell cytoplasm. The α-subunits of these transporters are biotin containing multisubunit enzymes and are currently thought to be restricted to bacteria. However, on the basis of the homology of the α-δ-subunits to proteins coded by Archaea genomes, it is proposed that NaT-DC family members may also be present in this kingdom. 1.3.3.3 Methyltransfer-Driven Transporters The Na + transporting methyltetrahydromethanopterin:coenzyme M methyltransferase (NaTMMM) of the Archaea Methanobacterium thermoautotrophicum is the only characterized
1-13
Solute Transport Processes in the Cell Table 1.3 Overview of the Superfamilies and Families of the P-P-Bond-Hydrolysis-Driven Transporters Superfamily or Familya
Mechanism: Solute(s) Tranported
3.A.1 ATP-binding cassette (ABC) superfamily
Bacterial and archaeal ABC-type uptake permeases: carbohydrates, polar-, hydrophobicaminoacids, peptide/opine/nickel, sulfate, phosphate, molybdate, phosphonate, ferric iron, poliamine/opine/phosphonate, quaternary amine, vitamin B12, iron chelate, manganese/ zinc/iron chelate, nitrate/nitrite/chelate, taurine, cobalt, thiamin, iron, Fe3+, nickel. Bacterial ABC-type efflux (exporter) permeases: capsular polysaccharide, lipooligosaccharide, lipopolysaccharide, teichoic acid, drug, lipid, heme, β-glucan, protein, peptide, glycolipid, Na+, microcin B17, drug/siderophore, drug resistance ATPase. Eukaryotic ABC-type efflux (exporter) permeases: multidrug and pleitropic drug resistance, cystic fibrosis transmembrane conductance, peroxysomal fatty Acyl CoA, eye pigment precursor, a-factor sex pheromone, conjugate transporter, MHC peptide, heavy metal, cholesterol/phospholipid/retinal, mitochondrial peptide.
3.A.2 and 3.A.3 F- and P-typeATPase superfamilies 3.A.4 to 3.A.18 Families
H+- or Na+-translocation
a
Arsenite-antimonite efflux; diverse secretory pathways; mitochondrial-, chloroplast envelop protein-, H+ - and septal DNA-translocase systems; bacterial competence-related DNA transformation transporter; filamentous phage-, fimbrilin- and the nuclear mRNA exporters; the endoplasmic reticular retrotranslocon; the phage T7 injectisome.
multisubunit protein family of this subclass. Analysis of the coding genes revealed eight contiguous genes organized in an operon. The majority of the encoded subunits are probably integral membrane proteins. 1.3.3.4 Oxidoreduction-Driven Transporters This subclass includes those transport systems that drive the transport of a solute (proton or ion) energized by the exothermic flow of electrons from a reduced substrate to an oxidized substrate. Transporters included in this subclass comprise two superfamilies and seven families (Table 1.4). These transporters, are found in bacteria, Archaea, eukaryotic mitochondria, and chloroplast; however, some families are group restricted. 1.3.3.5 Light Absorption-Driven Transporters The members of these families catalyze light-driven ion translocation across microbial cytoplasmic membranes or serve as light receptors. There are two families included: the 3.E.1 ion-translocating microbial rhodopsin (MR) and the 3.E.2 photosynthetic reaction center (PRC) families.
1.3.4 Group Translocators 1.3.4.1 Phosphotransfer-Driven Group Translocators This class of transporters includes the phosphoenolpyruvate:sugar PTS (Figure 1.7e). This protein system is widespread in bacteria but absent in Archaea, and eukaryotic organisms [27]. The PTS transport mechanism involves the transport and phosphorylation of several sugars. This is a unique characteristic of the group translocator class of transporters. The product of the reaction is a sugar-phosphate that will subsequently enter a catabolic pathway. The system is composed of the soluble and nonsugar-specific protein components Enzyme I (EI) and the heat-stable or phosphohistidine carrier protein (HPr). The first step of the PTS mechanism involves autophosphorylation by PEP of EI at a histidine residue. In the second step, EI transfers the phosphoryl group to a histidine residue in the HPr. Then, the HPr transfers the phosphoryl group to proteins called enzymes IIA and IIB, which are part of each sugar-specific PTS complex. The final component of this system, IIC (in some cases also IID), is an integral membrane protein permease that recognizes and
1-14
Cellular Metabolism
Table 1.4 General Characteristics of the Oxidoreduction-Driven Transporter Families Family/Superfamily 3.D.1. Proton-translocating NADH dehydrogenase (NDH) family 3.D.2 Proton-translocating transhydrogenase (PTH) family 3.D.3 Proton-translocating quinol:cytochrome c reductase (QCR) superfamily 3.D.4 Proton-translocating cytochrome oxidase (COX) superfamily
Generalized Transport Reaction NADH + ubiquinone + 4H+ (in) → NAD+ + ubiquinol + 4H+ (out), or NADH + ubiquinone + 2Na+ (in) → NAD+ + ubiquinol + 2H+ (out) → NADPH + NAD+ + H+ (in) NADP+ + NADH + H+ (out) ← quinol (QH2) + 2 cytochrome c (ox) + 2H+ (in)→quinone (Q) + 2 cytochrome c (red) + 4H+ (out) Reaction catalyzed by cytochrome c (Cyt c) oxidases: 2Cyt c (red) + 1/2 O2 + 6H + (in)→2Cyt c (ox) + H2O + 4H+ (out) Reaction catalyzed by quinol oxidases is: quinol + 1/2 O2 + 4H+ (in)→quinone + H2O + 4H+ (out)
3.D.5 Na+ -translocating NADH:quinone dehydrogenase (Na-NDH) family
NADH + quinone + nNa+ (in)→NAD+ + quinol + nNa+ (out)
3.D.6. Putative ion (H + or Na+)translocating NADH:ferredoxin oxidoreductase (NFO) family 3.D.7 H2:Heterodisulfide oxidoreductase (HHO) family
NADH + oxidized ferredoxin + n(H+ or Na+) (out)→NAD+ + reduced ferredoxin + n(H+ or Na+) (in).
3.D.8 Na+ or H+ pumping formyl methanofuran dehydrogenase (FMF-DH) family 3.D.9 The H+ -translocating F420H2 dehydrogenase |(F420H2DH) family
Reactions which are coupled to H+ translocation (extrusion): (1) H2 + 2-hydroxyphenazine + H+ (in)→dihydro-2hydroxyphenazine + H+ (out) (2) dihydro-2-hydroxyphenazine + CoM-S-S-CoB + H + (in)→2hydroxyphenazine + HS-CoM + HS-CoB + H+ (out) → CO + MF + H + 2Na+ or 2H+ (out) Formyl-MF + 2Na+ or 2H+ (in) ← 2
2
→ oxidized acceptor (2e–) + 1H+ (out). F420H2 reduced donor (2e–) + 1H+ (in) ← cofactor
transports the sugar molecule, which is phosphorylated by component IIB (Figure 1.8). The PTS of each organism usually includes several different enzyme II complexes, each one usually specific for single sugar substrates. In the case of E. coli, there are 26 different identified enzyme II complexes. It should be noted that there is significant diversity with regard to domain composition for the sugar complexes. As it can be seen in Figure 1.8, the maltose/glucose PTS complex form of E. coli lacks the IIA domain. Presumably, the IIA protein from the glucose complex (IIAGlc) is involved in the phosphorylation of the IIBMal domain. Also evident in Figure 1.8 is the frequent occurrence of protein fusions among enzyme II domains. Fusions including IIA-IIB, IIB-IIC, and IIA-IIBIIC, among others, have been observed in different organisms. Analyses of bacterial genomes have revealed a high diversity with regard to PTS composition. Some bacteria have a single enzyme II complex, whereas others have several dozens. These studies have also revealed that some bacterial species completely lack genes encoding PTS components, whereas others have EI and/or HPr proteins but none of the enzyme II complexes. It is possible that in these cases the PTS proteins have a regulatory role [28]. PEP-dependent phosphorylation by PTS results in a tight linkage between sugar transport and its subsequent metabolism (Figure 1.9). PEP can be considered a link between the Embden–Meyerhof– Parnas (EMP) glycolytic pathway and the PTS, together forming a phosphorylation circuit. PEP is a highly connected compound in the central metabolic network of all organisms. This metabolite is a precursor in several biosynthetic pathways and also participates directly in energy-generating reactions such as substrate-level phosphorylation of ADP or indirectly as a precursor of acetyl-CoA. The relative carbon flux originating from the PEP node into the different metabolic pathways has been determined experimentally. When E. coli grows in minimal medium containing glucose as the
1-15
Solute Transport Processes in the Cell Glucose IICBGlc
IIAGlc ~P
G6P F6P
HPr ~P F1,6BP
EI ~P
GAP + DHA Cell wall
PEP + PEP
Aromatic compounds
Acetate PYR
AcCoA CIT ICIT OAA AKG
MAL FUM
SUCCoA SUC
Figure 1.9 Central pathways related to glucose transport and metabolism in Escherichia coli. Dashed lines indicate more than one biochemical reaction.
carbon source, the PTS consumes 50% of the available PEP, whereas the reactions leading to the synthesis of oxaloacetate, pyruvate, cell wall components, and aromatic compounds consume approximately 16%, 15%, 16%, and 3%, respectively [29,30]. From these data, it is evident that PTS is a cellular activity having a major influence on the PEP/PYR ratio and the carbon flux distribution originating from these two metabolic nodes. It can be expected, then, that modification or elimination of PTS components should have a significant impact on central metabolism. PTS modification or its functional replacement has been a strategy applied in E. coli to improve the properties of production strains. Expression of PEP-independent uptake and phosphorylation activities in strains devoid of PTS activity has resulted in significant improvements in productivity and product yield for several classes of metabolites [31]. A common feature of most free-living bacteria is their capacity to select, from a mixture of carbon sources, the one that affords the highest growth rate. This response is known as carbon catabolite
1-16
Cellular Metabolism
repression (CCR) and it is identified by the inhibition of sugar transport capacity, enzyme activities, and gene expression by the presence of a rapidly metabolizable carbon source [32]. Glucose repression is another name usually employed to describe this phenomenon, although other PTS sugars can also exert CCR. This event is responsible for diauxic growth, observed when bacteria are cultured in media containing more than one carbon source. In the case of E. coli, the IIAGlc protein has a central role in CCR. When glucose is present in the medium, EI, Hpr, and IIAGlc are present mainly in a nonphosphorylated state, since the phosphoryl group from PEP is transferred, via IIBCGlc, to glucose. In this condition, IIAGlc binds to various nonPTS permeases, inhibiting the uptake of non-PTS sugars (Figure 1.10a). Dephosphorylated IIAGlc can also bind to the enzyme glycerol kinase (GK), inhibiting its activity [33]. Proteins EI and Hpr also have regulatory roles that are controlled by phosphorylation. Dephosphorylated EI has been shown to bind to the chemotaxis protein CheA, inhibiting its autophosphorylating activity and thus causing smooth swimming behavior [34]. When it is dephosphorylated, Hpr activates the enzyme glycogen phosphorylase (GP) [35]. The protein IIBCGlc plays a role as regulator of PTS and thus, indirectly, in CCR. This protein interacts with the transcriptional repressor Mlc that regulates genes ptsHI, ptsG, mlc, manXYZ, and malT. Glucose presence causes dephosphorylation of IIBGlc under this condition; it binds Mlc, thus relieving its repression from genes under its control. The net effect of this response is an increased expression of the general PTS enzymes and those involved in glucose, mannose, and maltose transport [36]. If glucose is absent from the medium, IIAGlc and IIBGlc will be mainly in their phosphorylated states (Figure 1.10b). In this condition, IIAGlc~P binds to the enzyme adenylate cyclase (AC), activating its cAMP biosynthetic capacity. Therefore, cAMP concentration increases in the cell, binding to the cAMP receptor protein (CRP) and causing the induction of catabolite-repressed genes [37]. It should be noted that AC displays a basal level of activity even in the absence of IIAGlc~P activation. Therefore, a low level of cAMP is present in the cell when growing on glucose. In the absence of glucose, IIBGlc~P loses its capacity to bind Mlc. Thus, this regulatory protein is free to bind to its target operator sequences, causing repression of genes involved in glucose uptake. When Hpr is phosphorylated, it binds to and activates BglG, a transcriptional activator of the bgl operon that encodes proteins involved in β-glucosidic sugars uptake and utilization [38]. PTS forms part of a complex regulatory network involved in coordinating cellular processes related to the cell’s capacity to select, transport, and metabolize a large number of carbon sources [39]. Therefore, it can be expected that direct modification of PTS components will have wide-ranging effects on the cell’s physiology. As a result of CCR, E. coli displays sequential sugar consumption when it is grown in media containing sugar mixtures, like those generated from lignocellulose hydrolyzates. The simultaneous consumption of sugars present in a mixture would be advantageous in a fermentative production process, since this would eliminate diauxic growth, resulting in reduced operating time and increased productivity. Modification of components from the glucose PTS complex in E. coli has resulted in disruption of CCR. This strategy has been applied to improve strains for the production of ethanol and L-lactic using mixtures of glucose, arabinose, and xylose as carbon sources [11,40,41]. It is expected that similar strategies should also improve the characteristics of production strains derived from other bacterial species that have PTS.
1.3.5 Transmembrane Electron Flow Systems These systems are involved in the catalytic flow of electrons across a biological membrane, from donors localized to one side of the membrane to acceptors localized on the other side. These systems contribute to or subtract from the membrane potential, depending on the direction of electron flow, and are important elements in cellular energetics. According to the TCDB, members of this class are grouped in two subclasses: (1) transmembrane two-electron transfer carriers and (2) transmembrane one-electron transfer carriers (see Table 1.5).
EI
PEP
–
EI
Pyruvate
Hpr ~P
Hpr
+
–
GK
GP
~P
IIAGlc
IIAGlc
AC
–
Glucose
–35 –10 Mlc-repressed genes
G6P
Mlc IICB
Glc
–10 –35
EI
PEP +
~P
PTS-sugars~P
BglG
EI
Pyruvate
Catabolite-repressed genes
+
CRP
S -PT Non eases m r pe
Hpr ~P
Hpr
cAMP
ATP
PTS-sugars
PTS ases r e p me
IIAGlc
~P IIAGlc
+
AC
Glc Mlc IICB
Mlc-repressed genes
Mlc
P
Figure 1.10 Carbon catabolic repression mechanisms in Escherichia coli. Regulatory interactions of PTS components in presence (a) or absence (b) of glucose.
~P
–10 –35 Catabolite-repressed genes
CRP
(b)
~
CheA
(a)
Solute Transport Processes in the Cell 1-17
1-18
Cellular Metabolism
Table 1.5 Overview of the Subclasses of the Transmembrane Electron Flow Systems Class/Subclass/Family Transmembrane two-electron transfer carriers. 5.A.1. Disulfide bond oxidoreductase D (DsbD)
Mechanism: Solute(s) Transported Catalyses an essentially irreversible reaction due to the fact that electrons flow down their electrochemical gradient from inside the bacterial cell (negative inside) to outside the cell (positive outside). In order to reverse the reaction, electrons are transferred from dithiol proteins in the periplasm to an electron acceptor in the cytoplasm. The overall vectorial electron transfer reaction catalyzed by DsbD is 2 e–cytoplasm→2 e–periplasm
5.A.2. Disulfide bond oxidoreductase B (DsbB)
DsbB family transfers electrons across the bacterial cytoplasmic membrane from periplasmic dithiol proteins to quinones in the membrane. The reduced quinones then reduce cytoplasmic electron acceptors such as O2 via cytochrome oxidases (cytochrome d and cytochrome o, respectively, in E. coli) as well as nitrate via nitrate reductase and fumarate via fumarate reductase. The generalized vectorial electron transfer reaction catalyzed by DsbB is 2 e–cytoplasm→2 e–periplasm
5.A.3. Prokaryotic molybdopterincontaining oxidoreductase (PMO)
The membrane-bound nitrate reductase-A (NR-A) (contains Mo-molybdopterin guanine dinucleotide) and formate dehydrogenase (FDH) employ a redox loop to couple quinol oxidation to the equivalent of proton translocation, thereby generating a proton motive force (pmf) during anaerobic respiration. The net transmembrane electron transfer reactions for NR-A and FDH, and probably other homologous enzymes, are a. 2e– (out)→2e– (in) (NR-A) b. 2e– (in)→2e– (out) (FDH)
Transmembrane one-electron transfer carriers: 5.B.1 gp91phox phagocyte NADPH oxidaseassociated cytochrome b558 (Phox)
The human phagocyte cytochrome b558 is a heterodimeric complex consisting of a heavy (β) chain (gp91phox) and a light (α) chain (p22phox) as well as several auxiliary subunits. gp91phox is the terminal component of a respiratory chain that transfers single electrons from cytoplasmic NADPH to O2 on the external side of the plasma membrane generating superoxide. Its activity is electrogenic, causing depolarization of the membrane potential on the negative inside. The overall electron transfer reactions catalyzed by gp91phox and some of its homologues are: 1. electron (in)→electron (out) 2. O2 (out) + electron (in)→O2– (superoxide) (out)
5.B.2 Eukaryotic cytochrome b561 (Cytb561)
Bovine cytochrome b561 is a six TMS protein present in secretory vesicles (i.e., adrenal chromaffin granules) that contain catecholamines and amidated peptides. It supplies electrons from cytoplasmic L-ascorbate to intravesicular enzymes such as dopamine β-hydroxylase and α-peptide amidase. Transmembrane electron flow reactions catalyzed by Cytb561 family members are 1. L-ascorbate (cytosol) + enzyme (oxidized) (synaptic vesicle)→dehydroascorbate (cytosol) + enzyme (reduced) (synaptic vesicle) 2. L-ascorbate (cytosol) + 2Fe3+ (intestinal lumen)→dehydroascorbate (cytosol) + 2Fe2+ (intestinal lumen).
5.B.3 Geobacter nanowire electron transfer (G-NET)
Geobacter and other bacteria from diverse kingdoms have the capacity to transfer electrons from cytoplasmic electron donors to extracellular substances. Geobacter utilizes “nanowires” distantly related to type IV pili (fimbriae) to transfer electrons to iron oxide (Fe2O3; rust) to generate soluble Fe2 + and solid magnetite. An electron pair is transferred from NADH to menaquinone, and then single electrons are transferred through the chain to iron oxide via an inner membrane (IM), periplasm, outer membrane (OM) pathway as follows: NADH → NADHDH (3.D.1.4.1) → menaquinone → MacA (IM) → PpcA (periplasm) → OmcB (a trans OM protein) → OmcE (outer surface of the OM) → OmcS (outer surface of the OM) → pilin → Fe2O3.
5.B.4 Plant photosystem I supercomplex (PSI)
Water, the electron donor for photosynthesis, is oxidized to O2 and four protons by PSII. The electrons that have been extracted from water are shuttled through a quinone pool and the cytochrome b6f complex to plastocyanin—a small, soluble, copper-containing protein. Solar energy that has been absorbed by PSI induces the translocation of an electron from plastocyanin at the inner face of the membrane (thylakoid lumen) to ferredoxin on the opposite side (stroma).
Solute Transport Processes in the Cell
1-19
References 1. Paulsen, I. T., Sliwinski, M. K., and Saier, M. H. Jr. Microbial genome analyses: global comparisons of transport capabilities based on phylogenies, bioenergetics and substrate specificities. J. Mol. Biol., 277, 573, 1998. 2. Saier, M. H. Jr. Computer-aided analyses of transport protein sequences: gleaning evidence concerning function, structure, biogenesis, and evolution. Microbiol. Rev. 58, 71, 1994. 3. Díaz, E. et al. Biodegradation of aromatic compounds by Escherichia coli. Microbiol. Mol. Biol. Rev., 65, 523, 2001. 4. Bruckner, R. and Titgemeyer, F. Carbon catabolite repression in bacteria: choice of the carbon source and autoregulatory limitation of sugar utilization. FEMS Microbiol. Lett., 209, 141, 2002. 5. Ferenci, T. Adaptation to life at micromolar nutrient levels: the regulation of Escherichia coli glucose transport by endoinduction and cAMP. FEMS Microbiol. Rev., 18, 301, 1996. 6. Hahn-Hägerdal, B. et al. Metabolic engineering of Saccharomyces cerevisiae for xylose utilization. Adv. Biochem. Eng. Biotechnol., 73, 53, 2001. 7. Flores, N. et al. Pathway engineering for the production of aromatic compounds in Escherichia coli. Nat. Biotechnol., 14, 620, 1996. 8. Yi, J. et al. Altered glucose transport and shikimate pathway product yields in E. coli. Biotechnol. Prog., 19,1450, 2003. 9. Lin, H., Bennett, G. N., and San, K. Y. Metabolic engineering of aerobic succinate production systems in Escherichia coli to improve process productivity and achieve the maximum theoretical succinate yield. Metab. Eng., 7, 116, 2005. 10. Hernández-Montalvo, V. et al. Expression of galP and glk in a Escherichia coli PTS mutant restores glucose transport and increases glycolytic flux to fermentation products. Biotechnol. Bioeng., 83, 687, 2003. 11. Dien, B. S., Nichols, N. N., and Bothast, R. J. Fermentation of sugar mixtures using Escherichia coli catabolite repression mutants engineered for production of L-lactic acid. J. Ind. Microbiol. Biotechnol., 29, 221, 2002. 12. de Anda, R. et al. Replacement of the glucose phosphotransferase transport system by galactose permease reduces acetate accumulation and improves process performance of Escherichia coli for recombinant protein production without impairment of growth rate. Metab. Eng., 8, 281, 2006. 13. Palsdottir, H. and Hunte, C. Lipids in membrane protein structures. Biochim, Biophys, Acta, 1666, 2, 2004. 14. Cabeen, M. T. and Jacobs-Wagner, C. Bacterial cell shape. Nat. Rev. Microbiol., 3, 601, 2005. 15. Desvaux, M. et al. Protein cell surface display in Gram-positive bacteria: from single protein to macromolecular protein structure. FEMS Microbiol. Lett., 256, 1, 2006. 16. Volkman, J. K. Sterols in microorganisms. Appl. Microbiol. Biotechnol., 60, 495, 2003. 17. Singer, S. J. and Nicholson, G. L. The fluid mosaic model of the structure of cell membranes. Science, 175, 720, 1972. 18. Matsumoto, K. et al. Lipid domains in bacterial membranes. Mol. Microbiol., 6, 1110, 2006. 19. Heathcock, C. H. et al. Stereostructure of the archaebacterial C40 diol. Science, 229, 862, 1985. 20. Tornabene, T. G. and Langworthy, T. A. Diphytanyl and dibiphytanyl glycerol ether lipids of methanogenic archaebacteria. Science, 203, 51, 1979. 21. Moat, A. G., Foster, J. W., and Spector, M. P. Microbial Physiology. Wiley-Lyss. New York, 2002, chapter 7, Cell structure and function. 22. Fu, D. et al. Structure/function relationships in OxlT, the oxalate-formate transporter of Oxalobacter formigenes. J. Biol. Chem., 276, 8753, 2001. 23. Ames, J. F. Bacterial periplasmic transport systems: structure, mechanism, and evolution. Annu. Rev. Biochem., 55, 397, 1986. 24. Suzuki, H. et al. Probing the transmembrane potential of bacterial cells by voltage-sensitive dyes. Anal. Sci., 19, 1239, 2003.
1-20
Cellular Metabolism
25. Saier, M. H. Jr. A functional-phylogenetic classification system for transmembrane solute transporters. Microbiol. Mol. Biol. Rev., 64, 354, 2000. 26. Saier, M. H. Jr., Tran, C. V., and Barabote, R. D. TCDB: the Transporter Classification Database for membrane transport protein analyses and information. Nucleic Acids Res., 34, D181, 2006. 27. Postma, P. W., Lengeler, J. W., and Jacobson, G. R. Phosphoenolpyruvate: carbohydrate phosphotransferase systems. In Escherichia coli and Salmonella. Cellular and Molecular Biology, Neidhardt, F. C., Ed. American Society for Microbiology, Washington, DC, 1996. 28. Barabote R. D. and Saier M. H. Jr. Comparative genomic analyses of the bacterial phosphotransferase system. Microbiol. Mol. Biol. Rev., 69, 608, 2005. 29. Valle, F. et al. Basic and applied aspects of metabolic diversity: the phosphoenolpyruvate node. J. Ind. Microbiol., 17, 458, 1996. 30. Flores, S. et al. Analysis of carbon metabolism in Escherichia coli strains with an inactive phosphotransferase system by 13C labeling and NMR spectroscopy. Metab. Eng., 4, 124, 2002. 31. Gosset, G. Improvement of Escherichia coli production strains by modification of the phosphoenolpyruvate:sugar phosphotransferase system. Microb. Cell. Fact., 4, 14, 2005. 32. Saier, M. H., Ramseier, T. M., and Reizer, J. Regulation of carbon utilization. In Escherichia coli and Salmonella. Cellular and Molecular Biology, Neidhardt, F. C., Ed. American Society for Microbiology, Washington, DC, 1996. 33. Novotny, M. J. et al. Allosteric regulation of glycerol kinase by enzyme IIIglc of the phosphotransferase system in Escherichia coli and Salmonella typhimurium. J. Bacteriol., 162, 816, 1985. 34. Lux, R. et al. Coupling the phosphotransferase system and the methyl-accepting chemotaxis protein-dependent chemotaxis signaling pathways of Escherichia coli. Proc. Natl. Acad. Sci., 92, 11583, 1995. 35. Seok, Y. J. et al. Regulation of E. coli glycogen phosphorylase activity by HPr. J. Mol. Microbiol. Biotechnol., 3, 385, 2001. 36. Plumbridge, J. Regulation of gene expression in the PTS in Escherichia coli: the role and interactions of Mlc. Curr. Opin. Microbiol., 5, 187, 2002. 37. Korner, H., Sofia, H. J., and Zumft, W. G. Phylogeny of the bacterial superfamily of Crp-Fnr transcription regulators: exploiting the metabolic spectrum by controlling alternative gene programs. FEMS Microbiol. Rev., 27, 559, 2003. 38. Görke, B. and Rak, B. Catabolite control of Escherichia coli regulatory protein BglG activity by antagonistically acting phosphorylations. EMBO J., 18, 3370, 1999. 39. Saier, M. H. Jr. and Reizer, J. The bacterial phosphotransferase system: new frontiers 30 years later. Mol. Microbiol., 13, 755, 1994. 40. Lindsay, S. E., Bothast, R. J., and Ingram, L. O. Improved strains of recombinant Escherichia coli for ethanol production from sugar mixtures. Appl. Microbiol. Biotechnol., 43, 70, 1995. 41. Hernández-Montalvo, V. et al. Characterization of sugar mixtures by an Escherichia coli mutant devoid of the phosphotransferase system. Appl. Microbiol. Biotechnol., 57, 186, 2001.
2 Catabolism and Metabolic Fueling Processes 2.1 2.2 2.3 2.4
Introduction ���������������������������������������������������������������������������������������2-1 Classification of Organisms �������������������������������������������������������������2-2 Thermodynamics of Fueling Processes..........................................2-2 Products of Fueling Processes �������������������������������������������������������� 2-4
2.5
Redox Potentials and Mobile Electron Carriers........................... 2-9
2.6
Examples of Catabolic Processes in Different Organisms.........2-14
Precursor Metabolites • ATP • NADPH
Redox Potentials • Mobile Electron Carriers • Quinones • Flavoproteins • Iron-Sulfur Proteins • Cytochromes
Olubolaji Akinterinwa and Patrick C. Cirino Pennsylvania State University
Photosynthesis • The Central Pathways • Anaplerotic and Peripheral Reactions • Fermentation • Oxidative Phosphorylation • Key Parameters Influencing Regulation of Catabolic Pathways
2.7 Concluding Remarks ��������������������������������������������������������������������� 2-30 Acknowledgment ��������������������������������������������������������������������������������������� 2-30 References ��������������������������������������������������������������������������������������������������� 2-30
2.1 Introduction Comparative DNA and ribosomal RNA sequencing provides evidence of three phylogenetically distinct domains of organisms: bacteria, archaea, and eukarya [1]. Although all domains are believed to have originated from a common ancestor [1,2], there is an amazing diversity of mechanisms by which these different life forms harness energy from their environment to fuel essential metabolic processes that enable survival. This metabolic diversity is a testament to the power of evolution and the adaptability of life. Metabolism can be defined as the sum of processes involved in energy conversions in the cell. They regulate cellular conditions such that a state of metabolic homeostasis, i.e., a stable supply of energy and metabolites, is maintained. Metabolic processes are organized into complex sequences of controlled chemical reactions referred to as metabolic pathways, and many different pathways are responsible for nutrient processing, energy acquisition, and energy conversion in the cell. These metabolic processes can be broadly categorized as catabolic and anabolic. Catabolic processes are responsible for the production of energy, reducing power, and precursor molecules. They usually involve the breakdown of complex molecules into simpler molecules. In contrast, anabolic processes consume energy, reducing power and precursor molecules for the synthesis of complex biomolecules such as proteins, nucleic acids, and membranes. This chapter provides an overview of the predominant catabolic/fueling metabolic processes found in nature and particularly in microorganisms. Fueling reactions collectively represent the most diverse of all processes within the cell [3,4]. Since catabolic mechanisms vary widely from one organism to another, we emphasize major pathways and products, energy yields and efficiency of energy conversion. The wellstudied metabolism of Escherichia coli serves to demonstrate many of the catabolic paradigms described. 2-1
2-2
Cellular Metabolism
2.2 Classification of Organisms Catabolic processes require an initial source of energy in the form of light or chemical energy. Organisms that utilize light energy directly are called phototrophs while those that utilize chemical energy (via chemical reactions) are called chemotrophs. An overall mutualistic relationship exists between phototrophs and chemotrophs such that chemotrophs primarily derive chemical energy from substrates ultimately originating from phototrophic metabolism via photosynthesis. During photosynthesis, light energy (usually solar energy) is converted to chemical energy while CO2 is captured. In addition to requiring energy, catabolic processes also require reducing power (electrons) and carbon. The sources of electrons and carbon are used as parameters to further classify organisms as lithotrophs or organotrophs and autotrophs or heterotrophs, respectively. Lithotrophs obtain reducing equivalents from inorganic sources while organotrophs obtain reducing equivalents from organic sources. Autotrophs utilize CO2 as a carbon source or initial substrate while heterotrophs utilize carbon compounds other than CO2. Sugars such as glucose, xylose, and maltose are examples of organic sources of electrons and carbon. Examples of inorganic electron sources include H2, H2O, H2S, NH3, S, S2O32− , and Fe. A variety of combinations of energy, electron, and carbon sources is used by organisms, as shown in Table 2.1. Organisms can also be classified based on their ability to utilize oxygen for respiration or their sensitivity to oxygen. Strict/obligate aerobes require oxygen for respiration and cannot survive in the absence of oxygen. Strict/obligate anaerobes such as Clostridium spp. are sensitive to oxygen or reactive oxygen species, cannot grow in the presence of oxygen, and instead grow fermentatively or respire anaerobically. Facultative aerobes and anaerobes are more flexible with their oxygen requirements and can survive in the absence or presence of oxygen. Examples of facultative aerobes include E. coli and Bacillus anthracis. Aerotolerant anaerobes do not respire on oxygen but are able to grow in its presence.
2.3 Thermodynamics of Fueling Processes Fueling reactions essentially convert energy extracted from a substrate into a form that is more readily available and useful for driving anabolic processes. The energy yield of a fueling process depends on the substrate being decomposed and is quantified by the change in Gibbs free energy (G) for the conversions involved in the process: ∆G = Gproducts–Greactants
(2.1)
Fueling reactions which are exergonic (energy releasing) have negative ∆G values while those which are endergonic (energy consuming) have positive ∆G values. An estimate of ∆G for a chemical reaction can be obtained from the standard Gibbs free energy (G'0) of substrates which have been reported in literature. The standard Gibbs free energy change (∆G'0) applies to reactions occurring at standard temperature (T = 298K), and pH = 7, with starting substrate concentration(s) of 1 M. The relationship between the ∆G'0 and the actual ∆G of a reaction is:
ΔG=ΔG ′ 0 +RT lnQ
(2.2)
where Q is the mass action ratio, defined as the ratio of activities of products to reactants, and in the case of dilute aqueous solutions is often estimated as the ratio of concentrations of products to reactants [5]. R is the gas constant (R = 8.314 J/mole K) and T is the absolute temperature (K). For reactions at equilibrium, ∆G = 0, so that:
ΔG ′ 0 = - RT lnK eq
(2.3)
where Keq is the equilibrium constant. Fueling processes are characterized by high-energy group transfer reactions which promote the energetic favorability of substrate decomposition. These high-energy group transfers are typically in
2-3
Catabolism and Metabolic Fueling Processes Table 2.1 Common Classifications of Organisms Basis of Classification
Designation
Description
Source of carbon i. Autotroph ii. Heterotroph
Utilizes CO2 as source of carbon Utilizes organic molecules such as sugars as source of carbon
i. Organotroph ii. Lithotroph
Obtains electrons from organic compounds such as glucose, maltose Obtains electrons from inorganic compounds such as H2, H2O, H2S, NH3, and S
i. Phototroph ii. Chemotroph iii. Mixotroph
Obtains energy from light (usually solar energy) Obtains energy from oxidation of chemical compounds Obtains energy from light and chemical reactions
Source of electrons
Source of energy
Note: Further classification involves combinations of those listed, e.g., photoheterotrophs, chemoheterotrophs, photoautotrophs, and chemoautotrophs.
the form of phosphorylation, transacylation, condensation, or oxidation-reduction (redox) reactions. Phosphorylation reactions are used to initiate fueling processes since they produce substrate derivatives which cannot diffuse out of the cell (and energy is therefore not required to retain the substrate inside the cell). Phosphorylation of a substrate additionally places it in a more labile form for metabolism such that the activation energy of subsequent enzyme-catalyzed conversions is lowered and the specificity of enzymatic reactions is promoted [5]. A good illustration is the initial step of glycolysis, where glucose is phosphorylated to glucose-6-phosphate in the two coupled half-reactions shown below:
i. Nonspontaneous/endergonic half-reaction: Pi + Glucose ↔ Glucose-6-phosphate + H2O ∆G´10 = +14 kJ/mole
(2.4)
ii. Spontaneous/exergonic half-reaction: ATP + H2O ↔ ADP + Pi ∆G´20 = −31 kJ/mole
(2.5)
where Pi is an inorganic phosphate. The overall reaction is: ATP + Glucose →ADP + Glucose-6-phosphate ∆G´30 = ∆G´10 + ∆G´20 = −17 kJ/mole
(2.6)
Energy obtained from fueling reactions is efficiently captured in the chemical bonds of compounds such as nicotinamide adenine dinucleotide (NADH), nicotinamide adenine dinucleotide phosphate (NADPH) and adenosine triphosphate (ATP) via reversible redox and phosphorylation reactions. The reaction step in the “pay-off” phase of glycolysis in which glyceraldehyde-3-phosphate is converted to 3-phosphoglycerate is an example of a fueling reaction that supplies ATP through substrate-level phosphorylation in many organisms (refer to Section 2.6.4). The overall reaction is written as: Glyceraldehyde-3-phosphate + ADP + Pi + NAD ↔ 3-Phosphoglycerate + ATP + NADH + H+ ∆G´0 = −12.5 kJ/mole (2.7) Coupled reactions are very common in catabolic processes. Endergonic reactions are coupled to exergonic reactions such as ATP hydrolysis or NAD(P)H oxidation, and their free energy changes are
2-4
Cellular Metabolism
additive. By coupling reactions, energy from the exergonic reaction shifts the endergonic reaction away from equilibrium and toward product formation. Bioenergetics of catabolic and energy-producing pathways is discussed in greater detail in Section 2.6.5.
2.4 Products of Fueling Processes Fueling processes produce precursor metabolites, high-energy group-transfer bonds, and reducing power (primarily NADPH) which are used in anabolic processes. For example, the requirements for production of 1g of E. coli include approximately 13.4 mmole of precursor metabolites in the specific proportions needed by the cell (refer to Table 2.2), 41.2 mmole of high-energy phosphate bonds (denoted as ~P) and 18.5 mmole of NADPH [4]. The productivity of a fueling process depends on the nature and availability of substrates, intracellular and extracellular conditions, and the redox potentials of available electron donors and acceptors. Precursor metabolites, ATP, NADH, and NADPH are described below.
2.4.1 Precursor Metabolites The precursor metabolites obtained from fueling reactions serve as the major link between catabolic and anabolic processes. They are used in the synthesis of macromolecular subunits such as amino acids, lipids, enzymes, and nucleotides. They can alternatively be oxidized for ATP synthesis. Central metabolism produces 12 precursor metabolites that are essential to biosynthesis [6]. These central pathways include: Embden–Meyerhof–Parnas pathway (EMP, also known as glycolysis), Entner–Doudoroff pathway (ED), pentose phosphate pathway (PPP) and the tricarboxylic acid (TCA) cycle. More information on the central pathways is presented in Section 2.6.2. Table 2.2 lists 12 important precursor metabolites, their main sources and the required proportions for synthesis of 1 g of E. coli.
2.4.2 ATP During metabolism, high-energy intermediate compounds spontaneously transfer specific chemical groups, such as phosphoryl or acyl groups, to other compounds, and hence are said to have high “grouptransfer” potential. Hydrolysis of ATP and other high-energy compounds yields a strongly negative ∆G Table 2.2 Twelve Major Precursor Metabolites and the Primary Pathway Used for their Production Precursor Metabolite Glucose-6-phosphate Fructose-6-phosphate Triose phosphate 3-Phosphoglycerate Phosphoenolpyruvate Pyruvate Ribose-5-phosphate Erythrose-4-phosphate Acetyl-CoA α-Ketoglutarate Oxaloacetate Succinyl-CoAa
Metabolic Pathway
Amount Required for Biosynthesis of E. coli (µmol/g cell)
Glycolysis Glycolysis Glycolysis Glycolysis Glycolysis Glycolysis PPP PPP TCA cycle TCA cycle
205 71 129 1,496 519 2,833 898 361 3,748 1,079
TCA cycle TCA cycle
1,787 –
Source: Ingraham, J.L., Maaloe, O., and F.C. Neidhardt. Growth of the Bacterial Cell, Sinaeur Associates, Inc, Sunderland, MA, 1983. With permission. Note: Also included are the specific proportions of these metabolites required in the synthesis of E. coli. a Succinyl-CoA is a precursor metabolite in the synthesis of tetrapyrroles.
2-5
Catabolism and Metabolic Fueling Processes Table 2.3 Free Energy of Hydrolysis of Important High-Energy Compounds in Cellular Metabolism Compound
∆G´0 of Hydrolysis (kJ/mol)
Acetyl-CoA
−31.4
Phosphoenolpyruvate (PEP)
−61.9
Phosphocreatine
−43.0
Pyrophosphate
−19.2
ATP (to ADP)
−30.5
ATP (to AMP)
−45.6
ADP (to AMP)
−32.8
AMP (to Adenosine + Pi) Glucose-6-phosphate
−14.2
Fructose-6-phosphate
−15.9
Glycerol-1-phosphate
−9.2
−13.8
Source: Nelson D.L. and M.M. Cox, Lehninger Principles of Biochemistry, 4th ed., W.H. Freeman and Company, New York, 2005. With permission.
(usually more negative than −25 kJ/mole [5]), and is usually coupled to endergonic reactions. Table 2.3 lists the free energy of hydrolysis of some high-energy compounds. ATP (shown in Figure 2.1) is the most commonly used high-energy compound in the cell. The symbol “~” is used to denote high-energy bonds, hence ATP is also written as A ~ P ~ P ~ P. ATP is generated within the cell by phosphorylation of ADP through substrate-level phosphorylation and oxidative phosphorylation (refer to Sections 2.6.4 and 2.6.5). The intermediate ∆G for ATP hydrolysis along with its high phosphate group-transfer potential make it the universal energy currency in metabolic reactions. ATP hydrolysis involves the removal of one or two of its phosphates as shown:
ATP + H2O ↔ ADP + Pi + Energy ∆G´0 = −30.5 kJ/mole
(2.8)
ADP + H2O ↔ AMP + Pi ∆G´0 = −32.8 kJ/mole
(2.9)
ATP + H2O ↔ AMP + PPi ∆G´0 = −45.6 kJ/mole
(2.10)
AMP + H2O ↔ Adenosine + Pi ∆G´0 = −14.2 kJ/mole
(2.11)
PPi ↔ 2Pi ∆G´0 = −19.2 kJ/mole
(2.12)
where Pi is a phosphate group and PPi is a pyrophosphate group. All values in Equations 2.8–2.12 were obtained from Ref. [5]. Structures of ATP, ADP, and AMP are shown in Figure 2.1. Hydrolysis of phosphoanhydride bonds in ATP is strongly exergonic because the resonance stabilization of hydrolysis products exceeds the resonance stabilization of the original compound. Also, the electrostatic repulsions between negatively charged phosphate oxygen atoms favor their separation. ATP hydrolysis is slow in the absence of an enzyme catalyst, and AMP and ADP levels regulate the hydrolysis of ATP in the cell (i.e., ATP hydrolysis is most favorable at low concentrations of ADP and AMP). The actual free energy change of ATP hydrolysis under physiological conditions is different from the standard free energy change due to the unequal cellular concentrations of ATP, ADP and Pi, their ionization states at cellular pH, and their formation of complexes with Mg2 + :
HATP3− ↔ ATP4− + H+ pK´ = 6.95 1
(2.13)
HADP2− ↔ ADP3− + H+ pK´ = 6.88 2
(2.14)
O–
P
O
O
OH
AMP
H
H
O
H OH
H
N
–O
O–
P
O
O
Inorganic phosphate
N
N
O
–
O–
P
O O O–
P
O O
OH
ADP
H
H
O H OH
H
N
N N
–O
N
NH2
O–
P
O O
O
–
O–
P
O O
O–
P
O O
O–
P
O O
OH
O
ATP
H
H
H OH
H
N
N N
N
NH2
Figure 2.1 Structures of adenosine monophosphate (AMP), adenosine diphosphate (ADP), and adenosine triphosphate (ATP). Adenosine is linked to phosphate through an ester bond. Other phosphate groups are linked by phosphoanhydride bonds.
–O
N
NH2
2-6 Cellular Metabolism
Catabolism and Metabolic Fueling Processes
2-7
H2PO4 − ↔ HPO42− + H+ pK´ = 7.20 3
(2.15)
Mg2+ + ATP4− ↔ MgATP2−
(2.16)
Mg2+ + ADP3− ↔ MgADP−
(2.17)
Mg2+ + HPO42− ↔ MgHPO4
(2.18)
Other high-energy compounds found in organisms include cytosine triphosphate (CTP), uridine triphosphate (UTP), guanosine triphosphate (GTP), phosphoenolpyruvate (PEP), acyl thioesters such as acetyl-coenzyme A (AcCoA), phosphocreatine in vertebrates, and phosphoarginine in invertebrates. High-energy compounds other than ATP, with higher or lower ∆Gs of hydrolysis and lower phosphate group-transfer potential, are used only in specific metabolic reactions. For example, CTP is used mainly in phospholipid synthesis, UTP is used for complex carbohydrate synthesis (such as glycogen and cellulose), and phosphocreatine is used mainly in skeletal muscle to provide chemical energy for intense work and to ensure a constant supply of ATP.
2.4.3 NADPH Nicotinamide adenine dinucleotide (NAD + , NADH) and nicotinamide adenine dinucleotide phosphate (NADP + , NADPH) belong to a class of molecules called mobile electron carriers (refer to Section 2.5) [5]. These water-soluble molecules migrate between enzymes and serve as coenzymes or enzyme cofactors in biological redox reactions. NADP + is similar to NAD + , with the addition of a phosphate group at the adenosine ribosyl C2 position. Reducing power extracted from substrates during oxidative fueling reactions is commonly stored in the nicotinamide ring of NAD(P)+ and the reduction is represented by the following reactions:
NAD + + 2H + + 2e − ↔ NADH + H+ ∆G'0 ≈ +61.9 kJ
(2.19)
NADP + + 2H + + 2e − ↔ NADPH+ H + ∆G'0 ≈ +62.5 kJ
(2.20)
Two hydrogen atoms are removed from the substrate, with one hydride ion (H: −) transferred to the nicotinamide ring and the other hydrogen released into solution as a free proton (H+ ). Figure 2.2 shows the structures of NAD(P)(H). NADP + reduction occurs most commonly through the oxidative pentose phosphate cycle and the TCA cycle, while NADH is generated from NAD + through glycolysis and the TCA cycle. In many organisms, reducing equivalents can be transferred between cofactors by the transhydrogenase (THD)catalyzed reaction:
NADPH + NAD + ←transhydrogenase → NADP + + NADH
(2.21)
The above reaction (Keq~1) is catalyzed by a cytosolic THD. Another membrane-bound, protontranslocating THD complex couples a proton gradient to the hydride transfer reaction, effectively shifting the equilibrium ratios of oxidized and reduced cofactors [7]:
[H + ]in + NADPH + NAD + ←transhydrogenase → NADP + + NADH + [H + ]out
(2.22)
Microbes often contain one of the two THD isoforms, and the membrane version is commonly found in mitochondria. Only the Enterobacteriaceae are known to contain both THDs. Information on the structural, catalytic and kinetic properties of transhydrogenases is available [7–10]. In prokaryotes
2-8
Cellular Metabolism H
O C
O –O
P
–O
P
O
O
O– H
O
2e– + H+
H
H
OH
H OH N
OH NAD+
–O
P
H
OH
H OH
OH
–O
P
H
O N
–O
P
H
O
–O
P
NH2
O
O–
H
H
OH
H OH
NH2 N
O–
N
N O
CH2 H
N
O
H
H
OH
O
H –O
O
O NADP+
H O
N
CH2
H O
H OH OH NADH
C
NH2
N
N
H
NH2
O
N
O
H
N
H
CH2
O
O–
NH2 N
O
2e– + H+
H
CH2 H
H OH
H
N O
OH
N
O H
O
H
O–
O
+ N
CH2 O–
H
H
C
P
H
H OH
H
–O
P
O
O–
H
H H
–O
N
O
P
NH2
N
CH2
O N
N
CH2
O
–O
NH2
O–
O
C
NH2
+ N
CH2
H O
H
O P
O
O NADPH
Figure 2.2 Structures of NAD(P)(H). Nicotinamide adenine dinucleotides serve as electron-shuttling coenzymes in metabolic reactions. NADP+ is similar to NAD+, with the addition of a phosphate group at the adenosine ribosyl C2 position.
(especially Enterobacteriaceae), the physiological role of transhydrogenases remains an intriguing area of study [11] and the importance of these enzymes in engineering cofactor-dependent metabolic pathways cannot be overstated (several noteworthy examples include Refs. [11–15]). Cofactor-dependent enzymes are typically specific for either NAD(H) or NADP(H), although some redox enzymes (e.g., many aldo-keto reductases) are capable of utilizing both cofactors. NAD + serves as an intermediate electron acceptor in catabolic reactions and as an electron shuttle in the respiratory chain. In contrast, NADPH typically serves as a hydride donor for polymerization and biosynthetic (anabolic) reactions such as fatty acid biosynthesis, steroid biosynthesis, and nucleic acid biosynthesis.
2-9
Catabolism and Metabolic Fueling Processes
NADPH additionally serves as a direct antioxidant that helps to protect eukaryotic cells from the damaging effects of oxygen radicals and thereby fight against diseases including atherosclerosis, cancer, and neurodegenerative disorders [5,16].
2.5 Redox Potentials and Mobile Electron Carriers 2.5.1 Redox Potentials The redox potential of a reaction is an important measure of its electron donating ability. Half-reactions involving molecules with low (usually negative) reduction potentials will donate electrons to those with high (positive) reduction potentials. In general, molecules with large negative reduction potentials are good electron donors and those with large positive reduction potentials are good electron acceptors. Table 2.4 lists some half-reactions occurring within cells, and their corresponding standard redox potentials. The relationship between the standard redox potential ( E ′ 0 ) and the actual redox potential (E) is:
E = E ′0 +
RT [Electron acceptor] ln [Electron donor] nF
(2.23)
where n is the number of electrons transferred per molecule, and F is the Faraday constant (F = 96,485 Coulombs/mole). For a redox reaction,
0 ΔE ′ 0 = E acceptor - E donor ′0 ′
(2.24)
Δ E ′ 0 is proportional to the free energy change ( ΔG ′ 0 ) as follows: ΔG ′ 0 = −nF∆ E ′ 0 (J)
(2.25)
An example calculation is given below, for the reduction of molecular oxygen (higher electrode potential) by FADH2 (lower electrode potential): Half-reactions: FAD + 2H + + 2e −→FADH2
E1′0 = −219 mV
(2.26)
1 O2 + 2H + + 2e −→H2O E ′20 = + 816mV 2
(2.27)
FADH2 + ½O2→FAD + H2O ∆ E ′ 0 = + 1035mV
(2.28)
Overall reaction:
The ΔG ′ 0 of the overall reaction is:
ΔG ′ 0 = −2 × 96,485 × 1.035 ≈ −200 kJ
(2.29)
2.5.2 Mobile Electron Carriers Enzyme-catalyzed redox reactions are at the heart of catabolic processes, facilitating the conversion of substrate chemical energy to cellular energy currency. Hydrogen atoms and electrons obtained from redox reactions are transported in the cell by mobile electron carriers which include the flavoproteins, quinones, iron-sulfur proteins, cytochromes, and NAD(P)(H). The mobile electron carrier used in a particular redox reaction is dependent on the reduction potential and the chemical structure of the carrier.
2-10
Cellular Metabolism Table 2.4 Standard Redox Potentials of Some Cellular Reactions Couple Ferredoxin(Fe3 +)/ferredoxin(Fe2 +) (spinach) CO2/formate
E´0 (mV) −432 −432
H + /H2 (pH = 7) Ferredoxin (oxidized/reduced) in Clostridium
−410a
Acetoacetate/hydroxybutyrate
−346
−410
NADP + /NADPH
−324
NAD + /NADH FeS (oxidized/reduced) in mitochondria
−320 −305
Lipoic acid/dihydrolipoic acid
−290
S0/H2S
−270
FAD/FADH2
−220b
Acetaldehyde/ethanol
−197
FMN/FMNH2
−190
Pyruvate/lactate
−185
Oxaloacetate/malate
−170
Menaquinone (oxidized/reduced) Cyt b558 (oxidized/reduced)
−74 −75 to −43
Crotonyl-CoA/butyryl-CoA
−15
Fumarate/succinate
+ 33
Ubiquinone (oxidized/reduced)
+ 40 to + 100
Cytochrome b556 (oxidized/reduced)
+ 46 to + 129
Cytochrome b562 (oxidized/reduced)
+ 125 to + 260
Cytochrome d (oxidized/reduced)
+ 260 to + 280
Cytochrome c (oxidized/reduced)
+ 250
FeS (oxidized/reduced) in mitochondria
+ 280
Cytochrome a (oxidized/reduced)
+ 290
O2/H2O2
+ 295
Cytochrome c555 (oxidized/reduced)
+ 355
Cytochrome a3 (oxidized/reduced) in mitochondria
+ 385
NO3−/NO2−
+ 421
Fe3 + /Fe2 + O2 (1 atm)/H2O
+ 815
+ 771
Source: White, D. “Standard Electrode Potentials” from The Physiology and Biochemistry of Prokaryotes, Oxford University Press, New York, table 4.1, pg. 104, 2000. With permission. Some data also obtained from Nelson, D. and Cox, M. Lehninger Principles of Biochemistry. W.H. Freeman and Company, New York, 2005. Note: In developing the values for the standard redox potentials, a hydrogen electrode is arbitrarily designated as the reference, with [H+ ] = 1 M. The standard redox potential (E′ 0 ) is based on assumptions of 1M of electron acceptor and electron donor, pH = 7 and standard conditions of temperature and pressure (298 K and 1 atm). a When this half reaction occurs at pH = 0, E´0 is 0.0. b This reaction has different E´0 depending on whether FAD is free or if it is bound tightly to a flavoprotein. The value given is for free FAD.
2.5.3 Quinones Quinones serve as electron acceptors in the electron transport chains (ETCs) of photosynthetic photosystems I and II, aerobic and anaerobic respiration [6]. These hydrophobic, lipid-soluble electron carriers transport hydrogen and electrons to and from membrane-bound protein complexes. Three types of quinones are commonly found in ETCs:
2-11
Catabolism and Metabolic Fueling Processes CH3
O O
H
H3C
n
H3C CH3
O
Ubiquinone
O H+ + e–
O·
CH3 H
O H 3C
n
H3C CH3
O
Ubisemiquinone radical
OH H+ + e–
CH3
OH O
H
H3C
n
H3C O
CH3
Ubiquinol
OH
Figure 2.3 Reduction of ubiquinone. Reduction of ubiquinone occurs in two electron transfers through a ubisemiquinone radical intermediate. The value n can vary from 4 to 10.
1. Ubiquinone (coenzyme Q/CoQ): found in the mitochondrial and bacterial ETCs and characterized by two methoxy groups, as shown in Figure 2.3. 2. Menaquinone: found in the bacterial electron transport chain (ETC). It is characterized by a benzene ring replacing the two methoxy groups present in ubiquinones. Menaquinones have a lower electrode potential ( E ′ 0 = −74 mV) than ubiquinones ( E ′ 0 = +40 to +100mV) and are more commonly used during anaerobic respiration [6].
2-12
Cellular Metabolism
3. Plastoquinone: found in the ETCs of photosynthetic organisms such as plants and cyanobacteria. It is characterized by methyl groups which replace the two methoxy groups found in ubiquinone.
Figure 2.3 shows the reduction of ubiquinone to ubiquinol. An example of a quinone-mediated electron shuttling process is shown below:
CoQH2 + 2Fe3+ –cytochrome c→CoQ + 2Fe2+ –cytochrome c + 2H+
(2.30)
2.5.4 Flavoproteins Flavoproteins serve as enzyme catalysts in the redox reactions of oxidative phosphorylation and photophosphorylation. They employ either flavin mononucleotide (FMN or riboflavin-5′-phosphate) or flavin adenine dinucleotide (FAD) as their coenzyme and electron carrier. These coenzymes are depicted in Figure 2.4. The flavin nucleotides are tightly bound or in some cases covalently bound prosthetic groups of the flavoproteins. FAD and FMN can accept one or two electrons, with the fully reduced forms written as FADH2 and FMNH2.
2.5.5 Iron-Sulfur Proteins Iron-sulfur proteins (also called ferredoxins) are a group of electron carriers containing a 1:1 ratio of sulfide ions and nonheme iron ions which exist in a variety of oxidation states. The iron-sulfur proteins cover a very wide range of potentials, from approximately −400 to +350 mV; therefore, they are involved in redox reactions at the low and high ends of the potential range in the ETC [6]. The low potential ferredoxins can be either Fe2S2 ferredoxins or Fe4S4 ferredoxins [17–19] and they are found in plants, mitochondrial and bacterial electron transport systems. Specific examples include adrenodoxin in mitochondrial monooxygenase systems [20], thioredoxin-type ferredoxins in Clostridium pasteurianum [21,22] and putidaredoxin and terpredoxin in other bacterial systems. The high potential ferredoxins (HiPIP) are mainly Fe4S4 ferredoxins [17–19] found in anaerobic ETCs, and are common in photosynthetic bacteria [23]. High potential Fe4S4 ferredoxins have higher oxidation states than low potential Fe4S4 ferredoxins and are reduced as follows:
[Fe4S4 ]3+
High potential ferredoxin
←e →
[Fe4S4 ]2+
←e →
[Fe4S4 ]+
Low potential ferredoxin
(2.31)
2.5.6 Cytochromes Cytochrome proteins catalyze redox reactions in the ETC. They are ubiquitous in nature, found in plants, photosynthetic microorganisms, bacteria, the mitochondrial inner membrane and the endoplasmic reticulum of eukaryotes. They have an iron-containing heme prosthetic group which consists of four porphyrins (pyrrole rings with substituted side chains) attached by methane bridges, and this heme group is tightly bound or covalently linked to its associated protein [5]. Cytochromes are single electron carriers, and the iron ion in the heme transitions between Fe2+ and 3+ Fe states during the electron-transfer process. There are five different classes of hemes that distinguish the cytochromes: hemes a, b, c, d, and o. A cytochrome can have more than one heme group. For example, bacterial cytochrome bd contains hemes b and d, and it is also called cytochrome d. Similarly, bacterial cytochrome bo contains hemes b and o, and it is also called cytochrome o or cytochrome bo3 (with the subscript 3 indicating that it is an O2-binding heme). Hemes d and o are found only in the prokaryotic cytochrome oxidases [6,24].
H
O–
P
O
NH
O
O
FAD
O–
P
O O H OH
H
O H OH
H
N
N N
NH2 N
e–+ H+
CH3
CH3
H
H
H
FMNH·
H
H
CH2
OH
OH
OH
·C
CH2
N
H N
O
N
O
O
O–
P
O
NH
O–
e–+ H+
CH2
CH 2
H
H
H
H
H
O
OH
OH
OH
N H
O
FMNH2
CH2
CH2
N
H N
O
O–
P
O
NH
O–
Figure 2.4 Reduction of FMN and FAD. Flavin mononucleotide (FMN) and flavin adenine dinucleotide (FAD) are derivatives of riboflavin. Phosphorylation of riboflavin at the ribityl 5´-OH produces FMN and adenylation of FMN produces FAD. Their reduction chemistry is illustrated, using FMN as an example. Complete reduction of FMN and FAD requires a two-step electron transfer.
FMN
OH
H
O
OH
OH
N
O
H
CH2
CH2
N
CH3
H
N
CH3
H
Catabolism and Metabolic Fueling Processes 2-13
2-14
Cellular Metabolism
2.6 Examples of Catabolic Processes in Different Organisms 2.6.1 Photosynthesis Phototrophic organisms use light to generate chemical free energy, which is stored in the bonds of ATP and NAD(P)H in the catabolic phase of photosynthesis. In the anabolic phase of photosynthesis, ATP and NADPH are utilized in the synthesis of carbohydrates and other organic compounds from CO2 and H2O. Examples of phototrophic organisms include green plants, algae, and photosynthetic bacteria. These organisms carry out either oxygenic (O2-producing) or anoxygenic (non-O2producing) photosynthesis. 2.6.1.1 Oxygenic Photosynthesis Oxygenic photosynthesis is carried out by green plants, algae, the cyanobacteria and photosynthetic bacteria of the genera Prochloron, Prochlorothrix and Prochlorococcus [6]. Oxygenic photosynthesis involves a “light phase” and a “dark phase.” The light phase is necessary in oxygenic photosynthesis because H2O, an inherently poor electron donor is used, and the transfer of electrons from H2O (converted to O2, ∆E′ 0 = +820 mV) to the reaction center chlorophyll, P680 (∆E′ 0 = +1.1 mV) is driven by light energy [6]. The main light-absorbing pigment found in green plants and cyanobacteria is chlorophyll a, and in the prochlorophytes it is chlorophyll b. The light-dependent reactions generate NADPH and a proton gradient (∆p) which is dissipated for ATP synthesis (discussed in Section 2.6.5). Two light reactions occur at two different reaction centers: Reaction center I (Photosystem (PS) I), which is energized by light at wavelengths of about 700 nm or higher, and reaction center II (Photosystem (PS) II), which is energized by light at wavelengths of about 680 nm. The activities of PS I and II are coupled and are connected by a cyclic or noncyclic ETC through which electrons are transferred from the final electron acceptor in PS II to the PS I chlorophyll, with the concomitant generation of a proton gradient. For every mole of O2 formed, 12 protons move across the thylakoid membrane and about 200 kJ of energy is stored in the resulting proton gradient. This energy is used to synthesize about three moles of ATP [5]. The reactions of the light phase are summarized below:
2H2O + 2NADP+ + eight photons + ~3ADP + ~3Pi → O2 + 2NADPH + ~3ATP
(2.32)
2.6.1.2 Anoxygenic Photosynthesis Anoxygenic photosynthesis is carried out by most photosynthetic bacteria. The major classes of anoxygenic phototrophs are Rhodospirillaceae (purple nonsulfur bacteria), Chromatiaceae (purple sulfur bacteria), Chlorobiaceae (green sulfur bacteria), Chloroflexaceae (green nonsulfur bacteria/green gliding bacteria), and heliobacteria [6,25–27]. The main light-absorbing pigment in anoxygenic phototrophs is the bacterio-chlorophyll. There are seven types of bacterio-cholorophyll and they are distinguished by their respective absorption spectra [28]. Inorganic sources of electrons such as H2S, S0, S2O32− , and H2 are utilized, and the oxidation of these compounds is not necessarily light-dependent. Only one reaction center is necessary for anoxygenic photosynthesis, and electron flow can be cyclic or noncyclic [6,26,28]. The photosynthetic mechanisms in the purple bacteria are similar to those of PS II in oxygenic photosynthesis, while photosynthesis in green bacteria and heliobacteria resembles that of PS I [6]. Below is an example of a light phase reaction during anoxygenic photosynthesis in green sulfur bacteria [6]:
+ bacteriochlorophyll S 0 + NADH + H + + ATP H2S + NAD + + ADP + Pi light →
(2.33)
Products of oxygenic and anoxygenic photosynthesis light phase reactions are employed in the dark phase for the reduction of CO2 to triose phosphate in a process called carbon fixation. Carbon fixation commonly occurs through the Calvin cycle. In plants and algae, Calvin cycle reactions take place in the stroma of chloroplasts, while in photosynthetic bacteria they take place in the cytosol [6]. For every
Catabolism and Metabolic Fueling Processes
2-15
mole of triose phosphate produced, three moles of CO2, six moles of NADPH, and nine moles of ATP are utilized. The triose phosphates are further converted to starch and hexoses in the stroma and cytosol, respectively. Carbon fixation is also accomplished through pathways other than the Calvin cycle. A reductive or reverse TCA cycle is used to fix CO2 by Chlorobiaceae (a class of green sulfur bacteria), Desulfobacter (anaerobic sulfate-reducing bacterium), Hydrogenobacter, and archaea such as Thermoproteus neutrophilus (note that additional enzymes are required for a functional, reverse TCA cycle) [6,29–31]. The 3-hydroxypropionate cycle is a recently discovered carbon fixation pathway demonstrated in bacteria and archaea such as Chloroflexus aurantiacus (a green nonsulfur bacteria) and Metallosphaera sedula (a hyperthermophilic crenarchaeon) [31,32]. Finally, a noncyclic acetyl-CoA pathway fixes CO2 in methanogens, acetogenic bacteria, and most autotrophic sulfate-reducing bacteria [33].
2.6.2 The Central Pathways The central pathways are used in the metabolism of carbohydrates and carboxylic acids. Central pathways produce precursor metabolites (carbon skeleton) and sometimes ATP and NAD(P)H (energy), in which case they are called amphibolic pathways. Examples of amphibolic pathways are: the EMP pathway (glycolysis), the ED pathway, the PPP, and the TCA cycle (shown in Figures 2.5 and 2.6). The EMP, PPP and ED pathways are generally considered in the context of glucose oxidation to pyruvate, with each pathway employing different enzymes/reactions to synthesize the intermediate glyceraldehyde-3phosphate. However, in some bacteria including E. coli, the ED pathway is used primarily for gluconate metabolism rather than glucose metabolism [4]. 2.6.2.1 Glycolysis Glycolysis comprises the initial reactions required for carbohydrate metabolism. The principal functions of glycolysis are: oxidation of glucose (a six carbon molecule) to pyruvate (a three carbon molecule), production of cellular energy sources (NADH and ATP), and supply of six of the precursor metabolites used in biosynthesis (refer to Table 2.2). The reactions of glycolysis are summarized as: Glucose + 2NAD + + 2ADP + 2Pi→2Pyruvate + 2NADH + 2H + + 2ATP + 2H2O ∆G′ 0 = −85 kJ/mole
(2.34)
Glucose oxidation by glycolysis is incomplete and the final product (two pyruvates) still contains the bulk of the total energy initially in glucose. Complete oxidation of one mole of glucose yields about 2,840 kJ of energy, but 94.8% of this energy is contained in two moles of pyruvate [5]. Over 60% of the energy released from glycolysis is stored as ATP [5]. During glycolysis, ATP is produced only by substrate-level phosphorylation, whereby energy released from exergonic reactions is used to drive the phosphorylation of ADP [25]. For example:
pyruvate
kinase 2Phosphoenolpyruvate + 2ADP → 2Pyruvate + 2ATP ∆G′ 0 = −31.4 kJ/mole
(2.35)
Glycolysis is divided into two stages: the preparatory phase and the pay-off phase. The preparatory phase consists of the first five steps which incorporate isomerization, group-transfer and cleavage reactions. Two moles of ATP are utilized and two moles of glyceraldehyde-3-phosphate are produced per mole of glucose. The pay-off phase is comprised of redox reactions which oxidize glyceraldehyde-3phosphate to pyruvate. The reactions in this phase yield four moles of ATP and two moles of NADH per mole of glucose. The reactions involved in glycolysis are summarized in Table 2.5. Note that several of these reactions are considered irreversible.
2-16
Cellular Metabolism ED
EMP
PPP
Glucose + ATP NADP+ NADPH NADPH NADP 1 NADP+ NADPH ADP Ribulose-5-phosphate Glucose-6-phosphate 6-Phosphogluconate 6-Phosphogluconate 11 11 12 14 2 13 CO 2 18 Fructose-6-phosphate Ribose-5-phosphate Xylulose-5-phosphate 2-Keto-3-deoxy-6ATP 3 phosphogluconate 15 ADP Fructose-1,6-bisphosphate SedoheptuloseGlyceraldehyde4 7-phosphate 3-phosphate Glyceraldehyde-3-phosphate 19 16 + Dihydroxyacetonephosphate FructoseErythrose6-phosphate 5 4-phosphate Glyceraldehyde-3-phosphate NAD+ + Pi 6 NADH 1,3-Bisphosphoglycerate ADP 7 ATP 3-Phosphoglycerate
17 Xylulose5-phosphate
8 2-Phosphoglycerate 9 Phosphoenolpyruvate ADP 10 ATP Pyruvate
Fermentation products
TCA cycle/anaplerosis Biosynthesis
Figure 2.5 The central pathways. The glycolysis/Embden-Meyerhof-Parnas (EMP) pathway converts glucose6-phosphate to pyruvate. The PPP and the ED pathway branch from 6-phosphogluconate and re-enter glycolysis at different locations. In the PPP, the oxidation-decarboxylation reactions producing ribulose-5-phosphate occur in four steps: oxidation of glucose-6-phosphate to 6-phosphogluconolactone, hydrolysis of 6-phosphogluconolactone to 6-phosphogluconate, oxidation of 6-phosphogluconate to 3-keto-6-phosphogluconate and decarboxylation of 3-keto-6-phosphogluconate. These reactions form the oxidative arm of the PPP. The remaining reactions form the reductive arm of the PPP and consist of isomerizations and sugar rearrangements. The ED pathway produces one mole of glyceraldehyde-3-phosphate per mole of glucose-6-phosphate, so the net ATP yield up to pyruvate synthesis is only one in this pathway (instead of two, as in EMP). The enzymes that catalyze the reactions are: 1, hexokinase; 2, phosphohexose isomerase; 3, phosphofructokinase-1; 4, aldolase; 5, triose phosphate isomerase; 6, glyceraldehyde-3-phosphate dehydrogenase; 7, phosphoglycerate kinase; 8, phosphoglycerate mutase; 9, enolase; 10, pyruvate kinase; 11, glucose-6-phosphate dehydrogenase + gluconolactonase; 12, 6-phosphogluconate dehydrogenase; 13, ribose-5-phosphate isomerase, 14, ribulose-5-phosphate epimerase; 15, transketolase; 16, transaldolase; 17, transketolase; 18, 6-phosphogluconate dehydratase; 19, 2-keto-3-deoxy-6-phosphogluconate aldolase.
2.6.2.2 The PPP The PPP, also called the phosphogluconate pathway or the hexose monophosphate pathway, is responsible for the oxidation of glucose-6-phosphate to pentose phosphates such as ribose-5-phosphate and xylulose-5-phosphate. These pentose phosphates are used in the synthesis of RNA, DNA, and coenzymes such as ATP, coenzyme A, NADH, and FADH2. Additionally, the PPP produces glycolytic intermediates by converting five-carbon sugars (more accurately their phosphates) to six-carbon and three-carbon sugars, and it also produces two of the precursor metabolites used in biosynthesis (refer to Table 2.2).
2-17
Catabolism and Metabolic Fueling Processes H2O
FADH2
FAD
Fumarate 7 Malate
NAD+
Succinate Acetyl-CoA
10
CoA
ADP/GDP + Pi 9
Succinyl-CoA NADH + CO2 4
Citrate
H2O
CoA + ATP/GTP
5
Glyoxylate
Oxaloacetate
1
H2O
CoA
8
NADH
Acetyl-CoA
6
2a
NAD+ + CoA
cis-Aconitase
α-Ketoglutarate 2b
Isocitrate
3
NAD(P)H + CO2
NAD(P)+
H2O
Figure 2.6 The tricarboxylic acid (TCA/citric acid/Krebs) cycle. Acetyl-CoA is oxidized to CO2 and energy is conserved as NAD(P)H, FADH 2 and ATP. Enzymes involved: 1, citrate synthase; 2a, aconitase; 2b, aconitase; 3, isocitrate dehydrogenase; 4, α-ketoglutarate dehydrogenase complex; 5, succinyl-CoA synthetase; 6, succinate dehydrogenase (in oxidative direction) or fumarate reductase (in reductive direction); 7, fumarase; 8, malate dehydrogenase; 9, isocitrate lyase; 10, malate synthase. In eukaryotes, the enzymes of the citric acid cycle are found in the mitochondrial matrix while in prokaryotes they are cytosolic (succinate dehydrogenase interacts with the membrane). Depending on the isoform, isocitrate dehydrogenase requires either NADP+ or NAD+. Different isoforms of succinyl-CoA synthetase generate either ATP or GTP through substrate-level phosphorylation.
Table 2.5 Reactions of Glycolysis and Their Standard Free Energies ∆G´0 (kJ/mole) A. Preparatory Phase (1)
Glucose
—▶
Glucose-6-phosphate
(2)
Glucose-6-phosphate
◀—▶
Fructose-6-phosphate
(3)
Fructose-6-phosphate
—▶
Fructose-1,6-bisphosphate
(4)
Fructose-1,6-bisphosphate
◀—▶
Glyceraldehyde-3-phosphate + Dihydroxyacetone phosphate
(5)
Dihydroxyacetone phosphate
◀—▶
Glyceraldehyde-3-phosphate
−16.7 1.7 −14.2 23.8 7.5
B. Pay-off Phase (6)
Glyceraldehyde-3-phosphate
◀—▶
1,3-Bisphosphoglycerate
(7)
1,3-Bisphosphoglycerate
◀—▶
3-Phosphoglycerate
6.3 −18.5
(8)
3-Phosphoglycerate
◀—▶
2-Phosphoglycerate
4.4
(9)
2-Phosphoglycerate
Phosphoenolpyruvate
7.5
(10)
Phosphoenolpyruvate
◀—▶ —▶
Pyruvate
−31.4
Source: Nelson, D. and Cox, M. Lehninger Principles of Biochemistry. W.H. Freeman and Company, New York, 2005. With permission.
2-18
Cellular Metabolism
The PPP is the main source of NADPH, which serves as an electron donor in most biosynthesis reactions. The reactions of the PPP are divided into one oxidative phase and two reductive phases [6]. Several of the reactions in the reductive phase of the PPP are similar to those used by autotrophic organisms to fix CO2 via the Calvin cycle. Stage 1: Oxidation–decarboxylation reactions
+
NADP →6 NADPH → 3Ribulose-5-phosphate + 3CO2 3Glucose-6-phosphate + 3H2O 6
(2.36)
Stage 2: The isomerization reactions
3Ribulose-5-phosphate → 2Xylulose-5-phosphate + Ribose-5-phosphate
(2.37)
Stage 3: The sugar rearrangement reactions
i. Transketolase-catalyzed reaction Xylulose-5-phosphate + Ribose-5-phosphate → Sedoheptulose-7-phosphate + Glyceraldehyde-3-phosphate
(2.38)
ii. Transaldolase-catalyzed reaction Sedoheptulose-7-phosphate + Glyceraldehyde-3-phosphate → Fructose-6-phosphate + Erythrose-4-phosphate
(2.39)
iii. Transketolase-catalyzed reaction
Xylulose-5-phosphate + Erythrose-4-phosphate → Fructose-6-phosphate + Glyceradehyde-3-phosphate
(2.40)
Two moles of fructose-6-phosphate produced by reactions 2.39 and 2.40 can be isomerized back to two moles glucose-6-phosphate. The net reaction of the PPP is thus:
Glucose-6-phosphate + 6NADP + + 3H2O → Glyceradehyde-3-phosphate + 3CO2 + 6NADPH + 6H +
(2.41)
2.6.2.3 The ED Pathway The ED pathway is primarily found in prokaryotes but has been reported in Entamoeba histolytica and the fungi Aspergillus niger and Penicillum notatum [6,34]. In the ED pathway, one mole of glucose-6-phosphate is oxidized to one mole of glyceraldehyde-3-phosphate and one mole of pyruvate. Glyceraldehyde-3-phosphate is further oxidized through the EMP pathway to produce another mole of pyruvate. The ED pathway is summarized as:
Glucose + NADP + + NAD + + ADP + Pi → 2Pyruvate + NADPH + 2H + + NADH + ATP
(2.42)
When the ED pathway is used for glucose oxidation, the net ATP yield per mole of glucose is one. Based on the lower energy yield of the ED pathway, anaerobes which lack respiratory chains use the EMP pathway (net ATP yield per mole of glucose is two) for glucose oxidation since this is more “economical” [25]. However the ED pathway is useful in those organisms that do not have a complete EMP pathway and also for the degradation of aldonic acids such as gluconate [6].
Catabolism and Metabolic Fueling Processes
2-19
The distribution of EMP and ED pathways in various bacteria is summarized by White [6, chapter 8]. Flux ratios through these “upper” metabolic pathways vary considerably between organisms and growth conditions. Metabolic engineers are often concerned with these fluxes and altering them through genetic modifications. Continued advances in flux analysis are enabling novel insights into cellular metabolism and improved strain engineering (for examples of flux analysis methods and applications refer to Refs [35–42], as well as section IV of this handbook). 2.6.2.4 Fates of Pyruvate Pyruvate is obtained from the breakdown of nearly all energy sources and its metabolic fate is highly dependent on growth conditions. Pyruvate can be siphoned-off for use as a precursor metabolite in the synthesis of the amino acids alanine, valine, and leucine [5] and it can be utilized by anaplerotic reactions (discussed in Section 2.6.3) involved in replenishing citric acid cycle intermediates. Alternatively, pyruvate can be converted into the high-energy intermediate acetyl-CoA, which plays an important role in connecting glycolysis to the citric acid cycle. Depending on the organism, conversion of pyruvate to acetyl-CoA can be catalyzed by the pyruvate dehydrogenase enzyme complex (PDHc), pyruvate formate lyase (PFL), or pyruvate:ferredoxin oxidoreductase (POR) and fluxes through these enzymes depend on whether metabolism proceeds via the oxidative route (citric acid cycle and respiration) or fermentative route [43]. PDHc catalyzes the NAD + -linked oxidative decarboxylation of pyruvate to acetyl-CoA (primarily considered an aerobic reaction) and PFL catalyzes the anaerobic conversion of pyruvate to acetyl-CoA [5,44–48], as shown: PDHc
→ Acetyl-CoA + NADH + H + + CO2 Pyruvate + CoA + NAD +
(2.43)
Pyruvate + CoA PFL → Formate + Acetyl-CoA
(2.44)
Thus, when cells are able to generate energy through respiration, pyruvate oxidation provides NADH and fully oxidized CO2, while under fermentative conditions NADH production is avoided by the production of formate. In E. coli, PDHc is generally considered to be minimally active under anaerobic or at least non-respiratory conditions [43] due to a relatively high ratio of NADH/NAD + and high concentration of acetyl-CoA, both of which allosterically inhibit PDHc activity [43,49,50]. However, when nitrate serves as an anaerobic terminal electron acceptor, PDHc activity in E. coli can be substantial [43,48]. Expression of PDHc is additionally repressed by the ArcA/B global regulator [51] under low oxygen conditions (or more accurately, low oxidation-reduction potentials) [47,51–54]. While PDHc activity is extremely redox sensitive in many organisms (especially Enterobacteriaceae), pyruvate dehydrogenases from some organisms such as Enterococcus faecalis and Azotobacter vinelandii are less sensitive to elevated levels of intracellular NADH [50,55]. PFL is enzymatically interconverted between active and inactive forms during anaerobic-aerobic transitions since its activity involves a glycyl radical that is not stable in the presence of oxygen [43,46,48]. At the genetic level, expression of PFL in E. coli is regulated by the FNR and ArcA/B global regulators [6,43,51]. In general, catabolic pathways are sensitive to and tightly regulated by a number of related signals that reflect the redox potential and energy charge of the cell (refer to Section 2.6.6 for details on regulation of catabolic pathways). Pyruvate can alternatively be converted directly to acetate and CO2 by pyruvate oxidase under aerobic conditions, in which case reducing equivalents directly enter electron transport through ubiquinone. Acetate can in turn be converted to acetyl-CoA through the ATP-dependent acetokinase-phosphotransacetylase (ACK-PTA) and acetyl-CoA synthetase (ACS) pathways [56,57].
Acetate + ATP ←ACK → Acetyl-P + ADP
(2.45)
2-20
Cellular Metabolism
Acetyl-P + CoA ←PTA → Acetyl-CoA + Pi
(2.46)
CoA + Acetate + ATP ←ACS → Acetyl-CoA + AMP + PPi
(2.47)
Another fate of pyruvate is nonoxidative decarboxylation to acetaldehyde by the fermentative enzyme pyruvate decarboxylase (PDC). PDC from Zymomonas mobilis has received much attention in the metabolic engineering of fermentation pathways for ethanol production [58,59]. 2.6.2.5 The TCA Cycle The TCA/citric acid cycle is used in the metabolism of sugars, fatty acids, and some amino acids. Its reactions oxidize acetyl-CoA to CO2, and reducing equivalents are extracted in the form of NAD(P)H and FADH2. One mole of ATP is also produced per mole of acetyl-CoA through substrate-level phosphorylation. The citric acid cycle operates concurrently with the respiratory chain, through which NAD(P)H and FADH2 are reoxidized and a proton motive force is generated for ATP synthesis [6] (refer to Section 2.6.5). The overall reactions of the TCA cycle can be summarized as:
Acetyl-COA + 2H2O + ADP + Pi + FAD + NADP+ + 2NAD+ →2CO2 + ATP + FADH2 + NADPH + 2NADH + 3H+ + CoA
(2.48)
Although typically associated with aerobic respiration, the citric acid cycle can be active during anaerobic respiration in the presence of an external electron acceptor [60]. For example, Pseudomonas stutzeri completely oxidizes glucose to CO2 with nitrate as the electron acceptor [61,62] and the hyperthermophilic archaea, Thermoproteus tenax and Pyrobaculum islandicum, express a functional TCA cycle with sulfur or thiosulfate as the terminal electron acceptor [6,63]. Regardless of whether the TCA cycle is operational, enzymes of the cycle are used to generate precursor metabolites for biosynthesis [64] (refer to Table 2.2). For example, succinyl-CoA is a precursor to the synthesis of L-lysine, L-methionine, and tetrapyrroles of cytochromes and chlorophylls [6]. Under fermentative conditions the TCA cycle enzyme α-ketoglutarate dehydrogenase has little or no activity [51,64–66] and therefore cannot synthesize sufficient succinyl-CoA. For continued supply of metabolic precursors the TCA cycle enzymes then operate in two separate branches. α-ketoglutarate is supplied from citrate through the oxidative branch (as in the normal cycle), while succinyl-CoA is produced via reduction of oxaloacetate through the reverse, “reductive” branch, in which the enzyme fumarate reductase replaces succinate dehydrogenase.
2.6.3 Anaplerotic and Peripheral Reactions A variety of important reactions facilitate the operations of catabolic and anabolic pathways by connecting and replenishing their intermediates. Peripheral reactions convert substrates (carbon sources) which are non-central pathway intermediates into central pathway intermediates so that they can be subsequently metabolized. These reactions enhance the metabolic flexibility of organisms by enabling them to grow on a broader range of carbon compounds. For example, glycerol is converted to dihydroxyacetone phosphate (a glycolytic intermediate) in the two-step feeder pathway involving the enzymes glycerol kinase and glycerol-3-phosphate dehydrogenase. Anaplerotic reactions are responsible for replenishing central pathway intermediates which have been depleted through biosynthesis, and are most commonly referred to in the context of restoring TCA cycle intermediates. In a similar context, cataplerotic reactions refer to those which deplete excess TCA cycle intermediates, and play an important role in gluconeogenesis [67]. Anaplerosis derives carbons from amino acids, carboxylic acids, lipids, and CO2. The conversion of pyruvate to oxaloacetate by the
Catabolism and Metabolic Fueling Processes
2-21
pyruvate carboxylase enzyme is an example of an anaplerotic reaction which replenishes a citric acid cycle intermediate:
pyruvate
Pyruvate + HCO3− + ATP ←carboxylase → Oxaloacetate + ADP + Pi
(2.49)
When oxaloacetate nears depletion and the citric acid cycle is unable to effectively process acetylCoA, excess acetyl-CoA accumulates in the cell and acts as a positive allosteric modulator of the pyruvate carboxylase enzyme, stimulating oxaloacetate production [5]. Oxaloacetate is also produced in the anaplerotic reaction catalyzed by PEP carboxylase:
PEP carboxylase
→ Oxaloacetate + Pi PEP + HCO3− ←
(2.50)
Depending on the organism, PEP carboxylase is regulated by metabolic intermediates such as acetyl-CoA [6] or fructose-1,6-bisphosphate [5]. A similar reaction is catalyzed by the enzyme PEP carboxykinase, although its primary physiological role is in the reverse direction, i.e., PEP synthesis from oxaloacetate:
PEP
Oxaloacetate + ATP/GTP ←carboxykinase → PEP + CO2 + ADP/GDP
(2.51)
Nonglycolytic PEP is also synthesized by the enzymes PEP synthetase and pyruvate-phosphate dikinase, both using pyruvate as substrate and requiring hydrolysis of ATP to AMP:
synthetase Pyruvate + H2O + ATP PEP → PEP + AMP + Pi
dikinase Pyruvate + ATP + Pi → PEP + AMP + PPi
Pyruvate-phosphate
(2.52) (2.53)
Another important anaplerotic reaction is the synthesis of pyruvate from malate by the malic enzyme:
Malic enzyme
→ Pyruvate + NAD(P)H + CO2 L-malate + NAD(P)+ ←
(2.54)
This is a critical reaction to support growth on TCA cycle intermediates, and the reverse reaction has been used to improve succinate production by E. coli [68]. Anaplerotic reactions play an important role in the overproduction of metabolites derived from TCA cycle intermediates and are a common target in metabolic engineering (refer to Refs. [69–78]). 2.6.3.1 The Glyoxylate Bypass In some aerobic organisms, acetyl-CoA can be oxidized to succinate or other four-carbon intermediates of the citric acid cycle through the glyoxylate bypass. The glyoxylate bypass is an anaplerotic pathway [79] in which the two decarboxylation steps of the citric acid cycle are bypassed [6,64] and isocitrate is converted to succinate and glyoxylate by the enzyme isocitrate lyase. Glyoxylate condenses with acetyl-CoA to form malate in a reaction catalyzed by malate synthase, and malate undergoes a NAD + -dependent oxidation to form oxaloacetate. The reactions are summarized as:
2Acetyl-CoA + NAD + + 2H2O → Succinate + 2CoA + NADH + H+
(2.55)
2-22
Cellular Metabolism
The glyoxylate cycle is present in plants, some microorganisms such as E. coli and some invertebrates, allowing these organisms to grow on acetate and convert fatty acids into metabolic intermediates [5,64,79]. Upregulation of the glyoxylate bypass in E. coli was shown to reduce overflow metabolism/ acetate secretion (refer to Section 2.6.6) [80] and has been implemented for the overproduction of succinate [81].
2.6.4 Fermentation Cell fermentation requires no external electron acceptors and no respiratory pathways. Instead, internal metabolites such as pyruvate and acetyl-CoA serve as substrates of reduction reactions, allowing for NAD + regeneration with concomitant secretion of (reduced) acids and alcohols (e.g., lactate and ethanol). Fermentation is a poor energy-yielding process with little assimilation of substrate carbon into biomass. The driving force for fermentation is glycolytic ATP production through substrate-level phosphorylation, which can only continue when coupled to NADH oxidation. Additional ATP is sometimes produced in the subsequent fermentative pathways (e.g., during acetate production in mixed acid fermentation). Fermentation products vary depending on the organism and degree of reduction of the growth substrate. For example, lactic acid bacteria such as Lactobacillus plantarum produce lactate, yeasts and some fungi such as Trichocladium canadense [82] produce ethanol, and members of Clostridrium produce butyrate, butanol, and acetone [6]. E. coli and other Enterobacteriaceae carry-out mixed-acid fermentation [28], secreting mixtures of lactate, ethanol, acetate, succinate and formate. Recently, E. coli has been the target of many metabolic engineering efforts aimed at generation of strains which exclusively overproduce a single fermentation product (e.g., [81,83–89]. Figure 2.7 shows common fermentation pathways found in different organisms.
2.6.5 Oxidative Phosphorylation The central concept in oxidative phosphorylation is the chemiosmotic theory, developed by a biochemist, Peter Mitchell [90]. Cells use ion gradients to couple exergonic reactions which occur during electron transport to endergonic reactions such as ATP synthesis. During oxidative phosphorylation, electrons are ferried from NADH/FADH2 to terminal electron acceptors through protein complexes and mobile electron carriers in the ETC. Free energy becomes available during the “downhill” passage of electrons to oxygen:
NADH → NAD+ + H+ + 2e − ∆G′ 0 = −158.2 kJ
(2.56)
1 O + 2H+ + 2e −→H2O ∆G′ 0 = −61.9 kJ 2 2
(2.57)
Overall reaction:
NADH +
1 O2 + H+ →NAD+ + H2O ∆G′ 0 ≈ −220 kJ 2
(2.58)
Refer to Equations 2.26 through 2.28 for FADH2 oxidation reactions. This free energy is conserved as electrochemical energy/proton motive force (pmf)/proton potential (∆p) as a result of proton translocation during electron transport. Electrochemical energy is a function of both the membrane potential (∆Ψ, electrical energy) and the difference in concentrations between solutions separated by the membrane (chemical energy). For n moles of protons that cross a membrane,
Electrochemical energy (∆μ) = Electrical energy + Chemical energy
(2.59)
Electrical energy = nF∆Ψ (J)
(2.60)
2-23
Catabolism and Metabolic Fueling Processes Glucose
2NAD+
ATP
2NADH
ADP
2
OAA 3
NADH
ADP
6
Malate
Acetyl-CoA
FADH2
Lactate
7
CoA
8
Fumarate
NAD+
NADH
ATP
Pyruvate
4
5
ATP PEP
Pi
NAD+
ADP
1
CO2
+
CO2
9
Formate
H2
Acetyl-CoA
FAD
Pi
Succinate
10
NADH
12
NAD+
CoA Acetyl-P
Acetaldehyde* NADH
ADP 11
13
NAD+
ATP
Acetate
CoA
14
NADH
Acetoacetyl-CoA Acetate 15
NAD+
22
β-OHButyryl-CoA
Ethanol
16 Crotonyl-CoA Pi
Butyryl-P ADP 21
ATP
CoA
17 20
NADH NAD+
Butyryl-CoA 18
NADH
Acetyl-CoA Acetoacetate 23
CO2
Acetone 24
NADH NAD+
Isopropanol
NAD+
Butyraldehyde NADH 19
Butyrate
NAD+ Butanol
Figure 2.7 Common fermentation pathways. Various fermentative products are produced from pyruvate in different organisms. Common products include succinate, lactate, formate, ethanol, acetate, acetone, butanol, isopropanol, and butyrate. Enzymes involved: 1, glycolytic enzymes; 2, PEP carboxylase; 3, malate dehydrogenase; 4, fumarase; 5, fumarate reductase; 6, pyruvate kinase; 7, lactate dehydrogenase; 8, pyruvate-formate lyase; 9, formate-hydrogen lyase; 10, phosphotransacetylase; 11, acetate kinase; 12, acetaldehyde dehydrogenase; 13, alcohol dehydrogenase; 14, acetyl-CoA-acetyl transferase; 15, L(+)-β-hydroxybutyryl-CoA dehydrogenase; 16, 1,3hydroxy-acyl-CoA hydrolase; 17, butyryl-CoA dehydrogenase; 18, butyraldehyde dehydrogenase; 19, butanol dehydrogenase; 20, phosphotransbutyrylase; 21, butyrate kinase; 22, CoA transferase; 23, acetoacetate decarboxylase; 24, isopropanol dehydrogenase. *Pyruvate can be nonoxidatively decarboxylated directly to acetaldehyde by the enzyme pyruvate decarboxylase.
2-24
Cellular Metabolism
where F is Faraday’s constant and ∆Ψ is the potential difference over which the charge is moved. Also,
[H+ ] Chemical energy = RTln +in (J) [ Hout ]
(2.61)
+ ] is the concentration of protons where [Hin+ ] is the concentration of protons inside the cell and [Hout outside the cell. The electrochemical energy or proton motive force is hence written as:
[H+ ] Δµ = nFΔΨ + RT ln +in (J) [ Hout ]
(2.62)
Electrochemical energy conserved in the proton gradient is expended when extruded protons are pumped backed into the matrix to power ATP synthesis through a membrane-spanning multisubunit protein called the F1F0 ATPase (ATP synthase). Electron transfer coupled to electrogenic translocation of protons through a membrane is referred to as vectorial translocation, and is achieved by electroncarrying proton pumps [6]. Examples of proton pumps include the cytochrome c oxidase complex in the mitochondrial ETC and the NDH-1 (NADH:ubiquinone oxidoreductase complex) in bacteria. In scalar translocation, protons are transferred from one side of the membrane to the other due to switches in electron carriers involved in the electron transport process [6]. For example, in the “Q cycle,” in which quinol (QH2) is oxidized to quinone (Q), electron carriers alternate between those that carry hydrogen and those that do not, such that protons are carried across the membrane and released when this switch occurs [6]. The “Q cycle” is illustrated in Figure 2.8. Complexes in the ETC where both electron transfer and proton translocation occur are called coupling sites. 2.6.5.1 Electron Transport in Mitochondria In eukaryotes, oxidative phosphorylation takes place in the mitochondria. The mitochondrial matrix contains high concentrations of electron transport enzymes, coenzymes, substrates and inorganic ions. In well-functioning mitochondria, oxidation of substrates and phosphorylation of ADP are tightly coupled and proceed simultaneously. The mitochondrial ETC is comprised of four complexes: NADH dehydrogenase/NADH:ubiquinone oxidoreductase (complex I), succinate dehydrogenase (complex II), ubiquinol:cytochrome c oxidoreductase (complex III), and cytochrome aa3 oxidase (complex IV). 2.6.5.1.1 Complex I NADH dehydrogenase is a large enzyme complex consisting of 42 different polypeptide chains and FMN and iron-sulfur proteins as its prosthetic groups [5]. This complex catalyzes the exergonic transfer of electrons from NADH to ubiquinone (“the Q pool”). Complex I is one of the three coupling sites in the mitochondria, translocating four protons (H+ ) across the membrane for every two electrons transferred [91].
NADH + Q + H+ →NAD+ + QH2 ∆E′ 0 = 360 mV ∆G′ 0 = −69.5 kJ/mole
(2.63)
2.6.5.1.2 Complex II FAD and iron-sulfur proteins are prosthetic groups in this complex and they funnel electrons derived from succinate oxidation into the Q pool via FADH2. No proton pumping occurs in this complex and free energy release is insufficient for ATP synthesis.
2-25
Catabolism and Metabolic Fueling Processes Intermembrane space
Inner membrane
QH2
Matrix QH2
2H+ + e– Q •– Q e– Hemes e– Q
Q
2H+
Q •– 2H+ + e– QH2 QH2
QH2
Q •– + e– + 2H+out
QH2 + Q •– + 2H+matrix QH2 + 2H+matrix
Q + QH2 + e– + 2H+out
Q + 2e– + 4H+out
Figure 2.8 Diagram of the mitochondrial Q cycle. In complex III, four protons are translocated across the inner mitochondrial membrane for every two electrons delivered from complex I in the form of a single reduced coenzyme Q (QH2). Two QH2 molecules diffuse to the outside of the membrane and each delivers a single electron to cytochrome c1 (via an iron-sulfur protein), forming two Q.− semiquinones and releasing a total of four protons to the intermembrane space. The two available Q.− electrons are delivered to hemes within the membrane, forming two oxidized coenzyme Q molecules which diffuse back to the matrix. Two matrix protons and two heme electrons are subsequently used to reduce one Q, regenerating one QH2.
Succinate + FAD→Fumarate + FADH2 FADH2 + Q → FAD + QH2 ∆E′ 0 = 85 mV ∆G′ 0 = −16.4 kJ/mole
(2.64)
(2.65)
2.6.5.1.3 Complex III Complex III is the second coupling site in the mitochondria. In this complex, oxidation of ubiquinol (QH2) by cytochrome c is coupled to the translocation of four protons. Electron carriers involved in the redox reactions include cytochrome b, cytochrome c1 and an iron-sulfur protein. Ubiquinol is a twoelectron carrier while cytochrome c is a one-electron carrier, and this switch is accommodated by the Q cycle (depicted in Figure 2.8). The overall reaction in complex III is:
+ QH2 + 2cyt c1 (oxidized) + 2H +matrix → Q + 2cyt c1(reduced) + 4H intermembrane space
∆E′ 0 = 190 mV ∆G′ 0 = −36.7 kJ/mole
(2.66)
2-26
Cellular Metabolism
2.6.5.1.4 Complex IV Complex IV transfers two electrons from cytochrome c to oxygen, while two protons are translocated across the membrane. Complex IV contains hemes a and a3 and three copper ions, which are used in electron transfer. Heme a3 is an oxygen-binding heme. The overall reaction in this complex is: + + 2cyt c(reduced) + 4H matrix + ½O2 → 2cyt c(oxidized) + 2H intermembrane space + H 2O
∆E′ 0 = 580 mV ∆G′ 0 = −112 kJ/mole
(2.67)
Figure 2.9 is a schematic of the ETC in mitochondria. NADH oxidation in an intact mitochondrial ETC is accompanied by the transmembrane extrusion of a maximum of ten protons per mole of NADH or per oxygen atom reduced, while FADH2 oxidation is accompanied by the transmembrane extrusion of six protons per mole of FADH2 or per oxygen atom [6]. This ratio of protons extruded per oxygen atom reduced is referred to as H+/O. Intermembrane space
Innermembrane space
Matrix
Complex I ∆E´0= 360 mV ∆G´0 = –69.5 kJ/mole
NADH + H+ 2H FMN
NAD+
FeS-protein
4H+
4H+ Succinate FAD
2H
Fumarate
Coenzyme Q FeS
2H
Cyt bc1 4H + Complex III ∆E´0 = 190 mV ∆G´0 = –36.7 kJ/mole
2H +
Figure 2.9 Electron transport in mitochondria.
2H+
protein Cyt c
Complex II ∆E´0= 85 mV ∆G´0 = –16.4 kJ/mole
2e– 2H+
Cyt aa3 Complex IV ∆E´0 = 580 mV ∆G´0 = –112 kJ/mole
1 O + 2H+ 2 2 H 2o 2H+
2-27
Catabolism and Metabolic Fueling Processes
The number of moles of ATP (cytosolic) synthesized per oxygen atom reduced (P/O ratio) in the mitochondria has been reported to range from ~0.96 to 2.9 for NADH oxidation and ~0.88−1.9 for succinate oxidation. There is evidence however that these ratios are close to 1.0, 1.0, and 0.5 for the three proton pumping sites [92]. Therefore the P/O ratio for NADH-linked substrates is 2.5 and 1.5 for succinate [91,92]. This is based on the consensus value of four protons for ATP synthesis: three protons driving the ATP synthase per ATP (3H + /ATP) and one for transport of Pi, ADP and ATP [5,91]. 2.6.5.2 Electron Transport in Bacteria The ETC in bacteria is found in the cytoplasmic membrane and is similar to, but more complex than, the mitochondrial ETC. The complexity of the bacterial ETC is attributed to the ability of bacteria to alter their electron transport mechanisms based on growth conditions or available electron donors and acceptors [6]. Bacterial ETCs can be divided into quinone-reducing branches and quinol-oxidizing branches. The reductive branches are responsible for electron transfer from electron donors to quinone and consist of the dehydrogenases, while the oxidative branches are redox systems involved in the oxidation of quinols, cytochromes and terminal oxidoreductases [93]. Figure 2.10 depicts a generalized ETC in bacteria. Depending on the organism, electrons entering the bacterial ETC can originate from a wide variety of donors including NADH, FADH2, organic substrates, H2, NH3, NO2− , sulfur, sulfide, and ferrous iron. The entrance point of electrons into the ETC depends on the reduction potential of the donor, with more negative potentials entering at higher “levels” of the ETC (NADH dehydrogenase being the highest level). Bacterial terminal electron acceptors include oxygen, nitrate, nitrite, fumarate, sulfate, carbon dioxide and dimelthylsulfoxide. Of these compounds, oxygen has the largest (positive) reduction potential and its reduction yields the largest Gibbs free energy. The well-studied ETC of E. coli is a good illustration of the complex and branched nature of bacterial ETCs. Proton translocation occurs during NADH oxidation (2H+/e −) via NADH:ubiquinone (a) Electron donor
Reductase(s) Dehydrogenase
(b)
bc1
Quinones
c
Terminal oxidase(s)
Electron acceptor
aa3
b
c
bc
aa3
Electron acceptor
Quinones
o
o b
o
Figure 2.10 Generalized bacterial electron transport chain. (a) Electrons from donors enter the ETC through dehydrogenase complexes, the quinone pool or the cytochromes. The ETC can branch at the quinones (quinols) and cytochromes. Under anaerobic conditions, electrons are transferred from the quinols to reductase complexes which reduce the final electron acceptor. Under aerobic conditions, electrons are transferred from quinols to terminal oxidases via multiple routes. (b) Examples of different electron transfer routes employed during aerobic respiration. Cytochromes o and aa3 are terminal oxidases.
2-28
Cellular Metabolism
oxidoreductase (NDH-1), while a second NADH dehydrogenase (NDH-2) does not pump protons [6]. Several different quinol-oxidizing branches connect to terminal reductases or oxidases. Under aerobic conditions the cytochrome complexes bo and bd are formed for quinol oxidation [6]. Cytochrome bo has a lower affinity for oxygen compared to cytochrome bd, and their relative activities depend on the oxygen tension. Cytochrome bo translocates 2H +/e − (one scalar and one vectorial) while cytochrome bd translocates 1H +/e − (one scalar). During anaerobic growth, reductase complexes reduce electron acceptors. For example, nitrate reductase is synthesized for nitrate reduction and fumarate reductase for fumarate reduction. The complexity of the bacterial ETC makes it difficult to postulate a specific value for the P/O ratio, as it depends on the branches involved and the number of coupling sites. P/O ratios are higher in bacterial ETCs compared to mitochondria ETCs because protons are not required to bring Pi into the cell [6]. 2.6.5.3 Efficiency of Energy Conversion The efficiency of a process is defined as the ratio of energy output to energy input. The efficiency of electron transport in converting free energy to ATP is given as: ξ=
N × E ATP ×100 E react
(2.68)
where ξ is the efficiency, N is the number of moles of ATP produced, EATP is the energy in one highenergy phosphoanhydride bond in ATP or the free energy from ATP hydrolysis, and Ereact is the energy released from the transport process.
Example 1: Efficiency of Oxidative Phosphorylation Oxidation of one mole of NADH in mitochondria under standard biochemical conditions (298 K, 1 atm, pH 7.0), has a theoretical yield of 220 kJ of free energy per mole of NADH. If the mitochondrial P/O ratio is taken as ~2.5, the efficiency of oxidative phosphorylation in mitochondria under standard biochemical conditions is calculated as:
NADH +
1 O + H+ →NAD+ + H2O ∆G´0 ≈ −220 kJ/mole 2 2
ADP + Pi ↔ ATP + H2O ∆G´0 = +30.5 kJ/mole ξ = -2.5 ×
30.5 × 100 ≈ 35% 220
(2.69) (2.70) (2.71)
Example 2: Efficiency of Glucose Catabolism Under standard biochemical conditions, complete oxidation of glucose to CO2 has a theoretical maximum yield of 2,840 kJ of energy per mole of glucose [5]. If 32 moles of ATP are produced by substrate level phosphorylation and oxidative phosphorylation, then the efficiency of glucose oxidation is:
ξ = -32 ×
30.5 × 100 ≈ 34% 2840
(2.72)
Actual values for efficiency differ from these calculated values since cells do not operate under standard biochemical conditions.
Catabolism and Metabolic Fueling Processes
2-29
2.6.6 Key Parameters Influencing Regulation of Catabolic Pathways Metabolic activities are regulated to ensure a balance between catabolism and anabolic demands. Catabolic pathways are sensitive to and tightly regulated in response to a number of related signals that reflect the redox potential and energy charge of the cell. This includes the presence of oxygen or other terminal electron acceptors and whether a growth substrate is limiting or in excess. As was described in Section 2.6.2 for the regulation of PDHc and PFL enzymes (involved in pyruvate metabolism), key “control” enzymes of the central pathways are allosterically regulated by central pathway intermediates. While the effectors vary between isozymes and organisms, some general examples include: inhibition of phosphofructokinase by ATP and PEP, activation of pyruvate kinase by fructose-1,6-bisphosphate, inhibition of citrate synthase and isocitrate dehydrogenase by ATP and NADH, and inhibition of α-ketoglutarate dehydrogenase by its products NADH and succinyl-CoA. Transcription factors responsive to the intracellular redox potential also play an important role in regulating catabolic pathway genes. The most characterized are the fumarate nitrate reductase (FNR) and the ArcA/B global regulatory systems in E. coli (reviewed in Ref. [6]). FNR is minimally functional during aerobic growth, but under anaerobic or very low oxidation-reduction potential conditions, FNR acts as a positive regulator of many genes including PFL and fumarate reductase. FNR additionally represses many other aerobically expressed genes such as succinate dehydrogenase (in the TCA cycle) and respiratory cytochromes. Transcriptional control exerted by the Arc system also includes induction as well as repression and has pronounced effects on central metabolism under microaerobic conditions (where FNR may still be relatively inactive) in addition to anaerobic conditions, in coordination with FNR (many genes are regulated by both FNR and Arc) [94–96]. As examples, the Arc system is responsible (at least partly) for repression of PDHc and TCA cycle genes and induction of PFL. “Catabolite repression” in microbes refers to attenuated activity of inducible enzymes involved in the catabolism of a specific carbon source (i.e., the inducer, such as xylose or arabinose) when there is a surplus of a more “favorable” substrate such as glucose [97,98]. Cyclic AMP (cAMP) acts as an effector of the “catabolite activator protein” (CAP) transcription factor (also called cAMP-receptor protein, or CRP), which regulates transcription of many (or most) catabolic operons [97,99–102]. Glucose metabolism lowers intracellular cAMP concentrations [103] and as a result glucose is preferentially metabolized when present with other substrates whose catabolic genes are under CAP control. Microorganisms thus often exhibit “diauxie” or “diauxic growth” in the presence of sugar mixtures, a phenomenon in which multiple carbon sources are not metabolized simultaneously but rather utilized sequentially [98,104– 106]. Elimination of diauxic growth has been achieved in engineered E. coli strains by elevating the cAMP pool [105,107] and expressing a constitutive CRP mutant [108]. In the presence of excess nutrients, organisms often exhibit “overflow metabolism” in which the carbon substrate is assimilated rapidly but incompletely oxidized, resulting in production of some ATP and biomass as well as secretion of overflow metabolites such as ethanol and acetate (even in the presence of ample oxygen) [109–112] Respiration essentially does not “keep-up” with glycolytic flux and glycolysis is not attenuated, so metabolites are secreted to maintain redox balance and ATP production (much like fermentation) [35,111,113–116]. When the high-energy nutrients have been exhausted, cells switch to a survival agenda in which they assimilate the previously excreted organic compounds. Examples of organisms which exhibit overflow metabolism in the presence of excess glucose include E. coli (acetate accumulates and is later metabolized in what is referred to as the “acetate switch”) [117], Saccharomyces cerevisiae (ethanol accumulates) [111,118,119], and Klebsiella aerogenes (excretes 1,3-propanediol) [120]. As noted, the TCA cycle is subject to multi-tiered regulation at the levels of transcription and enzyme activity [64,65,121,122]. Under anaerobic conditions enzyme activities are 10- to 20-fold lower than those during aerobic growth [51,64,66,121,123]. High catabolic flux caused by excess glucose results in repression of TCA cycle/respiratory pathways (the “Crabtree” effect), influenced by a combination of factors including reduced gene expression (catabolite repression) and inhibition by elevated levels of “energetic” metabolites [64,113,122,124]. From a metabolic engineering perspective, it is often desirable
2-30
Cellular Metabolism
to alleviate this repression and increase flux through the TCA cycle or specific TCA cycle enzymes. This has been demonstrated through deletion of the redox-sensitive regulators (ArcA and FNR) [61,113] and through use of a constitutive CRP mutant [108].
2.7 Concluding Remarks While not comprehensive, this overview of catabolism and metabolic fueling processes describes the pathways of primary concern to metabolic engineers. Although innumerable variations are found throughout nature, the basic energy conservation mechanisms described here are used by all organisms. For example, while we have not described the roles of nonproton ion gradients in cellular energetics/ATP synthesis (a crucial component of methanogen metabolism), or a variety of unique features of archaeal respiratory chains [125], very similar biomolecular components and energy coupling mechanisms exist, and the same thermodynamic relationships apply. An understanding of these fundamental, naturally devised processes that harness energy for life should help to facilitate approaches to engineer strains with improved properties (e.g., to overcome growth deficits in engineered strains or redirect high-energy intermediates toward energy-dependent pathways of interest). Knowledge of the metabolic couplings between growth, ATP production and redox balancing is additionally necessary for constraining metabolic networks through genetic engineering, to develop genetic selections that guide the design of strains with unique phenotypes. An ever-growing need for sustainability through renewable and biologically derived products brings metabolic engineering to the technology forefront. In this context, bioprocess efficiency is of central importance; with regard to distribution of carbon and energy, it is not difficult to envision scenarios in which an optimal tradeoff exists between cell growth and product formation. Vital to our success in this arena is an ability to restructure catabolic pathways, made possible by the genomic revolution and the limitless insights and tools that it provides.
Acknowledgment The authors thank the National Science Foundation for financial support (grant No. BES0519516).
References 1. Woese, C.R., O. Kandler, and M.L. Wheelis. Towards a natural system of organisms—proposal for the domains archaea, bacteria, and eucarya. PNAS, 87 (12), 4576–4579, 1990. 2. Purves, W.K., et al. Life: The Science of Biology. 6th ed. Sunderland, MA: Sinauer Associates Inc., 2001. 3. Todar, K. The diversity of metabolism in prokaryotes. In Todar’s Online Textbook of Bacteriology, 2004. www.textbookofbacteriology.net/metabolism.html 4. Ingraham, J.L., O. Maaloe, and F.C. Neidhardt. Growth of the Bacterial Cell. Sunderland, MA: Sinaeur Associates Inc., 1983. 5. Nelson, D. and M. Cox. Lehninger Principles of Biochemistry. 4th ed. New York: W.H. Freeman and Company, 2005. 6. White, D. The Physiology and Biochemistry of Prokaryotes. 2nd ed. New York: Oxford University Press, 2000. 7. Jackson, J.B. The proton-translocating nicotinamide adenine dinucleotide transhydrogenase. J. Bioenerg. Biomembr., 23 (5), 715–741, 1991. 8. Bizouarn, T., et al. The involvement of NADP(H) binding and release in energy transduction by proton-translocating nicotinamide nucleotide transhydrogenase from Escherichia-coli. Biochim. Biophys. Acta Bioenergetics, 1229 (1), 49–58, 1995. 9. Hutton, M., et al. Kinetic resolution of the reaction catalyzed by proton-translocating transhydrogenase from Escherichia coli as revealed by experiments with analogs of the nucleotide substrates. Eur. J. Biochem., 219 (3), 1041–1051, 1994.
Catabolism and Metabolic Fueling Processes
2-31
10. Jackson, J.B., et al. Proton-translocating transhydrogenase in bacteria. Biochem. Soc. Trans., 21 (4), 1010–1013, 1993. 11. Sauer, U., et al. The soluble and membrane-bound transhydrogenases UdhA and PntAB have divergent functions in NADPH metabolism of Escherichia coli. J. Biol. Chem., 279 (8), 6613–6619, 2004. 12. Weckbecker, A. and W. Hummel. Improved synthesis of chiral alcohols with Escherichia coli cells co-expressing pyridine nucleotide transhydrogenase, NADP(+)-dependent alcohol dehydrogenase and NAD(+)-dependent formate dehydrogenase. Biotechnol. Lett., 26 (22), 1739–1744, 2004. 13. Canonaco, F., et al. Metabolic flux response to phosphoglucose isomerase knock-out in Escherichia coli and impact of overexpression of the soluble transhydrogenase UdhA. FEMS Microbiol. Lett., 204 (2), 247–252, 2001. 14. Boonstra, B., et al. Cofactor regeneration by a soluble pyridine nucleotide transhydrogenase for biological production of hydromorphone. Appl. Environ. Microbiol., 66 (12), 5161–5166, 2000. 15. Sanchez, A.M., et al. Effect of overexpression of a soluble pyridine nucleotide transhydrogenase (UdhA) on the production of poly(3-hydroxybutyrate) in Escherichia coli. Biotechnol. Prog., 22 (2), 420–425, 2006. 16. Kirsch, M. and H. De Groot. NAD(P)H, a directly operating antioxidant? FASEB J., 15 (9), 1569– 1574, 2001. 17. Mortenson, L.E., J.E. Carnahan, and R.C. Valentine. An electron transport factor from Clostridium pasteurianum. Biochem. Biophys. Res. Comm., 7 (6), 448–452, 1962. 18. Moulis, J.M., et al. Crystal structure of the 2[4Fe-4S] ferredoxin from Chromatium vinosum: Evolutionary and mechanistic inferences for [3/4Fe-4S] ferredoxins. Protein Sci., 5 (9), 1765–1775, 1996. 19. Matsubara, H. and K. Saeki. Structural and functional diversity of ferredoxins and related proteins. Adv. Inorg. Chem, 38, 223–280, 1992. 20. Grinberg, A.V., et al. Adrenodoxin: Structure, stability, and electron transfer properties. Prot. Struct. Func. Genet., 40 (4), 590–612, 2000. 21. Yeh, A.P., et al. Structure of a thioredoxin-like [2Fe-2S] ferredoxin from Aquifex aeolicus. J. Mol. Biol, 300 (3), 587–595, 2000. 22. Meyer, J. Ferredoxins of the third kind. FEBS Lett., 509 (1), 1–5, 2001. 23. Ciurli, S. and F. Musiani. High potential iron-sulfur proteins and their role as soluble electron carriers in bacterial photosynthesis: tale of a discovery. Photosynth. Res., 85 (1), 115–131, 2005. 24. Puustinen, A. and M. Wikstrom. The heme groups of cytochrome o from Escherichia coli. PNAS, 88 (14), 6122–6126, 1991. 25. Gottschalk, G. Bacterial Metabolism. 2nd ed. New York: Springer-Verlag, 1986. 26. Milano, F. The influence of chemical environment on the final electronic acceptors in Bacterial Reaction Centres. Universita Degli Studi Di Bari, 2001. 27. Blakenship, R.E., M.T. Madigan, and C. Bauer. Anoxygenic Photosynthetic Bacteria. Dordrecht, the Netherlands: Kluwer Academic, 1995. 28. Madigan, M.T. and J.M. Martinko. Brock Biology of Microorganisms. 11th ed. Upper Saddle River, NJ: Pearson Prentice Hall, 2006. 29. Schonheit, P. and T. Schafer. Metabolism of hyperthermophiles. World J. Microbiol. Biotechnol., 11 (1), 26–57, 1995. 30. Amesz, J. Green photosynthetic bacteria and heliobacteria. In Variations in Autotrophic Life. ed. J.M. Shively and L.L. Barton. New York: Academic Press, 1991. 31. van der Meer, M.T.J., et al. Stable carbon isotope fractionations of the hyperthermophilic crenarchaeon Metallosphaera sedula. FEMS Microbiol. Lett., 196 (1), 67–70, 2001. 32. Schouten, S., et al. Stable carbon isotopic fractionations associated with inorganic carbon fixation by anaerobic ammonium-oxidizing bacteria. Appl. Environ. Microbiol., 70 (6), 3785–3788, 2004. 33. Wood, H.G. and L.G. Ljungdahl. Autotrophic character of the acetogenic bacteria. In Variations in Autotrophic Life. ed. J.M. Shively and L.L. Barton. New York: Academy Press, 1991.
2-32
Cellular Metabolism
34. Conway, T. The Entner-Doudoroff Pathway—History, physiology and molecular-biology. FEMS Microbiol. Rev., 103 (1), 1–28, 1992. 35. Holms, H. Flux analysis: A basic tool of microbial physiology. Adv. Microbial Phys., 45, 271–340, 2001. 36. Holms, H. Flux analysis and control of the central metabolic pathways in Escherichia coli. FEMS Microbiol. Rev., 19 (2), 85–116, 1996. 37. Sauer, U. High-throughput phenomics: experimental methods for mapping fluxomes. Curr. Opin. Biotechnol., 15 (1), 58–63, 2004. 38. Sauer, U., et al. Metabolic flux ratio analysis of genetic and environmental modulations of Escherichia coli central carbon metabolism. J. Bacteriol., 181 (21), 6679–6688, 1999. 39. Emmerling, M., et al. Metabolic flux responses to pyruvate kinase knockout in Escherichia coli. J. Bacteriol., 184 (1), 152–164, 2002. 40. Sahm, H., L. Eggeling, and A.A. de Graaf. Pathway analysis and metabolic engineering in Corynebacterium glutamicum. Biol. Chem., 381 (9–10), 899–910, 2000. 41. de Graaf, A.A., et al. Metabolic state of Zymomonas mobilis in glucose-, fructose-, and xylose-fed continuous cultures as analysed by C-13- and P-31-NMR spectroscopy. Arch. Microbiol, 171 (6), 371–385, 1999. 42. Yang, C., Q. Hua, and K. Shimizu. Quantitative analysis of intracellular metabolic fluxes using GC-MS and two-dimensional NMR spectroscopy. J. Biosci. Bioeng., 93 (1), 78–87, 2002. 43. de Graef, M.R., et al. The steady-state internal redox state (NADH/NAD) reflects the external redox state and is correlated with catabolic adaptation in Escherichia coli. J. Bacteriol., 181 (8), 2351–2357, 1999. 44. Cassey, B., J.R. Guest, and M.M. Attwood. Environmental control of pyruvate dehydrogenase complex expression in Escherichia coli. FEMS Microbiol. Lett., 159 (2), 325–329, 1998. 45. Guest, J.R., et al. Regulatory and other aspects of pyruvate dehydrogenase complex synthesis in Escherichia coli. In Biochemistry and Physiology of Thiamine Diphosphate Enzymes. eds. H. Bisswanger and Schellenberger, A. Prien, Germany: Intemann, 1996. 46. Kessler, D. and Knappe, J. Anaerobic dissimilation of pyruvate. In Escherichia coli and Salmonella: Cellular and Molecular Biology. ed. F.C. Neidhardt, et al. Washington, DC: ASM Press, 1996. 47. Quail, M.A., D.J. Haydon, and J.R. Guest. The PdhR-Acef-Ipd operon of Escherichia-Coli expresses the pyruvate-dehydrogenase complex. Mol. Microbiol., 12 (1), 95–104, 1994. 48. Kaiser, M. and G. Sawers. Pyruvate formate-lyase is not essential for nitrate respiration by Escherichia coli. FEMS Microbiol. Lett., 117 (2), 163–168, 1994. 49. Hansen, R.G. and U. Henning. Regulation of pyruvate dehydrogenase activity in Escherichia Coli K-12. Biochim. Biophys. Acta, 122 (2), 355–358, 1966. 50. Snoep, J.L., et al. Differences in sensitivity to NADH of purified pyruvate dehydrogenase complexes of Enterococcus faecalis, Lactococcus lactis, Azotobacter vinelandii and Escherichia coli - Implications for their activity in-vivo. FEMS Microbiol. Lett., 114 (3), 279–283, 1993. 51. Iuchi, S. and E.C.C. Lin. ArcA (Dye), a global regulatory gene in Escherichia coli mediating repression of enzymes in aerobic pathways. PNAS, 85 (6), 1888–1892, 1988. 52. Malpica, R., et al. Identification of a quinone-sensitive redox switch in the ArcB sensor kinase. PNAS, 101 (36), 13318–13323, 2004. 53. Nystrom, T., C. Larsson, and L. Gustafsson. Bacterial defense against aging: Role of the Escherichia coli ArcA regulator in gene expression, readjusted energy flux and survival during stasis. EMBO J., 15 (13), 3219–3228, 1996. 54. Malpica, R., et al. Signaling by the arc two-component system provides a link between the redox state of the quinone pool and gene expression. Antioxidants & Redox Signaling, 8 (5–6), 781–795, 2006. 55. Snoep, J.L., et al. Involvement of pyruvate dehydrogenase in product formation in pyruvate limited anaerobic chemostat cultures of Enterococcus faecalis NCTC-775. Arch. Microbiol, 154 (1), 50–55, 1990.
Catabolism and Metabolic Fueling Processes
2-33
56. Abdel-Hamid, A.M., M.M. Attwood, and J.R. Guest. Pyruvate oxidase contributes to the aerobic growth efficiency of Escherichia coli. Microbiol.-Sgm, 147, 1483–1498, 2001. 57. Brown, T.D.K., M.C. Jonesmortimer, and H.L. Kornberg. Enzymic interconversion of acetate and acetyl-coenzyme A in Escherichia coli. J. Gen. Appl. Microbiol., 102 (October), 327–336, 1977. 58. Ingram, L.O., et al. Metabolic engineering of bacteria for ethanol production. Biotechnol. Bioeng., 58 (2–3), 204–214, 1998. 59. Aristidou, A. and M. Penttila. Metabolic engineering applications to renewable resource utilization. Curr. Opin. Biotechnol., 11 (2), 187–198, 2000. 60. Miller, S.L. and D. Smithmagowan. The thermodynamics of the Krebs cycle and related-compounds. J. Phys. Chem. Ref. Data, 19 (4), 1049–1073, 1990. 61. Prohl, C., et al. Functional citric acid cycle in an arcA mutant of Escherichia coli during growth with nitrate under anoxic conditions. Arch. Microbiol, 170 (1), 1–7, 1998. 62. Spangler, W.J. and C.M. Gilmour. Biochemistry of nitrate respiration in Pseudomonas stutzeri. I. Aerobic and nitrate respiration routes of carbohydrate catabolism. J. Bacteriol., 91 (1), 245–250, 1966. 63. Selig, M. and P. Schonheit. Oxidation of organic compounds to carbon dioxide with sulfur or thiosulfate as electron acceptor in the anaerobic hyperthermophilic archaea Thermoproteus tenax and Pyrobaculum islandicum proceeds via the citric acid cycle. Arch. Microbiol, 162 (4), 286–294, 1994. 64. Cronan, J.E. and D. Laporte. Tricarboxylic acid cycle and glyoxylate bypass. In Escherichia coli and Salmonella: Cellular and Molecular Biology. ed. F.C. Neidhardt, et al. Washington, DC: ASM Press, 1996. 65. Amarasin, C.R. and B.D. Davis. Regulation of alpha-Ketoglutarate dehydrogenase formation In Escherichia coli. J. Biol. Chem., 240 (9), 36642–3668, 1965. 66. Smith, M.W. and F.C. Neidhardt. 2-oxoacid dehydrogenase complexes of Escherichia coli - Cellular amounts and patterns of synthesis. J. Bacteriol., 156 (1), 81–88, 1983. 67. Owen, O.E., S.C. Kalhan, and R.W. Hanson. The key role of anaplerosis and cataplerosis for citric acid cycle function. J. Biol. Chem., 277 (34), 30409–30412, 2002. 68. Stols, L. and M.I. Donnelly. Production of succinic acid through overexpression of NAD( + )-dependent malic enzyme in an Escherichia coli mutant. Appl. Environ. Microbiol., 63 (7), 2695–2701, 1997. 69. March, J.C., M.A. Eiteman, and E. Altman. Expression of an anaplerotic enzyme, pyruvate carboxylase, improves recombinant protein production in Escherichia coli. Appl. Environ. Microbiol., 68 (11), 5620–5624, 2002. 70. Yang, C., et al. Analysis of Escherichia coli anaplerotic metabolism and its regulation mechanisms from the metabolic responses to altered dilution rates and phosphoenolpyruvate carboxykinase knockout. Biotechnol. Bioeng., 84 (2), 129–144, 2003. 71. Petersen, S., et al. In vivo quantification of parallel and bidirectional fluxes in the anaplerosis of Corynebacterium glutamicum. J. Biol. Chem., 275 (46), 35932–35941, 2000. 72. Lin, H., et al. Increasing the acetyl-CoA pool in the presence of overexpressed phosphoenolpyruvate carboxylase or pyruvate carboxylase enhances succinate production in Escherichia coli. Biotechnol. Prog., 20 (5), 1599–1604, 2004. 73. Riedel, C., et al. Characterization of the phosphoenolpyruvate carboxykinase gene from Corynebacterium glutamicum and significance of the enzyme for growth and amino acid production. J. Mol. Microbiol, 3 (4), 573–583, 2001. 74. Petersen, S., et al. Metabolic consequences of altered phosphoenolpyruvate carboxykinase activity in Corynebacterium glutamicum reveal anaplerotic regulation mechanisms in vivo. Metab. Eng., 3 (4), 344–361, 2001. 75. Gokarn, R.R., M.A. Eiteman, and E. Altman. Metabolic analysis of Escherichia coli in the presence and absence of the carboxylating enzymes phosphoenolpyruvate carboxylase and pyruvate carboxylase. Appl. Environ. Microbiol., 66 (5), 1844–1850, 2000. 76. Millard, C.S., et al. Enhanced production of succinic acid by overexpression of phosphoenolpyruvate carboxylase in Escherichia coli. Appl. Environ. Microbiol., 62 (5), 1808–1810, 1996.
2-34
Cellular Metabolism
77. Sanchez, A.M., G.N. Bennett, and K.Y. San. Efficient succinic acid production from glucose through overexpression of pyruvate carboxylase in an Escherichia coli alcohol dehydrogenase and lactate dehydrogenase mutant. Biotechnol. Prog., 21 (2), 358–365, 2005. 78. Flores, N., et al. Pathway engineering for the production of aromatic compounds in Escherichia coli. Nature Biotechnol., 14 (5), 620–623, 1996. 79. Clark, D.P. and J. E. Cronan. Two-carbon compounds and fatty acids as carbon sources. In Escherichia coli and Salmonella: Cellular and Molecular Biology. ed. F.C. Neidhardt, et al. Washington, DC: ASM Press, 1996. 80. Farmer, W.R. and J.C. Liao. Reduction of aerobic acetate production by Escherichia coli. Appl. Environ. Microbiol., 63 (8), 3205–3210, 1997. 81. Lin, H., G.N. Bennett, and K.Y. San. Metabolic engineering of aerobic succinate production systems in Escherichia coli to improve process productivity and achieve the maximum theoretical succinate yield. Metab. Eng., 7 (2), 116–127, 2005. 82. Pavarina, E.C. and L.R. Durrant. Growth of lignocellulosic-fermenting fungi on different substrates under low oxygenation conditions. Appl. Biochem. Biotechnol., 98, 663–677, 2002. 83. Shukla, V.B., et al. Production of D(-)-lactate from sucrose and molasses, Biotechnol. Lett., 26 (9), 689–693, 2004. 84. Zhou, S.D., et al. Production of optically pure D-lactic acid in mineral salts medium by metabolically engineered Escherichia coli W3110. Appl. Environ. Microbiol., 69 (1), 399–407, 2003. 85. Alterthum, F. and L.O. Ingram. Efficient ethanol-production from glucose, lactose, and xylose by recombinant Escherichia-coli. Appl. Environ. Microbiol., 55 (8), 1943–1948, 1989. 86. Ingram, L.O., et al. Genetic engineering of ethanol production in Escherichia coli. Appl. Environ. Microbiol., 53 (10), 2420–2425, 1987. 87. Lin, H., G.N. Bennett, and K.Y. San. Genetic reconstruction of the aerobic central metabolism in Escherichia coli for the absolute aerobic production of succinate. Biotechnol. Bioeng., 89 (2), 148–156, 2005. 88. Chang, D.E., et al. Homofermentative production of D- or L-lactate in metabolically engineered Escherichia coli RR1. Appl. Environ. Microbiol., 65 (4), 1384–1389, 1999. 89. Vemuri, G.N., M.A. Eiteman, and E. Altman. Succinate production in dual-phase Escherichia coli fermentations depends on the time of transition from aerobic to anaerobic conditions. J. Ind. Microbiol. Biotechnol., 28 (6), 325–332, 2002. 90. Mitchell, P. Coupling of phosphorylation to electron and hydrogen transfer by a chemi-osmotic type of mechanism. Nature, 191 (478), 144–148, 1961. 91. von Jagow, G., B.M. Geier, and T.A. Link. The mammalian mitochondrial respiratory chain. In Bioenergetics. ed. P. Graber and G. Milazzo. Basel, Switzerland: Birkhauser Verlag, 1997. 92. Hinkle, P.C. P/O ratios of mitochondrial oxidative phosphorylation. Biochim. Biophys. Acta-Bioenergetics, 1706 (1–2), 1–11, 2005. 93. Thony-Meyer, L. Biogenesis of respiratory cytochromes in bacteria. Microbiol. Mol. Biol. Rev., 61 (3), 337–376, 1997. 94. Alexeeva, S., et al. Effects of limited aeration and of the ArcAB system on intermediary pyruvate catabolism in Escherichia coli. J. Bacteriol., 182 (17), 4934–4940, 2000. 95. Alexeeva, S., K.J. Hellingwerf, and M.J.T. de Mattos. Requirement of ArcA for redox regulation in Escherichia coli under microaerobic but not anaerobic or aerobic conditions. J. Bacteriol., 185 (1), 204–209, 2003. 96. Levanon, S.S., K.Y. San, and G.N. Bennett. Effect of oxygen on the Escherichia coli ArcA and FNR regulation systems and metabolic responses. Biotechnol. Bioeng., 89 (5), 556–564, 2005. 97. Todar, K. Regulation and control of metabolic activity. Todar’s Online Textbook of Bacteriology, 2004. www.textbookofbacteriology.net/regulation.html 98. Stulke, J. and W. Hillen. Carbon catabolite repression in bacteria. Curr. Opin. Microbiol., 2 (2), 195–201, 1999.
Catabolism and Metabolic Fueling Processes
2-35
99. Pastan, I. and S. Adhya. Cyclic adenosine 5’-monophosphate in Escherichia coli. Bacteriol. Rev., 40 (3), 527–551, 1976. 100. Botsford, J.L. and J.G. Harman. Cyclic-Amp in prokaryotes. Microbiol. Rev., 56 (1), 100–122, 1992. 101. Kolb, A., et al. Transcriptional regulation by cAMP and its receptor protein. Ann. Rev. Biochem., 62, 749–795, 1993. 102. Saier, M.H., T. M., Ramseier, and Reizer, J. Regulation of carbon utilization. In Escherichia coli and Salmonella: Cellular and Molecular Biology. ed. F.C. Neidhardt, et al. Washington, DC: ASM Press, 1996. 103. Saier, M.H., T. M., Ramseier, and Reizer, J. Regulation of carbon utilization. In Escherichia coli and Salmonella: Cellular and Molecular Biology. 2nd ed. ed. F.C. Neidhardt, et al. Washington, DC: ASM Press, 1996. 104. Monod, J. Recherches sur la croissance des cultures Bactériennes. Actualite’s Scientifique et Industrielles, 911, 1–215, 1942. 105. Hernandez-Montalvo, V., et al. Characterization of sugar mixtures utilization by an Escherichia coli mutant devoid of the phosphotransferase system. Appl. Biochem. Biotechnol., 57 (1–2), 186–191, 2001. 106. Epstein, W., L.B. Rothmandenes, and J. Hesse. Adenosine 3’-5’-cyclic monophosphate as mediator of catabolite repression in Escherichia-coli. PNAS, 72 (6), 2300–2304, 1975. 107. Dien, B.S., N.N. Nichols, and R.J. Bothast. Fermentation of sugar mixtures using Escherichia coli catabolite repression mutants engineered for production of L-lactic acid. J. Ind. Microbiol. Biotechnol., 29 (5), 221–227, 2002. 108. Cirino, P.C., J.W. Chin, and L.O. Ingram. Engineering Escherichia coli for xylitol production from glucose-xylose mixtures. Biotechnol. Bioeng., 95 (6), 1167–1176, 2006. 109. Hollywood, N. and H.W. Doelle. Effect of specific growth-rate and glucose concentration on growth and glucose-metabolism of Escherichia coli K-12. Microbios, 17 (67), 23–33, 1976. 110. Andersen, K.B. and K. Vonmeyenburg. Are growth-rates of Escherichia coli in batch cultures limited by respiration. J. Bacteriol., 144 (1), 114–123, 1980. 111. Kaeppeli, O., Sonnleitner, B. Regulation of sugar metabolism in Saccharomyces-type yeast: Experimental and conceptual considerations. CRC Crit. Rev. Biotechnol., 4, (3), 299–326, 1986. 112. Meyer, H.P., C. Leist, and A. Fiechter. Acetate formation in continuous culture of Escherichia coli K12-D1 on defined and complex media. J. Biotechnol, 1 (5–6), 355–358, 1984. 113. Vemuri, G.N., et al. Overflow metabolism in Escherichia coli during steady-state growth: Transcriptional regulation and effect of the redox ratio. Appl. Environ. Microbiol., 72 (5), 3653–3661, 2006. 114. El-Mansi, E.M.T. and Holms, W.H. Control of carbon flux to acetate excretion during growth of Escherichia coli in batch and continuous cultures. J. Gen. Microbiol., 135, 2875–2883, 1989. 115. Andersen, K.B. and K.V. Meyenburg. Charges of nicotinamide adenine-nucleotides and adenylate energy-charge as regulatory parameters of metabolism in Escherichia coli. J. Biol. Chem., 252 (12), 4151–4156, 1977. 116. Majewski, R.A. and M.M. Domach. Simple constrained-optimization view of acetate overflow in Escherichia coli. Biotechnol. Bioeng., 35 (7), 732–738, 1990. 117. Wolfe, A.J. The acetate switch. Microbiol. Mol. Biol. Rev., 69 (1), 12–50, 2005. 118. Pham, H.T.B., G. Larsson, and S.O. Enfors. Modelling of aerobic growth of Saccharomyces cerevisiae in a pH-auxostat. Bioprocess Eng., 20 (6), 537–544, 1999. 119. Pham, H.T.B., G. Larsson, and S.O. Enfors. Precultivation technique for studies of microorganisms exhibiting overflow metabolism. Biotechnol. Tech., 13 (1), 75–80, 1999. 120. Streekstra, H., et al. Overflow metabolism during anaerobic growth of Klebsiella aerogenes NCTC 418 on glycerol and dihydroxyacetone in chemostat culture. Arch. Microbiol, 147 (3), 268–275, 1987. 121. Gray, C.T., J.W. Wimpenny, and M.R. Mossman. Regulation of metabolism in facultative bacteria. 2. Effects of aerobiosis anaerobiosis and nutrition on formation of krebs cycle enzymes in Escherichia coli. Biochim. Biophys. Acta, 117 (1), 33–41, 1966.
2-36
Cellular Metabolism
122. Spencer, M.E. and J.R. Guest. Regulation of citric acid cycle genes in facultative bacteria. Microbiol. Sci., 4 (6), 164–168, 1987. 123. Guest, J.R. and G.C. Russell. Complexes and complexities of the citric acid cycle in Escherichia coli. Curr. Top. Cell. Regul., 33, 231–247, 1992. 124. Miles, J.S. and J.R. Guest. Molecular genetic aspects of the citric acid cycle of Escherichia coli. Biochem. Soc. Symp., 54, 45–65, 1989. 125. Schafer, G., M. Engelhard, and V. Muller. Bioenergetics of the archaea. Microbiol. Mol. Biol. Rev., 63 (3), 570–620, 1999.
3 Biosynthesis of Cellular Building Blocks: The Prerequisites of Life 3.1
Introduction ���������������������������������������������������������������������������������������3-1
3.2
Amino Acid Biosynthesis �����������������������������������������������������������������3-3
3.3
Nucleotides as Building Blocks �����������������������������������������������������3-12
3.4
Synthesis of Carbohydrates for Building Cells........................... 3-20
3.5
Cell Synthesis of Lipids ������������������������������������������������������������������ 3-25
Incorporation of Biological Atoms into the Cells • Cellular Building Blocks Are Derived from Central Precursors The 2-Oxoglutarate Derived Amino Acids • The Aspartate Family and Branch-Chained Amino Acids • Serine-Glycine Family of Amino Acids • Aromatic Nonpolar Amino Acids • Histidine Biosynthesis Structure and Organization of Nucleotides • Biosynthesis of Nucleotides • Nucleotide Metabolic Conditions
Zachary L. Fowler, Effendi Leonard, and Mattheos Koffas State University of New York at Buffalo
Introduction to Carbohydrates • Interconversion of Carbohydrates • Other Import Enzymes of Carbohydrate Synthesis Introduction to Lipids and Fatty Acids • Sources of Metabolites for Lipogenesis • Synthesis of Long-Chain Fatty Acids • Synthesis of Neutral Lipids
References ����������������������������������������������������������������������������������������������������3-31
3.1 Introduction The vast array of secondary metabolites critical for the production of commercial pharmaceuticals, food products, and other consumer goods are all synthesized from the most basic cellular building blocks. This chapter covers the fundamentals of their biosynthesis, where a majority of the biosynthetic routes have been obtained from studies of well characterized prokaryotes, especially the enteric bacterium Escherichia coli. However, especially in the cases of amino acid and nucleotide biosynthesis, it is safe to assume all organisms have a universal cellular biochemistry thus generalizing the fundamental (primary) biosynthetic pathways as similar in all organisms to some extent. Amino acids, nucleotides, sugars, and fatty acids make of the major building blocks used in the synthesis of macromolecular cellular components including protein polymers, nucleic acids (deoxyribonucleic acid, DNA, and ribonucleic acid, RNA), cellular structures, and lipids. Cellular building blocks are typically assembled out of degradation products from central precursor metabolites, specifically acetyl-CoA, pyruvate, oxaloacetate, 2-oxoglutarate, phosphoenolpyruvate (PEP), and the hexose phosphates, which themselves are derived from the organic, simple inorganic, low-molecular weight, and macromolecular molecules. Cells are constantly shifting from metabolic states to catabolic states, generating and consuming the building blocks 3-1
3-2
Cellular Metabolism
in an effort to maintain the delicate equilibrium essential to proper cellular functioning while growing on environmental substrates. Because most growth substrates eventually lose nitrogen, phosphorus, and sulfur during degradation processes, the central precursor metabolites contain only carbon, hydrogen, oxygen, and some phosphate atoms. As a result, cells also require compounds rich in nitrogen, phosphate, sulfate, one carbon units, as well as energy (mostly in the form of phosphoric acid anhydride bonds in adenosine triphosphate (ATP)) and a reductant in the form of the hydride of NADPH.
3.1.1 Incorporation of Biological Atoms into the Cells 3.1.1.1 Nitrogen Nitrogen is incorporated into a cell’s metabolism in several oxidation states including NH+ 4 , NO3− , N2, as well as through urea and amino acids, and purine and pyrimidine metabolites, however the most common route for utilization of inorganic nitrogen is generally ammonia. Reactions involving nitrogen assimilation in all organisms lead to the formation of central nitrogen carriers, primarily glutamate, and glutamine, and to a lesser extent the secondary metabolites aspartate and carbamoyl phosphate. Carbamoyl phosphate is unique in that it is only utilized for the biosynthesis of arginine, urea, and pyrimidine nucleotides. As such most nitrogen transfer from ammonia to the amino acids and other nitrogen containing molecules is conducted through the two primary nitrogen carriers’ glutamate and glutamine. By and large, the process from substrate uptake to cellular component synthesis using nitrogen is carried out in three steps. First, and only after nitrogen sources from the environment are transported into the cells, ammonia formation occurs through the reduction of nitrite or N2 or by hydrolytic release mechanisms from nitrogen containing organic substrates. The second step involves nitrogen transfer, in which case, the ammonia is transferred to form glutamate and glutamine. Glutamine is the direct nitrogen donor for most nitrogen containing building blocks, except for the α-amino group of the amino acids for which glutamine is only an indirect donor, where glutamate on the other hand, is the universal amino donor of α-amino acids. In E. coli ammonia assimilation is conducted through glutamine synthesis mediated by L-glutamine synthetase when not under energy limitation or by a glutamate dehydrogenase (GDH) pathway when energy in the cell is at a premium and phosphate levels are high. The GDH pathway is the primary ammonia assimilation route in fungi and yeast. In the third step, nitrogen is finally released by transferring the α-amino group of L-glutamate to 2-oxo acids, the direct precursors of α-amino acids, by aminotransferase enzymes. In general, most cell types possess a variety of aminotransferases in which each enzyme is uniquely responsible for the synthesis of a group of 2-oxo acids. 3.1.1.2 Sulfur Elemental sulfur, which can be synthesized from microbial or air oxidation has a low solubility; therefore, it is only used by a few prokaryotes as an electron acceptor in aerobic respiration. On the other 2− hand, sulfate (SO2− 4 ) and thiosulfate (S2O3 ) are the principal sulfur sources for prokaryotes. Under aerobic respiration sulfur sources are commonly transported into the cells using a process mediated by an ATP-dependent, high-affinity, binding-protein-dependent (ABC) transport system while the low affinity transport systems is typically inactive even if present. Anaerobic respiration results in sulfate reduction to H2S by a variety of sulfate reducing enzymes, a process that has become highly evolved in geothermal bacteria. In many cases cellular metabolism finds itself in a condition between aerobic and anaerobic where thiosulfate and H2S (the preferred substrates of anaerobic bacteria) are both present. H2S is highly toxic to most cells; therefore it is immediately transferred into the amino acid cysteine by the enzyme O-acetylserine sulfhydrylase. Intracellular sulfur containing compounds, for instance methionine, coenzyme A, lipoic acid, thiamine pyrophosphate, and glutathione, ultimately obtain their sulfur from cysteine. Through a similar enzyme, thiosulfate is also incorporated into cysteine forming cysteine thiosulfate. While the transport of sulfate and synthesis of cysteine are in no way linked, both are controlled by the complex cys regulon in that the presence of excess cysteine results in feedback-inhibition of O-acetylserine formation, which is a critical inducer of expression for the sulfate reducing enzymes.
Biosynthesis of Cellular Building Blocks: The Prerequisites of Life
3-3
3.1.1.3 Phosphate Since it is readily oxidized by air, phosphorus exists mostly in the form of phosphate, which is the main inorganic phosphorus source of prokaryotes and does not need further reduction. The two phosphate forms that exist in neutral pH environments, H2PO−4 and HPO2− 4 are readily transported into the cells by unique transport systems. At high concentrations (>0.1 mM), a constitutively expressed, relatively unspecific, low-affinity system driven by proton motive force through proton symport operates with high capacity to alleviate the cell from excess phosphorus levels. When phosphate exists in low concentrations (<10 µM), transport into the cells is governed by a well regulated, high-affinity binding-protein-dependent system driven by the hydrolysis of ATP. Once transported into the cells, phosphate is transferred to ADP by ATP synthase to form ATP, the universal phosphate carrier and cellular energy source. 3.1.1.4 One-Carbon Units and Oxygen The term one-carbon (C1) unit refers to carbon atoms separated from other carbon atoms by hetero atoms such as oxygen, nitrogen, and sulfur. Biosynthesis of C1 units originates from coenzymes containing C1 units bound to a nitrogen or sulfur atom. In most cases, C1 units are synthesized from serine or glycine through serine hydromethylase or glycine decarboxylase, respectively to form the conjugation with tetrahydrofolate (H4F) forming methylene-H4F. Methylene-H4F is either reduced to methyl-H4F or successively oxidized to the formyl level where these C1-H4F compounds serve as C1 precursors in a number of reactions, for example purine and methionine biosynthesis. In general, molecular oxygen is not required in biosynthetic reactions because all cellular oxygen essentially, is obtained from water. However, O2 is required in a few hydroxylation reactions or in the introduction of double bonds performed by some obligate bacteria to synthesize uniquinone of monosaturated fatty acids and polysaturated fatty acids.
3.1.2 Cellular Building Blocks Are Derived from Central Precursors The three major metabolic pathways serving as suppliers of central precursor metabolites for the synthesis of building blocks are glycolysis, the citric acid cycle and the pentose phosphate pathway, each of which has multiple, tightly regulated steps that will not be discussed here. Glycolysis and the reverse process gluconeogenesis are used to generate the C6 carbon base molecules for many of the secondary metabolites with functional phosphates attached. The citric acid cycle (also known as the tricarbocyclic acid cycle (TCA)) is the energy pump for aerobic respiration generating ATP to such amounts that the complete oxidation of pyruvate to CO2 and H2O by the TCA cycle produces 18 ATP equivalents. Also known as the Calvin cycle in plants, the pentose phosphate pathway is used to convert the C6 carbons into C5 carbons, ribose 5-phosphate and the other C3, C4, C5, and C7 sugars. In some bacteria, alternative metabolic pathways exist to provide the essential central precursor metabolites, for example, the Entner–Doudoroff pathway. The synthesis of cellular building blocks through the degradation of central precursor metabolites is strictly regulated and involves anaplerotic reactions to replenish the metabolic constituents.
3.2 Amino Acid Biosynthesis All enzymatic metabolic function is conducted by proteins making them the major cell constituents synthesized within the cell. As such the generation of the 20 natural L-amino acid building blocks using only a few central precursor metabolites for their biosynthesis to form proteins is a critical process (Figure 3.1). Additionally, L-amino acids serve as precursors for the synthesis of nucleotides. Nitrogen containing metabolites can be formed directly from a single nitrogen source, such as ammonia or nitrate in many bacteria and most plants; however preformed amino acids are often the substrates of other nitrogen containing metabolites in many organisms. Mammals synthesize only half of the 20 amino
3-4
Cellular Metabolism Pentose phosphate Proteins
Glycolysis / Gluconeogenesis Glucose 6-phosphate
Ribose 5-phosphate
Fructose 6-phosphate
Amino acids Erythose 4-phosphate
Glycerol Glyceraldehyde 3-phosphate
Nucleic acids Lipids 3-Phosphoglycerate Nucleotides Phosphoenolpyruvate
Fatty acids
Mono/polysaccharides Pyruvate
Acetyl-CoA Citrate cycle (TCA cycle)
Oxalacetate 2-Oxoglutarate Succinate
Figure 3.1 Interconnections in the formation of building blocks. The figure highlights the interconnectivity between metabolic pathways producing the critical cellular constituents needed to maintain cell growth. Only those intermediates critical to the synthesis of building blocks are noted in glycolysis, the TCA cycle, and the pentose phosphate pathway.
acids required for growth and maintenance; these amino acids are termed “non essential amino acids” with the remaining “essential amino acids.” Generally, nonessential amino acids are readily synthesized from metabolites derived from the central metabolism, such as glycolysis or the citric acid cycle. On the other hand, essential amino acids which generally exhibit complex structures such as aromatic rings and hydrocarbon side chains need to be provided from diets. Amino acids can be categorized into five biosynthetic families plus histidine, based on the central metabolic precursors.
3.2.1 The 2-Oxoglutarate Derived Amino Acids Degradation of the metabolite 2-oxoglutarate leads to the synthesis of glutamate, glutamine, proline, and arginine in which two primary reactions, amidation, and reduction, are utilized for the synthesis of this amino acid class (Figure 3.2). As mentioned before, glutamate is the universal
ATP
NH3
C
C
CH2
CH2
C
NH2
H
COOH
1
1
Glutamine
O
H2N
Pi
C H2
H Glutamate
C H2
ADP
C
NH2
ATP
COOH
ADP
2
10
–
11
2 [H]
Pro
Arg
3
Pi
4
ATP ADP 2 [H]
H
O C
Pi
C H2
C
2 [H]
H2O
NH
C
CH3
C H2
C
NH2
Proline
HN
COOH
12
COOH
5
Glutamate
COOH C C H2 H Semialdehyde
C H2
O
H Semialdehyde
H
O
Spontaneous
6
C
ATP
HN
C
HN
C H2
C H2
NH2
H2 C
Fumarate
C H2
NH2
H
C
NH2
COOH
COOH
NH2
Arginine
C C H2 H
9
8
COOH
Aspartate
Citrulline
C C H2 H
7
Ornithine
C H2
NH2
H2 C
Pi
CAP
H2 C
AMP + PPi
O
HN
H2N
H2O Acetate
Figure 3.2 Glutamate family amino acids biosynthetic pathways. Glutamate (Glu) is formed by reductive amination of 2-oxoglutarate. Glutamine (Gln) is formed by then using: 1, Glutamate synthase. Synthesis of arginine (Arg) requires: 2, N-acetylglutamate synthase; 3, N-acetylglutamate kinase; 4, N-acetylglutamyl-phosphate reductase; 5, N-acetylornithine-δ-transaminase; 6, N-acetylornithinase; 7, ornithine carbamoyl-transferase (or isozymes); 8, argininosuccinate synthetase; 9, argininosuccinase. For the path to Proline (Pro): 10, γ-glutamyl kinase; 11, glutamate-γ-semialdehyde dehydrogenase; 12, Δ1-pyrroline-5-carboxylate reductase.
HO
O
Acetyl -CoA
CoA -SH
–
2-Oxoglutarate
Biosynthesis of Cellular Building Blocks: The Prerequisites of Life 3-5
3-6
Cellular Metabolism
nitrogen donor for α-amino groups in amino acids. The synthesis from 2-oxoglutarate through reversible reductive amination is catalyzed by the enzyme GDH, which is the primary route in bacteria utilizing ammonia as the sole carbon source. However, in animal cells, the primary role of GDH (located in mitochondria) is for the synthesis of 2-oxoglutarate in the citric acid cycle for energy generation. Moreover, the enzyme is allosterically regulated, in which 2-oxoglutarate synthesis is activated by ADP and GDP, and is inhibited by ATP and GTP. In this case, glutamate synthesis from 2-oxoglutarate and glutamine is catalyzed by glutamate synthase. In general, glutamate synthase serves as the primary enzyme for glutamate synthesis in most cells since GDH has a relatively low affinity for ammonia. Glutamine is synthesized by the attachment of a second ammonia moiety to glutamate by the enzyme glutamine synthetase, which requires ATP-dependent glutamyl-phosphate formation in order to eliminate the −OH group and to drive the amidation reaction forward. Glutamine is a universal nitrogen carrier with multiple key roles, in which its amide nitrogen is used for the synthesis of glutamate, tryptophan, histidine, nucleotides, and amino sugars. For this reason, its biosynthesis is under tight regulation. In E. coli, glutamine synthetase is regulated by two unique interdependent mechanisms. One regulatory mechanism controls the allostericity of the enzyme by cumulative feedback inhibition in which the allosteric regulation is accommodated by the presence of eight binding sites for eight inhibitor metabolites derived from glutamine metabolism, i.e., glucosamine-6-phosphate, carbamoyl phosphate, CTP, AMP, and the amino acids alanine, tryptophan, histidine, and glycine. The inhibitory mode is dependent on the number of metabolites bound to the recognition sites such that a single inhibitor is not sufficient enough to halve glutamine biosynthesis. The other mechanism controls enzyme activities through covalent modifications by adenylation. There are 12 adenylation sites on glutamine synthetase which are located adjacent to the catalytic sites. At each site inactivation is achieved when specific tyrosine residues react with ATP to form esters at each site where, in a manner similar to the allosteric control, partial adenylation results in partial inactivation of the protein. The processes of adenylation and deadenylation are further mediated by a complex regulon with both reactions catalyzed by the formation of an enzyme complex of adenylyl transferase and a regulatory protein PII. The cascade provides a feedback loop that when glutamine is present in high concentration, the biosynthesis is shut down. On the other hand glutamine synthetase is turned on in the presence of low glutamine concentration by the accumulation of 2-oxoglutarate and with the presence of ATP. The biosynthesis of proline from glutamate is mediated by several steps. In the first step, glutamate is converted into N-acetylglutamate by the enzyme γ-glutamyl kinase. Next, N-acetylglutamate is converted into glutamic γ-semialdehyde by the enzyme glutamate γ-semialdehyde. Subsequently, glutamic γ-semialdehyde is spontaneously converted into 1-pyrroline carboxylic acid through an intramolecular Schiff base reaction, and further NADPH reduction results in proline. Proline synthesis is regulated by a negative feed-back inhibition of γ-glutamyl kinase. Similar to proline, arginine is synthesized from glutamate through a series of biochemical reactions, involving the enzymes N-acetylglutamate kinase, N-acetylglutamyl-phosphate reductase, N-acetylornithine-δ-transaminase, N-acetylornithase, ornithine carbamoyl-transferase, arginosuccinate synthetase, and argininosuccinase. Like proline, to modulate its synthesis from glutamate, arginine exerts a negative feedback inhibition to the enzyme N-acetylglutamate kinase.
3.2.2 The Aspartate Family and Branch-Chained Amino Acids Oxaloacetate serves as a central metabolic precursor for the synthesis of several amino acids making up the aspartate family. With glutamate, oxaloacetate undergoes a transamination reaction to form the central amino acid of aspartate (Figure 3.3). Subsequently, the conversion of aspartate to form asparagine is catalyzed by L-asparagine synthetase, which exhibits similar mode of action to that of glutamine synthetase. Initiating the pathway for the synthesis of lysine, threonine, methionine, and isoleucine, aspartate kinase I, II, III first converts aspartate to β-aspartyl phosphate. Next, β-aspartyl phosphate
3-7
Biosynthesis of Cellular Building Blocks: The Prerequisites of Life
COOH
Glutamate
C
H2N
Oxaloacetate
ATP
H
CH2
1a,b,c
ADP
COOH H2N
CH2
– – –
C O OH Aspartate
Thr Met Lys
ATP 10a,b
C O OP Aspartyl-4-P 2 [H] 2
PPi
H3C
Pi
NH3 or Glutamine
COOH
10a,b H2N
AMP
C
C
H 11
O H Aspartate semialdehyde
CH2 C
2 [H]
O NH2 Asparagine
O
2 H2O
C
H
C
COOH Pyruvate
CH2
COOH H2N
H
C
N
HOOC
COOH
Dihydrodipicolinate
3a,b
2 [H] 12
COOH H2N
Succinate Cystine H2N
C H CH2 S
H C
C
CoA -SH
H
CH2
5
Cys
COOH
H
H Cystathionine
O
C
Succinyl
H O-Succinyl homoserine
H2O
COOH
Succinyl -CoA H N 2 4
C
–
C
Met
CH2
H
ATP 8
HOOC N COOH Tetrahydrodipicolinate Succinyl -CoA 13 CoA -SH Glutamate 14
ADP Pyruvate + NH3 COOH C
H
CH2 H
OH
OH Homoserine
6
H2N
H
C
SH
H Homocysteine
H2O
COOH CH3-H4F 7
9
H2N C H CH2 H
C
S
Pi COOH CH3
H Methionine To branched-chain amino acids
H2N
C
H
H
C
OH
H
C
H
H Threonine
H2O
15
Succinate
16a,b H2C H2C H
CO2 NH2
CH2 C
NH2
COOH Lysine
Figure 3.3 Biosynthesis of the aspartate family of amino acids. Transamination of oxaloacetate leads to aspartate (Asp). The common pathway to threonine (Thr), methionine (Met), and lysine (Lys) includes: 1a/b/c, aspartate kinases I, II, and III; 2, aspartate semialdehyde dehydrogenase; 3a/b, homoserine dehydrogenase I, II (bifunctional enzymes). Methionine branch: 4, homoserine succinyltransferase; 5, cystathionine γ-synthase; 6, cystathionine β-lyase; 7, homocysteine methylase. Threonine branch: 8, homserine kinase; 9, threonine synthase. Asparagine (Asn) synthesis uses 10a/b, asparagine synthetases. Lysine branch: 11, dihydrodipicolinate synthase; 12, dihydrodipicolinate reductase; 13, tetrahydrodipicolinate succinylase; 14, succinyl diaminopimelate; 15, succinyl diaminopimelate desuccinylase; 16a/b, diaminopimelate epimerase/decarboxylase.
3-8
Cellular Metabolism
is transformed into aspartate semialdehyde by the enzyme aspartate semialdehyde synthase. Through homoserine dehydrogenase I, II (bifunctional enzyme aspartate kinases I, II), homoserine is next synthesized from aspartate semialdehyde. Further reduction occurs to alcohol homoserine by homoserine kinase followed by isomerization by threonine synthase to result in threonine. The biosynthesis of threonine is regulated through the inhibition of aspartate kinase I by threonine itself. Lysine is formed from the condensation (Schiff’s base formation) of aspartate semialdehyde and another central precursor metabolite, pyruvate, by the enzyme dihydrodipicolinate synthase to form dihydrodipicolinate. Further reduction catalyzed by dihydropicolinate reductase gives tetrahydrodipicolinate. Subsequent reactions catalyzed by tetrahydrodipicolinate succinylase, succinyl diaminopimelate aminotrasnferase, succinyl diaminopimelate desuccinylase, diaminopimelate epimerase, and diaminopimelate decarboxylase lead to the synthesis of lysine. Lysine also serves a negative inhibitor by acting on aspartate kinase III. Methionine biosynthesis branches from homoserine through the action of homoserine succinyltransferase to synthesize O-succinyl homoserine. Cystathionine formation then follows by a reaction catalyzed by cystathionine γ-synthase, a pyridoxal-phosphate enzyme, and also results in the release of succinate, an intermediate in the TCA cycle. Subsequent reactions involve cystathionine β-lyase that synthesizes homocystein and then homocystein methylase that catalyzes the incorporation of methylH4F, a C1 carrier, to the homocystein molecule generate methionine. Methionine is a feedback inhibitor of two enzymes; its own biosynthetic pathway at the step catalyzed by homoserine succinyltransferase and also aspartate kinase II found earlier in the aspartate family biosynthesis. While alanine is derived from the direct transamination of pyruvate, the biosynthesis of branchedchain amino acids isoleucine, valine, and leucine is more complicated but with many common principles between each path (Figure 3.4). Activated acetaldehyde or 2-oxobutyrate (derived from threonine deamination) is condensed with pyruvate by acetohydroxyacid synthases in a thiamine-pyrophosphatedependent reaction. The α-acetyl products are then reduced to 2,3-dihydroxyl compounds through catalysis by acetohydroxyacid isomeroreductase, followed by the elimination of a water molecule by dihydroxyacid dehydratase to yield 2-oxo compounds. Further transamination results in formation of valine and isoleucine. The formation of α-isopropylmalate from acetyl-CoA and 2-oxoisovalerate, an intermediate of the valine pathway, occurs by isopropylmalate synthase; this reaction initiates the biosynthesis of leucine. Subsequent reactions form 2-oxoisocaproate by isopropylmalate isomerase and β-isopropylmalate dehydrodenase, which is then followed by a transamination reaction resulting in the formation of leucine. The biosynthesis of leucine, valine and isoleucine is regulated by negative feedback inhibition to acetohydroxyacid synthases by valine and isoleucine.
3.2.3 Serine-Glycine Family of Amino Acids The biosynthesis of the serine-glycine family (Figure 3.5) from 3-phosphoglycerate originates from the oxidation of the free hydroxyl group by the enzyme 3-phosphoglycerate dehydrogenase. Subsequently, transamination and phosphate removal by 3-phosphoserine aminotransferase and 3-phosphoserine phosphatase result in serine biosynthesis. From here the biosynthetic pathway branches in two directions, one to cysteine by incorporation of hydrogen sulfide (HS) and the other into glycine. Biosynthesis of cysteine occurs through the action of the two enzymes serine transacetylase and O-acetylserine sulfhydrylase. In general, serine transacetylase catalyzes the condensation of serine with acetyl-CoA to give β-O-acetylserine. Which when HS is directly incorporated into by the enzyme O-acetylserine sulfhydrylase, results in cysteine. Moreover, the biosynthesis of cysteine is also regulated by the cys regulon. Mammals derive cysteine from diet or the metabolism of methionine because under normal physiological conditions, cysteine does not exist in cells as a free amino acid due to quick reduction by glutathione. Here, the route of cysteine biosynthesis proceeds through the reverse pathway of methionine. More specifically, serine and homocysteine are condensed by the enzyme cystathionine synthase to form cystathionine from which cysteine is then formed through the cleavage of cystathionine. On the
3-9
Biosynthesis of Cellular Building Blocks: The Prerequisites of Life O 1
CH2
NH3
H
O COOH
C
CH3
Pyruvate x2 2a,b,c
2a,b,c CO2 H3C
O
OH
C
C
COOH
CH2
H3C
CH2 H CH3 α,β-Dihydroxyβ-methylvalerate
H3C
Glutamate CH3
H3C
C
H
CH2
COOH
C
O
COOH 2-Oxoisocaproate 2 [H] CO2
2 [H] 8 7
C
C
CH3 COOH
CH3 H
H3C
C
H
HOOC
C
OH
α,β-Dihydroxyisovalerate
H2O
CH2 COOH α-Isopropylmalate
H2O CoA -SH
O
H C
C
COOH
CH2 α-Oxo-β-methylvalerate Glutamate 5 a,b H
NH2
C
C
CH2 H CH3 Isoleucine
H3C
H
O
C
C
COOH
H2O COS
2-Oxoisovalerate
H3C CoA Acetyl-CoA
Glutamate
5 a,b
H3C
6
COOH
CH3
CH3
H3C
C
4
4
H3C
C
5 c,d
OH OH COOH
C
OH
3
2 [H]
OH OH C
O
α-Acetolactate
α-Acetoα-hydroxybutyrate
H3C
Val Ile
CH3
CH3
3
COOH
C C H2 H
CH3 Leucine
CH3
2-Oxobutyrate
NH2
C
H3C
COOH
C
– –
Threonine
H
NH2
C
C
COOH
CH3 H Valine
Figure 3.4 Biosynthesis of the branched-chain amino acids. Parallel path for synthesis of isoleucine (Ile) and valine (Val): 1, threonine deaminase; 2a/b/c, acetohydroxyacid synthases (a/b are end-product inhibited); 3, acetohydroxyacid isomeroreducaste; 4, dihydroxyacid dehydrase; 5a/b, transaminases; 5c/d, transaminases; 6, isopropylmalate synthase; 7, isopropylmate isomerase; 8, β-isopropylmalate dehydrogenase.
3-10
Cellular Metabolism COOH C
O
CH2O P 2 [H]
Hyroxypyruvate 3-phosphate
COOH H
C
Glutamate
2
1 OH
CH2O P
Pi
COOH
Glycerate 3-phosphate
H2N 6
C
CH2OH H4F
COOH C H
H
Glycine 7
H2N
C
H
CH2O P Serine 3-phosphate
3 Acetyl -CoA CoA -SH HS– 4
CH=H4F + H2O
H4F
H2O
H
Serine
H2N
COOH
Acetate– 5 COOH
CH=H4F CO2
H2N
C
H
CH2 SH Cysteine
Figure 3.5 Serine-glycine family amino acids biosynthetic pathways. Serine (Ser) is formed using: 1, 3-phosphoglycerate dehydrogenase; 2, 3-phosphserine aminotransferase; 3, 3-phosphoserine phosphatase. Synthesis of cystenin (Cys) requires: 4, serine transcetylase; 5, O-acetylserine sulfhrylase. Glycine (Gly) is formed using 6, serine hydroxymethyltransferase and decomposed by 7, glycine cleavage enzymes.
other hand in plants and microorganisms, since cysteine is synthesized by direct incorporation of HS-, cysteine serves as the universal sulfur carrier from which most of the sulfur containing metabolites incorporate sulfur directly or indirectly. Glycine synthesis is performed by the enzyme serine hydroxymethyltransferase where tetrahydrofolate is used to cleave a methyl group forming methyl-H4F and water. Enzyme expression in this pathway is repressed by several products of C1 metabolism, i.e., serine, glycine, methionine, purine, and thyamine.
3.2.4 Aromatic Nonpolar Amino Acids Erythrose 4-phosphate and PEP are the central precursor metabolites for the formation of the aromatic ring in the amino acids phenylalanine, tyrosine, and tryptophan (Figure 3.6). The C3 side-chain is also derived from PEP, or in the case of tryptophan from serine. In general, the biosynthesis of all three aromatic amino acids follows the shikimate pathway. It begins with the condensation of erythrose 4-phosphate and PEP into 3-deoxy-D-arabino-heptulosonate-7-phosphate (DAHP) catalyzed by three isozymes of DAHP synthase where each form of DAHP synthase is inhibited by a corresponding aromatic amino acid. Through the action of dehydroquinate synthase, DAHP is cyclized into 3-dehydroquinate,
3-11
Biosynthesis of Cellular Building Blocks: The Prerequisites of Life
Phosphoenolpyruvate (PEP)
Pi
CH2O P O OH
Pi
OH
OH
Erythrose 4-phosphate
OH
2
COOH OH 3-Deoxy-D-arabinoheptulosonate 7-phosphate (DHAP) Pi HO
1a, b, c
O
COOH OH 3-Dehydroquinate PEP 3-7
COOH O
Trp
–
HOOC
CH2
C
Phe
CH2
COOH Chorismate Glutamine 8 Pyruvate
– 13 a
O
C
OH
HO
Prephenate 14 b CO2, H2O
11 CO2, H2O
C H
N
CH
15
HO 12
HO NH2 H
O
N
O
C
C
COOH
Glutamate
P
CH2
15
Serine Glyceraldehyde 3-phosphate
C H
H
H Phenylpyruvate
OH CH
H H Tyrosine
COOH
PPi
Glutamate
C
COOH
10
H
H 4-Hydroxyphenylpyruvate
HOOC
HO PRPP
9 2 [H], CO2
C
O CH2 C
NH2 Anthranilate
13 b
HOOC
14 a
COOH
HOOC Prephenate
O
–
Tyr
NH2
H
NH2
C
C
COOH
H H Phenylalanine
CH CH2 COOH Tryptophan
Figure 3.6 Aromatic amino acids biosynthetic pathways. Common pathway is the shikimate pathway: 1a/b/c, 3-deoxy-D-arabino-heptulosonate-7-phosphate (DHAP) synthases; 2, dehydroquinate synthase; 3, dehydroquinate dehydratase; 4, shikimate dehydrogenase; 5, shikimate kinases; 6, 5-enoylpyruvoylshikimate-3-phsophate synthase; 7, chorismate synthase. Tryptophan (Trp) branch: 8, anthranilate synthase; 9, anthranilate-phosphoribosyl transferase; 10, phosphoribosyl-anthranilate isomerase; 11, indoleglycerolphosphate synthetase; 12, tryptophan synthase. Tyrosine (Tyr) branch: 13a/b, chorismate mutase/prephante dehydrogenase (Tyr-bifunctional enzyme). Phenylalanine (Phe) branch: 14a/b, chorismate mutase/prephante dehydrogenase (Phe-bifunctional enzyme); 15, tyrosine aminotransferase (used in both Tyr and Phe branches).
3-12
Cellular Metabolism
and converted into chorismate through a series of reactions catalyzed by dehydroquinate dehydratase, shikimate dehydrogenase, shikimate kinase, 5-enoylpyruvoylshikimate-3-phosphate synthase, and chorismate synthase. Conversion of chorismate is the first committed step towards any one of the three branching pathways leading to the aromatic amino acids. For the path leading to tyrosine, chorismate is converted into prephenate, and subsequently 4-hydroxyphenylpyruvate by a bifunctional enzyme chorismate mutase/ prephenate dehydrogenase using an intramolecular group transfer. Next, the incorporation of an amino group from glutamine by the enzyme tyrosine aminotransferase, results in the production of tyrosine. Similar to the initiation process of tyrosine biosynthesis, chorismate is converted into prephenate by chorismate mutase for biosynthesis of phenylalanine, but subsequently into phenylpyruvate by prephenate dehydratase. Finally, incorporation of an amine group from glutamate to phenylpyruvate by tyrosine aminotransferase results in phenylalanine formation. Tryptophan biosynthesis on the other hand starts with the formation of anthranilate from chorismate catalyzed by anthranilate synthase through the introduction of an amine group and a release of pyruvate. A series of reactions catalyzed by anthranilate-phosphoribosyl transferase, phosphoribosyl-anthranilate isomerase, indoleglycerolphosphate synthetase, and tryptophan synthase finally leads to the biosynthesis of tryptophan.
3.2.5 Histidine Biosynthesis Histidine is synthesized through a unique pathway involving ten biochemical steps, starting with an unusual condensation of ATP and 5-phosphoribosyl 1-pyrophosphate (Figure 3.7). 5-phosphoribosyl 1-pyrophosphate (PRPP) forms the C3 side-chain and the two adjacent carbon atoms −C = CH− of the heterocyclic ring. The purine base of ATP contributes to the atoms of the −N = CH− group. Then the conjugation of ATP with PRPP, followed by the release of pyrophosphate and incorporation of a water molecule, results in the synthesis of an intermediate in which the ribose ring opens up through the catalysis of phosphoribosylformimino-5-aminoimidazole carboxamide ribonucleotide isomerase. Subsequently an amide nitrogen is transferred from glutamine, followed by cleavage and ring closure to synthesize imidazole glycerol phosphate, with the other product, 5-aminoimidazole-4-carboxamide ribonucleotide, acting as an intermediate for the synthesis of purine (see Section 3.2.1). Next, a dehydration reaction which is catalyzed by histidinol-phosphate phosphatase results in the synthesis of imidazole acetol phosphate. Further transamination, from which the nitrogen group is derived from glutamate, as well as dephosphorylation, and dehydrogenation yields histidine. In enteric bacteria, the genes encoding the histidine biosynthetic enzymes form an operon assembly, arranged in the order of the reactions in the pathway.
3.3 Nucleotides as Building Blocks 3.3.1 Structure and Organization of Nucleotides The organizational blueprint for cellular machinery is made up of nucleotides. When these monomers are chained together to form the DNA and RNA strands, they act as the repositories and transmitters of genetic information for every cell, tissue, and organism. The monomer units are referred to as nucleotides, and are classified into the two main structures of purines and pyrimidines. Nucleotides are also known to serve additional functions throughout the cell, involving a number of different metabolic objectives and account for up to 20% of the cells dry weight in prokaryotes. 3.3.1.1 Primary Structure of the Nucleic Acids Nucleotides are composed of three main components: (1) the main ribose or deoxyribose sugar, (2) the phosphate linker, and (3) the nucleobase. When the ribose sugar is in the unphosphorylated state at the 5´ carbon and contains only a nucleobase attached the 1´ carbon, the molecules are known as nucleosides.
His OH OH
NH2 O
N
N
n O
N H
O
CH2
N N
8
H 2O
CH
Glutamate
C H2
O
N
N
N
Pi
9
5
NH2
3
H2O
Histidinol
HO CH2 CH CH2
AICAR
2
PPi
Ribose 5’ -P
2 [H]
N
H2C
P O
H2C
P O
10
O H C
HOOC
O
CH2
OH HN
2 [H]
N
HO
HN
OH OH
O
O H C
4
N
N
Histidine
N N
Ribose 5’ -P
Ribose 5’ -P
N
N
N
CH CH2
NH2
NH2
N
NH2
Figure 3.7 Histidine biosynthesis. Histidine (His) is synthesized via a complex series of reactions. Numbered enzymes are: 1, phosphoribosyl-ATP pyrophosphorylase; 2, phosphoribosyl-ATP pyrophosphohydroylase; 3, phosphoribosyl-AMP cyclohydrolase; 4, phos-phoribosylformimino-5-aminoimidazole carboxamide ribonucleotide isomerase; 5, phosphor-ribosylformimino-5-aminoimidazole carboxamide ribonucleotide; 6, cyclase; 7, imidazole-glycerolphosphate dehydratase; 8, histidinol-phosphate transaminase; 9, histidinol-phosphate phosphatase (as a bifunctional enzyme with 7 or separate); 10, histidinol dehydrogenase.
P O CH2 C
ke to ni za tio
H2O
6
OH OH
P O CH2 CH CH
OH OH Imidazole glycerol phosphate
7
H2C
P O
Resynthesis of purine ring
Glutamine
1
–
PPi
P O CH2 CH CH
(ATP) Adenosine 5’ -triphosphate
(PRPP) 5-Phosphoribosyl 1-pyrophosphate
N
Biosynthesis of Cellular Building Blocks: The Prerequisites of Life 3-13
3-14
Cellular Metabolism
On the other hand when the monomer ribose sugar is phosphorylated at the 5´ carbon and has a base attached it is termed a nucleotide. The nucleobase is always connected to the 1´ position using a glycosidic bond while the ribose sugars are connected from the 3´–5´ phosphodiester bond of successive ribose sugars (Figure 3.8). These bonds form the backbone of the nucleic acid molecule. The nucleotide chains are arranged in one direction which is established by the bonds formed between the 5´-phosphate and the 3´-hydroxl group. The specific nucleotide sequence, referred to as the primary structure of the DNA is the storage for the genetic information for protein synthesis and is read from the 5´ to the 3´ end. 3.3.1.3 The Basic Components of Nucleotides Nucleosides are classified according to their nucleobases into purines (nine carbon ring molecules), which include adenosine (A) and guanosine (G), and pyrimidines (six carbon ring molecules), which are cytidine (C), uridine (U), and thymidine (T). Instead of thymidine, uridine (U) is used for RNA assembly (Figure 3.9). Moreover, RNA is also assembled from the ribose sugar molecules, and not from the deoxyribose form used for DNA formation. The labeling of the atoms in ribose generally includes a prime (′) to differentiate between the labeling of the base carbons. As mentioned earlier, DNA links the two strands through hydrogen bonds of the complementary pairing nucleotides A-T/U and G-C.
(O) P O– O
5´
Base
Base N
CH2
N
O
O (O) P O–
CH2 – O
O H
3´
H 2´
OH
2´
O –O P (O)
5´ N
N
O
OH
Base
Base
O
3´
2´
5´
CH2
O
H
2´
OH
CH2 – O
O
Hydrogen bonding
H
3´
–O P (O)
5´
3´ OH
O
Figure 3.8 Nucleotide backbone connections. The phosphodiester and glycosidic bonds are evident in the connection of consecutive nucleotides. Hydrogen bond between strands causes a rotation of the strands resulting in the helix structure so often depicted. 3′ and 5′ locations have been identified. (a)
CO2
6 Aspartate
1N
5
2
3N
4
N7
N9
HC Glutamine
Ribose 5’ -P Glutamine
OH
Carbamoyl phosphate 8
HOCFH4
(b)
Glycine
OH
6 1
FH4 O
2
Aspartate 5
N N3
4
Ribose 5’ -P
Figure 3.9 Origin of each atom making up the nucleotide precursors. (a) IMP, the first synthesized purine, and (b) UMP, the first synthesized pyrimidine.
3-15
Biosynthesis of Cellular Building Blocks: The Prerequisites of Life
The phosphate linker can be a monophosphate, diphosphate, or triphosphate group attached to the 5´ carbon of the ribose. The monophosphate forms of AMP, CMP, GMP, UMP, and TMP molecules are used as energy sources through cellular metabolism. Because the nucleoside monophosphates (NMPs) and diphosphates (NDPs) are used as energy sources and also help to carry phosphates around the cell, the have a very high rate of formation. Both are also repeatedly phosphorylated with ATP by nucleotide specific NMP and NDP kinases that can act on both the deoxyribose and ribose forms.
3.3.2 Biosynthesis of Nucleotides Both purine and pyrimidine biosynthesis are feedback inhibited to synchronize physiological demand within the cell. Production is regulated by intracellular mechanisms that sense and control the availability of nucleotide triphosphates (NTPs) as they accumulate and dissipate during growth and tissue regeneration. In general, regulations of nucleotide synthesis are complex, and further information can be found elsewhere for detailed information is beyond the scope of this chapter. The 2-deoxyribonucleotides are catalyzed by ribonucleotide reductase which reduces a ribonucleotide into the deoxyribonucleotide. The nucleoside triphosphates are all, with the exception of CTP, synthesized from the corresponding monophosphate forms. The pentose portion of both purines and pyrimidines are derived from PRPP, but at different points in the synthesis. In addition, NDP-sugars can also be synthesized directly from a few central precursor metabolites (Figure 3.10). Purines add a number of different groups to PRPP and eventually from the heterocyclic ring while in pyrimidine synthesis the PRPP is introduced after the ring has already been formed. 3.3.2.1 de novo Purine Synthesis In humans, a limited amount of nucleic acids derived from diets are incorporated into the cells and thus must be synthesized de novo. There are two major routes of purine biosynthesis; the first is synthesis from amphibolic intermediates, more commonly known as synthesis de novo and the second is through the use of salvage pathways (see Section 3.2.4). In de novo synthesis, the active ribose 5-phosphate, PRPP, is converted into inosine monophosphate (IMP) through a series of ten enzymatic reactions where functional groups are added in a stepwise manner. IMP is then used to form the final purines adenosine monophosphate (AMP) or guanosine monophosphate (GMP) through the inclusion of the respective base (Figure 3.11). Fructose 6-phosphate
Mannose 6-phosphate
Glucose 6-phosphate
Glutamine
Glucosamine 6-phosphate Acetyl -CoA
Mannose 1-phosphate
Glucose 1-phosphate
N-Acetylglucosamine 1-phosphate
NTP
NTP
NTP
PPi
PPi
PPi
NDP-mannose
NDP-glucose
NDP-N-acetylglucosamine
Figure 3.10 Activation pathways for hexose 1-phosphates from central precursor metabolites. While some can be activated by nucleotidyl transfer from the corresponding triphosphate, most other activation mechanisms occur by using a diphosphate form. UDP-galactose can be formed directly from galactose by UDP activation.
3-16
Cellular Metabolism
ATP
Glutamine PPi 5-Phosphoribosyl 1-pyrophosphate (PRPP)
– –
NH2
ADP + Pi
Ribose 5’ -P
1
C O
2
C
Ribose 5’ -P -Glycineamide
AMP GMP 5‘ -phosphoribosyl
CH=H4F ADP + Pi
N N
H N
ATP
5
C
NH
HN
Ribose 5’ -P
ADP + Pi
H
O
3
ATP Glutamine
H N
4
NH
O
Ribose 5’ -P
-5-Aminoimidazole
C
H
O
Ribose 5’ -P
-N-Formylglycineamidine
-N-Formylglycineamide
6
CO2
Aspartate
HOOC
ATP
N
ADP + Pi
O
Fumarate H2N
N
H2N
7
8
C
Ribose 5’ -P
9 Ribose 5’ -P
-4-Carboxamide5-aminoimidazole
2 [H] H2O
N
N
(IMP) Inosine 5’ -monophosphate
12
H2N
N
N H
N Ribose 5’ -P
H 2O
Aspartate GTP GDP + Pi 13 8
Fumarate
NH2 N
N
N
-4-Carboxamide5-formaminoimidazole
OH
Glutamine ATP ADP + Pi
N Ribose 5’ -P
(XMP) Xanthosine 5’ -monophosphate
C
10 N
N
11
Ribose 5’-P
H C
O
OH N
N
H2N
N
OH N
O
OHC-H4F
N
H2N
-5-Aminoimidazole4-carboxylic acid (AICAR)
HO
NH
Glycine
-Amine
H2N
NH2
N Ribose 5’ -P
(GMP) Guanosine 5’ -monophosphate
N
N N
N Ribose 5’ -P
(AMP) Adenosine 5’ -monophosphate
Figure 3.11 Purine synthesis from precursor 5-Phosphoribosyl 1-pyrophosphate (PRPP). Using a ten step irreversible pathway IMP is formed from: 1, PRPP amidotransferase; 2, phosphoribosyl-glycineamide synthetase; 3, phosphoribosyl-glycineamide formyltransferase; 4, phosphoribosyl-formylglycineamidine synthetase; 5, phosphoribosyl-aminoimidazole synthetase; 6, phosphoribosyl-aminoimidazole carboxylase; 7, phosphoribosyl-aminoimidazole succinocarboxamide synthetase; 8, adenylosuccinate lyase (bifunctional enzyme); 9, phosphoribosyl-aminoimidazole carboxamide formyltransferase; 10, IMP cyclohydrolase; GMP is then formed using 11, IMP dehydrogenase and 12, GMP synthetase while AMP is synthesized using 13, adenylosuccinate synthetase and the bifunctional enzyme.
Biosynthesis of Cellular Building Blocks: The Prerequisites of Life
3-17
To begin the purine synthesis de novo, an imidazole ring is formed by the introduction of glutamine and glycine in five reaction steps which is then followed by the completion and closure of the pyrimidine ring using aspartate. Overall, several different functional groups are added from glycine molecules, amino groups from glutamine and aspartate, single carbons from methenyl or formyl containing compounds and a final carbon from carbon dioxide. The first step involves the formation of PRPP by PRPP synthase using a free ribose 5-phosphate and ATP. The 1′ carbon is then aminated by PRPP glutamyl amidotransferase using glutamine and water followed by the addition of glycine to the recently added NH3+ . Glycine contributes the C4 and C5 carbons as well as the N7 nitrogen of the imidazole ring. Reaction with tertrahydrofolate adds a carbon to the C8 which is acted on by a series of synthetases (VI and VII) adding the N3 in the pyrimidine ring and closing the imidazole ring. The subsequent five reactions, beginning with a carboxylation, complete a purine structure forming IMP. AMP is then formed by aminating IMP at the C6 position by adenylosuccinase. In forming GMP, the C2 position of IMP is first oxidized to form xanthine monophosphate (XMP, nucleobase of xanthine) and then aminated at the same C2 position to create GMP. Later phosphorylation leads to the diphosphates ADP and GDP and then to the triphosphate forms ATP and GTP, respectively. Both prokaryotic and eukaryotic synthesis de novo steps are the same but there are major differences in the functionality of the enzymes used to catalyze the reaction steps. While in prokaryotic cells each step is catalyzed by its own unique enzyme, in purine producing eukaryotic cells the enzymes contain multiple catalytic activities facilitating the successive conversions through adjacent catalytic sites. For instance the addition of glycine, the C8 position and closing of the imidazole ring are performed at different catalytic sites on the same functional enzyme in human liver cells. Since a number of precursor metabolites (glycine, glutamine, ATP, aspartate, and tetrahydrafolate) are incorporated into nucleotide biosynthesis, it is vital that the metabolism of nucleotides is tightly regulated to not induce further cell stress. For example, PRPP synthase is sensitive to the concentration of AMP, ADP, GMP, and GDP where high concentrations result in feedback inhibition of the enzyme. AMP also limits its own formation by inhibition of the adenylosuccinate synthase enzyme, as does GMP with its own inhibition of IMP dehydrogenase. There is also cross regulation that occurs for the generation of GMP and AMP in IMP metabolism where a decrease in synthesis of one occurs when the other is in a state of deficiency. 3.3.2.2 Stepwise Production of Pyrimidines The de novo biosynthetic pathway of pyrimidines occurs in a disjointed manner since the active ribose phosphate is not incorporated until after the pyrimidine ring is formed. Pyrimidine synthesis is regulated through allosteric control of the three enzymes CTP synthase, carbamoyl phosphate synthase, and aspartate carbamoyltransferase (Figure 3.12). To start the biosynthetic pathway, the enzyme carbamoyl phosphate synthase (CPS) condenses aspartate and carbamoyl phosphate to form carbamoyl aspartic acid (CAA). In eukaryotic species, there are two forms of the CPS enzyme, mitochondrial (Type I) and cytosolic (Type II) protein. Only CPS II is responsible for the biosynthesis of pyrimidines from its independent pool of carbamoyl phosphate. In prokaryotes, there also exists multiple forms of the CPS enzyme though not typically classified as Type I or II and all are functionally active. After formation of CAA, the ring closure occurs to from dihydroorotic acid which then is converted to orotic acid by dihydroorotase dehydrogenase. Next, orotic acid is attached to PRPP by orotate phosphoribosyltransferease and then decarboxylated to from UMP. Before forming either TMP or CTP, uridine monophosphate is always phosphorylated to UDP. Another phosphorylation generates UTP before being converted by CTP synthase using ATP and glutamine to create CTP. Alternatively UDP is acted on by ribonucleotide reductase to remove the formed deoxyuridine diphosphate (dUDP) which after a dephosphoylation is converted to dTMP by thymidylate synthase, then oxidized to form the final pyrimidine building block TMP. 3.3.2.3 Formation of 2´-Deoxyribonucleotides In order to form 2-deoxyribonucleotides, ribonucleotides must be synthesized first through the steps outlined in the previous sections, and then after phosphorylation to the diphosphate and triphosphate
3-18
Cellular Metabolism NH2
O C
HO
CH2
CH
COOH
OH
Pi
Aspartate
C O O
O O
H 2N
H2N
1
P
CH2
N
CH N COOH O H Carbamoyl aspartate
OH OH
OH
H2O
C
2
C
O
C
CH
N H
COOH
4,5 -Hihydroorotate
Carbamoyl phosphate
3 2 [H]
OH
OH CO2
PPi
OH
PRPP
N
N O
CH2
N 5
N
O
4
COOH
N
HO
Ribose 5’ -P
Ribose 5’ -P (UMP) Uridine monophosphate
N
COOH
Orotate
Orotidine monophosphate
ATP 6
ADP OH
OH ATP ADP
N O
N
7 Ribose 5’ -PP
(UDP) Uridine diphosphate
Glutamine ATP ADP + Pi
N O
NH2
N
8 Ribose 5’ -PPP
(UTP) Uridine triphophate
N O
N Ribose 5’ -PPP (CTP) Cytidine triphosphate
Figure 3.12 Synthesis of pyrimidine nucleotides via orotate. The enzymes used are: 1, aspartate transcarbamoylase; 2, dihydroorotase; 3, dihydroorotate dehydrogenase; 4, orotate phosphoribosyltransferase; 5, orotidine 5-phosphate decarboxylase; 6, nucleoside-momophosphate kinase; 7, nucleoside-diphosphate kinase; 8, CTP synthase.
forms, ribonucleotide reductase can catalyze the reduction reaction to the deoxy- forms. This highlyevolved and highly regulated enzyme acts using a radical donor mechanism involving a coenzyme B12 ion, an iron ion, or a noniron dependent enzyme. The basic mechanism for ribonucleotide diphosphates involves the donation of an electron, generally by reduced thioredoxin, which is then later oxidized back to its reduced state by the cofactor NADPH. The deoxyribonucleoside diphosphates are then later phosphorylated by ATP to triphosphates. The reduction products, deoxyuridine triphosphate (UTP) and deoxyuridine diphosphate (UDP) are not the DNA building blocks but the main enabler for deoxynucleotides. UTP and UDP are dephosphorylated and reduced into dUMP by either the enzyme dUTP pyrophosphatase or dUDP phosphatase, respectively. dUMP is then converted into the deoxythymidine monophosphate (dTMP) through the action of the enzyme thymidylate synthase. In the final reaction to form dTMP, methylene
Biosynthesis of Cellular Building Blocks: The Prerequisites of Life
3-19
tetrahydrofolate donates the methylene group that is simultaneously reduced to the methyl level by oxidizing the carrier tetrahydrofolate to dihydrofolate, later reducing itself back to tetrahydrofolate. dTTP is though to be formed through a different mechanism since production levels seam to be independent of activity through dUTP pyrophosphatase. 3.3.2.4 Production through Salvage Reactions As mentioned before, the regulation of de novo synthesis is very tightly controlled resulting in instances where the production levels can not meet cellular demands. As such, and especially in prokaryotes, there exists a collection of salvage enzymes to help sustain production levels in periods of high demand. Salvage reactions generally involve less energy than primary synthesis routes since the major pieces of the nucleotide molecules have already been formed, thus only requiring a few more steps and metabolic precursors to complete the nucleotide biosynthesis. The salvage enzymes in prokaryotes have the primary function of allowing nucleotide synthesis from free nucleosides and nucleobases that permeate into the cell while also recycling back signaling molecules, such as cAMP, and any nucleobases or nucleosides produced from nucleotide turnover. In addition, salvage pathways also provide carbon, energy, and nitrogen by deconstructing the dephosphorylated nucleotides that have entered the cell through periplasmic membranes and phosphatases. Salvage reactions occur by a number of mechanisms and variations; however, there are five typical mechanisms: (1) a free purine and PRPP form a purine 5′ mononucleotide with a pyrophosphate where two phosphoribosyl transferases combine the free purine to the one carbon position of PRPP. This is the most common method for recycling of nucleobases as it has an equilibrium strongly favoring the assembled nucleotide. (2) Nucleoside kinases transfer a phosphoryl group in a phosphorylation reaction from nucleotide triphosphates to form mononucleotides. This reaction does not occur in all bacteria species due to the lack of metabolic precursors needed in the reaction mechanism, but does occur in most mammalian cell lines. (3) Reversion between nucleobases and nucleosides can occur by nucleoside phosphorylases, which places a phosphate O-glycosidic linkage in place of the N-glycosidic linkage. (4) The N-glycosidic linkage of monophosphates can be hydrolyzed by NMP glycosylases for formation of dNMPs. (5) Since the nucleotide synthesis steps are not reversible, mechanisms for the interconversion of nucleobases exist, such as the conversion of adenine to hypoxanthine (the nucleobase for IMP) and deamination of GMP to create IMP. In general, salvage reactions play a major role in the replenishment of the nucleotide pool and are the major method of nucleotide generation in humans. More specifically, the human liver is the primary location used in generating purines and purine nucleosides used in the salvage pathways. This generation is critical for other tissues in the body as well. For instance, the human brain has been found to produce only small quantities of PRPP synthase, the enzyme responsible for synthesis of PRPP, and therefore it must rely on these exogenous purines from the liver. Similarly erythrocytes and polymorphonuclear leukocytes cannot synthesize 5-phosphoribosylamine, a precursor synthesized by the PRPP glutamyl amidotransferase.
3.3.3 Nucleotide Metabolic Conditions Abnormal nucleotide catabolism can result in a number of clinical disorders ranging from mild to fatal conditions in humans where the majority of nucleotide related diseases result from purine catabolism problems. Gout is a clinical condition caused by the excess accumulation of uric acid, a by product of purine degradation, and typically results in severe inflammation of the joints or similarly arthritis that is caused by a precipitation of sodium urate crystals. Readers can find a more information in the review by Nyhan [25]. The most severe of purine metabolism related diseases are Lesch–Nyhan syndrome and Severe Combined Immunodeficiency Disease (SCID). Lesch–Nyhan syndrome is caused by an inherited defect in the X chromosome which results in the loss of function in the HGPRT gene (encodes a nucleotide
3-20
Cellular Metabolism
salvage enzyme). In the most serious cases, patients not only have severe symptoms of gout, but a severe malfunction of the nervous system usually resulting in death before the age of 20 [24]. SCID is the resulting condition of a deficiency in the adenosine deaminase enzyme that converts adenosine to inosine and selectively leads to the destruction of B and T lymphocytes responsible for mounting the body’s immune response [2,3]. Often seen in cancer patients undergoing chemotherapy, conditions of purine deficiency occur due to treatment methods that inhibit the formation of tetrahydrofolate, a major precursor metabolite of purine synthesis. These treatments are designed to limit tumor growth in patients, however since the initial carbons used to form the purine rings are derived from tetrahydrofolate derivatives, these deficiency states result. Typically this is diagnosed by monitoring the levels of folic acid in the body, the metabolite responsible for the pool of tetrahydrofolate. While there are only a few disorders caused by pyrimidine metabolism, they are still prevalent in the population. The two main disorders are inherited disorders affecting the bifunctional enzyme catalyzing the UMP synthesis, orotate phosphoribsyl transferase and OMP decarboxylase. Deficiencies in these enzymes result in orotic aciduria causing retarded growth and conditions of hyporchromic erythrocytes and megaloblastic bone marrow. Both conditions are prevalent in children and lead to severe anemia [25].
3.4 Synthesis of Carbohydrates for Building Cells 3.4.1 Introduction to Carbohydrates Carbohydrates, also commonly referred to as saccharides, are found attached to all forms of natural molecules such as proteins, lipids, hormones, and antibiotics. As such, they are critically important building blocks for proper cell functioning by both providing energy for metabolism through oxidation and serving as structural components for a variety of cellular membranes. Understanding such roles in cellular function requires elucidation of the mechanisms by which interconversions between carbohydrates take place, the role they play as building block precursors as well as modes in which they are introduced to pathways by activation. While monosaccharides and their derivatives make up the majority of the carbohydrates found in cells, polysaccharides, oligosaccharides and several interlinked carbohydrate monomers are also highly prevalent. Originally the term carbohydrate was reserved to only those molecules containing the traditional Cn(H2O)n structure (hydrates of carbon), but now is also used to include molecules having similar physical properties such as the polyhydroyl aldehydes, alcohols, ketones, and acids. In plants, animals, and microorganisms, the carbohydrate glucose is one of chief importance since it is used as an energy source, primarily through the main oxidative pathway of glycolysis. Glucose is generated from a variety of metabolic routes including simple transport, oxidation, and reduction. For example in plants, glucose is derived from the energy-storage polymer starch produced during photosynthesis in which, water and carbon dioxide are converted into carbohydrates and oxygen with the use of sunlight as an energy source. The primary form synthesized by plants is D-glucose, which is utilized to generate plant polymers such as cellulose and even reconverted back to starch storage. In humans carbohydrates are primarily derived from plant fiber intake, which is broken down into water and energy. A similar process is used by most bacteria where glucose and other simple sugars are imported into the cell through various transport mechanisms. Carbohydrates are generally separated into two main groups: simple and complex sugars. Simple sugars are generally monosaccharides containing only the carbohydrate structure itself. This class of sugars includes mannose, fructose, and ribose along with their various derivatives. Monosaccharides are identifiable through the number of carbon atoms present (e.g., pentoses have five carbon atoms, where hexoses have six), the configuration of the hydroxyl group, and the position of the carbonyl group and are joined together by acetal-linkages to form oligosaccharides and polysaccharides. In many instances, monosaccharides, such as glucose, exist in different isomeric forms (D and L form) where the D sugars are the most naturally abundant. As such, most enzymes in mammals are selective to the D sugars.
Biosynthesis of Cellular Building Blocks: The Prerequisites of Life
3-21
The most stable forms of monosaccharides occur with ring structures similar to pyran and furan and may also exhibit alpha and beta anomers, especially when the compounds are exposed to different environmental conditions such as pH. Additionally, several other varieties can occur in sugar molecules, which include epimers, different configurations of hydroxyl groups, aldose-ketose isomers, and structural conformations. Because of the important modifying influence these different isomeric forms take in addition to their attachment to many lipid and benzenoid structures, numerous enzymes called glycosidases have evolved for the attachment and removal of sugars. The advanced simple sugar disaccharides such as maltose, sucrose and lactose are also critical carbohydrates prevalent in many organisms. Many humans suffer from the inability to metabolize lactose, the disaccharide most prevalent in milk; this is something that leads to diarrhea, flatulence and other serious metabolic conditions. Maltose, or malt sugar, is used in the hydrolysis of starch, which itself is a carbohydrate polymer. Many polysaccharides have a major physiological role in cellular function, specifically the homopolymer starch, the storage molecule glycogen and the related molecules glycol-aminoglycans and glycoproteins. Complex carbohydrates make up a broad spectrum of molecules where carbohydrate molecules are covalently bonded to a lipid, protein, or other larger biomolecular structure, examples of which include glycosides, glycolipids, antibiotics, glycoproteins and proteoglycans, peptidoglycans, and lipopolysaccharides. Peptidoglycans are the highly branched macromolecules in cell walls of plants providing membrane strength made up of linear polysaccharide strands that are cross-linked by oligopeptide units. Gram-negative bacteria use the highly complex lipopolysaccharides in their outer cell envelope where these molecules sometimes serve as receptors to detect bacteriophages intent on causing infection. One of the most important groups of complex carbohydrates are antibiotics, since they not only serve as simple solutions to some serious health problems but are essential tools used in the latest research biochemistry techniques. This group of compounds is excreted by microorganisms to inhibit the growth of other organisms generally through the interference of DNA transcription, RNA translation, or protein synthesis. A sub class called nucleoside antibiotics generally contain a normal base attached to a carbohydrate other than the D-ribufuranose sugar. The first nucleoside antibiotic discovered was puromycin, a potent antibiotic causing premature chain release during translational chain elongation by mimicking the 3′-end of tRNA. Aromatic antibiotics contain one or two sugars attached to an aromatic molecule having a polycyclical character, for example the cancer cell killing drug daunomycin [30]. Another potent antibiotic is erythromycin, where the sugar moiety is attached to a larger lactone ring which contains the sugar activation sites. Interestingly, the macrolide antibiotic will lose its function completely when the sugar moiety is removed from the lactone ring. Because of its natural chemotherapeutic properties as well as being a stable molecule for biochemistry protocols, the most widely used antibiotic today is streptomycin. It is a member of the aminoglycoside antibiotics that has an aminocyclotol attached to amino sugars which are themselves linked to other sugars [28]. Glycolipids occupy a minor part of the overall molecular structure and commonly occur with lipids associated with outer cell surfaces. By inhibiting biological activity of toxins, glycolipids aid in molecular transport across cell membranes and are also known to be involved with the biosynthesis of glycoproteins and proteoglycans. Glycoproteins and proteoglycans are similar in that both are proteins containing chains of carbohydrates linked by N- or O-glycosidic bonds but differ in the configuration of attached carbohydrates to the chain. For clycoproteins, the carbohydrates are a short, branching of complex monomer units where proteoglycans are long, simple linear strands of carbohydrate monomers. Glycoproteins are usually 15–20 repeating monosaccharide units making up the chains typically consisting of D-glucose, D-galactose, deoxy-D-galactose, D-mannose, and L-fucose. The saccharide chains of proteoglycans are approximately 100 monomers in length with disaccharide repeating units usually consisting of a hexosamine and a hexuronic acid with a sulfate group attached. The addition of the sulfate makes these molecules highly charged and thus naturally very reactive species. Nevertheless, both of these large molecules are included in structures such of hormones, enzymes, collagens among others and are evenly dispersed through cells of higher animals, plants, and microorganisms.
3-22
Cellular Metabolism
3.4.2 Interconversion of Carbohydrates The monosaccharides, as principle molecules of metabolism and cell function, are synthesized through a variety of routes by an array of enzymes responsible for their physiological functional levels. Sugars containing three and six carbon molecules are generated through glycolysis/gluconeogenesis; however the other central precursor sugars having four and five carbons are synthesized through separate metabolic channels. The four major routes leading to these carbohydrate building blocks are: (1) The oxidative pentose-phosphate pathway, which is essential to all organisms for the interconversion of many sugar phosphates producing the central precursor metabolites for the synthesis of some amino acids, nucleotides, and other important building blocks. Most bacteria are capable of using the main carbohydrate degradation pathway as well as the oxidative pentose-phosphate pathway or the Entner–Doudoroff pathway. (2) Formation of NDP-sugars from fructose 6-phosphate in gluconeogenesis. (3) Transfer reactions used to activate the (NDP)-sugars. Such reactions include glycosyl transfer, pyrophosphoryl transfer, or nucleotidyl transfer. For example, the activation of sugar 1-phosphates with NTP via nucleotidyl transfer or the ribose 5-phosphate activation to PRPP, which itself acts as a mode of synthesis for nucleotides. (4) Synthesis of activated sugars. Sugars must be metabolized further into usable building blocks through the formation of glycosidic bonds with various acceptor groups, or other activated forms. 3.4.2.1 Pentose Phosphate Pathway The pentose phosphate pathway’s primary function is to efficiently replenish the pool of NADPH but also serves to keep the carbon flow towards the primary metabolic targets in the end of glycolysis (Figure 3.13). This pathway is also the main mode of interconversion from hexose 6-phosphates into triose 3-phosphates and other carbon phosphates. Pathway initiation begins with glucose 6-phosphate (G6P) converted into 2-dehydrogluconate by the reversible action of G6P dehydrogenase, an enzyme inhibited by high levels of NADPH. In a competing reaction, G6P is converted to fructose 6-phosphate by phosphoglucose isomerase (PGI), an important enzyme in glycolysis universal to plants, animals, NADPH NADP+ + H+ Glucose 6-phosphate
Gluconolactone 6-phosphate
1
NADPH + H+
NADP+
H2O
6-Phosphogluconate
2
3
Ribulose 5-phosphate 4
Fructose 6-phosphate
Glyceraldehyde 3-phosphate
6 C2
Erythrose 4-phosphate Xylulose 5-phosphate
Glycolysis
Xylulose 5-phosphate
Glyceraldehyde 3-phosphate C3
5
7
C2 6 Sedoheptulose 7-phosphate
To nucleotide synthesis
Ribose 5-Phosphate ATP 8 AMP PRPP
Figure 3.13 Pentose phosphate pathway’s oxidative routes for NADPH and pentose formation. Three hexose molecules cycle through the action of: 1, glucose 6-phosphate dehydrogenase; 2, lactonase; 3, 6-phosphogluconate dehydrogenase; 4, ribose 5-phosphate isomerase (can proceed in the reverse direction); 5, ribulose 5-phosphate 3-epimerase; 6, transketolase; 7, transaldolase; 8, PRPP synthetase.
Biosynthesis of Cellular Building Blocks: The Prerequisites of Life
3-23
and bacteria. PGI strongly favors the formation of fructose 6-phosphate under both aerobic and anaerobic conditions in E. coli [34], unless a cellular demand exists for NADH or NADPH. E. coli mutants lacking the gene for PGI grow more slowly on glucose and other sugars since the pentose phosphate and Entner-Doudoroff pathways are left to make up the lack of metabolites down stream of PGI [14]. Continuing the pentose phosphate pathway, 2-dehydrogluconate undergoes two irreversible steps leading to the formation of ribulose 5-phosphate through oxidation and the generation of NADPH. Ribulose 5-phosphate is then processed by phosphoribulose epimerase to form xylulose 5-phosphate or by phosphopento isomerase to form ribose 5-phosphate. These two precursors are then interconverted to form other sugar phosphates, such as erythrose 4-phosphate and sedoheptulose 7-phosphate, that play important roles in phosphorylation, coenzymes, and in transport and storage. Xylulose 5-phosphate and ribose 5-phosphate are converted by transketolase (TK) to form the metabolites glyceraldehyde 3-phosphate and sedoheptulose 7-phosphate. Transaldolase (TA) then converts these to form fructose 6-phosphate and erythrose 4-phosphate. TK exchanges a carbon between erythrose 4-phosphate and xylulose 5-phosphate to form fructose 6-phosphate and glyceraldehyde 3-phosphate. The yeast TK has been investigated for a number of industrial synthesis purposes and has been shown to have a high level of selectivity for diasteric molecules, those with the C2 steroiochemistry [8]. On the other hand, no wide spread study has yet been performed on the substrate specificity exhibited by TA. The complex network of the pentose phosphate pathway allows for the generation of these glycolysis intermediates, enabling bacteria to grow on a variety of different carbon sources thus allowing a high degree of metabolic flexibility. In particular, when glycolysis is inhibited even under cell states of high NADPH, the pentose phosphate interconversions allow for the synthesis of fructose 6-phosphate, and the excess NADPH can be converted to energy through oxidative phosphorylation routes. 3.4.2.2 NDP-Sugars as Building Blocks NDP-sugars are formed using central precursors through a few simple enzymatic reactions (see Figure 3.10) beginning with the isomerization of ribose to ribulose (as seen mentioned above). Isomerases interconverting an aldo group to a keto group in the synthesis of carbohydrates have been observed to exhibit substrate specific reactivity towards hexoses, pentoses, and nonphosphorylated pentoses. The isomerization of esters has long been considered an important part of glycolysis and as such is recognized as vital to pentose metabolism. Mutases are then employed to manipulate the phosphate groups, for example that of 6-phosphohexoses which is relocated to the C1 carbon to form 1-phosphohexoses. Alternatively, formation of amino sugars is done by transamination of fructose 6-phosphate to glucosamine 6-phosphate where acetyl-CoA is used as an acyl donor for the formation of the N-acetylated sugar. The sugar 1-phosphates are then converted to activate sugars in an irreversible manner. Activation can be accomplished via three methods: glycosyl transfer, pyrophosphoryl transfer, or nucleotidyl transfer. Glycosyl transfer is primarily used in the synthesis of complex carbohydrates. In pyrophosphyl transfer, a pyrophosphate group is transferred to pentose 5-phosphate from ATP, which then yields the activated pentose 5-phosphate and AMP. The enzyme PRPP synthetase acts on ribose 5-phosphate to form PRPP through this mechanism. Nucleotide transfer involves the activation of a ldohexoses by adding the nucleotide to the sugar releasing pyrophosphate in the process. An example of this is the conversion of glucose 1-phosphate to UDP-glucose by the NTP-hexose-1-phosphate nucleotidyl transferase (a similar reaction is used to create NDP-mannose from mannose 1-phosphate). It is important to note that the transferases used are specific to the sugar moiety and the type of NTP involved thus predefining the location of the sugar and conveniently are strictly controlled by feedback inhibition. The reaction equilibrium is such that the reaction favors the formation of the NDP-sugars creating excess free energy for the synthesis of nucleotides and other sugar-containing polymers. The activated sugars can then be used as building blocks or further converted to other activated sugars. NDP-sugars are incorporated during a carbohydrate polymerization process which forms critical polymers used throughout the cell such as: (a) disaccharides and polysaccharides including glycogen, (b) lipid diphosphate sugar intermediates, and (c) glycosides. NDP-sugars can also play functional roles
3-24
Cellular Metabolism
in the synthesis of complex carbohydrates. However, before polymerization occurs, the NDP-sugars must be activated, modified for specific function and then transferred to the appropriate receptors. In prokaryotes, NDP-sugars are the major precursors leading to the formation of most of the activated building blocks used in the formation of carbohydrate-containing polymers, specifically acting as the glycosyl donors. Additionally, they are known to act as regulatory mediators during the polymerization process. Since the nucleotidyl transferases are specific to the NTP and sugar molecules, the transferases act as built-in metabolic regulators by controlling the channeling of carbon into separate pathways leading to different sugar-containing polymers. Frequent interconversion between NDP-sugars also takes place by a variety of alternative mechanisms. UDP-glucose can undergo oxidation to form a ketosugar and then is reduced to form uronic acids with the retention or inversion of hydroxyl group configurations. The oxidation is performed by NDP-sugar dehydrogenases using NAD + as an electron acceptor. UDP-glucose can also undergo epimerization to form UDP-galactose via UDP-glucose 4-epimerase or to form UDP-mannose by UDP-glucose 2-epimerase. Finally, the deoxy-NTP sugars can undergo a series of reductions, oxidations, and epimerizations to form the 6-deoxy hexoses such as dTDP-6-deoxy L-talose and dTDP-L-rhamnose. Another way in which the flow of carbohydrate is regulated is during the transport of NDP-activated sugars across membranes where the pathway used is dictated by the type of sugar. The transport of NDPsugars across hydrophobic lipid layer of the cytoplasmic membrane is regulated by lipid carries that attach carrier-specific sugars to them for transit across the membrane. These carriers are highly important since all the NDP-activated sugars making up extracellular polymers are water soluble, whereas the carries are not, thus stabilizing the polymer. For example, the carrier undecaprenyl phophate aids in the synthesis of peptidoglycan and other glycans whereas dolichol phoaphates are utilized during synthesis of glycoproteins.
3.4.3 Other Import Enzymes of Carbohydrate Synthesis 3.4.3.1 Critical Kinases Kinases, which transfer phosphates from ATP to carbon structures, are responsible for the variety of carbohydrates found in microorganisms. The formation of kinases, either constitutively or in response to added inducers, directly leads to the fermentation of carbohydrates and their dissimilation via phosphate esters. The most important classes of kinases for phosphorylation are the aldokinases and ketokinases. Hexokinase, one of the most widely studied aldokinases, is responsible for the phosphorylation of glucose, fructose, or mannose, initiating glycolysis. While hexokinase has been found in animals, plants, and bacteria, the similar functioning enzyme glucokinase, which only acts on glucose, has been found in only animals and bacteria. Glucokinase is specific for the conversion of glucose to glucose 6-phosphate and has been reported in a variety of organisms functioning in both aerobic and anaerobic conditions. As verifiable by comparison of K m values for hexokinase and glucokinase, under high glucose concentrations glucokinase has a higher affinity/activity thus is the primary mechanism for glucose assimilation. When galactose concentrations are high, hexokinase is inefficient in carbohydrate phosphorylation, thus a unique enzyme galactokinase is utilized performing a unique phosphorylation of the C1 carbon forming galactose 1-phosphate. This is then followed by an epimerization of galactose 1-phosphate to glucose 1-phosphate, which then gets metabolized into glucose phosphates for further conversion into other carbohydrate intermediates. Fructokinase exists in animal and bacterial cells where its activity requires the presence of magnesium and potassium ions. The product of the fructose phosphorylation by fructokinase is fructose 6-phosphate, a central metabolite of carbon flow feeding important pathways such as the lower half of glycolysis and amino sugar synthesis. Another critical group of kinases are referred to as pentokinases because they actively phosphorylate pentoses specifically at the fifth carbon and have been widely studied in a variety of organisms [4–6,16,22,23,26]. Ribokinase is one of the most important pentokinases since the product of the phosphorylation is ribose 5-phosphate, an important molecule in the pentose phosphate pathway that leads to synthesis of purines, pyrimidines, and histidine.
Biosynthesis of Cellular Building Blocks: The Prerequisites of Life
3-25
3.4.3.2 Aldolases for Monosaccharides Aldolases are abundant enzymes utilized throughout the metabolic network for synthesizing and breaking down metabolites for the generation of carbohydrates as well as energy, nucleic acids, amino acids, and other vital metabolic components. There are two types of aldolases (Type I and Type II) in which each performs the same condensation reaction between an aldehyde acceptor and a ketone donor but where each enzyme tends to have specificity toward a unique donor molecule. Type I aldolases are primarily found in plants and high order animals. They act through a mechanism forming a covalently bound intermediate molecule, after the acceptor molecule creates a Schiff base with the donor molecule. In contrast, Type II aldolases use metal cofactors, specifically Zn2+ , as Lewis acids in the enzyme’s activation site to help facilitate the acceptor–donor transfer and thus no true intermediate is formed. While exceptions are known, typically Type II aldolases are only found in microorganisms. Fructose 1,6-diphophate (FDP) aldolase is one of the most important enzymes in the metabolism of monosaccharides since it lies in the center of glycolysis. The aldolase catalyzes the reaction of glyceraldehyde 3-phosphate and dihydroacetone phosphate (DHAP) to form FDP, the central metabolite leading to pyruvate metabolism and the TCA. FDP aldolase is found in both Type I and Type II acting forms, with Type I generally having tetrameric form and Type II as dimers [31]. Type I FDP aldolases have a high degree of overall sequence homology (greater than 50%) in which the active site is highly conserved throughout the evolution of the enzyme [29]. On the other hand, Type II FDP aldolases have a much lower sequence homology yet studies still suggested similar range in substrate specificity as compared with corresponding Type I substrate specificities. Similar DHAP dependent aldolases included the Type II aldolases for fuculose 1-phosphate and rhamnulose 1-phosphate. Both are highly common Type II aldolases found in microorganisms and function using a similar mechanism where DHAP is the donor for the aldol reaction. These aldolases have been used in whole-cell fermentation systems synthesizing carbohydrates for industrial production. It has also been noted that the use of rhamnulose 1-phosphate can form L-rhamnose, a rare metabolite, by rhamnose isomeration in the cell after the initial aldol reaction forms L-rhamnulose from rhamnulose 1-phosphate [11]. DHAP-dependent aldolases are critical to the availability of particular isomers for later metabolic processes where the enzyme’s in vivo availability is regulated by the simple presence of FDP, fuculose 1-phosphate or rhamnulose 1-phosphate in the cell. A more unique and widely studied aldolase is 2-deoxyribose 5-phosphate (DERA) which has been characterized in both function and structure [9,15]. It is unique in that there is no other known aldolase that uses an aldehyde as the donor in the aldol reaction, however the DERA enzyme from E. coli has been found capable of accepting a wide range of donor substrates, however the rate of reaction for more exotic substrates is much slower [1]. Sialic acids are generated solely from N-acetylneuraminate aldolase, a Type I aldolase found in both bacteria and animals. The aldolase catalyzes the condensation of its only known acceptable donor, pyruvate, with acetylmannosamine to form sialic acid [17]. This class of acids and their derivatives, found in bacteria and at the C-terminus of mammalian glycoconjugates, play important roles in biochemical recognition within the cell [27]. Polysialic acids are present in bacterial cells and in mammalian cell tissues where they are believed to be involved in cell adhesion and cell–cell communication [18].
3.5 Cell Synthesis of Lipids 3.5.1 Introduction to Lipids and Fatty Acids The biosynthesis of lipids and fatty acids is a complex interaction of multiple pathways beginning with the generation of free palmitate. In the following steps, palmitate undergoes separate elongation and/or unsaturation to yield the array of fatty acid molecules. Lipids in general are classified similarly to carbohydrates as simple or complex, based on the physical properties of the molecule. Lipids play indispensable roles from cell proliferation and cell differentiation to even organ morphogenesis; all are processes that can be considered intimately associated with cell cycle progression [10]. Apart from serving as
3-26
Cellular Metabolism Sucrose 1 HO—CH2
6 HOCH2 HO
5
O
H
4 H
OH 3 H
H 2
O 1
2
H
O
4
5
6 C—OH 4 H 2
OH
Lactose
5
O
H
HO
OH 3 H
1 O
H
H 2
OH
O
2
3 H
5
O
OH
H
H 1
3 H
OH
2
OH
HO
6 HO-CH2 5
H
4
H 4
H
OH
6 HO-CH2
H 1
H
6 HO-CH2 HO
O
5
H
H HO
3
OH
Maltose
6 HO-CH2
O
OH
4
1 OH 3 H
H 2
H
HO
Figure 3.14 Major disaccharides involved in carbohydrate metabolism. Sucrose, consisting of D-glucose and D-fructose monomers, and maltose, consisting of two D-glucose monomers, both having carbon rings linked by C1 α anomer bonds. The D-galactose and D-glucose monomers of lactose have a β linkage.
signaling molecules to regulate developmental processes and defensive mechanisms [12,13], fatty acids and lipids primarily function as energy storage and provide essential components for membrane assembly. Figure 3.14 shows a couple important disaccharides used in energy storage and lipid synthesis. The main mode for new generation of fatty acids occurs in the cytosol starting with the molecule acetyl-CoA and the enzyme acetyl-CoA carboxylase (ACC). 3.5.1.1 Classification of Lipids The heterogeneous group of lipids includes fats, oils, steroids, waxes, and related compounds that all have similar physical properties even though their chemical formulations may vary in structure and composition. Lipids, for the most part, tend to be soluble in nonpolar solvents such as chloroform and ether and are relatively insoluble in water. In many animals they are dietary essential molecules because they are not only a source of energy but also contain important fat-soluble vitamins and other fatty acids found in natural foods. The basic lipid structure classification can be broken into three primary groups. The first group is the simple lipids of fats and waxes, which are generally esters of fatty acids containing various glycerols and alcohols, respectively. The second group is the complex lipids, which are simple lipids containing additional functional groups. For example, the phospholipids are fats in which the molecule includes a phosphoric acid residue sometimes harboring a nitrogen-containing base. Another complex lipid, the glycolipid, carries a carbohydrate group. The third category is the derived lipids. This diverse class of molecules includes steroids, hydrocarbons, hormones, and soluble vitamins among others. 3.5.1.2 Diversity of Lipids in Microorganisms Comparison of lipid composition within a single phylogenetic branch has shown to be similar yet unique to each species, and thus highly useful as marker molecules, especially in chemotaxonomy. The differences in lipid composition are attributed to the variance in growth temperatures, environmental
Biosynthesis of Cellular Building Blocks: The Prerequisites of Life
3-27
nutrient gradients and other factors in the surrounding habitat of each microorganism where the results of which manifest in cellular structures. For example the presence of neutral lipids in membranes, for instance the presence of small amounts of squalene in Archaea and hopanoids in bacteria yet both families contain varying amounts of cartenoids. Additionally, Gram-negative bacteria contain variations within their own outer membrane where each layer differs in composition. For example the E. coli membrane consists of two layers, both of which contain lipoproteins. However, the outer layer is composed of a varying amount of lipopolysaccharides while the inner layer has a composition of phospholipids similar to that of the cytoplasmic membrane. In general, lipids account for about 10% of the cellular building blocks in microorganisms, with roles in energy storage, membrane structure, metabolite transport, and other metabolic processes. The fact that lipids vary in both synthesis route and structure is directly attributable to the different precursor groups used by the various prokaryotes. The basic components of most cellular lipids are
1. Glycerol phosphate, which is formed from DHAP through a reduction reaction. 2. Small hydrocarbon chain molecules from the elongation of acetyl-CoA. The chain molecules are usually 14–18 carbon fatty acids in bacteria and 20 carbon isoprene alcohols in Archaea. In both, the small acyl carrier protein (ACP) is used to activate these molecules via thioesterification. 3. A polar head group, which is attached to the long hydrocarbon chain. The two major head groups are glycerol phosphate and ethanolamine phosphate, while others include serine phosphate and glucose.
3.5.2 Sources of Metabolites for Lipogenesis The availability of acetyl-CoA and NADPH plays a vital role in the biosynthesis of fatty acids. AcetylCoA is generated through carbohydrate oxidation to pyruvate, which is then converted to acetyl-CoA by two different pathways. In eukaryotes, pyruvate is first transported into the mitochondria and converted into citrate, which is excreted. Next, using ATP and CoA moieties, citrate is converted into cytosolic acetyl-CoA by an ATP-citrate lyase, which then can serve as the substrate for lipogenesis. Importantly, citrate catabolism in the mitochondria leads to the formation of acetyl-CoA. However, because the mitochondria membrane has no transport mechanisms to move acetyl-CoA into the cytosol, it is unusable for fatty acid biosynthesis. On the other hand, prokaryotes lack mitochondria and thus the synthesis of acetyl-CoA from pyruvate by pyruvate dehydrogenase can be utilized directly for fatty acid synthesis. The cofactor NADPH acts as the proton donor for the reductions of the 3-ketoacyl and 2,3u nsaturated acyl derivatives in fatty acid biosynthesis. The metabolites produced in oxidative reactions of the pentose phosphate pathway (see Section 3.4.2.1) are the principal source of the hydrogen used for the reductive synthesis of fatty acids. It can be noted that tissues in animals that specialize in the fatty acid synthesis, i.e., the liver, also possess highly active pentose phosphate pathways. Also of note, for all organisms, is that both pathways are located in the cytosol of the cell thus eliminating any permeability barriers that would complicate NADPH transport from one pathway to the other. NADPH is also derived from other sources including the malic enzymes, or NADP malate dehydrogenase, a TCA cycle enzyme used to convert malate to pyruvate.
3.5.3 Synthesis of Long-Chain Fatty Acids From a chemistry perspective, the main oxidative pathway appears to be the opposite route by which fatty acids are synthesized. However, the synthesis of fatty acids takes place in the cytoplasm only using the cofactor NADPH, whereas oxidation occurs in the cytoplasm for prokaryotes and within the mitochondria of eukaryotes utilizing cofactors FADH+ and NAD+ . The Type II, or dissociated fatty acid synthase, is used for microbial fatty acid synthesis to much the same way that polypeptide synthesis occurs in bacteria. Separate proteins are used to catalyze every one of the individual reactions where each protein is encoded by separate genes. This is a stark contrast to the multifunctional protein complexes used
3-28
Cellular Metabolism
in yeast, animals, and birds in which separate protein domains execute reaction steps, a mode referred to as Type I fatty acid synthesis. Regardless of the type of synthesis, there are similarities in synthesis build-up, initiation, elongation, and radical formation. 3.5.3.1 Entrance to the Fatty Acid Pathway Production of malonyl-CoA is the initial step leading to fatty acid synthesis. In addition to NADPH, lipogenesis requires the cofactors ATP, biotin, manganese ions, and bicarbonate for proper functioning of enzymes. Specifically, the carboxylation reaction performed by ACC, in the presence of ATP, converts acetyl-CoA into malonyl-CoA by using bicarbonate as a source of CO2. ACC in E. coli is a multifunctional protein, encoded by accABCD, consisting of biotin carboxylase, biotin acetyl carrier protein (BCCP), transcarboxylase, and a regulatory site. The activation of ACC requires biotin attachment at a specific lysine residue of the BCCP subunit (accBC) by a biotin carrier protein, the biotin ligase (BirA). The A and D subunits (accAD) can then perform the carboxylation reaction; this can also act as a means of competitive regulation. ACP is then incorporated into the active enzyme complex replacing the CoA. It is important to note that malonyl-CoA is a pivotal metabolite in plant biosynthesis. This is because there are two branches for which carbon flow is then distributed: (1) to the synthesis of fatty acids and (2) to the synthesis of flavonoids, the multifunctional secondary metabolites that confer plant colorations, antimicrobial protection and UV protection. In humans, flavonoids are believed to have high medicinal value as cancer preventive, antipathogens, and antidiabetic. The increasing attention of flavonoids as nutraceuticals has driven many biotechnological endeavors to produce these metabolites from E. coli and S. cerevisiae using these organism’s native malonyl-CoA as the starter unit for flavonoid synthesis [19–21,32,33]. 3.5.3.2 Initiation, Radical Formation, and Liberation Following malonyl-CoA formation, a condensation of malonyl-CoA by malonyl-ACP transacylase occurs as a first step toward fatty acid synthesis (Figure 3.15). The malonyl moiety reacts with a free ACP forming malonyl-ACP. The free CoA byproduct is then recycled back into the metabolic network for later use. Some malonyl-ACP molecules can then be recycled back to acetyl-CoA via acetyl-ACP (through enzymes encoded by fabB and fabF in E. coli), whereas the remaining pool enters into a multistep cycle where two carbon atoms are added each cycle through addition of acetyl groups creating fatty acid polymers. The fatty acid biosynthesis initiation is marked by one of two committed steps: (1) the formation of acetyl-ACP from malonyl-ACP and acetyl-CoA or (2) the formation of acetoacetyl-ACP from malonyl-ACP and acetyl-ACP. These molecules, acetyl-ACP and acetoacetyl-ACP, are sometimes referred to as the primer molecules since the elongation cycle builds the hydrocarbon chain off these molecules. Branched-chain fatty acids are synthesized using a primer molecule formed by either of two methods: in one, malonyl-ACP is joined with an aldehyde generated from intermediates of branchedchain amino acid synthesis. The other method uses an alternative primer molecule formed by a specialized transcylase(s) that reacts a free ACP with a lone acyl-CoA. The final long-chain radical is formed through an elongation of the prosthetic group attached to the ACP by the series of reduction, dehydration, and condensation processes. The four-step cycle of elongation begins with the condensation of the acetyl group on the primer molecule releasing carbon dioxide and ACP to form 3-ketoacyl-CoA. The 3-ketoacyl-ACP synthase isoenzymes are responsible for catalyzing the irreversible reaction to elongate the carbon chain. In E. coli, the enzyme responsible is acetyl-CoA ACP transcylase (fabH). Next the 2-oxo group is reduced to form a D-β-hydroxyl group by 3-ketoacyl-CoA reductase, forming 3-hydroxyacyl-CoA using the cofactor NADPH; in bacteria this is generally performed by 3-oxoacyl-ACP reductase (fabG). The first reduction is followed by a dehydration via 3-hydroxyacyl-CoA dehydratase, or β-hydroxybutyryl ACP dehydratase (fabD) in E. coli, that yields 2-trans-enoyl-CoA. Finally, a second reduction completes the elongation cycle to form another saturated acyl-ACP that has been lengthened by two carbon atoms. Enoyl-ACP reductase (fabF), a
3-29
Biosynthesis of Cellular Building Blocks: The Prerequisites of Life (a)
O CH3
C
ACP
S
CH2
H3C
1
O HOOC
C
O
O
Acetyl -ACP
S
ACP
ACP C S CH2 Acetoacetyl -ACP (primer) C
Malonyl -ACP (b)
O CH2 C [CH2]x CH2
H3C
S
ACP
4
5
ACP-SH + CO2
Malonyl -ACP 2 [H] O H3C
H2O
O
O
CH C [CH2]x CH
S
ACP
H3C
S
ACP
2 [H]
3
2 OH
H3C
C C [CH2]x CH2
O
C C [CH2]x CH2
(c)
S
ACP
10:0 β -Hydroxydecanoyl -ACP 6 6 10:1 ∆3 cis-3-Decanoyl -ACP 7
10:1 ∆2 trans-2-Decanoyl -ACP
7 or 8
16:1 ∆9 Palmitoleyl -ACP 8
18:0 Palmitoyl -ACP
18:1 ∆11 cis-Vaccenoyl -ACP
Figure 3.15 Fatty acid biosynthesis cycle for long chain polysaccharides. (a) Initiation begins with the formation of a primer molecule via a number of different 1, β-ketoacyl-ACP synthetases. (b) The elongation cycle used to add malonyl moieties is conducted by 2, β-ketoacyl-ACP reductase; dehydration by 3, β-hydroxylacyl-ACP dehydrase; reduction by the NADPH-dependent 4, enoyl-ACP reductase; and condensation of the next malonyl moiety by a 5, β-ketoacyl-ACP synthetase. (c) Distribution into different classes of long chain fatty acids is performed by 6, b-hydroxydecanoyl-ACP dehydrase (HDD) catalyzing the dehydration and a 3-cis isomer isomerization; 7, β-ketoacyl-ACP synthase isoenzyme I; β-ketoacyl-ACP synthase isoenzyme II.
3-30
Cellular Metabolism
2-trans-enoyl-CoA reductase, with a reducing cofactor NADPH forms the saturated carbon chain that can then serve as the substrate for the next elongation cycle. For small saturated fatty acid chains (<10 carbons in length), the process of condensation, reduction, and dehydration continues until the chain reaches a length of ten carbons. There are a couple of routes to elongate the fatty acid chain further: β-hydroxydecanoyl-ACP dehydrogenase (HDD) can catalyze either a single dehydration or performed along with an isomerization reaction. The isomerization utilizes only the 2-trans form converting the chain to its 3-cis isomer, thus creating the unsaturated form of the fatty acid. When only undergoing the dehydration reaction, ketoacyl-ACP synthase (KAS) I or its isoenzyme KAS II continues to grow the chain until free palmitate, a 16 carbon polypeptide, is formed. The overall synthesis equation from acetyl-CoA and malonyl-CoA to palimate is
CH2CO.S.CoA + 7 HOOC.CH2CO.S.CoA + 14 NADPH + 14 H + → CH3(CH2)14COOH + 7 CO2 + 6 H2O + 8 CoA.SH + 14 NADP +
In the case of isomerization, 2-trans-enoyl-CoA reductase is not capable of performing the reduction reaction, thus forcing the further elongation towards unsaturated fatty acids. For the synthesis of unsaturated fatty acids, the C10-ACP intermediate is used as the precursor for elongation continuing by only KAS I until the chain has reached sixteen carbons (C16:1Δ9 cis) in length. From there, KAS II actively elongates the unsaturated fatty acid to a chain length of 18 carbons (C18:1Δ11 cis). This so-called anaerobic pathway is used for synthesis of unsaturated fatty acids in anaerobic microbes and in many aerobic microbes. In many Gram-positive aerobes, a so-called aerobic pathway is used for the generation of polyunsaturated fatty acids. In this pathway the unsaturation occurs not at 10 carbons but at 16 carbons in length. A fatty acid desaturase, with the help of molecular oxygen, introduces a double bond in the middle of C16 acyl-ACP to form C16:1Δ9 cis (palmitoleic acyl-ACP). 3.5.3.3 Completion of Lipid Synthesis As the acyl chains are formed through the elongation process, the acyl-ACP is transferred into membrane phospholipids by glycerol phosphate acyltransferase (encoded by plsB), which yields the phosphatic acids. Different acyltransferases have preference for different acyl chain lengths and cis-unsaturated or branched-chain acids, thus ensuring an even distribution of different fatty acids in the membrane. The addition of polar head groups follows the insertion of the fatty acid molecule into the membrane by a variety of transfer mechanisms to produce a phospholipid or one of the other major classes. The most prevalent phosphatic acid, diacylglycerol 3-phosphate, is activated by nucleotidyl transfer from CTP by CDP-diglyceride synthase to form a CDP-diacylglycerol. Subsequent reactions by a variety of specific enzymes lead to the array of phospholipids. Diacylglycerol 3-phosphate can also be activated by glycosyl transfer to after a phosphorylation to give the precursor for glycolipids.
3.5.4 Synthesis of Neutral Lipids Neutral lipids are a group of lipophilic compounds (i.e., carotenoids, squalene, C 40 isoprenoids) derived from acetyl-CoA via addition of C5-isoprene units. This altered synthesis route offers a distinct change in the structure and function between neutral lipids and traditionally formed fatty acids. Like most of the long-chain saturated and unsaturated fatty acids making up the bulk of cellular lipids, neutral lipids are also essential structural and functional elements of membranes. These polymers are known to play roles in membrane stability, electron transport, sugar, and oligosaccharide transfer through membranes, and even in light absorption for protection from light or to harvest light energy (specifically in bacteria). Much of the current literature shows the importance of polar lipids, i.e., phospholipids, while comparatively little attention has been paid to the metabolism of neutral lipids. An excellent review of neutral lipids has been assembled by Coppens and Vielemeyer that includes a discussion of new pharmacological targets from pathways generating neutral lipids [7].
Biosynthesis of Cellular Building Blocks: The Prerequisites of Life
3-31
As a first step in their biosynthetic pathway, two acetyl-CoA molecules are condensed to form a single acetoacetyl-CoA. A third molecule of acetyl-CoA is then quickly condensed on the carbon of the second ketone group, creating 3-hydroxy-3-methyl-glutaryl-CoA (HMG-CoA). A reduction of the CoA-activated acid to an alcohol occurs to form mevalonic acid, which in turn is diphosphated to from a single isoprenyl unit, or isopentenyl diphosphate. From here one isoprenyl unit is isomerized, forming a prenyl unit, to which a second isoprenyl unit is condensed onto it, thus creating a chained molecule via a simple polymerization chain reaction. This process can continue to create very large molecules; such is the case for C55 isoprenoids and quinones. The polymerization process is generally characterized by reference to head and tail units of each molecule where the product is stabilized by hydrogen ion elimination. The pyrophosphate leaving group pulls out the oxygen atom during the activation process to leave behind a reactive carbenium cation, referred to as the head. This head can easily add to the tail, or the double bond of the electrophilic isopentenyl unit, of an adjacent molecule. The condensation reaction can occur in conformations of head-to-tail (as explained above), head-to-head, or tail-to-tail to form the variety polyprenyl compounds containing two, three, four or more isoprenoid units. Those with more than four units are generally derived by the union of two sesquiterpene (being six units in length) or two diterpene units (having a total of eight units).
References 1. Barbas, C. F., Y. F. Wang, and C. H. Wong. 1990. Deoxyribose-5-phosphate aldolase as a synthetic catalyst. J. Am. Chem. Soc., 112:2013–14. 2. Buckley, R. H. 2002. Primary cellular immunodeficiencies. J. Allergy Clin. Immunol., 109:747–57. 3. Buckley, R. H. 2002. Primary immunodeficiency diseases: dissectors of the immune system. Immunol. Rev., 185:206–19. 4. Cabrera, R., A. Caniuguir, A. L. Ambrosio, V. Guixe, R. C. Garratt, and J. Babul. 2006. Crystallization and preliminary crystallographic analysis of the tetrameric form of phosphofructokinase-2 from Escherichia coli, a member of the ribokinase family. Acta Crystallograph Sect. F Struct. Biol. Cryst. Commun., 62:935–37. 5. Cao, P., Y. Gong, L. Tang, Y. C. Leung, and T. Jiang. 2006. Crystal structure of human pyridoxal kinase. J. Struct. Biol., 154:327–32. 6. Chuvikovsky, D. V., R. S. Esipov, Y. S. Skoblov, L. A. Chupova, T. I. Muravyova, A. I. Miroshnikov, S. Lapinjoki, and I. A. Mikhailopulo. 2006. Ribokinase from E. coli: expression, purification, and substrate specificity. Bioorg. Med. Chem., 14:6327–32. 7. Coppens, I. and O. Vielemeyer. 2005. Insights into unique physiological features of neutral lipids in Apicomplexa: from storage to potential mediation in parasite metabolic activities. Int. J. Parasitol., 35:597–615. 8. Demuynck, C., F. Fisson, I. Bennanibaiti, H. Samaki, and J. C. Mani. 1990. Immunoaffinity purification of transketolases from yeast and spinach leaves. Agri. Biol. Chem., 54:3073–78. 9. DeSantis, G., J. Liu, D. P. Clark, A. Heine, I. A. Wilson, and C. H. Wong. 2003. Structure-based mutagenesis approaches toward expanding the substrate specificity of D-2-deoxyribose-5-phosphate aldolase. Bioorg. Med. Chem., 11:43–52. 10. Donnelly, P. M., D. Bonetta, H. Tsukaya, R. E. Dengler, and N. G. Dengler. 1999. Cell cycling and cell enlargement in developing leaves of Arabidopsis. Dev. Biol., 215:407–19. 11. Drueckhammer, D. G., J. R. Durrwachter, R. L. Pederson, D. C. Crans, L. Daniels, and C.-H. Wong. 1989. Reversible and in situ formation of organic arsenates and vanadates as organic phosphate mimics in enzymatic reactions: mechanistic investigation of aldol reactions and synthetic applications. J. Org. Chem., 54:70–77. 12. Farmer, E. E. 1994. Fatty acid signalling in plants and their associated microorganisms. Plant Mol. Biol., 26:1423–37.
3-32
Cellular Metabolism
13. Farmer, E. E., H. Weber, and S. Vollenweider. 1998. Fatty acid signaling in Arabidopsis. Planta, 206:167–74. 14. Fraenkel, D. G. 1986. Mutants in glucose metabolism. Annu. Rev. Biochem., 55:317–37. 15. Horinouchi, N., J. Ogawa, T. Sakai, T. Kawano, S. Matsumoto, M. Sasaki, Y. Mikami, and S. Shimizu. 2003. Construction of deoxyriboaldolase-overexpressing Escherichia coli and its application to 2-deoxyribose 5-phosphate synthesis from glucose and acetaldehyde for 2´-deoxyribonucleoside production. Appl. Environ. Microbiol., 69:3791–97. 16. Jeffries, T. W. and Y. S. Jin. 2004. Metabolic engineering for improved fermentation of pentoses by yeasts. Appl. Microbiol. Biotechnol., 63:495–509. 17. Kim, M.-J., J. Hennen, H. M. Sweers, and C.-H. Wong. 1988. Enzymes in carbohydrate synthesis: N-acetylneuraminic acid aldolase catalyzed reactions and preparation of N-acetyl-2-deoxy-Dneuraminic acid derivatives. J. Am. Chem. Soc., 110:6481–86. 18. Lasky, L. A. 1992. Selectins: interpreters of cell-specific carbohydrate information during inflammation. Science, 258:964–69. 19. Leonard, E., J. Chemler, K. H. Lim, and M. A. Koffas. 2006. Expression of a soluble flavone synthase allows the biosynthesis of phytoestrogen derivatives in Escherichia coli. Appl. Microbiol. Biotechnol., 70:85–91. 20. Leonard, E., Y. Yan, and M. A. Koffas. 2006. Functional expression of a P450 flavonoid hydroxylase for the biosynthesis of plant-specific hydroxylated flavonols in Escherichia coli. Metab. Eng., 8:172–81. 21. Leonard, E., Y. Yan, K. H. Lim, and M. A. Koffas. 2005. Investigation of two distinct flavone synthases for plant-specific flavone biosynthesis in Saccharomyces cerevisiae. Appl. Environ. Microbiol., 71:8241–48. 22. Long, M. C., V. Escuyer, and W. B. Parker. 2003. Identification and characterization of a unique adenosine kinase from Mycobacterium tuberculosis. J. Bacteriol., 185:6548–55. 23. Maugeri, D. A., J. J. Cazzulo, R. J. Burchmore, M. P. Barrett, and P. O. Ogbunude. 2003. Pentose phosphate metabolism in Leishmania mexicana. Mol. Biochem. Parasitol., 130:117–25. 24. McCarthy, G. 2004. Medical diagnosis, management and treatment of Lesch Nyhan disease. Nucleosides Nucleotides Nucleic Acids, 23:1147–52. 25. Nyhan, W. L. 2005. Disorders of purine and pyrimidine metabolism. Mol. Genet. Metab., 86:25–33. 26. Parducci, R. E., R. Cabrera, M. Baez, and V. Guixe. 2006. Evidence for a catalytic Mg2+ ion and effect of phosphate on the activity of Escherichia coli phosphofructokinase-2: regulatory properties of a ribokinase family member. Biochemistry, 45:9291–99. 27. Paulson, J. C. 1989. Glycoproteins: what are the sugar chains for? Trends Biochem. Sci., 14:272–76. 28. Suami, T. 1979. Modifications of aminocyclitol antibiotics. Jpn. J. Antibiot., 32 Suppl:S91–102. 29. Sygusch, J., D. Beaudry, and M. Allaire. 1987. Molecular architecture of rabbit skeletal muscle aldolase at 2.7-A resolution. Proc. Natl. Acad. Sci. USA, 84:7846–50. 30. Tsukada, Y., W. K. D. Bischof, N. Hibi, H. Hirai, E. Hurwitz, and M. Sela. 1982. Effect of a conjugate of daunomycin and antibodies to rat alpha-fetoprotein on the growth of alpha-fetoproteinproducing tumor-cells. Proc. Natl. Acad. Sci. USA, 79:621–25. 31. Vonderosten, C. H., A. J. Sinskey, C. F. Barbas, R. L. Pederson, Y. F. Wang, and C. H. Wong. 1989. Use of a recombinant bacterial fructose-1,6-diphosphate aldolase in aldol reactions—preparative syntheses of 1-deoxynojirimycin, 1-deoxymannojirimycin, 1,4-dideoxy-1,4-Iimino-D-arabinitol, and fagomine. J. Am. Chem. Soc., 111:3924–27. 32. Yan, Y., J. Chemler, L. Huang, S. Martens, and M. A. Koffas. 2005. Metabolic engineering of anthocyanin biosynthesis in Escherichia coli. Appl. Environ. Microbiol., 71:3617–23. 33. Yan, Y., A. Kohli, and M. A. Koffas. 2005. Biosynthesis of natural flavanones in Saccharomyces cerevisiae. Appl. Environ. Microbiol., 71:5610–13. 34. Zhao, J., T. Baba, H. Mori, and K. Shimizu. 2004. Global metabolic response of Escherichia coli to gnd or zwf gene-knockout, based on 13C-labeling experiments and the measurement of enzyme activities. Appl. Microbiol. Biotechnol., 64:91–8.
4 Polymerization of Building Blocks to Macromolecules: Polyhydroxyalkanoates as an Example Si Jae Park LG Chem Ltd.
Soon Ho Hong University of Ulsan
Sang Yup Lee Korea Advanced Institute of Science and Technology
4.1 4.2 4.3
Introduction ���������������������������������������������������������������������������������������4-1 PHAs ���������������������������������������������������������������������������������������������������4-2 PHA Synthases �����������������������������������������������������������������������������������4-2
4.4
etabolic Engineering of Microorganisms for PHA M Production ����������������������������������������������������������������������������������������� 4-5
Evolution of PHA Synthases
SCL-PHA • MCL-PHA and SCL-MCL-PHA
4.5 Conclusion ����������������������������������������������������������������������������������������4-14 Acknowledgments ��������������������������������������������������������������������������������������4-15 References ����������������������������������������������������������������������������������������������������4-15
4.1 Introduction Microorganisms synthesize many building blocks that are mainly used for the synthesis of macromolecules such as proteins, DNA, RNA, lipids, and so on. Because these macromolecules are indispensable for living organism, the metabolism is optimized to provide enough precursors to synthesize them. A polymer is made by covalently linking together simple small molecules called monomers. A general term to describe the process that leads to the formation of a polymer is polymerization. It should be noted that there are many ways to polymerize monomers to synthesize various macromolecules. Biopolymers are amongst the most sophisticated and complex polymers on earth, and it is important to understand how monomers or building blocks can assemble covalently into life-enabling polymers. There are many different biopolymers that have been studied in detail with respect to their biosynthetic mechanisms, enzymes, and precursors involved in, and metabolic engineering strategies. For detailed information on various biopolymers, it is recommended for readers to refer to the excellent book on biopolymers recently published [1]. Since this book is about metabolic engineering and only limited space is available, we focus on only one family of biopolymers for detailed description. Polyhydroxyalkanoates (PHAs) are an interesting family of carbon and energy storage polymers, which are formed by polymerization of monomers derived from central carbon metabolism [2]. There have been many advances in our understanding of the metabolism of PHA biosynthesis and the molecular characteristics of the genes and enzymes involved. A number of interesting metabolic engineering studies have been performed toward enhanced production of PHAs, production of novel PHAs, and utilization of inexpensive carbon sources. We will start with what PHAs are and how they are 4-1
4-2
Cellular Metabolism
synthesized. Following a description of the metabolism and molecular characteristics of the genes and enzymes involved, we will review the metabolic engineering examples. Even though the polymerization processes are different for other biopolymers, the general strategies described in this chapter should be valid for metabolic engineering toward enhanced production of biopolymers and for the production of novel and/or tailor-made biopolymers.
4.2 PHAs Among various biopolymers produced by microorganisms, PHAs have attracted much attention for commercial application due to mechanical properties that are similar to general-purpose polymers and their complete biodegradability. The monomer constituents of natural PHAs identified to date are more than 150 hydroxyalkanoic acids [3]. (R)-3-hydroxyalkanoic acids are most frequently used as monomers of PHAs. PHAs show a broad range of polymer characteristics from thermoplastics to rubber like polymers depending on the types and amounts of monomer constituents [4]. Thermoplastic short chain length (SCL) PHAs are composed of monomers with three to five carbon atoms, and rubberlike medium chain length (MCL) PHAs are composed of monomers with six to 14 carbon atoms [4]. Recently, PHA copolymers containing both SCL and MCL monomers (SCL-MCL-PHA) were identified in several bacteria. Material properties of these copolymers are dependent on the amount of SCL or MCL monomers. For example, incorporation of small amounts of MCL monomer units into poly(3hydroxybutyrate) [P(3HB)] backbone polymers makes the properties of the PHA material similar to low density polyethylene (LDPE), which opens new opportunities for commercial applications of PHAs [5]. PHA synthesis in microorganism can be divided into two major processes: (R)-hydroxyacyl-CoAs (RHA-CoAs) are generated from various metabolic pathways and then RHA-CoAs are polymerized into PHA. In a broad sense, all metabolic pathways of microorganism are involved in these processes to provide RHA-CoAs that are finally used up for PHA synthesis. However, when we consider enzymes directly involved in PHA biosynthesis, enzymes can fall into three classes according to their function. The first group of enzymes are PHA synthases that have a key role in PHA biosynthesis, the polymerization of RHA-CoAs (Figure 4.1; Table 4.1). Wide substrate specificity of PHA synthases results in the corporation of the variety of monomer constituents in PHAs. The second group consists of the enzymes that generate RHA-CoAs from the intermediates of metabolism, such as β-ketothiolase, 3-ketoacyl -CoA reductase, and enoyl-CoA hydratase (Table 4.2). Finally, enzymes involved in the regulation of PHA synthesis at the transcription and translation level and in stabilization of PHA granules in bacteria, such as PhaR, PhaF, and phasin (PhaI and PhaP) correspond to the third group.
4.3 PHA Synthases To synthesize PHA in the cytoplasm of microorganisms, the combined action of RHA-CoA generating enzymes and RHA-CoA polymerizing PHA synthase is very critical. PHA synthase carry out the key role in PHA synthesis, the polymerization of RHA-CoAs into PHA granules. To date, over 60 PHA synthases have been obtained and characterized at a molecular level [6]. PHA synthases can be classified into four major groups depending on the substrate specificity, enzyme subunits consisting of PHA synthase and genetic organization type of PHA synthesis operon (Figure 4.1) [6]. Class I PHA synthases have high substrate specificity toward RHA-CoAs having three to five carbon atoms and consist of only one type of subunit, PhaC with molecular weight of around 60–73 kDa. Ralstonia eutropha and Alcaligenes latus PHA synthases belong to Class I group [7,8]. In R. eutropha, PHA synthase gene (phaC) constitutes operon (phaCAB) with phaA and phaB genes encoding β-ketothiolase and NADPH-dependent acetoacetyl-CoA reductase, respectively. A different genetic order of phaC, phaA, and phaB in the PHA operon is often found in some bacteria, but the phaC gene is often found to be colinearized with phaA and phaB genes. Class II PHA synthases are also composed of one type of subunit, PhaC with molecular weight around 60–65 kDa, but have higher specificity to RHA-CoAs having six to 12 carbon atoms. Class II PHA
phaC (1767 bp)
phaQ (441 bp)
phaA (1185 bp)
PhaC (40 kDa)
phaB (744 bp)
PhaR (22 kDa)
phaB (738 bp)
phaB (741 bp)
phaD (615 bp)
phaP (363 bp)
phaC (1089 bp)
ORF4 (462 bp)
phaC2 (1680 bp)
E. coli σ70 consensus promoter
phaR (609 bp)
phaE (1074 bp)
Figure 4.1 Genetic organization of PHA synthases and their subunits.
phaP (513 bp)
Class IV: Bacillus megaterium
phaC (1068 bp)
PhaC (40 kDa)
phaZ (849 bp)
PhaE (40 kDa)
phaA (1179 bp) PhaC (60–65 kDa)
PhaC (60–73 kDa)
σ70- dependent promoter
Class III: Allochromatium vinosum
phaC1 (1677 bp)
Class II: Pseudomonas aeruginosa
E. coli σ70 consensus promoter
Class I: Ralstonia eutropha
ORF7
Polymerization of Building Blocks to Macromolecules 4-3
4-4
Cellular Metabolism
synthases are preferentially found in Pseudomonas sp., for example Pseudonomas aeruginosa [9] and P. putida. Two types of PHA synthase gene (phaC1 and phaC2) constitute the operon, in which phaC1 and phaC2 genes are separated by phaZ gene encoding intracellular PHA depolymerase. Two PHA synthases show high amino acid sequence homology and have similar substrate specificity. Downstream of the phaC1, phaZ, phaC2 operon, genes encoding regulatory proteins such as PhaD, PhaI, and PhaF are colocalized. Class III PHA synthases are often found in cyanobacteria and are firstly cloned and characterized from Allochromatium vinosum [10]. They consist of two different types of subunits, PhaC and PhaE subunits of molecular weight of ca. 40 kDa. PhaC has an amino acid sequence homology of 21–28% with Class I and II PHA synthases but PhaE shows no sequence similarity to PHA synthases. In A. vinosum, the phaCE genes and phaAB genes are located in opposite direction. These PHA synthases prefer to use RHA-CoAs with three to five carbon atoms, however, they were recently found to have rather broad substrate specificity toward RHA-CoAs with three to ten carbon atoms. Class IV PHA synthases exist in Bacillus sp. and are composed of two different types of subunits, PhaR and PhaC [11]. Molecular weights of PhaR and PhaC are 20 kDa and 40 kDa, respectively. The phaP, phaQ, phaR, phaB, phaC genes are colocalized in the chromosome and phaPQ genes and phaRBC genes are in opposite direction. These PHA synthases preferentially use RHA-CoAs with three to five carbon atoms. Substrate specificity of PHA synthases regarding chain length of RHA-CoAs is one of the key factors in classification of PHA synthases groups. Recently, PHA synthases having rather broad substrate specificity were found in some bacteria including Thiocapsa pfennigii [12], Pseudomonas sp. 61-3 [13], and Aeromonas caviae [14]. T. pfennigii has PHA synthases belonging to Class III with substrate specificity to RHA-CoAs comprising three to 12 carbon atoms. PHA synthases found in Pseudomonas sp. 61-3 belong to Class II, however, PhaC1 and PhaC2 from Pseudomonas sp. 61-3 are found to polymerize PHAs consisting of 3-hydroxybutyrate (3HB) and MCL-RHAs. A. caviae PHA synthase showing high similarity to Class I PHA synthase (ca. 45%) synthesize PHAs consisting of 3HB and 3-hydroxyhexanoate (3HHx). Further studies revealed that A. caviae PHA synthase could even accept 3-hydroxyhepanoylCoA (3HHp-CoA) [15]. Furthermore, a typical Class I PHA synthase, R. eutropha PHA synthase, was found to produce a copolyester consisting of 3HB and MCL-RHAs when metabolic pathways of host strains were well optimized to supply MCL-RHA-CoAs [16,17]. All these findings lead us to conclude that substrate specificity for PHA synthases is very versatile and broad.
4.3.1 Evolution of PHA Synthases Biosynthesis of PHA consists of two major steps, monomer substrate supply and polymerization of monomer units by PHA synthase. PHA productivity is a function of the activities of enzymes supplying monomers and polymerization of the monomers. How to efficiently supply monomers is more critical because generally polymerization of PHA is executed only in conditions that substrate specificity and activity of PHA synthase have been well optimized to supplied monomers. In particular, PHA synthase is a key factor in determining PHA productivity as well as polymer material properties including molecular weight, polydispersity, and composition and distribution levels of monomer in copolymer. Therefore, PHA synthase has always been a hot issue in the PHA research field. Studies on the amino acid sequence of PHA synthases obtained from various microorganisms and especially on the polymerization mechanism of PHA synthases from R. eutropha and A. vinosum have provided some clues to understand important amino acid residues responsible for polymerization. As protein engineering studies are generally carried out on the basis of tertiary structure revealed by X-ray crystallography when important amino acid residues are changed through site directed mutagenesis, the lack of tertiary structure information for polymerase hampers any in depth protein engineering studies of PHA synthases. Even though the exact tertiary structure of PHA synthases is not presently available, bioinformatic tools provide a basic idea to assume the structure of polymerase, leading to the design of experimental schemes to find important amino acid residues in PHA biosynthesis. Since the first report about the nucleotide sequence of
Polymerization of Building Blocks to Macromolecules
4-5
R. eutropha PHA synthase, it was found that all PHA synthases contain a lipase box (G-X-[S/C]-X-G) in which the essential active site serine of lipase is substituted by cysteine [7]. Furthermore, an amino acid sequence homology search by BLAST using A. vinosum PhaC as a query supported that cysteine is well aligned with serine in bacterial lipases from P. cepacia, Pseudomonas sp. KW1-56, and P. luteola and also revealed that a span of 45 residues (131–175) including the lipase box of A. vinosum PhaC shows 42% sequence identity to P. cepacia lipase [18]. P. cepacia lipase has been characterized at a crystal structure level, therefore, this structural information is helpful in investigating PHA synthase because PhaC is likely to contain similar structures in this region. Using the P. cepacia lipase tertiary structure, the Sinskey and Stubbe group recently reported the molecular mechanism of polymerization of A. vinosum PHA synthase. Mutagenesis studies generating mutation in A. visnosum PHA synthase C149, H331, H303, D302, and C130 residues revealed that H331 is the general base catalyst that activates the nucleophile, C149, for covalent catalysis [18]. Studies also corrected previously proposed role of C130 in polymerization by showing that C130 is not truly involved in catalysis. Authors suggest D302 functions as a general base catalyst in activation for 3HB-CoA for nucleophilic attack on the covalently linked thiol ester intermediate and affect the chain elongation during polymerization [18]. Due to the lack of structure information of PHA synthase, screening and characterization of newly developed PHA synthase after random mutagenesis have been carried out to investigate the molecular characteristics of polymerase. Mutant PHA synthases from R. eutropha, A. caviae, Pseudomonas sp. 61-3, and P. putida have been generated by random mutagenesis through error-prone PCR and employing a mutator strain [19–31]. Mutant PHA synthases were screened by selecting cell colonies showing enhanced biosynthesis level of P(3HB), the most characterized PHA member. Once PHA synthases showing dramatic changes in P(3HB) accumulation level were screened, site directed saturation mutagenesis of target amino acid were further executed to identify other mutations conferring similar enhanced characteristics. Representative results of in vitro mutagenesis conferring significant effects in biosynthesis of PHA in various polymerases are summarized in Table 4.1.
4.4 Metabolic Engineering of Microorganisms for PHA Production 4.4.1 SCL-PHA Substrates of PHA synthases, RHA-CoAs are generated by activation of hydroxyfatty acid using CoA transferase and CoA synthetase and by conversion of inherent intermediates of host metabolism. The first methods are typically used for the investigation of newly found enzymes, especially PHA synthases obtained from novel PHA producing bacteria, for their activities in PHA biosynthesis and for biosynthesis of PHA consisting of novel monomer units. Usually, hydroxyfatty acid is hard to use for the PHA biosynthesis in large amounts because it is expensive and toxic to the cells. Therefore, it is employed to the small-scale production of novel PHA to confirm the feasibility of process. The recombinant microorganisms expressing CoA transferase or CoA synthetase responsible for the activation of corresponding hydroxyfatty acid have successfully synthesized PHAs containing SCL-monomers such as 3HB, 3-hydroxypropionate (3HP), 4-hydroxybutyrate (4HB), and 3-hydroxyvalerate (3HV). Coexpression of the Clostridium kluyveri 4-hydroxybutyryl-CoA:CoA transferase gene (cat2) and the R. eutropha PHA synthase gene in E. coli lead to the production of P(4HB) homopolymer from glucose and 4-hydroxybutyric acid. In the absence of glucose, P(3HB-co-4HB) containing 72 mol% of 3HB was accumulated, which indicates that E. coli seems to operate hither-to-unknown pathways to convert 4HB to 3HB, that is suppressed in the presence of glucose [32]. 4HB could be activated into 4HB-CoA by simultaneous expression of the Clostridium acetobutylicum butyrate kinase (buk) and phosphotransbutyrylase (ptb) in E. coli [33]. Growing E. coli strain expressing the phaC, ptb, and buk genes in the medium supplemented with 3-hydroxybutyric acid, 4-hydroxybutyric acid, and 4-hydroxyvaleric acid resulted in the formation of homopolymers of 3HB, 4HB, and 4HV, respectively.
Pseudomonas Gpo1 PhaC1 and Pseudomonsa aeruginosa PhaC1
Several point mutation in α/β hydrolase region/Site specific mutation
E130D Q418K S325C Q481M
Pseudomonas sp.61-3 PhaC1
Ralstonia eutropha PhaC
Aeromonas caviae PhaC
F518I F362I, F518I D459V, A513C D171G N149S F420S G4D E267K Y445F L446K
Mutation
Aeromonas punctada PhaC
Enzymes
Characteristics
Increase of PHA content Increase of 3HHx fraction in copolymer Increase of specific activity of PHA synthase/decrease of molecular weight of PHA Increase of molecular weight of PHA/increase of expression level of PhaC Decrease of specific activity of PHA synthase Decrease of specific activity of PHA synthase/conferring substrate specificity toward 3HO to PhaC Decrease of specific activity of PHA synthase Increase of PHA synthesis activity/increase of 3HB fraction in copolymer Increase of molecular weight of PHA Increase of 3HB fraction in copolymer Increase of 3HB fraction in copolymer Change of substrate specificity of PhaC1 to use 3HB as a monomer
Increase of specific activity of PHA synthase Increase of molecular weight of PHA
Table 4.1 Representative Mutagenesis to Change Characteristics of PHA Synthase
Site specific or semi random mutagenesis in specific region
Error-prone PCR
Error-prone PCR/ saturated mutagenesis
Error-prone PCR
Mutator strain
Methods
[30,31]
[19,22,23,25]
[20,21,27,28]
[26]
[29]
Reference
4-6 Cellular Metabolism
Polymerization of Building Blocks to Macromolecules
4-7
Pathways for the synthesis of copolymer of 3HP and 3HB were recently developed in recombinant E. coli. PHA copolymer containing 3HP and 3HB was found to have reduced crystallinity, compared with P(3HB) homopolymer. Salmonella enterica serovar Typhimurium propionyl-CoA synthetase (PrpE) was found to have activity to convert 3-hydroxypropionic acid to 3HP-CoA in recombinant E. coli, the substrate of R. eutropha PHA synthase [34]. When 3-hydroxypropionic acid was present in the medium, recombinant E. coli strain expressing the prpE gene and P(3HB) biosynthesis gene could produce copolymer of 3HP and 3HB. P(3HB), the most ubiquitous member of PHAs, is synthesized from acetyl-CoA by sequential reactions catalyzed by three enzymes: β-ketothiolase (PhaA) condensates two acetyl-CoA molecules to acetoacetylCoA, which is reduced to 3HB-CoA by acetoacetyl-CoA reductase (PhaB) using NADPH as a cofactor, and finally 3HB-CoA is incorporated into the growing chain of P(3HB). Because acetyl-CoA and NADPH are used for the synthesis of P(3HB), the metabolic engineering strategy has been focused on modifying the inherent metabolic pathways of E. coli in order to make acetyl-CoA and reducing power more available for P(3HB) biosynthesis pathway. When glucose is used as a carbon source, acetyl-CoA and NADPH can be efficiently provided from the glycolysis and pentose phosphate (PP) pathways. As a strategy to increase cellular NADPH level, overexpression of the zwf and gnd genes, which encode glucose-6-phosphate dehydrogenase and 6-posphogluconate dehydrogenase, respectively, was examined for P(3HB) biosynthesis [35]. When these enzymes were amplified, NADPH/NADP + ratio increased by six times resulting in the increase of the P(3HB) content from 23 to 41%. However, a closer evaluation suggested that the increase of P(3HB) content was not due to enhanced P(3HB) biosynthesis but rather due to decreased cell concentration. Therefore, it is important to make cellular status suitable for P(3HB) biosynthesis without reduction of cell growth (metabolic activity) to achieve high concentration of P(3HB). Various metabolic and fermentation strategies including host strain selection, use of plasmids of different copy numbers, filamentation suppression, use of different PHA biosynthesis genes, and plasmid stabilization have been employed to achieve enhanced PHA biosynthesis in recombinant E. coli [36–38]. These strategies have successfully developed superior strains having high PHA biosynthesis activity, but they could not overcome the root weak point of trial-and-error procedures. Analysis of whole metabolic status of E. coli during P(3HB) biosynthesis was carried out in systems level using 2-dimensional gel electrophoresis (2DE) and metabolic flux analysis (MFA) based on in silico E. coli metabolic network model [39–41]. Proteome analyses of soluble proteins of recombinant E. coli during P(3HB) biosynthesis found proteome patterns such as high level expression of general heat shock proteins including GroES, GroEL, DnaK, and IbpAB [39]. Also, the importance of acetyl-CoA and NADPH availability for P(3HB) production was confirmed by proteome analysis. It was found that the spot intensities of fructose-bisphosphate aldolase (FbaA) and triosephosphate isomerase (TpiA) in glycolytic pathway were increased during P(3HB) accumulation by recombinant E. coli XL1-Blue harboring plasmid pJC4 containing the A. latus PHA biosynthesis genes. The reason for the increased expression of FbaA and TpiA might be due to that E. coli modified its metabolic fluxes to increase the glyceraldehyde-3-phosphate pool, which is eventually used for P(3HB) synthesis. Based on the results of proteome analysis, we examined the effects of the amplification of triosephosphate isomerase (TpiA) and fructose-bisphosphate aldolase (FbaA) on the P(3HB) biosynthesis in recombinant E. coli W3110 harboring the plasmid containing the Ralstonia eutropha PHA biosynthesis genes (Unpublished results, Park SJ and Lee SY). Amplification of TpiA and FbaA significantly increased the P(3HB) content and concentration. In the case of TpiA amplification, P(3HB) concentration was increased by seven fold compared with control strain (from 0.89 to 5.98 g/L). These results clearly showed that the amplification of glycolytic pathway enzymes can be a good metabolic engineering strategy for the efficient production of P(3HB) by allowing increased glycolytic pathway flux to make more acetylCoA available for P(3HB) biosynthesis. Proteome analysis also reported that the up-regulation of 2-keto-3-deoxy-6-phosphogluconate aldolase (Eda) catalyzing the final step of Entner–Doudoroff (ED) pathway during P(3HB) biosynthesis [39]. Even though the ED pathway has generally been considered to be inactive under normal growth
4HB
Salmonella
C. acetobutylicum
PrpE
Ptb, Buk
b
Engineered A. caviae PhaJ. Engineered E. coli FabH.
3HB, 4HB 3HP
C. kluyveri
Cat2
a
3HB 3HHx, 3HO, 3HD 3HB, 3HV 3HB, 3HV, 3HHx 3HB, 3HV 3HHx, 3HO, 3HD
E. coli E. coli R. eutropha R. eutropha E. coli E. coli
FabH, FabD eFabHb PhaA PhaB Sdm, YgfG TesA Activation of hydroxyfatty acid to HA-CoA
3HHx, 3HO, 3HD 3HHx, 3HO, 3HD
P. aeruginosa
P. putida
RhlG
Glycolysis Glycolysis TCA cycle Fatty acid biosynthesis Fatty acid β-oxidation
Fatty acid β-oxidation Fatty acid biosynthesis
Fatty acid β-oxidation
Fatty acid β-oxidation Fatty acid β-oxidation
3HHx, 3HO, 3HD 3HHx, 3HO, 3HD
E. coli
E. coli
PaaG, PaaF, YdbU
FabG
PhaG
Fatty acid β-oxidation Fatty acid β-oxidation
3HHx, 3HO, 3HD 3HHx, 3HO, 3HD
E. coli
E. coli
Fatty acid β-oxidation
MaoC
3HHx, 3HO, 3HD
P. aeruginosa
PhaJ1, PhaJ2, PhaJ3, PhaJ4
Fatty acid β-oxidation
Fatty acid β-oxidation
Metabolism
YfcX
3HB, 3HHx 3HB, 3HHx, 3HO, 3HD
A. caviae
ePhaJa
Use of inherent metabolic intermediates
Monomer Composition of PHA
Aeromonads
Representative Sources
PhaJ
Enzymes
Table 4.2 Enzymes for PHA Biosynthesis Used in Recombinant E. coli and Salmonella
[33]
[34]
[32]
[64] [65] [7] [7] [46] [75]
[57–60]
[55]
[54–56]
[70]
[69]
[61]
[49,50]
[52]
[48,51]
Reference
4-8 Cellular Metabolism
Polymerization of Building Blocks to Macromolecules
4-9
condition when glucose is used as a carbon source, it seems to become active to satisfy increased demand for NADPH and acetyl-CoA during P(3HB) production. Activation of Eda during P(3HB) biosynthesis was also supported by MFA and comparison of P(3HB) synthesis activity in wild and eda mutant E. coli strain [40]. In silico E. coli metabolic network was constructed with 154 reversible and 156 irreversible reactions and 295 metabolites and metabolic flux distributions in wild type E. coli and recombinant E. coli producing P(3HB) were compared using this model. MFA results showed that ED pathway flux increased significantly under P(3HB) accumulating condition. These MFA and proteome analysis results lead us to compare the abilities of P(3HB) synthesis in E. coli KS272 and its eda mutant. The P(3HB) concentration and the P(3HB) content were lower in eda mutant E. coli strain than those obtained in a wild strain. Because it has been suggested that different E. coli strains have different P(3HB) biosynthesis activity, therefore, it could be concluded that at least in E. coli KS272 strain, the ED pathway plays an important role in P(3HB) biosynthesis. Analysis of P(3HB) granule-associated proteome in recombinant E. coli showed some different aspects of protein expression pattern than in recombinant E. coli producing P(3HB) [41]. Because E. coli does not naturally synthesize P(3HB), there is no system protecting intracellular biomolecules such as DNA, RNA, and proteins from direct contact with hydrophobic P(3HB) granules. Bacteria naturally accumulating PHA have amphiphilic phasin proteins to cover the surface of P(3HB) granules, thus stabilize intracellular biomolecules. Soluble and insoluble proteome analysis of P(3HB) producing recombinant E. coli clearly showed the negative effect of accumulated P(3HB) to intracellular proteins, for example, reduced synthesis of EF-Tu that is critically involved in protein synthesis [39]. Analysis based on soluble proteome lead us to that total expression level of Ef-Tu is reduced due to the unfavorable condition generated by P(3HB) biosynthesis [39]. However, insoluble proteome analysis of revealed that EF-Tu synthesis level is not reduced, but most of the EF-Tu synthesized is associated with P(3HB) granules [41]. New finding was provided by analysis of insoluble proteome that IbpAB, small heat shock proteins that are associated with inclusion body proteins when external stresses such as heat shock are put into E. coli strains, are associated with P(3HB) granules and function to stabilize intracellular biomolecules as phasin proteins do [41]. It has been reported that phasin expression in E. coli does not have a critical effect in enhancing P(3HB) biosynthesis, but the number of P(3HB) granules are increased with reduced diameters of granules. Proteome analysis of P(3HB) granule associated proteome of recombinant E. coli allowed us to find the presence of IbpAB on the surface of P(3HB) granules and to suggest the roles of IbpAB during P(3HB) biosynthesis, induction of P(3HB) accumulation with increased number of granules having decreased diameter [41]. MFA is a useful tool to evaluate cellular metabolic status and to design engineering strategies to increase the product yield under a given condition. Metabolic and fermentation strategies which have been employed to increase P(3HB) biosynthesis in recombinant E. coli such as oxygen limited fermentation and increased availability of acetyl-CoA and NADPH were proved to be right by MFA [42,43]. It was found that under the oxygen limiting condition, additional carbon flux through pyruvate formate lyase increased without any change of the pyruvate dehydrogenase flux, which is used under normal growth condition, resulting in the accumulation of twice more acetyl-CoA, which was subsequently used for P(3HB) biosynthesis. In this section, the importance of acetyl-CoA and NADPH was mainly discussed. However, P(3HB) biosynthesis is not simply controlled by only one factor, but is affected by many central metabolic pathway fluxes. To design host strain for high P(3HB) production activity, it will be important to optimize the flux distributions in such ways that the flux toward P(3HB) is maximized while the fluxes towards other by-products are minimized. Also, fermentation strategies should be carefully optimized because too early accumulation of P(3HB) results in low final cell concentration (due to the growth inhibition) while too late accumulation of P(3HB) results in low P(3HB) content, both of which result in low P(3HB) productivity. Metabolic engineering strategies for the production PHAs containing other RHA monomers besides 3HB have been developed using inherent metabolic intermediates in recombinant E. coli (Figure 4.2).
Citrate
Lactate
SucD
PHA synthase PHA
4HB-CoA
Succinic acid semialdehyde 4HbD 4HB
Glutamate
3HP-CoA
GabT
GadA
PHA
GABA
PHA synthase
Acrylyl-CoA
Lactyl-CoA
PHA
PHA synthase
FabG Acyl-CoA
FadD
Fatty acids
FabG
Enoyl-CoA
PHA
PHA synthase
(R)-3-Hydroxyacyl-CoA
?
YfcX, YdbU, (S)-3-Hydroxyacyl-CoA PaaG, PaaF MaoC
FadA FadE Fatty acid 3-Ketoacyl-CoA β-oxidation pathway FadB FadB
Thioesterase (TesB)
(R)-3-Hydroxyacyl-CoA
PhaG
Thioesterase (TesA)
3-Ketoacyl-CoA
FabI Fatty acid 3-Ketoacyl-ACP biosynthesis pathway FabH FabA FabG
3-Hydroxyacyl-ACP
Enoyl-ACP
Acyl-ACP
Figure 4.2 Metabolic engineering strategies for the various PHAs from inherent intermediates of E. coli and Salmonella. Abbreviations: Buk, butyrate kinase from Clostridium acetobutylicum; FadB, enoyl-CoA hydratase; FadE, acyl-CoA dehydrogenase; GabT, glutamate:succinic semialdehyde transaminase from E. coli; GadA, glutamate decarboxylase from Escherichia coli; 4HbD, hydroxybutyrate dehydrogenase from Ralstonia eutropha; SucD, succinic semialdehyde dehydrogenase; PhaJ, (R)-specific enoyl-CoA hydratase; PrpE, propionyl-CoA synthetase from Salmonella enterica serovar typhimurium; Pta, phosphotransacetylase; Ptb, phosphotransbutyrylase from C. acetobutylicum.
3HV-CoA
Propionyl-CoA Acetyl-CoA
(2R)-methyl-malonyl-CoA
Succinyl-CoA
TCA-cycle α–ketoglutarate
Oxaloacetate
Acetyl-CoA
Pyruvate
Glyceraldehyde 3-phosphate
Glucose-P
4-10 Cellular Metabolism
Polymerization of Building Blocks to Macromolecules
4-11
4HB-CoA could be produced from succinyl-CoA in recombinant E. coli by three enzymes, including succinic semialdehyde dehydrogenase (SucD), 4-hydroxybutyrate dehydrogenase (4hbD), and 4-hydroxybutyryl-CoA: CoA transferase (Cat2) from Clostridium kluyveri [44]. By employing this pathway, recombinant E. coli could supply 4HB monomers from glucose to P(3HB-co-4HB) by up to 2.8 mol%. To construct a metabolic pathway in E. coli to generate 4HB from glutamate, glutamate decarboxylases from Arabidopsis thaliana or E. coli, glutamate:succinic semialdehyde transaminase from E. coli, 4-hydroxybutyrate dehydrogenase from R. eutropha, and 4-hydroxybutyrate-CoA: CoA transferase from C. kluyveri have been employed [45]. When the R. eutropha PHA biosynthesis genes were coexpressed with these engineered pathways, the copolymer of 3HB and 4HB was produced. However, the extent of 4HB assimilation into the copolymer was low (up to 3.5 mol%). 3HV-CoA, the substrate for the synthesis of PHAs containing 3HV, for example, P(3HB-co-3HV), is synthesized through condensation of propionyl-CoA with acetyl-CoA by β-ketothiolase (PhaA). Usually, propionyl-CoA is generated by activation of propionic acid in the medium. Mole fraction of 3HV in the copolymer is dependent on the amount of propionic acid added to the cells, but the concentration of propionic acid should be maintained in optimized level to support good cell growth along with desired 3HV composition due to the toxic effect of propionic acid at excess concentration. Development of metabolic pathway to generate propionyl-CoA from inherent intermediates of host strain can be a good solution to relieve toxicity of propionic acid during the synthesis of PHAs containing 3HV [46]. This was demonstrated by the expression of E. coli sbm and ygfG genes, which encode a novel (2R)-methylmalonylCoA mutase and a (2R)-methylmalonyl-CoA decarboxylase, respectively in prpC strain of Salmonella enterica serovar Typhimurium. Recombinant S. enterica expressing sbm and ygfG genes could convert succinyl-CoA, derived from the tricarboxylic acid cycle, to propionyl-CoA, and synthesized P(3HB-co3HV) from glycerol. It was possible to increase the 3HV fraction in the copolymer up to 15 mol%, which is usually obtainable by direct addition of propionic acid to the medium. To date, there is no report about the synthesis pathway of PHA containing 3HP, in which 3HP-CoA is generated from the inherent intermediates of host strain. Recently, Cargill proposed 3HP synthesis pathway via 3HP-CoA from glucose opening the possibility to produce PHA containing 3HP using metabolic intermediates of host strain [47]. Three metabolic pathways for the production of 3HP from glucose are proposed. Firstly, enzymes used during fermentation of alanine in Clostridium propionicum are employed. Lactate is activated to lactyl-CoA by CoA synthetase or CoA transferase and then converted to acrylyl-CoA by lactyl-CoA dehydratase. 3-hydroxypropionyl-CoA dehydratase adds one molecule of water into acrylyl-CoA to make 3HP-CoA. And then, 3HP-CoA is hydrolyzed by CoA hydrolase to release 3HP. Secondly, enzymes involved in the 3-hydroxypropionate cycle of autotrophic CO2 fixation in the phototrophic green non-sulfur bacterium Chloroflexus aurantiacus are used. In this pathway, malonyl-CoA is directly converted into 3HP by bifunctional enzyme malonyl-CoA reductase. Finally, a pathway, in which β-alanine is converted into 3HP via malonate semialdehyde by 4,4-aminobutyrate aminotransferase and 3-HP dehydrogenase is proposed. At present, only small amounts of 3HP can be produced using these pathways.
4.4.2 MCL-PHA and SCL-MCL-PHA MCL-PHA monomers are mainly provided from the intermediates of fatty acid metabolism because such metabolites can be easily converted into (R)-3-hydroxyacyl-CoAs (R3HA-CoAs), the most favorable substrates for PHA synthase. Enoyl-CoA, 3-ketoacyl-CoA, (S)-3-hydroxyacyl-CoA, and 3-hydroxyacyl-ACP are the major precursors of PHAs (Figure 4.2). In natural PHA producing bacteria, MCL-PHAs are efficiently synthesized from the intermediates of fatty acid β-oxidation and de novo fatty acid biosynthesis pathways. However, in recombinant E. coli, efficient production of PHAs with precise control of monomer composition using fatty acid metabolism has not yet been achieved because metabolic pathways involved in fatty acid metabolism are not optimized for the PHA synthesis. Enzymes supplying R3HA-CoAs from these fatty acid metabolism intermediates have
4-12
Cellular Metabolism
been identified and cloned from natural PHA-producing bacteria. These enzymes include (R)-specific enoyl-CoA hydratase (PhaJ) [48–53], 3-ketoacyl-ACP reductase (FabG) [54–56] and 3-hydroxydecanoyl-ACP:CoA transferase (PhaG) [57–60]. These enzymes have been characterized in molecular level and employed to develop PHA biosynthesis pathways in recombinant E. coli using intermediates of fatty acid metabolism. There is no specific metabolic engineering strategy for leading natural PHA producing bacteria to produce PHA from intermediates of fatty acid metabolism. When nutrient limitation is employed to the cell, bacteria naturally optimize their physiological state to accumulate PHA. Because E. coli is not able to synthesize PHA naturally, it is necessary to amplify specific genes directly involved in PHA biosynthesis obtained from natural PHA producing bacteria and to further create metabolic condition favorable to PHA biosynthesis through genetic and environmental modification. Engineering β-oxidation pathway and de novo fatty acid biosynthesis pathways strictly follows this strategy to support PHA biosynthesis. First, amplification of enzymes that directly convert intermediates of fatty acid metabolism to R3HACoAs, such as enoyl-CoA hydratase (PhaJ), 3-ketoacyl-ACP reductase (FabG), and 3-hydroxydecanoylACP: CoA transferase (PhaG) has been employed to construct PHA biosynthesis pathway. Presence of (R)-specific enoyl-CoA hydratase was revealed during characterization of PHA biosynthesis genes of A. caviae that is able to accumulate PHA consisting of 3HB and 3HHx from oils. In A. caviae, phaP, phaC, and phaJ, which encodes phasin, PHA synthase, enoyl-CoA hydratase, respectively, consist of the PHA biosynthesis operon. Based on the nucleotide sequence of A. caviae phaJ gene, various (R)-specific enoyl-CoA hydratases having different substrate specificities have been cloned and characterized from E. coli and other microorganisms which are able to produce PHA from fatty acid [48–53,61]. The basic idea that 3-ketoacyl-ACP reductase can mediate PHA biosynthesis using intermediates of β-oxidation pathway, 3-ketoacyl-CoA, was originated from the amino acid sequence homology of 3-ketoacyl-ACP reductase to acetoacetyl-CoA reductase (PhaB) [54,56]. Amplification of 3-ketoacyl-ACP reductases from E. coli and P. aeruginosa allowed us to develop MCL-PHA biosynthesis pathway from fatty acids. Because enoyl-CoA hydratase and 3-ketoacyl-ACP reductase compete with inherent β-oxidation pathway enzymes for intermediates, it was suggested that high-level expression of these genes is necessary to construct the metabolic link between β-oxidation and PHA biosynthetic pathways [62]. Pseudomonads belonging to rRNA homology group I synthesize MCL-PHA from glucose, indicating the de novo fatty acid biosynthesis pathway is connected to the PHA biosynthesis pathway. Recently, the phaG gene encoding 3-hydroxydecanoyl-ACP: CoA transferase was cloned from several Pseudomonas sp. and characterized at a molecular level for their role of constructing metabolic link between PHA biosynthesis pathway and fatty acid biosynthesis pathway [57–60]. However, expression of the phaG gene alone could not support MCL-PHA production from glucose in recombinant E. coli [57,63]. Therefore, E. coli FabH that might have 3-ketoacyl-ACP:CoA transferase activity, was examined for its activity to develop PHA biosynthesis pathway using intermediates of fatty acid biosynthesis pathway [64]. It was found that E. coli FabH has narrow activity to synthesize only 3HB-CoA. To broaden the substrate specificity of E. coli FabH, protein engineering of E. coli FabH was carried out by site-directed mutagenesis of Phe87 amino acid through in vitro saturated mutagenesis by PCR [65]. Phe87 amino acid was chosen as a target for mutagenesis on the basis of the crystal structure of Mycobacterium tuberculosis FabH that has specificity for substrates with C10–C16 carbon atoms. Several FabH mutant showed broader substrate specificity to generate R3HA-CoA having carbon atoms up to C12. The MCLPHA content obtained by expression of wild and engineered E. coli FabH was quite low (< 1%) and even Pseudomonas PhaG could not support PHA biosynthesis. This can be due to that the fatty acid biosynthesis pathway in E. coli was unable to supply enough PHA precursors, unlike the pathway in pseudomonads. Because degradation of fatty acid through β-oxidation is processed by multifunctional enzyme complex, FadBA, in which FadB has function of enoyl-CoA hydratase, 3-hydroxyacyl-CoA dehydrogenase, and 3-hydroxyacyl-CoA epimerase and FadA is 3-ketoacyl-CoA thiolase, major precursors for PHA biosynthesis, enoyl-CoA, (S)-3-hydroxyacyl-CoA, and 3-ketoacyl-CoA, are not easily accessible to PHA
Polymerization of Building Blocks to Macromolecules
4-13
biosynthesis pathway in E. coli that actively operates β-oxidation. Therefore, secondly, inhibition of enzymes involved in fatty acid metabolism has been tried to develop PHA biosynthesis pathway from β-oxidation. This strategy was further employed to construct PHA biosynthesis pathway using de novo fatty acid biosynthesis pathway. FadB and/or FadA mutant E. coli was found to be good hosts to synthesize MCL-PHA from fatty acid when MCL-PHA synthase was expressed [66,67]. Also, chemical inhibition of FadA by acrylic acid resulted in MCL-PHA biosynthesis [68]. When we examine the monomer composition of MCL-PHA obtained from E. coli deficient in β-oxidation, monomers having the same carbon numbers or numbers reduced by 2 and 4 or increased by 2 compared with those of the supplied fatty acid find to be incorporated into MCL-PHA [66]. For example, MCL-PHA consisting of 2.5 mol% 3-hydroxyhexanoate (3HHx), 20 mol% 3-hydroxyoctanoate (3HO), 72.5 mol% 3-hydroxydecanoate (3HD) and 5 mol% 3-hydroxydodecanoate (3HDD) was synthesized in fadB mutant E. coli expressing P. aeruginosa phaC1 gene from sodium decanoate [66]. Based on these results, it can be assumed that enigmatic enzyme substitutes for FadB to actively generate R3HA-CoA and to degrade fatty acid by two carbon atoms in fadB mutant E. coli strain. Molecular characterization of what enzymes are involved in the generation of R3HA-CoA in the absence of FadB and/or FadA found that various enzymes including YfcX [69], enoyl-CoA hydratase MaoC [61], and crotonase superfamily enzyme such as YdbU, PaaG, and PaaF [70] operate to convert enoyl-CoA to R3HA-CoA in fadB mutant E. coli. Deletion of genes encoding YfcX and MaoC in the chromosome of fadB mutant E. coli significantly decreased PHA synthesis activity from fatty acid. fadB yfcX and fadB maoC double mutant E. coli successfully recovered PHA biosynthesis activity from fatty acid by restoration of YfcX and MaoC in plasmid-based expression [61,69]. YfcX and YfcY homologous to FadB and FadA, respectively, were found to be components of another β-oxidation pathway operating under anaerobic conditions using nitrate as a terminal respiratory electron acceptor [71]. These enzymes proved to function normally in mutant strains lacking aerobic β-oxidation pathway enzymes. The substrate specificities of YfcXY are similar to those of FadBA. Based on these results, it can be concluded that YfcX is involved in PHA biosynthesis and degradation of acyl-CoAs in fadB mutant E. coli. Differences of YfcX, crotonase superfamily, and enoyl-CoA hydratase MaoC can be found in the PHA biosynthesis activities obtained by amplification of these enzymes in normal and fadB mutant E. coli strains. When YfcX and crotonase superfamily enzyme were amplified in normal E. coli strain, no PHA biosynthesis was achieved from fatty acid [61,70]. MCL-PHA content was increased without much modification of monomer composition of PHA when these enzymes were amplified in fadB mutant E. coli [61,70]. However, MaoC amplification in normal E. coli strain resulted in the accumulation of MCL-PHA from fatty acid and in fadB mutant E. coli strain, PHA with altered monomer composition was synthesized from fatty acid [61]. Monomer composition seems to be altered following substrate specificity of MaoC. Similar results that can be used for the comparison of enzymes involved in PHA biosynthesis in fadB mutant E. coli were obtained using P. oleovorans FadBA protein [53] and other (R)-specific enoyl-CoA hydratases [62]. As discussed above, even though it is necessary to achieve high-level expression of (R)-specific enoyl-CoA hydratases, PHA can be synthesized from fatty acid by additional expression of (R)-specific enoyl-CoA hydratases in normal E. coli strain that has functional β-oxidation pathway. However, P. oleovorans FadBA was not able to construct PHA biosynthesis pathway in normal E. coli strain even though FadBA was amplified by plasmid-based expression system. Considering the characteristic of FadBA, multifunctional enzyme complex, this seems to be due to not that FadBA do not dominantly perform hydration of enoyl-CoA in (R)-specific manner, but that they do not release R3HA-CoA making PHA synthase hardly accessible to R3HA-CoA during β-oxidation process of fatty acid. Strategy to mediate PHA biosynthesis by inhibition of enzymes related in fatty acid β-oxidation was also employed in construction of metabolic link between PHA biosynthesis and fatty acid biosynthesis. Chemicals that specifically inhibit the enzymes involved in fatty acid biosynthesis were examined for their ability to supply PHA precursors such as cerulenin for FabB and FabF and triclosan for FabI [63]. Only triclosan led to MCL-PHA synthesis from gluconate up to 2–3% dry cell in recombinant E. coli
4-14
Cellular Metabolism
expressing the P. putida phaG and the P. aeruginosa phaC1 genes. Unlike engineering of β-oxidation, fatty acid biosynthesis of E. coli could not successfully support MCL-PHA biosynthesis. It was suggested that weak metabolic activity of E. coli to generate intermediates of fatty acid biosynthesis is a factor for weak PHA biosynthesis activity based on the report that expression of the phaG gene only did not mediate MCL-PHA biosynthesis from glucose and E. coli could synthesize very small amounts of MCLPHA by expressing the phaG gene along with chemical inhibition via fatty acid biosynthesis pathway. However, recently, it was reported that E. coli expressing the phaG gene without PHA synthase efficiently excreted 3-hydroxydecanoic acid generated from the intermediates of fatty acid biosynthesis to the growth medium [72–74]. Addition of triclosan increased 3-hydroxydecanoic acid production by 20–30%. When the amount of excreted 3-hydroxydecanoic acid was considered on the basis of cell concentration, it reached nearly 50% of cell concentration [73]. This is quite interesting result to show that E. coli has sufficient metabolic capacity to supply enough fatty acid biosynthesis pathway intermediates for PHA biosynthesis. Thioesterase II (TesB) of E. coli might be reason to inhibit MCLPHA biosynthesis using intermediates of fatty acid biosynthesis through fast removal of CoA from R3HA-CoA, resulting in the excretion of overproduced (R)-3-hydroxyalkanoic acid [74]. When the tesB mutant E. coli strain was employed for the excretory production of 3-hydroxydecanoic acid from glucose by expressing the phaG gene, production of 3-hydroxydecanoic acid was decreased to 72 mg/L that is only 10% of 3-hydroxydecanoic acid concentration achieved with control strain. It was revealed that expression of the tesB gene was triggered by accumulation of 3HD-CoA that is synthesized by PhaG. The exact role of TesB in MCL-PHA biosynthesis using intermediates of fatty acid biosynthesis will be resolved by the experiment about the production of MCL-PHA in the tesB mutant E. coli by expression of the phaG gene and the phaC gene. Interestingly, thioesterase has been employed to link fatty acid biosynthesis and fatty acid β-oxidation pathway to construct MCL-PHA biosynthesis pathway from glucose in the fad mutant E. coli that constitutively operates impaired β-oxidation pathway [75,76]. fadR fadB mutant E. coli could produce MCL-PHA from gluconate up to 2.3% dry cell weight by expressing E. coli thioesterase I and P. oleovorans PHA synthase [75]. Also, expression of acyl-ACP thioesterase from the plant Umbellularia californica established MCL-PHA biosynthesis from gluconate in recombinant E. coli [76]. Accumulated laurate by U. californica acyl-ACP thioesterase was converted to PHA precursors through the β-oxidation pathway in fadB mutant E. coli, and then polymerized into MCL-PHA by P. aeruginosa MCL-PHA synthase. Based on these results of MCL-PHA biosynthesis obtained by employing thioesterase, we can suggest two possible strategies to produce MCL-PHA from glucose in recombinant E. coli: (I) amplification of the phaG and phaC genes in the tesB mutant E. coli strain, (II) amplification of the phaG, tesB, and phaC genes in fadR fadA and/or fadB mutant.
4.5 Conclusion PHA biosynthesis is a good model system for examining the metabolic engineering strategies for efficient biopolymer production. As reviewed in this chapter, successful metabolic engineering of microorganisms for the enhanced production of biopolymers and/or production of novel biopolymers requires a priori knowledge. Detailed understanding on the metabolic pathways leading to the synthesis of a biopolymer as well as the genes and enzymes involved is essential. Even though successful metabolic engineering can be performed using this information, it will be better to optimize the upstream to downstream processes at the same time; metabolic engineering of strains should be performed after considering fermentation and recovery/purification processes. As biopolymers like PHAs should be relatively inexpensive, the entire process should be optimized with economical production in mind. Carbon substrate can be a cost-determining factor in the production of bulk biopolymers, and thus the production strain should be selected or engineered to utilize an inexpensive carbon source available at the production site. Recent advances in omics and systems biology disciplines are opening a new possibility of systemic optimization of strains and bioprocesses. Systems biotechnology [77,78] by
Polymerization of Building Blocks to Macromolecules
4-15
integrating genome, transcriptome, proteome, metabolome, and computational analyses are allowing us to identify new targets and new strategies for the metabolic engineering of microorganisms for the overproduction of bioproducts and production of novel products. This is also true for the production of biopolymers. We are observing the rapid rise of oil prices, which is reminding us of the finite nature of fossil fuels. Various chemicals and polymers we are using everyday need to be produced from renewable resources in the future. As the efficiencies of bioprocesses for the production of these chemicals and biopolymers by microbial fermentation are generally low, we need to improve the performance of the strains to be employed. Metabolic engineering strategies described in this chapter for PHA as an example should be useful for designing strains for other biopolymers. Furthermore, it is expected that the better metabolic engineering strategies will become available through the systems biotechnology.
Acknowledgments Our work described in this chapter was supported by the Korean Systems Biology Research Program (M10309020000-03B5002-00000) of the Ministry of Science and Technology. Further supports by the LG Chem Chair Professorship and KOSEF through the CUPS are appreciated.
References 1. Steinbüchel, A and Doi, Y. Polyesters I. In Biopolymers. Wiley-VCH, Weinheim, 2001. 2. Anderson, A.J. and Dawes, E.A. Occurrence, metabolism, metabolic role, and industrial uses of bacterial polyhydroxyalkanoates. Microbiol. Rev., 54, 450, 1990. 3. Steinbüchel, A. and Valentin, H.E. Diversity of bacterial polyhydroxyalkanoic acid. FEMS Microbiol. Lett., 128, 219, 1995. 4. Doi, Y. Microbial Polyesters. VCH, New York, 1990. 5. Matsusaki, H., Abe, H., and Doi, Y. Biosynthesis and properties of poly(3-hydroxybutyrate-co-3hydroxyalkanoates) by recombinant strains of Pseudomonas sp. 61-3. Biomacromolecules, 1, 17, 2000. 6. Rehm, B.H.A. Polyester syntheases: natural catalysts for plastics. Biochem. J., 376, 15, 2003. 7. Schubert, P., Steinbüchel, A., and Schlegel, H.G. Cloning of the Alcaligenes eutrophus genes for synthesis of poly-beta-hydroxybutyric acid (PHB) and synthesis of PHB in Escherichia coli. J. Bacteriol., 170, 5837, 1988. 8. Choi, J., Lee, S.Y., and Han, K. Cloning of the Alcaligenes latus polyhydroxyalkanoate biosynthesis genes and use of these genes for the enhanced production of poly(3-hydroxybutyrate) in Escherichia coli. Appl. Environ. Microbiol., 64, 4897, 1998. 9. Timm, A. and Steinbüchel, A. Cloning and molecular analysis of the poly(3-hydroxyalkanoic acid) gene locus of Pseudomonas aeruginosa PAO1. Eur. J. Biochem., 209, 15, 1992. 10. Liebergesell, M. and Steinbüchel, A. Cloning and nucleotide sequences of genes relevant for biosynthesis of poly(3-hydroxybutyric acid) in Chromatium vinosum strain D. Eur. J. Biochem., 209, 135, 1992. 11. McCool, J. and Cannon, M.C. Polyhydroxyalkanoate inclusion body-associated proteins and coding region in Bacillus megaterium. J. Bacteriol., 181, 585, 1999. 12. Liebergesell, M., Rahalker, S., and Steinbüchel, A. Analysis of the Thiocapsa pfennigii polyhydroxyalkanoate synthase: subcloning, molecular characterization and generation of hybrid synthases with the corresponding Chromatium vinosum enzyme. Appl. Microbiol. Biotechnol., 54, 186, 2000. 13. Matsusaki, H. et al. Cloning and molecular analysis of the Poly(3-hydroxybutyrate) and Poly(3hydroxybutyrate-co-3-hydroxyalkanoate) biosynthesis genes in Pseudomonas sp. strain 61-3. J. Bacteriol., 180, 6459, 1998. 14. Fukui, T. and Doi, Y. Cloning and analysis of the poly(3-hydroxybutyrate-co-3-hydroxyhexanoate) biosynthesis genes of Aeromonas caviae. J. Bacteriol., 179, 4821, 1997.
4-16
Cellular Metabolism
15. Fukui, T. et al. Biosynthesis of poly(3-hydroxybutyrate-co-3-hydroxyvalerate-co-3-hydroxy-heptanoate) terpolymers by recombinant Alcaligenes eutrophus. Biotechnol. Lett., 19, 1093, 1997. 16. Dennis, D. et al. Formation of poly(3-hydroxybutyrate-co-3-hydroxyhexanoate) by PHA synthase from Ralstonia eutropha. J. Biotechnol., 64, 177, 1998. 17. Antonio, R.V., Steinbüchel, A., and Rehm, B.H.A. Analysis of in vivo substrate specificity of the PHA synthase from Ralstonia eutropha: formation of novel copolyesters in recombinant Escherichia coli. FEMS Microbiol. Lett., 182, 111, 2000. 18. Jia, Y. et al. Lipases provide a new mechanistic model for polyhydroxybutyrate (PHB) synthases: characterization of the functional residues in Chromatium vinosum PHB synthase. Biochemistry, 39, 3927, 2000. 19. Takase, K. et al. Alteration of substrate chain-length specificity of type II synthase for polyhydroxyalkanoate biosynthesis by in vitro evolution: in vivo and in vitro enzyme assays. Biomacromolecules, 5, 480, 2004. 20. Normi, Y.M. et al. Site-directed saturation mutagenesis at residue F420 and recombination with another beneficial mutation of Ralstonia eutropha polyhydroxyalkanoate synthase. Biotechnol. Lett. 27, 705, 2005. 21. Normi, Y.M. et al. Characterization and properties of G4X mutants of Ralstonia eutropha PHA synthase for poly(3-hydroxybutyrate) biosynthesis in Escherichia coli. Macromol. Biosci., 5, 197, 2005. 22. Matsumoto, K. et al. Synergistic effects of Glu130Asp substitution in the type II polyhydroxyalkanoate (PHA) synthase: enhancement of PHA production and alteration of polymer molecular weight. Biomacromolecules, 6, 99, 2005. 23. Takase, K. et al. Alteration of substrate chain-length specificity of type II synthase for polyhydroxyalkanoate biosynthesis by in vitro evolution: in vivo and in vitro enzyme assays. Biomacromolecules, 5, 480, 2004. 24. Takase, K., Taguchi, S., and Doi, Y. Enhanced synthesis of poly(3-hydroxybutyrate) in recombinant Escherichia coli by means of error-prone PCR mutagenesis, saturation mutagenesis, and in vitro recombination of the type II polyhydroxyalkanoate synthase gene. J. Biochem. (Tokyo), 133, 139, 2003. 25. Taguchi, S. et al. In vitro evolution of a polyhydroxybutyrate synthase by intragenic suppressiontype mutagenesis. J. Biochem. (Tokyo), 131, 801, 2002. 26. Kichise, T., Taguchi, S., and Doi, Y. Enhanced accumulation and changed monomer composition in polyhydroxyalkanoate (PHA) copolyester by in vitro evolution of Aeromonas caviae PHA synthase. Appl. Environ. Microbiol., 68, 2411, 2002. 27. Taguchi, S. et al. Analysis of mutational effects of a polyhydroxybutyrate (PHB) polymerase on bacterial PHB accumulation using an in vivo assay system. FEMS Microbiol. Lett., 198, 65, 2001. 28. Rehm, B.H.A. et al. Molecular characterization of the poly(3-hydroxybutyrate) (PHB) synthase from Ralstonia eutropha: in vitro evolution, site-specific mutagenesis and development of a PHB synthase protein model. Biochim. Biophys. Acta., 1594, 178, 2002. 29. Amara, A.A., Steinbüchel, A., and Rehm, B.H.A. In vivo evolution of the Aeromonas punctata polyhydroxyalkanoate (PHA) synthase: isolation and characterization of modified PHA synthases with enhanced activity. Appl. Microbiol. Biotechnol., 59, 477, 2002. 30. Amara, A.A. and Rehm, B.H.A. Replacement of the catalytic nucleophile cysteine-296 by serine in class II polyhydroxyalkanoate synthase from Pseudomonas aeruginosa mediated synthesis of a new polyester: identification of catalytic residues. Biochem. J., 374, 413, 2003. 31. Sheu, D.S. and Lee, C.Y. Altering the substrate specificity of polyhydroxyalkanoate synthase 1 derived from Pseudomonas putida GPo1 by localized semirandom mutagenesis. J. Bacteriol., 186, 4177, 2004. 32. Hein, S. et al. Biosynthesis of poly(4-hydroxybutyric acid) by recombinant strains of Escherichia coli. FEMS Microbiol. Lett., 153, 411, 1997.
Polymerization of Building Blocks to Macromolecules
4-17
33. Liu, S.J. and Steinbüchel, A. A novel genetically engineered pathway for synthesis of poly(hydroxyalkanoic acids) in Escherichia coli. Appl. Environ. Microbiol., 66, 739, 2000. 34. Valentin, H.E. et al. Application of a propionyl coenzyme A synthetase for poly(3-hydroxypropionate-co-3-hydroxybutyrate) accumulation in recombinant Escherichia coli. Appl. Environ. Microbiol., 66, 5253, 2000. 35. Lim, S.J. et al. Amplification of the NADPH-related genes zwf and gnd for the oddball biosynthesis of PHB in an E. coli transformant harboring a cloned phbCAB operon. J. Biosci. Bioeng., 93, 543, 2002. 36. Lee, S.Y. Bacterial polyhydroxyalkanoates. Biotechnol. Bioeng., 49, 1, 1996. 37. Lee, S.Y. Plastic bacteria? Progress and prospects for polyhydroxyalkanoate production in bacteria. Trends Biotechnol., 14, 431, 1996 38. Madison, L.L. and Huisman, G.W. Metabolic engineering of poly(3-hydroxyalkanoates): from DNA to plastic. Microbiol. Mol. Biol. Rev., 63, 21, 1999. 39. Han, M.J., Yoon, S.S., and Lee, S.Y. Proteome analysis of metabolically engineered Escherichia coli producing poly(3-hydroxybutyrate). J. Bacteriol., 183, 301, 2001. 40. Hong, S.H. et al. In silico prediction and validation of the importance of Entner-Doudoroff pathway in poly(3-hydroxybutyrate) production by metabolically engineered Escherichia coli. Biotech. Bioeng., 83, 854, 2003. 41. Han, M.-J. et al. Analysis of poly(3-hydroxybutyrate) granule-associated proteome in recombinant Escherichia coli. J. Microbiol. Biotechnol., 16, 901, 2006. 42. Wong, H.H. et al. Metabolic analysis of poly(3-hydroxybutyrate) production by recombinant Escherichia coli. J. Microbiol. Biotechnol., 9, 593, 1999. 43. Van Wegen, R.J., Lee, S.Y., and Middelberg, A. Metabolic and kinetic analysis of poly(3-hydroxybutyrate) production by recombinant Escherichia coli. Biotechnol. Bioeng., 74, 70, 2001. 44. Valentin, H.E. and Dennis, D. Production of poly(3-hydroxybutyrate-co-4-hydroxybutyrate) in recombinant Escherichia coli grown on glucose. J. Biotechnol., 58, 33, 1997. 45. Valentin, H.E., Reiser, S., and Gruys, K.J. Poly(3-hydroxybutyrate-co-4-hydroxybutyrate) formation from gamma-aminobutyrate and glutamate. Biotechnol. Bioeng., 67, 291, 2000. 46. Aldor, I.S. et al. Metabolic engineering of a novel propionate-independent pathway for the production of poly(3-hydroxybutyrate-co-3-hydroxyvalerate) in recombinant Salmonella enterica serovar typhimurium. Appl. Environ. Microbiol., 68, 3848, 2002. 47. Gokarn, R.R. et al. PCT patent WO0242418. 48. Fukui, T. and Doi, Y. Expression and characterization of (R)-specific enoyl Coenzyme A hydratase involved in polyhydroxyalkanoate biosynthesis by Aeromonas caviae. J. Bacteriol., 180, 667, 1998. 49. Tsuge, T. et al. Molecular cloning of two (R)-specific enoyl-CoA hydratase genes from Pseudomonas aeruginosa and their use for polyhydroxyalkanoate synthesis. FEMS Microbiol. Lett., 184, 193, 2000. 50. Tsuge, T. et al. Molecular characterization and properties of (R)-specific enoyl-CoA hydratases from Pseudomonas aeruginosa: metabolic tools for synthesis of polyhydroxyalkanoates via fatty acid β-oxidation. Int. J. Biol. Macromol., 31, 195, 2003. 51. Park, S.J. et al. Production of poly(3-hydroxybutyrate-co-3-hydroxyhexanoate) by metabolically engineered Escherichia coli strains. Biomacromolecules, 2, 248, 2001. 52. Tsuge, T. et al. Alteration of chain length substrate specificity of Aeromonas caviae R-enantiomerspecific enoyl-coenzyme A hydratase through site-directed mutagenesis. Appl. Environ. Microbiol., 69, 4830, 2003. 53. Fiedler, S., Steinbüchel, A., and Rehm, B.H.A. The role of the fatty acid beta-oxidation multienzyme complex from Pseudomonas oleovorans in polyhydroxyalkanoate biosynthesis: molecular characterization of the fadBA operon from P. oleovorans and of the enoyl-CoA hydratase genes phaJ from P. oleovorans and Pseudomonas putida. Arch. Microbiol., 178, 149, 2002.
4-18
Cellular Metabolism
54. Ren, Q. et al. FabG, an NADPH-dependent 3-ketoacyl reductase of Pseudomonas aeruginosa, provides precursors for medium-chain-length poly-3-hydroxyalkanoate biosynthesis in Escherichia coli. J. Bacteriol., 182, 2978, 2000. 55. Park, S.J., Park, J.P., and Lee, S.Y. Metabolic engineering of Escherichia coli for the production of medium-chain-length polyhydroxyalkanoates rich in specific monomers. FEMS Microbiol. Lett., 214, 217, 2002. 56. Taguchi, K. et al. Co-expression of 3-ketoacyl-ACP reductase and polyhydroxyalkanoate synthase genes induces PHA production in Escherichia coli HB101 strain. FEMS Microbiol. Lett., 176, 183, 1999. 57. Rehm, B.H.A., Kruger, N., and Steinbüchel, A. A new metabolic link between fatty acid de novo synthesis and polyhydroxyalkanoic acid synthesis. The phaG gene from Pseudomonas putida KT2440 encodes a 3-hydroxyacyl-acyl carrier protein-coenzyme a transferase. J. Biol. Chem., 273, 24044, 1998. 58. Hoffmann, N., Steinbüchel, A., and Rehm, B.H.A. The Pseudomonas aeruginosa phaG gene product is involved in the synthesis of polyhydroxyalkanoic acid consisting of medium-chain-length constituents from non-related carbon sources. FEMS Microbiol. Lett., 184, 253, 2000. 59. Hoffmann, N., Steinbüchel, A., and Rehm, B.H.A. Homologous functional expression of cryptic phaG from Pseudomonas olevorans establishes the transacylase-mediated polyhydroxyalkanoate biosynthetic pathway. Appl. Microbiol. Biotechnol., 54, 665, 2000. 60. Matsumoto, K. et al. Cloning and characterization of the Pseudomonas sp. 61-3 phaG gene involved in polyhydroxyalkanoate biosynthesis. Biomacromolecules, 2, 142, 2001. 61. Park, S.J. and Lee, S.Y. Identification and characterization of a new enoyl coenzyme A hydratase involved in biosynthesis of medium-chain-length polyhydroxyalkanoates in recombinant Escherichia coli. J. Bacteriol., 185, 5391, 2003. 62. Fukui, T. et al. Co-expression of polyhydroxyalkanoate synthase and (R)-enoyl-CoA hydratase genes of Aeromonas caviae establishes copolyester biosynthesis pathway in Escherichia coli. FEMS Microbiol. Lett., 170, 69, 1999. 63. Rehm, B.H.A., Mitsky, T.A., and Steinbüchel, A. Role of fatty acid de novo biosynthesis in polyhydroxyalkanoic acid (PHA) and rhamnolipid synthesis by pseudomonads: establishment of the transacylase (PhaG)-mediated pathway for PHA biosynthesis in Escherichia coli. Appl. Environ. Microbiol., 67, 3102, 2001. 64. Taguchi, K. et al. Over-expression of 3-ketoacyl-ACP synthase III or malonyl-CoA-ACP transacylase gene induces monomer supply for the polyhydroxybutyrate production in Escherichia coli HB101. Biotechnol. Lett., 21, 579, 1999. 65. Nomura, C.T. et al. Coexpression of genetically engineered 3-ketoacyl-ACP synthase III (fabH) and polyhydroxyalkanoate synthase (phaC) genes leads to short-chain-length-medium-chain-length polyhydroxyalkanoate copolymer production from glucose in Escherichia coli JM109. Appl. Environ. Microbiol., 70, 999, 2004. 66. Langenbach, S., Rehm, B.H.A., and Steinbüchel, A. Functional expression of the PHA synthase gene phaC1 from Pseudomonas aeruginosa in Escherichia coli results in poly(3-hydroxyalkanoate) synthesis. FEMS Microbiol. Lett., 150, 303, 1997. 67. Qi, Q., Rehm, B.H.A., and Steinbüchel, A. Synthesis of poly(3-hydroxyalkanoates) in Escherichia coli expressing the PHA synthase gene phaC2 from Pseudomonas aeruginosa: comparison of PhaC1 and PhaC2. FEMS Microbiol. Lett., 157, 155, 1997. 68. Qi, Q., Rehm, B.H.A., and Steinbüchel, A. Metabolic routing towards polyhydroxyalkanoic acid synthesis in recombinant Escherichia coli (fadR): inhibition of fatty acid beta-oxidation by acrylic acid. FEMS Microbiol. Lett., 167, 89, 1998. 69. Snell, K.D. et al. YfcX enables medium-chain-length poly(3-hydroxyalkanoate) formation from fatty acids in recombinant Escherichia coli fadB strains. J. Bacteriol., 184, 5696, 2002.
Polymerization of Building Blocks to Macromolecules
4-19
70. Park, S.J. and Lee, S.Y. New FadB homologous enzymes and their use in enhanced biosynthesis of medium-chain-length polyhydroxyalkanoates in fadB mutant Escherichia coli. Biotechnol. Bioeng., 86, 681, 2004. 71. Campbell, J.W., Morgan-Kiss, R.M., and Cronan, J.E., Jr. A new Escherichia coli metabolic competency: growth on fatty acids by a novel anaerobic β-oxidation pathway. Mol. Microbiol., 47, 793, 2003. 72. Zhao, K., et al. Production of D-(-)-3-hydroxyalkanoic acid by recombinant Escherichia coli. FEMS Microbiol. Lett., 218, 59, 2003. 73. Zheng, Z. et al. Production of 3-hydroxydecanoic acid by recombinant Escherichia coli HB101 harboring phaG gene. Antonie Van Leeuwenhoek., 85, 93, 2004. 74. Zheng, Z. et al. Thioesterase II of Escherichia coli plays an important role in 3-hydroxydecanoic acid production. Appl. Environ. Microbiol., 70, 3807, 2004. 75. Klinke, S. et al. Production of medium-chain-length poly(3-hydroxyalkanoates) from gluconate by recombinant Escherichia coli. Appl. Environ. Microbiol., 65, 540, 1999. 76. Rehm, B.H.A. and Steinbüchel, A. Heterologous expression of the acyl-acyl carrier protein thioesterase gene from the plant Umbellularia californica mediates polyhydroxyalkanoate biosynthesis in recombinant Escherichia coli. Appl. Microbiol. Biotechnol., 55, 205, 2001. 77. Lee, S.Y., Lee, D.Y., and Kim, T.Y. Systems biotechnology for strain improvement. Trends Biotechnol. 23, 331, 2005. 78. Park, S.J. et al. Global physiological understanding and metabolic engineering of microorganisms based on omics studies. Appl. Microbiol. Biotechnol. 68, 567, 2005.
5 Rare Metabolic Conversions—Harvesting Diversity through Nature 5.1 5.2 5.3 5.4
Manuel Ferrer Institute of Catalysis
Peter N. Golyshin HZI-Helmholtz Centre for Infection Research
Introduction ���������������������������������������������������������������������������������������5-1 How Diverse Are Functional Groups?........................................... 5-4 Diversity of Enyzmes and Current Frontiers for Bioconversions ���������������������������������������������������������������������������������� 5-5 Main Chemical Conversions Mediated by Enzymes: Putative “Rare” Conversions ���������������������������������������������������������� 5-8
Hydrolytic Enzymes in Enantio-Transformations • Rare Conversions Using Carbohydrate Modifying Enzymes: Production of Rare Carbohydrates • Oxidative Enzymes for the Conversion of Organic Pollutants in Industrial Effluents to Useful Products • Rare Conversions Using Fatty Acids: Production of Unique Complex Lipids and Essential Fatty Acids • Rare Conversions Using Aliphatic Groups: Incorporation of Chlorine, Fluorine, Bromide, and Iodine to Peptides-Polyketides • Rare Conversions Using Amino Acids: Production of Unnatural Amino Acids • Rare Conversion with Flavonoids and Steroids: Drug Synthesis • Rare Conversion of Xenobiotics: Potential Biodegradation Pathways • Rare Routes for Synthesis of Hormones and Vitamins: Rare Carboxylations and Hydroxylations • Cofactor-Dependent Enzymes and Cofactor Regeneration • Enantiomerically Pure Compounds by Rare Domino and MCRs
5.5 5.6
How Can New Catalytic Functions Be Achieved?......................5-16 ecent Advances in Metagenomics: The Untapped R Reservoir of Proteins from Unculturable Microbes...................5-18 Acknowledgments ������������������������������������������������������������������������������������� 5-20 References ����������������������������������������������������������������������������������������������������5-21
5.1 Introduction At the fundamental level the cellular metabolism is mediated through the enzymes that catalyze biotransformation reactions. The specific reactions transforming the particular substrate in an experimental assay (or in an in vivo system) may be or may not be of a practical interest. However, in case the research focuses on biotransformations-conversions and their biotechnological applications, in most cases a knowledge of a whole genetic circuit, particular pathway, and individual enzymes comprising the pathway are needed. Traditionally, the screening for a specific biocatalytic reaction was performed through the growth of a particular microorganism and the disappearance of the starting compound (reaction substrate) and detection of its disappearance, or/and occurrence and accumulation of 5-1
5-2
Cellular Metabolism
corresponding metabolites [1,2]. The structures of metabolites and the stoichiometry of their formation (determined both, experimentally or in silico) give some important hints to the specificity of an enzyme and its probable usefulness for the application [3]. Whatever the case, the reactions of an apparent new pathway are routinely elucidated via molecular biological methods, typically through molecular cloning and expression of the protein in a surrogate host, such as Escherichia coli, and consequent characterization of its enzymatic activity (Table 5.1). To obtain in a desired biocatalyst a known pathway or reaction Table 5.1 Classification of Enzymes and EC Numbers 1. Oxidoreductases 1.1 Acting on the CH-OH group of donors 1.2 Acting on the aldehyde or oxo group of donors 1.3 Acting on the CH-CH group of donors 1.4 Acting on the CH-NH2 group of donors 1.5 Acting on the CH-NH group of donors 1.6 Acting on NADH or NADPH 1.7 Acting on other nitrogenous compounds as donors 1.8 Acting on a sulfur group of donors 1.9 Acting on a heme group of donors 1.10 Acting on diphenols and related substances as donors 1.11 Acting on a peroxide as acceptor 1.12 Acting on hydrogen as donor 1.13 Acting on single donors with incorporation of molecular oxygen (oxygenases) 1.14 Acting on paired donors with incorporation of molecular oxygen 1.15 Acting on superoxide as acceptor 1.16 Oxidizing metal ions 1.17 Acting on CH or CH2 groups 1.18 Acting on iron-sulfur proteins as donors 1.19 Acting on reduced flavodoxin as donor 1.20 Acting on phosphorus or arsenic in donors 1.21 Acting on X-H and Y-H to form an X-Y bond 1.97 Other oxidoreductases 1.- Other oxidoreductases 2. Transferases 2.1 Transferring one-carbon groups 2.2 Transferring aldehyde or ketone residues 2.3 Acyltransferases 2.4 Glycosyltransferases 2.5 Transferring alkyl or aryl groups, other than methyl groups 2.6 Transferring nitrogenous groups 2.7 Transferring phosphorus-containing groups 2.8 Transferring sulfur-containing groups 2.9 Transferring selenium-containing groups 2.- Transferring selenium-containing groups 3. Hydrolases 3.1 Acting on ester bonds 3.2 Glycosidases 3.3 Acting on ether bonds 3.4 Acting on peptide bonds (peptidases) (Continued)
Rare Metabolic Conversions—Harvesting Diversity through Nature
5-3
Table 5.1 (Continued) 3.5 Acting on carbon–nitrogen bonds, other than peptide bonds 3.6 Acting on acid anhydrides 3.7 Acting on carbon–carbon bonds 3.8 Acting on halide bonds 3.9 Acting on phosphorus–nitrogen bonds 3.10 Acting on sulfur–nitrogen bonds 3.11 Acting on carbon–phosphorus bonds 3.12 Acting on sulfur–sulfur bonds 3.13 Acting on carbon–sulfur bonds 3.- Acting on carbon–sulfur bonds 4. Lyases 4.1 Carbon–carbon lyases 4.2 Carbon–oxygen lyases 4.3 Carbon–nitrogen lyases 4.4 Carbon–sulfur lyases 4.5 Carbon–halide lyases 4.6 Phosphorus–oxygen lyases 4.99 Other lyases 5. Isomerases 5.1 Racemases and epimerases 5.2 cis-trans-Isomerases 5.3 Intramolecular oxidoreductases 5.4 Intramolecular transferases (mutases) 5.5 Intramolecular lyases 5.99 Other isomerases 5.- Other isomerases 6. Ligases 6.1 Forming carbon–oxygen bonds 6.2 Forming carbon–sulfur bonds 6.3 Forming carbon–nitrogen bonds 6.4 Forming carbon–carbon bonds 6.5 Forming phosphoric ester bonds 6.6 Forming nitrogen–metal bonds 6.- Forming nitrogen–metal bonds
might be taken. The goal is then to obtain an ideal enzyme which is highly active with a given substrate or under specific set of conditions [4]. In this case, the screening method is important, as it might be necessary to look at hundreds or thousands of enzymes in the first rounds of screening. The description of a wide range of screening methods can be seen in numerous publications, the most relevant being those by the groups of Omura and Cutler [5,6]. Moreover, fast, efficient, and high-throughput screening methods that could test vast numbers of enzymes for specific applications have been discovered in the last decades, bringing made to order enzymes a step closer [7]. Recently, a screening system utilizing the “fluorescence activated cell sorting,” or FACS, a technology that enables the identification of biological activity within a single cell, has been developed by Diversa Co. (CA). This system incorporates a laser with multiple wavelength capabilities and the ability to screen up to 50,000 clones per second, or over 1 billion clones per day. Another approach is to test existing isolates of microorganisms to determine if they have an enzyme which show a high activity with the substrate of interest. The pure
5-4
Cellular Metabolism
cultures of microorganisms have passed the test of time well: those have successfully been used for over 100 years. However, will the advent of the genome-wide high-throughput techniques of enzymatic screening and in silico predictions [8], make the cultivation-based enzyme discovery excessive? Based on genome sequencing data it is currently possible to make a metabolic reconstruction in which all possible known biochemical conversions could be identified. Thus, a comprehensive metabolic reconstruction on aspergilli revealed up to 14,400 ORFs, 335 reactions, and 284 metabolites distributed over the intra- and extracellular compartments [9]. Moreover, the genomes of extremophilic bacteria and archaea could be even more sophisticated: those from Thermotoga, Sulfolobus, Picrophilus spp., containing the ORF numbers in the range of 1,500–4,000, may comprise more complex biochemistries in terms of unique proteins, metabolites, and enzymes under extreme physical-chemical conditions existing in their natural environment [10–12]. Even though the valuable genomic information of individual organisms represents a good starting point, the cultivation of microbes often represents a bottleneck. The biological diversity on our planet is mostly comprised of microorganisms, in terms of both absolute numbers of cells and in terms of the biomass and the numbers of species [13]. Recently it has been reported that up to 25,000 different genotypes in a millilitre of the seawater sample in some marine ecosystems have been found [14]. However, the majority of microbes in nature will never be cultured, and to access this great inexhaustable genetic and metabolic diversity the new “metagenomic” approach is used [15–21]. This approach uses harvesting DNA from an environmental sample (or from an enrichment), its archiving in the metagenomic libraries in appropriate hosts, and screening these libraries for a gene of interest or expressing the DNA and screening for the enzymatic activities of interest [17]. Alternatively, these libraries are subjected to a high throughput shotgun sequencing and automated annotation, which often produces thousands of new genes whose functions often remain unknown. This has been shown recently by the group of Venter [22], whose shotgun sequencing of environmental DNA from Sargasso Sea yielded 1,045 billion base pairs derived from 1,800 genomic species and deduced more than 1.2 million of previously unknown genes. However, the work reflects difficulties on genome assembly and other challenges in interpretation of large data sets. Metagenomics may result in the development of a number of new products and to provide the tools for the resolution of rare conversions which are not amenable to the existing biocatalysts. Below we focus on the issues, what are current challenges for enzymatic conversions and how the natural biodiversity may help to deal with these challenges.
5.2 How Diverse Are Functional Groups? The compounds to be considered in the context of conversions are the known organic molecules, an ever-expanding set of over 10 million compounds, both natural and artificial, which are divided in a variety of chemical functional groups with a definable set of chemical reactivities. However, it is to be expected that in the natural biological world, a huge number of undiscovered natural products and therefore “exotic-rare” chemical reactions for their synthesis and/or conversion awaits discovery. This conclusion is supported by the enormous taxonomic diversity of microorganisms and thus the corresponding unmined riches of yet unexplored catalytic diversity [12]. Furthermore, for enzymes catalyzing sequential reactions in a biosynthetic pathway, new catalytic functions could evolve after gene duplication, whereas the binding capacity for a common ligand is retained and the chemistry of catalysis is changed [23,24]. Alternatively, for enzymes catalyzing chemically related reactions in different biosynthetic pathways, after gene duplication the chemistry of catalysis could be retained but the substrate specificity can change [25]. In addition, some divergently evolved enzymes could have been derived from a largely conserved active site architecture that was modified to catalyze different reactions which are unrelated both with respect to chemical mechanism and substrate structure [26,27]. As the number and diversity of enzymes is practically unlimited (in terms of amino acid combinations), there is important work to be done in categorizing the known and unknown reactions via a functional group approach, and experimental science, uncovering new enzymatic reactions useful for the biosynthesis and catabolism of previously unstudied functional groups.
5-5
Rare Metabolic Conversions—Harvesting Diversity through Nature
Recycling of materials and waste elimination
Products that improve our diet and health
Food/Feed Genetic variants of plants to resist plagues, and scarcity of water and nutrients and to improve yield Agriculture chemicals to improve yields
Biopolymers: replacement of tissues, organs, bones, etc.
Selective and promiscuous biocatalysts
Material science
Chemical synthesis
Preparation of chiral synthons New ways to produce known or new products: anti-cancer, etc.
Energy and sustainable development
Biotechnology
Terrorism issues
Hydrogenases and laccases more stable and less inhibited by CO
Detection of chemical weapons / biological agents (virus/toxins/infectious bacteria)
Bioenergy: biofuels, bioethanol, biofuel cells
Figure 5.1 Challenges for biotechnology and possible points for implementation of rare enzymes and rare conversions.
The bibliographic records of the biotechnology research of past few decades, especially regarding the new enzymatic conversions, show that this emerging field may provide a clear alternative of those processes to the chemical synthetic industrial methods. Biocatalysts show regio-, quimio-, and stero-selectivity as along with a high catalytic efficiency and offer a great diversity of applications. Although microbial cultivation generates huge collections of microorganisms and their genes, it has been demonstrated that laboratory protein engineering evolution technology with expression-based and high-throughput screening technologies used for the optimization of enzymes and other proteins, also generates large and potentially valuable gene libraries and enzyme variants useful for particular conversions [4]. However, some processes remain to be resolved (see examples in Figure 5.1), which will require new biocatalysts (Figure 5.2).
5.3 Diversity of Enyzmes and Current Frontiers for Bioconversions Like all catalysts, enzymes are catalytically versatile agents which accelerate the rates of reactions normally undergoing no permanent chemical modifications resulting in the course of catalysis. Enzymes can accelerate, often by several orders of magnitude, reactions that under the mild conditions of cellular concentrations, temperature, pH, and pressure would proceed imperceptibly (or not at all) in the absence of the enzyme. Consonant with their role as biological catalysts, enzymes show considerable selectivity for the molecules (substrates) upon which they act. This makes enzyme and whole-cell biocatalysts a greener alternative to traditional organic synthesis which offers appropriate tools for the industrial transformation of natural or synthetic materials under mild reaction conditions [28,29], low energy requirements and minimizes the problems of isomerization and rearrangement [30,31]. One of the key findings in the last 15 years was the discovery that the enzymes may be active in organic solvents, and that this influences enzyme properties [32,33]. This is of a special interest for enzymes such as penicillin acylases, esterases, lipases, proteases, nitrilases, hydantoinases, oxidoreductases, and
5-6
Cellular Metabolism Reaction engineering: To modulate the microenvironment of the enzyme
Biocatalysts engineering– protein engineering
To use known biocatalysts and to try to improve their activities and properties
New enzymes? Better enzymes? To study the microbial diversity
Cultivation independent methods The microbial world is characterized by its extreme biochemical diversity with more than 10 million species and a considerable divergence of gene sequences among species A large diversity of microbes needs to be studied to gain sufficient understanding of their functional roles
Figure 5.2 Alternatives for engineering or finding new enzymes and conversions. Table 5.2 Enzymes Commonly Used in Chemical Conversions Enzymes Esterases, lipases Amidases, proteases, and acylases Dehydrogenases Oxidases (mono- and dioxygenase) Peroxidases Kinases Aldolases and transketolases Glycosidases and glycosyltransferases Phosphorylases and phosphatases Sulphotransferases Transaminases Hydrolases Isomerases, lyases, and hydratases
Conversion Ester hydrolysis and formation Amide hydrolysis and formation Oxireduction of alcohols and ketones Oxidation Oxidation, epoxidation, and halohydratation Phosphorylation Aldol reaction (C-C-bond) Glycosidic bond formation Formation and hydrolysis of phosphate Formation of sulphate esters Amino acid synthesis (C-N bond) Hydrolysis Isomerization, addition, elimination, and replacement
transferases, which have been successfully applied for the synthesis of numerous bioactive compounds [34]. However, most of the enzymes react with only a small group of closely related chemical compounds; many demonstrate absolute specificity, having only one substrate molecule which is appropriate for reaction (see Table 5.2 for specific conversions and classification of enzymes according to them). Over 3,000 enzymes have so far been identified, and this number may be greatly augmented in the wake of in silico data mining and proteomic and (meta)genomic research [20,34]. The last strategy is of a special interest as we know that the microbial word is characterized by its extreme biochemical diversity
Rare Metabolic Conversions—Harvesting Diversity through Nature
5-7
with more than 10 million species and a considerable divergence of gene sequences with and among species. Moreover, a large diversity of microbes need to be studied to gain sufficient understanding of their functional roles.Although the existing experimental and genome data have revealed a battery of enzymes whose physiological roles and biotechnological potential are well established in laboratory and industry, we strongly believe that the enzymology is still in its infancy and its future is going to greatly expand during the forthcoming search for novel and versatile enzymes able to catalyze reactions that are difficult to perform by chemical methods (Figure 5.2) [20]. Biotransformation processes are far more diverse than one can imagine simply counting all existing protein families [35]. Thousands of different strains and enzymes are required to exploit the selective biotransformation potential for the conversion of a myriad of different substrates into the desired products, especially new optically active derivatives. This may offer the chance to discover many novel bioconversions, concluding stereo/regio-specific conversions of a range of substrates such as cyclic sugars, polyalcohols, and steroids. This is especially significant given the move toward developing “cleaner” synthetic routes based, where possible, on “white,” or more environmentally friendly biotechnologies [28,36,37]. Timeline compressions in the development cycle of pharmaceuticals, in combination with a missing broad strain and enzyme choice, result in the fact that conversions typically represent the second generation process choice in the manufacturing of small molecule pharmaceuticals. As the enzyme is the first and foremost functional element in biotransformation for small molecules, novel biocatalysts especially oxidoreductases and lyases, are needed (other enzymes that have been exploited for organic synthesis, as well as the type of reaction catalyzed, are summarized in Table 5.2). Good examples of the replacement of traditional organic processes by a greener biocatalytic alternative include the thermolysin-catalyzed synthesis of the low calorie sweetener aspartame, the production of acrylamide and nicotinamide (assisted by nitrile hydratases), the synthesis of the noncariogenic sweetener isomaltulose by sucrose mutases, the production of biopolymers such as polylactic acid, the synthesis of semisynthetic penicillins and cephalosporins, the transformation of natural and synthetic fibers, the pulp kraft-bleaching and recycling of paper, and the multistep synthesis of polyketide and glycopeptide antibiotics [29,37–39]. To overcome the problem of competing chemical reactions, the traditional frontiers for enzymatic conversions have been protein activation and stabilization, and reaction specificity [40]. For example, using protein engineering and high-throughput screening it is currently possible to create stereoselective enzymes with broadly extended substrate specificities, which are useful in organic synthesis, stable hydroxynitrile hydrates for the synthesis of, e.g., substituted R-mandelic acids, and hydrolases capable of enantioselective carbon–carbon bond formation or selective oxidation processes [27,41]. Moreover, in synthetic reactions, involving either oxidoreductases, expensive redox cofactors, or where the whole cells are needed, biotechnological developments—from upstream (strain, cell, and organism development) and midstream (fermentation and other unit operations) to downstream processes— will significantly benefit from the application of white biotechnology in chemical transformations [42]. Additionally, combinatorial immobilization techniques are providing effective methods for optimizing operational performance, in addition to the aiding the recovery and re-use of biocatalysts [40]. It should also be taken into consideration that as the drug targets are so variable and complex, industrial emphasis have been directed to the development and exploration of new domino and multicomponent reactions (MCRs) for the biosynthesis of medically important unusual antibiotics. Therefore, new catalytic synthetic methods in organic chemistry that satisfy increasingly stringent environmental constraints are in great demand by the pharmaceutical and chemical industries [38,43,44]. In addition, novel catalytic procedures are necessary to produce the emerging classes of compounds that are becoming the targets of molecular and biomedical research. Independently of the application, an ideal biocatalyst requires a high specific activity and stability and minimal substrate and product inhibition, and, very importantly, in terms of synthetic utility, a high stereospecificity. Although many enzymes have now been proven to be useful for the synthesis, one still cannot use enzymes for the formation of every desired linkage or for the resolution of any racemic mixture. Moreover, although many enzymes have been well characterized with regard to substrate specificity and stereoselectivity, they may still exhibit some unpredictable
5-8
Cellular Metabolism
features with unnatural substrates. As we will discuss later, the success of metagenomics in finding new enzymatic activities may play a key role for the generation of new products and processes that were until recently hidden from us [20].
5.4 Main Chemical Conversions Mediated by Enzymes: Putative “Rare” Conversions It is likely that ongoing enzyme discovery will further enhance the catabolic potential of biotechnology toward synthetic conversions and ultimately contribute to the sustainable and environmentally friendly practices. Below we discuss the most important enzymes and conversions on which the future research will be focused.
5.4.1 Hydrolytic Enzymes in Enantio-Transformations Introduction of chirality and/or synthesis-resolution of enantiomerically pure intermediates and products are of practical importance for pharmaceuticals, agrochemicals, and other industries (Figure 5.3a). Of particular interest are esterases and lipases, which are considered to be a third important group of industrial enzymes suitable for commercial exploitation. Both types of enzymes belong to the family of carboxylic ester hydrolases (EC 3.1.1.), which catalyze the cleavage of ester bonds of lipids and other organic compounds, and are divided into 79 enzymatic families according to the specific bond, moiety, and substrate they hydrolyze [45]. Although their primary natural function is the hydrolysis of acylglycerols, the main interest in their application stems from their high activity and stability in nonaqueous systems, which allow important biotransformations such as preparation of enantiopure (a)
Enantiomeric resolution mediated by esterases and lipases OH
OH
OH
O
O
(b)
Synthesis of oligosaccharides by glycosidases, transglycosidases, and glycosyltransferases OH O HO OH OH HO O GluFa O OH O OH HO O OH O OH O Glucosa HO HO OH OH 30 Glucosa 25 Maltosa 20 Maltotriosa 15 mV
HO HO
10 5 0 0
5 10 15 20 25 Tiempo de retención (min)
OCH3
OH
OCH3
30
+
O
(c)
OH O
+ CH3OH
Regiospecific hydroxylation mediated by mono-oxygenases COOH
COOH
COOH COOH NH2
COOH NH2 N
OH OH OH
OH
OH
Enantiospecific reduction of carboxylic groups to chiral alcohols by alcohol dehydrogenases OH O F O O O F F F OH O O F F F O F F
Figure 5.3 A few examples of preparative and industrial applications of enzymes. (a) Esterase-lipase mediated resolution of chiral synthons. (b) Enzymatic synthesis of glycosidic-formation enzymes. (c) Regiospecific hydroxylation and enantiospecific reduction by mono-oxygenases and alcohol dehydrogenases.
Rare Metabolic Conversions—Harvesting Diversity through Nature
5-9
compounds from racemic pairs, prochiral (precursors to chiral) or meso compounds, or diastereomeric mixtures [32,33]. Hydrolytic enzymes also effectively catalyze enantiocomplementary reverse hydrolysis (esterification, transesterification, aminolysis, or amidation), providing access to both enantiomers of a desired product [46]. Moreover, in order to produce enantiopure-chiral amines and alcohols as well as phospho-, glyco-, and lipopeptide conjugates, lipase/esterase-catalyzed enantioselective reactions for temporary protection-deprotection of alcohol and amines (using appropriate protecting groups) have recently been reported [47–49]. Further, the regiospecific esterification of carbohydrates by the enzymecatalyzed transesterification of sugars, may be obtained with an appropriate selection of lipases [46]. Besides esterases and lipases, two other important hydrolytic enzymes are proteases and penicillin acylases. Proteases keep their status as important catalysts for peptide synthesis in thermodynamically controlled processes or kinetically controlled aminolysis; similarly to other hydrolases such as esterases and lipases, proteases not only serve as hydrolytic enzymes, hydrolyzing peptide bonds in vivo, but also facilitate amide- or ester-bond-forming reactions under specific conditions in vitro, being particularly interesting for the formation of unnatural amino acids through the peptide bond formation. Moreover, apart from lipases and carboxylesterases, serine proteases, especially those of the subtilisin-family, have also been successfully employed in sugar acylation processes, even with long-chain fatty acids [46]. The last example of hydrolytic enzymes applied in synthetic industrial processes are penicillin acylases. Although, they are broadly appreciated for the synthesis of medically important antibiotics through β-lactam cleavage, they harbor a few practical limitations, such as the narrow pH profile (they are active within a narrow window of pH) and that the resolution of antibiotics is kinetically controlled yielding max. 50% yield [50]. Although in silico analysis, modeling techniques, protein engineering, and medium development can aid the “correct” stereochemical course of the reaction along with the improvement of thermal and/or solvent stability and facilitate the insight into potential substrate or enzyme modifications that may increase selectivity [4], these are less attractive than the discovery of new enzymes containing unnatural amino acids or functionalities that yield new structures and cannot be obtained through existing technologies.
5.4.2 Rare Conversions Using Carbohydrate Modifying Enzymes: Production of Rare Carbohydrates It is known that a common oligosaccharide is a carbohydrate consisting of two to ten monosaccharide residues linked by O-glycosidic bonds [51,52]. The development of efficient and scalable synthetic processes for addressing oligosaccharides in the food (prebiotics, sweeteners, stabilizers, bulking agents, etc.) and pharmaceuticals (e.g., therapeutics in prevention of infection, neutralization of toxins, and for immunotherapies) is of great interest [53]. Considering the structural diversity of oligosaccharides (three different amino acids would allow the synthesis of only six different peptides, whereas three different hexopyranose moieties would yield up to 720 trisaccharides), the stereo- and regioselectivity of enzymes is considered a valuable alternative to chemical synthesis—which needs complex protection and deprotection steps—for the preparation of structurally well-defined oligosaccharides (Figure 5.3b). Actually, enzymatic processes are the preferred choice in the food industry for the production of most important oligosaccharides. Biotransformation of carbohydrates is a classical example of the application of the regiospecificity of enzymes [54]. In vivo synthesis of glycosidic bonds is performed by glycosyltransferases (EC 2.4.) [55,56]. These enzymes catalyze the coupling (by a transfer reaction) of a glycosyl donor to an acceptor molecule forming a new glycosidic bond with regio- and stereo-selectivity. According to the nature of the sugar residue being transferred, glycosyltransferases are divided into hexosyltransferases (EC 2.4.1.), pentosyltransferases (EC 2.4.2.), and those transferring other glycosyl groups (EC 2.4.99.). Depending on the nature of the donor molecule, glycosyltransferases are classified into three main mechanistic groups: (1) Leloirtype glycosyltransferases, which require sugar nucleotides (e.g., UDP-glucosyltransferase); (2) non-Leloir glycosyltransferases, which use sugar-1-phosphates (e.g., phosphorylases); and (3) transglycosidases,
5-10
Cellular Metabolism
which employ nonactivated oligosaccharides (e.g., sucrose, starch, etc.) as glycosyl donors. A distinctive feature of transglycosidases, compared with Leloir and non-Leloir glycosyltranferases, is that they also display some hydrolytic activity, which can be regarded as a transfer of a glycosyl group from the donor to water. It is noteworthy to mention that in terms of reaction mechanism transglycosidases belong to the same group that glycosidases (EC.2.), a group of hydrolases that catalyze with exquisite sterereoselectivity the hydrolysis of glycosidic bonds in oligo- and polysaccharides. According to classification Henrissat based on amino acid sequence comparisons transglycosidases and glycosidases constitute the “glycoside hydrolase family (GH family),” with more than 2,500 enzymes [57]. In vitro oligosaccharide synthesis can be performed with glycosyltransferases and glycosidases. There are several problems associated with the use of glycosyltransferases of Leloir and non-Leloir type: (1) the requirement of sugar nucleotides or sugar phosphates as substrates, whose synthesis is rather difficult and expensive; and (2) the inhibitory effect of the nucleotide phosphate released; (3) the limited availability of these enzymes [58]. Glycosidases are widely employed for oligosaccharide synthesis, as under appropriate conditions the normal hydrolytic reaction can be reversed toward glycosidic bond synthesis [54]. This can be achieved by thermodynamic control (using low-water concentrations) or by kinetic control (using activated glycosyl donors at high concentrations). Despite the broad specificity of glycosidases and their availability, the application of these catalysts is often limited by low yields and poor regioselectivity. 5.4.2.1 Production of Rare Monosaccharides Along with the production of other sugars, the enzymatic transformations of monosaccharides have become an important bioprocess. In the past few years, the medical applications of L-carbohydrates and derived nucleosides have greatly increased [59]. Several nonmodified and modified L-sugars, e.g., L-sorbose, L-arabinose, 2-deoxy-2-fluoro-5-methyl-β-L-arabinofuranosyl uracil (L-FMAU), and L-arabinose condensed with cyanohydrin and nitromethane have been shown to be potent antiviral agents and also useful in antigenic therapies and have been used as precursors for the facile synthesis of the potent glycosidase inhibitor 1-deoxygalactonojirimycin, as anti-hepatitis B virus, anti-human immunodeficiency virus (HIV), anti-HBV (hepatitis B virus) and anti-EBV (Epstein-Barr virus), and antitumor agents [59]. Rare sugars are usually as sweet as the natural sugars, but unlike them, rare sugars are either not metabolized by the body or metabolized to a lesser extent than natural sugars. Due to these features, rare sugars are desirable as low-calorie sweeteners and are well tolerated by diabetics. It was also found that L-monosaccharides have antineoplastic characteristics useful in combination with all major forms of cancer therapy including surgery, biological, chemical and radiation therapies, and hyperthermia. Other advantage of rare sugars is the absence of an objectionable aftertaste, commonly experienced with artificial sweeteners such as saccharin or cyclamates. However, in spite of the demand for these rare sugars, their commercial availability, application, or usefulness is negligible as they are expensive to prepare, and unavailable in nature. Apart from their intrinsic applications, one of the key challenges in the last decades involves the asymmetric synthesis of highly functionalized monosaccharides that bear several centers of chirality [60]. Subsequently, these sugars (i.e., monosaccharides) need to be connected through glycosidic linkages to carboxydrates or other biomolecules, such as lipids, proteins, or metabolites [49]. However, the selective formation and hydrolysis of glycosidic linkages is chemically difficult, but is a common reaction in biological systems—it is estimated that 1–2% of an organism’s genes are dedicated to glycoside hydrolases and glycosyltransferases. The application of biocatalysis for carbohydrates is, therefore, a rich field that has now been adopted to various degrees, by many researchers who are involved in carbohydrate synthesis. The key enzymes for the synthesis of monosaccharides are readily available aldolases which catalyze the selective formation of the central C–C bonds, e.g., galactose oxidase, rhamnulose-1-phosphate aldolase, sialyltransferase. These enzymes have successfully been applied to the synthesis of a variety of mono- and sialooligosaccharides [34,61]. The formation of C–C bonds by aldolases for the production of monossacharides with complete stereochemical control is of obvious importance in organic synthesis, and enzyme-catalyzed aldol addition reactions (typically between an aldehyde and a ketone) have made important contributions in this regard [62]. In aldolase-catalyzed
Rare Metabolic Conversions—Harvesting Diversity through Nature
5-11
reactions, the enzyme generally controls configuration of newly formed stereogenic centres. Aldolases are highly specific for the donor substrate (that is, the nucleophilic enolate) but relatively flexible with respect to the acceptor (electrophilic) group. Exploiting this enzymatic characteristic, a judicious choice of acceptor substrate and aldolase, has led to the preparation of numerous carbohydrates and mimics thereof. These molecules have further served as intermediates in the synthesis of complex bioactive molecules, such as glycosyltransferase and glycosidase inhibitors. Characteristic synthetic examples are the use of the fructose diphosphate (FDP) aldolase for the construction of cyclic imine sugars and the acetaldehyde-dependent aldolase 2-deoxyribose-5-phosphate aldolase (DERA). The latter is the only known aldolase which catalyzes condensation between two aldehydes, and which has been used in the synthesis of epothilones, a new class of anti-cancer agents of interest in the pharmaceutical industry [34]. Other synthetically useful enzymes catalyzing C–C bond formation include transaldolases, transketolases, cyanohydrin synthetases (also called oxynitrilase), and enzymes for acyloin condensation, acyltransfer, isoprenoid, and steroid assembly, β-replacement of amino acids, and many B12-dependent reactions which will not be discussed here. Apart from aldolases for the production of monosaccharides, other enzymes have been successfully applied for the formation of glycosidic linkages, namely for the production of complex sugars with prebiotic and bioactive properties (see below). 5.4.2.2 Rare Oligossacharides—Prebiotics and Bioactives Transglycosidases are the ideal biocatalysts for oligosaccharide synthesis in vitro, since they do not require special activated substrates, as they directly employ the free energy of cleavage of disaccharides (e.g., sucrose) or polysaccharides (e.g., starch) [63]. For this reason, they are the most convenient enzymes for the production of prebiotic oligosaccharides as food ingredients that are potentially beneficial to the human or animal health (prebiotics escape enzymatic digestion in the upper gastrointestinal tract and enter the colon without change to their structure) [40]. Transglycosidases exhibit the same mechanism as retaining glycosidases, resulting in net retention of anomeric configuration. The active site contains two carboxylic acid residues, located at approx. 5.5 Å apart: one acting as a nucleophile and the other as an acid/base catalyst. The reaction proceeds by a double-displacement mechanism in which a covalent glycosyl-enzyme intermediate is formed by the attack of the deprotonated carboxylate to the anomeric centre of the carbohydrate with concomitant C-O breaking of the scissile glycosidic bond [56,60]. This step is assisted by the carboxylic residue acting as a general acid. The second step is the attack of a nucleophile to the glycosyl-enzyme intermediate, which is assisted by the conjugate base of the second carboxyl residue. The nucleophilic H2O and the acceptor (normally a carbohydrate) compete for the glycosyl-enzyme intermediate. When the nucleophile is H2O, the enzyme acts as a hydrolase; when the sugar is the nucleophile, the enzyme acts as a transferase. The transferase/hydrolase ratio depends on two main factors: (1) the concentration of acceptor (high concentrations must be used to enhance glycosyl transfer), and (2) the intrinsic enzyme properties. A transglycosidase will be considered efficient if it possesses significant ability to bind the acceptor and to exclude H2O. Along with transglycosidases, the glycosyltransferases are the biocatalysts responsible for the formation of many of the glycosyl bonds in nature. However, these enzymes use either sugar nucleotides or glycosyl phosphates as their activated sugar donors and are often highly selective for the second substrate, the glycosyl acceptor. Although we have described above the limitation of those in biocatalysis, they have been demonstrated as useful biocatalysts, e.g., for the glycosylation of natural products, such as the glycopeptide antibiotics vancomycin and teicoplanin, or the angucyclic antitumor drugs urdamycin A and B by using glycorandomization strategies [60,64]. However, these enzymes in some cases require new donor substrates (glycosyl fluorides, oxazolines, and 6-oxo-glycosides) for synthetic transglycosylation reactions. Encouragingly, some of the glycosyltransferases that are involved in the biosynthesis of above natural products have been purified and have demonstrated different degrees of enzymatic promiscuity toward their substrates. Particularly important targets for glycosylation are glycoproteins with defined and homogeneous glycosylation patterns. Biocatalytic approaches have involved the enzymatic glycosylation of neoglycoproteins after site-selective chemical glycosylation
5-12
Cellular Metabolism
or re-engineering of the glycosylation machinery of Pichia pastoris to produce human-like N-linked glycosylation on proteins. As a consequence of the progress in the understanding of the structures and catalytic mechanisms involved in the enzymatic synthesis of glycosidic bonds, a group of novel, directed-mutated glycosidases called glycosynthases was developed [65]. The glycosynthase concept was established in 1998 by the Whiters’ group using an exo-glucosidase as a model [66], and was further extended to endo-glycosidases by the Planas’ group [67]. A glycosynthase is a specifically mutated retaining glycosidase in which substitution of the catalytic carboxyl nucleophile by a noncatalytic residue (Ala, Gly, or Ser) renders a hydrolytically inactive enzyme, still able to catalyze the transglycosylation of activated glycosyl fluoride donors (having the opposite anomeric configuration to that of normal substrates of the parental wildtype enzyme). The yield obtained with glycosynthases reaches in some cases 95–98% [58]. The impressive amount of glycosidases available clearly indicates that the potential biodiversity of glycosynthases is still largely unexplored, and new applications of these enzymes will emerge in the near future [68]. Among the most challenging oligosaccharide targets for synthesis are those of the proteoglycans, such as heparin and heparan sulfate. These are negatively charged, linear polysaccharides that interact with a variety of proteins at the cell surface. An important target has been the antithrombin III binding pentasaccharide, which promotes inhibition of blood coagulation, and hence, has important therapeutic applications. The chemical synthesis of the pentasaccharide has been reported in 60 steps, with less than 0.5% overall yield [34]. The challenge for rare conversion of carbohydrates is to develop a robust screening or selection systems, and in this context, a recent report by Mayer et al. [69] describes the development of a novel agar plate-based coupled-enzyme screen to select natural glycosynthases from a library of mutants. Such developments may open the window to find wild-type enzymes derived from environmental DNA which show high ratio synthesis versus hydrolysis of glycosidic bonds.
5.4.3 Oxidative Enzymes for the Conversion of Organic Pollutants in Industrial Effluents to Useful Products Research involving enzymes such as peroxidases, laccases, and polyphenol oxidases from plants, fungi, and bacteria, aims at the development of bioremediation systems, and in addition, utilizing the waste organics found in industrial wastewaters, converting them to economically valuable products [37]. At present, wastes from the wine, olive, and petrochemical industries are used as sources of raw materials for the bioconversions. Potential products include antioxidants suitable for use as food additives or nutraceuticals. A number of oxidase enzymes are being investigated for their ability to act as biocatalysts in the hydroxylation and oxidation of aromatics to produce compounds of economic importance, particularly for production of compounds of value in the pharmaceutical and food industry (Figure 5.3c), namely ferulic acid and certain flavonoids and steroids.
5.4.4 Rare Conversions Using Fatty Acids: Production of Unique Complex Lipids and Essential Fatty Acids It is now clear that the identification of novel biochemical conversions for carboxylic acid activation may reveal new potential drug and human health targets. More generally, the identification of novel classes of enzymes and their role in the biosynthesis of complex hybrid metabolites may also reveal unknown mechanisms with which we should be able to generate metabolite diversity: fatty acid synthases (Fad) and poliketide synthases (PKSs) in combination have been demonstrated as powerful catalysts to generate hybrid metabolites, although it is still not clear how the covalent sequestered biosynthetic intermediates are transferred from one enzymatic complex to another [37,70,71]. Long-chain polyunsaturated fatty acids (PUFAs) are vital for human health [72]. Although, until now, fish and other marine oils have been the only source of PUFAs, new developments have been achieved for the biosynthesis of ω-3 and
Rare Metabolic Conversions—Harvesting Diversity through Nature
5-13
ω-6 families of PUFAs. Synthesis of these essential fatty acids requires the introduction of several genes responsible for the conversion of linoleic acid into arachidonic acid, or α-linolenic acid into eicosapentaenoic acid and docosahexaenoic acid. Although the three major steps for the synthesis of PUFAs are all well studied and involve sequentially a ∆6-desaturation, a chain elongation and a ∆5-desaturation by the actuation of fatty acid synthases, PKSs, and desaturases, it still remains unclear how efficient other enzymes could be for carboxylic acid activation [72].
5.4.5 Rare Conversions Using Aliphatic Groups: Incorporation of Chlorine, Fluorine, Bromide, and Iodine to Peptides-Polyketides Enzymatic incorporation of chlorine, bromide, or iodine atoms occurs during the biosynthesis of more than several thousands natural products [73]. This particular halogenation may have significant consequences for the bioactivity of these products so there is a great interest in understanding the biological catalysts that perform these reactions. With the exception of the well characterized haloperoxidases [74], most of the biosynthetic enzymes and mechanisms responsible for the halogenations have remained elusive. The crystal structures of two functionally diverse halogenases have been recently solved, providing us with new and exciting mechanistic detail [75,76]. This new insight has the potential to be used both in the development of biomimetic halogenation catalysts and in engineering halogenases, and related enzymes, to halogenate new substrates. Interestingly, these new structures also illustrate how the evolution of these enzymes mirrors that of the monooxygenases, where the cofactor is selected for its ability to generate a powerful oxygenating species [74]. Since their discovery, halogenated metabolites have been somewhat of a biological peculiarity and it is only now that we are beginning to realize the full extent of their medicinal value. However, only few studies report on enzymes that halogenated unactivated aliphatic groups from peptides and polyketides. A good example of an efficient enzyme catalysing rare conversions such as iodination, bromidation, and chlorination of a wide range of substrates are chloroperoxidases (CPO) [74]. Besides H 2O2dependent halogenation reactions, the enzyme catalyzes dehydrogenation reactions. These reactions can be performed in a stereoselective manner by CPO, which may also act as a catalase, facilitating the decomposition of hydrogen peroxide to oxygen and water. Furthermore, CPO catalyzes P450like oxygen insertion reactions, e.g., enantioselective epoxidation/sulfoxidation of alkanes. Although CPO have extensively been investigated during the last four decades, it should be crucial for the conversion of nonactivated compounds such as alkanes to find novel CPO-like enzymes that help to gain a better understanding of the catalytic mechanism of this class of enzymes. This is of practical importance as we know that with the exception of the well-characterized haloperoxidase, most of the biosynthetic enzymes and mechanisms responsible for the halogenation have remained elusive [74,76]. Alternatively, the isolation of novel halogenases may open new perspectives to realize the full extent of their medical value [73]. This new insight has the potential to be used for the development of biomimetic halogenation catalysts and in engineering halogenases, and related enzymes, to halogenate new substrates. We are convinced that natural evolution can efficiently select particular structures and properties to enhance or modify enzyme catalysis. Although fluorine in the form of fluoride minerals is the most abundant halogen in the Earth’s crust, only 12 naturally occurring organofluorine compounds have so far been found, and how these are biosynthesized remains a mystery [1]. The rarity of natural fluorinated products contrasts with the identification of about 3,500 naturally occurring halogenated compounds. The available fluoride is largely insoluble—e.g., sea water contains 1.3 ppm. fluoride and 19,000 ppm. chloride, which may help to explain why fluorine’s biochemistry has hardly evolved. The toxin fluoroacetate is the most ubiquitious of the small class of organofluorine compounds and has been identified in more than 40 plant species from all of the continents apart from Antarctica, but its biosynthetic fluorination pathway has not been clearly defined. Fluoroacetate is also produced by the bacterium Streptomyces cattleya when it is grown in culture medium supplemented with fluoride ions. In a recent study, an enzymatic reaction has been
5-14
Cellular Metabolism
described that occurs in the bacterium S. cattleya and which catalyzes the conversion of fluoride ion and S-adenosylmethionine (SAM) to 5′-fluoro-5′-deoxyfluoroadenosine (5′-FDA). This study has for the first time revealed a fluorinase enzyme, this discovery opened up new biotechnological opportunities for the preparation of organofluorine compounds and showed the usefulness of continuation of discovery of new halogenases for both degradation and synthetic purposes [75].
5.4.6 Rare Conversions Using Amino Acids: Production of Unnatural Amino Acids The products of such biocatalytic systems have significant value as precursors of antibiotics and other high-value pharmaceuticals, particularly due to the stereospecificity of the enzymes involved and hence the chirality of the products. In this way, enantioselective amino acid syntheses using biocatalysis methods, specifically involving proteases and hydantoin-hydrolysing enzyme systems are broadly appreciated [77,78]. Unnatural amino acids are a growing class of intermediates required for pharmaceuticals, agrochemicals, and other industrial products [79]. However, no single enzyme was proven to be efficient enough to prepare these compounds broadly at scale. To address this need we need to isolate new enzymes to prepare enantiomerically pure L- and D-amino acids in high yield by deracemization of racemic starting materials (enantiomeric excess and product yields over 99%).
5.4.7 Rare Conversion with Flavonoids and Steroids: Drug Synthesis Flavonoids are among the most ubiquitous phenolic compounds found in nature [80]. These compounds have diverse physiological and pharmacological activities such as estrogenic, antitumor, antimicrobial, antiallergic, and anti-inflammatory effects. They are well-known antioxidants and metal ion-chelators. As integral constituent of the diet, flavonoids may exert a wide range of beneficial effects on human health, including protection against cardiovascular diseases and certain forms of cancer. Recent studies have shown diverse physiological and pharmacological activities of these natural compounds such as strogenic, antilipoperoxidant, antitumor, antiplatelet, antiviral, antifungal, antibacterial, antihemolytic, anti-ischemic, antiallergic, and anti-inflamatory. The biological effect is mediated by their free radical-scavenging antioxidative activities and metal ion-chelating abilities. These properties may facilitate inhibitions of certain enzymes such as lipooxygenases, cyclooxygenases, monooxygenases, xantine oxidases, aldose reductase, etc. It is therefore important to increase or decrease the biological activities of those flavonoids by enzymatic transformations. Biotransformations of numerous flavonoids catalyzed mainly by microbes and few plant enzymes have been described in four different flavonoid classes, chalcones, isoflavones, catechins, and flavones. These biotransformations represent a variety of reactions including condensation, cyclization, hydroxylation, dehydroxylation, alkylation, O-dealkylation, halogenation, reduction, deglycosidation, methylation, dehydrogenation, double-bond reduction, carbonyl reduction, glycosylation, sulfation, dimerization, or different types of ring degradations, conjugations, and reductions [80]. Future advances should be made in the discovering of enzymes able to efficiently separate racemic flavones and/or introduce chiral centers or functional groups, to achieve efficient derivatives that may be used as valuable synthons: benzaldehyde lyase (BAL, EC 4.1.2.38), biphenyl dioxygenase, dihydrodiol dehydrogenase, phenylalanine/tyrosine ammonia lyase, 4-coumrate:coenzyme A (CoA) ligase, chalcone synthase, cinnamte hydroxylase, chalcone isomerase, methyltransferase, biphenyl-2,3-dihydrodiol-2,3-dehydrogenase, O-demethylase, t yrosinase, horseradish peroxidase [80]. Other important derivatives, steroids, are also commonly used to treat ailments such as arthritis, skin disorders, and adrenal insufficiency. These therapeutic values of steroids provide impetus to the search for ways of producing steroids that were more efficient and cost effective than traditional chemical synthesis. One of the critical conversions is the stereo- and regionspecific steroid hydroxylations [81]. We know that these hydroxylase biotransformations are catalyzed by members of the cytochrome P450 (CYP) superfamily. It has been reported that in microbes, plants,
Rare Metabolic Conversions—Harvesting Diversity through Nature
5-15
and mammals, various CYPs are also associated with steroid biosynthesis and conversions. In the future many efforts should be directed to the development of new routes for the synthesis of novel steroid/flavonoid drugs.
5.4.8 Rare Conversion of Xenobiotics: Potential Biodegradation Pathways We should recognize than in the last few years, enzymatic bioremediation has raised as an attractive alternative to further support the biotreatment techniques currently available, since enzymes are more simple systems than a whole organism [37]. Most xenobiotics can be submitted to enzymatic bioremediation, e.g., polycyclic aromatic hydrocarbons, polynitrated aromatic compounds, pesticides such as organochlorine insecticides, bleach-plant effluent, synthetic dyes, polymers, and wood preservatives (cresoate, pentachlorophenol). It is just a matter of looking for the microorganisms capable to be fed with a particular pollutant, and afterward focusing the effort in finding out which enzyme(s) is(are) behind the particular biodegrading phenotype. Historically, the most studied enzymes in bioremediation are bacterial mono- or di-oxygenases, reductases, dehalogenases, P450 cytochrome monoxygenases, enzymes involved in lignin-metabolism (basically, laccases, lignin-peroxidases, and manganese peroxidases from white-rot fungi), and bacterial phosphotriesterases [37,82]. Although many compounds could efficiently be converted to nontoxic derivatives, no enzyme has been found to degrade other chemicals. This is probably due to the fact that the existing enzymes somehow did not have enough time to evolve to tackle the non-natural, or poorly degraded xenobiotics, including insecticides, herbicides, fungicides, and mycotoxins that have been introduced to the environment in the last decades [75]. These novel compounds may represent an example of rare conversions. The inability of natural microorganisms to fully mineralize many synthetic chemicals is mainly caused by the lack of enzymes that carry out (just one or two) critical steps in a catabolic pathway. This is obviously a critical issue in the mineralization of low-molecular weight halogenated compounds. Bacterial dehalogenases catalyze the cleavage of carbon-halogen bonds, which is a key step in the mineralization pathways of many halogenated compounds that occurs as environmental pollutants. This dehalogenation step is an obvious critical step with halogenated compounds. Up to date, no enzymes have been found that convert environmental chemicals such as chloroform, trichloroethylene, 1,1,1-trichloroethane, 1,2-dichloropropane, and 1,2,3-trichloropropane. Another example is the herbicide mesotrione [2]; to date, no bacteria have been described to biotransform this compound. These are just a few examples, but surely other artificial compounds newly introduced into the environment may be a certain challenge as one may think that nature did not have enough time to evolve novel enzymes to tackle these compounds. However, low substrate specificities of the enzymes that may use as substrates the compounds similar to “new” pollutants keep a good chance these compounds may fully be mineralized under given environmental conditions.
5.4.9 Rare Routes for Synthesis of Hormones and Vitamins: Rare Carboxylations and Hydroxylations Lipid hormones represent chemically distinct classes of molecules mediating a multitude of essential effects in vertebrates and mammals, including control of development, metabolisms, reproduction, electrolyte balance, cardiovascular tone, regulation of the immune system, and inflammatory response[81]. The most important compounds are affiliated to steroids, tyroid hormones, and retinoids. Hormones, steroids, glucocorticoids and mineralcontoids, androgens and estrogens, progestins, neurosteroids, vitamins, retonids, tyroid hormones, and arachidonic acid derivatives are known for a long time to act as signaling molecules. A relevant conversion to be applied for such compounds is the regiospecific introduction of carbonyl or hydroxyl groups that are responsible for the synthesis of different enantiomers and stereoisomers. A common strategy to obtain enzymes for the regio- and enantiospecific oxygenation of hydrocarbons is the evaluation of biodegradative pathways in microbe isolates by selective enrichment. The conversion of the majority of lipid and steroid hormones involve short-chain dehydrogenases/reductases or aldo-keto
5-16
Cellular Metabolism
reducatases. Additional enzymes belong to different classes of oxidoreducatases, e.g., cytochrome P450. However, many enzymes and isoforms involved in the conversion of these and other non-classical signaling molecules are waiting to be identified, and it would not be surprising if the principle of a new enzymatic modulation of local effects will be extended into further areas of signaling molecules.
5.4.10 Cofactor-Dependent Enzymes and Cofactor Regeneration Performing the enzymatic conversion, it is necessary to ensure that the enzyme is kept under optimal conditions during the operation. The production of enzymes must have a low cost (e.g., through an efficient heterologous expression) and the catalyst should preferentially exhibit a high substrate affinity (Km in the micromolar range), supporting thousands of substrate turnovers per second. At the same time the enzymes should display a certain robustness toward the external factors and low dependency on expensive redox cofactors (i.e., NAD(P)H), which would be prohibitive in a commercial setting [83]. Enzymatic oxidoreductions remain the important synthetic processes, and development of cofactor regeneration systems will perhaps help to solve the problem of enzyme applicability in preparative or industrial scales, and ultimately lead to the design of new oxidative-reductive conversions. The cofactor regeneration is also synthetically advantageous, as it drives the reaction to completion, prevents the accumulation of inhibitory cofactor by-products, simplifies the reaction work-up, and increases enantioselectivity [83]. Several cofactors can be recycled effectively, including nucleoside triphosphates such as ATP in phosphoryl transfer reactions, nicotinamide adenine dinucleotide and its 3′-phosphate (NAD and NADP) in oxidoreductions, acetyl-CoA in acyl transfer reactions, 3′-phosphoadenosine-5′phosphosulphate (PAPS) in the formation of sulphate esters, and sugar nucleotides in glycosyl transfer reactions; however, the others require a combination of regeneration systems for the conversion of a secondary product which leads to cofactor regeneration (i.e., sugar nucleotides, such as thymidine diphosphate, and NAD(P)H for oxygenases) [34].
5.4.11 Enantiomerically Pure Compounds by Rare Domino and MCRs An important field of research activities on rare conversions is the exploration and development of new domino and MCRs [43,84]. Unlike the usual stepwise formation of individual bonds in a target molecule, the most attractive attribute of MCRs is the inherent formation of several bonds in one operation (optimally, using a single promiscuous enzyme) without changing reaction conditions or adding reaction intermediates. This will allow the minimization of waste production and energy use. An example of this reaction is the enzyme-catalyzed kinetic resolution which yields the corresponding enantiomers with high enantioselectivity [43]. Other examples may be the conversion of rare penicillins for the biosynthesis of medically important antibiotics, i.e., cephalosporins [85]. It should be important to identify bifunctional enzymes mediating oxidative, ring expansion and hydroxylation reactions in the presence of appropriate substrates. Here, the quality of “catalytic plasticity” of enzymes such as deacetoxycephalosporin/deacetylcephalosporin C synthase and oxygenase enzymes will be achieved more productively through metagenome screening rather than by site-directed mutagenesis, DNA shuffling, or error-prone PCR. The use of the former strategies allows a limited number of amino acid substitutions, whereas the metagenomes deliver the proteins with low level of sequence similarity to the known enzymes and thus different reaction mechanisms, including substrate and reaction selectivity.
5.5 How Can New Catalytic Functions be Achieved? The overview of conversion achieved to now make us to think, are we ready to implement the ideal enzyme to resolve rare conversions? We conclude that many of the necessary molecular and screening technologies are in place and there is evidence that a successful acquisition of new mono- and
5-17
Rare Metabolic Conversions—Harvesting Diversity through Nature
DNA Extraction DNA Digestion Ligation Transformation
Expression host (e.g. E. coli)
mRNA Enzymes
99% UNCULTURED
20%
Fast growth
10%
Low growth
2005
Purification Enzymes
2010
Year
Possible expansion of enzyme-based technologies in chemical synthesis
Figure 5.4 Principal steps in the construction of metagenomic libraries and enzyme screening and putative progression of enzymic conversion in industry.
multifunctional biocatalysts for fully optimized “rare” conversions may be available in the near future. In this respect, indeed, metagenomics technology has revolutionized the possibilities of biocatalysis, since we can now have access to the genomes, genes, and encoded enzymatic activities of unculturable microorganisms (see Figure 5.4) [20]. With this modern molecular technique we can now access the biodiversity of the environment, with the objective of accessing novel enzymes from bioconversions. The success of metagenomics in finding new enzymatic activities has unequivocally demonstrated the power of this approach, showing that metagenomics is not a future opportunity anymore, it is a current reality for the discovery and development of new products and processes that were until recently hidden from us. Moreover, many studies in past years have been focused on finding bioproducts, particularly enzymes, from extremophiles. As we know the extreme environments are populated with microbes that have been physically isolated from other habitats on the planet for thousands of years and this may have resulted in the selection of unusual organisms and probably prevented their dispersal, and are therefore expected to yield novel microbial diversity and unknown cellular gene products with interesting properties and new catalytic activities. Therefore, metagenomics of microbial communities inhabiting these extreme environments may be important for understanding the global biogeochemical cycles, and their potential for biotechnology [86–88]. Independently of and in parallel to the metagenomics the protein engineering based on site-directed mutagenesis and directed evolution technologies has contributed significantly to our understanding of enzyme catalysis during the past 20 years and has led to the development of enzyme variants with modified properties for synthetic transformations [4]. Although the directed evolution can yield new enzymes with altered substrate specificity, enantioselectivity, protein topology, thermal stability, and tolerance to organic solvents, this strategy has a few hindrances. The first one, the requirement of highly efficient high-throughput screening assays, the very large numbers of possible mutant variants, and, the
5-18
Cellular Metabolism
problems of selection of the best enzyme for the next mutagenesis/panning round, as the accumulation of subtle changes at genetic level do not always lead to the best enzymatic fitness. Routinely, to gain a new product, one can start with some enzyme known to catalyze a specific type of reaction, optimize the reaction conditions, and further improve the catalyst through directed evolution and the protein engineering cycle. The future is, in contrast, to provide a new biocatalytic method by the direct screening for new enzymes-reactions that can be pursued if the reaction is sufficiently important.
5.6 Recent Advances in Metagenomics: The Untapped Reservoir of Proteins from Unculturable Microbes Natural environments contain vast numbers of microorganisms. As an example, a gram of soil contains up to 109 bacteria with perhaps 10,000 different species [17,18,89]. The difficulties of cultivating microorganisms from those environments exclude the majority of the microbial community from a functional analysis of their genes and subsequent use of the microbial gene products, e.g., proteins, enzymes, etc [90]. Considering the estimation that >99% of the microorganisms in most environments are not amenable to culturing [12], does this enormous phylogenetic diversity really reflect new biomolecules “biocatalysts,” underlying novel mechanisms? The answer is “yes” because there is a high probability of finding novel microbial products, such as antibiotics and enzymes, in uncultivable microbes [20]. For instance, every new sequenced genome of an individual microorganism reveals up to 30–50% genes coding for proteins with yet unknown function [39]. Furthermore, we know the environmental genomes yield unprecedented amount of genes coding for peptides with low sequence similarity to the known proteins, with new structures and new catalytical properties [91]. Thus, to access this tremendous metabolic space of uncultured organisms, a tool is required to harvest and to functionally analyse the genetic diversity from the environment. The metagenomics seems to be a good tool for this. Although the future advance in metagenomics will require new screening techniques and development of new expression systems [20], it has been shown that the existing screening techniques allow the discovery of various biocatalysts, and, among them, lipases/esterases, β-lactamases, proteases, nitrilases, polysaccharide-modifiying enzymes (including agarases, cellulases, α-amylases, xylanases, 1,4-α-glucan branching enzymes, and pectate lyases), oxidoreductases and dehydrogenases, enzymes involved in the biosynthesis of antibiotics and vitamins and, more recently, enzymes involved in catabolism of aromatic hydrocarbons [20,92–102]. Moreover, recent developments have demonstrated the efficiency of mining the metagenome libraries in searching for novel microbial bioactives, anticancer drugs, novel antibiotics as well as secondary metabolites and the genes responsible for them (i.e., polyketide synthases) [103–106]. The generation and analysis of (meta)genomic libraries is thus a powerful approach to harvest and archive environmental genetic resources for answering the questions “what organisms are there?”, “what are they doing?”, and finally “how can their genetic information be beneficial to the humans?” Whatever the case, the natural diversity of enzymes makes the further optimization of them for specific processes a real possibility [20]. One of the main approaches to novel biocatalysts detection in metagenome libraries involves functional screening. This requires gene expression in a heterologous host, usually Escherichia coli. This common expression host organism has been predicted to be able to successfully express up to 40% genes whose sequences are available in the public databases, which is a surprisingly high number [107]. Alternative hosts for library construction and screening which have different expression capabilities, such as Pseudomonas putida, Bacillus sp (Firmicutes), Streptomyces lividans (Actinobacteria), or Rhizobium leguminosarum (Alphaproteobacteria), are currently under development [108,109]. Independently of the host for cloning and expression of metagenomic DNA, a function-based screening for a particular conversion, seems to be the best option to look for new enzymes and corresponding reactions. This is of special interest, as we know that proteins and/or enzymes belonging to a superfamily are evolutionary related and share more than 50% of sequence similarity each with other; the sequence similarity suggests common structural features and sometimes functional similarities [35]. For this reason, the identification of functional motifs by sequence comparison and the use of these motifs for PCR amplification and
Rare Metabolic Conversions—Harvesting Diversity through Nature
5-19
gene probing in DNA libraries may not allow the discovery of unusual catalysts which possesses novel structural and catabolic features (an important issue for discovering enzymes involved in rare conversions). Moreover, the retrieval of DNA from environmental samples provides the access to the genomes of not yet cultured bacteria and archaea impossibly performing specific rare conversions. Taken together, the advances in (meta)genomics and the development of screening techniques and post-(meta)genomic techniques make it possible to find new genes encoding metabolic pathways or rare conversions that were until now masked to us [110]. A high success rate of the activity-based mining of metagenomes for new enzymatic activities has univocally demonstrated the importance of microbial diversity in the discovery of new enzymes and underlines the necessity of mining different environments to capture new activities potentially applicable for biotechnology. Good examples are the metagenomic studies performed in marine microbial communities such as those located in subtropical Pacific, Antarctic waters, Sargasso Sea, or Mediterranean Sea, Eastern Snake river plain aquifer, 10,000-year-old cold-seep sediments of Edison seamount, aquatic thermal environments, alkaline loessian soil, groundwater contaminated with hydrocarbons, the unusual extreme environments represented by the deep sea hypersaline anoxic basins (DHABs) of Eastern Mediterranean and, soil communities, as well as waste water treatment plants and micro bial communities associated with sponges, forest, agriculture, grassland and alpine soils, freshwater, sediments, deep sea waters, as well as from tissues and digestive tracts of terrestrial and marine animals [95,96,98,100,102,105,106,111–115]. Of special interest are extreme environments where the enzymatic repertoire is the result of a natural selection under the existing extreme conditions. This will not only lead to the discovery of yet unknown enzymatic activities and reveal novel molecular structures and biochemistries, but will also provide the understanding of the mechanistic basics of the life under the most hostile conditions on Earth. The analyses of these environments through the metagenomics have revealed some interesting data. For example, through the metagenome approach we have recently discovered enzymes that have novel adaptive tertiary-quaternary structures that maintain functionality over the steep physico-chemical gradients that characterize the DHABs (Eastern Mediterranean Sea), and have high activities, enantioselectivity and unusual tolerance of polar solvents and reducing agents that make it interesting for applications (Figure 5.5a) [113]. Therefore, one can suggest that if these enzymes are a proxy of other enzymes and other metabolic activities in the DHABs, then considerable new microbial diversity and novel biological activities and mechanisms may be discovered in these fascinating habitats. In addition, the combination of metagenomic and enzyme evolution have successfully been applied for the discovery of the first lipase acting in the conversion of the sn-2 positions of triglycerides (Figure 5.5b) [114,116]. This unusual and rare conversion may open new perspectives for the creation of essential triglycerides containing long-chain PUFAs that are vital for human health. Currently, the “ideal” process for the synthesis of triglycerides with ω-3 and ω-6 families of PUFAs in the sn-2 position, is close to be implemented in industry. We also reported recently on the retrieval from a metagenome expression library of a bovine rumen, and characterization of a new polyphenol oxidase with laccase activity [91]. This laccase is unusual in two respects, namely (1) it lacks any sequence relatedness to the known laccases but belongs to a large protein family of domain of unknown function (DUF152), containing up to now about 750 database entries and it thus represents the first functionally characterized member of this new laccase family; (2) it exhibits much higher activity and substrate affinities than thus far described laccases (Figure 5.5c). Another good example is the discovery of enzymes in uncultured archaea associated to rice fields using a metagenomic approach. In a recent study, a unique set of enzymes have been described with capabilities to perform carbohydrate metabolism and assimilatory sulfate reduction unknown among methanogens, and moreover, a set of antioxidant enzymes as well as oxygen-insensitive enzymes with a selective advantage over those from other methanogens [117]. The combination of enzyme activity and biochemical properties may be extrapolated from an ecological point of view, explaining the prevalence of one organism in a particular environment. Metagenomics is thus a multidisciplinary approach requiring the expertize of microbiologists, enzymologists, molecular biologists, engineers, ecologists, etc. Since the initial implementation of
5-20
(a) Ser446
Cellular Metabolism Structural features
OH H O–
Asp600
HO HO
O
Ser907
O
O
O
N
OAc
O
OAc
O
+
OH
O
E: 126 ± 3 eep: 99% Conversion: 48%
Ser939
Thio-esterase domain
(b)
Potential in bioconversions
His744
N
Carboxyl-esterase domain
N33
OOC OOC OOC
OOC OOC OOC
S137 H247
D215
OH
(c)
R
N338 C237 N36 Y40 N114
H233 H73
C175
R
+
H207
R
C172
R
OH
C114
H135 C75
O
H2N
HO
H
R
R
OR2
R1O
H
R N H
O
OH
H190
R
H OR1
Figure 5.5 Examples of enzymes isolated from metagenomic libraries belonging to know protein families and possesing novel structural and biocatalytical features. (a) Esterase isolated from DHABs and that contains three catalytic serines mediating different activities in two domains and is a good candidate for the synthesis of chiral synthons (e.g., enantiomeric resolution of (±)-solketal) (From Ferrer, M., et al., Chem. Biol., 12, 895, 2005a). (b) Conversion of a true carboxyl-esterase from rumen into a high efficient triacylglycerol lipase with sn-2 positional specificity and high potential for the synthesis of structured lipids (From Ferrer, M., et al., Env. Microbiol., 7, 1996, 2005b and Gill, S.R., et al., Science, 312, 1355, 2006). (c) Polyphenol oxidase mined from rumen: this laccase is unusual in three respects, namely it lacks any sequence relatedness to the known laccases, it exhibits much higher activity and substrate affinities (left side products) than thus far described laccases. (From Beloqui, A., et al. J. Biol. Chem., 281, 22933, 2006. With permission.)
metagenomics, novel enzymes have been recovered from many environments suggesting that a great number of new enzymatic diversity is out there and awaiting discovery. Are we ready to implement the ideal enzyme to resolve rare conversions? We conclude that many of the necessary molecular and screening technologies are in place and there is evidence that a successful acquisition of new mono- and multifunctional biocatalysts for fully optimized “rare” conversions may be available in the near future harvesting diversity through Nature.
Acknowledgments This research was supported by European Community Projects EVK3-2000-00042 “BIODEEP”, EVK32002-00077 “COMMODE” and MERG-CT-2004-505242 “BIOMELI”. P.G. thanks the BMBF GenoMik initiative. M.F. thanks the Spanish Ministerio de Ciencia y Tecnología for a Ramón y Cajal contract. Authors also thank ViaLactiaBiosciences Ltd. (New Zealand).
Rare Metabolic Conversions—Harvesting Diversity through Nature
5-21
References 1. O’Hagan, D., et al. Biosynthesis of an organofluorine molecule. Nature, 416, 279, 2002. 2. Durand, S., et al. First isolation and characterization of a bacterial strain that biotransforms the herbicide mesotrione. Lett. Appl. Microbiol., 43: 222–228, 2006. 3. Urbanczik, R. SNA – a toolbox for the stoichiometric analysis of metabolic networks. BMC Bioinformatics, 7, 129, 2006. 4. Burton, S.G., Cowan, A., and Woodley, J.M. The search for the ideal biocatalyst. Nature Biotechnol., 20, 37, 2002. 5. Omura, S. The Search for Bioactive Compounds from Microorganisms. Springer-Verlag, New York, NY, 1992. 6. Cutler, H.G. and Cutler, S.J. Biologically Active Natural Products: Agrochemicals. CRC Press, Boca Raton, FL, 1999. 7. Çelik, A., Speight, R.E., and Turner, N. Identification of broad specificity P450CAM variants by primary screening against indole as substrate. Chem. Commun., 3652, 2005. 8. Ettema, T.J., de Vos, W.M., and der Oost, J. Discovering novel biology by in silico archaeology. Nature Rev., 3, 859, 2005. 9. David, H., Akesson, M., and Nielsen, J. Reconstruction of the central metabolisms of Aspergillus niger. Eur. J. Biochem., 270, 4243, 2003. 10. Nelson, K.E., et al. Evidence for lateral gene transfer between Archaea and bacteria from genome sequence of Thermotoga maritime. Nature, 399, 323, 1999. 11. She, Q., et al. The genome sequence of the crenarchaeon Sulfolobus solfataricus P2. Proc. Natl. Acad. Sci. USA, 98, 7835, 2001. 12. Fütterer, O., et al. Genome sequence of Picrophilus torridus and its implications for life around pH 0. Proc. Natl. Acad. Sci. USA, 101, 9091, 2004. 13. Bull, A.T., Goodfelow, M., and Slater, J.H. Biodiversity as a source of innovation in biotechnology. Ann. Rev. Microbiol., 46, 219, 1992. 14. Sogin et al. Microbial diversity in the deep sea and the underexplored “rare biosphere”. Proc. Natl. Acad. Sci. USA, 103, 12115, 2006. 15. Sebat, J.L., Colwell, F.S., and Crawford, R.L. Metagenomics profiling, microarray analisis of an environmental library. Appl. Environ. Microbiol., 69, 4927, 2003. 16. Schloss, P.D. and Handelsman, J. Biotechnological prospects from metagenomics. Curr. Opin. Biotechnol., 14, 303, 2003. 17. Cowan, D.A., et al. Metagenomics, gene discovery and the ideal biocatalyst. Biochem. Soc. Trans., 32, 298, 2004. 18. Handelsman, J. Metagenomics: applications of genomics to uncultured microorganisms. Microbiol. Mol. Biol. Rev., 68, 669, 2004. 19. Riesenfeld, C.S., Schloss, P.D., and Handelsman, J. Metagenomics, genomic analysis of microbial communities. Annu. Rev. Genet., 38, 525, 2004. 20. Ferrer, M., Martínez-Abarca, F., and Golyshin, P. Genome and “metagenome” minining for novel catalysts. Curr. Opin. Biotechnol., 16, 588, 2005c. 21. Yun., J. and Ryu, S. Screening ffor novel enzymes from metgenomes and SIGEX, as a way to improve it. Microb. Cell Fact., 25, 8, 2005. 22. Venter, J.C. et al. Environmental genome shotgun sequencing of the Sargasso Sea. Science, 304, 66, 2004. 23. Horowitz, N.J. On the evolution of biochemical synthesis. Proc. Natl. Acad. Sci. USA, 31, 153, 1945. 24. Leopoldseder, S., Claren, J., Jürgens, C., and Sterner, R. Interconverting the catalytic activities of (βα)-barrel enzymes from different metabolic pathways: sequence requirements and molecular analysis. J. Mol. Biol., 337, 871, 2004. 25. Jensen, R.A. Enzyme recruitment in evolution of new function. Annu. Rev. Microbiol., 30, 409, 1976.
5-22
Cellular Metabolism
26. Wise, E. Homologous (β/α)8-barrel enzymes that catalyze unrelated reactions: orotidine 5′-monophosphate decarboxylase and 3-keto-l-gulonate 6-phosphate decarboxylase. Biochemistry, 41, 3861, 2002. 27. Bornscheuer, U.T. and Kazlauskas, R.J. Catalytic promiscuity in biocatalysis: using old enzymes to form new bonds and follow new pathways. Angew. Chem. Int. Ed. Engl., 43, 6032, 2004. 28. Armor, J.N. Striving for catalytically green processes in the 21st century, Appl. Catal A: Gen., 189, 153, 1999. 29. Sheldon, R.A. and van Rantwijk, F. Biocatalysis for sustainable organic synthesis. Aust. J. Chem., 57, 281, 2004. 30. Azerad, R. Chemical biotechnology – better enzymes for green chemistry. Curr. Opin. Biotechnol., 12, 533, 2001. 31. Anastas, P. and Williamson, T. Green Chemistry. Theory and Practice. Oxford University Press, Oxford, 1998. 32. Davis, B.G. and Boyer, V. Biocatalysis and enzymes in organic synthesis. Nat. Prod. Rep. 18, 618, 2001. 33. Klibanov, A.M. Improving enzymes by using them in organic solvents. Nature, 409, 241, 2001. 34. Koeller, K.M. and Wong, C.H. Enzymes for chemical synthesis. Nature, 409, 232, 2001. 35. Babbitt, P.C. and Gerlt, J.A. Advances in Protein Chemistry. Academic Press, San Diego, CA, 2001, 55, 1. 36. Leonardo, E.J., et al. Green chemistry – the 12 principles of green chemistry and its insertion in the teach and research activities. Quim. Nova, 26, 123, 2003. 37. Alcalde, M., Ferrer, M., Plou, F.J., and Ballesteros, A. Environmental biocatalysis: from remediation with enzymes to novel green processes. Trend Biotechnol., 24, 281, 2006. 38. Schmid, A., et al. Industrial biocatalysis today and tomorrow. Nature, 409, 258, 2001. 39. Bode, H.B. and Müller, R. The impact of bacterial genomics on natural product research. Angew. Chem. Int. Ed. Engl., 44, 6828, 2005. 40. Cao, L. Immobilised enzymes: science or art. Curr. Opin. Chem. Biol., 9, 217, 2005. 41. Glieder, A., et al. Comprehensive step-by-step engineering of an (R)-hydroxynitrile lyase for largescale asymmetric synthesis. Angew. Chem. Int. Ed. Engl., 42, 4815, 2003. 42. Lee, S.Y., et al. System biotechnology for strain improvement. Trends Biotechnol., 23, 349, 2005. 43. Srübing, D., et al. Synthesis of enantiomerically pure cyclohex-2-en-1-ols, development of novel multicomponent reactions. Chem. Eur. J., 11, 4210, 2005. 44. Tietze, L.F. Domino reactions in organic synthesis. Chem. Rev. 96, 115, 1996. 45. Arpigny, J.L. and Jaeger, K.E. Bacterial lipolytic enzymes: classification and properties. Biotechnol. J., 343, 177, 1999. 46. Ballesteros, A., et al. Enzymatic synthesis of sugar esters and oligosaccharides from renewable resources. In Biocatalysis in the Pharmaceutical and Biotechnology Industries, R.N. Patel, Ed., CRC Press, Boca Raton, 463, 2007. 47. Phatak, T. and Waldman, H. Enzymes and protecting group chemistry. Curr. Opin. Chem. Biol., 2, 112, 1998. 48. Takayama, S., Lee, S.T., Hung, S-C., and Wong, C-H. Designing enzymatic resolution of amines. Chem. Commun., 127, 1999. 49. Bader, B., et al. Bioorganic synthesis of lipid-modified proteins for the study of signal transduction. Nature, 403, 223, 2000. 50. Sonawane, V.C. Enzymatic modifications of cephalosporins by cephalosporin acylase and other enzymes. Crit. Rev. Biotechnol., 26, 95, 2006. 51. McNaught, A.D. International Union of Pure and Applied Chemistry and International Union of Biochemistry and Molecular Biology—Joint Commission on Biochemical Nomenclature— Nomenclature of carbohydrates (Recommendations 1996) (Reprinted from Pure Appl. Chem., 68, 1919, 1996), 43, 1997.
Rare Metabolic Conversions—Harvesting Diversity through Nature
5-23
52. Eggleston, G. and Cote, G.L. Oligosaccharides in food and agriculture. In Oligosaccharides in Food and Agriculture, Eggleston, G., and Cote, G.L., Eds. American Chemical Society, Washington, DC, 2003. 53. Kren,V. and Thiem, J. Glycosylation employing bio-systems: from enzymes to whole cells. Chem. Soc. Rev., 26, 463, 1997. 54. Ajisaka, K. and Yamamoto, Y. Control of the regioselectivity in the enzymatic syntheses of oligosaccharides using glycosidases. Trends Glycosci. Glycotechnol., 14, 1, 2002. 55. Ichikawa, Y., et al. Synthesis of oligosaccharides using glycosyltransferases. J. Synth. Org. Chem. Jpn., 50, 441, 1992. 56. Crout, D.H. and Vic, G. Glycosidases and glycosyl transferases in glycoside and oligosaccharide synthesis. Curr. Opin. Chem. Biol., 2, 98, 1998. 57. Coutinho, P.M. and Henrissat, B. Carbohydrate-active enzymes server at http://afmb.cnrs-mrs.fr/ CAZY/, 1999. 58. Planas, A. and Faijes, M. Glycosidases and glycosynthases in enzymatic synthesis of oligosaccharides. An overview. Afinidad, 59, 295, 2002. 59. Ahmed, Z. Production of natural and rare pentoses using microorganisms and their enzymes. J. Biotechnol., 4, 103, 2001. 60. Daines, A.M., Maltman, B.A., and Flitch, S.L. Synthesis and modifications of carbohydrates, using biotransformations. Curr. Opin. Chem. Biol., 8, 106, 2004. 61. Hsu, C-C., et al. Directed evolution of D-sialic acid aldolase to L-3-deoxy-manno-2-octulonic acid (L-KDO) aldolase. Proc. Natl. Acad. Science USA, 102, 9122, 2005. 62. Machajewski, T.D. and Wong, C.H. The catalytic asymmetric aldol reaction. Angew. Chem. Int. Ed. Engl., 39, 1352, 2000. 63. Plou, F.J., et al. Glucosyltransferases acting on starch or sucrose for the synthesis of oligosaccharides. Can. J. Chem., 80, 743, 2002. 64. Griffith, B.R., Langenhan, J.M., and Thorson, J.S. ‘Sweetening’ natural products via glycorandomization. Curr Opin Biotechnol., 16, 622, 2005. 65. Davies, G.J., Charnock, S.J., and Henrissat, B. The enzymatic synthesis of glycosidic bonds: “Glycosynthases” and glycosyltransferases. Trends Glycosci. Glycotechnol., 13, 105, 2001. 66. Mackenzie, L.F., et al. Glycosynthases: Mutant glycosidases for oligosaccharide synthesis. J. Am. Chem. Soc., 120, 5583, 1998. 67. Malet, C. and Planas, A. From β-glucanase to beta-glucansynthase: glycosyl transfer to alpha-glycosyl fluorides catalyzed by a mutant endoglucanase lacking its catalytic nucleophile. FEBS Lett., 440, 208, 1998. 68. Perugino, G., et al. Recent advances in the oligosaccharide synthesis promoted by catalytically engineered glycosidases. Adv. Synth. Catal., 347, 941, 2005. 69. Mayer, C., et al. Directed evolution of new glycosynthases from Agrobacterium beta-glucosidase: a general screen to detect enzymes for oligosaccharide synthesis. Chem. Biol., 8, 437, 2001. 70. Green, A.G. From alpha to omega-producing essential fatty acids in plants. Nat. Biotechnol., 22, 680, 2004. 71. Trivedi, O.A., et al. Enzymic activation and transfer of fatty acis as acyl-adenylates in mycobacteria. Nature, 428, 441, 2004. 72. Fleith, M. and Clandinin, M.T. Dietary PUFA for preterm and term infants: review of clinical studies. Crit. Rev. Food Sci. Nutr., 45, 205, 2005. 73. Vaillancourt, F.H., et al. Cryptic chlorination by a non-haem iron enzyme during cyclopropyl amino acid biosynthesis. Nature, 436, 1191, 2005. 74. Kühnel, K., Blankenfeldt, W., Terner, J., and Schlichting, I. Crystal structure of chloroperoxidase with its bound substrate and complexed with formate, acetate and nitrate. J. Biol. Chem., 281, 23990, 2006. 75. Janssen, D.B., Dinkla, I.J.T., Poelarends, G.J., and Terpstra, P. Bacterial degradation of xenobiotics compounds: evolution and distribution of novel enzyme activities. Environ. Microbiol., 7, 1868, 2005.
5-24
Cellular Metabolism
76. Anderson, J.L. and Chapman, S.K. Molecular mechanisms of enzyme-catalysed halogenation. Mol. Biosyst., 2, 350, 2006. 77. Fernandez, M.M., et al. Enzymatic synthesis of peptides containing unnatural amino acids. Enzyme Microb. Technol., 17, 964, 1995. 78. Nam, S-H., Park, H-S., and Kim, H-S. Evolutionary relationship and application of a superfamily of cyclic amidohydrolase enzymes. Chem. Rec., 5, 298, 2005. 79. Fotheringham, I., et al. Preparative deracemization of unnatural amino acids. Biochem, Soc. Trans., 34, 287, 2006. 80. Das, S. and Rosazza, J.P.N. Microbial and enzymatic transformation of flavonoids. J. Nat. Prod., 69, 499, 2006. 81. Nobel, S., Abrahmsen, L., and Oppermann, U. Metabolic conversion as a pre-receptor control mechanism for lipophilic hormones. Eur. J. Biochem., 268, 4113, 2001. 82. Pieper D.H., Martins Dos Santos V.A, and Golyshin P.N. Genomic and mechanistic insights into the biodegradation of organic pollutants. Curr. Opin. Biotechnol., 15, 215, 2004. 83. Wong, C-H. and Whitesides, G.M. Enzymes in Synthetic Organic Chemistry. Pergamon, Oxford, 1994. 84. Dömling, A. and Ugi, I. Multicomponent reactions with isocyanides. Angew. Chem., 39, 3168, 2000. 85. Lloyd, M.D., et al. Controlling the substrate selectivity of deacetoxycephalosporin / deacetylcephalosporin C synthase. J. Biol. Chem., 279, 15420, 2004. 86. Rothschild, L.J. and Mancinelli, R.L. Life in extreme environments. Nature, 409, 1092, 2001. 87. Cavicchioli, R., Siddiqui, K.S., Andrews, D., and Sowers, K.R. Low-temperature extremophiles and their applications. Curr. Opin. Biotechnol., 13, 253, 2002. 88. Nicol, G.W. and Schleper, C. Ammonia-oxidising Crenarchaeote: important players in the nitrogen cycle? Trends Microbiol., 14, 207, 2006. 89. Steele, H.L. and Streit, W.R. Metagenomics: advances in ecology and biotechnology. FEMS Microbiol. Lett., 247, 105, 2005. 90. Boucher, Y. and Doolittle, W.F. Something new under the sea. Nature, 417: 27, 2002. 91. Beloqui, A., et al. Novel polyphenol oxidase mined from a metagenome expresion library of bovine rumen: Biochemical properties, structural analysis and phylogenetic relationships. J. Biol. Chem., 281, 22933, 2006. 92. Knietsch, A., et al. Identification and characterization of coenzyme B12-dependent glycerol dehydratase and diol dehydratase-encoding genes from metagenomic DNA libraries derived from enrichment cultures. Appl. Environ. Microbiol., 69, 3048, 2003. 93. Lee, S.W., et al. Screening for novel lipolytic enzymes from uncultured soil microorganisms. Appl. Environ. Microbiol., 65, 720, 2004. 94. Cottrell, M.T., Yu, L., and Kirchman, D.L. Sequence and expression analysis of Cytophaga-like hydrolases in a Western arctic metagenomic library and the Sargasso Sea. Appl. Environ. Microbiol., 71, 8506, 2005. 95. Erwin, D.P., et al. Diversity of oxygenase genes from methane- and ammonia-oxidizing bacteria in the eastern Snake river plain aquifer. Appl. Environ. Microbiol., 71, 2016, 2005. 96. Byun, J.S., et al. Crystallization and preliminary X-ray crystallographic analysis of EstE1, a new and thermostable estere cloned from a metagenomic library. Acta Crystallograph. Sect. F Struct. Biol. Cryst. Commun., 62, 145, 2006. 97. Rhee, J.K., Ahn, D.G., Kim, Y.G., and Oh, J.W. New thermophilic and thermstable esterase with sequence similarity to the hormone-sensitive lipase family, cloned from a metagenomic library. Appl. Environ. Microbiol., 71, 817, 2005. 98. Shima, S. and Thauer, R.K. Methyl-coenzyme M reductase and the anaerobic oxidation of methane in methanotrophic archea. Curr. Opin. Microbiol., 8, 643, 2005. 99. Uchiyama, T., Abe, T., Ikemura, T., and Watanabe, K. Substrate-induced gene-expression screening of environmental metagenome libraries for isolation of catabolic genes. Nat. Biotechnol., 23, 88, 2005.
Rare Metabolic Conversions—Harvesting Diversity through Nature
5-25
100. Wexler, M., Bond, P.L., Richardson, D.J., and Johnston, A.W.B. A wide range-host metagenomic library from a waste water treatment plant yields a novel alcohol/aldehyde dehydrogenase. Environ. Microbiol., 7, 1917, 2005. 101. Lim, H.K., et al. Characterization of a forest soil metagenome clone that confers indirubin and indigo production on Escherichia coli. Appl. Environ. Microbiol. 71, 7768, 2005. 102. Dumont, M.G., et al. Identification of a complete methane monooxygenase operon from soil by combining stable isotope probing and metagenomic analysis. Environ. Microbiol., 8, 1240, 2006. 103. Ginolhac, A., et al. Phylogenetic analysis of polyketide synthase I domains from soil meagenomics libraries allows selection of promising clones. Appl. Environ. Microbiol., 70, 5522, 2004. 104. Piel, J., Hui, D., Fusetani, N., and Matsunaga, S. Targeting modular polyketide synthases with iteratively acting acyltransferases from metagenomes of uncultures bacterial consortia. Environ. Microbiol., 6, 921, 2004. 105. Schirmer, A., et al. Metagenomic analysis reveals diverse polyketide synthase gene clusters in microorganisms associated with the marine sponge Discodermia dissolute. Appl. Environ. Microbiol., 71, 4840, 2005. 106. Kim, T.K. and Fuerst, J.A. Diversity of polyketide synthase genes from bacteria associated with the marine sponge Pseudoceratina clavata: culture-dependent and culture-independent approaches. Environ. Microbiol., 8, 1460, 2006. 107. Gabor, E.M., Alkema, W.D.L., and Janssen, D.B. Quantifying the accessibility of the metagenome by random expression cloning techniques. Environ. Microbiol., 6, 879, 2004. 108. Kaneko, S., Akioka, M., Tsuge, K., and Itaya, M. DNA shuttling between plasmid vectors and a genome vector: systematic conversion and preservation of DNA libraries using the Bacillus subtilis genome (BGM) vector. J. Mol. Biol., 349, 1036, 2005. 109. Martinez, A., et al. Genetically modified bacterial strains and novel bacterial artificial chromosome shuttle vectors for constructing environmental libraries and detecting heterologous natural products in multiple expression hosts. Appl. Environ. Microbiol., 70, 2452, 2004. 110. Boubakri, H., Beuf, M., Simonet, P., and Vogel, T.M. Development of metagenomic DNA shuffling for the construction of a xenobotic gene. Gene, 21, 87, 2006. 111. Gabor, E.M., de Vieres, E.J., and Janssen, D.B. Construction, characterization, and use of smallinsert gene banks of DNA isolated from soil and enrichment cultures for the recovery of novel amidases. Environ. Microbiol., 6, 948, 2004. 112. Yun, J., et al. Characterization of a novel amilolytic enzyme encoded by a gene from a soil-derived metagenomic library. Appl. Environ. Microbiol., 70, 7229, 2004. 113. Ferrer, M., et al. Novel microbial enzymes mined from the Urania deep-sea hypersaline anoxic basin. Chem. Biol., 12, 895, 2005a. 114. Ferrer, M., et al. Novel hydrolase diversity retrieved from a metagenome library of bovine rumen microflora. Env. Microbiol., 7, 1996, 2005b. 115. Gill, S.R., et al. Metagenomic analysis of the human distal gut microbiome. Science, 312, 1355, 2006. 116. Reyes-Duarte, D., et al. Conversión of a carboxylesterase into a triacylglycerol lipasa by a random mutation. Angew. Chem. Int. Ed., 44, 7553, 2005. 117. Erkel, C., Kube, M., Reinhardt, R., and Liesack, W. Genome of rice I archaea – the key methane producers in the rice rhizosphere. Science, 313, 370, 2006.
Balances and Reaction Models
II
Walter M. van Gulik Delft University of Technology
6 Growth Nutrients and Diversity Joseph J. Heijnen..............................................................6-1 Introduction • Cell Composition • The Elemental Composition of Biomass Can Be Expressed as a 1 C-mol Formula • Medium Composition • Conclusions
7 Mass Balances, Rates, and Experiments Joseph J. Heijnen................................................7-1 Introduction • Rates and Mass Balances • Biomass Specific Conversion Rates (q-Rates) • Mathematical Models for the Batch Experiment from Mass Balances and q-Based Kinetics: The Modeling Cycle • Conclusions
8 Data Reconciliation and Error Detection Peter J.T. Verheijen..........................................8-1 Introduction • Statistical Framework—Theory • Optimization Formulation • Linear Constraints • Nonlinear Constraints • Deviations—Reality • Computations and Reconciliation • Biotechnology and Reconciliation • Discussion and Conclusion
9 Black Box Models for Growth and Product Formation Joseph J. Heijnen.....................9-1 Introduction • Kinetic Simplicity: Achievement of a Single Nutrient Limited Condition by Medium Design • Fermentor Transport Mechanisms as a Tool to Control Extracellular Concentrations and therewith Control the q-Rates: The Chemostat • Black Box Kinetic Functions for qs, q p, µ under Single Nutrient (Substrate) Limited Conditions • The Herbert–Pirt Substrate Distribution Equation • Kinetics of Product Formation • Estimation of the Parameters of the Kinetic Model from Chemostat Experiments • Operational Yields • Conclusions
10 Metabolic Models for Growth and Product Formation Walter M. van Gulik.............10-1 Introduction • Modular Approach • Detailed Stoichiometric Models • Conclusions
11 A Thermodynamic Description of Microbial Growth and Product Formation Joseph J. Heijnen...................................................................................................11-1 Introduction • Thermodynamics of Microbial Growth Stoichiometry, max • Thermodynamics of Maintenance • Calculation of the Operational Stoichiometry YDX of a Growth Process at Different Growth Rates, Including Heat Using the Herbert–Pirt Relation for Electron Donor • A Correlation to Estimate the Maximum Specific Growth Rate, µmax • Thermodynamic Prediction of Minimal Concentration Electron Donor and Maximal Concentration of Catabolic Product • Thermodynamics and Stoichiometry of Product Formation • Conclusions
II-1
II-2
T
Balances and Reaction Models
he efficient use of microorganisms for the production of a compound of interest requires not only a high producing organism but also a properly designed fermentation process for large scale cultivation of this organism. The design and development of such a fermentation process as well as the evaluation, troubleshooting, and further improvement of the process after implementation on full scale requires a quantitative approach. In the design phase of the process it is important to be able to make reliable estimations of maximum yields of biomass and product on the supplied substrate, the expected rates of oxygen consumption and carbon dioxide production as well as heat dissipation. In the development phase laboratory and pilot plant scale experiments have to be designed, carried out and evaluated. Finally a translation has to be made to a full scale process for which the equipment has to be designed and the economical viability has to be assessed. It has been shown that fermentation processes, although they deal with living and self reproducing organisms, can be approached in a way which is remarkably similar to the way a chemical engineer would approach a chemical process. Also for fermentation processes the engineering approach, i.e., the use of mass and heat balances, mathematical modeling, and process simulation has proved to be very useful, as they allow a rational design and improvement of the process. For a long time the development and optimization of fermentation processes has been focused only on the rational improvement of the process conditions, while optimization of the applied microorganism was done through random mutagenesis and selection of high producing mutants. Presently, due to the fact that biochemical and genetic knowledge as well as the available genetic tools have increased such that precise changes can be made to the microbial metabolic network, improvement of fermentation processes has now been focused much more on the rational improvement of the microorganism applied in the process. Also here engineering principles have been advocated in order to come to a rational design of microorganisms to serve a specific purpose, as in the exciting research area of metabolic engineering. The chapters in this section of the handbook are mainly devoted to the way engineering principles can be applied to fermentation processes. In Chapter 6 it is shown how the composition of a proper cultivation medium can be obtained in a rational way from the elemental and biochemical composition of the cells and the catabolic reaction used by the microorganism to generate energy. Chapter 7 is devoted to the calculation of conversion rates, which can not be measured as such, from measurements of concentrations and flows for different cultivation systems. Furthermore a first attempt is made to set up a mathematical model for a fermentation process by combining the proper mass balances and simple kinetic relations. In Chapter 8 the application of data reconciliation and gross error detection to fermentation processes is treated, with the aim to show that these techniques are very valuable to check the consistency of the experimental data, to detect erroneous measurements and to obtain as much information from the available measurements as possible. In Chapter 9 the basic concepts of black box modeling of fermentation processes are introduced. It is shown that a relatively simple approach, which does not require detailed information on the metabolism of the applied microorganism is already very useful for the design and optimization of fermentation processes. Chapter 10 some application of metabolic stoichiometric models are introduced. It is shown in this chapter that metabolic models of moderate complexity can be successfully applied to provide a fairly accurate description of changes in the metabolic network structure of the microorganism as a result of changes in growth conditions. It is also shown that such models can be applied, in combination with experimental results, to estimate the ATP stoichiometry of oxidative phosphorylation and maintenance requirements for a certain microorganism. Incorporating the estimated ATP stoichiometry in such a model allows the prediction of maximum yields of biomass and products for different substrates, substrate mixtures and metabolic network topologies. In Chapter 11 the application of thermodynamic principles to living systems is introduced. It is shown that thermodynamics and only three correlations can provide very quantitative information on stoichiometry, rates and thresholds of microbial growth, and product formation.
6 Growth Nutrients and Diversity 6.1 6.2 6.3 6.4
Joseph J. Heijnen Delft University of Technology
Introduction ���������������������������������������������������������������������������������������6-1 Cell Composition ������������������������������������������������������������������������������6-2 The Elemental Composition of Biomass Can Be Expressed as a 1 C-mol Formula ����������������������������������������������������������������������� 6-3 Medium Composition ��������������������������������������������������������������������� 6-4
Organic Compounds • Inorganic Compounds • The Anabolic and Catabolic Reaction Networks • Signaling Molecules
6.5 Conclusions ����������������������������������������������������������������������������������������6-7 Further Reading ������������������������������������������������������������������������������������������� 6-8
6.1 Introduction Living cells such as microorganisms and animal cells have been used for many decades for the production of a multitude of products which cover various sectors of application and the application of different sectors of biotechnology. The product volume can be very different ranging from order of magnitude of 1 kg year-1–1012 kg year-1. The factories where these products are made can be small (e.g., a tabletop factories) to extremely large (containing vessels of 100–30,000 m3). In many fermentation processes, a pure culture of a suitable microorganism is used to produce a desired product. This technology is becoming a major production tool in many different application sectors (food, fine-, and bulk chemicals, pharmaceutical). In fermentation processes pure microorganisms or even cells of higher organisms (obtained from plants or certain human or animal tissue) are used to produce a valuable product. The cells themselves are seldom the main product. An exception is e.g., bakers yeast. An important aspect in fermentation processes is the cultivation medium used. In many cases undefined media are used, often containing waste material from e.g., the sugar industry (cane or beet molasses), cheese manufacturing (wey), wet milling of maize (corn steep liquor), and other ingredients like soybean meal, vegetable oils, yeast extract, etc. The advantage of these complex or undefined media is that they are relatively cheap which is an important factor in fermentation processes because the cultivation medium can be a significant cost factor. The disadvantage is that the exact composition of the ingredients of complex media is often not known, while differences in composition may occur between different batches. This hampers careful attempts to optimize the fermentation process in order to increase the productivity and yield of the desired product. For this reason several industries have made the decision to switch to defined cultivation media for certain fermentation processes. The composition of defined cultivation media can be very different and is highly dependent on the applied microorganism or cell culture used, the way the process is operated, and the desired product to be produced. A good starting point for the design of a cultivation medium for a certain process is to analyze the biochemical composition of the microorganism or cell culture which will be applied and which components have to be supplied to the medium 6-1
6-2
Balances and Reaction Models
because the cells are not able to synthesize these themselves, e.g., amino acids, vitamins, etc. In many cases defined media for the cultivation of microorganisms, but also for plant cells contain relatively few compounds (in the order of tens), whereas defined media for the cultivation of animal cells may contain up to hundreds of different compounds. It should be realized that such complicated media are much more expensive than the more simple media.
6.2 Cell Composition All living cells contain water, polymers, and small molecules. The water content of a cell is typically about 70%, the remaining part consists of organic and inorganic molecules. The organic compounds in living cells are mainly present as polymers such as: lipids (in e.g., membranes), RNA/DNA (in ribosomes, cell nucleus), proteins (in ribosomes, enzymes, transport proteins, regulatory proteins) and carbohydrates (storage carbohydrates, cell wall). A typical composition (in % of cell organic dry mass) is 1% DNA, 7% RNA, 55% protein, 5% lipid, 27% carbohydrates, which adds up to 95% of cell organic dry mass). The remaining 5% consists of small organic molecules which are present in the cell and are needed as reaction intermediates, (e.g., in glycolysis with intermediates like: G6P, F6P, PEP, etc.), as cofactors (ATP, ADP, NADH(P)(H), CoA, etc.) and regulatory factors (hormones, peptides). A very important second class of small organic molecules present in cells are vitamins. Vitamins are molecules which are needed in active sites of certain classes of enzymes (e.g., biotin is needed in carboxylating enzymes, pyridoxin in transaminating enzymes). Vitamins are also part of cofactor molecules (pantothenate in CoA, nicotinic acid in NAD(P), riboflavin in FAD). The polymers and small molecules contain C, H, O, N, S, P as major elements. Question: Which elements are present in each polymer present in biomass? Answer: Proteins contain the elements C, H, O, N, S Lipids contain C, H, O DNA/RNA contain C, H, O, N, P Carbohydrates contain C, H, O Quantiἀcation of growth Growth can be quantified in many different ways, e.g., by measuring total cell volume, cell number, cell wet mass, cell dry mass, or cell organic dry mass. Total cell volume is usually measured as the volume of packed cells. The packed cell volume can be obtained by centrifugation of a sample of cell broth in a calibrated centrifuge tube and subsequently measuring the volume of the cell pellet. Cell wet mass is obtained by weighing the packed cells. Sometimes the extracellular water is excluded, e.g., by subtraction of the estimated volume of water which is present between the cells. Cell dry mass is obtained by drying of a certain amount of wet biomass in an oven to remove both extra- and intracellular water by heating at 105°C as long as is needed to obtain a constant weight. Cell organic dry mass is obtained by subtracting the mass of ash from cell dry mass. The mass of ash is obtained by burning the organic fraction of the dry mass in an oven at ≈500°C. As an example consider a culture of baker’s yeast. The yeast is cultivated in a fermentor as a suspension in the growth medium. This mixture of organisms and aqueous solution is called broth. Typically the packed volume of yeast cells can be as much as 300 ml per liter of broth. This 300 ml of packed cell volume contains the actual wet yeast cells (200 ml) plus approximately 100 ml of aqueous medium in-between the cells (similar to filled and open volume in a dense sphere packing). The 200 ml yeast cells contains about 140 ml intracellular water (70%) and 60 gram yeast dry weight. The dry weight contains 5 gram ash, leading to an organic yeast mass of about 55 gram (all per litre of broth). The ash contains all inorganic matter present in the cell. Apart from organic compounds all living cells contain inorganic compounds as well. A signifi+ 2+ + cant part of this is S and HPO 24 present in protein and DNA/RNA. In addition K , Mg , Na are present, which serve as counter ions for the phosphate of nucleic acids and of the carboxylic acid
6-3
Growth Nutrients and Diversity
groups in proteins. Finally there is a multitude of small amounts of different so-called trace metals (Fe2+, Cu2+, Mn2+, Zn2+, W, Mo). These metals are needed in the active site of specific enzymes, where they are essential for the catalytic mechanism.
6.3 The Elemental Composition of Biomass can be Expressed as a 1 C-mol Formula Although a living cell contains several polymers, many small organic molecules, several metals in larger amounts and many metals in very small amounts (see next paragraph) it is useful to study also the elemental composition of biomass. This elemental composition refers to the fraction of the biomass which is obtained after drying of the biomass. This so called dry mass can subsequently be analyzed for its elemental composition. Experience shows that the four most abundant elements make up 90–95% of the dry mass, which are in decreasing amount: C, O, N, H. These elements are nearly uniquely present in the four main organic polymers (proteins, lipids, carbohydrates, RNA/DNA) and the small organic monomeric molecules (biochemical pathway intermediates). It is therefore common practice to distinguish the following fractions of the dry biomass: Organic fraction This fraction contains nearly all C, H, O, N present in the living cells. This fraction is about 90–95% of the total dry mass. Inorganic fraction This fraction contains nearly uniquely all other elements (H2PO4- , P, S, Mg2+, Na+, K+, trace metals). The inorganic mass is experimentally determined by incineration of the organics (500°C) present in the dry biomass. The left over, called “ash” represents the inorganic mass and usually is about 5–10% of the total dry mass. Surprisingly many different organisms have an elemental composition which is rather similar. It is common practice to determine in most cases only the contents of the four most abundant elements (C, H, O, and N), which are considered to represent the organic mass. The results can be represented in a 1 C-mol formula of the biomass containing these four elements.
Biomass = C1H1.8O0.5N0.2 The note below explains how the elemental composition of biomass can be obtained experimentally. Experimental determination of the elemental composition of biomass In order to be able to carry out stoichiometric calculations the elemental composition of the biomass needs to be known. This can be obtained experimentally by means of elemental analysis of the produced biomass. Assume, as an example, that a sample of dried biomass is weighted to be 7.803 gram. The ash amount is determined (after combustion in a 500°C oven) to be 507.2 mg (from which it can be calculated that the ash content equals 6.5%). Furthermore it was measured that the amounts of carbon, oxygen, hydrogen, nitrogen are 3502 mg, 2408 mg, 511 mg, and 875 mg, respectively. Using the atom masses of C, O, H, and N (12, 16, 1, 14) the amounts of these elements (expressed in mol) present in 7.803 gram dry biomass can be calculated as:
C:
3502 = 0.2918 mol 12000
O:
2408 = 0.1505 mol 16000
H:
511 = 0.511 mol 1000
6-4
Balances and Reaction Models
N:
875 = 0.0625 mol 14000
The 1 C-mol formula can subsequently be calculated by normalizing to 1 C-mol to be C1H1.751O0.516N0.214 where the H-coefficient follows from 0.511/0.2918, etc. The mass of 1 C-mol of the organic biomass fraction can now be calculated to equal 25.003 gram, the mass including ash can be calculated to equal 26.74 gram. The 1 C-mol formula can be easily extended to include other elements such as P, S, K, Mg by analyzing the biomass for its P, S, K, Mg, and other metal contents.
6.4 Medium Composition It is obvious that for growth of cells both inorganic and organic compounds must be supplied by the growth medium, which requires a well-designed composition of this medium. Each of the medium compounds can be added as a pure compound to water. The mixture is then completely specified and is therefore called a defined medium. Such pure compounds are however expensive and therefore often cheap undefined mixtures of organic and inorganic compounds are applied in the growth medium. This is called a complex medium. Examples are molasses for microbial fermentation processes and calf fetal serum (CFS) for cell tissue cultures. Molasses is a waste product from the sugar industry, which contains K+, Mg2+, sugars, trace metals, and vitamins as desirable compounds. However, apart from the sugar content, the amount of these compounds can vary strongly depending on supplier and storage time. In addition molasses contains thousands of other, undesirable, compounds (e.g., pesticides). All these variations can have a significant impact on the growth. CFS contains protein type growth factors and other desirable proteins, but may also contain undesirable proteins such as prions (causing e.g., Creutzfeldt–Jakob disease). Undesired compounds can affect cell growth in a negative way, but more importantly these compounds can contaminate the purified final product, such as an injectable pharmaceutical compound. Complex media are therefore more and more avoided in industrial fermentation processes, because process performance may become too variable and also regulatory authorities, as the Food and Drug Administration (FDA), more and more often demand defined media to guarantee perfect product quality. Many microorganisms can synthesize all cell components (proteins, polymers, vitamins, and hormones) from one organic compound plus a number of inorganic compounds (see above). The medium then only needs one organic compound (as C-source) and the relevant inorganic compounds. This is called a defined minimal medium. If the organism can use CO2 as source of carbon the medium only contains inorganic compounds and is called a defined mineral medium.
6.4.1 Organic Compounds Microorganisms require in the first place carbon containing compounds for synthesizing new cells. In the case of autotrophic growth the carbon source is inorganic, i.e., CO2, but in most cases microorganisms grow heterotrophically and require organic compounds as C-source. The required organic carbon source depends on the synthetic capacities of the cell, which again depends on the available genes encoding for the enzymes of the biosynthesis reactions performed by the cell. In some cases the genes and therefore the biosynthetic capacity to synthesize essential compounds from the available C-source is lacking. Examples are vitamins, hormones, growth factors, and essential amino acids. This implies that these compounds have to be supplied via the growth medium. However, many microorganisms possess all genetic information to synthesize all (thousands) cell compounds from a single organic compound (e.g., glucose, ethanol, and acetate) and inorganic compounds as sources of nitrogen, phosphorus, sulfur, trace metals, etc. In such a situation only one organic compound is needed in the growth medium.
6-5
Growth Nutrients and Diversity
6.4.2 Inorganic Compounds In the case of microbial cultures the nitrogen source is in most cases supplied as ammonia and sometimes as nitrate, nitrite, or urea. However, some microorganisms can use N2-gas (N2-fixation). For cultivation of animal cells amino acids are usually supplied as N-source. Phosphorus is supplied in the form of orthophosphate. The sulfur source is in most cases sulfate and sometimes H2S. Metals which are used as counter ions, such as sodium, potassium, and magnesium are required in significant amounts. Metal ions which are required in the active sites of enzymes and thus have to be supplied only in small amounts are for example Fe2+, Cu2+, Mn2+, Zn2+, W, and Mo. These metals are called trace elements. A special inorganic compound which is highly essential for all living cells is CO2, often present in the form of bicarbonate (HCO3-). Only in this form it can be consumed in carboxylation reactions. However, in most cases CO2 does not need to be supplied to the growth medium, because the cells produce it themselves. To maintain a sufficient intracellular carbonate concentration the cells contain the enzyme carbonic anhydrase to convert CO2 into carbonate. Another important aspect is that the cultivation medium should be able to buffer the pH. Here HCO3- /CO32- and H 2PO -4 /HPO 24- are the most used buffering couples. The amount of Fe3+ needed for growth Assume that from biochemical information it is found that 5% of all enzymes contain Fe3+ in their active site. Suppose that 1 kg cell dry mass contains 95% organic matter, which contains 55% protein. Assume that 50% of the proteins are enzymes and that the average molar mass of an enzyme is 50,000. In 1,000 gram cells the amount of active sites containing Fe3+ equals
1000 × 0.95 × 0.55 × 0.50 × 0.05 = 0.26 × 10-3 mol 50.000 which is equivalent to 0.26 × 10 -6 mol Fe3+ per gram biomass dry matter. This implies that the Fe3+ requirement for the synthesis of 1,000 g dry cells is then about 15 mg. Generally, metals required for enzymes, are only needed in very small amounts (micromoles to nano moles, 10 -6 − 10 -9 mol), per gram dry biomass. Chemical analysis of biomass for the amount of trace metals present, yields direct information on the required amounts.
Question: A cell requires several vitamins and growth factors for proper growth. A complex medium is not allowed and a defined medium is too expensive. Which solution would you suggest? Answer: A solution could be to apply rec-DNA technology to introduce the genes of the enzymes which allow the cell to produce the required vitamins/hormones by itself. Medium preparation Media must be sterilized before use to eliminate undesired microorganisms. Usually the medium components are dissolved in water and are thereafter heat sterilized (e.g., 1 hour at 120°C). However, a problem with heat sterilization could be that some medium compounds precipitate (e.g., trace elements) and/or degrade (e.g., vitamins and hormones). Also heat induced chemical reactions between medium compounds can occur, leading to the formation of unspecified compounds which could affect growth. Precipitation easily occurs for trace metals, which form insoluble carbonates or phosphatesalts. Especially at pH > 7 this can happen. Usually a chelating agent (EDTA or citrate) is added to the trace elements solution and a low pH is maintained. A well-known undesired sterilization reaction is the Maillard reaction, a type of nonenzymatic browning which involves the reaction of simple sugars (carbonyl groups) and amino acids (free amino groups). To avoid these problems it
6-6
Balances and Reaction Models
is common practice to sterilize these compounds separately and combine them subsequently after heat sterilization. This avoids undesirable reactions, precipitation, and/or degradation of medium components. A good alternative to heat sterilization is filter sterilization. With this method the complete medium can be sterilized at room temperature, which avoids degradation of heat labile compounds and precipitation of salts. The medium is sterilized using a filter with such small pores (0.2 µm) that microorganisms or spores cannot go through. Medium design and genomics In many cases it is unknown which vitamins/hormones/growth factors can be synthesized by the cells. A practical approach is then to use a complex medium where all compounds are present, so biosynthetic deficiencies are always covered. But we have seen from the above that complex media have definite disadvantages. If a defined medium is desired a considerable experimental effort is required to test the many possible medium compositions. However, if for a certain organism a complete annotated genome is available one can immediately inspect the available biosynthetic routes and deduce which biomass molecules cannot be synthesized by the cells, and therefore need to be supplied in the growth medium.
6.4.3 The Anabolic and Catabolic Reaction Networks The combined organic and inorganic compounds present in the medium are taken up by the cell through specific transport proteins present in the cell membrane and are used in a very complex, enzyme catalyzed, biosynthetic network of thousands of reactions, which produces each of the thousands of compounds present in the cell. This biosynthetic network is called anabolism or assimilation, or anabolic network or assimilatory network. However, it should be realized that the anabolic network cannot function without energy which is produced in catabolism!! The above discussed organic and inorganic medium compounds are all required in anabolism and most of these medium compounds (N-, S-, O-, P-containing compounds, vitamins, hormones, macrometals and trace metals) are often completely incorporated in the newly formed cells. However, the thermodynamic nature of the enzyme catalyzed reactions in anabolism is such that large amounts of energy are required. This is easily understood. In many reactions molecules are coupled, e.g., glucose is phosphorylated to produce glucose 6-phosphate, NH3 is coupled to α-ketoglutarate to produce glutamate, amino acids are coupled to each other to form peptides, etc. These coupling reactions are often thermodynamically not possible as such. Therefore a net input of energy is needed to drive these reactions in the right direction. It is well known that ATP is used as the carrier to deliver this energy, which raises the question as to where the ATP (hence energy) comes from. All living cells contain a network of biochemical reactions, called catabolism, which generates this energy and converts it in the form of ATP. The overall reaction which is performed by the catabolic network is nothing more than a redox reaction in which an electron donor and acceptor are consumed. The electron donor and electron acceptor are taken up from the medium and are processed through the catabolic network leading to the production of energy (in the form of ATP) which is consumed in the anabolic reactions and other energy requiring processes in the cells. The oxidized electron donor and the reduced electron acceptor are secreted from the cell to the aqueous solution outside the cell. Examples of secreted catabolic products are CO2, ethanol, Fe3+, S0, SO42- , N2, NO2- , H2O, H+, etc. From the above it is clear that the growth medium should always contain an electron donor and an electron acceptor. Furthermore it should be realized that cell growth leads to changes in the cellular environment due to the secreted catabolic products (reduced acceptor and oxidized donor). The performed overall redox reaction is called the catabolic reaction, examples of which are shown in Table 6.1.
6-7
Growth Nutrients and Diversity Table 6.1 Some Examples of Catabolic Reactions Performed by Living Cells Compounds Consumed
Compounds Secreted
Electron Donor Couple
Electron Acceptor Couple
Glucose
+
6O2
6CO2 + 6H2O
Glucose/CO2
O2/H2O
Ethanol
+
3O2
3CO2 + H2O
Ethanol/CO2
O2/H2O
Fe2+ H2S
+
Fe3+ + ½H2O
So + H2O
Fe2+ /Fe3+ H2S/So
O2/H2O
+
¼O2 + H+ ½O2
H2S
+
2O2
SO42- + 2H+
O2/H2O
2 ethanol + 2CO2
H2S/SO42Glucose/CO2
Glucose
O2/H2O CO2/ethanol
8HNO3
+
5H2S
4N2 + 5H2SO4 + 4H2O
H2S/H2SO4
HNO3/N2
NH4+
+
1½O2
NO2- + 2H+ + H2O
NH4+ /NO2-
O2/H2O
Catabolism and redox half reactions The overall catabolic reactions can be obtained by considering the two constituting redox half reactions. To set up these redox half reactions for a certain case the electron donor couple and electron acceptor couple must be known. Consider the next to last reaction in the previous example. The electron donor couple is H2S/H2SO4 (an electron donor must contain electrons which can be donated, and hence an electron donor must be a reduced compound). The electron acceptor couple (which must be an oxidized compound) is HNO3/½N2. The following half reactions apply:
Electron donor couple H2S + 4H2O H2SO4 + 8e - + 8H+
Electron acceptor couple HNO3 + 5e - + 5H+ ½N2 + 3H2O Multiplying the first and the second reaction with 5 and 8, respectively, and subsequent addition leads to the catabolic reaction in the example (8HNO3 + 5H2S 4N2 + 5H2SO4 + 4H2O).
6.4.4 Signaling Molecules All biological systems contain signal transducing complexes (which are in most cases protein based receptors) in their cell membranes. In unicellular systems (prokaryotic as well as eukaryotic microorganisms) these receptors function to “sense” the extra cellular situation (presence of novel food molecules, presence of other organisms, etc.). E.g., chemotaxis is based on such receptors. Such receptors are also needed to trigger “differentiation” of cells, such as formation of spores (where organisms become dormant to survive unfavorable conditions), aggregation of cells in multicellular tissues, differentiation into specialized cell types (e.g., liver cells, heart cells, and muscle cells, etc.). Very small amounts of so-called signal molecules are needed in the medium to stimulate the growth of a specific cell type. Examples of such signal molecules are hormones, protein type growth factors, Nitric oxide NO (the “magic” molecule), and many more. Whereas the need for energy and biosynthetic compounds is well understood, the need and presence of signaling molecules in the medium is much less understood and is an area of rapid development and application (e.g., growing artificial skin tissue to replace burned skin, etc.).
6.5 Conclusions In this chapter it has been shown that the composition of the growth medium for the cultivation of microorganisms or cells of higher organisms can in principle be derived from the elemental and biochemical composition of the cells and the catabolic reaction used to generate energy. Two main categories
6-8
Balances and Reaction Models Table 6.2 General Composition of Cultivation Media C-source N-source Electron donor Electron acceptor P-, S-source, K+, Mg2+ HCO3-/CO2 H2O H+ Signaling molecules Vitamins/hormones/growth factors/trace metals
of cultivation media exist, namely undefined or complex media, which are composed of mixed nutrient sources and often contain all compounds needed for cellular replication, and defined media, which are prepared from single purified compounds. The main groups of compounds that a growth medium should always contain are listed in Table 6.2. However, the number of individual compounds which should be present in the medium can be quite extensive (tens to hundreds), where concentration differences are very large (10 -9 mol · l -1-1 mol · l -1). For each biological system (tissue, cell, and microorganism) a suitable medium composition must be found. This is similar to providing a diet for humans. In the past this has been largely done empirically by experimentally testing many medium mixtures for their effect on growth. In this chapter it has been shown that medium design can be performed in a much more rational way. Also a proper sterilization procedure is very important. Heat sterilization has the disadvantage that heat labile compounds may (partly) be destroyed, compounds react with each other (e.g., Maillard reaction) to form undefined products and salts may precipitate. This can be avoided by sterilizing parts of the medium separately (e.g., vitamins and trace elements by means of filter sterilization) and combining them afterward. Another way to avoid these problems and to be sure that the medium is not affected by the sterilization procedure is filter sterilization of the complete medium. It should be realized that the absence of one single component may already lead to absence of growth. For example the absence of a relevant trace metal leads to absence of growth because the enzyme, which needs this trace metal in its active site, cannot function. Therefore the reaction catalyzed by this particular enzyme does not occur and the whole anabolic or catabolic reaction network stops. This is similar as the removal of a small cogwheel from a time-watch, resulting it to stop.
Further Reading Dubertret L. and Coulomb B., 1994. Reconstruction of human skin in culture. C R Seances Soc. Biol. Fil., 188(3):235–44. Egli T. and Zinn M., 2003. The concept of multiple-nutrient-limited growth of microorganisms and its application in biotechnological processes. Biotechnol. Adv., 22(1–2):35–43. Ertola R.J., Giulietti A.M., and Castillo F.J., 1995. Design, formulation, and optimization of media. Bioprocess Technol., 21:89–137. Froud S.J., 1999. The development, benefits and disadvantages of serum-free media. Dev. Biol. Stand., 99:157–66. Lange H.C. and Heijnen J.J., 2001. Statistical reconciliation of the elemental and molecular biomass composition of Saccharomyces cerevisiae. Biotechnol. Bioeng., 75(3):334–44. Lubiniecki A.S., 1999. Elimination of serum from cell culture medium. Dev. Biol. Stand., 99:153–56. Mather J.P., 1998. Making informed choices: medium, serum, and serum-free medium. How to choose the appropriate medium and culture system for the model you wish to create. Methods Cell Biol., 57:19–30.
Growth Nutrients and Diversity
6-9
Nowruzi K., Elkamel A., Scharer J.M., Cossar D., and Moo-Young M., 2008. Development of a minimal defined medium for recombinant human interleukin-3 production by Streptomyces lividans. Biotechnol., Bioeng, 99(1): 214–22. Taticek R.A., Lee C.W., and Shuler M.L., 1994. Large-scale insect and plant cell culture. Curr. Opin. Biotechnol., 5(2):165–74. Thomsen M.H., 2005. Complex media from processing of agricultural crops for microbial fermentation. Appl. Microbiol. Biotechnol., 68(5):598–606. Wlaschin K.F. and Hu W.S., 2006. Fedbatch culture and dynamic nutrient feeding. Adv. Biochem. Eng. Biotechnol., 101:43–74.
7 Mass Balances, Rates, and Experiments 7.1 7.2
Introduction ���������������������������������������������������������������������������������������7-1 Rates and Mass Balances ������������������������������������������������������������������7-2
7.3 7.4
Biomass Specific Conversion Rates (q-Rates)................................7-8 Mathematical Models for the Batch Experiment from Mass Balances and q-Based Kinetics: The Modeling Cycle...................7-9
Definition of the Mass Balance • The System Boundary • Setting up the Mass Balances: Some Guidelines • The Calculation of Conversion Rates from a Batch Experiment
Joseph J. Heijnen Delft University of Technology
A Batch Model for Growth and Product Formation • A Simple Relation to Obtain qsmax • Application of the Mathematical Model for Batch Growth for the Estimation of Kinetic Parameters and the Duration of the Batch Culture • Stoichiometric Coefficients • Obtaining q-Rates and Stoichiometry from Batch Experiments: Some Pitfalls
7.5 Conclusions ��������������������������������������������������������������������������������������7-20 Further Reading ������������������������������������������������������������������������������������������7-21
7.1 Introduction Living systems, e.g., cultures of microorganisms or cells from higher organisms transform matter and energy with certain rates. It is obvious that if cultivated cells are applied in a fermentation process to produce a certain desired product that knowing these rates, and knowing how these rates can be changed is important, e.g., to increase the rate of product formation. In industrial processes these rates determine economy, profits, investments, and equipment design (pumps, stirring power, etc.). In natural environments conversion rates determine e.g., how fast pollutants are transported, degraded, and accumulated (leading to adverse effects). In the human body, metabolic disorders quickly result from imbalances in rates of production, rate of transport or the rate of degradation of metabolites or proteins, resulting in too high or too low levels of metabolites and proteins in certain parts of the body, usually with fatal results. In fermentation processes cells or (micro) organisms consume the nutrients provided in the growth medium and produce different compounds with certain rates. Quantification of these rates is important to obtain insight in the profit of a biotechnological industrial process, to understand how long it will take to remove an undesirable compound from a polluted site by using microorganisms. Moreover, it is important to understand how a cell/microorganism changes its rates (e.g., growth rate, its substrate uptake rate, or its product formation rate) upon a change in its extra-cellular environment. Thereby the environment of the cell is defined as temperature, pH, concentration of substrate, product, etc. outside 7-1
7-2
Balances and Reaction Models
the cell membrane. The study of how to change rates is called kinetics, and for such studies one needs methods to define and to quantify rates. Here the attention will be on the three most important rates (growth, substrate uptake, and product formation). Other rates will be dealt with later.
7.2 Rates and Mass Balances As shown in Chapter 6, the synthesis of a certain amount of biomass and/or product requires the consumption of nutrients of which the most important ones are the C-source, an electron donor, an electron acceptor, and sources of nitrogen, phosphorus, and sulfur. This results in the production and subsequent excretion by the cells of various catabolic (H2O, CO2, N2, and others) and noncatabolic products. In a fermentation process each of these compounds is consumed or produced with a certain rate. These rates are usually expressed as mol/time, e.g., glucose consumption rate in mol glucose/hour, oxygen consumption rate in mol O2/hour, biomass growth rate in C-mol biomass per hour etc. As indicated above it is important to obtain the values of these rates as they occur in fermentation processes, in nature, in our experiments. One might think that these rates can be measured using a certain sensor analogous to, a thermometer to measure temperature, or a pH-electrode to measure the pH, etc. It should be clear, however, that sensors to measure a rate of production or consumption do not exist because this is fundamentally impossible. A rate must be calculated using the proper mass balance in combination with proper measurements of flow rates, volumes, concentrations, and time.
7.2.1 Definition of the Mass Balance For each compound i present in a certain system a mass balance can be formulated. This means that there are as many mass balances to be formulated for a certain system as there are chemical compounds present in this system. However, each of these individual mass balances has the same general structure:
Accumulation of compound i = Ratei (production (+) consumption (-) of compound i) + Rate of all in/out transport processes of compound i
It is important to note that this is a balance of rates and that each term in this balance has the same dimension: amount of compound i/hour. The most practical choice of the unit used to express the amount of compound i is mol of compound i, hence the dimension of the rate will be mol of compound i/time. However, one can also make other choices as kg of compound i/time, etc. It is also obvious that the mass balance of compound i allows us to calculate Ratei of production (or consumption) of compound i if we can quantify the other terms in the mass balance (the accumulation term and the transport terms) using experimental measurements. The mass balance immediately shows which measurements (flows, volumes, concentrations) are needed to obtain Ratei.
7.2.2 The System Boundary Before a mass balance can be formulated the system boundary needs to be defined. System boundaries are usually chosen in a practical way. If, e.g., one is interested in uptake rates of cells present in the liquid broth inside a fermentor the logical boundary is the boundary of the liquid phase. However, if one is interested in the transfer of O2 from the gas phase to the liquid phase, both the gas and the liquid phase have to be taken into account. The choice of the system boundary is strongly dependent on the system at
Mass Balances, Rates, and Experiments
7-3
hand. Obvious boundaries are e.g., different phases (gas/liquid/solid) or compartments. When defining the system for which mass balances are to be formulated the following has to be defined: The system volume The system boundaries across which transport occurs The concentrations of the relevant compounds inside the system
7.2.3 Setting up the Mass Balances: Some Guidelines For a process involving many chemical compounds and many phases or compartments, many mass balances have to be defined. In principle one mass balance for each compound in each phase can be formulated. Each compound has also its own concentration in each compartment or phase. Furthermore accurate information on the transport processes which operate on each compound i in each phase is needed in order to quantify the transport terms in each balance. The most common balances are those for substrate, biomass, and product in the liquid phase (broth) of the fermentor. The following example shows how to set up these balances.
Example 1: A Proper Mass Balance for NH3 Consider the following process whereby one is interested in the fate of NH3. It is known that: a liquid feed containing NH3 enters the reactor, no liquid leaves the reactor and air is sparged through the reactor. The liquid volume VL in the reactor changes due to liquid feed and water evaporation. The NH3 is present in the liquid phase in the reactor and is consumed by a nitrifying organism. Furthermore NH3 is also present in the reactor gas phase due to transport of NH3 between gas and liquid phase. Question: Which balances are needed? Answer: NH3 is present in two phases. Hence we need to formulate two balances: • The NH3-balance for the liquid phase • The NH3-balance for the gas phase Question: Which terms should be present in each balance? Answer: For the NH3-balance in the liquid phase we need: • The dynamic term: d(VLCnL)/dt, where VL is the volume of the liquid phase and CnL the concentration of NH3 in the liquid phase • The rate of conversion of NH3 by the nitrifier organism, RateNH3 in mol NH3/h • The rate of transport of NH3 from the liquid to the gas phase, a transport process called “stripping” • The rate of transport of NH3 with the liquid feed to the reactor
For the NH3-balance in the gas phase we need the next contributions: • d(VGCnG)/dt, which is the dynamic term (with volume of the gas phase VG and, the concentration of NH3 in the gas phase CnG) • There is no conversion term needed because there are no microorganisms in the gas phase • There is transfer from liquid to gas phase • There is transport of NH3 out of the gas phase by the gas flow that leaves the reactor
In order to set up the proper balances for compound i it is necessary to identify all the transport processes and reactions which operate on molecule i present within the system boundary. If one transport process is forgotten (e.g., the stripping of NH3 from the liquid phase to the gas phase) then the calculated RateNH3 will be erroneous.
7-4
Balances and Reaction Models
A good process is like a good theater play One can envisage a process as a theater performance where actors (= compounds) perform on a theater stage (= process) according to a script (= the specification of each interaction between the compounds in the form of the transport and conversion mechanisms with their kinetics). The process designer is of course the play’s director. This shows that for each particular process we need to think creatively, using process knowledge. The process knowledge relates to: • Which compounds are relevant in the process (The Actors). • Where are these compounds localized in the process (The Theater stage) (e.g., in only one phase or in more phases or in a compartment of a reactor or in a compartment, e.g., mitochondrion, within the cells). For this we need to know the physical structures of the process (vessels, compartments in the vessel, phases in the vessel, compartments in microorganisms) based on own visual knowledge (or, undesirable, hear say?). Based on this information subsystems are defined (e.g., a gas phase, a liquid phase, a solid phase, cytosolic and mitochondrial compartments in cells, etc.). • Which mechanisms (transport, conversion, or both) act on each compound i in each subsystem and how are their kinetics. Subsequently we describe how each compound i takes part in and interacts with the whole process (The Script). • A process designer is the director of the process performance.
7.2.4 The Calculation of Conversion Rates from a Batch Experiment A usual task is to cultivate cells/organisms and to study e.g., the rate of growth or product formation and/ or other rates. The experimenter is usually free to choose the method of cultivation. This choice is called “experimental design.” The most simple cultivation method is called the “batch cultivation,” which can be executed in microtiter or deep well plates, shake flasks, bags, stirred vessels (fermentors), or other devices. The characteristic of a true batch cultivation is that for all compounds which are uniquely present in the liquid phase no mechanism is present to transport the compounds into or from the cultivation vessel: for these compounds transport is absent!! Batch cultivation and culture volume Traditionally, if one speaks of batch experiments this is usually assumed to imply that the cultivation volume is constant. In practice a constant volume can seldom be realized. Reactions lead in principle to changes in densities. Also often (unnoticed) losses occur due to evaporation or additions are applied (e.g., for pH control). Therefore a constant volume does not characterize a batch condition. The key is absence of transport which must be specified for each molecule. E.g., in an aerated batch fermentation the batch condition applies for substrate, but not for O2 and CO2. If pH is controlled with NH4OH solution the batch condition also does not apply to NH4+ . Alternatively if the pH is controlled with NaOH the batch condition does apply for NH4+ !! This shows that in a batch experiment for some compounds the mass balances relate to batch conditions, but for other compounds batch condition does not apply!! The mass balance for such “batch-compounds” is then simplified (because the transport terms in and out become zero) and is written as:
Accumulation of compound i = Ratei
Clearly the conversion rate of compound i (Ratei) can be obtained from the mass balance if we are able to experimentally quantify the accumulation term. The accumulation term is the change in time of the total amount (mol) of i (present within the boundary, e.g., in the cultivation vessel). It is by definition mathematically expressed as
7-5
Mass Balances, Rates, and Experiments
Accumulation of compound i =
d(V ⋅C i ) dt
(7.1)
Here V is the volume of broth (m3) in which compound i is present and Ci is the concentration of compound i in this broth (mol i/m3 broth). Here it is useful to introduce Mi = V ⋅ Ci, with Mi being the total amount (mol) of compound i present in the system). For the batch conditions Ratei now follows from the mass balance for compound i:
Rate i =
d(V ⋅ C i ) dM i = dt dt
(7.2)
Note that by definition the conversion rate, Ratei is positive for a produced compound and negative for a consumed compound. It has now become clear that in order to obtain the conversion rate of a batch compound we have to measure V and Ci as function of time to obtain Mi(t) = Vi(t) ⋅ Ci(t) which is the changing amount of i in the cultivation vessel as function of time. The Mi(t)-values can then be plotted as function of time and then the slope dMi(t)/dt can be calculated. This slope equals Ratei according to the mass balance for compound i. Usually the slope at time t can be calculated from two values of Mi at two close time points according to:
dM i d(V ⋅ C i ) (V ⋅ C i )t2 - (V ⋅ C i )t1 M i (t 2 ) - M i (t1 ) = = = = Ratei (between t2 and t1) dt dt t 2 - t1 t 2 - t1
(7.3)
This is the most straightforward and simple method to obtain the slope dMi/dt, but more advanced methods, involving curve fitting through the experimental values of Mi are also possible. It should furthermore be clear that in batch cultivations Ratei is usually a function of time and cannot be considered as constant!! The example below shows the application of the mass balances in combination with experimental measurements to obtain the proper values of Ratei as a function of time from a batch experiment.
Example 2: Batch Fermentation: Mass Balances and Rates An organism is grown on a cultivation medium containing glucose as the carbon source in a batch fermentor. The broth volume (V), glucose concentration (Cs), and biomass concentration (C x) are measured as function of time (Table 7.1). We observe that the volume decreases, which is due to the evaporation of water because of air sparging. The evaporation rate is 2 liters of water/hour. Question: Explain why between 0 and 4 hours an increasing substrate concentration is observed, in spite of the fact that biomass growth occurs, which can be inferred from the increasing biomass concentration. Answer: This is due to the volume change (see note below!!) from 0.100 m3 to 0.092 m3. Table 7.1 Experimental Results from a Batch Experiment Time (hour)
V (m3)
0 1 2 4 8 16
0.100 0.098 0.096 0.092 0.084 0.068
Cs (mol Glucose/m3) 100.000 100.296 100.510 100.674 99.536 86.985
Cx (C-mol Biomass/m3) 100.00 107.27 115.125 132.761 177.595 327.279
7-6
Balances and Reaction Models
Question: Calculate the total rates of substrate consumption Rates and biomass production Ratex (in mol/h and C-mol/h) as function of time. Answer: We need to use the mass balance for glucose to obtain Rates and the biomass mass balance to obtain Ratex. Mass balance for substrate (glucose) in batch:
d(V ⋅ C s ) dMs = = Rates dt dt
(7.4)
d(V ⋅ C x ) dM x = = Ratex dt dt
(7.5)
Mass balance for biomass in batch:
The first thing that needs to be done is to calculate Ms and Mx (from the available volume and concentration measurements) to obtain the change of the total amounts of glucose and biomass in time. The results are shown in Table 7.2. From the calculated total amounts shown in Table 7.2 it is now clear that there is indeed growth at the expense of glucose (Mx increases, Ms decreases) as has to be expected. It is also clear that Ms and Mx do not change linearly in time, because their slope increases. According to the mass balance the Ratei is the slope of the curve of Mi versus time. Ideally one would need to obtain a continuous interpolation function through the Mi-time points and then obtain the slope as a function of time from the derivative of this function. For reasons of simplicity we can approximate the slope by using the Mi vs. time data points and the definition of the slope of the Mi-time curve in the time interval t1-t2. dMi Mi (t2 ) - Mi (t1 ) = dt t2 - t1
(7.6)
For the available time intervals we then obtain the values (see Table 7.3) for the slopes at different time intervals (which are equal to the average Ratei in the respective time interval). Table 7.2 Calculated Total Amounts of Glucose and Biomass in Time from the Data of Table 7.1 Time (hour)
Ms(= VCs) (mol Glucose)
Mx(= VCx) (C-molX)
10.000 9.829 9.649 9.262 8.361 5.915
10.000 10.513 11.052 12.214 14.918 22.255
0 1 2 4 8 16
Table 7.3 Calculated Conversion Rates of Glucose and Biomass Using the Data from Table 7.2 dMs/dt (= Rates(t)) (mol S/Hour)
dMx/dt (= Ratex(t)) (MolX/Hour)
0–1 hour
-0.171
+ 0.513
1–2 hour
-0.180
+ 0.539
2–4 hour
-0.194
+ 0.581
4–8 hour
-0.225
+ 0.676
8–16 hour
-0.306
+ 0.917
Time Interval (t1-t2)
7-7
Mass Balances, Rates, and Experiments From the calculated conversion rates it is clear that: • Rates, being the substrate conversion is negative, which is logical because substrate is consumed. • The rate of substrate consumption Rates(t) and for biomass production Ratex(t) are not constant in time. • The absolute values of both rates increase, while the substrate concentration decreases. At first sight the increase in Rates(t) with a decrease in Cs(t) is counter intuitive! The explanation for this will follow below, when we introduce the biomass specific conversion rates (q-rates). A correct calculation of Ratex and Rates depends on a correct evaluation of the accumulation term in their respective mass balances from the available experimental data. The note below shows that this is not a trivial task.
Mistakes in evaluating the accumulation term of a mass balance The accumulation term of compound i in the mass balance for i can be expanded as:
d(V ⋅ C i ) V ⋅ dC i dV = + Ci ⋅ dt dt dt
(7.7)
This shows that a change in the amount of molecules of compound i present in the growth vessel has two contributions.
1. A change due to the change of the concentration of i: V ⋅ dCi/dt 2. A change due to a change of the culture volume: Ci ⋅ dV/dt An often-made mistake is that only concentrations are considered and changes in volume are neglected. For example when Ci is measured to be constant, and thus dCi/dt = 0, the wrong conclusion is drawn that Ratei = 0. However, it should be kept in mind that this is only true when the culture volume does not change in time. If this is not the case, i.e., the volume does change in time the change in the amount V ⋅ Ci(= Mi) is always needed, which requires to use the measured changes in both V and Ci. A second mistake is related to the volume which should be used. Also this is not trivial. Suppose the concentration Ci is obtained by taking a broth sample. First the biomass is removed, e.g., by centrifugation, to obtain a stable supernatant (because the biomass present in the broth sample would otherwise continue to consume and produce compounds, which would change the concentrations which have to be measured). Subsequently the concentration of the compound of interest is measured in the supernatant. The total amount present in the broth sample can now be found by multiplying the measured Ci value (mol i/m3 supernatant) with the volume of the supernatant. It should be realized that this is not the same as the volume of the broth sample (which is supernatant + wet biomass). The difference depends on the amount of biomass present in the broth, and can easily be 5–20%!! In the above approach it is assumed that compound i is only present in the supernatant and not inside the biomass. This is not always so. It may well be that there is a significant amount of product present inside the biomass. Only when there is complete secretion of the product from the cells by means of active transport it can be expected that its concentration inside the biomass is neglibly low and may be neglected. In all other situations the amount of product in the biomass must be quantified separately. The accumulation term for product i now becomes:
dM i d(M x ⋅ X i ) d(VsupC i,sup ) = + = Ratei dt dt dt
(7.8)
Here M x is the amount of biomass in the vessel (C-molX) and X i the amount of product i present inside in the biomass (mol i/C-molX). If the product is also present in significant amounts inside the cells we thus need to carry out additional measurements to obtain the correct value for Ratei.
7-8
Balances and Reaction Models
7.3 Biomass Specific Conversion Rates (q-Rates) In the example in the previous section we have seen how, for a batch cultivation, Rates and Ratex of substrate consumption and biomass production (all in mol i/hour) can be obtained from the mass balances for substrate and biomass using volume and concentration measurements as function of time. The obtained results, however, merit some thought. It was shown in the example above that the glucose consumption (Rates) increases. This seems strange; as at first glance one would expect that the consumption rate of glucose would decrease when the glucose concentration decreases. The explanation is that the glucose is consumed by the cells and that, because the amount of cells (V ⋅ Cx = M x) increases, the total rate of glucose consumption will increase. This shows that apart from the Ratei, which is a total rate, a second type of rate is relevant, which is the biomass specific rate qi which is defined as: qi =
Ratei mol i/h consumed or produced M x C-molX present in the cultivation vessel
(7.9)
The q-rate is obtained from the total production or consumption rate (Ratei) by dividing with the total amount of biomass (V ⋅ Cx = M x) present in the culture vessel. It should be realized that because these q-rates are expressed per amount of biomass they characterize the activity of the cells. The qi-rates are influenced by: • The properties of the cells (as a result of gene expression) • The environment of the cells as represented by the extra-cellular concentrations such as glucose, O2, CO2, pH, T, pressure, and other compounds present in the growth medium (vitamins, trace elements, ….) It is clear from the above definition that to obtain the value of qi both the total Ratei as well as the total amount of biomass M x present in the cultivation vessel are required. The dimension of the qi-rates is (mol i/h)/C-molX. A special q-rate is qx which is the rate of newly produced biomass per hour per amount of biomass present in the cultivation vessel. For historical reasons this rate is called µ and not qx (which would be logical) µ( = q x )=
Rate x C-molX/h which is produced M x C-molX present in the cultivation vessel
(7.10)
Example 3: Calculation of q-Rates from the Batch Experiment In the previous example on the batch experiment we have applied the mass balances for substrate and biomass to calculate the total rates of produced biomass (Ratex) and consumed glucose (Rates). The mass balances showed that, for a batch cultivation, these rates are equal to the time derivative:
d(V ⋅ Ci ) dMi = dt dt
(7.11)
These time derivatives were quantified from the experimentally obtained (V·Ci) values as function of time for the time intervals between the measurements. Rates and Ratex were therefore expressed as time-interval average rates which in practice slightly change in time within the considered time interval. To obtain q-rates one must divide the total Ratei by the total biomass Mx present in the time interval. However within a time interval Mx increases. A logical proposal for Mx in the interval is then to calculate the average value of Mx in the interval. The results for the calculated rates and average Mx in the time intervals of the batch experiment are shown in Table 7.4.
7-9
Mass Balances, Rates, and Experiments Table 7.4 Calculated Total and Biomass Specific Conversion Rates for the Batch Experiment Time Interval (hour)
Rates (molS/h)
0–1
-0.171
1–2
-0.180
2–4
-0.194
Average Mx (C-molX)
qs (mol Glucose/h C-molX)
µ (C-molX/h C-molX)
+ 0.513
10.256
-0.0167
+ 0.050
+ 0.539
10.783
-0.0167
+ 0.050
+ 0.581
11.633
-0.0167
+ 0.050
-0.0166
+ 0.050
-0.0165
+ 0.049
Ratex (C-molX/h)
4–8
-0.225
+ 0.676
13.566
8–16
-0.306
+ 0.917
18.587
The average values for Rates, Ratex, and Mx in each time interval directly allow to calculate qs and µ. The result is quite revealing, it shows that the biomass specific rates qs and µ are constant. This result is often found in batch experiments where organisms achieve their maximal q-rates which are constant due to the high concentrations of e.g., substrate (unlimited growth).
A third way to express conversion rates, ri Until now we have introduced two ways to express conversion rates:
Ratei = total rate of compound i in mol i/hour
qi = the biomass specific rate = Ratei/M x In economic considerations the investment of equipment is related to the total volume (V) of the broth in the cultivation vessels. From this point of view it is important to have information of the volume specific rate which is named ri. ri = (Ratei)/V (in amount of i/hour per m3 broth)
(7.12)
Comparing the definition of qi and ri it follows that (eliminate Ratei) ri = qi ⋅ Cx
(7.13)
Hence the volume specific rate ri and the biomass specific rate qi of compound i are coupled through the biomass concentration Cx.
7.4 Mathematical Models for the Batch Experiment from Mass Balances and q-Based Kinetics: The Modeling Cycle In the previous example of a batch experiment we have seen that in batch the q-rates of the organisms are constant and are not influenced by changing extracellular substrate concentrations. This leads to very simple rate functions, also called “q-based kinetics”:
qs = qsmax
and
µ = µmax
Here qsmax and µmax are kinetic parameters, which have a certain value for a particular organism growing under particular conditions and which must always be obtained from experiments. The above kinetics are the simplest possible (0-order). Let us now assume that for a certain organism the values for qsmax and µmax are known. This essentially means that the q-rate based kinetics are known as well. If we combine this kinetic knowledge with the mass balances we already used before, then the mass balances allow us to calculate the expected total amount Mi(t) of compound i as function of time.
7-10
Balances and Reaction Models
It should be noted that the mass balances wherein the specified q-rate functions are substituted are sets of differential equations which can be solved. In fact this set of differential equations can be considered to be a mathematical model of the system (batch fermentor, shake flask, etc.). It should furthermore be noted that: • If the q-rate functions are not known a combination of experimental measurements of concentrations and volumes with the proper mass balances is needed to calculate values of qi, which leads to q-based kinetic functions. • If we do know the q-rate functions we can combine these functions with the mass balances, subsequently integrate the obtained differential equations and in this way calculate the change of the total amount Mi of each compound as function of time, which we can compare to available measurements. Clearly the mass balance allows us to always obtain the missing quantity!! (qi or Mi(t)) The above approach is called the modeling cycle which is summarized below. Step 1 (exp. design) Step 2 (evaluation) Step 3 (kinetics) Step 4 (validation)
Step 5 (prediction)
Step 6 (correction)
Perform a cultivation experiment and measure the concentrations of the relevant compounds and volumes (flow rates) as a function of time. Use the mass balances to quantify the q-value of each compound under different conditions (q-relations). Use the obtained q-values to construct kinetic functions for each biomass specific conversion rate q, as a function of extra-cellular conditions (pH, T, concentration). Combine, for each relevant compound, the obtained kinetic function with the corresponding mass balance. The resulting set of differential equations can then be integrated (in most cases numerically) to obtain the predicted amounts Mi of these compounds as function of time. These results can be compared with the experimental measurements (to validate the kinetic functions). The obtained kinetic functions can also be used to model the cultivation process for different cultivation systems, leading to different mass balances (because of different transport conditions), to predict the expected Mi vs. time. If the model predictions obtained during step 4 and 5 do significantly deviate from the measurements one or more kinetic functions are probably not correct (or missing) and a return to step 3 is needed to modify the kinetic functions.
The combination of the full set of mass balances with the kinetic functions does represent a complete mathematical model in the form of a set of ordinary differential equations. Such a mathematical model contains different variables which can be grouped into two main categories: Dependent variables Examples are: V, Cs, Cx, Cp. The broth volume V follows from a total mass balance for the system, all concentrations of the different compounds present in the cultivation system, Ci, follow from the mass balances. Independent variables These are variables which can be manipulated by the experimentor or the process operator such as inflow rate of the feed medium (pump adjustment), the inoculum size of the biomass, the substrate concentration of the feed medium, the pH, the total reactor volume, the reactor temperature, etc. The model furthermore contains parameters, e.g., the kinetic parameters of the rate equations. For the example of the batch cultivation shown above these are e.g., the maximum specific growth rate µmax and maximum specific substrate uptake rate qsmax. Mathematical models are highly relevant in interpreting and designing biological processes as will be illustrated below.
7-11
Mass Balances, Rates, and Experiments
7.4.1 A Batch Model for Growth and Product Formation In the previous section a batch experiment was described and it was shown how mass balances and experimental information should be combined to obtain the values for the specific growth rate µ and the specific substrate uptake rate qs. For this example the following rate functions apply for biomass growth and substrate uptake: C-molX/h = µ max C-molX
µ = +0.050
q s = -0.0167
molglucose/h = q max s C-molX
(7.14) (7.15)
Knowing these rate functions does allow us to set-up the mathematical model to calculate the substrate concentration Cs and the biomass concentration Cx as a function of time by setting up the mass balances for substrate and biomass. For a batch culture these mass balances only contain the accumulation terms (d(V ⋅ Ci)/dt) and the total rate term for production and consumption (Ratei = qiV ⋅ Cx), because in a batch culture no transport for substrate or biomass to or from the culture system occurs. Mass balance for biomass (batch culture)
d(V ⋅ C x ) = µ(V ⋅ C x ) dt
(7.16)
Mass balance for substrate (batch culture)
d(V ⋅ C s ) = q s (V ⋅ C x ) dt
(7.17)
These mass-balances represent differential equations which need to be solved to obtain Cs and Cx as a function of time. To achieve this: • We need to know how V changes in time, e.g., due to evaporation. • We need to know how µ and qs change in time, especially because we expect that these rates will change at some point, because Cs decreases with time and will ultimately reach a value of zero. It is therefore logical to expect that Cs has an effect on the substrate uptake rate qs. However, at high substrate concentrations (which occur in batch experiments) it has been shown above that these rate functions are very simple (zero order kinetics), that is µ = µmax and qs = qsmax. In addition it is useful to define two new variables:
M x = V ⋅ Cx
(7.18)
Ms = V ⋅ Cs
(7.19)
M x and Ms represent the total amounts of biomass and substrate present in the batch vessel at any time. It should be realized that these total amounts are time dependent in a nonlinear fashion, which will be shown below. Let us first consider the biomass mass balance replacing (V ⋅ Cx) by M x:
dM x = µ max ⋅ M x dt
(7.20)
7-12
Balances and Reaction Models
Now the biomass mass balance can be solved by separation of variables. dM x = µ max dt Mx
Integration of both sides (using the relation
(7.21)
dM x = d lnM x ) leads to: Mx
In (M x) = µmax ⋅ t + (constant)
(7.22)
We know that at time t = 0 there is an amount of biomass, called M xo, present in the batch (which was added during the inocculation procedure). This gives:
ln M xo = µmax*0 + (constant), and thus (constant) = ln M xo.
(7.23)
M x(t) = M xo exponent(µmax t)
(7.24)
which yields:
The last relation shows that the total amount of biomass M x in the batch culture changes exponentially in time. This relation is also called the exponential growth relation. If one is interested in the change of the biomass concentration in time, Cx(t), this follows by dividing M x(t) by the culture volume, i.e.,: C x (t) =
M x (t) V(t)
(7.25)
So in order to calculate the biomass concentration as a function of time we need to know if, and if yes, how V(t) changes in time. When the volume V does not change in time, hence V = V0 (= volume at time 0) = constant we can write C x (t) =
M x (t) V0
(7.26)
Because also M xo = Vo ⋅ Cxo the exponential growth relation can be rewritten as:
Cx(t) = Cxo exp(µmaxt) (if V = constant)
(7.27)
From this relation it follows that the biomass concentration changes exponentially in time (note that this only happens at constant volume). However, for the batch culture example shown earlier the volume did change in time (decrease due to evaporation) according to:
V(t) = Vo-0.0020 ⋅ t
(7.28)
Whereby Vo is the initial volume (0.1 m3) and the rate of evaporation is 0.0020 m3/h (2 liter water per hour). By substituting this relation in the exponential growth function the following expression for the biomass concentration as a function of time is obtained:
C x (t) =
Vo ⋅ C xo ⋅ exp(µ max ⋅ t) Vo - 0.0020 ⋅ t)
(7.29)
Mass Balances, Rates, and Experiments
7-13
This relation shows that Cx increases in time, and even faster than exponential due to the decrease of the culture volume. If the volume V(t) increases in time (e.g., addition of pH control agent) then Cx(t) increases less than exponentially!! Let us now consider the substrate mass balance: dM s = q max ⋅ Mx s dt
(7.30)
In this differential equation qsmax is a parameter with a fixed value and M x changes in time. However, the change of the biomass concentration as a function of time, M x(t), has already been obtained from the mass balance for biomass. Substitution of this result gives: dM s = q max ⋅ M xo ⋅ exp(µ max ⋅ t ) s dt
(7.31)
Separation of variables (Ms and time) yields: dM s = (q max ⋅ M xo ) ⋅[exp(µ max ⋅ t)dt] s
(7.32)
We can rewrite the term between the square brackets by bringing the exponent behind the differential according to:
(exp (αt) dt = (1/α)d(exp αt))]
(7.33)
q max ⋅ M dM s = s max x0 ⋅ dexp(µ max ⋅ t) µ
(7.34)
This yields:
Integration between time t = 0 and t = t, and defining that at time t = 0, Ms = Mso and at time t the total substrate mass equals Ms(t) yields:
q max ⋅ M M s (t) - M s0 = s max x0 ⋅[exp(µ max ⋅ t ) - 1] µ
(7.35)
This equation describes the course of the total amount of substrate as a function of time, Ms(t). A few comments should be made here: is a negative number, hence Ms(t) decreases in time!! • It should be realized that q max s If the values of the kinetic parameters (µmax, qsmax) as well as the experimental conditions at time t = 0 (Mso, M xo, Vo) are known this relation yields an expression for the total substrate mass as a function of time. The substrate concentration as function of time follows from:
Cs(t) = Ms(t)/V(t)
(7.36)
Substitution of the function for V(t) given earlier (Equation 7.28) shows that Cs(t) depends on time in a complicated way.
7-14
Balances and Reaction Models
7.4.2 A Simple Relation to Obtain qsmax The obtained Ms(t) relation (Equation 7.35) contains µmax and qsmax as unknown kinetic parameters. µmax has already been obtained from plotting ln (M x/M xo) vs time. qsmax can be obtained from the Ms(t) equation by using measured values of Ms(t), M xo, Mso, and the µmax value. A more elegant approach is found by rewriting the Ms(t) relation (Equation 7.35) yielding.
M s (t) - M so =
q max s ⋅ [ M xo ⋅ exp(µ max ⋅ t ) - M xo ] µ max
(7.37)
The term M xo ⋅ exp (µmax ⋅ t) was already found to equal M x(t). Elimination of this term yields:
M s (t) - M so =
q max s ⋅ ( M x (t) - M xo ) µ max
(7.38)
This relation shows that the plot of Ms(t)-Mso vs M x(t)-M xo) is a straight line, the slope of which equals qsmax/µmax. Given µmax, this slope yields directly qsmax.
7.4.3 Application of the Mathematical Model for Batch Growth for the Estimation of Kinetic Parameters and the Duration of the Batch Culture Above a mathematical model (based on mass balances and q-based kinetic functions) which describes a batch fermentation has been presented. It has been shown that the obtained mathematical relations: • Give the expected biomass and substrate amount as function of time when the kinetic parameters (µmax, qsmax) are known. • Can be used, together with experimental measurements, to obtain qsmax and µmax-values. This is an equivalent method to the basic method applied earlier, whereby no assumptions were made on µ and qs and it was found afterward that they were constant. The mathematical model described above assumes a priori that µ and qs are constant (µmax, qsmax). 7.4.3.1 Evaluation of µmax from Experiments Using the Model Knowing that in a batch cultivation (where the during most of the time the substrate concentration is high) µ = µmax allows us to use the biomass mass balance and solve the differential equation leading to the exponential growth relation for M x(t). If it is expected that the zero order rate function (µ = µmax = constant) applies then the equation
ln(M x/M xo) = µmax t
can directly be applied to experimental measurements. This relation suggests that, in order to obtain the value of µmax from experimental data a graph of ln(M x/M xo) as function of time should be made. From the slope of this graph µmax can be obtained. This is done for the batch example described earlier (see Table 7.5). It can be inferred from Table 7.5 that plotting (ln(M x/M xo)) vs time t should give a straight line through the origin. The slope equals µmax. Indeed one obtains the slope µmax = (0.80-0)/(16-0) = 0.050 h -1. This procedure is clearly more elegant than the previous one where interpolation between time points was applied and for each time interval the slope (dM x/dt) and the average M x were used to calculate µ = (dM x/dt)/M x according, to the mass balance. However, it should be realized that, when µ is not constant in time, often the solution of the differential equation representing the biomass mass balance can not be obtained analytically and thus the interpolation approach has to be applied.
7-15
Mass Balances, Rates, and Experiments Table 7.5 Calculation of ln(M x/M xo) for the Batch Example Time (hour)
Mx (C-molX)
0 1 2 4 8 16
10.000 10.513 11.052 12.214 14.918 22.255
Mx/Mxo(-)
lnMx/Mxo(-)
1 1.0513 1.1052 1.2214 1.4918 2.2255
0 0.050 0.100 0.200 0.400 0.800
Table 7.6 Total Amounts of Substrate Consumed and Biomass Produced for the Batch Example Time (hour) 0 1
Ms(t)-Mso (mol Glucose)
Mx(t)-Mxo (C-molX)
0 -0.1710
0 + 0.513
2
-0.351
+ 1.052
4
-0.738
+ 2.214
8
-1.639
+ 4.918
16
-4.085
+ 12.255
7.4.3.2 Biomass Doubling Time It should be realized that the maximum specific growth rate µmax is a very characteristic kinetic parameter of growing organisms. It is well known that unicellular organisms grow by increasing their individual mass to a certain level, but then they multiply by division. The time needed for the cells to increase their number, which also holds for the total cell mass if an average cell mass is assumed, with a factor 2 follows from the exponential growth equation:
2 M xo = M xo ⋅ exp(µmax ⋅ τd)
(7.39)
whereby τd is the biomass doubling time. Division by M xo, taking the natural logarithm of both sides and subsequent rearrangement yields:
τd =
ln 2 0.693 = µ max µ max
(7.40)
Using this equation it can be calculated that if µmax = 0.05 h -1 this is equivalent to a doubling time of the cells τd = 0.693/0.05 = 13.9 hours. The Ms(t) function obtained earlier (Equation 7.38) can be used to obtain a value for q max s . If at a given time t the substrate concentration and the culture volume have been measured, the total amount of substrate present at time t can be calculated from Ms(t) = V(t) ⋅ Cs(t). If in addition also the initial amounts of substrate and biomass, Mso, M xo as well as the maximum specific growth rate µmax , are known qsmax can be obtained from a plot of the consumed substrate (Ms(t)-Mso), which is negative versus the produced biomass M x(t)-M xo. The slope of the line equals then q max /µ max . This is done for the data from the batch s example which are shown in Table 7.6.
The corresponding plot is shown in Figure 7.1. It can be seen from this plot that linear regression q max molglucose consumed . results in a slope smax which is equal to -0.333 C-molX produced µ
7-16
Balances and Reaction Models 0
Ms(t)–Mso (mol)
–1 y = –0.3333x – 4E–05
–2 –3 –4 –5
0
5 10 Mx(t)–Mxo (C-mol)
15
Figure 7.1 Plot of Ms(t)-Mso against M x(t) -M xo for the data from the batch culture example.
With µ max = 0.050
C-molX/h the maximum specific substrate uptake rate can be calculated as: C-molX q max = -0.333 × 0.050 = -0.0166 s
molglucose/h . C-mol X
This value was also found before, by using the substrate mass balance calculations for each time interval (interpolation approach). It should be realized that if interpolation between time points is used to calculate rates only approximate values are obtained. The reason is, for the example, that dMs/dt is approximated by a straight line (linear interpolation) and that M x in this interval is approximated as the average between two time points, which does also introduce errors. Therefore the approach presented here, where the observation that in batch condition qs and µ are constant is used to solve the mass balance based differential equations, is more accurate. It should be kept in mind that for this example, that is unlimited batch growth, a mathematical model was available to describe the growth (the exponential growth equation). Therefore the relation between the biomass concentration and time was known (an exponential relation). However, if, e.g., under different conditions, this relation is not known interpolation between measurements is the only way to obtain the rates. 7.4.3.3 Duration of the Batch Culture In case the values of µmax, qsmax, Mso, M xo are known an explicit relation for Ms was obtained as a function of time by substituting these values into Equation 7.37:
Ms(t) = 10 + (−3.33 × 10)[exp(0.05t) − 1]
(7.41)
This equation allows us to calculate how much time it will take until the substrate is completely consumed (Ms(t) = 0): 0 = 10 - 3.33 [exp(0.05t) − 1] This yields exp(0.05t) = 4.0, from which it can be calculated that t = 27.73 hours.
7.4.4 Stoichiometric Coefficients The ratio (q max µ max ) has the dimension mol substrate used per C-molX produced and is a so-called s stoichiometric coefficient. These stoichiometric coefficients are also called operational yields (hence symbol Y) and are always defined as ratio’s of conversion rates.
7-17
Mass Balances, Rates, and Experiments
Yij =
q j Rate j = q i Ratei
(7.42)
A well known yield is the biomass yield, which is indicated with the symbol Ysx and has the dimension C-molX produced/mol substrate consumed. The biomass yield Ysx is defined as Ysx =
Rate x µM x q x µ = = = Rate s q s M x q s q s
(7.43)
During unlimited growth in a batch culture the biomass yield is equal to: Ysxbatch =
µ max q max s
(7.44)
This implies that the yield does not change during the period of unlimited growth in batch culture. For the batch experiment of the example it can be calculated that Ysxbatch = 0.05/(-0.0166) = -3.0 C-molX/mol glucose which implies that 3.0 C-molX are produced per mol glucose consumed. Substituting the above relation for the biomass yield in batch culture in Equation 7.38 yields, after rearrangement: Ysxbatch =
dM X M X (t) - M X (0) = dM S M S (t) - M S (0)
(7.45)
Clearly the batch biomass yield follows directly from the changes in the amounts of substate and biomass.
7.4.5 Obtaining q-Rates and Stoichiometry from Batch Experiments: Some Pitfalls In the previous sections we have introduced the total rate Ratei (in mol/time) the biomass specific rate qi (in mol/C-mol biomass /time) and yields Yij = q j /qi (in mol/mol) and it has been shown how these rates and yields can be obtained from experimental data and the proper mass balances, using the batch experiment as an example. Below additional examples will be presented, each of which contains a pitfall, leading to a message.
Example 4: Growth Rates and Yields in a Batch Experiment. Message: do not neglect volume changes!! During a batch growth experiment the liquid volume and the concentrations of substrate and biomass were measured. The results are shown in Table 7.7. During the experiment the volume has increased slightly (only 5%) due to the continuous addition of an alkaline solution for pH control. Question: Calculate the biomass yield Ysx. Answer: First we calculate the total amounts of substrate, Ms and biomass Mx at 0 and 10 hours, using the Cs, Cx, and V measurements (refer to Table 7.8). The produced biomass is then 3.675–2.0 = 1.675 kgX and the consumed substrate is 36.75–40 = -3.25 kgS, hence the yield of biomass on substrate is calculated as:
Ysx = 1.675/3.25 = 0.5138 kgX/kgS.
7-18
Balances and Reaction Models Table 7.7 Data from a Batch Experiment Time (hour) 0 10
Cs (kg/m3)
Cx (kg/m3)
V (m3)
40 35
2 3.5
1.00 1.05
Table 7.8 Calculated Total Amounts of Substrate and Biomass Time (hour) 0 10
Ms (kgS)
Mx (kgX)
40 36.75
2.0 3.675
Question: Calculate Ysx, without taking the change in volume into account. Answer: Ysx is then calculated from the concentration differences as Ysx = (3.5–2.0)/ (40–35) = 1.5/5 = 0.30 gX/gS. Comparison with the result of the correct calculation shows that this is incorrect by 40%, which shows the danger of neglecting small volume changes in calculating yields!! So never calculate yields from concentration differences!! Question: Calculate qs and µ. Answer: qs is defined as qs = Rates/Mx From the mass balance for substrate it is found that for the 0–10 hour time interval: Rates =
dMs Ms (10 ) - Ms ( 0 ) = dt 10 - 0
Substituting the measurements for Ms at 10 and 0 hour yields: Rates = (36.75–40)/10 = -0.325 kgS/hour
To calculate qs we need the average biomass amount which is equal to:
(3.675 + 2)/2 = 2.84 kgX.
Subsequently qs can be calculated as: qs = -0.325/2.84 = -0.1144 gS/gX/hour. Similarly one can calculate Ratex = (3.675–2.0)/10 = 0.1675 kgX/hour and µ = Ratex /Mx = 0.1675/2.84 = 0.0588 kgX/hour per kgX. Ysx can now also be calculated as Ysx = (µ/qs) = -0.0588/0.1144 = -0.514 gX/gS. This is (apart from the - sign), identical to the answer to Question 1.
In the above approach we used linear interpolation to obtain average values for Rates, Ratex, and M x in the time interval 0–10 hours. A more accurate approach is to use our knowledge that in batch µmax and qxmax are constant. Using the exponential relation ln
M x (t) = µ max ⋅ t M xo
it can be found:
µmax = ln(3.675/2)
(7.46)
7-19
Mass Balances, Rates, and Experiments
This gives µmax = 0.061 h -1, which is indeed slightly different from the previous value (0.05 h -1). It can be shown that when the interval time <0.1/µmax both procedures give nearly the same result. This would require a time interval between the measurements of 0.1/0.061 < 1.6 hour.
Example 5: Batch Experiments on Lab Scale and Production Scale A microorganism is grown in a batch cultivation carried out in a lab-reactor (around 1 liter volume) and growth and product formation are analyzed by measuring their concentrations (CX and CP) and the total broth volume (V) as function of time between 10 and 13 hours. The broth volume decreases strongly due to water evaporation. The measurements are displayed in Table 7.9. Question: Calculate µ and qp between 10 and 13 hours Answer: In order to calcualte µ we need to calculate first the average biomass production rate, Ratex between 10 and 13 hours. This follows from the biomass mass balance as dMx/ dt = Ratex = ((9*0.787)-5*1.05))/3 = 0.61 g/h. This production is realized by the average biomass amount present in the reactor between 10 and 13 hours, which is ((5*1.05) + (9* 0.787))/2 = (5.25 + 7.08)/2 = 6.16 gram. Hence µ = 0.611/6.16 = 0.0991 gX/gX/h. In order to calculate the specific rate of product formation, qp we first have to calculate the total production rate Ratep, which follows from the mass balance for product as 0.456 g P/h. With the average amount of biomass (6.16 gram) qp follows as qP = 0.456/6.16 = 0.074 g P/gX/h. Question: The previous laboratory batch experiment is also performed on production scale, where the fermentor volume is increased nearly 5000 fold from 1.05 liter to 5.0 m3. The following experimental measurements are available (Table 7.10): It can be inferred from Table 7.10 that at t = 10 hours the same concentrations are measured as on lab scale, which is expected. However at t = 13 hour the biomass and product concentrations are significantly lower at production scale. The following question immediately arises: Does the organism grow and produce worse at large scale? (a so-called scale up effect?) Answer:
At first glance, the concentrations of biomass and product for the fermentation carried out on production scale are much lower at 13 hour, so it would appear that there is a scale-up effect. However, to evaluate the performance of the cells one should not compare product concentrations but specific conversion rates!! Using the proper approach (mass balance based, see before) it can be calculated that µ = 0.093 gX/gX/h and qp = 0.074 gP/gX/h. These are values which are very close to the values observed in the lab scale experiment. Table 7.9 Results from a Batch Cultivation Carried Out on LabScale Concentrations Time t (hours) 10 13
Volume V (L)
CX(kg/m3)
CP(kg/m3)
1.050 0.787
5.0 9.0
2.0 4.407
Table 7.10 Results from a Batch Experiment Carried out on 5 m3 Scale Concentrations Time t(hours) 10 13
Volume V(m3)
CX(kgX/m3)
CP(kgP/m3)
5.0 4.9
5.0 6.75
2.0 3.36
7-20
Balances and Reaction Models Table 7.11 Data from a Citric Acid Fermentation Time (hours) 30.5 37.0
V (m3)
CX (kgX/m3)
CS (kgS/m3)
70.3 68.0
10.3 10.65
150 120
Ccitric acid (kgP/m3) 2 23
The main difference between lab and production scale is the change in volume (25% at lab, 2% at production) due to much larger water evaporation on lab scale. Clearly there is no scale up effect for the organism.
Example 6: Product Formation in a Batch Reactor, qp Is the Key Parameter The fungus Aspergillus produces citric acid (a food preservative) from glucose in an aerobic process. Citric acid C6H8O7 has a molecular weight (MW) of 192 g/mol. Glucose C6H12O6 has a MW of 180 g/mol. The citric acid production occurs under condition of no growth (e.g., because the N-source needed to synthesize new biomass is absent). In a large batch reactor the measurements are made at different times and data is shown in Table 7.11. The volume V decreases due to water evaporation. Task: Answer:
Show that there is no biomass growth and also calculate qP and qS. Also provide the proper units of the q-values. The concentration of biomass (CX) increases, the volume V decreases, but the total amount of biomass (MX = VCX) is constant, being MX = 724.1 kgX. From this it can be concluded that growth is indeed absent!!! To calculate qP and qS it is needed to calculate first the total amounts of produced citrate (kg/h) and consumed glucose (kg/h). From the mass balance for citrate it follows that the total rate of citrate increase in the cultivation vessel dMp/dt (kg/h) equals the citrate production Ratep = (qP ⋅ MX). It can be calculated that the total amount of produced citrate between 30.5 and 37.0 hour = 68*23-70.3*2 = 1564– 140.6 = 1423.4 kg. The total citrate production is therefore Ratep = 1423.4/6.5 = 219 kg citrate/h. Hence qP can be calculated to equal Ratep/Mx = 219/724.1 = 0.3024 kg citrate/ kg biomass per hour. In an analogous way it can be found that the total amount of consumed glucose is equal to 2385 kg in 6.5 hour from which it can be calculated that qS = -0.5067 kg glucose/kgX/h (negative!!).
7.5 Conclusions The biomass specific (q-based) conversion rates kinetics (q-values) of cells, tissues, or microorganisms can not be measured directly, but can be calculated from measurements of concentrations, volumes, and flow rates obtained from experiments where the cell, tissue, or microorganism is consuming, growing and producing (in a flow cell, fermentor, shake flask, etc.). To obtain a qi-value (i = substrate, biomass, product, O2, CO2, N-source, etc.) it is necessary: • To design a proper experiment • To set-up the proper mass balance for compound i • To carry out the proper measurements of volume and biomass concentration in order to be able to calculate the total amount of biomass (M x = V ∙ Cx) which is present in the system as a function of time • To carry out the proper measurements of in and outflow rates (if any) and concentrations of compound i as function of time • To calculate, from the mass balance of compound i and the measurements, the total conversion rate of compound i, Ratei (mol i/h), of produced or consumed compounds as function of time • To calculate qi as function of time from M x and Ratei as function of time
Mass Balances, Rates, and Experiments
7-21
The thus obtained biomass specific conversion rates, (q-rates) measured as a function of time and of the experimental conditions applied allows us to obtain kinetic relations for qi as a function of the external conditions. These allow us to set up mathematical models by combining the proper mass balances and these kinetic relations. Such mass balance based models can then be used to generate predictions of the culture behavior as a function of the cultivation conditions, and can thus be applied for the design and optimization of fermentation processes. For reasons of simplicity we have limited ourselves in this chapter to batch cultivations, and therefore the kinetic relations used were not very exciting (zero order kinetics). The latter indicates immediately the disadvantage of batch experiments for the study of q-rates, namely that the experimenter has no control over the q-rates, which are constant (the cell has control). Before we approach the question of how the experimenter can manipulate the q-rates, it is important to address another question. Cells and organisms grow (µ), consume substrate (qs), and make product (q p). To do so they also need O2, produce CO2, heat, etc. The question is: How do we obtain these “other rates”? The answer is to use stoichiometric calculations.
Further Reading Burgos-Rubio C.N., Okos M.R., and Wankat P.C., 2000. Kinetic study of the conversion of different substrates to lactic acid using Lactobacillus bulgaricus. Biotechnol. Prog., 16(3):305–14. Khaw T.S., Katakura Y., Koh J., Kondo A., Ueda M., and Shioya S., 2006. Evaluation of performance of different surface-engineered yeast strains for direct ethanol production from raw starch. Appl. Microbiol. Biotechnol., 70(5):573–79. Koga S., Burg C.R., and Humphrey A.E., 1967. Computer simulation of fermentation systems. Appl. Microbiol., 15(4):683–89. Kuyper M., Hartog M.M., Toirkens M.J., Almering M.J., Winkler A.A., van Dijken J.P., and Pronk J.T., 2005. Metabolic engineering of a xylose-isomerase-expressing Saccharomyces cerevisiae strain for rapid anaerobic xylose fermentation. FEMS Yeast Res., 5(4–5):399–409. Kong Q., He G.Q., Chen F., and Ruan H., 2006. Studies on a kinetic model for butyric acid bioproduction by Clostridium butyricum. Lett. Appl. Microbiol., 43(1):71–7. Letisse F., Chevallereau P., Simon J.L., and Lindley N.D., 2001. Kinetic analysis of growth and xanthan gum production with Xanthomonas campestris on sucrose, using sequentially consumed nitrogen sources. Appl. Microbiol. Biotechnol., 55(4):417–22. Lin H.Y., Mathiszik B., Xu B., Enfors S.O., and Neubauer P., 2001. Determination of the maximum specific uptake capacities for glucose and oxygen in glucose-limited fed-batch cultivations of Escherichia coli. Biotechnol. Bioeng., 73(5):347–57. Modig T., Granath K., Adler L., and Liden G., 2007. Anaerobic glycerol production by Saccharomyces cerevisiae strains under hyperosmotic stress. Appl. Microbiol. Biotechnol., 75(2):289–96. Myint M., Nirmalakhandan N., and Speece R.E., 2007. Anaerobic fermentation of cattle manure: modeling of hydrolysis and acidogenesis. Water Res., 41(2):323–32. Nilsson A., Gorwa-Grauslund M.F., Hahn-Hagerdal B., and Liden G., 2005. Cofactor dependence in furan reduction by Saccharomyces cerevisiae in fermentation of acid-hydrolyzed lignocellulose. Appl. Environ. Microbiol., 71(12):7866–71. Ozmihci S. and Kargi F., 2007. Kinetics of batch ethanol fermentation of cheese-whey powder (CWP) solution as function of substrate and yeast concentrations. Bioresour. Technol., 98(16):2978–84. Papagianni M. and Papamichael E.M., 2007. Modeling growth, substrate consumption and product formation of Penicillium nalgiovense grown on meat simulation medium in submerged batch culture. J. Ind. Microbiol. Biotechnol., 34(3):225–31. Reardon K.F., Mosteller D.C., and Bull Rogers J.D., 2000. Biodegradation kinetics of benzene, toluene, and phenol as single and mixed substrates for Pseudomonas putida F1. Biotechnol. Bioeng., 69(4):385–400.
7-22
Balances and Reaction Models
Richard A. and Margaritis A., 2004. Empirical modeling of batch fermentation kinetics for poly(glutamic acid) production and other microbial biopolymers. Biotechnol. Bioeng., 87(4):501–15. Sainz J., Pizarro F., Perez-Correa J.R., and Agosin E., 2003. Modeling of yeast metabolism and process dynamics in batch fermentation. Biotechnol. Bioeng., 81(7):818–28. Wang X., Xu P., Yuan Y., Liu C., Zhang D., Yang Z., Yang C., and Ma C., 2006. Modeling for gellan gum production by Sphingomonas paucimobilis ATCC 31461 in a simplified medium. Appl. Environ. Microbiol., 72(5):3367–74. Zelic B., Bolf N., and Vasic-Racki D., 2006. Modeling of the pyruvate production with Escherichia coli: comparison of mechanistic and neural networks-based models. Bioprocess Biosyst. Eng., 29(1):39–47.
8 Data Reconciliation and Error Detection 8.1
Introduction ���������������������������������������������������������������������������������������8-1
8.2
Statistical Framework—Theory �������������������������������������������������������8-3
8.3
Deviations—Reality ������������������������������������������������������������������������� 8-8
General Sketch • Example Problem • Problem Definition
Variables and Relations • Optimization Formulation • Linear Constraints • Example Problem Solved Linearly • Nonlinear Constraints • Example Problem Extended
Missing Data • Testing for Deviations • Gross Errors • Example Problem with Tests
Peter J.T. Verheijen Delft University of Technology
8.4 Computations and Reconciliation.................................................8-11 8.5 Biotechnology and Reconciliation.................................................8-11 8.6 Discussion and Conclusion �����������������������������������������������������������8-12 References ����������������������������������������������������������������������������������������������������8-12
8.1 Introduction 8.1.1 General Sketch In a typical fermentation process many variables are observed, and actually recorded. Often, there is a certain redundancy in the measurements. Either several replicates are taken for the same observable or there exists an implicit relation between different measurements. A very obvious one has to do with the observation on influents and on effluents. It is to be expected that the incoming mass flow equals the outgoing mass flow. However, due to measurement errors there are always differences. Application of the mass conservation law provides an opportunity to find a better estimate of these same observables. This is generalized to the situation where all relations that exist between observables are gathered together and a reconciliation process is carried out in order to find new estimates for these observables that fulfill the required relations. The advantages of this approach are threefold. If data reconciliation is applied any engineering decision for process control and optimization will be based on estimates with less inherent uncertainty. Secondly, any knowledge that one looks for in microbial processes will be made clearer. Any subsequent estimation of parameters of internal processes that are being studied will be easier determined than from the raw data. Thirdly, reconciliation provides an internal check on the validity of the data or point at significant errors in the experimental procedure. For example, if the effluent mass is significantly lower than the influent mass, it could mean some components such as off-gas are missed in the measurements.
8-1
8-2
Balances and Reaction Models
The reality is of course more complicated. There might be missing data and genuine outliers that are caused by apparatus malfunctioning. Also the relationships that are used to introduce the redundancy could be at unjustified. In the example of the missing off-gas above the conservation of mass should not be applied. A third—often overlooked—problem is that the statistics behind the standard data reconciliation technology is based on the assumption that measurement errors are normally distributed. Therefore, this technique is very useful, but precautions should be made to cater for these deviations from the standard theory. The basis of the work presented here is an extension of Van der Heijden et al. (1994a,1994b). A short overview of the field of data reconciliation is given by Crowe (1996). He also gives the often quoted papers such as: Kuehn and Davidson (1961, one of the first formulations of the data reconciliation problem), Ripps (1965, measurement elimination and gross errors), Almasy and Sztano (1975, statistical testing), and Crowe et al. (1983, projection method for linear reconciliation). Romagnoli and Sanchez (2000) have compiled a complete textbook on data reconciliation, which covers the whole field and also gives many numerical examples. A recent summary is given by Heyen and Kalitventzeff (2006).
8.1.2 Example Problem The following are data taken from an aerobic carbon limited chemostat culture of E. coli with glucose as the growth limiting substrate. After reaching a steady state measurements were performed of the concentrations of glucose in the feed and glucose and biomass in the effluent, oxygen and carbon dioxide in the offgas as well as the medium feed flow and the aeration rate. From these measurements the net rates of glucose consumption, biomass formation, oxygen uptake, CO2 production, and by-product formation were calculated. There is an ammonia-containing substrate which we will lump into NH3, and also water as product. Considering the only the elements C, H, O, and N two rates were not measured, namely ammonium consumption and water production. The seven variables are summarized in Table 8.1 using the same units of mmol per C-mole biomass per hour. Note that feed streams have a negative and product streams have a positive sign. In this case there are four relations to be fulfilled namely the conservation relations for each of the atoms carbon, hydrogen, nitrogen and oxygen. These are presented by the successive rows in the set of equations
6 12 0 6
0 0 0 2
0 3 1 0
1 2.0045 0.2366 0.4477
1 2.0045 0.2366 0.4477
1 0 0 2
0 2 x = 0, 0 1
Table 8.1 Rate Data in mmol . Cmol -1 . h -1 with Experimental Errors for the Example Problem Compound
Net Conversion Rate
Glucose
x1
- 38.41 ± 0.85
Oxygen
x2
NH3 Biomass
x3 x4
- 92.3 ± 4.5 Unknown
By-product
x5
19.01 ± 0.56
Carbon dioxide
x6
Water
x7
98.7 ± 2.1 Unknown
104.5 ± 1.0
(8.1)
Data Reconciliation and Error Detection
8-3
where in the columns the atomic composition of each of the seven compounds is given. The atomic compositions of biomass and by-product are given for the amount containing 1 mol of carbon. Note, that because the by-product is assumed to be lysed biomass, the atomic compositions of biomass and by-product are identical. In this example there are five measurements and four equality constraints giving us information for seven rates to be estimated. It would suffice to use three measurements and four equations to calculate the seven rates. Alternatively, five measurements and two equations are sufficient to calculate seven rates. So there is redundancy in this combination of measurements and relations. The question is how to use this redundancy.
8.1.3 Problem Definition The purpose of data reconciliation is to find new estimates* for observed variables taking into account, the existing data and additional model relations, e.g., constraints such as mass or elemental conservation, involving those variables. In order to achieve this, we will first develop the framework in the case that these model relations are linear and subsequently the nonlinear case. This will give the basis of the theory in terms of its mathematical formulation. It is extended in the next section dealing with special conditions such as missing data. Essentially, the data reconciliation problem is translated into an optimization problem or a set of equations, which must be solved numerically. Some relevant numerical aspects of dealing with a large number of variables are given in the next section. The chapter is concluded with a section describing actual applications in the field of biotechnology.
8.2 Statistical Framework—Theory 8.2.1 Variables and Relations On a given system we will denote the vector of n variables of interest by x, their measurements by y and their estimates after the reconciliation step will be denoted as xˆ . Here we will not make an explicit distinction between measured and unmeasured variables, because the formulas then remain compact, unlike e.g., as done by Romaglino and Sanchez (2000). So, x contains all variables to be estimated including the missing data and any parameters in models that we wish to know, but cannot be measured directly. In the notation used here, those elements of vector y that are unmeasured can have an arbitrary value. We will accept that the measurement error is normally a known vector σ for all measured variables, as these can be different for each of the variables i = 1…n. In some cases, the measurements are correlated and then the variance-covariance matrix of the measurements is given as Var(y). The basic assumption is that the measurements are normally distributed
y ∼ N(x, diag(σ2)),
(8.2)
y ∼ N(x, V), where V = Var(y).
(8.3)
and in the case of correlated data
The rows and columns in V associated with the unmeasured variables, i∈ {not measured}, are not relevant and we set these to zero, except for the diagonal elements. Where appropriate, in any derivation the limit of these diagonal elements to +∞ is done. * There is a tendency to state that the “data are improved.” Data are the actual observations, and as such are given for once and for all. Only, the derived estimates can be improved.
8-4
Balances and Reaction Models
Next to the observables there are the relations that exist. From a mathematical point of view all relations are acceptable, but one should distinguish between “hard” relations such as mass balances and “soft” relations such as kinetic equations, which often have an approximate character and could cause a model-related bias in the estimations. In data reconciliation, only the first category is taken along. These we will denote by the vector functions h and g for respectively the equality and inequality constraints. In the standard reconciliation these are only functions of the observed variables
h( xˆ ) = 0 and g( xˆ ) ≤ 0 .
(8.4)
These are the equations that create redundancy. With these equations better estimates can be found for the variables. The above sketches the ideal situation. In practice, there exists added challenges. There are often missing data, outliers, and non-Gaussian distributions on the side of the measurement, while the given equations are sometimes not exactly fulfilled, and a bias could be suspected.
8.2.2 Optimization Formulation Under the conditions formulated in Equations 8.3 and 8.4 the search for the estimates of the measured observables is a least squares method (LSM) problem. The objective function is the distance between the measurement vector y and the vector of estimates xˆ weighted by the measurement error. In its simplest form it reduces to a sum-of-squares
2
yi - xˆi σ i i ∈measured
(8.5)
min SS( xˆ ) s.t . h( xˆ ) = 0 , g( xˆ ) ≤ 0
(8.6)
SS(xˆ ) =
∑
and
xˆ = arg
is the estimator that minimizes this distance. The formulation is extended to the case of correlated data by the introduction of the weight matrix
W = Var( y )-1 ,
(8.7)
which is the inverse of the covariance matrix. If there are unmeasured variables, the columns and rows associated with those variables are set to zero. Its name, weight matrix, comes from the fact that relatively large values of W—or small values of the related error in the measurements—allows for large additions to the sum-of-squares, and therefore determine for a large part the minimum distance. In the more amenable vector notation, Equation 8.5 becomes
SS( xˆ ) = ( y - xˆ )t W ( y - xˆ ),
(8.8)
and the optimization problem in Equation 8.6 remains the same. In this manner, the data reconciliation problem is formulated as a general optimization problem. This has the added advantage that the total optimization framework can easily be applied. Aspects of continuity, convexity, and necessary conditions for an optimum are properly dealt with in optimization theory. Secondly, there exist many software packages to solve this problem offering a variety of algorithms.
Data Reconciliation and Error Detection
8-5
From a statistical point of view, the sum-of-squares is the central statistic that can be evaluated. If the data are really normally distributed, the given covariance matrix is a good estimate for the error model, and the active constraints are all independent of each other, then this SS( xˆ ) = χ 2 (m - nu ),
(8.9)
i.e., it has a chi-square distribution with a number of degrees of freedom equal to the number of active constraint relations, m, that are in operation at the optimum, corrected for the number of unmeasured variables, nu. This number of degrees of freedom, given by m − nu , is also called the degree of redundancy.
8.2.3 Linear Constraints In the case that the constraints are given by linear relations, the optimization problem above can be solved explicitly. To focus the argument, we will assume that the optimum is found and that the inequality constraints can be divided between active ones that are lumped together with the set of equality constraints, while the remaining ones don’t need to be considered. This is locally justified. All these relations are lumped into a single set of linear equations
h( xˆ ) = 0
→ Axˆ = b. g k ( xˆ ) = 0 for all active k
(8.10)
A is an m × n matrix representing the m effective equality constraints working on the n observables to be estimated. Optimization theory shows that the solution to the optimization problem of Equation 8.6 can be found by considering the function
SS( xˆ ) + λ t ( Axˆ - b )
(8.11)
as a function of both the n estimates xˆ and the m introduced parameters λ, which are called the Lagrange multipliers (see e.g., Fletcher, 2000). At the point where the derivatives of this function to each of these n + m variables equals zero, there is a saddle point, which is also the point where the optimum of
SS( xˆ ) ∼ χ 2 ( rA - nu )
(8.12)
is reached, subject to all the equality constraints. In this case the number of degrees of freedom equals the rank of matrix A, rA, with the number of unobserved measurements, nu, subtracted. The saddle point condition leads to the following set of equations
W At xˆ Wy = , A Zm λ b
(8.13)
where Zm is an m × m matrix of zeros. This matrix relation is the central set of linear equations that determines the relation between the measured variables y on the right hand side and the new estimates xˆ on the left hand side. For convenience we will introduce the matrix
W At M= . A Zm
(8.14)
8-6
Balances and Reaction Models
When M is full rank the solution for xˆ is easily found from xˆ Wy = M -1 . λ b
(8.15)
A complete derivation using blockwise matrix inversion leads to an explicit formula, which is often quoted in literature,
)
(
-1 -1 xˆ = I - VAt ( AVAt ) A y + VAt ( AVAt ) b .
(8.16)
However, Equations 8.13 and 8.15 are more transparent. Secondly, Equation 8.15 gives explicitly the Lagrange multipliers. From the optimization theory it can be shown that these multipliers have a special meaning. Each of the parameters λj shows the sensitivity of the objective SS( xˆ ) at the optimum on the constant bj belonging to the j-th linear relation. This is sometimes relevant information. If the matrix M is not of full rank, the solution is given by a particular solution and an arbitrary linear combination of vectors in the null space of M:
xˆ Wy µ, = M † + Nµ λ b
(8.17)
where M† denotes the pseudo-inverse and the columns of N form the null space of the matrix M. The vector of coefficients, µ, can be freely chosen subject to the conditions applicable to variables x such as non-negativity. It is worth to inspect the matrix N. Each of the top n rows that contain only zeros are associated with observations that are completely identifiable. The other observables cannot be identified. The statistical properties of the estimates are derived from Equations 8.15 or 8.17. Let us call Q the top left n × n sub matrix of M−1. In the case of a singular matrix M, only the identifiable observables need to be considered and the matrix Q consists of the first n rows in M† that are associated with rows of zeros in the matrix N. So, from Equation 8.15 and 8.17. follows
xˆ = QWy + a constant term.
(8.18)
From this the covariance matrix for the estimates is directly derived,
Var( xˆ ) = QWQ t.
(8.19)
In this approach the covariance matrix follows directly from the calculation of M and its inverse. Also, here a formal derivation is possible, and the result is quoted for completeness sake:
Var( xˆ ) = V - VAt ( AVAt ) AV . -1
(8.20)
The important point to note here is, that all the statistical properties and also the identifiability of the system to reconcile is contained in the matrix M of equation and the solution in Equations 8.15 or 8.17. This is summarized in Table 8.2 which gives the procedure to analyse and solve a data reconciliation problem.
8.2.4 Example Problem Solved Linearly The earlier problem was solved within the linear framework posed above. The obtained sum-of-squares is 3.108, and the estimates with their errors are given in Table 8.3. The degree of redundancy is 2 from four equations—two unmeasured observables.
8-7
Data Reconciliation and Error Detection Table 8.2 Procedure to Solve a Linear Data Reconciliation Problem I
Identify observables, x; their measurements, y, and covariance matrix, V = Var(y), which often only has the measurement variances on the diagonal.
II
Determine weight matrix, W = Var(y)−1. Missing measurements are excluded and corresponding rows and columns in W are set to zero.
III
Find the relations between observables and formulate those in terms of Ax = b, where the matrix A and vector b are fixed quantities. Prepare Equation 8.13. Set up the matrix M and the right hand side vector from A, W and b. Determine rank, rM, of matrix M. Find estimates xˆ when
IV V
VI
rM = m + n
Equation 8.15 with a single solution.
rM < m + n
Equation 8.17 with multiple solutions characterized by matrix N.
Identifiable observables are those with only zeros in associated rows of N. Determine covariance matrix of xˆ , from Equation 8.19.
Table 8.3 Example Problem: Estimates in Linear Case Compound
Measured rate
Estimated rate -37.40±0.34
Glucose
x1
-38.41±0.85
Oxygen
x2
NH3
x3
-92.3±4.5 Unknown
Biomass By-product Carbon dioxide Water
x4 x5 x6 x7
104.5±1.0 19.01±0.56 98.7±2.1 Unknown
-88.25±1.8 -29.29±0.27 104.7±1.0 19.08±0.56 100.6±1.8 144.3±1.8
Rates are given in mmol . Cmol-1 . h-1.
Note that the two unknown rates have now been determined and that the estimates have—if possible—a smaller error in their estimate than the original “pure” measurements.
8.2.5 Nonlinear Constraints Reality is nonlinear. In the case of dealing with fermentation data, net conversion rates are not measured as such but calculated from measured gas and liquid flow rates and concentrations, which are measured separately. The conversion rate of an individual component is then the product of a flow rate and a concentration difference. If the conversion rates are expressed as biomass specific rates, they are divided by the biomass concentration. A component balance then consists of a sum of these products, which is then nonlinear in the observables. The case of nonlinear constraints is much more common than the linear case. Its solution as a general optimization problem (Equation 8.6) remains the general approach. The analysis possibilities of the linear case make it attractive to use a local linear approximation. Suppose that the optimum solution is found a local linearization of the constraints, h(x), immediately delivers
A=
∂h . ∂x x = xˆ
(8.21)
This matrix then is used together with the weight matrix to build the central matrix M of Equation 8.14. This last matrix is the basis of all further analysis. For example, when this matrix is not of full rank, a degenerate solution space around the estimate is found as given by Equation 8.17. The approximate covariance matrix of xˆ follows from Equation 8.19. Whereas in the linear case these results are exactly
8-8
Balances and Reaction Models Table 8.4 Example Problem: Nonlinear Case Compound
Calculated Rate Directly from Observables
Estimated Rate
Glucose
x9
- 38.41 ± 0.85
- 37.36 ± 0.37
Oxygen
x10
NH3
x11
- 92.3 ± 4.5 Unknown
- 29.31 ± 0.31
Biomass
x12
104.5 ± 1.0
104.8 ± 1.0
By-product
x13
19.01 ± 0.56
19.12 ± 0.55
Carbon dioxide
x14
100.3 ± 1.9
Water
x15
98.7 ± 2.1 Unknown
- 87.9 ± 1.9
144.0 ± 1.9
The eight observables are not shown, only the seven rates (in mmol . Cmol-1 . h-1).
correct given that all other assumptions have been fulfilled, in the nonlinear case they are only valid within a certain region around xˆ . The more data and the higher the accuracy of these data, the more the linear approximation can be used, which is the most common situation.
8.2.6 Example Problem Extended The five “measured” rates in the example problem were actually based on eight different measurements of fluxes and concentrations, e.g., the net O2 uptake and CO2 production are based on the measured gas outflow and concentrations of both molecules. The data and relations will not be given here. The problem now consists of 15 variables (eight observables and seven rates) and nine equations (five rate relations and four component balances). The obtained sum-of-squares is 3.745 with a degree of redundancy of 2 and the estimates for the rates only with their errors are given in Table 8.4. In this example there appears to be only minor differences with the linear case, but in general that is not guaranteed.
8.3 Deviations—Reality 8.3.1 Missing Data Experiments are sometimes disrupted by incidental and momentary equipment failure. They are caused by contamination and clogging of tubes, by incidental exposure to air, by momentary electricity lapses, by computer failure, by communication failure, etc. Very often, it only affects a few measurement points in time or just a single sensor, while the other data are valid and therefore valuable. In the presence of extra relations between the observables—as in the case of data reconciliation—it might be possible that proper estimates for the missing observables can be obtained. As soon as one or more of the data are missing, the weight matrix W and matrix M have to be modified by replacing the associated rows and columns by zeros, forming a new weight matrix, W. In the formulation we have followed so far, this is taken along automatically and it does not make any difference for the calculations, except that the degree of redundancy will have reduced by unity for each of the missing data points. The condition for the existence of a solution for all unknowns is assured when the matrix M is of full rank, or otherwise said the set of equations as in Equation 8.13 is still solvable. When these equations are not solvable, only the actually measured observables and sometimes a subset of the unobserved quantities can be identified as described by Equation 8.17.
8.3.2 Testing for Deviations Abnormal situations cannot only lead to missing data but also to data that are highly perturbed. There is a need to detect those situations. This can be formulated as a statistical test problem. The hypothesis is
8-9
Data Reconciliation and Error Detection Table 8.5 Basic Tests in Data Reconciliation Description Measurement test
Each individual measurement is considered.
Nodal test
Each individual constraint misfit is considereda
Global test
Weighted sum of residuals squared gives an overall view
Measure
Distribution
y - xˆ
N(0, (I − QW)V(I − QW)t)
Ay−b
N(0, AVAt)
SS( xˆ )
χ 2 ( rA - nu )
a Here the linear case is given only. In the nonlinear case the measure is h(y) and in the distribution the matrix A represents the Jacobian of h(y).
that the measurements represent a “normal” operating point, and then one needs to identify a statistic that allows testing this hypothesis. Only when this statistic is in a critical region, or its associated probability is very small, the hypothesis is rejected. The assumption behind the data reconciliation procedure is that all measurements are normally distributed around a nominal value, and have a known or estimated measurement error. In order to draw conclusions from the data reconciliation procedure, it is expedient and necessary to validate the data by looking for consistency of the data with these assumptions. In the literature, three possible tests are often used. Firstly, one can compare each of the new estimates with the original measurements, y - xˆ . Secondly, one can substitute each of the measurements into the active constraints, h(y) or Ay−b. In these two cases the variances of the respective residuals can be derived easily. Thirdly, the sum-of-squares of the residuals of Equations 8.8 or 8.12 has a known distribution and can therefore be used. These are summarized in Table 8.5. The names of the tests are those commonly used in literature. Each of the statistical tests is performed by evaluating the measurement and substituting it into the associated distribution in order to determine the probability that the measurement is more extreme than found. This is called the P-value. When the P-value is smaller than a specified threshold, α of typically 0.05 for the one-sided global test, then the test leads to a rejection of the underlying assumptions. Similarly, this happens when P < α/2 or P > 1 − α/2 for the two-sided measurement and nodal test. Essentially, the occurrence of that specific P-value is found to be a too extreme a probability. The assumptions are essentially the size of the deviation, the associated standard deviation or the shape of the distribution. The setting of a level of significance means that a priori the probability of rejecting “good data” is given a specific value. Given this specific value it is possible to evaluate the probability of not rejecting an alternative hypothesis, while this alternative hypothesis happens to be present. This is called the power of the test. The higher the power, the higher is the probability of properly identifying the alternative hypothesis while retaining a probability α of rejecting the null hypothesis. In the case of the measurement test and the nodal test the power can be improved by multiplying the residual vectors with a properly chosen matrix. For the measurement test this is only relevant when the weight matrix is nondiagonal, otherwise identical results are obtained. The resulting measures for both tests are given in Table 8.6. The three tests in Table 8.5 and the two tests in Table 8.6 form the major tests that are reported in the literature. Others have been proposed e.g., on the basis of likelihood ratios (Narasimhan and Mah, 1987, which was found to be equivalent to a maximum power test by Crowe, 1992), or with the application of principal component analysis (Tong and Crowe, 1995), but they have found less application. There exists also a Bayesian approach, which takes along prior knowledge about the variables, in order to enhance power of the tests (Narasimhan and Jordache, 2000, Chapter 8). If for a specific data reconciliation problem one or some of the tests give a negative result, it could point to individual measurements or individual constraints that are not fulfilled. Because of the fact that data reconciliation involves a global approach to all observations, a single cause might affect many estimates, and therefore it has been a challenge to identify the single or few outliers and their associated causes.
8-10
Balances and Reaction Models
Table 8.6 Maximum Power Tests in Data Reconciliation Max power measurement test Max power nodal test
Description
Measure
Distribution
Each individual measurement with a weighting Each individual constraint misfit with a weighting
W( y - xˆ )
N(0, (I − WQ)W(I − WQ)t)
(AVAt)−1(Ay−b)
N(0, AVAt)−1
8.3.3 Gross Errors Malfunctioning apparatus normally only affect a single or a small number of observations. The first question is whether they can be detected. The statistical tests in the previous section are the basis for further investigation. The null hypothesis is that there are no gross errors. The measurement test is straightforward and specific when only one or two measurements lead to a P-value less than the level of significance, α, it is an indication that those measurements are a potential gross error. Some care should be taken, when the number of observations, n, is large, because ideally one would then have an observables that are rightfully in the tail of the distribution, but are now flagged as potential gross errors. In order to account for this effect the level of significance is put slightly more conservative, namely
α α ∗ =1 - (1 - α )1/n . n
(8.22)
With this taken into account, the measurement test immediately pin-points a single outlier, if present. The situation is often more complicated, and there are several strategies to locate gross errors. One major strategy starts from the global test. When this fails, a single measurement is excluded from the data, and the global test is repeated. The measurement of choice is the one that gives the maximum reduction of the objective function (Equation 8.6). This process is repeated until the global test gives a P-value above the significance level. Of course, the number of eliminated measurements should not exceed the redundancy in the data reconciliation problem. This procedure is referred to as the serial elimination method. Several variants are possible, such as removing more than one measurement in each iteration, and using an other objective function such as the P-value derived from the measurement test. Another possibility is to assign to the suspected measurements a much larger error. In the latter case, one essentially accepts the presence of gross errors as observations, which have a distribution as well, but with a much larger variability. In general, the different methods of gross error detection will not guarantee the detection of all gross errors; neither will it avoid the false negatives. It is therefore good policy to verify the experimental data logs for each identified potential gross error, before they are marked as real error. Data reconciliation on repeated measurements with the same setup and measurement procedure might indicate systematic biases in the apparatus.
8.3.4 Example Problem with Tests The tests described above are easily applied to the linear example of which the data are given in Table 8.3. We take α = 0.05. The global test gives P = 0.154. Each of the measurement tests give, respectively, P = 0.10, 0.16, 0.12, 0.12, and 0.04, which should be compared with a* = 0.005. The nodal test can only be applied to the C-balance in Equation 8.1, as that is the only one where all relevant rates had been observed. Here, P = 0.07. None, of the tests give rise to identifying measurements as other than normal
Data Reconciliation and Error Detection
8-11
8.4 Computations and Reconciliation The process of solving the basic optimization problem as posed in Equation 8.6 is relatively straightforward. First of all for the case of linear constraints, the given equations can be solved with any up-to-date linear algebra package. The general case, linear and nonlinear, can be resolved with many optimization methods and algorithms varying from the robust Nelder-Mead simplex to intricate genetic algorithms. Also the problem of initialization of the optimization is easy, as the measurements are the first estimators for the variables. Smaller data reconciliation problems involving some tens of variables can be solved with a standard spreadsheet. This is good for an initial investigation of a reconciliation problem, but spreadsheets do not supply the covariance information unless specially programmed. A data reconciliation problem as optimization problem is characterized by the fact that there can be many variables, but that often the sparsity of the matrices involved especially the weight matrix allows for using special numerical techniques to make the algorithms more robust. We will assume that the usual precautions are taken, namely that the variables are scaled, and that where possible the derivatives of Equation 8.21 are determined analytically rather than by numerical differentiation. Direct methods such as the Nelder-Mead method of optimization is often time consuming with the number of variables involved. This is besides the fact that the Nelder-Mead method is basically not suited for dealing with constraints. The methods of choice in this matter are the large scale sequential quadratic programming (SQP) algorithms. Raghunathan et al. (2003) have combined data reconciliation optimization with an innate cell objective. This leads to a multilevel optimization methods. If the used algorithm is not converging, it is often so that a situation is reached where the measurement set has no redundancy left over, and the degenerate situation of Equation 8.17 is reached. A very simple optimization algorithm is to use the Newton-Raphson method of solving the normal Equation 8.13. The idea is as follows. Once there is a k-th estimate for, xˆ k , a new estimate can be calculated in analogy of Equation 8.15 from
W ( y - xˆ k ) xˆ k +1 = xˆ k + M k-1 . b - Axˆ k
(8.23)
After each iteration the matrix A is updated with Equation 8.21, M and M−1 are recalculated. If the degree of redundancy is bigger than zero, this leads to relative quick convergence for the variables involved.
8.5 Biotechnology and Reconciliation In general the use of systematic methods for data reconciliation in biotechnology is traced to the work by de Kok and Roels (1980) and by Wang and Stephanopoulos (1983). This concerns measurements obtained from steady-state chemostat experiments, where flows and concentrations are measured, while the overall elemental balances give the constraints that need to be fulfilled, as in essence done in the example treated above. An extension to this problem is obtained when one or more parameters are to be estimated as well. Pouliot et al. (2000) has done this e.g., for the determination of the mass transfer coefficient, K La, in a yeast fermentation. Besides the balance relations between the directly observed rates there exists also a relation determining the rate of oxygen uptake rate involving the unobserved mass transfer coefficient. They thus found KLa was more consistent and less inaccurate than the coefficients obtained with conventional direct calculations. A second extension is to apply data reconciliation to a dynamic system. In that case, equality constraints in the optimization formulation of Equation 8.6 are expressed as differential equations. For
8-12
Balances and Reaction Models
example a balance around a metabolite pool consists of a net inflow and a net outflow which should balance in the steady-state. In the dynamic situation, the pool size itself can change in time as applied by Liebman et al. (1992). In the implementation, they had to make choices, as the computational complexity for long time series would be unmanageable. Basically, they used a finite horizon and solved the differential equations numerically with orthogonal collocation within this horizon. This interval was then shifted along the whole time series. The input consisted of the time series of all observables, and the constituent equations were the differential equations. Much attention was given to the optimization algorithm. The resulting estimates for the observed quantities gave considerably better estimates. A more recent example has been published by Herwig et al. (2001) who applied essentially the same idea to a yeast fermentation. The global test (Table 8.5) was used to detect a change in metabolic state. Before this shift the equations were based on a certain reaction stochiometry which changed from an oxidative to an oxido-reductive state. In 1997, Wiechert et al. reported on the application of reconciliation to data obtained from 13C labeling studies. This is a much more extensive problem, as each measured metabolite pool now consists of all possible isotopomers. The basic variables in the system are the fractional labeling states of all carbon pools, the fractional labeling states of all input carbon substrates and the fluxes between the pools and substrates. Each reaction is described by a transition matrix, which accounts for the placing of each individual C-atom from each of the inputs to each of the products. The isotopomer balance equations around each pool give the constraints that account for the redundancy in the measurements.
8.6 Discussion and Conclusion The data reconciliation problem has been formulated in Equation 8.6 and one major procedure of solution is outlined in Table 8.2. The implicit assumption is that variance information is always available for all the data. It is either given with the experiment, or homoscedasticity exists. The statistical tests given in Table 8.5 or 8.6 are good tools for gross error detection. However, the unsolvable problem here is that these tests can point to potential outliers, but they cannot replace the experimenter in judging whether a genuine event has taken place, or some malfunction from whatever cause. From a computational point of view research is still going on to improve the optimization algorithms, while there are still possibilities to apply more recent techniques. Within the domain of statistics the application of robust estimates rather than least squares estimates have potential to separate interesting outliers more clearly. Although the bootstrap has become a standard statistical technique, it has not yet in data reconciliation. Finally, the use of dynamic data reconciliation applied to fed-batch experiments, is still a potential area of development, although the distinction with direct parameter estimation is less obvious here. Data reconciliation and gross error detection have become mature techniques in processing stationary data. However, its actual use still needs some encouragement.
References Almasy, G.A. and T. Sztano. Checking and correction of measurements on the basis of linear system model. Prob. Control Infor. Theory, 1975, 4 (1), 57–69. Crowe, C.M., G.Y.A. Garcio Campos, and A. Hrymak. Reconciliation of process flow rates by matrix projection. AIChe J., 1983, 29 (6), 881–888. Crowe, C.M. Maximum-power test for gross errors in the original constraints in data reconciliation. Can. J. Chem. Eng., 1992, 70 (5), 1030–1036. Crowe, C.M. Data reconciliation – progress and challenges. J. Proc. Cont., 1996, 6 (2/3), 89–99. De Kok, H.E. and J.A. Roels. Method for the statistical treatment of elemental and energy balances with application to steady state continuous culture growth of Saccharomyces cerevisiae CBS 426 in the respiratory region. Biotechnol. Bioengin., 1980, 22 (5), 1097–1104.
Data Reconciliation and Error Detection
8-13
Fletcher, R. Practical Methods of Optimization, 2nd edition. Wiley, New York, 2000. Herwig, C., I. Marison, and U. Von Stockar. On-line stoichiometry and identification of metabolic state under dynamic process conditions. Biotechnol. Bioengin., 2001, 75 (3), 345–354. Heyen, G. and B. Kalitvenzeff. Process monitoring and data reconciliation. In Computer Aided Process and Product Engineering (CAPE), L. Puigjaner and G. Heyen (Eds). Wiley-VCH, Weinheim, 2006. Kuehn, P.M. and H. Davidson. Computer control. II Mathematics of control. Chem. Eng. Progress, 1961, 57 (6), 44–47. Liebman, M.J., T.F. Edgar, and L.S. Lasdon. Efficient data reconciliation and estimation for dynamic processes using nonlinear programming techniques. Comput. Chem. Engin., 1992, 16 (10-11), 963–986. Narasimhan, S. and R.S.H. Mah. Generalized likelihood ratio method for gross error identification. AIChE J., 1987, 33 (9), 1514–1521. Narasimhan, S. and C. Jordache. Data Reconciliation & Gross Error Detection: An Intelligent Use of Process Data. Gulf Publishing Co., Houston, TX, 2000. Pouliot, K., J. Thibault, A. Garnier, and G.A. Leiva. KLa evaluation during the course of fermentation using data reconciliation techniques. Bioprocess Engin., 2000, 23 (6), 565–573. Raghunathan, A.U., J.R. Pérez-Correa, and L.T. Biegler. Data reconciliation and parameter estimation in flux-balance analysis. Biotechnol. Bioengin., 2003, 84 (6), 700–709. Ripps, D.L. Chem. Eng. Prog. Symp. Ser., 1965, 61 (55), 8–13. Romagnoli, J.A. and M.C. Sanchez. Data Processing and Reconciliation for Chemical Process Operations. Academic Press, London, 2000, pp 270. Tong, H. and C.M. Crowe. Detection of gross errors in data reconciliation by principal component analysis. AIChE J., 1995, 41 (7), 1712–1722. Van der Heijden, R.T.J.M, J.J. Heijnen, C. Hellinga, B. Romein, and K.Ch.A.M. Luyben. Linear constraint relations in biochemical reaction systems: I. Classification of the calculability and the balanceability of conversion rates. Biotechnol. Bioengin., 1994a, 43 (1), 3–10. Van der Heijden, R.T.J.M, J.J. Heijnen, C. Hellinga, B. Romein, and K.Ch.A.M. Luyben, Linear constraint relations in biochemical reaction systems: II. Diagnosis and estimation of gross errors. Biotechnol. Bioengin., 1994b, 43 (1), 11–20. Wang, N.S. and G. Stephanopoulos. Application of macroscopic balances to the identification of gross measurement errors. Biotechnol. Bioengin., 1983, 25 (9) 2177–2208. Wiechert, W., C. Siefke, A.A. deGraaf, and A. Marx. Bidirectional reaction steps in metabolic networks. 2. Flux estimation and statistical analysis. Biotechnol. Bioengin., 1997, 55 (1) 118–135.
9 Black Box Models for Growth and Product Formation 9.1 9.2 9.3
Introduction ���������������������������������������������������������������������������������������9-2 inetic Simplicity: Achievement of a Single Nutrient K Limited Condition by Medium Design..........................................9-2 Fermentor Transport Mechanisms as a Tool to Control Extracellular Concentrations and therewith Control the q-rates: the Chemostat ����������������������������������������������9-5
Transport Mechanisms can be Applied to Control Extracellular Concentrations • Control of a Constant Extracellular Substrate Concentration using Substrate Transport • Control of a Constant Biomass Concentration using Biomass Transport • Control of a Constant Extra-cellular Product Concentration using Product Transport • Manipulation of Biomass Specific Conversion Rates in a Chemostat
9.4
lack Box Kinetic Functions for qs, q p, µ under Single B Nutrient (Substrate) Limited Conditions...................................... 9-9
Substrate Uptake Rate • Substrate Consumption for Maintenance
9.5
e Herbert–Pirt Substrate Distribution Th Equation ������������������������������������������������������������������������������������������� 9-13
Distribution of Consumed Substrate • Theoretical Maximum Yields
9.6
Kinetics of Product Formation ���������������������������������������������������� 9-15
9.7
stimation of the Parameters of the Kinetic Model from E Chemostat Experiments ���������������������������������������������������������������� 9-20
The qP(µ) Function • Categories of Product Formation • Kinetics of Growth • A Single Degree of Freedom under Single Nutrient Limited Condition
Minimal Number of Chemostat Experiments Needed • Chemostat Experiments and Obtaining the Model Parameters • Calculation of the Other q-rates using Three Independent Reactions
9.8
Joseph J. Heijnen Delft University of Technology
Operational Yields �������������������������������������������������������������������������� 9-26
Operational Yields Depend on µ and Often have an Optimum • Derivation of Yij(µ) Functions • Calculation of the Stoichiometry of the overall Growth plus Product Reaction
9.9 Conclusions ������������������������������������������������������������������������������������� 9-31 References and Further Reading �������������������������������������������������������������� 9-31
9-1
9-2
Balances and Reaction Models
9.1 Introduction It has been shown in Chapter 7 that the kinetic behavior of (micro)organisms can be described by the values of three biomass specific rates, namely the specific rate of substrate consumption, qs:
qs =
mol substrate consumed/hour C-molX present in the cultivation vessel
µ=
C-molX produced/hour C-molX present in the cultiivation vessel
The specific growth rate µ:
And the specific rate of product formation qP:
qp =
mol product produced/hour C-molX present in the cultivation vessel
The values of these rates depend in the first place on properties of the organism itself (which are determined by its genome) but also on environmental conditions such as the concentrations of compounds in the extracellular environment, e.g., substrates products and oxygen, pH, temperature (T), and pressure. During an experiment several of these factors are usually kept constant (e.g., pH, T, pressure). As has been shown in the previous chapter for a batch fermentation the extracellular concentrations of the substrate, nutrients, and oxygen (in case of aerobic growth) are so high that the q-rates are independent of the extracellular concentration values: these q-rates do not change in time, and have values batch. However it often happens that some q-rates in batch are not satisfying, µ = µmax, qs = qmax s , q p = q p e.g., the q p-value is very low. This means that there is hardly any product formation. This raises the question how the values of these biomass specific conversion rates can be changed. As has been said these rates are determined by the organism itself and the extracellular conditions. It is obvious that a decrease in the extracellular substrate concentration (Cs) will lead, at sufficiently low substrate concentration, to a decrease in the biomass specific substrate uptake rate qs. Because qs and µ are intimately linked (because anabolism and catabolism are coupled) it also follows that µ decreases at lower Cs-values. Finally q p is also expected to change. It is sometimes observed that q p increases at lower Cs. In this case the genetic mechanism of substrate repression plays a role (and thus at high substrate concentration, as in batch, the genes which code for the enzymes for the product forming reactions can be repressed). An excellent example of this behavior is the production of penicillin by Penicillium chrysogenum (Revilla et al., 1984). In general this means that for a given organism changes in q-rates can only be achieved by changing the extracellular concentrations (and pH, T, pressure). In kinetic studies of (micro)organisms the aim is to find the three algebraic functions (the “kinetic functions”) which describe how qs, q p, and µ depend on the environmental conditions, i.e., extracellular concentrations, pH, T, pressure. These functions will be discussed below.
9.2 Kinetic Simplicity: Achievement of a Single Nutrient Limited Condition by Medium Design The most straightforward approach to find the kinetic functions is to study the effect of the change of one condition (while all other conditions are kept constant) on qs, q p, and µ.
Black Box Models for Growth and Product Formation
9-3
An example could be to apply different extracellular substrate concentrations Cs (e.g., 5 mg/l, 10 mg/l, 50 mg/l, and 500 mg/l) and to quantify qs, q p, and µ during growth at these different substrate concentrations. Subsequently graphical plots can be made of each of these q-rates versus Cs to obtain an impression of the relationships. Finally, it can be tried to derive the algebraic functions which describe these relationships, qs = ƒ(Cs), µ = ƒ(Cs), and q p = ƒ(Cs) in a proper way. However, such an experiment is in practice not easy to achieve, because fixing the substrate concentration within a broad range at a certain constant value is very difficult. It is much easier to control, e.g., T and pH. However, in all cases it is difficult to maintain all other conditions constant, because due to growth also O2, NH4+ , trace metals vitamins etc. are consumed and CO2 and other metabolic products are produced. This means that all these concentrations will change. It is nearly impossible to install control measures to keep all these concentrations constant. However, experience shows that for each compound of the growth medium (dissolved O2, CO2, NH4+ , H2PO4− , vitamins, trace elements, etc.) an extracellular concentration range can be found within which a change of concentration does not show any kinetic effect on the q-rates. If only for one compound the extracellular concentration is chosen so low that only this compound shows a kinetic effect, a change in the concentration of only this so-called growth limiting nutrient will lead to changes qs, µ, and q p. Therefore, the kinetic effect of this growth limiting compound can be studied without interference of other compounds. It should be realized here that a different choice of the growth limiting nutrient will result in different q-relations. The kinetic functions are algebraic equations describing how changes in extracellular concentrations, pH, T, pressure outside the cell change the value of qi:
qi = ƒ(pH, T, pressure, all extracellular nutrient concentrations)
(9.1)
This function is nonlinear and should in principle contain the effects of all different concentrations present in the growth medium on qi. Because the growth medium contains so many (> 30) different compounds (vitamins, hormones, trace metals, signal molecules, electron donor, acceptor, N-source, P-source…) the true kinetic function for each qi would in principle be very complex. Such complexity would prohibit practical use, both with respect to the determination of the function itself as well as the determination of the values of the kinetic parameters as with respect to the application in mathematical models. Fortunately, a general property of biological systems allows considerable simplification. This is best illustrated with the following example. Suppose that a microorganism needs a vitamin for the proper functioning of an enzyme, which plays a role in the synthesis of an amino acid. If there is no vitamin in the medium (Cvitamin = 0), the enzyme has no activity, which means that the amino acid is not produced, hence the growth rate µ = 0. If we add vitamin to the medium, the enzyme becomes active and µ increases. This increase will not continue indefinitely with the increase of the vitamin concentration outside the microorganism. At some point µ will reach a maximum value. So what will be observed if the growth rate µ is plotted against the extracellular vitamin concentration is a typical saturation type of relationship. Such reasoning can be applied for each q-rate and each medium component hence we can write for the qi of a limiting nutrient j (assuming constant pH, T, pressure):
Cj q i = q imax K j +C j
(9.2)
This equation, the famous hyperbolic kinetic equation has the following properties: and Kj are kinetic parameters specific for the used biological system, compound i and the • qmax i applied conditions (e.g., pH, T). • Kj is called the affinity constant for compound j, qmax is the maximum conversion rate of comi pound i. When Cj = Kj, qi = ½ qmax . i • for Cj = 0, qi = 0.
9-4
Balances and Reaction Models
• for Cj >> Kj, the bracket term approaches 1 and thus qi approaches qimax. • w hen Cj is large (Cj >> Kj) a change in Cj has hardly any effect on qi. This is called zero (0) order behavior, and happens in batch conditions. This last observation is very relevant because it shows that when the concentration of e.g., a trace metal, vitamin or hormone remains far above its Kj-value, a decrease in its concentration Cj (which will occur because e.g., a trace metal j or vitamin j is consumed during biomass growth) will not significantly change the value of qi. This condition (Cj >>> Kj) is called nonlimiting condition for compound j. This condition should be applied to each medium component which should not have an effect on the q-rates. Furthermore, saturation kinetics offers us a simple and practical kinetic format. As has been pointed out above a proper cultivation medium, should be designed such that all nutrients, except one, are nonlimiting. The medium should thus be designed in such a way that for each nonlimiting nutrient the concentration during the experiment is at all time much larger than the affinity (Cj >> Kj). For the limiting nutrient, however, the concentration must be in the range of its affinity constant. The choice of the type of nutrient limitation has a very significant effect on cellular behavior. For example if a vitamin is the limiting nutrient this will result in limiting an enzyme activity which depends on this vitamin. This metabolic bottleneck then can lead to drastic changes in secreted metabolic products. A famous example is the citric acid production by Aspergillus niger. The cultivation medium used in the production process of citric acid should not contain any manganese (Mn). The absence of this metal blocks the conversion of isocitrate to alpha-ketoglutarate in the TCA-cycle, because the enzyme for this reaction cannot function without Mn. The result of this blockage is that large amounts of citric acid are secreted by the cells. Clearly, if the N-source is the limiting nutrient, the formation of biomass is restricted e.g., due to the limitation of protein biosynthesis. This will result in a surplus of e.g., the electron donor which might lead to the formation of large amounts of byproducts (e.g., S. cerevisiae (bakers yeast), produces ethanol from glucose under N-limited conditions). The concept of single nutrient limitation leads to important kinetic simplifications. The only extracellular concentration which influences the q-values under single limiting nutrient condition is the limiting nutrient itself, hence the q-rates only depend on the concentration Cj of the limiting nutrient j. The very complex kinetic function for qi can thus be simplified to: qi = f (pH, T, pressure, concentration Cj of the single limiting nutrient)
(9.3)
And, if pH, T, pressure are kept constant, can be further simplified to: qi = f (Cj only)
(9.4)
Example 1: Design of a Nonlimiting Medium Biotin is a cofactor in certain enzymes. Assume that the value of the affinity constant K of microorganisms for biotin is equal to 1*10 −6 M. Also assume that for the synthesis of 1 g dry matter of biomass 1.1*10 −6 mol of biotin is consumed. Task: Answer:
If one desires to reach a final concentration of 15 g/l biomass, how much biotin is needed in the medium to keep biotin nonlimiting. Biotin remains nonlimiting if C >> K, e.g., Cbiotin > 10*K = 10*(1*10 −6) = 10*10 −6 M. For growth 15*(1.1*10 −6) = 16.5*10 −6 mol biotin/l is needed.
The total biotin concentration added to the medium should therefore be:
10*10 −6 + 16.5*10 −6 = 26.5*10 −6 mol biotin/l.
Black Box Models for Growth and Product Formation
9-5
9.3 Fermentor Transport Mechanisms as a Tool to Control Extracellular Concentrations and therewith Control the q-rates: the Chemostat 9.3.1 Transport Mechanisms can be Applied to Control Extracellular Concentrations It has been outlined above that the biomass specific rates (q-rates) for uptake and secretion of compounds are in general influenced by the properties of the organism (the genes) and the environmental conditions to which the organism is exposed. Obvious environmental factors of influence are pH and T. Therefore these are usually experimentally controlled at a selected constant value. A proper design of the cultivation medium in principle allows to study the effect of a single nutrient on the behaviour of the organism, that is on the biomass specific conversion rates (q-rates). As has been argued before, batch cultivation is not the preferred way to carry out these studies because in batch culture the concentrations can not be controlled by the experimenter. To do so a cultivation method is required which allows us to precisely control the extra cellular concentration of a certain compound of choice at desired levels (Cj ≈ Kj) in order to study their effect on the q-rates. The question is how control of extracellular concentrations can be achieved while there is ongoing consumption and production by the cells/organisms which are present in the vessel. The answer is that properly designed transport mechanisms (which can be different for the different compounds) must be implemented in the cultivation vessel/space in which the organisms are cultivated.
9.3.2 Control of a Constant Extracellular Substrate Concentration using Substrate Transport We have seen that in a batch experiment the substrate concentration drops due to cellular consumption. Such a drop can only be stopped by adding substrate to the cultivation vessel from an external source at a certain rate. Hence we need a mechanism to transport substrate from a substrate storage vessel into the cultivation vessel. One can think of many possible ways to achieve this, but a particularly simple method is to have a sterilized substrate solution available in a storage vessel and pump this solution into the cultivation vessel with a controlled flow rate. Assuming that the substrate concentration in the substrate solution is equal to Cs,in (mol/m3) and that the flow rate of this substrate solution equals ∅in (m3/h) we can write for the rate of transport of substrate to the cultivation vessel:
Substrate feed rate = Cs,in ∅in (molS/h)
(9.5)
When this transport rate is kept equal to the consumption rate of substrate by the cells, which can be achieved by manipulating ∅in, then the amount of substrate Ms in the cultivation vessel remains constant. Because the substrate amount present in the cultivation vessel equals Ms = V ⋅ Cs this means that a constant substrate concentration Cs can only be achieved if the broth volume V in the cultivation vessel is also kept constant. However, the continuous addition of substrate solution ∅in will lead to an increase in broth volume, hence V will increase with time and Cs (= Ms/V) will still drop in time. To avoid this we need to keep V = constant while feeding substrate solution, which requires that liquid should be transported out of the cultivation vessel. This can be done by pumping out broth. However, this also results in the removal of biomass.
9-6
Balances and Reaction Models
9.3.3 Control of a Constant Biomass Concentration using Biomass Transport We have also observed that in the batch experiment the biomass amount M x increases due to cellular growth and hence Cx increases. The biomass concentration under condition of growth can only be kept constant by removing the produced biomass from the cultivation vessel. If the rate of biomass removal from the fermentor (in C -molX/h), would equal the rate of biomass production, Ratex (in C -molX/h), then the biomass amount M x (RateX, in C -molX) in the cultivation vessel would not change anymore. A simple method to remove biomass is to pump out the complete broth which contains extracellular water (called supernatant) and biomass with a flow rate ∅out (m3/h). Usually a cultivation vessel is ideally mixed using, e.g., a stirring device. The term well mixed means that concentrations inside the cultivation vessel have the same value at each position inside the vessel. Hence the biomass concentration, Cx, is the same everywhere and this means that it can be safely assumed that the biomass concentration is also Cx at the point where the broth is removed from the fermentor. For the transport rate of biomass from the cultivation vessel (in C -molX/h) one can write:
Removal rate of biomass = Cx ∅out (C -molX/h)
(9.6)
Continuous removal of broth from the cultivation vessel will thus result in a constant total amount of biomass M x when there is continuous production of biomass. However several additional aspects must now be considered: • If broth is removed the broth volume V inside the cultivation vessel will decrease. To maintain a constant volume V requires that the outflow of broth should be compensated by a sufficient inflow of another solution. The most logical choice is the inflow of a substrate solution as discussed before. The problem of a changing volume, due to either inflow of substrate solution or broth outflow, can be solved by using a simultaneous in- and outflow. This allows that V = constant at a value chosen by the experimenter. Please note that it is not so that ∅in = ∅out; this hardly ever occurs!! • It should be realized that the broth does not only contain biomass!! It contains also supernatant in which substrate but also products and other nutrients are present. Hence transport of biomass by broth removal also creates a transport of substrate, products, and nutrients from the fermentor:
Removal rate of substrate = Cs ∅out. (molS/h)
(9.7)
Removal rate of product = Cp ∅out. (molP/h)
(9.8)
whereby Cs and Cp are the substrate and product concentrations in the fermentor. One should realize that the broth supernatant contains much more compounds, e.g., vitamins, minerals, hormones, NH4+ , H2PO4− , SO42− which are not completely consumed. These are, therefore, also transported out of the cultivation vessel by the broth removal. Because these compounds are also consumed for cellular growth it is clear that the amount of each of these compounds would only decrease (due to transport-out and consumption). To achieve constant amounts in the cultivation vessel also these compounds needs to be transported to the vessel. This is most easily achieved by adding these compounds to the solution which contains the growth limiting substrate which is pumped into the cultivation vessel to provide substrate transport in order to achieve a constant substrate concentration in the vessel. Hence it is necessary to pump in a complete medium solution and not a solution containing only the substrate.
Black Box Models for Growth and Product Formation
9-7
9.3.4 Control of a Constant Extra-cellular Product Concentration using Product Transport In a batch experiment the product concentration can only rise, because it is produced by the organism. A constant product concentration requires therefore that product is transported out of the cultivation vessel. A constant product amount Mp, and hence a constant product concentration Cp, in the cultivation vessel will be achieved when the rate of production by the organisms in the fermentor (Ratep, molP/h) equals the rate of transport out of the cultivation vessel. We have seen above that this product transport already occurs when broth is removed to control the biomass concentration, because the product is also present in the broth. Other possibilities for product transport There are more possibilities to control the product concentration by using alternative transport mechanisms (compared to broth removal) • One could add product to the medium inflow. This would create a second transport mechanism where product is transported into the vessel. In this way the product concentration in the cultivation vessel can be increased, for example to study the effect of higher product concentrations on the q-values (e.g., product inhibition). • Some products are volatile (examples Are ethanol and CO2) and are transferred easily from the supernatant to a gas phase. It is then possible to remove the product by sparging gas through the broth (called “stripping”). The above shows that a cultivation vessel with an inflow of fresh growth medium and a simultaneous outflow of broth one has sufficient transport mechanisms to be able to achieve constant concentrations of all compounds (Cs, Cx, Cp, Ci,…) in the broth supernatant in a situation where there is simultaneous cellular consumption and production of s, x, p, i,… This cultivation system is called a chemostat.
9.3.5 Manipulation of Biomass Specific Conversion Rates in a Chemostat A classical chemostat is a well mixed cultivation vessel with a constant inflow rate of medium, containing a single growth limiting nutrient, and an outflow rate of broth which is controlled in such a way that the culture volume is kept at a certain desired value within narrow limits. Although the culture volume V in a chemostat can be assumed constant (dV/dt ≈ 0), it is well possible that ∅in and ∅out are not the same. Explanations are • Evaporation of water from the broth always occurs, due to aeration of the broth (needed to transport O2 into and to transport CO2 out of the broth). Evaporation causes ∅out < ∅in. • If pH control is applied often a significant addition of a pH controlling agent (e.g., 1 N NaOH or 1 N H2SO4 solution in water) occurs to keep the pH constant which would change due to produced H+ or OH−. In this case ∅out > ∅in. • Densities of medium and broth may be different. This is usually of minor importance. The most characteristic property of a chemostat is that after sufficient time a steady state is reached, which means that all concentrations, T, pH, and V become constant in time. Hence for a steady state chemostat it holds that:
dV = 0 and dt
dC i =0 dt
9-8
Balances and Reaction Models
Total conversion rates can be calculated from the proper mass balances. The mass balance for compound i in a chemostat reads: d(V ⋅ C i ) = Ratei + Φ in ⋅ C i,in - Φ out ⋅ C i dt
(9.9)
Compared to the mass balances for a batch culture system the mass balances for a chemostat system also contains transport terms to and from the culture system. After a chemostat has reached a steady state, the accumulation term becomes equal to zero and thus the mass balance for compound i can be simplified to: 0 = Ratei + Φ in ⋅ C i,in - Φ out ⋅ C i
(9.10)
So where in case of a batch culture system the mass balance contains zero transport terms but a nonzero accumulation term, the mass balance for a steady state chemostat has a zero accumulation term but nonzero transport terms. Because of the presence of transport the chemostat is the most suitable cultivation system to manipulate the q-rates of microorganisms or cultured cells. The fact that specific conversion rates can be set by the experimenter, by means of manipulation of the transport rates, becomes clear from the biomass mass balance. If we assume that biomass is not present in the feed of the chemostat (Cx,in = 0) then it follows from the steady state mass balance for biomass that: 0 = Ratex−∅out · Cx
(9.11)
This result shows that in a steady state chemostat the rate of biomass production equals its removal rate in the broth outflow. By definition it holds that: Ratex = µ·M x = µ·V·Cx
(9.12)
Combination of Equations 9.11 and 9.12 yields:
µ·V·Cx = ∅out·Cx,
(9.13)
µ = ∅out/V
(9.14)
which can be rewritten as:
This wonderful simple result shows that the experimenter (who can set the broth volume V and broth outflow rate ∅out) can set the value of the biomass specific growth rate µ which he can impose on the organism in his chemostat. The ratio ∅out/V is called the dilution rate D. Hence the chemostat enables to do different experiments with an organism at different µ-values. In each chemostat experiment (see example below) one can then measure the concentrations of different compounds i, the flow rates and volumes, which can be entered into the different mass balances for the different compounds from which e.g., µ, qs, q p, etc. can be calculated. In general sets of qi- and Cs values (limiting substrate) can be obtained for different µ-values which can be accomplished by performing chemostat cultivations at different values of ∅out/V. These sets of q-rates, (qs, q p, µ) together with the measured substrate concentrations in the broth, are the basis of a stoichiometric and kinetic understanding of cultured microorganisms or cells.
Black Box Models for Growth and Product Formation
9-9
Example 2: Calculation of q-rates from a Chemostat Experiment A microorganism is grown in a chemostat on a cultivation medium containing substrate s. The broth volume is kept at a fixed value of V = 1.25 l. The feed solution contains 10 g/l of substate s and no biomass. The inflow rate of the feed solution is 0.10 l/h. The broth is pumped out of the reactor with a flow rate of 0.13 l/h and contains 4 g/l substrate and 2 g/l biomass. The difference between the inflow and outflow rates is caused by the addition of an alkali solution, needed to maintain the pH at the proper value. Task: Answer:
Calculate the total rates of biomass formation (Ratex) and substrate consumption (Rates) and their give their properunits The total rate of substrate consumption Rates follows from the substrate mass balance which can be written as: 0 = Rates + 0.1 * 10−0.13 * 4
This gives Rates = −0.420 g/h (negative!!) The value of Ratex follows from the biomass mass balance as: Ratex = 0.260 g/h (positive).
Task: Answer:
Calculate the biomass specific rates qs and µ and provide the proper units The biomass specific rates of substrate consumption (qs) and growth (µ) can be calculated directly from the previously calculated total rates and the total amount of biomass Mx present in the reactor:
Mx = 1.25 * 2 = 2.50 gX
qs = Rates/Mx = −0.420/2.50 = −0.168 gS/gXh
µ = Ratex /Mx = 0.26/2.50 = 0.104 gX/gXh
From this experiment it has been found that for Cs = 4 g/l, qs = −0.168 gS/gXh and µ = 0.104 h −1.
It has been shown above that the chemostat is an excellent tool to obtain the kinetics (qs, µ, q p) under single nutrient (substrate) limited condition. Before we do so, however, it is needed to introduce the necessary kinetic functions, which will be done below. It will furthermore be shown how the chemostat can be used to obtain kinetic parameters.
9.4 Black Box Kinetic Functions for qs, qp, µ under Single Nutrient (Substrate) Limited Conditions 9.4.1 Substrate Uptake Rate Cells consume their carbon substrate with a certain specific rate (qs). Generally this substrate is used at different rates for different purposes: • Growth (rate µ) • Maintenance (rate ms) • Product formation (rate q p)
9-10
Balances and Reaction Models
An important question is now how each rate qs, q p, and µ depends on the extracellular concentration of substrate, under the condition that the carbon substrate (which is often identical to the electron donor) is the only growth limiting nutrient (single nutrient limitation). If we further assume that T, pressure and pH are constant, it can be understood that only the extracellular concentration of substrate has an effect on the value of qS, hence we can write:
qS = f (CS)
(9.15)
The question is now to reflect on the form of this function. Here we have to consider our global knowledge on the metabolism of the substrate. Clearly, the substrate has to be first transported over the cellular membrane, usually by a specific membrane associated protein, called transporter. Hence one can expect that qS increases at increasing extracellular concentration of substrate, CS. The question is now to consider the form of the increase. A transporter has always a maximum specific transport rate (similar to enzymes). In addition there is a limit to the amount of transporter proteins present in the cellular membrane, because of space limitations or due to genetic regulation. Both factors explain why there is always a maximal value for qS, called qSmax. Describing the mechanism of transport of substrate over the cell membrane allows to derive a rate equation for substrate uptake. Assume that the cell membrane contains a transporter protein (Tr) which is able to form a reversible substrate-transporter complex (STr) when extracellular substrate is present:
(STr) S + Tr
(9.16)
The dissociation equilibrium constant of the (STr) complex follows as:
KS =
(C S )(C Tr ) C STr
(9.17)
The transporter exists in two forms, the unbound form, with concentration CTr, and the substrate bound form, with concentration CSTr. Note that the sum of both concentrations is constant (indicated tot ). with C Tr tot (9.18) CSTr + CTr = C Tr Combination of these two equations yields an expression for the fraction of substrate-bound tot ): transporters (CSTr/ C Tr C STr CS (9.19) = tot C Tr K S + CS tot = 0 and that if CS >> KS the value of It can be inferred from this equation that if CS = 0, CSTr/ C Tr tot = 1. CST/ C Tr The complex STr is formed at the outside of the membrane and is subsequently translocated to face the inside. Because intracellular concentration of S is very low (due to the consumption by metabolic reactions) the complex dissociates with a rate q max and releases S inside. This implies that the substrate s tot , leading to a hyperbolic function (see Figure 9.1): transport rate is proportional to CSTr/ C Tr
q s = q max s
Cs (K s + C s )
(9.20)
This function for qS resembles the Michaelis and Menten kinetics for single enzyme kinetics, but this is only a mathematical resemblance; qS holds for the overall kinetics of a complex biological system (microorganims, tissues, etc.).
9-11
Black Box Models for Growth and Product Formation
qSmax
–qS
0.5qSmax
qS
0
qSmax
CS KS + CS
KS
CS
Figure 9.1 Hyperbolic function for qS.
The hyperbolic function contains two kinetic parameters: qSmax and KS. These parameters: • can be estimated from experiments in which qs and Cs are varied. • will change when the same microorganism is grown on a different substrate (electron donor) or electron acceptor. • will change when a different T and pH is used. The rate qS is 1st order in CS for CS << KS and 0-order in CS for CS >> KS. qSmax is the maximum substrate uptake rate (which has a negative value, in mol S/C -molX/h). KS is the substrate affinity (mol substrate/m3). The substrate limited condition can now be quantified precisely, as a substrate concentration such that qS < qSmax, meaning that CS has a value close to or lower than KS. When CS >> KS, e.g., CS = 20 * KS, then qS = 0.95 qSmax ≈ qSmax, which means that no nutrient is limiting the microbial rates, meaning that all rates qi are at their so-called batch values qimax. Problems in measuring CS in a fermentor under nutrient limited condition Unfortunately the substrate concentration in a fermentor cannot be measured easily by a substrate specific sensor. On-line measurement systems have been developed but they are expensive and still not robust enough and therefore not used very often. The usual approach is still to withdraw a broth sample from the fermentor, to remove the biomass by filtration or centrifugation and subsequently analyze the substrate in the supernatant. It should be realized, however, that if the substrate to be measured is the growth limiting nutrient, the concentration is very low and thus time is a critical factor. Suppose that the real substrate concentration is 10 mg/l, the fermentor volume is 1.0 l, and that the microorganisms present in the broth consume the substrate at RateS = 3600 mg substrate/h which is equivalent to 1 mg/second. Compared to the total substrate amount in the fermentor, which is 10 mg, the substrate uptake rate is very high. Therefore, when the sampling process, or biomass filtration takes several seconds, the substrate concentration will drop significantly because the microorganisms keep on eating the substrate, and the analysis of the substrate concentration in the sample will result in completely wrong results.
9.4.2 Substrate Consumption for Maintenance The substrate which is taken up with rate qs partially has to be used for maintenance. Maintenance stands for the rate of energy expenditure needed to maintain the viability of a living cell. This energy is expressed as a rate of Gibbs-energy mG in kJ of Gibbs energy used per hour per amount of biomass present in the experimental system.
9-12
Balances and Reaction Models
The units for mG are therefore
(kJ per hour used for maintenance/C -mol biomass present in the fermentor).
A literature survey has shown (Heijnen, 1991) that the rate of maintenance Gibbs energy mG is similar for many microorganisms/cells: 1 kJ per hour 69000 1 mG = 4.5exp - R 298 T C-mol biomass presentin thefermentor
(9.21)
In this equation R is the gas constant (8.314 J/mol K) and T is the absolute temperature (273 + °C). This relation shows that mG is only dependent on temperature, according to a typical Arrhenius relation (with an activation energy of 69,000 J/mol). The temperature effect is strong; it can be calculated from this equation that a difference of 8°C (e.g., from 298 K to 306 K, meaning 25–33°C) approximately doubles mG from 4.5 kJ/C -molX/h to 9 kJ/C -molX/h. Another point of interest is that mG does not depend significantly on the nature of the C -source and of electron donor and electron acceptor used in catabolism to generate the maintenance energy. This is understandable because maintenance relates to biomass which has already been synthesized and for which viability must be maintained at the expense of a defined rate of Gibbs energy mG; it does not relate to new biomass that is being formed. The need for maintenance energy can be increased significantly by addition of so-called energy uncoupling agents. E.g., a weak acid like benzoic acid which is present at pH = 4–5 easily crosses the cell membrane and releases H + at the cell interior. To maintain the proton motive force and to avoid unacceptable high accumulation of the benzoate-ion (Ac −) inside the cell, both H + and Ac − must be exported at the expense of energy (ATP). This cyclic transport (in and out) of benzoic acid and (H + + Ac −) represents an energy dissipating cycle. It is obvious that the energy needed for maintenance is generated in a catabolic reaction, where electron donor (or substrate S), electron acceptor and catabolic products e.g., ethanol, CO2 , etc. are involved. Hence maintenance is not only characterized by mG but the generation of this energy leads to associated so-called “chemical maintenance rates” of electron donor, electron acceptor, and catabolic products which are consumed and produced in the catabolic reaction, with rates m S, mO2 , methanol, mCO2 , etc. The relation between the various mi-values follows directly from the catabolic reaction used by the cellular system (see examples below).
Example 3: Calculation of All Chemical Maintenance Coefficients mi from the Known mG Consider the yeast Saccharomyces cerevisiae that grows aerobically with glucose as electron donor. The catabolic reaction under these conditions is: −1 C6H12O6 − 6 O2 + 6 HCO3- + 6 H +
(9.22)
Under standard conditions (25°C = 298 K, pH = 7) the −∆GR = ∆Gcat = 2843.1 kJ. The energy need for maintenance at 25°C ( = 298 K) is (see correlation before) mG = 4.5 kJ/C -molX/h. To generate this Gibbs energy the organism must catabolize glucose with a rate mS = −(4.5/2843.1) = −0.00158 mol glucose/ C -molX/h. In addition O2 is needed to catabolize glucose, with a stoichiometry of 6O2 per mol glucose. Hence:
mO2 = −6 * 0.00158 = −0.0095 mol O2/C -molX/h.
Black Box Models for Growth and Product Formation
9-13
the production of CO2 is equal to:
mHCO3- = + 6 * 0.00158 = 0.0095 mol CO2/C -molX/h,
and the production of protons equals:
mH+ = 6 * 0.00158 = 0.0095 mol H+ /C -molX/h.
Note that ms and mO2 are negative because substrate and oxygen are consumed. Consider now the case that the yeast S. cerevisiae is cultured in the absence of O2 (anaerobically). It is known that under these conditions a different catabolic reaction is used, involving the production of ethanol (C2H6O) from glucose according to the following overall reaction: −1 C6H12O6 − 2H2O + 2C2H6O + 2 HCO3− + 2 H+
(9.23)
For this reaction ∆GR = ∆Gcat = −225.4 kJ. The stoichiometry of the catabolic reaction provides the chemical mi-values for catabolic reactants. Using mG = 4.5 kJ/C -molX/h it is easy to calculate that:
mS = −4.5/225.4 = −0.020 mol glucose/C -molX/h
meth = 2 * 0.02 = 0.040 mol ethanol/C -molX/h
mHCO3- = 2 * 0.02 = 0.040 mol HCO3- /C -molX/h
mH+ = 2 * 0.02 = 0.040 mol H + /C -molX/h
mH2O = 2 * -0.02 = −0.040 mol H2O/C -molX/h
Same maintenance energy requirement, but different mS!! It should be noted that mS under anaerobic conditions is about 13 (0.020/0.00158) times higher than under aerobic conditions, although the maintenance energy requirement (mG) is the same (4.5 kJ/C -mol X h). The reason for this is that the catabolic energy gain from 1 mol glucose under aerobic conditions is 13 times (2843.1/225.4) higher than under anaerobic conditions. In conclusion it appears that the kinetics of maintenance energy requirement are relatively straightforward. It is assumed that maintenance energy requirement is independent of the growth rate and is therefore usually expressed as a constant mS. The only relevant factor is temperature, where roughly speaking mS doubles for each 8°C increase in temperature. All other associated maintenance related rates mi (mG, mO2 , mCO2 , meth, etc.) follow from the catabolic reaction used to generate the energy needed for maintenance.
9.5 The Herbert–Pirt Substrate Distribution Equation It has already been noted that the substrate which is taken up is used for three purposes: maintenance (rate ms), growth (rate µ) and product formation (rate q p). This allows postulating the following substrate distribution equation:
qs = aµ + b q p + MS
(9.24)
This is the famous Herbert–Pirt equation for substrate distribution (Pirt, 1965). Note that a, b, and ms are negative numbers, whereas µ and q p are positive. Hence, qs is by definition negative.
9-14
Balances and Reaction Models
The units of the parameters of the Herbert–Pirt equation depend on the units of qs and q p. Assuming that all amounts are expressed in mol, the units are a mol substrate consumed per C -molX produced b mol substrate consumed per mol product produced ms molS/h catabolized for maintenance per C -molX present in the cultivation vessel Several important aspects of the Herbert–Pirt equation will be discussed below.
9.5.1 Distribution of Consumed Substrate (Micro)organisms consume expensive substrate with rate qs and use it for growth, product formation and maintenance. A relevant problem is to find out how the consumed substrate is distributed over these three independent processes. This is best illustrated using an example. Let us consider the following Herbert–Pirt equation for aerobic growth with lysine as a product. (All rates are expressed in mol per amount of biomass per time.) qS
= −0.333µ
−1.5qP
−0.005
Total uptake of substrate
Part used for growth
Part used for lysine production
Part used for maintenance
(9.25)
Question: Consider the above substrate Herbert–Pirt equation. Assume that µ = 0.05 h −1, qP = 0.05 mol lysine/C -molX/h. Calculate the substrate distribution for growth, product formation and maintenance. Answer: The total substrate consumption equals qS = −0.333 * 0.05−1.5 * 0.05−0.005 = -0.0967 mol glucose/C -molX/h.
The distribution of substrate is then:
Growth: (0.333 * 0.05)/0.0967 = 0.172 Lysine production: (1.5 * 0.05)/0.0967 = 0.776 Maintenance: 0.005/0.0967 = 0.052
From this we can conclude that substrate is used for growth (17%), product formation (78%) and maintenance (5%). This tells us that the organism is already highly efficient with respect to the production of lysine!!
9.5.2 Theoretical Maximum Yields Consider the general Herbert–Pirt relation. Assume the theoretical case that only product formation occurs, no biomass growth is takes place (µ = 0) and maintenance is negligible (ms = 0). In this case the Herbert–Pirt equation reduces to qs = b qP. This shows that all substrate consumed is only used for product formation. The yield of product on substrate Ysp = qP/−qs (mol product/mol substrate) is then at its theoretical maximal value because no substrate is used for the production of new biomass and no substrate is spent for maintenance. In such a theoretical situationYsp = 1/b = Yspmax . For the lysine case shown above it can thus be calculated that max = 1/1.5 = 0.666 mol lysine/mol glucose. Ysp Hence the coefficient b of the substrate Herbert–Pirt equation represents the reciprocal of the maximal theoretical product yield on substrate. This is essential information because this maximum can be compared to the actual, operational, yield and this comparison shows how much room there is for
Black Box Models for Growth and Product Formation
9-15
improvement of the operational product yield. It should be kept in mind that the operational product yield will always be lower than the theoretical maximum yield, because part of the substrate will be spent for growth and maintenance. Similarly, the coefficient a of the Herbert–Pirt relation represents the reciprocal of the theoretical maximum biomass yield Ymax sx = 1/a (in C -molX/mol substrate).
9.6 Kinetics of Product Formation 9.6.1 The qP(µ) Function In industrial fermentation processes microorganisms are usually applied to produce an economically attractive product. The performance of the micoorganisms in producing this product, is represented by the biomass specific rate of product formation qP, which is therefore an important rate. Under the here considered single nutrient (substrate) limited conditions qP, is only a function of the concentration of the extracellular substrate CS and thus we can write:
qP = f (CS)
(9.26)
The nature of this function is not easily deduced theoretically, as we did earlier for the qs (Cs) function. Therefore often an experimental approach is applied. However, although the experimental quantification of qP is relatively easy using the product mass balance, the measurement of CS under nutrient limited conditions is very difficult, as has been illustrated above. However, it can be argued that it is not necessary to measure CS. Because under single nutrient limited conditions µ = function (CS) (see below) then it is formally possible to use this (unknown) function to eliminate CS from qP = function (CS) to obtain:
qP = another function (µ)
(9.27)
This function is in most cases nonlinear. Because µ is easily manipulated experimentally (in a chemostat) it is fairly easy to experimentally measure the relation between qP and µ. This is the qP(µ) concept, which only holds under single nutrient limited conditions.
9.6.2 Categories of Product Formation It is important to distinguish the different categories of product formation which might occur. The first category is catabolic product formation. In case of catabolic product formation the product is produced in the catabolic reaction and therefore, the rate of product formation is directly coupled to the rate of the catabolic reaction. Examples are anaerobic formation of acetate, lactate, ethanol etc. Because the catabolic product formation is the unique, and therefore the sole source of energy generation which is stoichiometrically (meaning linear) coupled to growth and maintenance, it becomes clear that qP is coupled to growth and maintenance in a (stoichiometric) linear fashion:
qP = αµ + β
(9.28)
with α and β being the parameters of this linear q p (µ) relation. The second category is noncatabolic product formation. In this case the product is derived from the anabolic network. Examples are vitamins, amino acids, antibiotics, proteins, etc.
9-16
Balances and Reaction Models
Some examples of qP − µ relations for noncatabolic products are (α, β, γ are kinetic constants): decrease of q P with µ: q P =
α β+µ
power law relation: q P = α µ β
hyperbolic function of µ: q P = function with a maximum: q P =
αµ β+µ
(9.29)
αµ β + µ + γ µ2
Depending on the specific case, for noncatabolic product formation any relation might exist between the rate of product formation qP and the growth rate µ. Under the condition of single nutrient limitation the relation between the rate of product formation and the growth rate can be expressed by an algebraic function: qP = function (µ). In some cases this function is linear in µ. This especially happens in case of catabolic product formation. Usually the qP(µ) function is nonlinear, especially for noncatabolic products. The function itself and the parameter values must be obtained from proper experiments.
9.6.3 Kinetics of Growth In the previous sections we have introduced: • Hyperbolic kinetics for substrate uptake qS • (Non)linear qP(µ)-relation • Linear substrate Herbert–Pirt equation for substrate distribution with constant kinetics for maintenance (mS) These three kinetic functions are sufficient to calculate how µ depends on CS. Two cases can be distinguished: Case 1: The µ(CS)-function when there are no anabolic but only catabolic products In this case (all q-rates in mol i/C -molX/h) the Herbert–Pirt equation only relates qs and µ, there is no separate contribution for qP. Let us consider aerobic growth on glucose (CO2 is the only catabolic product). Assume the following Herbert–Pirt relation for substrate distribution:
qS = −0.3125 ⋅ µ − 0.0015
(9.30)
In this equation the maintenance coefficient can be recognized as mS = −0.0015 mol glucose/C -molX/h max = 1/0.3125 and YSX C -molX/mol glucose. Let us now assume that hyperbolic kinetics apply for the specific rate of substrate consumption qS as a function of CS according to: qS =
- 0.03C S 18 + C S
(9.31)
In this hyperbolic relation CS is the extracellular substrate concentration in mg/l. From this equation it can be inferred that qSmax = −0.03 mol glucose/C -molX/h and KS = 18 mg glucose/l. Combining these two equations by eliminating qS yields the following relation between µ and CS:
µ=
1 0.03C S 0.0015 0.3125 18 + C S 0.3125
(9.32)
9-17
Black Box Models for Growth and Product Formation
A plot of this relation is shown in Figure 9.2. Several remarks can be made about the above derived kinetic equation for µ as a function of CS. In the literature often the Monod equation is used to express µ as a function of CS, that is µ = µ max ⋅
CS K S + CS
(9.33)
The equation which has been derived before (Equation 9.32) is clearly not identical with the Monod equation (Equation 9.33). It should be noted, however, that only if maintenance is absent Equation 9.32 becomes identical to the Monod type equation because the maintenance term (in the above case 0.0015/0.125) disappears. Furthermore it should be noted for this example that: • At CS >> 18 mg/l, µ approaches µmax which equals (0.03/0.3125)−(0.0015/0.3125) = 0.0912 h −1 • At C S = 0 µ is negative and equal to −0.0048 h −1. The interpretation is that at C S = 0 (see Figure 9.1), there is no substrate uptake (qS = 0). However maintenance energy is still required. In practice it is observed that, under conditions of absence of substrate (C S = 0) organisms start to catabolize part of themselves; they loose weight!, which means that the cell mass decreases and hence µ < 0. • µ does become zero at a substrate concentration above zero. At this substrate concentration, called CSmin, there is still substrate uptake, but because there is no growth, all substrate consumed will be spent for maintenance. By substituting µ = 0 in the above relation, CSmin can be calculated as 0.9474 mg/l. It can indeed be calculated that at this concentration qS = −0.03(0.9474)/ (18 + 0.9474) = −0.0015 mol glucose/C -molX h, which is equal to the substrate requirement for maintenance. • Equation 9.32 can be rewritten as:
µ = 0.0912(CS − 0.9474)/(18 + CS)
(9.34)
We can recognize now that µmax = 0.0912 h −1 , that CSmin = 0,9474 mgS/l (by substituting µ = 0), and that KS = 18 mgS/l. Therefore, we can rewrite the µ(CS) relation as: µ = µ max (C S - C Smin )/(K S + C S )
(9.35)
Note that this equation becomes equal to the Monod equation by substituting C Smin = 0 .
µ (1/h)
0.1 0.09 0.08 0.07 0.06 0.05 0.04 0.03 0.02 0.01 0 –0.01
µ = 3.200
0
Figure 9.2 µ as function of CS.
0.03 CS
(18 + CS )
100
–0.0015
CS (mg/L)
200
300
9-18
Balances and Reaction Models
It is clear from the above that µmax and C min are related to the already available parameters s q ,Ysxmax , and m s. It is easy to derive these relations. max s
Combining the Monod equation with the Herbert–Pirt substrate distribution relation: A silly outcome. In many books and scientific articles the relation between the specific growth rate and the limiting substrate concentration is introduced as the Monod kinetics e.g., µ = µ max ⋅
CS K S + CS
(9.33)
By substituting the parameters from the example above, i.e., µmax = 0.0912 h −1 and KS = 18 mgS/l, the following equation is obtained: µ = 0.0912 ⋅
CS 18 + C S
(9.36)
Introducing this kinetic function for µ as function of CS in the Herbert–Pirt substrate distribution relation for the example above, being: qS = −0.3125 ⋅ µ−0.0015
(9.30)
Leads to the following kinetic equation for qS as function of CS: q S = - 0.0285
CS - 0.0015 18 + C S
(9.37)
This result is however highly questionable. At CS >> 18 mg/l, qSmax = −0.03 mol glucose/C -molX/h, which is correct. However from this equation it follows that when there is no substrate (CS = 0), there is still substrate uptake (qs) = −0.0015 mol glucose/C -molX/h. This is of course complete nonsense. This problem is eliminated by introducing the hyperbolic kinetic function for qS into the Herbert–Pirt substrate distribution equation as shown above, which leads to the µ (CS) function as shown earlier (Equation 9.32). Case 2: The μ(CS) function in case of noncatabolic product formation Let us now consider the case were growth is accompanied by the formation of a noncatabolic product. The procedure to obtain the kinetic function for µ = ƒ(CS) is most easily demonstrated with an example.
Example 4: Corynebacterium: Aerobic Growth and Lysine Production Assume that the following Herbert–Pirt substrate distribution has been found:
qS = −0.333µ−1.5qP − 0.005
(9.38)
max = 3 C-molX/mol glucose and This equation shows that mS = −0.005 mol glucose/C -molX/h, YSX max = 0.666 mol lysine/mol glucose . Let us further assume that the (hyperbolic) glucose uptake kinetics YSP are given by (where CS is the substrate concentration in mg/l):
qS =
- 0.10 ⋅ CS 5 + CS
(9.39)
Black Box Models for Growth and Product Formation
9-19
This shows that qSmax = −0.10 mol glucose/C -molX/h and KS = 5 mg glucose/l. Also the lysine production kinetics are known, with the following hyperbolic qP(µ) function (qP in mol lysine/C -molX/h):
qP =
0.03 ⋅ µ 0.01+ µ
(9.40)
Introducing the relations for qS and qP in the Herbert–Pirt substrate distribution relation yields:
0.10CS 0.03µ = 0.333µ + 1.5 + 0.005 5 + CS 0.01+ µ
(9.41)
The result is a nonlinear relation between µ and CS. Let us first consider the properties of this relation: • It can be shown that µ increases monotonously with CS to a maximal value, called µmax. For CS >>> 5 mg/l, the left side becomes constant and independent of CS (qS has then its maximal value of −0.10 mol glucose/C -molX/h). Under these conditions µ also achieves its maximal value which can be found by solving the equation:
0.10 = 0.333 ⋅ µ max + 1.5 ⋅
0.03 ⋅ µ max + 0.005 0.01+ µ max
(9.42)
This can be solved to give µ = µmax = 0.1580 h −1. • By combining the µ (Cs) relation (Equation 9.41) and the qP(µ) function, a function for the relation between qp and Cs is obtained. • Also in this case µ = 0 at a certain Cmin ; At this value of CS the value of qS equals the maintenance s rate. For µ = 0, the nonlinear relation between µ and CS (Equation 9.41) becomes:
0.10 ⋅ CS = 0.005 5 + CS
(9.43)
= 0.26 mg/l. At this concentration µ = 0 and qP = 0, but From this result it can be calculated that Cmin s qS = mS = −0.005 mol glucose/C -molX/h = mS. • When product formation would be absent, and assuming the same maintenance and substrate uptake kinetics, the µ(CS) relation would be:
0.10 ⋅ CS = 0.333µ + 0.005 5 + CS
(9.44)
The µmax-value (at Cs >>> 5 mg/l) follows now as µmax = (0.10−0.005)/0.333 = 0.285 h −1. This µmax value is much higher then when product formation occurs (µmax = 0.1580 h −1). The reason for this is that in case the substrate is not used for product formation all consumed substrate can be channeled to growth and maintenance, resulting in a higher growth rate.
It can be concluded from the above example that the presence of noncatabolic product formation has a significant effect on the µ(CS) relation, such that the growth rate can be much lower when product formation happens. This is logical and this phenomenon is called “metabolic burden.”
9-20
Balances and Reaction Models
9.6.4 A Single Degree of Freedom under Single Nutrient Limited Condition The kinetic model for substrate limited growth (hyperbolic substrate uptake equation, Herbert–Pirt substrate distribution, qP(μ) relation) is now complete. The three basic q-rates (qs, q p, µ) are completely specified as function of CS. Alternatively, because µ and CS are uniquely related (by the µ(CS) function) one can also state that all rates are determined when one rate is known, for example µ. Clearly at a chosen µ, the q p(µ) function yields q p. The Herbert–Pirt relation then yields the value of qs. Finally the hyperbolic qs-relation yields CS. This consideration clearly shows that the complete black box kinetic model only contains only 1 degree of freedom. Choosing the free variable, e.g., CS or µ or qs or q p determines which variables are fixed by the kinetic equations (Table 9.1). It is a matter of practical consideration which variable is chosen as free variable. In case of chemostat experiments the growth rate µ is a logical choice (is equal to the dilution rate and can be easily set by the experimenter), in a fed batch culture the feed rate of the substrate is a logical choice (which is directly related to qs under the condition of single nutrient limitation), etc.
9.7 Estimation of the Parameters of the Kinetic Model from Chemostat Experiments The kinetic and stoichiometric description of growth and product formation of cultured microorganisms or cells from higher organisms under single substrate limited condition requires information on: • The (hyperbolic) substrate uptake kinetics, (the qS(CS) function with qSmax, KS as parameters) • The qP(μ) relation with its parameters (α, β, γ) • The Herbert–Pirt substrate distribution equation with parameters a, b, ms
9.7.1 Minimal Number of Chemostat Experiments Needed As has been shown earlier, chemostat cultivation allows manipulating the growth rate µ and therefore µ is the most obvious free variable for such a system. Experiments at different growth rates μ can therefore be carried out to obtain the parameters of the black box kinetic model. Previously it was shown for batch systems how the biomass specific consumption (or production) rate of a compound of interest, e.g., qS for substrate, μ for biomass, is calculated from the experimental measurements (volumes and concentrations) in combination with the proper mass balances. Here we will show how the q-rates can be obtained from chemostat experiments. An important question thereby is what the minimal number of different experiments is that is needed to obtain the parameters. In case of noncatabolic product formation three different data sets on µ, qs, q p, CS are needed to obtain the values of a, b, mS of the Herbert–Pirt equation, by solving the resulting set of linear equations, as is shown in the example below. Table 9.1 Choices of Free Variables for the Black Box Kinetic Model Free Variable
Determined by Kinetic Model
CS
qs, µ, qp as function of CS
µ qs
CS, µ, qp as function of qs
qp
µ, qs, CS as function of qp
qp, qs, CS as function of µ
Black Box Models for Growth and Product Formation
9-21
Furthermore the parameters qmax and Ks can be obtained from a plot of qs versus CS and the relation s between q p and µ can be found from a plot of qP versus µ. If no noncatabolic product formation occurs the Herbert–Pirt equation reduces to: qS = aµ + mS
(9.45)
In this case minimally two different datasets on µ, qs and CS are required to obtain the parameters a, ms, qsmax, and Ks. In this case the q p(µ) function is a linear function of the growth rate: q p = αµ + β
(9.46)
Only in this case a and mS can be obtained graphically. According to the Herbert–Pirt equation a straight line is expected if qs is plotted versus µ, which is indeed found in most cases. The slope of this line equals a. The intercept with the vertical axis equals mS. The two parameters (α, β) of the q p (µ) relations and the parameters of the hyperbolic relation (KS, qsmax) also require minimally, two experiments. These are obtained from plots of qs versus CS (hyperbolic qs function) and q p versus µ for the q p (µ) function. In practice, however, it is wise to carry out more experiments than the minimum amount, i.e., two or three, in order to obtain statistically reliable parameter values.
9.7.2 Chemostat Experiments and Obtaining the Model Parameters A typical set of chemostat experiments consists of cultivations at different known flow rates φout, whereby the concentrations of substrate, product, biomass, the flow rates and V are measured under steady state conditions. From these measurements and the mass balances the q-values are calculated at each μ imposed on the biological system. This set of calculated biomass specific rates and measured Cs values can then be used to establish the kinetic and stoichiometric functions and their parameters. The required procedures to do so are outlined in the example below.
Example 5: Kinetics and Stoichiometric Model from Chemostat Experiments A microorganism is cultivated in a chemostat at different inflow rates. The chemostat broth volume is 1.2 m3 and is kept at this value by controlling the outflow rate. The organism grows aerobically, uses glucose as carbon source, NH4+ as N-source and produces alanine (C3H7O2N) as (noncatabolic) product. The pH is controlled using a 10 N solution of NaOH. Air sparging is used to transfer O2 to the culture and to remove the produced CO2. A stirrer is used to achieve ideal mixing of the contents of the culture vessel. The nutrient solution, which is fed into the chemostat contains glucose at a concentration of 2000 mol/m3. For each flow rate applied, the chemostat is allowed to achieve steady state, where the glucose is the single limiting nutrient. From the considerations outlined above it can be inferred that a minimum of three experiments is needed in this case because alanine is a noncatabolic product. However, in practice six experiments are performed at different in-flow rates of the nutrient solution. Achievement of a steady state is observed from the measured concentrations in the chemostat which reach constant values after some time. During each steady state measurements are performed on: φin φout CSin CS V
flow rate into the chemostat of the nutrient solution (m3/h) flow rate of broth out of the chemostat (m3/h) the glucose concentration in the inflowing nutrient solution (mol/m3) the glucose concentration in the chemostat (mol/m3) the volume of the broth in the chemostat (m3)
9-22
Balances and Reaction Models CX the biomass concentration in the chemostat (C -mol/m3) CP the alanine concentration in the chemostat (mol alanine/m3) φalk the supply rate of NaOH solution (alkali), to control the pH
Results can be found in Table 9.2. Task 1: Answer:
Calculate for experiment 4 the biomass specific rates μ, qS, qP. It has been derived earlier in this chapter that from the mass balance for biomass for a steady state chemostat it follows that the specific growth rate μ is equal to the dilution rate of the chemostat (Equation 9.14), thus µ = (D = (Φ v,out / V)) and therefore µ=
0.42 = 0.35 C-molX / C-molX / h 1.2
The value of qs is the total rate of consumed substrate divided by the total biomass present in the chemostat qs = Rates/Mx. The rate of consumed substrate, Rates, is obtained from the substrate mass balance:
(d(VCS ) /dt) = 0 = Rates + rate of substrate entering − rate of substrate leaving
This gives − Rates = −0.3623 * 2000 + 0.42*0.097 = −724.6 + 0.0407 = −724.56 mol glucose/h. Note that this rate is negative, which is logical because substrate is consumed. Subsequently qS is calculated by dividing Rates by the total biomass amount present in the fermentor: qS = Rates/Mx = −724.56/3394.8 = −0.2134 mol glucose/C -molX/h.
The value of qP follows similarly from the product mass balance as:
qP = + 0.10 mol alanine/C -molX/h
In a similar way these rates are calculated for the other five chemostat experiments. The results are shown in Table 9.3. Task 2: Answer:
Make a graph of −qS versus CS, obtain the values for qSmax and KS and give their proper units. This graph shows that −qS increases with CS in a nonlinear, hyperbolic way. The exact values of KS and qSmax can be obtained by nonlinear fitting of the qS and CS data to the hyperbolic substrate uptake relation. A popular alternative is to rewrite the hyperbolic substrate uptake relation in its inverse form (Lineweaver-Burke plot): 1 KS 1 1 = + CS qmax qS qmax S S
(9.47)
This shows that a plot of 1/qS versus 1/CS gives a linear line with slopes KS/qSmax and intercept 1/qSmax. The slope and intercept can be obtained by linear regression, and subsequent qSmax and KS follow. This method is not advised because it gives a disproportionate weight to the low concentration data!! Using nonlinear regression one obtains K S = 0.1 mol glucose/m3 and qSmax = −0.433 mol glucose/C -molX/h. Task 3: Answer:
Propose an equation for the qP(μ) function A plot of qP versus μ using the data of the six experiments shows that:
9-23
Black Box Models for Growth and Product Formation Table 9.2 Results from a Series of Steady State Chemostat Experiments Experiment
φin (m3/h)
φout (m3/h)
CSin (mol/m3)
CS (mol/m3)
V (m3)
CX (C -mol/m3)
CP (mol/m3)
φalk (m3/h)
1 2 3 4 5 6
0.0347 0.1125 0.3316 0.3623 0.4017 0.4699
0.036 0.12 0.36 0.42 0.4797 0.5759
2000 2000 2000 2000 2000 2000
0.008 0.016 0.048 0.097 0.190 1.390
1.200 1.200 1.200 1.200 1.200 1.200
1805 3126 3941 2829 2335 1939
0 0 0 809 1167 1454
0.0013 0.0075 0.0284 0.0577 0.0780 0.1060
Table 9.3 Calculated Specific Rates for the Chemostat Experiments Experiment
CS (mol/m3)
qS (mol i/C -molX/h)
μ (C -mol X/C -molX/h)
qP (mol i/C -molX/h)
1
0.008
− 0.032
0.030
0
2
0.016
− 0.060
0.100
0
3
0.040
− 0.140
0.300
0
4
0.097
− 0.213
0.350
0.100
5
0.190
− 0.287
0.400
0.200
6
1.39
− 0.404
0.480
0.360
For μ < 0.30 h −1 product formation is absent, hence qP = 0 For μ > 0.30 h −1 qP increases linear with μ. The slope is 2. Hence qP = 2(μ−0.30) This type of product formation kinetics is a typical overflow metabolism where there is an imbalance between the uptake rate of the substrate and the rate of biomass formation. The surplus of the substrate taken up is spend by secretion of a product. If the cell would not have such an “escape”, the surplus of substrate taken up (but which cannot be converted into biomass) would lead to very high levels of unprocessed intracellular intermediates. Task 4: Answer:
Provide the substrate Herbert–Pirt relation for the experiments where μ < 0.30 h −1. For μ < 0.30 h−1 noncatabolic product formation is absent and the Herbert–Pirt relation for qS has the form qS = a μ + mS. The qS and μ-values of experiment 1–3 can be used to estimate the values for a and mS. This can be done graphically by plotting qS versus μ, which shows a linear line with vertical cut-off, which equals the substrate maintenance coefficient mS = −0.020 mol glucose/C -mol X hour. The slope a is found to be −0.40 mol glucose/C -molX.
Task 5: Answer:
Provide the Herbert–Pirt relation for qS in the range μ > 0.30 h −1. The linear relation for qS now has to contain also (apart from μ and mS) qP as variable because alanine is a noncatabolic product. Given the result of Task 4 with respect to a and ms we can write for the Herbert–Pirt relation:
qS = −0.40 μ + b qP−0.020
The values of qS, μ and qP obtained from experiment 4, 5, 6 (Task 2) allow to calculate b. E.g., using the qs, µ and qp-values from experiment 4: −0.213 = −0.40*0.35 + b*0.10−0.020. This gives b = −0.533 mol glucose/mol alanine. The linear relation is then:
qS = −0.40 μ−0.533 qP−0.020
9-24
Balances and Reaction Models
We can also use experiments 5 and 6 to obtain values of b. These will be the same. The obtained maximal theoretical yields are
Ysxmax =
1 1 mol alanine = 2.5 C - molX/mol glucose and Yspmax = . 0.40 0.533 mol glucose
Parameter estimation from real experimental data. In the example above all calculated rates exactly obey the Herbert–Pirt equation. In practice this is of course not the case. It should be realized that the Herbert–Pirt equation is a mathematical model which provides often a satisfactory description of the culture behavior but is never exactly true. Furthermore, experimental measurements always contain errors and therefore the biomass specific rates which are calculated from these measurements, contain errors as well and are only best estimates, with a certain standard error, of the “real” conversion rates. Therefore weighed linear regression is the best way to obtain the parameters of the Herbert–Pirt equation from real experimental data.
9.7.3 Calculation of the Other q-rates using Three Independent Reactions The question is now how the other biomass specific conversion rates (qO2 , qCO2 , qheat, …. etc.) can be calculated. Also this can be accomplished by using the information present in the Herbert–Pirt substrate distribution equation. This equation expresses that the organism spends the consumed substrate in three independent processes (growth, product formation, maintenance). Each of these processes proceeds according to its own stoichiometry. Biomass growth, with rate µ: a substrate + α1 O2 + α2 NH4+ + 1 C -molX + α3 CO2 + α4 H2O + α5 H +
(9.48)
Product formation, with rate q p: b substrate + β1 O2 + β2 NH4+ + 1 mol Product + β3 CO2 + β4 H2O + β5 H +
(9.49)
Catabolic reaction for maintenance, with rate (−ms):
−1 substrate + γ1 O2 + γ2 CO2 + γ3 H2O
(9.50)
The unknown coefficients of these three reactions can be obtained from the conservation relations for the elements C, H, O, and N and charge. For both the growth reaction and the product formation reaction the five unknown coefficients, α1-α5, respectively. β1-β5 are found by solving these five linear relations with five unknowns. For the catabolic reaction only three such relations exist, namely the conservation relations for C, H, and O, but because this reaction contains three unknowns also the coefficients of this equation can be calculated. In the example below it is shown how to obtain the stoichiometries for the three independent reactions from experimental datasets of µ, qs and qp.
Example 6: Lysine Production: Evaluation of the Complete Stoichiometry of the Three Independent Reactions from Chemostat Experiments at Different Growth Rates µ Corynebacterium is aerobically cultivated in a chemostat with glucose as C -source and produces lysine. Three chemostat cultivations were carried out at different growth-rates μ to study the stoichiometry and kinetics of growth, substrate consumption, and lysine production. From proper measurements (medium inflow rate, broth outflow rate, measured concentrations of glucose, lysine and biomass, and of broth
9-25
Black Box Models for Growth and Product Formation Table 9.4 Calculated Biomass Specific Conversion Rates for Three Chemostat Cultures of Corynebacterium µ (h− 1) 0.01 0.05 0.20
− qS (mol glucose/C -molX/h)
qP (mol glucose/C -molX/h)
0.013 0.041 0.1010
0.0015 0.0075 0.0075
volume in the chemostat) and proper mass balances (for glucose, product, and biomass) a set of q-values was obtained for three different growth rates μ (see Table 9.4). As has been outlined before three chemostat experiments at different μ are just sufficient to obtain a, b, ms. Task 1: Answer:
Calculate the value of the coefficients in the Herbert–Pirt substrate distribution relation. The Herbert–Pirt relation is formulated as: qS = aµ + bqP + mS
Using the above obtained q-values, three linear equations can be written: −0.013 = a*0.01 + b*0.0015 + mS −0.041 = a*0.05 + b*0.0075 + mS −0.101 = a*0.20 + b*0.0075 + mS These three equations can be solved to give: a = −0.40, b = −2.0 and mS = −0.006 leading to qS = −0.40µ−2qP−0.006
Task 2: Answer:
(9.51)
Calculate the stoichiometry of the three independent reactions The three independent reactions are for growth, product formation, and maintenance. We use the standard composition for biomass, lysine as C6H15O2N2+ and assume NH4+ as N-source. For the independent growth reaction (rate µ) we can write:
−0.40C6H12O6 + a NH4+ + b O2 + 1 C1H1.8O0.5N0.2 + c CO2 + d H + + e H2O = 0
(9.52)
The coefficient of glucose (−0.40) was obtained from the above experiments (Task 1) which leaves five unknown coefficients (a, b, c, d, e). Using the five conservation constraints (C, H, O, N and charge) one obtains: −0.40 C6H12O6−0.20 NH4+ −1.35 O2 + 1 C1H1.8O0.5N0.2 + 1.40 CO2 + 0.20 H + + 1.80 H2O = 0
(9.53)
For the independent product formation reaction (rate qP) we can write (using −2 from the substrate Herbert–Pirt relation as the value of the glucose coefficient, Task 1): −2 C6H12O6 + a NH4+ + b O2 + 1 C6H15O2N2+ + c CO2 + d H + + e H2O = 0
(9.54)
Using the conservation constraints one obtains: −2 C6H12O6−2 NH4+ −5 O2 + 1 C6H15O2N2+ + 6 CO2 + 1 H + + 8 H2O = 0
(9.55)
9-26
Balances and Reaction Models
For the independent maintenance reaction (rate−mS = 0.006), as obtained from the Herbert–Pirt relation) the catabolic reaction for 1 mol catabolized glucose is written as: −1 C6H12O6−6 O2 + 6 CO2 + 6 H2O = 0 Task 3: Answer:
(9.56)
Calculate the linear relations which give qs, qO2, qCO2, qNH4+ , qH+, qw as linear function of µ, qp, and maintenance These relations can be read from the stoichiometries and rates of the three independent reactions. The complete set of linear relations is: qS = −0.40µ -2qP−0.006
qO2 = −1.35µ−5qP−0.036
qCO2 = 1.40µ + 6qP + 0.036
qNH4+ = −0.2µ−2qP
qH+ = + 0.2µ + 1qP
qw = 1.8µ + 8qP + 0.036
Note that all rates are expressed in moles i/C -molX/h. Knowing μ allows to calculate qP (from the qP(μ) relation). These relations then lead to all other q-rates. This shows that each qi depends only on μ under single nutrient limiting conditions. Task 4:
Answer:
Find, from the obtained linear relations the value of the maximal theoretical yield of biomass on CO2, the maximal theoretical yield of lysine on O2 and of the maintenance coefficient for O2. Also give the proper units. The maximal theoretical yield of biomass on CO2 Ycxmax is found in the linear q-relation for CO2. In the relation for qCO2 the biomass coefficient equals 1.40, which means that Ycxmax =
1 = 0.7142 C -molX/molCO2. 1.40
max ) is found in the relation for q Similarly the maximal theoretical yield of lysine on O2 ( Yop O2 as Y = (1/5) = 0.20 mol lysine/mol O2. The value for mO2 = −0.036 mol O2/C -molX/h. max op
9.8 Operational Yields In the previous section we have introduced the concept of independent reactions for growth, product formation and maintenance. In these reactions the stoichiometric coefficients represent the socalled theoretical maximum yields: Yijmax . These yields are determined by the biochemical pathways used for synthesizing biomass and product from the substrate used and are only valid for isolated subprocesses. A very important other yield is the operational yield Yij, which is a ratio of q-rates, which are obtained from measurements:
Yij =
q j mol j(producedorconsumed) q i moli(producedorconsumed)
(9.57)
Black Box Models for Growth and Product Formation
9-27
The operational yield relates to economic calculations. A most important yield (economically speaking) is Ysp =
q p mol product q s molsubstrate
(9.58)
This operational yield is important because financial gain comes from product and financial cost is mostly related to substrate. In contrast to theoretical maximal yields the operational yields are not constant stoichiometric coefficients.
9.8.1 Operational Yields Depend on µ and Often have an Optimum Because each q-value is a function of µ it is obvious that also each operational yield is a function of µ. This directly implicates that the operational yield is constant when µ is constant!! We have seen this already in the description of a batch fermentation (where µ = constant = µmax). The linear relations, together with the q p(µ) function, directly allow to calculate the algebraic function which gives the operational yield as a function of µ. Often this Ysp(µ) function shows an optimum.
9.8.2 Derivation of Yij(µ) Functions Example 7: Noncatabolic Product Formation Consider the lysine example shown earlier. From the obtained linear relations for qs and qp: −qs = 0.40µ + 2qp + 0.006
qp = 0.15µ for µ < 0.05 h −1 and qp = 0.0075 for µ > 0.05 h −1
an expression for Ysp can be obtained:
qp q 0.15µ Ysp = p = (µ < 0.05h-1 ) = q 0.40 2q 0.006 0.70 µ + + µ + 0.006 p s Ysp =
0.0075 (µ > 0.05h-1 ) 0.40µ + 0.021
(9.59)
(9.60)
From a plot Ysp as function of µ (see Figure 9.3) it can be observed that Ysp has a maximum value at µ = 0.05 h −1. This µ-value is called the optimal µ, µopt. In this example µopt = 0.05 h −1 and qpopt = 0.0075 molP/C -molX/h.
Example 8: Catabolic Product Formation Consider the fermentative growth of yeast on glucose and ammonium, with production of ethanol (C2H6O) as catabolic product. From chemostat experiments the linear equation for substrate consumption has been obtained:
qs = −1.111 µ−0.020
(9.61)
9-28
Balances and Reaction Models 0.2
Ysp (mol/mol)
0.15 0.1 0.05 0
0
0.05
0.1 0.15 0.2 Specific growth rate µ(h–1)
0.25
0.3
Figure 9.3 Plot of the operational yield of product on substrate as a function of the specific growth rate µ for the lysine example.
If the elemental composition of the biomass is known (here we assume the standard average composition) this is sufficient information to derive the independent growth reaction: −1.111 C6H12O6−0.2 NH4+ + 1.8722 C2H6O + C1H1.8O0.5N0.2 + 1.9222 CO2 + 0.2 H+ + 0.45 H2O = 0
(9.62)
The independent maintenance (catabolic) reaction, which is the fermentation of glucose to ethanol and CO2 (rate 0.02 mol glucose/C -molX/h) is: −1 C6H12O6 + 2 C2H6O + 2 CO2 = 0 Task 1: Answer:
(9.63)
Give the algebraic relation for the ethanol (symbol e) yield on glucose Yse as function of µ and give its units Ethanol is produced both in the growth reaction and in the maintenance reaction. From the ethanol stoichiometry of both reactions and the reaction rates (µ for the growth reaction and 0.02 for the maintenance reaction the linear expression for the specific ethanol production is obtained as qe = 1.8722µ + 0.04
(9.64)
Now the expression for the operational yield of ethanol on glucose can be derived from Yse = qe/qs
Yse =
qe 1.8722µ + 0.04 = ( - qs ) 1.111µ + 0.02
(9.65)
A plot of Yse as a function of µ is shown in Figure 9.4. The units of Yse are in mol ethanol per mol glucose. This equation shows that Yse depends on µ. At µ = 0, the yield is 2, at high µ, Yse = 1.685; these values which are the ethanol/glucose ratios in the two independent reactions. The highest Yse is obtained at µ = 0, hence µopt = 0 and Yseopt = 2 mol/mol. Task 2:
Calculate the biomass yield on glucose as function of µ YSX(µ)
Answer:
Ysx =
µ µ = ( - qs ) (1.111µ + 0.020 )
(9.66)
9-29
Black Box Models for Growth and Product Formation 2.5
Yse (mol/mol)
2 1.5 1
0.5 0
0
0.05
0.1 0.15 0.2 Specific growth rate µ(h–1)
0.25
0.3
Figure 9.4 Plot of the operational yield of ethanol on substrate as a function of the specific growth rate µ for the anaerobic yeast example.
Ysx (C-molX/mol glucose)
1
0.5
0
0
0.05
0.1 0.15 0.2 Specific growth rate µ(h–1)
0.25
0.3
Figure 9.5 Plot of the operational yield of biomass on substrate as a function of the specific growth rate µ for the anaerobic yeast example.
A plot of Ysx as function of µ is shown in Figure 9.5. It can be seen from this figure that Ysx decreased at decreasing µ, due to the increasing contribution of maintenance at low growth rates.
9.8.3 Calculation of the Stoichiometry of the Overall Growth Plus Product Reaction It has been shown above that from the independent reactions for growth, product formation, and maintenance and the linear equation for substrate consumption mathematical expressions can be derived to express the operational yields of biomass and product on the substrate as a function of the growth rate µ. Note that in case of the formation of a noncatabolic product also an expression for qP as a function of µ is needed. In a similar way the relations for the operational yields of the other relevant compounds of the system as a function of µ can be obtained. These relations allow to calculate the stoichiometry of a single overall reaction for growth and product formation for a certain growth rate µ. As can be inferred from
9-30
Balances and Reaction Models
these relations, the stoichiometry of this overall growth plus product reaction changes as a f unction of µ. This will be illustrated in the following example:
Example 9: Calculation of the Stoichiometry of the Overall Growth Plus Product Reaction for Noncatabolic Product Formation Assume three independent reactions for growth, product formation, and maintenance: Independent growth reaction: (rate µ) −0.333 C6H12O6−0.2 NH4+ −0.95 O2 + 1 C1H1.8O0.5N0.2 + 1 CO2 + 0.2 H + + 1.40 H2O = 0
(9.67)
Independent Lysine (C6H15O2N2+ ) production reaction: (rate qP) −1.5 C6H12O6−2 NH4+ −2 O2 + 1 C6H15O2N2+ + 3 CO2 + 1 H + + 5 H2O = 0
(9.68)
Independent maintenance reaction: (rate − mS = −(−0.005) = 0.005 mol glucose/C -molX/h) −1 C6H12O6−6 O2 + 6 CO2 + 6 H2O = 0 The coefficients of the linear equation for substrate consumption;
qS = −0.333µ−1.5qP−0.005
(9.69)
have been used to derive these reactions. From the stoichiometries of the three reactions given above similar linear equations can be derived for the specific conversion rates of the other reactants:
qNH4+ = −0.2µ−2qP
qO2 = −0.95µ−2qP−6*0.005
qCO2 = + 1µ + 3qP + 6*0.005
qH+ = + 0.2µ + 1qP
qW = + 1.4µ + 5qP + 0.030
These linear relations can be used to calculate the overall growth (plus product) reaction at different growth rate µ and specific rate of product formation qP. Assume e.g., that µ = 0.05 Cmol/Cmol /and that qP = 0.05 mol lysine/C -molX/h at this growth rate. The linear relations yield then the following q-values ((C)mol i/C -molX/h)
µ = +0.05
qP = +0.05
qS = −0.0967
qNH4+ = −0.11
qO2 = −0.1775
qCO2 = + 0.230
Black Box Models for Growth and Product Formation
qH+ = + 0.060
qW = + 0.35
9-31
Dividing all conversion rates by the specific rate of lysine production provides the stoichiometric coefficients of the overall growth and product reaction normalized to 1 mol lysine (C6H15O2N2+) produced. −1.934 C6H12O6 − 2.2 NH4+ − 3.55 O2 + 1 C1H1.8O0.5N0.2 + 1 C6H15O2N2+ + 4.6 CO2 + 1.2 H + + 7 H2O = 0 (9.70) For a different µ and qP a different overall growth (plus product) reaction can be calculated, as shown above.
9.9 Conclusions In this chapter the basic concepts of black box modeling of fermentation processes have been introduced. It has been shown that a relatively simple approach, which does not require detailed information on the metabolism of the applied microorganism is very useful for the design and optimization of fermentation processes. First the concept of single nutrient limited growth has been introduced. It has been shown that the kinetics of microbial growth and product formation under these conditions can be described with only a few parameters. • A hyperbolic kinetic relation for substrate uptake, requiring only two parameters, Ks and qmax s . • A relation between the specific rate of product formation, q p and specific growth rate, μ. For a noncatabolic product this function must be established experimentally. • The Herbert–Pirt substrate distribution relation which gives: • Information on the maintenance requirements, expressed in the parameter mS • The stoichiometry of the independent reactions for −− growth (which requires the stoichiometric parameter a = 1/Ymax sx ) max) −− product formation (which requires the stoichiometric parameter b = 1/Ysp −− maintenance (catabolism which does not require a stoichiometric parameter) max max This model description contains surprisingly few parameters (qmax s , K s , ms , Ysp , Ysx , and several parameters in the qP(μ) function), but still provides a description of how all q-rates depend on Cs, or equivalently how all q-rates and yields depend uniquely on μ. It should be considered remarkable that the enormous complexity of the living cell can be well described with respect to all relevant uptake and secretion rates using a relatively simple black box model with only a few parameters.
References and Further Reading de Poorter, L.M.I., Geerts, W.J., and Keltjens, J.T., 2007. Coupling of Methanothermobacter thermautotrophicus methane formation and growth in fed-batch and continuous cultures under different H2 gassing regimes. Appl. Environ. Biotechnol., 73:740–49. Geerdink, M.J., van Loosdrecht, M.C.M., and Luyben, K.Ch.AM., 1996. Biodegradability of diesel oil. Biodegradation, 7:73–81. Jansen, M.L.A., Krook, D.J.J., de Graaf, K., van Dijken, J.P., Pronk, J.T., and de Winde, J.H., 2006. Physiological characterizationand fed-batch production of an extra cellular maltase of Schizosaccharomyces pombe CBS 356. FEMS Yeast Res., 6:888–901. Heijnen, J.J., Roels, J.A., and Stouthamer, A.H., 1979. Application of balancing methods in modeling the penicillin fermentation. Biotechnol. Bioeng. 21(12):2175–201. Heijnen, J.J., 1991. A new thermodynamically based correlation of chemotrophic biomass yields. Antonie Van Leeuwenhoek, 60(3–4):235–56.
9-32
Balances and Reaction Models
Lineweaver, H. and Burk, D., 1934. The determination of enzyme dissociation constants. J Am. Chem. Soc., 56: 658–66. Michaelis, L. and Menten, M., 1913. Die Kinetik der Invertinwirkung. Biochem. Z., 49:333–69. Pirt, S.J., 1965. The maintenance energy of bacteria in growing cultures. Proc. R. Soc. Lond. Ser B, 163:224–31. Revilla, G., Lopez-Nieto, M.J., Luengo, J.M., and Martin, J.F., 1984. Carbon catabolite repression of penicillin biosynthesis by Penicillium chrysogenum. J. Antibiot. (Tokyo), 37(7):781–89. Savageau, M.A., 1995. Michaelis-Menten mechanism reconsidered: Implications of fractal kinetics. J. Theor. Biol., 176, 115–24. Schill, N., van Gulik, W.M., Voisard, D., and von Stockar, U., 1996. Continuous cultures limited by a gaseous substrate: Development of a simple, unstructured mathematical model and experimental verification with Methanobacterium thermoautotrophicum. Biotechnol. Bioeng., 51:645–58. Smolders, G.J.F., Van der Meij, J., Van Loosdrecht, M.C.M., and Heijnen, J.J., 1994. Model of the anaerobic metabolism of the biological phhosphorus removal process; stoichiometry and pH influence. Biotech. Bioeng., 43:461–70. Van Gulik, W.M., Antoniewicz, M.R., deLaat, W.T., Vinke, J.L., and Heijnen, J.J., 2001. Energetics of growth and penicillin production in a high-producing strain of Penicillium chrysogenum. Biotechnol Bioeng., 20;72(2):185–93. Van Gulik, W.M., ten Hoopen, H.J., and Heijnen, J.J., 2001. The application of continuous culture for plant cell suspensions. Enzyme Microb. Technol., 28(9–10):796–805. Xu, F. and Ding, H., 2007. A new kinetic model for heterogeneous (or spatially confined) enzymatic catalysis: Contributions from the fractal and jamming (overcrowding) effects. Appl. Catal. A Gen., 317, 70–81.
10 Metabolic Models for Growth and Product Formation 10.1 Introduction �������������������������������������������������������������������������������������10-1 10.2 Modular Approach ������������������������������������������������������������������������� 10-2 10.3 Detailed Stoichiometric Models.................................................... 10-3
Walter M. van Gulik Delft University of Technology
Estimation of ATP Stoichiometry Parameters • ATP Stoichiometry in Metabolic Networks • Calculation of Maximum Yields of Biomass • Calculation of Maximum Yields of Biomass and Product • Calculation of Metabolic Network Topology for Growth on Mixed Substrates • Theoretical Yield Limits to the Overproduction of Amino Acids • Limit Functions for Maximum Product Yields
10.4 Conclusions ������������������������������������������������������������������������������������10-17 References ��������������������������������������������������������������������������������������������������10-18
10.1 Introduction In many fermentation processes the product yield YSP is a parameter of major economic importance. The theoretical maximum to the product yield for a certain organism and product, Yspmax , is determined by the stoichiometry of the product pathway and connected central metabolic pathways. The theoretical maximum product yield can be changed by changing the stoichiometry of the metabolic network by means of genetic interventions, such as: • Replacement of a transporter which consumes ATP by a transporter which does not (active transport becomes passive transport or vice versa) • Replacement of a decarboxylation reaction • Replacement of an NADPH consuming reaction by an NADH consuming reaction (or vice versa) • Introduction of a catabolic pathway for a novel substrate • Replacement of an ATP consuming reaction by a non-ATP consuming reaction • Introduction of an alternative product pathway The quantitative effects of such changes on the maximum theoretical yields of product, but also of biomass on substrate can be calculated with so-called stoichiometric metabolic models. In principle a stoichiometric metabolic model is tailor made for each organism and incorporates all the available biochemical information of the studied organism. With the advent of (partially) annotated genomes the present knowledge is rather extensive (and is increasing every day) and large scale stoichiometric metabolic models based on genomic information may easily contain more than 1000 metabolic reactions (Feist et al., 2007, Oh et al., 2007, Duarte et al., 2004). 10-1
10-2
Balances and Reaction Models
Table 10.1 Setting Up a Stoichiometric Metabolic Model Detailed Approach • Include all potentially available enzymes based on textbook, literature and (annotated) genome information. • Uses transcript analysis on expressed genes and enzymes present. • Formulate the complete stoichiometry of each of the about 200–1000 reactions. • Define the stoichiometric matrix S. • Apply matrix calculations for analysis of the model and flux analysis.
Modular Approach • Central metabolism lumped into the biosynthesis reactions for the 12 key carbon metabolites. • Lumped reactions for regeneration of key cofactors such as ATP, NADH, NADPH. • Lumped reactions for the biosynthesis of monomers, e.g., amino acids, nucleotides, fatty acids from the key carbon metabolites. • Reactions for polymerization of monomers to polymers. • Single, lumped reaction for biomass formation. • Lumped reaction for product formation. • Calculations can be performed by hand.
In principle there are two approaches in formulating a stoichiometric metabolic model (see Table 10.1). The detailed approach, which results in a large and detailed model containing hundreds of reactions, becomes now more and more feasible with the genome wide approaches. Here the computer must be used and there is now software under development which allows the direct definition of the stoichiometric metabolic model from the annotated genome, (together with, e.g., transcriptome information which yields the expressed genes, and hence the enzymes and therefore the reactions which are present). It has been shown that, because of their detailed structure, genome-scale metabolic models are well suited for the in silico analysis of e.g., the behavior of certain organisms and the a priori assessment of the effects of alterations of the metabolic network stoichiometry by means of genetic intervention. However, the various applications of genome-scale models will not be treated here, but can be found in Section IV of this book. In this section we will focus on the applications of stoichiometric metabolic models of moderate size, containing 100–200 reactions as well as the modular approach.
10.2 Modular Approach The synthesis of each of the hundreds of different molecules which are present in the cells (amino acids present in proteins, nucleotides present in RNA and DNA, fatty acids, glycerol, and other compounds present in the membrane lipids, carbohydrates, and other compounds present in cell-walls, cofactors, etc.) and in secreted products occurs in the biosynthetic pathways. Here each particular compound synthesized has its own pathway, often a sequence of enzymecatalyzed steps. In the modular approach many parts of the metabolic network are lumped into a single reaction. This results in reducing the complexity to a large extent, however, thereby loosing detailed stoichiometric information. It should be realized here that if the lumping procedure is performed in a proper way, the simplified model should still provide the correct information on the overall stoichiometry of metabolism, e.g., maximum theoretical yields of biomass and product(s). This approach was followed by Ingraham et al. (1983). They showed that precursor synthesis in central metabolism can be reduced to 12 reactions for the production of the 12 key intermediates from the C-source supplied. As an example they tabulated the biosynthesis costs, in terms of C-source and associated consumption/production of ATP, NADPH, NADH, and CO2 for two different substrates, namely glucose and malate. Furthermore, reactions were defined for the biosynthesis of the monomers, i.e., amino acids, nucleotides, fatty acids, lipopolysaccharides, carbohydrates, etc. from the 12 precursors, 1-C compounds, NH3, and S, taking into account the consumption/production of ATP, NADPH, and NADH. Also the energy requirements for polymerization of the monomers to macromolecules were given. Finally, from the biochemical composition of the biomass, the required amounts of building blocks were calculated. This approach can be considered as a relatively simple stoichiometric metabolic model which provides understanding of how cellular metabolism is organized, demonstrates what resources are needed to produce all the building blocks and coenzymes for the production of a certain amount of cells. If in addition the production of
Metabolic Models for Growth and Product Formation
10-3
ATP and reducing equivalents from the substrate is included, this approach allows the calculation of biomass yields on different C-sources. However, in order to perform metabolic flux analysis (MFA), i.e., calculate the fluxes through metabolic pathways under certain conditions, more detail is required. One of the first published papers on the application of a stoichiometric metabolic model is probably that of Verhoff and Spradlin (1976). They used a stoichiometric model of the TCA cycle including variations thereof to analyze different possible metabolic routes for the production of citric acid by Aspergillus niger, which can be considered as a first example of elementary mode analysis.
10.3 Detailed Stoichiometric Models One of the first examples of the application of a detailed stoichiometric metabolic model for the quantitative estimation of metabolic fluxes has been published by Rabkin and Blum in 1985. They used a complete stoichiometric model of the “upper” metabolic pathways (gluconeogenesis, glycolysis, pentose phosphate pathway) and a minimal model of the “lower” pathways (mitochondrial and associated reactions) to perform MFA of hepatocytes in the presence and absence of the hormone glucagon. At present the availability of (partly) annotated genomes for an increasing number of microorganisms offers the possibility of genome-scale metabolic reconstruction, i.e., the construction of detailed metabolic models based on the available genes. Therefore such models consist of a large number of biochemical reactions, often more than 1000, contain many parallel pathways, a large number of transport reactions for many different compounds which may serve as alternative substrates and many connected catabolic reactions. In fact a genome-scale metabolic reconstruction should not be considered as a model but rather as a database of all biochemical reactions for which a certain organism has the genes available. It should be realized that, due to the fact that most genomes are not yet completely annotated, these genome-scale reconstructions contain many dead-end reactions, that is, reactions which produce a certain compound for which no reaction is available to consume it. This is not a problem because genome scale reconstruction is an ongoing process and as annotation of genomes proceeds the reaction databases can be extended and dead ends can be resolved. It should be realized, that in real life a microorganism growing under certain conditions and in the presence of certain nutrients does never express all available genes. Depending on the conditions which the microorganisms encounter, genetic regulation will alter the topology of the biochemical reaction network such that the organism is optimally adapted to the growth conditions. Thus, depending on the growth conditions the biochemical reaction network consists of a certain subset of all available reactions. Genome-scale metabolic reconstructions can be used, e.g., to explore the metabolic capabilities of a certain microorganism to adapt to certain conditions, to predict the effects of genetic alterations, to identify and characterize all possible phenotypes, to calculate optimal reaction network states for maximum product formation, etc. (Price et al., 2004). Genome scale metabolic models can also be used to calculate the flux distributions through the metabolic network from measured net conversion rates under certain conditions. However, this is not possible as such with a complete metabolic reconstruction for a certain microorganism, because many parallel pathways and futile cycles exist. Meaningful flux distributions can only be obtained by using a relevant subset of reactions from the complete database. Subsets of reactions can be obtained in several ways, either manually by using available biochemical and/or transcriptome information, i.e., on which enzymes are expressed under certain conditions or computationally by means of constraint based optimization. In the case of constraint based optimization, e.g., linear programming (LP) is used to calculate the flux distribution through the metabolic network under the condition that a certain objective function, e.g., yield of biomass on substrate, is maximized. It has been shown that from a large database of reactions, obtained from genomic reconstruction, largely reduced minimal models can be obtained for the description of growth under certain defined conditions, e.g., on a defined minimal medium (Burgard et al., 2001), as is often the case under laboratory
10-4
Balances and Reaction Models
and even in an increasing number of cases under industrial conditions. Burgard et al. showed that a large stoichiometric model for E. coli, consisting of 720 biochemical reactions could be reduced to 224 reactions to support growth on a glucose-only medium and 229 for an acetate-only medium. Such reduced models can subsequently be applied to calculate flux distributions by means of MFA. However, uncertainties remain concerning specific details of the metabolic network, e.g., possible alternative pathways, intracellular compartmentation and cofactor specificity of particular reactions. If these details are important, knowledge on specific aspects can be obtained through additional biochemical research. An exception to this is the operational ATP-stoichiometry of oxidative phosphorylation as well as growth dependent and growth independent maintenance energy costs. These are generally not known beforehand, and can not be obtained in a straightforward manner from biochemical research. Furthermore the values of these parameters vary between microorganisms. Therefore the ATP-balance is normally not used as a constraint in the flux balancing procedures. However, the unknown ATPstoichiometry parameters can be estimated from experimental data as will be shown later on in this chapter. It will also be demonstrated that if the ATP stoichiometry is known the stoichiometric model can be applied to perform a priori flux calculations, and to calculate maximum theoretical yields of biomass and product on single and multiple substrates.
10.3.1 Estimation of ATP Stoichiometry Parameters By using different carbon substrates or ratios of mixed substrates and different growth rates the relative contributions of substrate level and oxidative phosphorylation to the total generation of ATP can be manipulated experimentally. This allows the estimation of the unknown ATP-stoichiometry coefficients of oxidative phosphorylation (P/O-ratio), growth dependent maintenance (K x) and growth independent maintenance (mATP) of the metabolic network model from experimental data (van Gulik and Heijnen, 1995, Vanrolleghem et al., 1996, van Gulik et al., 2001). A correct estimation of these coefficients, and moreover a verification whether these coefficients may be considered constant for a certain range of experimental conditions, is crucial for acceptable flux predictions within the range of experimental conditions (interpolation) and beyond (extrapolation). This is for instance the case if the metabolic network model is used to predict maximum biomass and/or product yields. So far this aspect has received little attention. The method to estimate the ATP stoichiometry parameters is directly based on the ATP-balance, which contains the parameters only in a linear form. This makes the application of the estimation procedure relatively straightforward.
10.3.2 ATP Stoichiometry in Metabolic Networks The basis of metabolic network models is formed by the balance equations formulated for the components that take part in the biochemical conversions in the cell. For ATP, a component considered to be in pseudo steady state, the production equals the consumption, which puts the net result of the balance to zero. Although the ATP stoichiometry coefficients of many ATP generating and ATP consuming reactions are known, difficulties arise with respect to the uncertain ATP stoichiometry of oxidative phosphorylation, additional ATP-costs of anabolism and the ATP consumption in maintenance processes. As a result, the ATP balance can be written as:
P ⋅ q 2e O
∑q
i ATP
- K X ⋅ µ - m ATP = 0
(10.1)
where q2e is the specific flux of electrons through the respiratory chain, Σq iATP is the net rate of ATP consumption in the part of the metabolic network model of which the ATP stoichiometry is known (i.e., the result of all stoichiometrically fixed ATP usage, as well as production in substrate level phosphorylation), and µ is the specific growth rate of the cells.
10-5
Metabolic Models for Growth and Product Formation
The parameters K X and mATP are operational values for growth associated maintenance, and nongrowth-associated maintenance respectively. It should be realized that P/O, being the rate of ATP synthesis divided by the rate of oxygen consumption in oxidative phosphorylation can not be considered as a parameter. The reason is that this ratio is determined by the division of the electron flux over the different proton translocating complexes (I, III, and IV) of the respiratory chain which have different H+ /2e stoichiometries. This division, and thus the P/O-ratio is a function of the growth conditions, e.g., the carbon substrate used, the growth rate, the rate of product formation, etc. However, if the metabolic model applied for the metabolic flux balancing is sufficiently detailed, the origin of the reducing equivalents generated in microbial catabolism is known and thus the relative contributions of complexes I, II, and IV of the respiratory chain to oxidative phosphorylation. To include this, the ATP balance has to be extended to:
δ ⋅ (q NADH:mit + α ⋅ q NADH:cyt + β ⋅ q FADH )2e 2e 2e
∑q
i ATP
- K ⋅ µ - m ATP = 0
(10.2)
where α and β represent the relative contributions to proton translocation of electrons delivered by cytosolic NADH and FADH, respectively. The values of these parameters depend on the construction of the electron transport chain of the organism under study. If electrons derived from cytosolic NADH and FADH bypass complex I, both α and β may have for example a value of 2/3. If complex I is not operative, e.g., in case of Saccharomyces cerevisiae α and β are both equal to 1. The parameter δ represents the maximum P/O-ratio, i.e., when all electrons pass the complete respiratory chain and is thus, by definition, not equal to the P/O-ratio, as defined in Equation 10.1. However, if the H+ /2e stoichiometry in proton translocation as well as the H + /ATP stoichiometry of the ATP-synthase can be considered independent of the growth conditions, δ can also be considered independent of the growth conditions. Estimates of the parameters δ, K, and mATP are obtained by calculating for each experimental condition the values for q NADH:mit ,q NADH:cyt ,q FADH ,Σq iATP and µ from metabolic flux balancing without using the 2e 2e 2e ATP-balance as a constraint in the flux balancing procedure. This can be accomplished by either leaving out the ATP-balance from the network model altogether, or by including an ATP-hydrolysis reaction:
ATP + H2OADP + Pi + H
(10.3)
Subsequently, after the fluxes have been obtained from metabolic flux balancing, the coefficients δ, K, and m ATP are estimated using Equation 10.2. This requires at least three different sets of the above mentioned fluxes (q’s and µ). This can, e.g., be achieved by performing chemostat cultivations on different carbon substrates, or a varying ratio of mixed substrates (van Gulik and Heijnen, 1995; Vanrolleghem et al., 1996, van Gulik et al., 2001). Because Equation 10.2 is linear, the estimation procedure is straightforward. If only sets at a single growth rate are available growth dependent and nongrowth-dependent maintenance can not be distinguished and Equation 10.2 must be simplified to:
δ ⋅ (q NADH:mit + α ⋅ q NADH:cyt + β ⋅ q FADH )2e 2e 2e
∑q
i ATP
- K ′⋅ µ = 0
(10.4)
in which K′ is the overall result of growth related maintenance and non growth-related maintenance:
K ′=K X +
m ATP µ
(10.5)
The resulting values of δ, K X, and mATP (or δ and K′) will be the best estimates for the given sets of fluxes.
10-6
Balances and Reaction Models
10.3.3 Calculation of Maximum Yields of Biomass A detailed stoichiometric metabolic model which contains the proper ATP stoichiometry allows the calculation of maximum theoretical yields of biomass and product(s) on the carbon substrate and therefore (see Chapter 9 of this section on black box modeling) also on consumed oxygen, produced carbon dioxide, etc. An example of this approach can be found in van Gulik and Heijnen (1995). They used published data on biomass yields in steady state carbon limited chemostat cultures of S. cerevisiae and C. utilis at a dilution rate of 0.1 h -1 (Verduyn, 1991). The different conditions were anaerobic growth of S. cerevisiae on glucose and aerobic growth of S. cerevisiae on glucose, ethanol, and acetate and aerobic growth of C. utilis on acetate, citrate, ethanol, gluconate, glucose, glycerol, lactate, pyruvate, and succinate. Determined stoichiometric metabolic models were constructed for the two yeast strains and the different growth conditions. All models contained three degrees of freedom, which means that three rates had to be specified in order to calculate all other rates (net conversion rates and reaction rates). The rates to be specified were the specific growth rate of the cells μ, the rate of ATP hydrolysis to account for maintenance and (in case of aerobic growth) the rate of NADH oxidation which was required to introduce the P/O ratio as a parameter in the stoichiometric model:
ATP hydrolysis for maintenance: 1 ATP + 1 H2O1 ADP + 1 Pi + 1 H
NADH oxidation: 1 NADH + 0.5 O2 + 1 H1 H2O + 1 NAD
The fact that all experimental data were collected at one and the same growth rate prevents the distinction between growth- and nongrowth-associated maintenance energy requirements. As a consequence the growth dependent and growth independent maintenance coefficients could not be estimated separately but instead only a combination of the two, namely K′ (see Equation 10.5). It is known, however, that nongrowth-associated maintenance needs of yeasts are relatively low (Verduyn et al., 1991), certainly at a specific growth rate of 0.1 h -1. Therefore, it could be assumed that for both S. cerevisiae and C. utilis nongrowth-associated maintenance energy needs were negligible. In this study the costs for peptide chain elongation were assumed to be 4 ATP per amino acid. However, as pointed out by Verduyn et al. (1991) it should be realized that this is a relatively uncertain figure which might well be higher due to extra energy costs associated with the addition of incorrect amino acids to the chain and with subsequent proofreading. For this reason, growth-associated maintenance energy needs were assumed to be proportional to the rate of protein synthesis and to equal an amount of K′ mol ATP/C-mol of protein synthesized. The rate of extra ATP consumption for growthassociated maintenance therefore equals:
rATP,maint =K ′⋅ X P ⋅µ
(10.6)
where X P is the protein fraction of the biomass. From the metabolic network for anaerobic growth of S. cerevisiae on glucose, which had two degrees of freedom, namely the growth rate μ and the rate of ATP hydrolysis for maintenance, the following expression for the biomass yield as a function of ATP consumption for maintenance was obtained:
an YSX,netw =
1 0.904 + 0.5 ⋅ K ′⋅ X P (i )
(10.7)
Under aerobic conditions the biomass yield is a function of two parameters, the effective P/O ratio, δ, and the maintenance coefficient, K′ and has the general form:
aer YSX,netw(i) =
α1(i)+α 2 (i) ⋅ δ α 3 (i)+α 4 (i) ⋅ δ+α 5 (i) ⋅ K ′⋅ X P (i)
(10.8)
10-7
Metabolic Models for Growth and Product Formation
where α1(i)-α5(i) (for i = 1–12 for the 12 different yeast-substrate combinations under aerobic conditions) are constant coefficients. It was assumed that the value of K′ is the same for both S. cerevisiae and C. utilis. However, the effective P/O ratio, δ, was assumed to be different, which is obvious from the differences between the electron transport chain of both yeasts, i.e., S. cerevisiae lacks phosphorylation site I while C. utilis does not. Both parameters, K′ and δ, were assumed not to be influenced by the carbon substrate used. Thus, three parameters had to be estimated to describe the growth yields of the two yeasts on different carbon substrates. This was accomplished by minimizing the sum of the squared differences between the experimental biomass yields, YSX,exp(i) and the biomass yield calculated from the model, YSX,netw(i). The estimated values for the effective P/O ratio and growth associated ATP needs for maintenance were, respectively: δ = 1.20 for S. cerevisiae and δ = 1.53 for C. utilis, and K′ = 1.37 mol ATP/C-mol protein for both yeasts. The lower estimation of the effective P/O ratio of S. cerevisiae agrees well with the absence of phosphorylation site I. Using an average figure for the protein content of the biomass of 50%, or 0.022 C-mol protein/g biomass, the estimated growth-associated maintenance was 30 mmol/g biomass. Having the estimated values of the ATP stoichiometry parameters, δ and K′, allows to calculate the biomass yields for the different experimental conditions using Equations 10.7 and 10.8. A comparison between the yields calculated from the metabolic networks with the fitted ATP stoichiometry parameters and the experimental yields obtained for S. cerevisiae and C. utilis is shown in Figure 10.1. As can be seen from this figure, the experimental biomass yields could be predicted well for growth on a wide variety of substrates using the estimated values for the effective P/O ratio and growth associated maintenance.
10.3.4 Calculation of Maximum Yields of Biomass and Product A comparable approach as has been described above was followed for P. chrysogenum, although a more detailed metabolic network was constructed wherein also cellular compartmentation, i.e., division of the 0.8 12 3
Predicted yield (g/g)
0.6 2 7 6
0.4
10
13
11 9
8
5 4 0.2 1
0
0
0.2
0.4 Measured yield (g/g)
0.6
0.8
Figure 10.1 Predicted versus measured biomass yields of S. cerevisiae and Candida utilis in carbon-limited chemostat culture at a dilution rate of 0.1 (h -l). ( ∆): S. cerevisiae: 1. anaerobic growth on glucose; 2. aerobic growth on glucose; 3. aerobic growth on ethanol; 4. aerobic growth on acetate. (■): Aerobic growth of C. utilis: 5. acetate; 6. succinate; 7. lactate; 8. gluconate; 9. glucose; l0. citrate; 11. glycerol; 12. ethanol; 13. pyruvate. (From van Gulik, W.M. and Heijnen, J.J., Biotechnol. Bioeng., 1995, 48, 681–698. With permission.)
10-8
Balances and Reaction Models Table 10.2 Estimated Values of the ATP-Stoichiometry Parameters for P. chrysogenum with Their 95% Confidence Intervals Parameter
Value
δ KX
0.38 ± 0.11 mol ATP/C-mol biomass
KP
73 ± 20 mol ATP/mol penicillin
mATP
0.033 ± 0.012 mol ATP/C-mol biomass/h
1.84 ± 0.08 mol ATP/mol O
Source: van Gulik, W.M., Antoniewicz, M.R., DeLaat, W.T.A.M., Vinke J.L., and Heijnen, J.J., Biotechnol. Bioeng., 71:185–193. With permission.
cells in cytosolic, mitochondrial, and peroxisomal compartments was taken into account (van Gulik et al., 2000). From chemostat experiments on three different carbon sources and carried out at a range of different dilution rates the ATP stoichiometry parameters, that is the P/O ratio, and the growth dependent and growth independent maintenance coefficient were estimated (van Gulik et al., 2001). In addition to this an additional parameter K P was introduced to account for additional ATP dissipation for penicillin-G production. Because the penicillin biosynthesis pathway is divided over different compartments of the cell and the final product is actively excreted it was anticipated that additional ATP would be required for transport processes. However, the estima ted value of the parameter KP appeared to be surprisingly high, namely 73 mol ATP per mol of penicillin-G produced. The estimated parameters are shown in Table 10.2. In a similar way as has been pointed out above expressions for the maximum yields of biomass and the product penicillin-G on substrate as a function of the P/O-ratio and growth dependent and growth independent maintenance coefficients were derived from the stoichiometric metabolic models for the different substrates. The obtained relations are shown in Table 10.3. After substitution of the estimated ATP stoichiometry parameters the numerical values for the yield and maintenance coefficients on the supplied carbons source and on oxygen can be calculated (see Tables 10.4 and 10.5 respectively). Validation of the predictions of the biomass yield under producing and nonproducing conditions in independent experiments showed that model predictions and experimental results corresponded very well (see Table 10.6).
10.3.5 Calculation of Metabolic Network Topology for Growth on Mixed Substrates Using a relatively simple, uncompartmented stoichiometric model for the growth of S. cerevisiae on different carbon sources, van Gulik and Heijnen (1995) showed that constrained based optimization can provide correct predictions of changes of the metabolic network structure initiated by changes in the environmental conditions. The subset of reactions applied in the model allowed for growth on both ethanol and glucose as carbon substrates because in addition to the central metabolic pathways for glucose catabolism also the pathways required for growth on ethanol (i.e., gluconeogenesis and glyoxylate shunt) were present. This resulted in a stoichiometric metabolic model with seven degrees of freedom which was underdetermined because the only input variables were the consumption rates of glucose and ethanol. Therefore constrained linear optimization was applied to estimate the metabolic flux pattern as a function of the glucose/ethanol ratio in the feed. The constraint which was chosen for the optimization was maximum biomass yield on the mixed carbon substrate. Subsequently the fluxes through the metabolic network of S. cerevisiae were estimated for growth on glucose and ethanol alone and for growth on a range of glucose/ethanol mixtures, using the estimated values for the ATP stoichiometry parameters. It was calculated that changes in the topology of the metabolic network, that is, switching on and switching off of metabolic pathways, occurred at ethanol fractions of the feed of 0.09, 0.48, 0.58, and 0.73 C-mol/C-mol (Figure 10.2a through f).
10-9
Metabolic Models for Growth and Product Formation Table 10.3 Derived Relations for the Calculation of the Maximum Biomass and Penicillin Yields and Maintenance Coefficients on Substrate and Oxygen from the Estimated ATP-Stoichiometry Parameters for P. chrysogenum Growth on Glucose max = YSX
δ + 0.283 1.07δ + 0.566K X + 1.02
max = YOX
δ + 0.283 0.0256δ + 0.566K X + 0.727
max = YSP
δ + 0.283 11.2δ + 0.566K P + 12.3
max = YOP
δ + 0.283 1.72δ + 0.566K P + 9.58
5 - 2δ mS = + 0.203 ⋅ m ATP 9.83δ + 2.78
5 - 2δ mO = + 0.203 ⋅ m ATP 9.83δ + 2.78 Growth on Ethanol
max = YSX
δ - 0.179 0.732δ + 0.357K X + 0.82
max = YOX
δ - 0.179 0.0471δ + 0.536K X + 1.42
max = YSP
δ - 0.179 7.71δ + 0.357K P + 9.29
max = YOP
δ - 0.179 2.07δ + 0.536K P + 15.6
5 - 2δ mS = + 0.154 ⋅ m ATP 13δ - 2.32
5 - 2δ mO = + 0.231 ⋅ m ATP 8.67δ - 1.55 Growth on Acetate
max = YSX
δ - 0.278 1.14δ + 0.556K X + 1.37
max = YOX
δ - 0.278 0.0833δ + 0.556K X + 1.66
max = YSP
δ - 0.278 11.9δ + 0.556K P + 16.8
max = YOP
δ - 0.278 2.36δ + 0.556K P + 19.4
5 - 2δ mS = + 0.25 ⋅ m ATP 8δ - 2.22
5 - 2δ mO = + 0.25 ⋅ m ATP 8δ - 2.22
Source: van Gulik, W.M., Antoniewicz, M.R., DeLaat, W.T.A.M., Vinke J.L., and Heijnen, J.J., Biotechnol. Bioeng., 71:185–193. With permission.
Table 10.4 Calculated Yield and Maintenance Parameters of Penicillin and Biomass on Carbon Source with Their 95% Confidence Intervals C-source
max (C-mol/C-mol) YSX
max (mol/C-mol) YSP
ms (C-mol/C-mol/h)
Glucose
0.663 ± 0.013
0.029 ± 0.004
0.0088 ± 0.0032
Ethanol
0.721 ± 0.015
0.034 ± 0.005
0.0071 ± 0.0026
Acetate
0.425 ± 0.010
0.020 ± 0.003
0.0117 ± 0.0042
Source: van Gulik, W.M., Antoniewicz, M.R., DeLaat, W.T.A.M., Vinke J.L., and Heijnen, J.J., Biotechnol. Bioeng., 71: 185–193. With permission.
Figure 10.2a shows the calculated metabolic fluxes through the central metabolic pathways for growth on 100% glucose. The first change in the network structure, which was predicted when the ethanol fraction of the feed was 0.09 (C-mol/C-mol), was that the flux through transketolase converting xylulose 5-phosphate + erythrose 4-phosphate into glyceraldehyde 3-phosphate + fructose 6-phosphate became equal to zero (Figure 10.2b). The reason for this is that, at this point, there is no need for NADPH synthesis through the pentose phosphate pathway because sufficient NADPH can be produced in a more economic way through NADP linked acetaldehyde dehydrogenase and NADP linked isocytrate dehydrogenase. However, it should be realized that the predicted switch depends on the assumption that
10-10
Balances and Reaction Models Table 10.5 Calculated Yield and Maintenance Parameters of Penicillin and Biomass on Oxygen with Their 95% Confidence Intervals C-source
max (C-mol/mol) YOX
max (mol/mol) YOP
mO (mol/C-mol/h)
Glucose
2.15 ± 0.143
0.039 ± 0.008
0.0088 ± 0.0032
Ethanol
0.97 ± 0.04
0.028 ± 0.005
0.0106 ± 0.0038
Acetate
0.77 ± 0.03
0.024 ± 0.004
0.0117 ± 0.0042
Source: van Gulik, W.M., Antoniewicz, M.R., DeLaat, W.T.A.M., Vinke J.L., and Heijnen, J.J., Biotechnol. Bioeng., 71: 185–193. With permission.
Table 10.6 Measured and Calculated Effect of the Specific β-Lactam Production on the Steady State Biomass Concentration in Glucose Limited Chemostat Cultures of P. chrysogenum Measured Biomass Concentration (g/L)
Calculated Biomass Concentration (g/L)
High Production
Low Production
High Production
Low Production
-1
1.95
2.74
1.98
2.60
Chemostat at µ = 0.03 h-1
2.77
3.27
2.63
3.28
Chemostat at µ = 0.06 h-1
3.25
3.69
3.31
3.65
Chemostat at µ = 0.01 h
Source: van Gulik, W.M., Antoniewicz, M.R., DeLaat, W.T.A.M., Vinke J.L., and Heijnen, J.J., Biotechnol. Bioeng., 71: 185–193. With permission.
these two NADP linked enzymes are active under these conditions. The second change in the network was predicted to occur at an ethanol content of the feed of 0.48 C-mol/C-mol. At this point all acetylCoA is now completely synthesized from ethanol and therefore the flux through pyruvate dehydrogenase becomes equal to zero (Figure 10.2c). When the ethanol content of the feed is further increased the filling-up of the citric acid cycle can no longer be provided for by pyruvate carboxylase alone and the metabolic network predicted that the glyoxylate shunt (i.e., isocytrate lyase and malate synthase) became operative. At an ethanol fraction of 0.58 C-mol/C-mol. the flux through pyruvate carboxylase became equal to zero and the carbon flux was predicted to be channeled through PEP-carboxykinase to convert oxaloacetate to PEP (Figure 10.2d). Thereafter, a further increase of the ethanol fraction resulted in a predicted reversal of several reversible steps in glycolysis until, at an ethanol content of 0.73 C-mol/C-mol, the calculated flux through phosphofructokinase fell to zero and was replaced by the reversed reaction through fructose-l,6-bisphosphatase (Figure 10.2e). After this last change the minimal model for growth on ethanol was obtained. Figure 10.2f shows the metabolic flux pattern for growth on 100% ethanol. The question is now how to validate these model predictions experimentally. A way to do so would be the cultivation of S. cerevisiae in carbon limited chemostat cultures on different mixtures of glucose and ethanol and then measure the activities of the relevant enzymes. This was done by de Jong-Gubbels et al. (1995). They found that the activities of isocytrate lyase, malate synthase, PEP-carboxykinase, and fructosel,6-biphosphatase in cell free extracts were negligible in chemostat cultures on 100% glucose. However, when the cells were cultivated on glucose/ethanol mixtures malate synthase activity was detected at an ethanol content of 0.4 C-mol/C-mol and above and fructose- 1,6-biphosphatase activity was detected at an ethanol content of 0.7 C-mol/C-mol and higher. This corresponds very well with the predictions obtained from the metabolic network. However, activities of isocytrate lyase and PEP-carboxykinase were already detectable at low ethanol fractions of the feed. Furthermore it was found that the pyruvate kinase activity decreased at increasing ethanol content. This was indeed predicted by the metabolic network (Figure 10.2a through f), although it was also predicted that the flux through this enzyme reached a low but constant level above an ethanol content of 0.58 C-mol/C-mol. Unfortunately, the experimental data on pyruvate kinase activity contained too much scatter to draw further conclusions. The model predicted that the flux
10-11
Metabolic Models for Growth and Product Formation (a)
0.23
RIBU5P
(b)
GLUC 1.00
0.08
0.07
RIBU5P
GLUC6P
GLUC 0.91
0.03
GLUC6P
0.50 RIB5P
XYL5P
0.10
0.15 GAP + SED7P
0.61 RIB5P
FRUC6P 0.66
0.15
G3P 0.60
FRUC6P + E4P
PEP 0.63
PEP 0.58
PYR
PYR
0.30
0.13
ACCOA
OAA
ETOH 0.09
ACET 0.26 +NADH +NADPH AC
ACCOA
OAA
0.64
0.38
0.78
0.47
MAL
ISOCIT 0.53
0.38 FUM 0.36
GAP 0.63
0.04
G3P 0.66
0.13
FRUC6P 0.64
0.00
0.04 GAP + SED7P
GAP 0.69
FRUC6P + E4P
XYL5P
AKG 0.36
SUC
SUCCOA 0.36
MAL
ISOCIT 0.65
0.47 FUM 0.45
AKG 0.45
SUC
SUCCOA 0.45
figure 10.2 Estimated optimal metabolic flux patterns for aerobic growth of S. cerevisiae on a mixture of glucose and ethanol. All fluxes are given in C-mol of carbon transferred and are presented as fractions of the consumption rate of mixed carbon substrate (C-mol/h). (a) Growth on 100% glucose. (b) Cessation of NADPH production in the pentose phosphate pathway at an ethanol fraction of 0.09 C-mol/C-mol in the feed. (c) Cessation of the flux through pyruvate decarboxylase and start of glyoxylate cycle at 0.48 C-mol ethanol/ C-mol. (d) Cessation of the flux through pyruvate carboxylase and, instead, reversal of the carbon flux via PEPcarboxykinase at an ethanol fraction of 0.58 C-mol/C-mol. (e) Reversal of several reversible steps in glycolysis, cessation of the carbon flux through phosphofructokinase, and instead reversal of the carbon flux via fructose1,6-biphosphatase at an ethanol fraction of 0.73 C-mol/C-mol. (f) Growth on 100% ethanol. (From van Gulik, W.M. and Heijnen, J.J., Biotechnol. Bioeng., 1995, 48, 681–698. With permission.)
through pyruvate carboxylase was constant up to an ethanol content of 0.48 C-mol/C-mol. Up to this ethanol fraction this is the sole anaplerotic route to fill up the TCA cycle. Between an ethanol fraction of 0.48 and 0.58 C-mol/C-mol the model predicted that the anaplerotic function of pyruvate carboxylase is gradually taken over by the glyoxylate shunt. However, in contrast to this, enzyme activity measurements did not reveal significant changes in the activity of pyruvate carboxylase upon transition from 100% glucose to 100% ethanol (de Jong-Gubbels et al., 1995). Finally, the flux through phosphofructokinase was predicted to decrease at increasing ethanol fractions and to fall to zero at an ethanol fraction of 0.73 (Figure 10.2e). Also, in this case, the in vitro measured enzyme activity was not influenced by the ethanol fraction in the feed. It was concluded by the authors, however, that the actual fluxes through the enzymes most probably have been modulated at the metabolome level, instead of the enzyme level.
10-12 (c)
Balances and Reaction Models
RIBU5P
(d)
GLUC 0.52
0.03 0.07
GLUC6P
0.73 FUM 0.71
SUC
1.17
0.98
0.00
AKG 0.71 SUCCOA 0.71
GLUC6P 0.12 FRUC6P 0.14 GAP 0.14
0.00
G3P 0.11 PEP 0.08 PYR
0.00
AC
ACCOA 0.00 ISOCIT GLYO
MAL
XYL5P
0.04 GAP + SED7P 0.04 FRUC6P + E4P
ACCOA
OAA 0.73
RIB5P
ETOH 0.48 PYR ACET 0.00 + NADH +NADPH
0.13
0.07
RIBU5P
0.22 XYL5P RIB5P FRUC6P 0.00 0.24 0.04 GAP GAP + SED7P 0.24 0.04 G3P FRUC6P + E4P 0.21 PEP 0.18
GLUC 0.42
0.03
ETOH 0.58 ACET +NADH +NADPH AC
ACCOA
OAA 0.94 MAL
0.13
0.81 FUM 0.79
SUC
1.29 ACCOA 0.06
GLYO
0.13
ISOCIT 0.91
AKG 0.66
SUCCOA 0.66
Figure 10.2 (Continued)
It is clear that in vitro enzyme activity measurements can be used to verify the presence of certain enzymes, however, they do not provide proof for an actual flux through an enzyme. A much more elegant approach to verify the model predictions for growth of S. cerevisiae on glucose/ ethanol mixtures was followed by Stueckrath et al. (2002). They constructed null mutants for the glyoxylate cycle enzymes malate synyhase and isocitrate lyase and the gluconeogenic enzymes PEP carboxykinase and fructose bisphosphatase. Subsequently these null mutants were cultivated in carbon limited chemostat cultures on glucose/ethanol mixtures ranging from 0 to 100% ethanol. Following this approach the metabolic switching points can be found experimentally. At an increasing ethanol content of the feed the cells need certain pathways in order to be able to metabolize all the ethanol supplied. If a key enzyme is not available the cells will only be able to catabolize the ethanol supplied up to a certain ethanol/glucose ratio. Above this ratio the surplus of the ethanol can not be consumed and this will result in measurable amounts of residual ethanol in the effluent of the chemostat. From the experiments carried out by Stueckrath et al. (2002) it was found that both the null mutants for isocitrate lyase and for malate synthase, which both result in a non functional glyoxylate cycle, could grow in ethanol + glucose limited chemostats up to an ethanol fraction in the feed of 0.50 C-mol/C-mol. A further increase of the ethanol content of the feed resulted in a proportional increase of the residual ethanol concentration and a proportional decrease of the biomass concentration, up to an ethanol content of 100% where growth of these mutants was not possible at all. This observation corresponded very well with the ethanol fraction of 0.48 C-mol/C-mol which was predicted by the stoichiometric model as the switch point for requirement of the glyoxylate cycle (see Figure 10.3). It was found that, as was predicted by the model calculations, the PEP carboxykinase null mutant was able to grow at higher ethanol fractions.
10-13
Metabolic Models for Growth and Product Formation (e)
GLUC 0.27
0.03 0.07
RIBU5P RIB5P
XYL5P
0.04 GAP + SED7P 0.04
(f )
0.03 FRUC6P 0.00 GAP 0.00
RIB5P
PEP 0.08 PYR
0.14
ETOH 0.73
ACET +NADH +NADPH
1.24 MAL
0.31
0.93 FUM 0.91
1.46 ACCOA 0.16 ISOCIT GLYO 0.31
0.83 AKG 0.60
SUC
SUCCOA 0.60
GLUC6P
0.00
0.29 FRUC6P 0.25 GAP 0.26 G3P 0.29 PEP 0.08 PYR
0.39
ETOH 1.00
ACET +NADH +NADPH AC
ACCOA
AC
ACCOA
OAA
XYL5P
0.04 GAP + SED7P 0.04 FRUC6P + E4P
G3P 0.03
FRUC6P + E4P
0.07
RIBU5P
GLUC6P
0.00
0.03
OAA 1.79 MAL
0.64
1.15 FUM 1.13
1.79 ACCOA 0.32
GLYO 0.64
ISOCIT 0.70
AKG 0.50
SUC
SUCCOA 0.50
Figure 10.2 (Continued)
However, above an ethanol fraction of 0.60 C-mol/C-mol the residual ethanol concentration increased and the biomass concentration decreased in a linear fashion until no growth occurred at 100% ethanol. Also these experimental observations corresponded very well with the predicted switch point for this enzyme of 0.58 C-mol/C-mol. Only the behavior of the fructose bisphosphatase null mutant during chemostat growth on the ethanol/glucose mixtures deviated from the model predictions, although the trend was predicted well. Already at an ethanol fraction of 0.60 C-mol/C-mol the measured biomass concentration was significantly lower than predicted by the model, although no residual ethanol could be detected. This occurred above an ethanol fraction of 0.84 C-mol/C-mol, while the model predicted the metabolic switch to occur at an ethanol fraction of 0.73 C-mol/C-mol. It can be concluded from these results that metabolic models of still moderate complexity can provide a fairly accurate description of changes in metabolic network structure as a result of changes in growth conditions. Furthermore the approach to validate the model predictions experimentally by construction the proper null mutants proved to be very successful. The experimental results showed that the metabolic model was able to provide a quantitative description of the behavior of these null mutants during growth on ethanol/glucose mixtures.
10.3.6 Theoretical Yield Limits to the Overproduction of Amino Acids As has been shown above for penicillin-G production in P. chrysogenum, stoichiometric metabolic models can be applied to calculate limits to maximum product yields, if they contain the proper ATP stoichiometry parameters. In the following example theoretical yield limits to the overproduction of amino
10-14 (a)
Balances and Reaction Models (b)
0.6 0.5 0.4 0.3
Predicted Measured
0.2 0.1 0
(c)
Yield (C-mol/C-mol)
Yield (C-mol/C-mol)
0.7
0
20 40 60 80 Ethanol fraction in feed (%)
(d) Yield (C-mol/C-mol)
Yield (C-mol/C-mol)
0.6 0.5 0.4 0.3
Predicted Measured
0.2 0.1 0
0
20 40 60 80 Ethanol fraction in feed (%)
100
0.6 0.5 0.4 0.3
Predicted Measured
0.2 0.1 0
100
0.7
0.7
0
20 40 60 80 Ethanol fraction in feed (%)
100
0.7 0.6 0.5 0.4 0.3
Predicted Measured
0.2 0.1 0
0
20 40 60 80 Ethanol fraction in feed (%)
100
Figure 10.3 Predicted and measured biomass yields of S. cerevisiae grown in carbon limited chemostat cultures on different ratios of glucose and ethanol in the feed. (a) Wild type; (b) ∆ mls1 and ∆ icl1; (c) ∆ pck1; (d) ∆ fbp1. (From Stückrath, I., Lange, H.C., Kötter, P., van Gulik, W.M., Entian, K.-D., and Heijnen, J.J., Biotechnol. Bioeng., 2002, 77(1), 61–72. With permission.)
acids will be calculated using the uncompartmented metabolic network model for S. cerevisiae (van Gulik and Heijnen, 1995). When nongrowth-associated maintenance energy needs are negligible the well-known linear equation for substrate consumption for growth and product formation can be written as:
qS =
qP µ + max max YSX YSP
(10.9)
The operational yield of product on substrate is then given by:
YSP =
qP = qS
qP µ qP + max max YSX YSP
(10.10)
By applying the metabolic network for growth of S. cerevisiae possible stoichiometric limits to amino acid production were studied. Using the estimated values of δ′ and K′ and glucose as the substrate the max parameter for each of the 20 amino acids which can be metabolic network provides values for the YSP theoretically produced. From Equation 10.10 it follows that, at zero growth rate, μ, the maximum theomax . For each amino acid, retical value of the operational product yield, YSP, is equal to the parameter YSP max can be calculated from the metabolic network. However, it was found that calculation the value of YSP
10-15
Metabolic Models for Growth and Product Formation
of the fluxes through the metabolic network for the production of each of the 20 amino acids at zero growth rate (μ = 0) resulted, in some cases, in thermodynamic inconsistencies (e.g., backward operation of the citric acid cycle). It appeared that these inconsistencies occurred only for amino acids for which the production was accompanied by a net production of ATP. These thermodynamic inconsistencies could be avoided by dissipating the excess ATP produced. In these cases, biomass production might be a sink for excess ATP produced. Another possibility the cells might have is hydrolysis of ATP in futile cycles. For this example it was assumed that excess ATP could only be consumed through biomass production. For each amino acid produced the minimum biomass production rate was calculated for which no thermodynamic inconsistencies occurred. From Equation 10.10 it can be inferred that, when biomass growth is required for production of these amino acids, and thus part of the carbon substrate is necessarily consumed for biomass formation, this will result in a limit to the maximum theoretical max where: yield, YSP lim < Y max YSP ≤ YSP SP
(10.11)
In such cases, ATP dissipation by other means, e.g., by increased maintenance energy requirements, lim. These limits have been calculated for all 20 amino acids. The results are shown in would increase YSP lim may reach values of only 50% Figure 10.4. It can be seen from this figure that, for some amino acids, YSP max or less than YSP .
10.3.7 Limit Functions for Maximum Product Yields Given the stoichiometry of the metabolic network the linear equation for substrate consumption for growth and amino acid production can be derived, as has been shown above for penicillin production 1.2
Theoretical product yield Cmol/Cmol
1
0.8
0.6
0.4
Valine
Tyrosine
Tryptophane
Threonine
Serine
Proline
Phenylalanine
Methionine
Lysine
Leucine
Isoleucine
Histidine
Glycine
Glutamine
Glutamate
Cysteine
Aspartate
Asparagine
Arginine
0
Alanine
0.2
Figure 10.4 Metabolic network estimation of maximum theoretical yields for amino acid production in S. cerevisae. Grey bars: Maximum theoretical yield of product on the carbon source under the assumption of zero biomass growth. Black bars: Limits to the theoretical product yields resulting from thermodynamic constraints (see text). (From van Gulik, W.M. and Heijnen, J.J., Biotechnol. Bioeng., 1995, 48, 681–698. With permission.)
10-16
Balances and Reaction Models
in P. chrysogenum. As an example this was done for aerobic growth S. cerevisiae on glucose with production of leucine, using the metabolic model of van Gulik and Heijnen (1995). The resulting equation contains the ATP-stoichiometry parameters and reads
0.176 ⋅ δ + 0.0833 ⋅ K ′X + 0.166 1.25 ⋅ δ + 0.667 -q S = ⋅ µ + δ + 0.400 q P δ + 0.400
(10.12)
where δ is the P/O-ratio and K ′X (mol ATP/C-mol biomass) is the growth dependent maintenance coefficient which was estimated from chemostat data obtained at a specific growth rate of 0.1 h -1 (van Gulik and Heijnen, 1995). From this equation the maximum yields of biomass and leucine on glucose follow as:
max = YSX
δ + 0.400 0.176 ⋅ δ + 0.0833 ⋅ K ′X + 0.166
and
YLmax EU =
δ + 0.400 1.25 ⋅ δ + 0.667
(10.13)
From the relation for the maximum product yield it could be inferred that the minimum yield of leucine on glucose (for δ = 0) would be 0.60 mol/mol and that the maximum yield for the estimated P/O-ratio (δ = 1.20) would be 0.74 mol/mol. However, as has been pointed out above, if the biosynthesis of a product is accompanied with a net production of ATP, there should be a sink for the produced ATP as well. For the example on amino acid overproduction the formation of biomass has been assumed as the ATP sink, which resulted in lower operational yields for the amino acids alanine, glutamate, glutamine, glycine, leucine, valine, phenylalanine, proline, serine, tyrosine, and valine. For each of these amino acids limit functions can be derived from the stoichiometry of the metabolic network, giving the upper limit to the yield of product on substrate as a function of the ATP stoichiometry parameters, such that thermodynamic constraints (i.e., reversal of reactions which are irreversible under physiological conditions) are not violated. Below the yield limit functions for the overproduction of two amino acids, namely leucine and valine are shown:
lim = YLEU
-δ + 2.48 ⋅ K ′X + 2.46 0.932 ⋅ δ + 4.14 ⋅ K ′X + 4.110
(10.14)
lim = YVAL
-δ + 2.48 ⋅ K ′X + 2.46 -0.127 ⋅ δ + 2.90 ⋅ K ′X + 2.87
(10.15)
Substitution of the estimated values of the ATP stoichiometry parameters, δ = 1.20 (mol ATP/0.5 lim = 0.363 (mol/mol) and Y lim = 0.624 mol oxygen) and K ′X = 0.644 (mol ATP/C-mol biomass) yields YLEU VAL (mol/mol). As can be inferred from these equations these yield limits are not a function of the specific growth rate μ. The reason for this is that in the stoichiometric model for yeast of van Gulik and Heijnen (1995) growth independent maintenance energy requirements were not taken into account, because the data used were obtained from chemostat cultures carried out at the same growth rate and thus the growth independent maintenance could not be estimated. In order to close the ATP balance, the production rate of an amino acid which leads to a net production of ATP should be accompanied by a certain biomass production rate to consume the produced ATP. This implies that for each of these amino acids a ratio between the specific rate of amino acid production and specific growth rate exists for which the net production of ATP is equal to zero. This results in a fixed limit to the yield of amino acid on substrate which is independent of the growth rate. However, if growth independent maintenance is taken into account substitution of K ′X = K X + ( m ATP / µ ) in
10-17
Metabolic Models for Growth and Product Formation 0.8 0.7
Yield limit (mol/mol)
0.6 0.5 0.4 0.3 0.2 0.1 0
0
0.2 0.4 Growth rate (h–1)
0.6
Figure 10.5 Predicted theoretical limits to the yield of the amino acid leucine on glucose; (…) without thermodynamic constraints and without taking growth independent maintenance requirement into account, (---) with thermodynamic constraints and without growth independent maintenance, () with thermodynamic constraints and with growth independent maintenance.
Equation 10.14 yields an expression for the thermodynamic limit of the leucine yield as a function of the growth rate.
m -δ + 2.48 ⋅ K X + ATP + 2.46 µ lim = YLEU m ATP 0.932 ⋅ δ + 4.14 ⋅ K X + + 4.10 µ
(10.16)
Assuming a growth independent maintenance coefficient m ATP = 0.033 mol ATP/C-mol biomass and a growth dependent maintenance coefficient K X = 0.31 mol ATP/C-mol biomass, which yields K ′X = 0.64 mol ATP/C-mol for a growth rate of 0.1 h -1, a plot can be made of the thermodynamic limit to the leucine yield on glucose as a function of the growth rate (see Figure 10.5). As a comparison the yield limit for the case where growth independent maintenance was not taken into account (Equation 10.14 with K′ = 0.64 mol ATP/C-mol) is plotted in the same figure (dashed line) as well as the maximum theoretical yield if thermodynamic constraints are violated (dotted line). It can be seen from this figure that if growth independent maintenance is taken into account a decreases of the growth rate toward zero results in a progressive increase in the yield limit, to reach a value of 0.6 for zero growth.
10.4 Conclusions It has been shown in this chapter that stoichiometric metabolic models of moderate complexity can be successfully applied to provide a fairly accurate description of changes in metabolic network structure as a result of changes in growth conditions. It has also been shown that such models can be applied, in combination with experimental results, to estimate the ATP stoichiometry of oxidative phosphorylaton and maintenance requirements for a certain microorganism. Incorporating the estimated ATP stoichiometry in the model allows the prediction of maximum yields of biomass and products for different substrates, substrate mixtures and metabolic network topologies. An important prerequisite for these calculations is that thermodynamic constraints are not violated.
10-18
Balances and Reaction Models
References Burgard P.A., Vaidyaraman S., and Maranas C.D., 2001. Minimal reaction sets for Escherichia coli metabolism under different growth requirements and uptake environments. Biotechnol. Prog., 17:791–97. de Jong-Gubbels P., Vanrolleghem P.A., Heijnen J.J., van Dijken J.P., and Pronk J.T., 1995. Regulation of carbon metabolism in chemostat cultures of Saccharomyces cerevisiae grown on mixtures of glucose and ethanol. Yeast, 11:407–18. Duarte N.C., Herrgård M.J., and Palsson B.O., 2004. Reconstruction and validation of Saccharomyces cerevisiae iND750, a fully compartmentalized genome-scale metabolic model. Genome Res., 14(7):1298–309. Feist A.M., Henry C.S., Reed J.L., Krummenacker M., Joyce A.R., Karp P.D., Broadbelt L.J., Hatzimanikatis V., and Palsson B.O., 2007. A genome-scale metabolic reconstruction for Escherichia coli K-12 MG1655 that accounts for 1260 ORFs and thermodynamic information. Mol Syst. Biol., 3(121). Ingraham J.L., Maaloee O., and Neidhardt F.C., 1983. Growth of the Bacterial Cell. Sinauer Associates, Sunderland, MA. Oh Y.K., Palsson B.O., Park S.M., Schilling C.H., and Mahadevan R., 2007. Genome-scale reconstruction of metabolic network in Bacillus subtilis based on high-throughput phenotyping and gene essentiality data. J. Biol. Chem., 282(39):28791–9. Price N.D., Reed J.L., and Palsson B.O., 2004. Genome-scale models of microbial cells: evaluating the consequences of constraints. Nat. Rev. Microbiol., 2:886–97. Rabkin M. and Blum J.J., 1985. Quantitative analysis of intermediary metabolism in hepatocytes incubated in the presence and absence of glucagon with a substrate mixture containing glucose, ribose, fructose, alanine and acetate. Biochem. J., 225:761–86. Stückrath I., Lange H.C., Kötter P., van Gulik W.M., Entian, K.-D., and Heijnen J.J., 2002. Characterization of null mutants of the glyoxylate cycle and gluconeogenic enzymes in S. cerevisiae through metabolic. Biotechnol. Bioeng., 77(1):61–72. van Gulik W.M. and Heijnen J.J., 1995. A metabolic network stoichiometry analysis of microbial growth and product formation. Biotechnol. Bioeng., 48:681–98. van Gulik W.M., De Laat W.T.A.M., Vinke J.L., and Heijnen J.J., 2000. Application of metabolic flux analysis for the identification of metabolic bottlenecks in the biosynthesis of penicillin-G. Biotechnol. Bioeng., 68:602–18. van Gulik W.M., Antoniewicz M.R., Delaat W.T.A.M., Vinke J.L., and Heijnen J.J., 2001. Energetics of growth and penicillin production in a high-producing strain of Penicillium chrysogenum. Biotechnol. Bioeng., 72:185–93. Vanrolleghem P.A., de Jong-Gubbels P., van Gulik W.M., Pronk J.T., van Dijken J.P., and Heijnen J.J., 1996. Validation of a metabolic network for Saccharomyces cerevisiae using mixed substrate studies. Biotechnol. Prog., 12(4):434–48. Verhoff F.H. and Spradlin J.E., 1976. Mass and energy balances of metabolic pathways applied to citric acid production by Aspergillus niger. Biotechnol. Bioeng., 18:425–32. Verduyn C., Stouthamer A.H., Scheffers W.A., and van Dijken J.P., 1991. A theoretical evaluation of growth yields of yeasts. Ant. van Leeuwenhoek Int. J. Gen. Mol. Microbiol., 59:49–63. Verduyn C., 1991. Physiology of yeasts in relation to growth yields. Ant. van Leeuwenhoek Int. J. Gen. Mol. Microbiol. (Special issue: Growth and Metabolism of Microorganisms), 60:325–53.
11 A Thermodynamic Description of Microbial Growth and Product Formation 11.1 Introduction �������������������������������������������������������������������������������������11-1 11.2 Thermodynamics of Microbial Growth max �������������������������������������������������������������������������11-2 Stoichiometry, YDX The Anabolic Reaction for Biomass Synthesis • Calculation of the Electron Donor Needed for Anabolism Using the Balance of Degree of Reduction • Calculation of the Gibbs Energy from the Catabolic Reaction • The Required Amount of Gibbs Energy for Anabolism and Calculation of the Amount of Electron Donor That Must Be Catabolized • A Thermodynamic Relation to Calculate the Biomass max Yield on Electron Donor, YDX
Joseph J. Heijnen Delft University of Technology
11.3 Thermodynamics of Maintenance...............................................11-13 11.4 Calculation of the Operational Stoichiometry of a Growth Process at Different Growth Rates, Including Heat Using the Herbert–Pirt Relation for Electron Donor.......................... 11-14 11.5 A Correlation to Estimate the Maximum Specific Growth Rate, µmax �����������������������������������������������������������������������������������������11-15 11.6 Thermodynamic Prediction of Minimal Concentration Electron Donor and Maximal Concentration of Catabolic Product ��������������������������������������������������������������������������������������������11-16 11.7 Thermodynamics and Stoichiometry of Product Formation ���������������������������������������������������������������������������������������11-17 11.8 Conclusions ������������������������������������������������������������������������������������11-19 References and Recommended Reading.................................................11-19
11.1 Introduction Growth of (micro)organisms occurs under a wide range of conditions (such as pH 0–13, temperature 0–110°C, salt concentration 0.1–2 M), using a huge variety of electron donors, electron acceptors, carbon, and nitrogen sources, each of which can be organic or inorganic. Growth of organisms is usually described by four parameters which belong to the hyperbolic substrate max, m ). The values of these four parameters uptake relation (µmax, (Ks)) and the Herbert–Pirt relation (YSX s are essential to design processes in which growing organisms are used. However their values depend on the nutrients used (nature of carbon and nitrogen sources, electron donor and acceptor), temperature and pH and easily span a range of two orders of magnitude. For example Escherichia coli grows 11-1
11-2
Balances and Reaction Models
on glucose (electron donor) using O2 (aerobically) with YSX = 0.50 g biomass/g glucose and µmax = 1 h -1. In contrast methane bacteria, using acetate show YSX = 0.01 g biomass/g acetate and µmax = 0.005 h -1. A general method to predict the values of these four parameters for different growth systems is therefore of great value. Here we will present a thermodynamic approach to predict these four parameters for any growth system for which the C- and N-source, the electron donor and acceptor, the biomass specific growth rate µ, and the cultivation temperature are specified. This thermodynamic method also allows to understand the effect of changes in nutrients, temperature, and pH on these four parameters. max 11.2 Thermodynamics of Microbial Growth Stoichiometry, YDX
Growth stoichiometry is of major interest for biotechnological process design and is reflected in the parammax (from the Herbert–Pirt relation, see Chapter 9 of this section). (1 / Y max ) represents the amount eter YSX SX max , electron donor, in mol, needed to synthesize 1 Cmol of biomass. Usually YSXmax is also written as YDX because both the term substrate (S) and electron donor (D) are used to indicate the same compound.
11.2.1 The Anabolic Reaction for Biomass Synthesis Microorganisms are composed of protein, RNA, DNA, lipid, and carbohydrates. Comparing many different (micro)organisms, it is found that their relative contents are similar (40–70% protein, 1–2% DNA, 5–15% RNA, 2–10% lipid, 3–10% carbohydrate). This similarity leads to an elemental composition of biomass which is also very similar, such that the organic part of biomass can be represented by a simple 1-C-formula:
Biomass = C1H1.8O0.5N0.2
This composition formula holds close for many organisms. However for each specific situation one can establish an element analysis of the biomass and obtain a more precise elemental composition. For convenience we excluded the other elements P, S, K+, Mg2 +, etc. present in biomass, because of their minor contribution (<1–2%).
Example 1: Obtaining the Biomass Composition Formula Consider a biotechnological process from which a biomass sample has been obtained. Usually one applies filtration and drying to obtain the Biomass dry mass (in kg dry matter/m3 broth). Subsequently the biomass composition can be obtained by determining: • The ash content (by burning at ≈500°C) • The elemental composition with respect to C, H, O, N, and sometimes S, P, K, Mg, etc. From the results of these analyses the biomass composition formula can be obtained as presented before. In most cases only the four major elements (C, H, O, N) are measured, which give then the biomass composition of the organic mass. The other elements S, P, K, Mg are considered to be the inorganic (ash) fraction of biomass. More detailed considerations on the proper determination of the biomass composition can be found in Lange and Heijnen (2001).
The biomass composition formula shows that synthesis of biomass requires a carbon source and an N-source. In addition there is a need for an electron donor (D). When growth uses an organic compound as substrate, then the substrate is both carbon and electron donor. This is called heterotrophic growth. When growth uses CO2 as C-source, then there is a need for a separate electron donor to reduce CO2 to biomass. This is called autotrophic growth. When the carbon and nitrogen source and the electron donor are known it is possible to write the biomass synthesis reaction, called the anabolic reaction (see Figure 11.1). The unknown coefficients for the anabolic reaction are easily calculated from the conservation requirements (conservation of elements, electric charge), as shown in Examples 2a and 2b.
A Thermodynamic Description of Microbial Growth and Product Formation
11-3
Heterotrophic growth: a electron donor / C-source + b N-source + cH + + d H2O + e CO2 + 1 C1H1.8O0.5N0.2 = 0 Autotrophic growth: f CO2 + a electron donor + b N-source + cH+ + d H2O + e oxidized electron donor + 1 C1H1.8O0.5N0.2 = 0
Figure 11.1 Format of the anabolic reaction for the formation of 1 Cmol of biomass.
Example 2a: Calculation of the Anabolic Reaction for Heterotrophic Growth Consider the heterotrophic growth of an organism on oxalate (C2O42-) as C -source and electron donor, using NH4+ + as N-source. We can then write (Figure 11.1) for the anabolic reaction
a C2O42- + b NH4+ + c H+ + d H2O + e CO2 + 1 C1H1.8O0.5N0.2 = 0
The five unknown coefficients are easily found by setting up the conservation relations for the involved elements (C, H, O, N) and electric charge: C -balance H-balance O-balance N-balance Charge balance
2a + e + 1 = 0 4b + c + 2d + 1.8 = 0 4a + d + 2e + 0.5 = 0 b + 0.2 = 0 -2a + b + c = 0
These five linear equations can be solved for a to e leading to the anabolic reaction:
-2.1 C2O42–0.2 NH4+ -4 H+ + 1.5 H2O + 3.2 CO2 + 1 C1H1.8O0.5N0.2 = 0
This result shows that 2.1 mol oxalate is needed for the formation of 1 CmolX in the anabolic reaction!! This example shows that biomass synthesis requires a certain amount of electron donor, which depends on the nature of the electron donor. However, the nature of the N-source also plays a role. For example when NO3 - is the N-source it is easily calculated that in Example 2a the amount of oxalate needed to synthesize 1 Cmol of biomass increases from 2.1 mol to 2.9 mol for 1 CmolX.
Example 2b: Calculation of the Anabolic Reaction for Autotrophic Growth Consider the autotrophic growth of an organism, which uses Fe2 + as electron donor, which is oxidized to Fe3+. NH4+ is the N-source and CO2 is the carbon source. The anabolic reaction for biomass synthesis then follows (Figure 11.1)
f CO2 + a Fe2+ + b NH4+ + c H+ + d H2O + e Fe3+ + 1 C1H1.8O0.5N0.2 = 0 We now again formulate the conservation relations C -balance H-balance O-balance N-balance Fe Charge balance
f + 1 = 0 4b + c + 2d + 1.8 = 0 2f + d + 0.5 = 0 b + 0.2 = 0 a + e = 0 2a + b + c + 3e = 0
11-4
Balances and Reaction Models
There are now six relations (because of the extra element Fe), which can be solved for the anabolic reaction:
-1 CO2–4.2 Fe2+ -0.2 NH4+ -4 H+ + 1.5 H2O + 4.2 Fe3+ + 1 C1H1.8O0.5N0.2 = 0
Now we find that 4.2 mol Fe2+ are needed to make 1 CmolX by reducing CO2. When NO3- is available as N-source, and not NH4+, a similar calculation will show that there is a need of 5.8 mol Fe2+ to reduce 1 CO2 to 1 CmolX!!
These examples do show that the anabolic reaction for synthesis of biomass can be derived, by only knowing the nature of the C-source, N-source, electron donor, and the biomass composition. The anabolic reaction is easily adapted for other biomass compositions. The incorporation of P, S, K+, etc. in the biomass composition requires the incorporation of the sources of P, S, K+ in the anabolic reaction, the coefficients of which follow from the conservation of P, S, K+ atoms!! The anabolic reaction, based on known C- and N-sources and known electron donor is already very informative. The anabolic reaction shows how much electron donor is needed for biomass synthesis, e.g., 2.1 mol oxalate (Example 2a) or 4.2 mol Fe2+ (Example 2b) for the synthesis of 1 Cmol biomass. The nature of the N-source does affect this amount (Examples 2a and 2b).
11.2.2 Calculation of the Electron Donor Needed for Anabolism Using the Balance of Degree of Reduction The coefficients in the anabolic reaction follow directly from the solution of the element and charge balances. This is a cumbersome approach. The balance of degree of reduction is a method to simplify these calculations. Degree of reduction (γi) is a stoichiometric quantity which can be calculated for each chemical compound for which the elemental composition is known. γi reflects the amount of electrons available in a compound. γi is calculated by writing the redox half reaction in which compound i is converted in a set of reference compounds, which are CO2, H2O, H+, Fe3+, SO42- , PO43- , N2, and electrons. The amount of produced electrons equals γi. A special point to consider is that the degree of reduction for a reference compound equals 0, by definition. Using this redox half reaction approach we can calculate the degree of reduction of elements and of electric charge, the result of which is shown in Table 11.1. A special case is the degree of reduction of the element N in biomass. Making the γ-value of N in biomass dependent on the nature of the N-source (as shown in Table 11.1) makes the degree of reduction of the N-source equal to zero (which leads to simplified calculations). Table 11.1 is very useful to directly calculate γi of compound i from its element composition (see Example 3). Table 11.1 Generalized Degree of Reduction of Elements and Electric Charge Element or Charge
Degree of Reduction
H
+ 1
O
- 2
C
+ 4
+ Charge
- 1
- Charge S
+ 1
P
+ 5
Fe
+ 3 0
N N in biomass
+ 6
-3 for NH4+ as N-source 0 for N2 as N-source + 5 for NO3- as N-source
A Thermodynamic Description of Microbial Growth and Product Formation
11-5
Example 3: Calculation of Degree of Reduction of Compounds Consider a reference compound, e.g., CO2 using Table 11.1 its γ-value follows as γCO2 = 1 × 4 + 2 × (-2) = 0. Similarly for H+ γH+ = 1 × 1 + 1 (-1) = 0, and it is easy to check that for all reference compounds γ = 0. Now consider biomass. To calculate its degree of reduction, called γx, we need the biomass composition formula, for which we use C1H1.8O0.5N0.2. We also need information on the N-source. Consider NH4 + as N-source. Using Table 11.1 gives:
γx = 1 × 4 + 1.8 × 1 + 0.5 (-2) + 0.2 (-3) = 4.2.
The degree of reduction of the N-source NH4+ is then γN-source = 1 (-3) + 4 ( + 1) + 1 (-1) = 0, as expected. For NO3- as N-source one obtaines γx = 5.8 and γN-source = 0. Let us now consider the electron donor. For oxalate (C2O42-) γ = 2 × 4 + 4 (-2) + 2 = + 2. For ferrous (Fe2+ ) iron γ = 3 + 2 (-1) = 1 and for Ferric (Fe3+ ) iron γ = 3 + 3 (-1) = 0.
For organic compounds γ ranges between 0 and 8 per C-atom present in the compound, e.g., for CH4 γ = 8, for CO2 γ = 0, for C2H6O (ethanol) γ = 12 or 6 per carbon. Degree of reduction is a stoichiometric quantity for a compound i related to the amount of electrons available from the chemical compound i. Therefore it is also possible to set up a balance of degree of reduction (γ-balance), which should add up to zero because electrons are also conserved. One should note that the γ-balance is not an extra constraint beside the elements/electric charge. The γ-balance is in essence a linear combination of the element/charge conservation relations and is therefore a dependent relation!! However the γ-balance has the useful practical property that its value for the reference compounds CO2, H+, H2O, N-source, SO42- , H2PO4 - , Fe3+ , and N2 is by definition equal to zero, which leads to very easy calculations for reactions. This becomes already clear by using the γ-balance for calculation of the anabolic reaction.
Example 2c: Balance of Degree of Reduction Applied to the Anabolic Reaction Consider Example 2a, where the anabolic reaction for biomass synthesis on oxalate (C2O42-) was calculated using the conservation relations for elements/charge. Now we use the balance of degree of reduction, for which we need γ-values. γoxalate = 2, γx = 4.2 and all other γ-values are zero. This gives a very simple γ-balance.
γ-balance: a (2) + 1 (4.2) = 0
This gives immediately a = -2.1, being 2.1 mol oxadate electron donor needed for the synthesis of 1 CmolX. Consider Example 2b. Here the γ-balance follows as ( γ Fe2+ = 1, γ Fe3+ = 0, γ X = 4.20 ):
γ-balance: a (+1) + 1 (4.2) = 0 This gives a = -4.2, which tells that there is 4.2 mol Fe2+ consumed for 1 CmolX.
Example 2c shows that the consumption of electron donor (degree of reduction γD) in the anabolic reaction toward biomass equals:
mol electron donor for anabolism γ X = . CmolX γD
(11.1)
This is a very general result. For example consider the use of NO3 - as N-source, instead of NH4+ . The value of γX then changes from 4.2 to 5.8. For oxalate as electron donor (γD = 2), it can then be calculated that γX/γD = 5.8/2 = 2.9 mol oxalate needed for 1 Cmol biomass (see Example 2a) and γX/γD = 5.8/1 = 5.8 mol Fe2+ per CmolX (see Example 2b).
11-6
Balances and Reaction Models
11.2.3 Calculation of the Gibbs Energy from the Catabolic Reaction 11.2.3.1 The Need for a Catabolic Reaction Coupled to Anabolism It has been shown above that the complete stoichiometry of the anabolic reaction can be obtained from the elemental and charge balances and that it can be calculated from the balance of degree of reduction that the amount of electron donor needed in the anabolic reaction equals (γX/γD) mol electron donor/ CmolX. However, in reality more than γX/γD mol electron donor is required to produce 1 CmolX!!! The reason for this is the 2nd law of thermodynamics, which states that for a reaction to proceed ∆GR << 0. The anabolic reaction as presented in 11.2.1 is the result of hundreds of enzyme catalyzed reactions, which start with a simple carbon and N-source and an electron donor and which produces a highly complex en product, i.e., biomass which contains polymerized compounds (proteins, etc.). A detailed thermodynamic analysis of these individual reactions reveals that many of these reactions (polymerization, Pi-addition, carboxylations) require energy input in order to make the Gibbs energy of reaction sufficiently negative. Bio-chemically, this is well known and ATP delivers the energy input. This points however to the need of a biological process which delivers energy, which is called catabolism. In catabolism an electron acceptor performs a redox reaction with an electron donor leading to the production of energy. This energy is coupled to anabolism to drive the anabolic reaction (Figure 11.2). 11.2.3.2 Catabolic Reactions and their Gibbs Energy under Standard Conditions Figure 11.2 shows that the electron donor is not only used in anabolism but that also a certain amount is needed for catabolism. Because the catabolic reaction delivers Gibbs energy, it is first most important to quantify the Gibbs energy in the catabolic reaction. This is done by setting up the complete catabolic reaction for 1 mol electron donor. Table 11.2 represents examples of catabolic reactions for a number of biological systems. In principle each electron donor and acceptor which delivers sufficient Gibbs energy can function in the catabolic reaction. Given the diversity (organic and inorganic electron donors and acceptors) there are hundreds of thousands possible catabolic reactions, which contributes significantly to microbial diversity. Table 11.2 allows some important observations. The catabolic energy produced per mol electron donor varies enormously, nearly a factor 100, from about 30 kJ to about 3000 kJ. For a certain electron donor the Gibbs energy produced per mol electron donor depends on the available electron acceptor couple. For example glucose with O2 as electron acceptor produces 2843.1 kJ per mol glucose (Table 11.2, reaction 1). Glucose which is catabolized anaerobically to ethanol (Table 11.2, reaction 4) produces only 225.4 kJ, which is nearly 13 times less. The Gibbs energy of a catabolic reaction is obtained by first calculating the correct stoichiometry of the complete catabolic reaction for 01 the consumption of 1 mol electron donor, after which one calculates -ΔG01 R =ΔG cat (which is the produced catabolic energy under biochemical standard conditions, indicated with supercript 01, i.e., 298 K, 1 bar, 1 mol/l, and pH 7 (H+ concentration of 10 -7 mol/l)). Catabolism Electron donor (D) + electron acceptor (A)
Oxidized donor + reduced acceptor
Anabolism Biomass (X) CH1.8O0.5N0.2
YXG kJ/ CmolX
Figure 11.2 Gibbs energy based coupling of catabolism to anabolism.
C-source /electron donor, N-source H2O, HCO3–, H+
11-7
A Thermodynamic Description of Microbial Growth and Product Formation
Table 11.2 Catabolic Reactions for 1 mol electron Donor and Their Standard (pH = 7) Gibbs Energy of Reaction Catabolic Reaction for 1 mol Electron Donor
e-Donor Couple
01 e-Acceptor Couple -ΔG01 R =ΔG cat kJ
1.
C6H12O6 + 6 O26 HCO3- + 6 H2O + 6 H+
C6H12O6/HCO3-
O2/H2O
2843.1
2.
C2H6O + 3 O22 HCO3- + 2 H+ + 1 H2O
C2H6O/HCO3-
O2/H2O
1308.9
3.
C2H3O2 + 2 O22 HCO + H
C2H3O /HCO
O2/H2O
844.16
4.
C6H12O6 + 2 H2O2 C2H6O + 2 HCO3- + 2H+
C6H12O6/HCO3-
C6H12O6/HCO3-
225.4
5.
CH4O + 1.20 NO + 0.20 H 0.60 N2 + HCO + 1.6 H2O
CH4O/HCO
NO /N2
649.36
-
3
3
2
+
+
3
3
3
3
6.
Fe + ¼ O2 + H Fe + ½ H2O (pH = 1.85)
Fe /Fe
O2/H2O
33.78
7.
H2 + ¼ HCO3- + ¼ H+ ¼ CH4 + ¾ H2O
H2/H+
CO2/CH4
33.90
8.
C2H3O + H2OHCO + CH4
C2H3O /HCO
CO2/CH4
31.00
+
2+
2
3+
3
+
2+
3+
2
3
Assuming 1 mol electron donor, the remaining coefficients are in principle obtained by solving the element/charge conservation relations. However, the use of the γ-balance allows more convenient calcu01 lations (Example 4). After heaving a complete catabolic reaction, -ΔG01 R (=ΔG cat ) can be obtained using for each reactant the known standard Gibbs energy of formation (Table 11.3).
Example 4: Calculation of the Standard Catabolic Energy Gain in the Catabolic Reactions Consider a catabolic reaction with methanol (CH4O) as electron donor and where the reduction of NO3to N2 is the electron acceptor. The degree of reduction of the electron donor methanol is 6. The degree of reduction of the electron acceptor NO3- is -5 and the degree of reduction of N2 is 0. It is noted that electron acceptor have always negative γ-values. The γ-balance states that per consumed mol electron donor one needs (γD /(-γA)) mol electron acceptor, or 6/(-(-5)) = 1.20 mol NO3- /mol methanol. The catabolic reaction for 1 mol consumed electron donor follows then as:
-1 CH4O–1.20 NO3- + 0.60 N2–0.20 H+ + 1 HCO3- + 1.60 H2O = 0
The coefficient for N2 follows from the N-balance, for HCO3 from the C -balance, for H+ from the charge balance and for H2O from the O or H balance. The catabolic energy which is produced under biochemical standard conditions (ΔG01 cat) follows from the Gibbs energies of formation listed in Table 11.3 and the stoichiometry of the catabolic reaction. 01 ΔG01 cat = -1ΔGR = -[ + 1.60 (-237.18) + 1.0 (-586.85) -0.20 (-39.87) + 0.60 (0)-1.20 (-111.34) -1(-175.39)] = -(-649.36) = + 649.36 kJ
11.2.3.3 Catabolic Gibbs Energy under Nonstandard Conditions In the previous section the ΔG01 cat represents the Gibbs energy produced from 1 mol electron donor under biochemical standard conditions. This implies that the temperature equals 25°C, that the concentrations of all dissolved reactants (except H+) are 1 mol/l, that for gaseous reactants (e.g., O2) the partial pressure equals 1 bar and that the H+ -concentration equals 10 -7 M (pH = 7). A standard pH = 7 is based on the knowledge that the pH inside the cells (where the biochemical reactions occur) is close to 7. In many situations however cultivation conditions strongly differ from this standard. For example, there is much interest in thermophilic organisms (growing at temperatures up to 95°C), in acidophilic organisms (pH ≈ 0-2 such as Leptospirillum ferrooxidans), and alkalophilic organisms (pH = 8–12). Moreover very often organisms are cultivated under single nutrient limited conditions, where, e.g., the electron donor or acceptor concentration is <<< 1 M glucose. For example E. coli using glucose as a limiting substrate grows at concentrations of ≈10 -4 M. Therefore it is necessary to know how to calculate ∆GR
11-8
Balances and Reaction Models Table 11.3 Gibbs Energy and Enthalpy of Formation under Standard Conditions Compound Name
Composition
∆Gf01 kJ/mol
∆H0f kJ/mol
Biomass Water Bicarbonate CO2 (g) Proton O2 (g) Oxalate2Carbon monoxide FormateGlyoxylateTartrate2Malonate2Fumarate2Malate2Citrate3PyruvateSuccinate2GluconateFormaldehyde AcetateDihydroxy acetone LactateGlucose Mannitol Glycerol PropionateEthylene glycol Acetoine Butyrate Propanediol Butanediol Methanol Ethanol Propanol
CH1.8O0.5N0.2 H2O HCO3CO2 H+ O2 C2O42CO CHO2C2O3HC4H4O62C3H2O42C4H2O42C4H4O52C6H5O73C3H3O3C4H4O42C6H11O7CH2O C2H3O2C3H6O3 C3H5O3C6H12O6 C6H14O6 C3H8O3 C3H5O2C2H6O2 C4H8O2 C4H7O2C3H8O2 C4H10O2 CH4O C2H6O C3H8O
-67 -237.18 -586.85 -394.359 -39.87 0 -674.04 -137.15 -335 -468.6 -1010 -700 -604.21 -845.08 -1168.34 -474.63 -690.23 -1154 -130.54 -369.41 -445.18 -517.18 -917.22 -942.61 -488.52 -361.08 -330.50 -280 -352.63 -327 -322 -175.39 -181.75 -175.81
-91 -286 -692 -394.1 0 0 -824 -111 -410 -777 -843 -1515 -596 -909 -486 -687 -1264 -676 -535 -246 -288 -331
n-Alkane (l)
C15H32
+ 60
-439
Propane (g)
C3H8
-24
-104
Ethane (g)
C2H6
-32.89
-85
Methane (g)
CH4
H2 (g)
H2
-50.75 0
-75 0
Ammonium N2 (g)
NH4+ N2
-79.37 0
-133 0
Nitrite ion
NO2-
-37.2
-107
Nitrate ion
NO3-
-111.34
-173
Iron II
Fe2+
-78.87
-87
Iron III
Fe3S0
-4.6 0
-4 0
Hydrogen sulphide (g)
H2S
-33.56
-20
Sulphide ion
HS-
+ 12.05
Sulphur
-17 (Continued)
11-9
A Thermodynamic Description of Microbial Growth and Product Formation Table 11.3 (Continued) Compound Name
Composition
∆G01 f kJ/mol
∆H0f kJ/mol
Sulphate ion
SO42-
-744.63
-909
Thiosulphate ion
S2O32-
-513.2
-608
of the catabolic reaction under nonstandard condition. For this we need two types of equations which allow to calculate the Gibbs energy of formation under nonstandard condition. The effect of nonstandard concentration at 25°C on Gibbs energy of formation (kJ/mol i)of a compound i follows from:
Dissolved compound: ΔGfi = ΔG0fi + RTln(C i /1)
(11.2)
Gaseous compound: ΔGfi = ΔG0fi + RTln(Pi /1)
(11.3)
Proton: ∆Gf i = −39.87 + RTln(H+ /10 −7)
(11.4)
In these equations ∆ G0fi is the Gibbs energy of formation found in Table 11.3. Ci/1 is the dissolved concentration of compound i (Ci, mol/l) divided by the standard concentration (1 mol/l). Pi/1 is the partial pressure of compound i (Pi atm) divided by the standard pressure (1 atm). For H+ the biochemical standard is at H+ = 10 -7 M!! The effect of temperature (at standard concentration, pressure) on the Gibbs energy of formation is obtained from the Van‘t Hoff relation ΔGf (T) = ΔHf - TΔSf
(11.5)
The value of the enthalpy of formation ΔHf is obtained from the standard thermodynamic tables (Table 11.3), because enthalpy of formation is hardly influenced by temperature or concentration in the ranges relevant for biological systems. Also the value of ΔS °f can be obtained from Table 11.3 by using the ΔGf , ΔHf values listed and using the Van‘t Hoff relation for T = 298 K. In all catabolic reactions water occurs as a reactant. Because the concentrations of electron donor, acceptor, CO2, H+ are usually all very low, we are dealing with a dilute aqueous system, and ∆Gf for water is taken as the standard value (Table 11.3). For nondilute, high salt, conditions this approximation is not allowed. The effect of nonstandard concentration and/or temperature conditions on ∆GR for the catabolic reaction is most easily illustrated with some examples.
Example 5a: Effect of Nonstandard Temperature on ∆GR of the Catabolic Reaction Consider a possible catabolic reaction:
Glucose + 6 H2O6 CO2(g) + 12 H2 (g)
Glucose is split in CO2 and H2 gas (g). Using standard pressures and concentrations it is easy to show that at 25°C the catabolic reaction has ∆GR = -26 kJ, but at 95°C this value changes to ∆GR = -151 kJ. A high temperature leads to a much more negative ∆GR of the catabolic reaction, which therefore leads to more catabolic energy. Now consider an opposite (gas consuming) catabolic reaction:
4 H2(g) + 2 CO2 (g)Acetate- + H+ + H2O
Now we find that at 25°C the catabolic reaction (∆GR = -95 kJ) generates 95 kJ Gibbs energy but at 90°C ∆GR = -8 kJ and 8 kJ of catabolic energy is generated.
11-10
Balances and Reaction Models
Example 5a shows that catabolic reactions which involve gases, and where the total mole of reactants and products are very different (e.g., seven versus eighteen in the glucose example and six versus three in the acetate example) the catabolic energy depends strongly on temperature. When catabolism results in much more (gaseous) molecules the catabolic energy increases strongly with temperature and vice versa. This is a pure entropic effect.
Example 5b: Effect of Nonstandard Concentration on ∆GR of the Catabolic Reaction Consider the catabolic reaction of Fe2+ oxidation at 25°C (by Leptosprillum ferrooxidans). 1H+ + Fe2+ + ¼ O2 (g)Fe3+ + ½ H2O
This gives:
(Fe3+ /1) ΔGR = ΔGR01 + RTln + -7 ) × (Fe 2+ /1) × (P /1)1/4 (H /10 O2 Table 11.3 gives (at pH = 7)
ΔGR01= ½ (-237.18) + 1 (-4.6)-¼ (0)-1 (-78.87)-1 (-39.87) = -4.45 kJ
This catabolic reaction therefore seems very poor!! The real cultivation conditions however are very different from standard, with respect to concentration.
Fe2+ = 10 −3M, Fe3+ = 0.3M, H+ = 10 −1M (pH = 1.0)
PO2 = 10 −1bar. This leads to a concentration correction term RTln [ ], using R = 8.314 × 10 −3kJ/mol K.
(0.3/1) 8.314 × 10 −3 × 298 ln -1 -7 -3 /1)(10 -1 )1/4 (10 /10 )(10
= 2.477 ln (0.000535)
= 2.477 × (-7.35) = -18.65 kJ
The real ∆GR becomes then -18.65–4.45 = -23.10 kJ. Under realistic conditions the catabolic reaction of 1 mol electron donor Fe2 + generates 23.10 kJ of Gibbs energy. Clearly growth under acidic conditions is necessary for this organism to generate enough catabolic energy.
These two examples show that the correction of ΔGR01 due to the effect of nonstandard temperature and concentration is only a few to maximally 60 kJ per mol electron donor. This means that such corrections are especially relevant for catabolic reactions with little energy gain (autotrophic and anaerobic growth). For heterotrophic aerobic catabolism the energy gain is very large (> 500 kJ/mol donor, see Table 11.2) and therefore these corrections only have a minor effect!!
11.2.4 The Required Amount of Gibbs Energy for Anabolism and Calculation of the Amount of Electron Donor That Must Be Catabolized In the previous section it has been shown how to calculate the amount of Gibbs energy which is produced in catabolism of 1 mol electron donor (∆Gcat in kJ/mol donor). In order to calculate the amount of
A Thermodynamic Description of Microbial Growth and Product Formation
11-11
electron donor which must be catabolized one needs to know how much Gibbs energy is needed as input in the anabolic reaction for the synthesis of 1 CmolX. Heijnen et al. (1992) presented two simple correlations (Equations 11.6 and 11.7) which provide the amount of Gibbs energy (in kJ) to be generated in the anabolic reaction for the synthesis of 1 Cmol of biomass: For heterotrophic growth
YxG = 200 + 18 (6 – c)1.8 + exp[((3.8-γd/c)2)0.16 × (3.6 + 0.4 c)]
(11.6)
wherein γd is the degree of reduction of 1 mol of electron donor/C-source and c is number of carbon atoms of the electron donor/C-source. For autotrophic growth
YxG = 1000 (without reversed electron transport ( RET))
YxG = 3500 (with RET)
(11.7)
These two correlations show that: • The amount of Gibbs energy needed for heterotrophic growth (Equation 11.6) depends only on the nature of the organic carbon source used (which is also the electron donor in case of heterotrophic growth systems). This correlation has been established for conventional organic compounds of up to 6 C-atoms. Equation 11.6 shows that YxG ranges from about 250 (for glucose with γd = 24 and c = 6) to about 800 (for CO with γd = 2 and c = 1) kJ Gibbs energy per CmolX. In general C-sources which have more C-atoms and whose degree of reduction per carbon (γd/C) is closer to 4, require less Gibbs energy for anabolism. The reason is obvious if we realize that biomass has a degree of reduction per carbon atom of about 4 (γx = 4.2) and is composed of polymers made from monomers of an average carbon length of about 6. Hence organic compounds with less than 6 carbon atoms, which are more reduced or oxidized than biomass (γd/c deviates from about 4) require additional biochemical reactions (making more carbon – carbon bonds and performing extra oxidation / reduction reactions). All these extra biosynthesis reactions require extra Gibbs energy leading to larger YxG-values. From this perspective, glucose is an extremely suitable C-source (with 6 C-atoms and 4 electron per C-atom) which is very close to the biomass molecules. • The amount of Gibbs energy for autotrophic growth depends on the electron donor which is available to reduce CO2 to biomass. The critical point is to inspect the Gibbs energy of reaction for the anabolic reaction. Depending on the nature of the electron donor it is found that: • The Gibbs energy of the anabolic reaction ∆GR ≈ 0, which applies for all organic electron donors and for some inorganic electron donors such as H2 or CO. In this situation the organism does not need to use the biochemical mechanism of –RET. Making biomass by reducing CO2, using, e.g., H2 or CO, therefore costs (Equation 11.7) Y XG = 1000 kJ Gibbs energy per CmolX. The correlation (Equation 11.1) gives also 990 kJ for CO2 (c = 1, γD = 0). • The Gibbs energy of the anabolic reaction ∆GR >> 0, which applies to many inorganic electron donors (Fe2+ /Fe3+ , NH4+ /NO2- , NO2- / NO3-). The reduction of CO2 to biomass using these electron donors is thermodynamically highly unfavorable. The microorganisms uses then the mechanism of +RET to increase the Gibbs energy level of the electron donor electrons to such a level that CO2-reduction to biomass becomes thermodynamically feasible. This mechanism (RET) requires a large Gibbs energy input, so that Y XG = 3500 kJ/CmolX (Equation 11.7). The amount of Gibbs energy for biomass synthesis for autotrophic or heterotrophic growth only depends on the C-source and electron donor/need for RET. It does not depend on the nature of the
11-12
Balances and Reaction Models
catabolic reaction and the electron acceptor used. This is logical because anabolism does not involve the electron acceptor. The electron acceptor is only relevant for the amount of catabolic energy which can be obtained from the electron donor. The correlations for the Gibbs energy needed for 1 CmolX, Y XG, (Equations 11.6 and 11.7) and the calculated Gibbs energy produced from catabolism of 1 mol electron donor (Table 11.2) ∆Gcat directly give the amount of electron donor which must be catabolized as Y XG/∆Gcat mol electron donor catabolized/ CmolX produced. Here ∆Gcat must be calculated under actual conditions for anaerobic/autotrophic growth, for aerobic heterotrophic growth the standard ΔGR01 is sufficiently accurate.
11.2.5 A Thermodynamic Relation to Calculate the Biomass Yield on max Electron Donor, YDX Electron donor is needed for anabolism (amount (γX/γD) per CmolX) and for catabolism (amount max (also called Y max ) (YXG/∆Gcat) per CmolX). Usually one defines the biomass yield on electron donor YDX SX max equals the total in CmolX produced per mol electron donor ( = substrate) used, which means that 1/YDX electron donor (anabolic and catabolic) used per CmolX. Given the above results it follows
max = γ /γ +Y 1 YDX X D XG ΔG cat
(11.8)
max ) is composed of the Equation 11.8 shows that the total need for electron donor per CmolX (1/ YDX anabolic need (γX/γD) and the catabolic need (Y XG/∆Gcat). We can rewrite Equation 11.8.
max = YDX
γ D /γ X ΔGcat Δ Gcat + γ D /γ X YXG
(11.9)
Knowing the biomass composition and the N-source gives γX, knowing the electron donor and electron acceptor gives γD and ∆Gcat, knowing the C-source and electron donor gives Y XG (Equations 11.6 and 11.7), therefore when C-, N-source, electron donor, and acceptor are known Equation 11.9, together max on electron with correlations (Equations 11.6 and 11.7) directly gives the estimated biomass yield YDX max donor in CmolX per mol electron donor. It has been shown that this approach gives estimates of YDX over a 100 fold range with 10–15% accuracy (Heijnen et al., 1992). max are obtained for electron donors with high degree of Equation 11.9 shows that high values of YDX reduction (γD high), an electron acceptor which gives a high catabolic energy per mol electron donor (∆Gcat high) and a carbon source which requires a low amount of Gibbs energy for anabolism (Y XG low). max . Anaerobic Therefore autotrophic growth, with a high value of Y XG, always will show a low value of YDX max . Also it is clear that NO - as N-source leads to lower Y max (γ growth (∆Gcat low) also leads to low YDX 3 X DX max is directly found by putting Y = 0, showing that higher). A thermodynamic theoretical limit to YDX XG this theoretical limit of biomass yield equals γX/γD (which is of course the amount needed in only the anabolic reaction).
Example 6: Estimation of the Biomass Yield Using Equations 11.6 through 11.9 Assume that an organism grows aerobically using glucose as C -source and electron donor. NH4+ is the max , in CmolX mol glucose. Glucose (C H O ) is N-source. Equations 11.8 or 11. 9 can be used to obtain YDX 6 12 6 the electron donor for which γD = 6 × 4 + 12 × 1 + 6 (-2) = 24. For biomass γx = 4.2 (standard composition, NH4+ as N-source). Because O2 is the electron acceptor one can set up the catabolic reaction (which is reaction 1 of Table 11.2) and ΔG01 cat = 2843.1 kJ is found. Growth is heterotrophic without RET, therefore Equation 11.6 applies giving (γD = 24, C = 6, YXG = 200 + 0 + 36 = 236 kJ/CmolX. The moles of glucose
A Thermodynamic Description of Microbial Growth and Product Formation
11-13
needed for 1 CmolX follows from Equation 11.8 as 4.2/24 + 236/2843.1 = 0.175 + 0.083 = 0.258 mol glucose/CmolX. We also recognize that 32% of the glucose is used for catabolism and 68% for anabolism. We max = (0.258) -1 = 3.87 CmolX/mol glucose. also find immediately YDX Now we assume that the organism grows anaerobically on glucose and produces ethanol using the ethanol catabolic reaction (reaction 4, Table 11.2). ΔG01 cat is now different, being 225.4 kJ. The mole of glucose needed for 1 CmolX now follows from Equation 11.8 as 4.2/24 + 236/225.4 = 0.175 + 1.05 = 1.225 mol glucose/CmolX. Now 86% of the glucose is catabolized. The biomass yield on substrate max = (1.225) -1 = 0.82 CmolX/mol glucose, which is about four times less than for aerobic growth. YDX This four fold difference in biomass yield is completely due to the very different catabolic energy gain.
11.3 Thermodynamics of Maintenance All organisms are complex structures in which many decay processes occur. For example proteins denature, and need to be reassembled. Membranes are leaky and molecules must be pumped out. In all these processes there is full material recycling, they only use Gibbs energy and the collective spending of Gibbs energy for maintenance purposes can be assigned a value mG (kJ of Gibbs energy needed per hour for the maintenance of 1 CmolX). An Arrhenius type of relation has been found (Equation 11.10) which shows that mG is similar for many organisms, is not influenced by C, N-source or electron donor or acceptor (Tijhuis et al., 1993). The only factor of interest is the temperature, where a high temperature increases mG. This is understandable because a higher temperature increases the rate of decay processes and nearly all organisms have similar structure/chemistry. 1 -69,000 1 mG =4.5 exp T 298 R
(11.10)
In Equation 11.10 T is the absolute temperature, R is the gas constant (8.314 J/mol K). 69,000 is the energy of activation in J/mol, which shows that an increase of about 8°C doubles mG. Of course, Gibbs energy is derived from catabolism of electron donor, and therefore, the catabolic reaction used by the organism and its ∆Gcat immediately allows to calculate how much electron donor and acceptor are needed for maintenance (see Example 7).
Example 7: Calculation of the Consumption Rates of Electron Donor and Acceptor for Maintenance Consider aerobic growth on glucose of Example 6, and assume that the cultivation temperature is 37°C ( = 310 K). Equation 11.5 shows that mG = 13.12 kJ/CmolXh. The catabolic reaction of 1 mol glucose gives 2843.1 kJ (Table 11.2). This allows to calculate mD = -13.12/2843.1 = -0.0046 mol glucose/CmolXh. Because catabolism consumes 6 O2 and produces 6 CO2 it also follows that mO2 = -0.0046 × 6 = -0.0276 mol O2/CmolXh and mCO2 = +0.0276 mol CO2/CmolXh. Now consider anaerobic growth on glucose. The catabolic energy gain is 225.4 kJ/mol glucose (Table 11.2). mG remains the same, but the amount of electron donor needed for maintenance mD = -13.12/225.4 = -0.058 mol glucose/CmolXh. Hence for maintenance under anaerobic (compared to aerobic) condition the organism spends about 13 times ( = 2843.1/225.4) more glucose to generate the same rate of maintenance Gibbs energy because of the 13 fold catabolic energy difference for 1 mol glucose. In anaerobic catabolism CO2 and ethanol is produced, hence
meth = -2 mS = + 0.116 mol ethanol/CmolXh
mCO2 = -2 mS = + 0.116 mol CO2/CmolXh
11-14
Balances and Reaction Models
11.4 Calculation of the Operational Stoichiometry of a Growth Process at Different Growth Rates, Including Heat Using the Herbert–Pirt Relation for Electron Donor Using the three thermodynamic correlations for growth and maintenance (Equations 11.6, 11.7, and 11.10) one can calculate for any growth system (with specified C, N-source, electron donor and acceptor, max and m . The presence of maintenance causes that the measured (operatemperature) a value for YDX D max , because part of the consumed tional) biomass yield YDX is smaller than the theoretical maximum YDX electron donor is spent for maintenance. This is expressed in the Herbert–Pirt relation for electron donor consumption.
max µ + (-m ) -qD = 1/ YDX D
(11.11)
qD, which is negative, is the biomass specific rate of electron donor consumption (mol electron donor/ max )µ is the rate of electron donor spent on growth and m is the rate of electron donor CmolXh). (1/ YDX D consumed for maintenance. The operational yield YDX (CmolX/mol electron donor) is defined as the ratio of µ and (-qD):
µ max YDX = µ/(-qD) = YDX (µ +( - m Y max ) D DX
(11.12)
Equation 11.12 shows that the operational biomass yield YDX is a hyperbolic function in µ. At high µ max )) the operational biomass yield approaches asymptotically the maximal yield. At low µ (µ >> (-mD YDX max allows max ) the yield Y DX drops to zero. Equation 11.12 shows that, knowing m D and YDX (µ << (-mD YDX to calculate the operational biomass yield on electron donor YDX in CmolX/mol donor for any specified value of µ. Summarizing, when beside C, N-source, the electron donor and electron acceptor, also µ and temperature is specified one can calculate the operational biomass yield YDX and the effect of µ on YDX!!
Example 8: Calculation of the Effect of µ on Ydx Consider Example 7, aerobic growth on glucose. At 37°C mD = -0.0046 mol glucose/CmolXh. In max = 3.87 CmolX/mol gluxose. This gives (Equation 11.12) for the biomass Example 6 it was found that YDX yield : YDX = 3.87 µ/(µ + 0.018) This result shows that for large µ-values YDX becomes constant at 3.87 CmolX/mol glucose. For µ < 0.1 h -1 YDX starts to drop significantly. Consider now anaerobic growth on glucose with ethanol production max = 0.82 CmolX/mol glucose gives: (Examples 6 and 7). Using mD = -0.058 mol glucose/CmolXh and YDX YDX = 0.82µ/(µ + 0.047).
Having calculated for a specified growth system with a chosen T and µ, the value for YDX allows to obtain the complete growth stoichiometry which belongs to the specified value of µ and T. This is obtained by solving the element/charge conservation relations and/or the balance of degree of reduction, as already used before. Having the complete reaction one can also use energy conservation (1st law thermodynamics, the enthalpy balance) to obtain the heat production. The procedure is shown in Example 9.
Example 9: Calculation of Complete Growth Process Stoichiometry Including Heat Consider anaerobic growth of an organism which produces ethanol (C2H6O) from glucose (C6H12O6). Also assume that the cultivation temperature is 37°C and µ = 0.1 h -1. Example 8 shows that for µ = 0.1 h -1 the
A Thermodynamic Description of Microbial Growth and Product Formation
11-15
value for YDX = (0.82 × 0.1)/(0.1 + 0.047) = 0.558 CmolX/mol glucose. This means that for the production of 1 CmolX one needs 1.79 mol glucose when µ = 0.1 h -1. For the complete growth process we can now write for the synthesis of 1 CmolX:
-1.79 C6H12O6 + a H2O + b H+ + c NH4+ + d CO2 + e C2H6O + f heat (kJ) + 1 C1H1.8O0.5N0.2.
This represents our specific knowledge (NH4+ is N-source, 1.79 mol glucose needed at µ = 0.1 h -1 and 37°C, therefore -1.79 mol C6H12O6, for 1 Cmol biomass production, ethanol is the catabolic product) and general knowledge H+, H2O, CO2 are always involved. The six unknown coefficients (a–f) are easily found by setting up the six conservation relations for four elements, electric charge and enthalpy (using ∆Hf from Table 11.3). These six linear relations are then solved. This results in
-1.79 C6H12O6–0.2 NH4+ 0.2 H+ + 1 C1H1.8O0.5N0.2 + 3.23 C2H6O + 3.28 CO2 + 0.45 H2) + 153.4 kJ heat.
Example 9 shows that, when C, N-source, electron donor, acceptor are known and T and µ are specified we can use the three thermodynamic correlations (Equations 11.6, 11.7, and 11.10) to obtain the complete process stoichiometry, including heat production. Changing T, µ, or N-source will lead to different process stoichiometry, which can be directly calculated. This process stoichiometry is the basis of each process design because it specifies all consumption and productions, all of which are to be transported to and from the fermentor. Also all biomass specific rates are available, e.g., in Example 9, µ = 0.10 h-1 qS = -1.79*0.10 mol glucose/CmolXh, qCO2 = + 3.28*0.10 mol CO2/CmolXh, qeth = +3.23*0.10 mol ethanol/CmolXh, etc.
11.5 A Correlation to Estimate the Maximum Specific Growth Rate, µmax A most important parameter in fermentation process design is the maximum specific growth rate µmax. It is known that this value is very different for many microbial growth systems (values are known to range between order 0.005 and 2 h -1). A simple hypothesis to explain this variation is that cells have limits in catabolic energy production. Nearly all catabolic energy is obtained from transport of electrons obtained from the electron donor to the electron acceptor through the electron transport chain (ETC). The ETC consists of electron processing proteins embedded in membranes. Because cells are limited in the amount of membrane area and the amount of ETC protein which can be placed in membranes is also physicially (space) limited it is to be expected that there is a limit in the electron transport rate per CmolX. It is also logical to expect that this electron transport capacity is higher at higher temperature. Heijnen (1999) has proposed the following correlation. Maximal electron transport capacity (e-mol/CmolXh) is:
1 -69,000 1 3 exp R T 298
(11.13)
For convenience the same energy of activation (69,000 J/mol) was used as found for maintenance (Equation 11.10). ETC transports electrons obtained from an electron donor with γd electrons and releases ∆Gcat of Gibbs energy per mol electron donor, therefore the Gibbs energy obtained per catabolized electron is ∆Gcat / γD. This gives, together with Equation 11.13, for the maximal biomass specific Gibbs energy production rate (in kJ/CmolXh).
1 -69,000 1 q Gmax =3( ΔGcat / γ D ) exp R T 298
(11.14)
11-16
Balances and Reaction Models
Table 11.4 Estimated µmax-Values at 25°C Based on Limiting Electron Flux in the ETC Microbial System
∆GCAT / γD (kJ/mol Electron)
Aerobic/glucose Aerobic/acetate Anaerobic/CH4 from acetate An aerobic/ethanol from glucose Aerobic/Fe2+ oxidation (pH = 1.5) Aerobic/sulfide oxidation to SO42Aerobic/nitrification
YXG (kJ/Cmol Biomass)
µmax (h-1, 25°C)
118.5 105.5 3.87 9.39 38.6
236 432 432 236 3,500
1.5 0.7 0.015 0.10 0.03
99.6
3,500
0.08
45.8
3,500
0.04
The rate of Gibbs energy is spent for growth (Y XG × µmax) and maintenance (mG). Therefore we can write: q Gmax = YXG µ max + mG
(11.15)
Introducing Equation 11.10 for mG and Equation 11.14 gives for µmax 1 3ΔGcat /γ D - mG - 69,000 1 µ max = exp YXG T 298 R
(11.16)
In Equation 11.16, YXG is the Gibbs energy needed in anabolism according to Equations 11.6 and 11.7. Table 11.4 shows the µmax-values (25°C) obtained for various growth systems and their values are seen to cover a 100 fold range (0.015–1.5 h-1) , which is indeed observed. Also the µmax-values for the very different systems agrees roughly with observed µmax-values. This indicates that the very simplistic approach of Gibbs energy limitation due to ETC-limitation has some merit to understand the very different µmax-values.
11.6 Thermodynamic Prediction of Minimal Concentration Electron Donor and Maximal Concentration of Catabolic Product Growth is only possible when catabolic energy can be generated. In biochemical terms this means that the catabolic reaction can be coupled to the generation of proton motive force (pmf). The pmf equals about 15 kJ of Gibbs energy for 1 mol H+, and therefore one might expect that a catabolic reaction needs to generate at least 15 kJ of Gibbs energy in order to create pmf. It is easy to show that this sets minimal limits (thresholds) to the concentrations of electron donor/acceptor and maximum concentrations to the catabolic products. The concept of a minimal value of catabolic energy has been validated in microbial growth systems which have a catabolic reaction with a low gain of catabolic energy (Examples 10 and 11).
Example 10: Thiobacillus Ferrooxidans Has a “Threshold” Concentration in Electron Donor for the Catabolic Reaction
1 1 Fe2+ + O 2 (g)+H+ → Fe 3+ + H2O 4 2
Observation shows (Boon et al., 1995) that O2-consumption, and thus catabolism, stops at a threshold concentration of Fe2+ = 3 × 10 −4 M. Further conditions are PO2 = 0.12 bar, pH = 1.85, Fe3+ = 0.21 M, T = 308 K. At these conditions ∆GCAT = −16kJ. This value is around the postulated minimal value, to be able to generate pmf.
A Thermodynamic Description of Microbial Growth and Product Formation
11-17
Example 11: Inhibiting Concentration of a Catabolic Product Interspecies H2-transfer occurs in the anaerobic conversion of ethanol to acetate and CH4. This is performed by two microorganisms, with different catabolic reactions.
m.o.1: ethanol + H2O → acetate + 2H2 + H+
1 1 1 m.o.2: 2H2 + HCO 3- + H+ → CH4 + 1.5H2O 2 2 2
The catabolic reaction of m.o.1 cannot proceed under standard condition ( ΔGR01 = +9.62kJ) due to the standard H2 concentration of 1 bar. However m.o.2 has a catabolic reaction which has large negative standard Gibbs energy of reaction ( ΔGR01 = -67.68 kJ). M.o.2 extremely decreases the H2-concentration to a very low level (3.2 10 -4 bar). This increases for m.o.1 ∆GR to a value of -30.3 kJ and for m.o.2 ∆GR = -27.9 kJ. The total available Gibbs energy (combined reaction, -58.18 kJ), is neatly divided by both organisms, by the low H2-pressure. Measurements have confirmed the calculated range of H2-pressure. Equipartition of Gibbs energy makes sense, because it allows both organisms to grow at equal rate, which can be expected because the rate of both catabolic reactions must be equal due to the H2-coupling.
Another interesting observation (Seitz et al., 1990a, 1990b) is that the minimal acetate concentration achieved by microorganisms who convert acetate to CH4 and CO2 decreases at increasing temperature. This is due to the increasing standard catabolic energy gain at higher temperature. The actual catabolic energy gain is maintained at a constant level, by decreasing acetate concentration. In conclusion, expected minimal and maximal concentrations of catabolic reactants can be directly quantified using a minimal catabolic energy gain of about 10–20 kJ.
11.7 Thermodynamics and Stoichiometry of Product Formation For product formation one can distinguish two categories: catabolic or anabolic products. For both categories the thermodynamic approach allows interesting statements on the stoichiometry and sometimes on kinetics!! These statements do not need any information on metabolic pathways. This is attractive, because often such information is lacking, but therefore results can only be approximate. A detailed biochemical pathway always provides more exact information on stoichiometry, but this belongs to the field of metabolic network analysis. 11.7.1 Catabolic Products Many interesting products result from catabolic reactions. For catabolic products a direct coupling exists between catabolic product formation and growth. An important class of products is the anaerobic fermentation products (ethanol, acetate, lactate, etc.). Because there is then no external acceptor we can write the balance of degree of reduction.
γDqD + γXµ + γPqP = 0
(11.17)
Here γp and q p are the degree of reduction and biomass specific rate of product formation. Furthermore the Herbert–Pirt relation applies:
max µ + (-m ) -qD = 1/YDX D
(11.11)
Elimination of qD by substituting Equation 11.11 in Equation 11.17 yields a linear relation between q p and µ.
γ γ γD qP = - D µ + D (- m D ) max γ P YDX γP γP
(11.18)
11-18
Balances and Reaction Models
This relation can be completely calculated using the thermodynamic approach which provides max and (-m ) as outlined earlier in this chapter. numerical values for YDX D Thermodynamics therefore allow to calculate expected (possible) q p-rates for arbitrary catabolic products. An interesting quantity is the product yield on electron donor Y DP ( = qP/(-qD)). Using the definition one obtains for this product yield (molP/mol electron donor): γX γD γD γ Y max - γ µ + γ (- m D ) P DX P P YDP = 1 µ + ( m ) D max YDX
(11.19)
This relation shows the product yield to be a function of µ only. There are two extreme situations. • µ = 0 In the absence of growth YDP = γD / γP, which is the stoichiometric ratio in the catabolic reaction, because the metabolism is only maintenance and therefore fully catabolic • µ = very large Then maintenance is negligible and YDP =
γ D γ X max 1YDX . γ P γ D
The product yield is then lower as for µ = 0, which is understandable because part of the electron max we can rewrite the relation for donor is used for anabolism. Using the thermodynamic relation for YDX the product yield (if µ is large): YDP =
1 γD × γ P γ X ( ΔGCAT /γ D ) 1 + YXG
(11.20)
This relation shows that the catabolic product yield comes closer to its catabolic maximum γD / γP when: • ∆GCAT / γD, which is the catabolic energy gain per electron, is small • Y XG is large, which is understandable because this leads to lower biomass yields
Example 12: Thermodynamic Prediction of a Catabolic Production Rate and Catabolic Product Stoichiometry It is proposed to convert methanol anaerobically to acetate. The catabolic reaction is -1 CH4O–0.50 HCO3- + 0.75 C2H3O2- + 0.25 H+ + 1 H2O. This catabolic reaction gives (standard conditions) ∆GCAT = 55.54 kJ/mol methanol. The value of Y XG (heterotrophic, no RET (Equation 11.6)) equals (for methanol C = 1, γp = 6) 698 kJ Gibbs energy/CmolX. max = 0.073 CmolX per mol This gives (Equation 11.9) a value for the maximal biomass yield YDX methanol. Assuming T = 298 K, we obtain mD = -0.081 mol methanol/(h CmolX). The following qP(µ) relation is then obtained (Equation 11.18). (qP in mol acetate/CmolXh, γP = 8).
qP = 9.73 µ + 0.061
A Thermodynamic Description of Microbial Growth and Product Formation
11-19
For the product yield at high µ one obtains (Equation 11.20):
6 1 YDP = = 0.710 mol acetate/mol methanol 8 1+ 0.056
This shows a very profitable high yield, even under full growth condition. The yield is so high because of CO2-fixation in the catabolic reaction!!
11.7.2 Anabolic Products Here thermodynamics cannot provide a qP(µ) function, because the relation between product formation and growth is completely determined by the kinetic properties of the metabolic network. However thermodynamics can still make a statement on the maximal product yield. The anabolic reaction for an anabolic product (e.g., amino acid, nucleotide, protein, antibiotic, lipid, etc.) has the same format as for the anabolic reaction for biomass (Figure 11.1). The minimal amount of electron donor equals then γP/γD (see earlier in this chapter). In reality more electron donor is needed to provide Gibbs energy for the product synthesis. This amount is not easily predicted thermodynamically. However the minimal amount of donor γP/γD leads then to the thermodynamic maximal limit in product yield: limit = γ / γ YDP D P
mol product mol donor
(11.21)
In conclusion, thermodynamics allows very definite statements about product rates and yields for catabolic products and puts a maximal limit to the yield of anabolic products.
Example 13: Limits of Anabolic Product Consider the product amino acid phenylalanine C9H11O2N, with γP = 40. For growth on glucose (γ = 24) the thermodynamic limit of the yield equals γD/γP = 24/40 = 0.60 mol phenylalanine per mol glucose.
11.8 Conclusions It has been shown that thermodynamics and only three correlations provide very quantitative information on stoichiometry, rates and thresholds of microbial growth and product formation. The predictions are based on the unity of biochemistry for growth. Therefore when measured stoichiometry or rates are very different from the thermodynamically predicted values, this is an indication of very abnormal catabolic and/or anabolic routes. In this way, the presented thermodynamic approach not only provides predictions of stoichiometry and rates, but may also point the direction to new discoveries.
References and Recommended Reading Amend, J.P. and E.L. Schock. 2001. Energetics of overall metabolic reactions of thermophilic and hyperthermo philic Archaea and Bacteria. FEMS Microbiol. Rev., 25, 175–243. Battley, E.H. 1987. Energetics of Microbial Growth. John Wiley and Sons, Chichester. Boon, M., G.S. Hansford, and J.J. Heijnen. 1995. In D.S. Holmes, and R.W. Smith, eds. Recent developments in modelling bio-oxidation kinetics, Part II: Kinetic modelling of the bio-oxidation of sulphide minerals in terms of the critical sub-processes involved. Minerals Bioprocessing II, Proceedings of the Engineering Foundation Conference, Salt Lake City, 155, 94–98. Conrad, R. and B. Wetter. 1990. Influence of temperature on energetics of hydrogen metabolism in homoacetogenic, methanogenic and other anaerobic bacteria Arch. Microbiol., 155, 94–98.
11-20
Balances and Reaction Models
Heijnen, J.J. and J.P. van Dijken. 1992. In search of a thermodynamic description of biomass yields for the chemotrophic growth of microorganisms. Biotechnol. Bioengin., 39, 833–858. Heijnen, J.J., M.C.M. van Loosdrecht, and L. Tijhuis. 1992. A black box mathematical model to calculate auto- and heterotrophic biomass yields based on Gibbs energy dissipation. Biotechnol. Bioengin., 40, 1139–1154. Heijnen, J.J. 1999. Bioenergetics of microbial growth. In M.C. Flickinger, and S.W. Drew, eds. Encyclopedia of Bioprocesstechnology, Fermentation, Biocatalysis and Bioseparation. John Wiley and Sons. New York. Roels, J.A. 1983. Energetics and Kinetics in Biotechnology. Elsevier, New York. Lange, H.C. and J.J. Heijnen. 2001. Statistical reconciliation of the elemental and molecular biomass composition of Saccharomyces cerevisiae. Biotechnol. Bioeng., 75, 334–44. Seitz, H.J., B. Schink, N. Pfennig, and R. Conrad. 1990a. Energetics of syntrophic ethanol oxidation in defined chemostat cocultures. 1. Energies requirement for H2 production and H2 oxidation. Arch. Microbiol., 155, 82–88. Seitz, H.J., B. Schink, N. Pfennig, and R. Conrad. 1990b. Energetics of syntrophic ethanol oxidation in defined chemostat cocultures. 2. Energy sharing in biomass production. Arch. Microbiol., 155, 89–93. Tijhuis, L., M.C.M. van Loosdrecht, and J.J. Heijnen. 1993. A thermodynamically based correlation for maintenance Gibbs energy requirements in aerobic and anaerobic chemotrophic growth. Biotechnol. Bioengin., 42, 509–519. von Stockar, U. and I.W. Marison. 1989. The use of calorimetry in biotechnology. Adv. Biochem. Eng. Biotechnol., 40, 93–136. Westerhoff, H.V. and K. van Dam. 1987. Mosaic Non-equilibrium Thermodynamics and the Control of Biological Free Energy Transduction. Elsevier, Amsterdam. Zinder, S.H. 1990. Conversion of acetic acid to methane by thermophiles. FEMS Microbiol. Rev., 75, 125–138.
Bacterial Transcriptional Regulation of Metabolism
III
James C. Liao University of California
12 Transcribing Metabolism Genes: Lessons from a Feral Promoter Alan J. Wolfe..............................................................................................12-1 The Fundamentals of Eubacterial Transcription • Regulation of a Native Metabolic Promoter • Concluding Remarks
13 Regulation of Secondary Metabolism in Bacteria Wenjun Zhang, Joshua P. Ferreira, and Yi Tang.............................................................................................13-1 Introduction • Pathway Specific Regulators • Pleiotropic and Global Regulators in Actinomycetes • Nutritional and Physiological Factors • Other Functions of Secondary Metabolites • Conclusions
14 A Synthetic Approach to Transcriptional Regulatory Engineering Wilson W. Wong and James C. Liao.....................................................................................14-1 Introduction • Lessons from Natural Circuits • Synthetic Intracellular Circuits • Synthetic Intercellular Circuits • Alternative Synthetic Strategies • Issues in Population versus Single Cell Measurements • Conclusion
S
ince Jacob and Monod discovered the regulation of lactose operon half a century ago, many modes of regulation have been discovered (Pardee, Jacob, and Monod, 1959). Like other intracellular events, metabolism is regulated at various levels ranging from transcriptional initiation to metabolite feedback inhibition. These regulations are evolved to control the timing and extent of metabolic flux to different branches in order to supply building blocks, reducing power, cofactors, and III-1
III-2
Bacterial Transcriptional Regulation of Metabolism
energy needed for cell growth, maintenance, and adaptation to environmental changes. Because of these control systems, the cells are able to counter genetic or environmental changes that attempt to alter metabolic flux. As such, regulation is the most important issue that a metabolic engineer must deal with. Indeed, random mutagenesis that destroys regulation was the earliest tool of metabolic engineering for production of amino acids. Among different forms of regulation, the most common and best understood is transcriptional regulation. Because of its modularity, transcriptional regulation is readily amenable to engineering. This section deals with three aspects of bacterial transcriptional regulation: regulation of primary metabolism, regulation of secondary metabolism, and engineering of synthetic regulatory networks. Other forms of regulation, such as enzyme activity modulation, are certainly important, but tools for systematic engineering of these mechanisms remain to be developed. The lac operon model of Jacob and Monod set a great foundation for transcriptional regulation, but the details of regulation are often much more complicated, as the cell needs to respond to multiple scenarios in a quantitative manner. The chapter by Alan Wolfe (Chapter 12) gives a tutorial on the basic principles of transcriptional regulation and how they respond to environmental signals. An important departure from the operon-specific regulation is the coordinated regulation exerted by various sigma factors and global transcription factors, such as cAMP receptor protein (CRP) and FNR, which control a large number of genes. Sigma factors determine the binding of RNA polymerase to specific promoters. Upon a significant environmental change, such as heat shock or nitrogen starvation, the specific sigma factor is induced, which then turns on the expression of a set of genes that adapt to or counter the environmental challenges. These sigma factors are typically used to effect large and rapid induction of relevant genes. They act as positive regulators. Global transcription factors are used for similar purposes but through different mechanisms such that they can modulate transcription either positively or negatively. These proteins respond to environmental signals and bind to specific sites near the promoter regions of the genes they control. CRP is activated by cAMP, which is a secondary signaling molecule synthesized when glucose is depleted. CRP is a DNA binding and bending protein that controls more than 200 genes in E. coli (Brown and Callan, 2004). FNR belongs to the CRP family of transcriptional regulators and contains an oxygen labile iron-sulfur center as a sensor for oxygen. FNR regulates more than 700 genes in E. coli during anaerobiosis (Salmon et al., 2003). These global regulators also work in conjunction with other more specific regulators to respond to multiple conditions. Another important regulatory mechanism is the so-called two-component system, which involves a sensor kinase (SK) and a response regulator (RR). Despite the name, many involve more than two components in the regulatory mechanism. Prior to the discovery of these systems, a regulator typically is a homodimer (such as CRP) or homotetramer (such as LacI) that binds to a metabolite signal and a cis-acting DNA site. The finding of two component systems EnvZ/OmpR (osmoregulation in E. coli), PhoR/PhoB (phosphate scavenging in E. coli), NtrB/NtrC (nitrogen assimilation in a variety of bacteria), DctB/DctD (dicarboxylate transport in Rhizobium leguminosarum) and VirA/VirG (virulence by Agrobacterium tumefaciens) demonstrates that these signaling pathway involves at least two proteins. The first protein, SK, responds to signals and autophosphorylates a histidine residue in response to a specific signal. The SK then transfer the phosphoryl group to an aspartate residue on the RR. These discoveries eventually lead to the realization that these systems are modular and involve domains that are conserved across many bacteria. These systems are used in both primary and secondary metabolism (Chapter 13), and are the common building blocks for synthetic regulators because of their modularity (Chapter 14). The regulation of transcription also involves small molecules such as guanosine3′,5′bisphosphate (ppGpp), which mediates the stringent response during starvation though a very complex mechanism which involves both direct and indirect effects. Furthermore, the transcriptional regulation is modulated by a few nucleoid proteins, such as factor for inversion stimulation (FIS) and integration host factor (IHF). These histone-like proteins tend to bind to overlapping sites in intergenic regions associated with the binding of RNA prolymerase, and are not known to respond to any particular signals. Although
Bacterial Transcriptional Regulation of Metabolism
III-3
their functions are incompletely characterized, they are believed to fine-tune gene expression in different growth conditions. Chapter 12 provides an overview for these basic paradigms in E. coli and uses the promoter for acs (encoding acetyl coenzyme A synthetase gene) as an example for a complex regulation in real life. In addition to the regulation used in primary metabolism, growth, adaptation, and development, similar paradigms are also seen in the regulation of secondary metabolism (Chapter 13). Secondary metabolites are a class of compounds that are not essential for bacterial growth but are extremely important as bactericides and fungicides and for other pharmaceutical applications. Chapter 13 provides a detailed summary of known regulation mechanisms in secondary metabolism, particularly for Streptomyces. Similar to other genes, secondary metabolic genes are regulated by both pathway-specific regulators and global regulators. The best known pathway-specific regulators belong to the family of Streptomyces antibiotic regulatory protein (SARP) and large ATP-binding regulators of the LuxR family (LAL). Global regulators involve the typical prokaryotic two-component systems and other nutritional regulation such as cAMP-, ppGpp-mediated regulation. The genomic sequence of S. coelicolor identified more than 60 pairs of sensor kinase and response regulators used in the two-component system. At least five of them have been identified to be involved in the synthesis of secondary metabolites. Of particular interest are the unique modes of autoregulation involving γ-butyrolactones, the translational control, and light induction of carotenoid biosynthesis gene clusters found in Streptomyces. The diffusible autoregulator, γ-butyrolactones, are similar to the autoinducers (homoserine lactones) used in quorum sensing, but they do not function as an indicator of population density. Instead, the γ-butyrolactones are synthesized in a particular phase and induces secondary metabolism. They may also serve as cell-cell communication mediators. The rare translational control in Streptomyces takes advantage of the high GC content of this organism. Because of high GC level, the leucine TTA codon is rare, and thus the tRNA that recognizes this codon becomes a regulator for genes that contain this codon. The light induction of carotenoids in Streptomyces involves a sigma factor which in turn is regulated by a photo-modulated transcription factor, LitR. These features are not seen in E. coli and broaden the paradigms for prokaryotic transcriptional regulation. To a metabolic engineer, ironically, the benefit of understanding cellular regulation is to destroy it. To date, the most common form of metabolic engineering requires inactivating native regulatory mechanisms, such that the flux of interest can be amplified. This approach has brought about significant success thus far. However, more sophisticated production of biochemicals requires coordination of multiple substrates while maintaining cell’s native metabolism for cofactor balance and ATP production. In this regard, re-wiring cellular regulation to achieve a complex phenotype, such as oscillation or toggle switching, may become useful to metabolic engineering. Chapter 14 discusses a synthetic approach in studying regulatory networks. This approach builds non-native regulatory networks using existing, well-characterized components as building blocks. The goal is both to achieve complex synthetic phenotypes as well as help characterize native regulation. This “learning-by-building” approach contrasts the traditional “leaning-by-breaking” reductionist approach, and provides a new frontier to extend the existing metabolic engineering practice. The synthetic approach requires both biological insight as well as some mathematical competence for quantitative analysis of the circuit. Many fascinating synthetic regulatory circuits have been demonstrated so far. Remarkably, these synthetic circuits behaved as designed, suggesting that the underlying mechanism of regulation has been sufficiently understood. The philosophy of designing synthetic circuits parallels that of other engineering designs. In fact, the synthetic approach has been used by metabolic engineers to build non-native pathways. The integration of synthetic regulatory circuits and synthetic pathways will be a promising approach, as demonstrated by Farmer and Liao (2000). Many more unsuccessful synthetic circuits remain unreported, just as the early days in the design of electronic circuits. The primary difficulty lies in the incomplete understanding of cellular metabolism and regulation, despite the significant progress brought about by the “omic” technology. Metabolic engineering will benefit from more quantitative understanding of native regulation and principles for its
III-4
Bacterial Transcriptional Regulation of Metabolism
design. The three chapters in this section provide an overview of regulation from a metabolic engineering standpoint. Not included in this section are important subjects such as bioinformatics and systems biology, which are covered elsewhere.
References Brown, C.T. and Callan, C.G. Jr. 2004. Evolutionary comparisons suggest many novel cAMP response protein binding sites in Escherichia coli. Proc. Natl. Acad. Sci. USA, 101(8):2404–2409. Farmer, W.R. and J.C. Liao. 2000. Improving lycopene production in Escherichia coli by engineering metabolic control. Nature Biotechnol., 18:533–537. Pardee, A.B., Jacob, F., and Monod, J. 1959. The genetic control and cytoplasmic expression of “Inducibility” in the synthesis of beta-galactosidase by E. coli. Salmon, K., Hung, S., Mekjian, K., Baldi, P., Hatfield, G.W., and Gunsalus, R.P. 2003. Global gene expression profiling in Escherichia coli K12: the effects of oxygen availability and FNR. J. Biol. Chem., 278:29837–29855.
12 Transcribing Metabolism Genes: Lessons from a Feral Promoter 12.1 The Fundamentals of Eubacterial Transcription........................12-1 RNAP • The Transcription Pathway • Promoter Recognition • Hard-Wired Regulation • Adaptive Regulation • Signaling • Simple Activation • Complex Regulation
12.2 Regulation of a Native Metabolic Promoter.............................. 12-13 Acetyl Coenzyme A Synthetase • acs Operon and Promoter Architecture • Repression of acsP1 • Activation of acsP2 • Modulation of CRP-Dependent Activation by Nucleoid Proteins
Alan J. Wolfe Loyola University at Chicago
12.3 Concluding Remarks ������������������������������������������������������������������� 12-20 Acknowledgments ����������������������������������������������������������������������������������� 12-20 References ������������������������������������������������������������������������������������������������� 12-20
12.1 The Fundamentals of Eubacterial Transcription In this section, the metabolic engineer will become acquainted with the molecular machine that drives transcription (RNA polymerase (RNAP)); the steps of the transcription initiation pathway; promoter architecture; “hard-wired” and “adaptive” regulation; proximal signaling processes; simple and complex regulatory mechanisms; and the roles of nucleoid proteins.
12.1.1 RNAP The expression of any gene product begins with transcription, the synthesis of an RNA polymer from a DNA template by RNAP. The core (E) of this molecular machine is a large multi-subunit complex of about 400 kDa. It is comprised of five protein subunits: α1, α2, β, β′, and ω. Each α subunit is comprised of two domains attached by a flexible linker. The N-terminal domains (α-NTD) of the two α subunits dimerize and, as such, serve as a nucleation site to assemble the rest of the complex; they also can serve as docking sites for transcription factors. The C-terminal domains of each α (α-CTD) help anchor RNAP to the DNA; they also can serve to dock transcription factors. The large β and β′ subunits form the catalytic center of the enzyme and bind to DNA nonspecifically, an ability that permits elongation of the nascent transcript [18,97,98]. For years, the small ω subunit remained an enigma, but now it appears to function as a chaperone that promotes assembly and restores denatured RNAP to its functional form [92]. To bind specific sequences called promoters, the core enzyme (E) requires an additional subunit (σ), of which there are several varieties (see below). The binding of an σ variant to E greatly reduces 12-1
12-2
Bacterial Transcriptional Regulation of Metabolism
the affinity of the core enzyme for nonspecific DNA, while simultaneously enhancing specificity for promoters recognized by that particular σ variant. Transcription initiation by eubacteria, therefore, depends upon the prior formation of the holoenzyme (Eσ),* whose structure includes a channel 25 Å in diameter and 55 Å in length. These dimensions fit well with those of double-stranded DNA. The 20 Å wide DNA fits within the 25 Å diameter channel, while the 55 Å length can accept 16 nucleotides, the approximate length of the elongation transcription “bubble” [18,32,57,97].
12.1.2 The Transcription Pathway The transcription process is a complex pathway that involves several intermediates (Figure 12.1). It begins when the holoenzyme (Eσ) binds to the promoter (a specific DNA sequence) located upstream (5') of the gene to be transcribed. The resultant binary RNAP-DNA complex is called the transcription closed complex (TCC). For RNA synthesis to proceed, the duplex DNA around the transcription initiation site ( + 1) must unwind, a process known as isomerization. With one notable exception†, the energy used to melt the duplex DNA comes not from ATP (as in eukaryotic transcription), but instead from the breaking of noncovalent bonds (established during TCC formation) between amino acids on the surface of RNAP and the nucleotides of the promoter DNA. This transcription open complex (TOC) can now accept the initiating ribonucleoside triphosphate (rNTP) and the adjacent rNTP, forming a phosphodiester bond between them. This ternary RNAP-DNA-RNA complex (called the transcription initiation complex or TIC) continues the synthetic process, adding several rNTPs to the growing chain. Each of E+σ Eσ
NTP
Pro
TCC Initial binding
NTP
+1
AT
Initiation
σ2ββ’σ
α2ββ’σ
α2ββ’σ
–35
–35 –10
–35
–10
TEC
TIC
TOC Isomerization
σ
Promoter escape α2ββ’
AT
–10
Figure 12.1 The transcription initiation pathway. This schematic provides a simplified version of the essential steps of initiation of transcription by eubacteria. The core enzyme (E) interacts with the sigma factor (σ) to form the holoenzyme (Eσ), which binds to the promoter (Pro) to form the binary RNAP-DNA transcription closed complex (TCC). With energy provided by the breaking of bonds between RNAP and DNA, the TCC isomerizes or melts into the binary RNAP-DNA transcription open complex (TOC). Now ribonucleotides (NTP) can enter the “transcription bubble” to begin polymerization. The result is the ternary RNAP-DNA-RNA transcription initiation complex (TIC). Often, the short nascent transcripts are aborted (AT) and the TIC reforms. Escape from the promoter requires that the TIC make an irreversible conversion to the highly processive TEC, the ternary RNAP-DNA-RNA transcription elongation complex. This conversion involves a major conformation change and release of σ.
* This contrasts with eukaryotic transcription, whose σ analog (the TATA binding protein, TBP) first assembles on the promoter and then recruits the catalytically active portion of RNAP [124]. † Like eukaryotic RNAP, EσN (also known as Eσ54) requires ATPase activity to melt duplex DNA. This activity is provided by the NtrC family of transcription factors, each of which bind specifically to an upstream DNA sequence. Once bound, the transcription factor oligomerizes, in the process gaining the required ATPase activity [41,116].
12-3
Transcribing Metabolism Genes: Lessons from a Feral Promoter
these steps are reversible, including the TIC. Reversal of this step results in release of short aborted transcripts. Success at this step causes the complex to undergo a radical and irreversible conformational change. This change includes the loss of σ and the expansion of the “transcription bubble” from 12 to 16 nucleotides. The resultant transcription elongation complex (TEC) is extremely processive and, as such, is responsible for RNA-chain extension [117]. Each of these steps represents an opportunity for regulation. The stabilization or destabilization of any pathway intermediate can either increase or decrease transcription, depending upon context. It is generally accepted that the major regulated steps are formation of the TCC, isomerization to the TOC, and the initial stages of transcript synthesis [23].
12.1.3 Promoter Recognition The critical step in transcription is recognition of the promoter by the holoenzyme (Eσ). For the purpose of this chapter, I will describe recognition by the major eubacterial σ factor (σ70 in E. coli and related organisms, σA in others such as the Gram-positive Bacillus subtilis). Although other σ factors recognize distinct sequences, the fundamentals are similar. Five DNA sequence (cis) elements responsible for promoter recognition by σ70 have been identified and studied extensively (Figure 12.2): the –10 and –35 hexamers, the –16 element (also known as the extended –10 element), the discrimator, and the UP element [28,30,57,61,107,133]. The –10 hexamer is the principal recognition element. Located approximately 10 base pairs (bp) upstream of the transcription initiation site ( +1), it is recognized by region 2.4 of the multidomain σ. It is here that melting of the duplex DNA begins; to facilitate this process the sequence must be AT-rich (hence, the consensus of TATAAT for σ70). After formation of the TCC, the duplex between about positions –10 and +2 unwinds to form a “transcription bubble.” The resultant complex is called the TOC [140,144]. This isomerization remains poorly understood, but it is known that it involves the binding of the nontemplate strand to region 2.3 of σ and the movement of the now free template strand into the active site of RNAP so that RNA synthesis can begin [23].
ββ’ α-NTD
α-CTD UP
σR4.2
σ70 σR3
σR2.4
–16
–35
σR1.2 –10 D
TGTGn TTGACA
n17±1
TATAAT n17±1
+1
nnAAA(A/T)(A/T)T(A/T)TTTTnnAAAAnnn
Figure 12.2 The promoter elements and interactions with RNAP. The five primary promoter elements of a σ70-dependent promoter: the A/T-rich UP element, the –35 hexamer, the –16 element, the –10 hexamer, and the discriminator. The UP element interacts with the α-CTD, while the –35 hexamer, -16 element, -10 hexamer, and the discriminator contact regions 4.2, 3, 2.4, and 1.2 of σ. The spacing between the –35 and –10 hexamers is, on average, 17 bp. The spacing between the –10 hexamer and the transcription initiation site ( +1) is, on average, 7 bp.
12-4
Bacterial Transcriptional Regulation of Metabolism
Like the –10 hexamer, the –35 hexamer is recognized by σ; but, in this case, it is region 4.2. With few exceptions, it is separated from the –10 hexamer by 17 ± 1 bp. The –16 element, a 3–4 bp motif adjacent to the –10 hexamer, is recognized by region 3 of σ [80,95,98,127]. A promoter that includes both the –10 hexamer and the –16 element is called an extended –10 promoter. The discrimator is the sequence that separates the –10 hexamer and the +1. Typically 7 bp in length, it is recognized by region 1.2 of σ and influences the rate of isomerization [60,61,107]. In contrast to the previously described DNA sequences, the UP element is bound by the α-CTD. It is an AT-rich sequence, about 20 bp in length and located upstream of the –35 hexamer. Since each halfsite can independently bind an α-CTD, the UP element can be present with both half-sites adjacent to each other or as a isolated half-site [38,45,120,121]. The primary role of these five cis elements is to set the probability that the holoenzyme will dock with any given promoter. The E. coli cell is predicted to possess about 4600 promoter sequences [125]. None of these promoters possesses perfect versions of all five elements; such a promoter would bind so tightly that RNAP would be unable to escape. Instead, promoters that function most effectively possess near-consensus sequence elements [139]. Indeed, from promoter to promoter, the relative contribution of each element differs. Weak consensus in one element or the lack of that element altogether can be compensated by strong consensus in another [23]. For example, promoters that possess consensus –10 and –16 elements generally do not also possess the –35 hexamer [133], while a promoter with consensus –35 and –16 elements possesses a –10 hexamer with a poor match to consensus [64].
12.1.4 Hard-Wired Regulation When considering the regulation of transcription initiation, one must remember that RNAP (Eσ) is present in limited quantities. On average, much of that RNAP is either busy transcribing genes that encode the stable RNAs required for translation (i.e., rRNA and tRNA) [67,69] or is bound nonproductively to the genome [46,78,129]. The remaining RNAP must transcribe the vast majority of genes [67] and the differential distribution of RNAP between promoters depends upon the differences in their sequences. For example, the promoters that drive stable RNA transcription are strong primarily because they possess consensus UP elements, which stabilize TCC formation by binding the α-CTD [45]. Competition between σ factors also dictates differential RNAP distribution. Most bacteria possess multiple σ factor variants [56,58]. For example, E. coli possesses seven distinct σ factors σ70, σH, σE , σS, σN, σF, and σFecI. The primary σ factor, σ70, allows RNAP to recognize and bind the vast majority of promoters. Produced constitutively, σ70 drives transcription of genes dedicated to everyday cellular processes; thus, σ70 is sometimes called the “housekeeping” σ factor [106]. In contrast, accumulation of the other six σ factors depends upon exposure to specific environmental signals or stresses [67]. Accumulation of these alternative σ factors drives competition with σ70 for the core enzyme (E). Each resultant holoenzyme (Eσ) can initiate transcription only from the subset of promoters that possess its specific sequence elements [85]. This specificity provides cells with the opportunity to activate whole batteries of genes in response to a given specific signal or stress. For example σH accumulates in response to heat shock, σE in response to envelope stress, σS during entry into stationary phase, σN during nitrogen starvation, σF to construct a flagellum, and σFecI when the cells require iron [63]. The ensuing competition for core enzyme (E) produces specific holoenzymes (Eσ) that can transcribe genes necessary to cope with the unique challenges associated with each type of signal or stress [1,3,122]. Cells limit the supply of σ factors; thus, intense competition also can ensue between different promoters for their required brand of holoenzyme [58,67,85,96]. Cells regulate the availability of alternative σ factors by a variety of mechanisms that operate at several levels, including transcriptional, translational, and post-translational. For example, post-translational control can be applied by an anti-σ factor, which sequesters the σ from the core enzyme [62,65]. Examples include FlgM, which sequesters the flagellarspecific σF, and RseA, which sequesters σE.
Transcribing Metabolism Genes: Lessons from a Feral Promoter
12-5
12.1.5 Adaptive Regulation The differences that are hard-wired into the DNA sequence (cis elements) can only set the range of possible promoter activities. To precisely modulate their activity in response to specific environmental cues, two distinct mechanisms are used. One mechanism operates directly upon RNAP. The other utilizes trans-acting mediators, proteins called transcription factors. The direct mechanism depends upon the small molecule guanosine 3',5' bisphosphate (ppGpp). Synthesized when amino acids become scarce enough to limit translation [50,86] and apparently whenever the cell enters a substantial nutritional downshift [142], ppGpp acts by destabilizing unstable open complexes, which tend to form at promoters responsible for the synthesis of the translational machinery. This process, performed in concert with the protein DksA [87,108,123], increases the pool of RNAP available to transcribe promoters responsible for the response to the downshift [8,9]. This mechanism permits a rapid and efficient response [86,101]. The indirect mechanism involves the binding of a transcription factor in the vicinity of a promoter. Such binding can either activate or repress transcription initiation. Because promoters with sub-optimal sequence elements are weak, they are particularly amenable to activation. In contrast, strong promoters with near consensus sequence elements tend to be controlled by repression. Many promoters are subject to both. Although some transcription factors function solely as an activator or strictly as a repressor, others can act as either an activator or a repressor with the precise role at any given promoter depending upon the location of its DNA binding site [109]. By diminishing the affinity of a given promoter for RNAP, a repressor reduces the activity of that promoter. Repressors can achieve this effect by diverse mechanisms. These include steric hindrance, wherein the repressor binds to a DNA site that overlaps the promoter, thereby inhibiting the binding of RNAP (Figure 12.3a); promoter occlusion, in which multiple transcription factors interact to form a DNA loop small enough to deny access of RNAP to the promoter (Figure 12.3b); and over-stabilization of the RNAP-DNA complex, poising it in an unproductive state [2,23,78,81,115,128,129]. In contrast, activators enhance promoter efficiency by improving affinity. It appears that most activators, e.g., CRP and FNR, act by binding a DNA site near a target promoter (Figure 12.3c). From this site, the activator recruits the appropriate holoenzyme (e.g., Eσ70). In most cases, recruitment is mediated by one or multiple interactions between surfaces of the activator and the polymerase (see Section 12.1.7). A subset of activators, e.g., SoxS and MarA, first interact with free holoenzyme (Eσ) and then bind the promoter as a complex [53,54,89]. Like σ factors, these activators appear to “reserve” a proportion of the available RNAP molecules [23]. Strictly speaking, activation, and repression refer to those mechanisms that control basal transcription—those driven strictly by the binding of RNAP in the absence of accessory transcription factors. Variations on the theme do exist, however. For example, transcription from an activated promoter can be diminished by inhibiting the function of the activator. This mechanism is called antiactivation (Figure 12.3d). Antiactivation can occur by steric hindrance of activator binding or by blocking the ability of the bound activator to recruit RNAP. Antirepression can occur by similar mechanisms. More than 300 E. coli genes (about 7% of the entire genome) are predicted to encode transcription factors [126]. Most bind DNA in a sequence-specific manner. This specificity ensures that their activities become targeted to particular subsets of promoters. Most transcription factors can be categorized as either gene-specific or global. Gene-specific transcription factors, e.g., the Lac repressor LacI, control one or a small number of promoters. For example, about 60 transcription factors control a single promoter. In contrast, global transcription factors modulate the activities of large numbers of promoters. Seven proteins (CRP, FNR, ArcA, NarL IHF, Fis, and Lrp) are estimated to control 50% of all regulated genes [90]. Transcription factors also can be grouped on the basis of sequence analysis of their DNAbinding domains. To date, 75 families have been identified [109]. Of these, the best characterized are the CRP, OmpR, AraC, LacI, and LysR families.
12-6
Bacterial Transcriptional Regulation of Metabolism
(a)
(c)
RNAP
RNAP TF TF TFBS
–35
TF TF TFBS
UP
–35
–10
–10 (d)
(b)
RNAP
RNAP TF TF
TFBS TF TF
TF TF TFBSTFBS
UP
–35
–10
TF TF –35
–10
TFBS
Figure 12.3 Mechanisms of transcription regulation. Simple repression can occur by (a) steric hindrance, in which a transcription factor (TF) inhibits binding of RNAP by binding to a site (TFBS) that overlaps with one or more promoter elements. Repression can also occur by (b) promoter occlusion, in which a transcription factor causes looping that occludes RNAP. Activation by recruitment (c) occurs because a transcription factor binds to a DNA site near to the promoter. By contacting a surface of RNAP, the transcription factor recruits RNAP to the promoter. Antiactivation (d) occurs because one transcription factor causes steric hindrance by binding to a DNA site that overlaps with the DNA site for the activator.
12.1.6 Signaling The primary role of transcription factors is to couple gene expression to specific environmental signals [91,130]. This coupling can be achieved by regulating their concentration or by modulating their activity. Several distinct mechanisms can control the cellular concentration of a transcription factor. The concentration can be determined by regulating expression at the level of transcription or of translation. Concentration also can be determined by proteolysis. For example, the concentration of the transcription factor SoxS controls one arm of the response to oxidative stress. The concentration of SoxS is controlled by both mechanisms. First, SoxS expression is controlled by another transcription factor (SoxR) that is directly activated by oxidizing ligands (Figure 12.4a) [39]. Second, SoxS stability is controlled by the ATP-dependent protease Lon (Figure 12.4b) [52,131]. Several distinct mechanisms regulate activity, including allostery, covalent modification, and sequestration. Allostery occurs when a transcription factor becomes bound by a small ligand, whose concentration fluctuates in response to the availability of a nutrient or the exposure to stress (Figure 12.4c). Such binding alters the DNA-binding affinity. For example, the concentration of allolactose signals the presence of lactose in the environment. The binding of this small molecule to the Lac repressor reduces the affinity of this transcription factor for its DNA site, which overlaps the lac promoter. Although
12-7
Transcribing Metabolism Genes: Lessons from a Feral Promoter (a)
(b)
Ox Rad
Lon SoxS
SoxR +
soxS Signal
(c)
P
P
LacI TFBS
lacZ
TFBS
lacZ
P RR
SK
TFBS
(d)
target
LacI (e)
α-TF TF
CRP TFBS
lacZ
TFBS
target
Figure 12.4 Mechanisms of signaling. The concentration of a transcription factor (SoxS) can be influenced by (a) signals (e.g., oxygen radicals) that control the transcription of its gene or by (b) influencing the stability of the protein through the action of a protease, e.g., Lon. The activity of a transcription factor can be influenced by (c) allostery. For example, the binding of allolactose (closed triangle) to the Lac repressor (LacI) diminishes the affinity of LacI for its binding site. Similarly, when cAMP (checked triangle) binds to the cAMP receptor protein (CRP), the affinity of CRP for its DNA site increases, permitting activation of transcription. (d) The activity of a transcription factor can be modified by covalent modification. For example, a signal can drive a two-component sensor kinase (SK) to autophosphorylate. The phosphorylated SK then acts as a phosphoryl donor for the autophosphorylation of the cognate response regulator (RR), which can now bind its binding site and activate transcription. (e) The activity of a transcription factor can be controlled by sequestration. For example, an antitranscription factor (α-TF) can bind to a transcription factor, keeping it from binding to its DNA site.
the loss of Lac repressor binding potentiates the promoter for activation, it does not actively increase lac transcription. This requires the binding of another small molecule (cyclic AMP or cAMP) to the cAMP receptor protein (CRP, also known as CAP). The consequent conformational change increases DNA binding affinity; this permits the cAMP-CRP complex to occupy its binding site near the lac promoter. From this site, the small molecule-transcription factor complex recruits RNAP by contacting the α-CTD. Because the concentration of cAMP is inversely proportional to the concentration of glucose in the environment, lac transcription depends upon both the presence of lactose and the absence of glucose [79,148]. Covalent modification can either increase or decrease the activity of a transcription factor (Figure 12.4d). This mechanism is exemplified by two-component signal transduction (2CST). A typical 2CST pathway consists of two components: a sensor kinase (SK) and a response regulator (RR). The sensor kinase, which can be located in the cytoplasm or within the cytoplasmic membrane, autophosphorylates in response to a specific signal. The phosphorylated SK then donates its phosphoryl group to the RR. Phosphorylation typically results in increased binding affinity [136,147]. For example, the RR PhoP
12-8
Bacterial Transcriptional Regulation of Metabolism
is controlled by the SK PhoQ, which becomes phosphorylated in response to reduced Mg2 + concentration, exposure to acidic pH, or increased amounts of cationic antimicrobial peptides [6,55,59,112]. Sequestration regulates the effective concentration of a transcription factor. One way to sequester a transcription factor is to bind it with another protein (Figure 12.4e). For example, the anti-σ factor FlgM sequesters σF, while the integral cytoplasmic membrane protein RseA sequesters σE [33,62,65]. Similar strategies can work for transcription factors. For example, the integral cytoplasmic membrane protein Enzyme IICB sequesters the transcription factor Mlc [111]. Another strategy is to recruit the transcription factor directly into close association with the cytoplasmic membrane. For example, the bifunctional PutA protein acts either as a proline catabolic enzyme or as an autogenous transcriptional repressor, depending upon the availability of proline [105].
12.1.7 Simple Activation Activation of transcription can be simple, involving the action of a single activator. One simple mechanism of activation alters the promoter conformation, enabling RNAP to interact with the –10 and/or –35 hexamers (Figure 12.5a). This mechanism generally requires that the activator bind in close vicinity to the promoter elements [23]. For example, promoters activated by members of the MerR transcription factor family possess sub-optimal spacing between the –10 and –35 hexamers. The binding of a (a)
RNAP
–35
(b)
TF
–10
RNAP
1 –35
–60.5 (c)
–10
RNAP
1 –41.5
2 3
–35
–10
Figure 12.5 Simple activation. Activation can occur by conformational change, in which a transcription factor optimizes a promoter with sub-optimal spacing between its cis elements (a). Activation also can occur by a Class I mechanism, in which a transcription factor (e.g., CRP) binds to an upstream site centered near positions –61, –71, –81, –91 and interacts through activation region 1 (AR1, 1) with an α-CTD (b). Activation can occur via a Class II mechanism, in which the activator binds a site near the –35 hexamer and interacts with RNAP via two surfaces (AR1) and activation region 2 (AR2, 2) (c). With some proteins (e.g., FNR), a third surface called activation region 3 (AR3, 3) can interact with σ region 4.2.
Transcribing Metabolism Genes: Lessons from a Feral Promoter
12-9
MerR-type activator to this “spacer” sequence twists the DNA, reorienting the –10 and –35 hexamers. This reorientation facilitates the binding of the holoenzyme via its σ subunit [19,100]. Another, more common, mechanism does not alter promoter conformation, but rather recruits and/ or stabilizes RNAP to a promoter. These activators tend to bind upstream of the transcription start site and make direct contact with RNAP [114]. These activators fall into two classes (Class I and Class II) on the basis of the location of their binding sites. Class I activators enhance transcription by making direct contact with the α-CTD of RNAP, thereby stabilizing the TCC (Figure 12.5b) [23,29,30]. Because a flexible linker links the α-CTD and α-NTD, Class I activators can bind to several upstream locations, near positions –61, –71, –81 or –91 relative to the transcription start site. Depending upon context, the two α-CTDs can bind next to the σ70 subunit of RNAP or either upstream or downstream of the activator [77]. In contrast, Class II activators have a more restricted subset of sites, which overlap the –35 hexamer [23,29]. They activate transcription by contacting domain 4.2 of σ70 and/or the α-NTD (Figure 12.5c) [42]. At some promoters, Class II activators also make productive contacts with the α-CTD, which often binds upstream of the bound activator [118]. Class II activators can stabilize the TCC and enhance its isomerization to the TOC [30]. Arguably, the best-studied simple transcription factor is CRP, which becomes activated to bind DNA only when bound by cAMP. The active cAMP-CRP complex (hereafter referred to as CRP) can function either as a Class I or a Class II activator. As a Class I activator, it uses a well-characterized surface determinant called AR1 (activating region 1) to contact the α-CTD (Figure 12.5b). As a Class II activator, it uses another surface determinant called AR2 to contact the α-NTD (Figure 12.5c). FNR is a paralog of CRP; in addition to AR1 and AR2, it possesses a third surface (AR3), which permits it to also contact region 4.2 of σ70. In CRP, this surface is masked; certain mutants remove the mask, exposing this nonnative surface [29,76].
12.1.8 Complex Regulation Promoters can be complicated [10]. At some promoters, a single transcription factor can activate or repress transcription. At other promoters, multiple signals must be integrated. At some promoters (e.g., lac), a gene-specific repressor and a global activator work in succession to ensure proper regulation. At a third set of promoters, however, proper regulation requires the codependence of two or more transcription factors. This codependence can be observed in several distinct mechanisms. A simple mechanism involves the repositioning of one transcription factor by another: the second protein either physically moves the first protein along the DNA toward RNAP or bends the DNA to bring the first protein bound to an upstream site near to RNAP. In this section, I shall describe three more complex mechanisms: the integration of independent contacts with RNAP, cooperative binding of transcription factors, and codependence due to nucleoid proteins. 12.1.8.1 Integration of Independent Contacts Activation at Class I and at Class II promoters requires only a single α-CTD [82]. Since each RNAP possesses two α-CTDs, this leaves the second α-CTD free to interact with a second activator bound at a DNA site located further upstream [10]. These independent interactions between two α-CTDs and two transcription factors tend to increase the affinity of RNAP for the promoter, thereby enhancing transcription. This mechanism, sometimes called Class III, was elucidated primarily through the use of semi-synthetic promoters [11,14,31,70,75,82] and then shown to operate in nature. The mechanism was first observed at a semi-synthetic CRP-dependent Class II promoter [31] that could be further activated by the binding of a second CRP at an upstream Class I location. In this scenario (Figure 12.6a), the CRP bound at the downstream site functions in a Class II role, making contact with an α-CTD via AR1 and an α-NTD AR2. At the same time, the upstream-bound CRP functions as if it were at a Class I promoter, using AR1 to contact the second α-CTD. The effect is synergistic: the combined effect of both interactions is more than additive [31,70]. Subsequent studies [76,138] showed that both activators can function
12-10
Bacterial Transcriptional Regulation of Metabolism (a) RNAP
Class II
Class I (b)
RNAP
Class I
Class I
(c) RNAP
Figure 12.6 Mechanisms of codependence on two activators. Independent contacts between RNAP and each of two transcription factors can activate transcription using two different Class III mechanisms. (a) Synergistic activation can occur when one transcription factor binds to a Class II position and a second factor binds to a Class I position. The two factors can be the same protein or two different proteins. (b) Synergistic activation can occur when both transcription factors bind to Class I positions. Again, the two factors can be the same protein or two different proteins. (c) Activation also can occur through co-operative binding. See Figure 12.7 for details.
from tandem Class I positions, with each activator making contact strictly through the α-CTD-AR1 interaction (Figure 12.6b). In nature, the Class I–Class II combination appears to be more common than the Class I–Class I combination [23]. An example of the former is the E. coli proPP2 promoter, at which CRP bound at position –121.5 acts as a Class I activator, while FIS bound at position –41.5 functions as a Class II activator [93]. An example of the latter is the E. coli acsP2 promoter, where one CRP binds at position –122.5 and the other binds at position –69.5 [12]. Because the contacts are independent, many variations on the theme are possible [29]. FNR, a paralog of CRP, also can activate using Class I or Class II mechanisms [29]. Intriguingly, promoters that carry tandem DNA sites for FNR do not exhibit synergistic activation. Instead, the binding of two FNR proteins results in antiactivation. At the E. coli yfiD promoter, for example, FNR bound at the upstream Class I site cannot contact an α-CTD and therefore cannot enhance activation. Instead, it seems to contact the downstream FNR located at –41.5, a Class II position. This direct FNR-FNR contact prevents the downstream FNR from achieving Class II activation [51]. The upstream site possesses lower affinity for FNR; thus, low concentrations of FNR that permit occupancy of the downstream site only result in activation, while high concentrations that permit occupancy of both sites result in inhibition. This mechanism ensures that the promoter is activated within a narrow range of conditions [88]. Although CRP does not seem to possess the ability to antiactivate transcription by occupation of tandem sites [11], it appears that such behavior can be enforced by the occupation of a nearby IHF site.
12-11
Transcribing Metabolism Genes: Lessons from a Feral Promoter
At the acsP2 promoter, the binding of IHF to a site centered at position –153 causes severe inhibition that requires the occupancy of tandem Class I sites separated by five turns of the helix [129], identified as the optimal separation for inhibition by tandem FNR sites [11]. 12.1.8.2 Cooperative Binding of Transcription Factors Codependence of two transcription factors does not always occur using independent contacts with RNAP (Figure 12.6c). At certain promoters, two proteins can activate transcription cooperatively. In this mechanism, activation depends upon one activator, whose binding depends upon the binding of another. For example, the MelR-activated E. coli melAB promoter possesses two pairs of MelR binding sites (Figure 12.7a; sites 2 and 2' and sites 1 and 1') . It also has a single DNA site for CRP, which sits between the two pairs of MelR sites [15]. Activation requires the binding of MelR to site 2', from which it acts as a Class II activator [47,49]. Occupancy of site 2' by MelR, however, depends upon the binding of CRP, whose binding requires the prior binding of MelR to both sites 1 and 2 (Figure 12.7b). Thus, Class II MelR-dependent activation of melAB transcription actually requires the formation of a stable DNA-protein complex that includes one CRP and four MelR proteins [146]. The construction of a stable multiprotein complex also can be used to inhibition transcription. Whereas direct protein– protein interactions with tandem-bound MelR recruits CRP to function as a coactivator of the melAB promoter, similar direct interactions with tandem-bound CRP recruit CytR to function as a corepressor at certain CRP-dependent CytR-repressed promoters (Figure 12.7c and d) [34,71]. The use of this sort of mechanism—cooperative interaction between different transcription factors—appears to be rare. This rarity likely results from evolution pressures against commitment to dedicated interactions [10].
(b)
(a)
RNAP RNAP MeIR
MeIR
MeIR CRP MeIR
1 1’
2’ 2
1 1’
2’ 2
(d)
(c)
RNAP RNAP CRP
CRP
CRP CytR CRP
Figure 12.7 Cooperative interactions can activate or repress a promoter. (a) Co-operative binding of CRP to MelR bound in tandem activates the E. coli melAB promoter. In the absence of CRP, MelR cannot bind to its promoter-proximal site (2′) and the melAB promoter is inactive. Co-operative binding of CRP to the promoter permits MelR to occupy 2′ from where its activates transcription by a Class II mechanism. The binding sites for the α-CTD are unknown. (b) Co-operative binding of CytR to CRP bound in tandem represses the E. coli deoP2 promoter. CRP bound in tandem activates the deoP2 promoter by a Class II mechanism. CytR represses transcription by binding between the tandem CRP sites.
12-12
Bacterial Transcriptional Regulation of Metabolism
12.1.8.3 Nucleoid Proteins Codependence can also involve a transcription factor and a nucleoid protein [94]. Like eukaryotic histones, these proteins fold the genome into a highly compacted, highly organized structure, called the nucleoid [36,69,103,104,134]. Like histones, these abundant proteins bind numerous DNA sites and cause transient localized rearrangements of chromosome architecture that can influence transcription [13,94,110]. They tend to bind to overlapping sites in intergenic regions associated with the binding of RNAP [49]. E. coli cells possess 12 nucleoid proteins, including factor for inversion stimulation (FIS) and integration host factor (IHF) [4,68]. These nucleoid proteins are regulated with respect to the physiological status of the cells. For example, FIS becomes the most abundant nucleoid protein upon dilution of cells into fresh medium. Since FIS negatively regulates its own transcription, its pool diminishes throughout exponential growth, becoming undetectable by stationary phase. In contrast, cells growing exponentially have relatively small amounts of IHF, which increases in concentration as the growth rate decreases to become the second-most abundant nucleoid protein as the culture enters stationary phase [5,7]. Regulation of transcription initiation by nucleoid proteins can be relatively simple. For example, at proPP2, FIS functions as a Class II activator that co-depends with CRP bound at an upstream Class I site (see Figure 12.6a) [93]. Furthermore, IHF is known to introduce a severe bend in the DNA, bringing an activator bound to a far upstream site called an enhancer into close proximity to RNAP [17] or bringing an UP element close enough to facilitate binding of an α-CTD [84]. However, the influence of nucleoid proteins can be considerably more complex. For example, at the E. coli pnrfA promoter, the nucleoid proteins IHF and FIS confer codependence on two activators [26,27]. This promoter drives expression of the nrf operon, which encodes an anaerobic formate-dependent nitrate reductase [66]. Expression of pnrfA codepends on FNR and and either NarL or NarP [145]. The oxygen-sensitive transcription factor FNR binds to position −41.5 from which it functions as a Class II activator (see Figure 12.5c) [22,145]. NarP or NarL, homologous transcription factors whose binding to DNA is enhanced by the presence of nitrite or nitrate ions in the environment [135], bind a site centered at position −74.5 [145]. FNR-dependent activation of pnrfA is repressed by FIS and IHF. FIS binds to a site centered at −15; since it overlaps the −10 hexamer, FIS represses basal transcription by steric hindrance (see Figure 12.3a). In contrast, IHF binds to a site (centered at position −54) that overlaps the DNA sites for NarP/NarL and FNR (Figure 12.8a). NarL and NarP reverse this IHF-dependent repression by displacing IHF from its binding site (Figure 12.8b) [22,26,27]. (a)
(b)
IHF –54
FNR –41.5
IHF
FIS –15
NarP/NarL –74.5 FNR –41.5
Figure 12.8 Activation by remodeling a nucleoprotein complex. (a) The binding of FIS and IHF to sites centered at positions –15 and –54, respectively, represses FNR-dependent transcription at the nrf promoter. (b) The binding of NarL or NarP to the 7-2-7 element, centered at position –74.5, displaces IHF and activates the promoter provided that FIS is not present. Note that the further binding of NarL to weaker single heptamer sites at positions –50 and –22 re-establishes repression at higher nitrate concentrations. (From A. Darwin, H. Hussain, L. Griffiths, J. Grove, Y. Sambongi, S. Busby, and J. Cole, Mol. Microbiol., 1993, 9, 1255–1265 and K.L. Tyson, J.A. Cole, and S.J. Busby, Mol. Microbiol., 1994, 13, 1045–1055. With permission.)
Transcribing Metabolism Genes: Lessons from a Feral Promoter
12-13
The ability of a transcription factor to remodel a nucleoprotein complex is not unique to pnrfA. The related E. coli pnir possesses striking similarities. pnir drives transcription of an operon that encodes a different nitrite reductase. Its activation also codepends on FNR and NarP/NarL [145]. As at pnrfA, FNR functions as a Class II activator from position –41.5, while NarP/NarL binds to an upstream site; in this case, it is centered at position –69.5. Activation of FNR-dependent transcription is repressed by a complex that includes FNR, IHF, and FIS. In this complex, IHF binds at a site centered at position –88, while FIS binds two sites centered at –142 and +23. As at pnrfA, NarP/NarL reverse repression by displacing IHF [24,25]. Thus, at nir, codependence on the second activator results from the necessity to remodel an inhibitory nucleoprotein complex that includes nucleoid proteins and the Class II activator FNR. A similar mechanism seems to be at play at the acsP2 promoter. This synergistic CRP-dependent promoter transcribes divergently from pnrfA. Not surprisingly, considering that it sits within 200-bp of pnrfA, the ability of acsP2 to transcribe is influenced by the occupancy of the same DNA sites for the same nucleoid proteins [12,21,22,26,27]. Although the general scheme appears similar, the details are different.
12.2 Regulation of a Native Metabolic Promoter Now that the metabolic engineer is equipped with the fundamentals of bacterial transcription, s/he is now ready to venture into the “real” world where multiple nucleoid proteins and multiple transcription factors integrate diverse signals to modulate the behavior of several closely packed promoters. As you might have deduced from my references to it in the preceding section, the nrf-acs intergenic region represents one such scenario. This stretch of about 300 bp contains all of the cis elements necessary to regulate two operons that perform very different functions under very different circumstances. They are transcribed from divergent promoters that are activated by different transcription factors. However, regulation of transcription from both promoters depends on the same set of nucleoid proteins. These nucleoid proteins mediate the formation of several distinct multiprotein complexes that ensure that the correct operon gets transcribed at the right time in the proper amount. In this section, I will focus on the machinery that controls transcription of the acs operon, citing nrf regulation whenever necessary.
12.2.1 Acetyl Coenzyme A Synthetase (acs) Acs (Acetyl coenzyme A synthetase) catalyzes the conversion of acetate to acetyl-CoA through an enzyme-bound acetyladenylate (acetyl-AMP) intermediate (Figure 12.9a) [16]. This activity permits the use of acetate to obtain energy through the tricarboxylic acid cycle and biosynthetic subunits via the glyoxylate bypass [35]. E. coli cells require Acs to scavenge acetate from the environment [20,74]. This delays entry into stationary phase and allows cells to compete successfully during periods of carbon starvation. Thus, the induction of Acs represents a survival response for cells entering a nutrient poor environment (reviewed by Ref. [149]). Like many other genes required for the use of secondary carbon sources, acs is subject to catabolite repression [72]: cells growing on their preferred carbon source [e.g., glucose for E. coli) inhibit acs transcription [72,102,132,143]. However, the major byproduct of glucose metabolism is acetate, which cells excrete into their environment. Thus, the extracellular concentration of acetate rises as the concentration of glucose falls (Figure 12.9b). Just prior to exhaustion of the glucose, cells induce acs transcription. This event occurs during the transition from exponential growth to stationary phase [72,132]. As the culture enters stationary phase, transcription peaks and then decreases rapidly. Transcription rises once again during the first several hours of stationary phase, even though the concentration of acetate remains low [149]. A similar expression pattern happens during growth in tryptone broth, a mixture of amino acids. As E. coli cells exhaust L-serine, their preferred amino acid, they induce acs transcription [21,72,113]. As with glucose, transcription peaks as the culture enters
12-14
Bacterial Transcriptional Regulation of Metabolism (a)
(b) ATP (glc)
Acetyl-AMP CoASH
ACS Acetyl-CoA
Growth
PPi
AMP
acs (ace)
Conc
ACS
acs TRXN
Acetate
TCA/GS Time
Figure 12.9 Acetyl-CoA synthetase (Acs) and transcription of its gene (acs). (a) Acs catalyzes the activation of acetate to acetyl-CoA in an ATP- and CoASH-dependent manner. The by-products are pyrophosphate (PPi) and adenosine monophosphate (AMP). The resultant acetyl-CoA enters the TCA cycle and the glyoxylate shunt (GS). (b) acs transcription is induced as glucose (glc) levels approach depletion and acetate (ace) concentration begins to peak. Peak transcription occurs as cells enter stationary phase.
stationary phase [72], decreases rapidly, and then slowly increases—this time, in parallel with the concentration of extracellular acetate [149]. Regulation of acs occurs primarily at the level of transcription initiation [72]. As befits a critical survival gene, this regulation is quite complex. What follows is a description of the machinery that controls acs transcription and the current thinking about how it all works. Several general lessons can be gleaned concerning transcription of genes involved in metabolism and other cellular processes.
12.2.2 acs Operon and Promoter Architecture acs is the first gene in an operon that includes yjcH, a hypothetical gene, and actP (formerly known as yjcG), which encodes an acetate permease (Figure 12.10a) [44] whose physiological purpose remains unknown. No evidence exists for internal promoters; thus, transcription of the acs operon apparently initiates only from the regulatory region of acs, located upstream [5') of the acs open reading frame. The regulatory region of acs includes three σ70 -dependent promoters: the proximal major promoter acsP2; a minor promoter called acsP2A located 18 bp upstream of acsP2, and the distal minor promoter acsP1 located some 200 bp upstream of acsP2 and acsP2A (Figure 12.10b) [12,22,72,73]. The in vivo activities of these promoters, however, do not correlate with the strength of their promoter elements. Of all the possible promoter elements, the strongest promoter in vivo (acsP2) possesses a single clearly recognizable DNA element: the –10 hexamer (TATtAT). It also possesses an AT-rich upstream sequence that resembles an UP element; however, it appears to be mispositioned by about ½ helical turn with respect to the transcription initiation site ( +1). Furthermore, acsP2 possesses no recognizable –35 hexamer or –16 element. In contrast, the distal minor promoter (acsP1) has a –35 hexamer with moderate similarity to consensus (TTtAat) separated by 16-bp from a near consensus –10 hexamer (TAaAAT). The proximal minor promoter (acsP2A) possesses a –10 hexamer with only poor similarity to consensus (TtTAAc) and no recognizable –16, –35, or UP sequences. On the basis of these sequences alone, one might predict acsP1 to be the strongest promoter and, indeed, it is in vitro! Although RNAP alone can bind to each of these three promoters, only acsP1 produces a significant amount of transcript. Therefore, in vivo, the weakly consensus acsP2 must depend upon transcription
12-15
Transcribing Metabolism Genes: Lessons from a Feral Promoter (a)
nrfC
(b)
P1
pnrfA
–267 FIS I
nrfB
–226 IHF I
acs
nrfA
–180 –153 IHF II IHF III
–98 FIS II CRP II –122.5
P1 Pnrf
–74
P2A - P2
–19 IHF I
X-10
(c) FIS I –59
FNR –32.5
yjcH
–59 FIS III
P2A P2
CRP I –69.5
+1 P1
–10 NsrR –22
actP
+27 IHF II
NarP/L +0.5
Figure 12.10 The acs operon and the nrf-acs intergenic region. (a) acs is the first gene in a three gene operon. Its transcription is controlled by three promoters, which transcribe divergently from the nrf promoter, which drives transcription of the nrf operon. (b) The region 5′ of acs includes three promoters, two DNA sites for the transcription factor CRP and three sites each for the nucleoid proteins FIS and IHF. The numbers are in reference to the +1 of acsP2. (c) The repressed state of acsP1 results from the action of two nucleoid proteins (FIS and IHF) and several transcription factors (FNR, NsrR, NarP, and NarL). X-10, the extended –10 of the nrf promoter; −10, the −10 hexamer of acsP1. The numbers are in reference to the +1 of acsP1.
factors to help RNAP drive transcription, while acsP1 must be repressed. Although it can recruit RNAP, acsP2A transcribes poorly and its primary role appears to be to inhibit acsP2 by competing with it for RNAP [12,21,22,72,73,129].
12.2.3 Repression of acsP1 In vitro, acsP1 exhibits substantial basal activity: RNAP can bind and transcribe without support of transcription factors. In vivo, however, acsP1 initiates transcription infrequently. This repressed state results from the combined action of two nucleoid proteins and four different transcription factors. acsP1 overlaps extensively with the divergently oriented pnrfA promoter (Figure 12.10c), yet both promoters transcribe independently: transcription from one promoter does not inhibit transcription from the other [22]. This independent regulation is conferred by the nucleoid proteins FIS and IHF, the oxygen regulator FNR, the nitrite-responsive RR NarL or the nitrate-responsive NarP and the nitric oxidesensitive regulator NsrR [22,43]. These nucleoid proteins and transcription factors ensure that pnrfA is repressed when the environment is replete with oxygen and good sources of carbon and ammonia
12-16
Bacterial Transcriptional Regulation of Metabolism
and induced when the oxygen becomes depleted—especially in the additional presence of nitrite or nitrate [37]. Although the physiological role of acsP1 remains a mystery, it makes sense to uncouple its transcription from that of nrfA because the products of the nrf and acs operons perform very different functions under widely different conditions.
12.2.4 Activation of acsP2 In vitro, RNAP alone does not efficiently transcribe the proximal acsP2. It does, however, bind and melt an extensive region of DNA that includes both acsP2 and acsP2A [12,129]. For efficient transcription, it requires CRP. Because acs transcription depends on CRP, it is subject to catabolite repression [12,72]. This well-studied regulatory mechanism results from the normal action of the phosphoenolpyruvate (PEP)-dependent sugar phosphotransferase system (PTS). Using PEP as the phosphoryl donor, the PTS simultaneously phosphorylates certain sugars (e.g., glucose) and transports them across the cytoplasmic membrane into the cytosol, where the phosphorylated sugars enter glycolysis. As the sugars become scarce, the PTS instead activates adenylate cyclase, which synthesizes cAMP. The cAMP can then bind the CRP homodimer, altering its conformation so that it can bind to its DNA sites (reviewed by Ref. [40)]. CRP can activate the major promoter acsP2 using a variant of the Class III mechanism (Figure 12.11c). At this promoter, two CRP homodimers bind in tandem at DNA sites centered at Class I positions: the higher affinity and proximal site (CRP I) is centered at position −69.5, while the lower affinity and distal site (CRP II) is centered at −122.5. Activation absolutely requires that one dimer bind the proximal CRP I. To achieve maximal transcription, however, a second dimer also must bind to the distal CRP II [12]. (a) RNAP
Off (b) RNAP Off
(c) RNAP 1
1 –122.5 CRP II
On
–69.5 CRP I
Figure 12.11 CRP activates acsP2 transcription using the Class III mechanism that involves tandem Class I sites. CRP activates transcription (c) by inhibiting the formation of the nascent acsP2A TOC (a) and stabilizing the preformed acsP2 TOC (b). Black box, acsP2 –10 hexamer; black arrow, acsP2 + 1; gray box, acsP2A –10 hexamer; gray arrow, acsP2A + 1. Because the exact binding sites for the two α-CTDs are unknown, multiple α-CTDs are shown. To indicate that the positions are putative, each tether and α-CTD is denoted with or encompassed by a dotted line. The arrow indicates re-positioning of the α-CTDs by CRP.
12-17
Transcribing Metabolism Genes: Lessons from a Feral Promoter
Most often CRP functions from Class I sites by stabilizing the TCC. Using the surface determinant AR1, the CRP dimer contacts the 287 surface determinant of the α-CTD (Figure 12.5b). This interaction stabilizes the TCC, permitting the subsequent formation of a TOC, the next intermediate in the transcription initiation pathway [23,29]. At acsP2, the mechanism is a bit more complex. Although CRP I, AR1 and determinant 287 help to stabilize the TCC, their contribution cannot be limited to this role: RNAP alone can bind both acsP2A and acsP2 and form TOCs, albeit in inactive forms (Figure 12.11a and b) [12]. In a recent study [129], we showed that CRP does not substantially increase TOC formation. Thus, the function of CRP is not to recruit RNAP. Instead, it appears to cause a rearrangement of the interactions that RNAP has already made with the promoter, stabilizing a more active form of the TOC. The increased stability of this ternary CRP-RNAP-acsP2 TOC activates transcription at multiple levels. First, it inhibits the formation of the nascent binary RNAP-acsP2A TOC (Figure 12.11a and b), which sterically hinders formation of the binary RNAP-acsP2 TOC. Second, it results in a TOC conformation that permits RNAP to initiate transcription (Figure 12.11c). Third, it stabilizes this active TOC in the presence of nonspecific IHF binding. The inactivity of the binary RNAP-acsP2 TOC might result from improper positioning of the α-CTD. A similar scenario has been seen at the malT promoter, whose binding site for α-CTD (i.e., the UP element) and single binding site for CRP resemble CRP I of acsP2 and the DNA sequence located immediately downstream [137]. By analogy to the malT promoter, we propose that the binding of CRP to CRP I might reposition the α-CTD to a more favorable position, resulting in stabilization of the more active TOC (compare panels B and C) [129].
12.2.5 Modulation of CRP-Dependent Activation by Nucleoid Proteins Of the 12 nucleoid proteins of E. coli, both FIS and IHF can inhibit transcription of acs. Within the acs promoter region, FIS and IHF can each bind to three sites (Figure 12.10b). Furthermore, all six sites can be occupied simultaneously. Finally, both nucleoid proteins function independently to antagonize CRPdependent activation of acsP2 [21]. 12.2.5.1 Negative Regulation by FIS FIS can inhibit CRP-dependent acsP2 transcription directly using three distinct mechanisms. (i) FIS can repress acsP2 transcription by steric hindrance. This first mechanism relies upon the ability of FIS to bind to several lower affinity sites. Because one of these lower affinity sites overlaps the –35 element of acsP2, basal transcription is inhibited, and the mechanism is correctly called repression [21]. (ii) FIS also can antiactivate CRP-dependent acsP2 transcription. This second mechanism relies on the ability of FIS to bind to two higher affinity sites: FIS II and FIS III (Figure 12.12). FIS II (centered at –98)
CRP
CRP
P2A FIS CRP II –122.5
FIS
P2
CRP I –69.5 FIS II –98
FIS III –59
Figure 12.12 FIS-dependent antiactivation of CRP-dependent acs transcription. FIS out-competes CRP binding to their overlapping DNA sites. Because CRP cannot bind, transcription does not occur.
12-18
Bacterial Transcriptional Regulation of Metabolism
lies between CRP II (–122.5) and CRP I (–69.5), while FIS III (–59) overlaps CRP I. Competitive DNase I footprint and electrophoretic mobility shift analyses, and in vitro transcription assays indicate that FIS can displace CRP from both its sites. In vivo, a mutation in FIS II that diminishes its affinity for FIS by more than ten-fold increases acs transcription two- to three-fold during growth in tryptone broth. A similar increase results from a mutation that favors the binding of CRP over that of FIS to their overlapping sites (CRP I and FIS III). Thus, the competition between FIS and CRP for binding to their overlapping and tandem sites helps to keep acsP2 transcription low [21]. (iii) FIS can inhibit transcription from acsP1 (Figure 12.10c). This third mechanism depends upon the ability of FIS to bind to a third higher affinity site: FIS I (centered at position −59 relative to the acsP1 + 1). In vivo, a fis mutant exhibits three-fold more transcription from acsP1, suggesting that FIS helps keep this promoter repressed. The upstream location of this site suggests that FIS does not inhibit by steric hindrance of RNAP [22,26]. Because FIS levels rise dramatically during outgrowth in rich medium from stationary phase and then progressively diminish throughout exponential growth [5,7], FIS appears to be responsible for maintaining acs transcription at basal levels during rapid growth conditions when the activity of acs is unnecessary. 12.2.5.2 Negative Regulation by IHF The concentration of IHF in cells harvested from early exponential growth has been estimated at about 0.7 nM [99]. This concentration increases progressively throughout growth until, during stationary phase, it becomes the second most abundant nucleoid protein [4,68]. The binding of IHF to its specific sites causes the DNA to bend up to 180 degrees and to wrap around the protein [83,119,141]. IHF affects acs transcription by binding to three high affinity sites (IHF I–III) located between positions −140 and −240 relative to the +1 of acsP2 (Figure 12.10b) [21]. Occupation of the most distal site (IHF I) strongly inhibits transcription from acsP1. Centered at position −19 relative to the +1 of acsP1, IHF I completely overlaps acsP1 [22]; thus, the binding of IHF competes directly with that of RNAP and causes repression of acsP1 by steric hindrance (Figure 12.10c). This simple mechanism contrasts sharply with the considerably more complex IHF-dependent mechanisms that operates at acsP2. In a recently published manuscript [129], we proposed that IHF uses two distinct mechanisms to affect transcription from acsP2. The first mechanism relies on IHF in its primary role as a nucleoid protein. When present at the higher concentrations associated with entry into stationary phase, IHF binds to lower affinity, less specific, sites that overlap with acsP2 itself. Thus, IHF inhibits TOC formation by steric hindrance. In contrast, the second mechanism depends on the binding of IHF to its specific sites, especially the most proximal (IHF III). This binding mediates the formation of a tightly wrapped nucleoprotein complex that poises RNAP in an unproductive TOC. 12.2.5.2.1 Denial of Access to Promoter The first type of IHF-dependent inhibition results from its ability to bind to lower affinity sites in the vicinity of acsP2. With IHF bound to these lower affinity sites, RNAP cannot form the TCC. The simple explanation for this behavior is that the initial interactions of RNAP with the cis elements of acsP2 are too unstable. In the presence of CRP, however, similar concentrations of IHF cannot inhibit the RNAP, even when CRP is present at relatively low concentration. Thus, it appears that CRP can stabilize the TCC long enough to compete successfully with IHF and, thus, permit the subsequent formation of the TOC. This mechanism might be essential for transcription to proceed in the face of the increased amounts of IHF as the cells enter stationary phase [5]. Similar mechanisms have been demonstrated at several promoters. 12.2.5.2.2 Mediation of a Poised Open Complex The second type of IHF-dependent inhibition depends upon the ability of IHF to bind to the proximal high affinity site (IHF III). Occupancy of this site mediates the formation of a complex that includes
12-19
Transcribing Metabolism Genes: Lessons from a Feral Promoter (a)
(c)
IHF III
IHF III
1
P
Off
P
RNA
CRP II
Off
1
RNA
2 1
1
CRP I (d)
(b)
IHF I
IHF III
NAP
R
IHF III
IHF II On
P
1
2 1
CRP I
RNA
CRP II
1
1
1
On
2 1
CRP I
Figure 12.13 IHF plays multiple roles at acsP2. (a) In the absence of CRP, transcription from acsP2 does not occur. (b) When CRP occupies CRP I, the occupancy of IHF III enhances CRP-dependent activation in an AR2dependent manner. (c) As the concentration of CRP rises, the lower affinity CRP II site also becomes occupied. Now, occupancy of IHF III inhibits CRP-dependent activation in an AR2-dependent manner. (d) The upstream sequence that includes IHF I and IHF II enhances transcription even when both CRP I and CRP II are occupied. Because the exact binding sites for the two α-CTDs are unknown, multiple α-CTDs are shown. To indicate that the positions are putative, each tether and α-CTD is denoted with or encompassed by a dotted line.
IHF, two CRP dimers, and RNAP (Figure 12.13c). The result is an unproductive TOC that is poised to begin transcription upon receipt of the proper signal [129]. This model is supported by the following observations: (1) IHF-dependent transcription inhibition in vivo requires IHF III and CRP II and, surprising, at least one CRP dimer with an intact AR2. (2) In the presence of both RNAP and CRP, IHF mediates increased DNase I protection of the region between the two CRP sites [129]. (3) Specific single alanine substitutions in the α-CTD of RNAP cause increased transcription (12). The critical residues include R265, which enhances DNA interaction. Mutation of this amino acid (or others) could increase transcription by eliminating one or more of the many contacts made by RNAP. This would decrease the stability of the poised open complex and permit promoter escape [129]. 12.2.5.3 Positive Regulation by IHF In a surprising twist, we found that IHF also can enhance CRP-dependent activation. For this enhancement to occur, IHF III and CRP I must be occupied, AR2 must be intact, and CRP II must remain unoccupied (Figure 12.13b). This behavior was observed in vivo only from a promoter truncation that lacked the upstream IHF sites (IHF I and IHF II) and possessed a mutant CRP II; however, experiments performed in vitro with the full-length acs promoter suggest that this scenario might happen in vivo. This would occur when CRP concentrations are small enough to favor occupancy of the higher affinity and proximal CRP I, but not that of the lower affinity and distal CRP II. This ability would create the intriguing possibility that occupation of IHF III creates a precise window of opportunity for CRP-dependent activation of acsP2 transcription. In the absence of CRP, acsP2 transcription could not occur (Figure 12.13a). At low CRP concentrations that favor occupancy of CRP I, however, prior occupation of IHF
12-20
Bacterial Transcriptional Regulation of Metabolism
III could increase the probability that an TOC will form at acsP2 (Figure 12.13b). In contrast, at high CRP concentrations that favor the binding to both CRP I and CRP II, prior occupation of IHF III could increase the probability that the TOC will become poised (Figure 12.13c) [129]. 12.2.5.4 The Role of Upstream Elements In vivo, the upstream sequence that includes IHF I and IHF II exerts an overall positive effect on transcription [21]. Currently, we do not understand this behavior. However, two distinct possibilities exist. First, some as yet unidentified transcription factor could bind to this region and antagonize the effect of IHF III. Second, this region includes two promoters (acsP1 and pnrfA) and three binding sites for four transcription factors (NarL/NarP, NsrR, and FNR). IHF bound at IHF I and IHF II could compete with these transcription factors and RNAP for their overlapping sites [22,26]. This competition could result in incomplete saturation of the two upstream IHF sites. This would be expected during exponential growth when the IHF pool remains small enough to only half-saturate its highest affinity sites [99]. This scenario would favor stability of the poised nucleo-protein complex. If this stability depends inversely on the number of occupied IHF sites, then this small IHF pool would favor poising, while a larger pool would antagonize IHF III, “loosening” the structure and permitting synergistic Class III transcription by the previously poised RNAP (Figure 12.13d). Because it would be poised in the TOC, RNAP could respond rapidly to the need to utilize acetate as a carbon source, a requirement that occurs as cells approach stationary phase and the IHF pool approaches its maximum [129].
12.3 Concluding Remarks The nrf-acs intergenic region exemplifies the complexity that metabolic engineers may encounter as they attempt to manipulate bacterial promoters. At first glance, transcription of the nrf and acs operons appears to be simple enough: pnrfA is an extended –10 promoter activated in response to low oxygen tension and nitrite or nitrate by the Class II activator FNR with the assistance of NarP/NarL, while acsP2 is a weak CRP-dependent promoter activated by the tandem Class I variant of the Class III mechanism. Deeper study, however, shows that both operons are subject to considerably more sophisticated regulatory mechanisms that involve intricate competition between multiple transcription factors and multiple nucleoid proteins. These interactions between well-characterized regulators can, within the context of these and other complex promoters, result in novel mechanisms of gene regulation that still remain to be explored and about which the well-informed metabolic engineer should be acutely aware.
Acknowledgments I wish to thank my collaborators, especially Steve Busby, Doug Browning, Bianca Sclavi, and Christine Beatty. Studies of acs transcription were supported by grants from the National Science Foundation.
References 1. Ades, S. E. 2004. Control of the alternative sigma factor σE in Escherichia coli. Curr. Opin. Microbiol., 7:157–162. 2. Adhya, S., M. Geanacopoulos, D. E. Lewis, S. Roy, and T. Aki. 1998. Transcription regulation by repressosome and by RNA polymerase contact. Cold Spring Harb. Symp. Quant. Biol., 63:1–9. 3. Alba, B. M. and C. A. Gross. 2004. Regulation of the Escherichia coli σE-dependent envelope stress response. Mol. Microbiol., 52:613–619. 4. Azam, T. A. and A. Ishihama. 1999. Twelve species of the nucleoid-associated protein from Escherichia coli. Sequence recognition specificity and DNA binding affinity. J. Biol. Chem. 274:33105–33113.
Transcribing Metabolism Genes: Lessons from a Feral Promoter
12-21
5. Azam, T. A., A. Iwata, A. Nishimura, S. Ueda, and A. Ishihama. 1999. Growth phase-dependent variation in protein composition of the Escherichia coli nucleoid. J. Bacteriol., 181:6361–6370. 6. Bader, M. W., W. W. Navarre, W. Shiau, H. Nikaido, J. G. Frye, M. McClelland, F. C. Fang, and S. I. Miller. 2003. Regulation of Salmonella typhimurium virulence gene expression by cationic antimicrobial peptides. Mol. Microbiol., 50:219–230. 7. Ball, C. A., R. Osuna, K. C. Ferguson, and R. C. Johnson. 1992. Dramatic changes in FIS levels upon nutrient upshift in Escherichia coli. J. Bacteriol., 174:8043–8056. 8. Barker, M., T. Gaal, C. Josaitis, and R. Gourse. 2001. Mechanism of regulation of transcription initiation by ppGpp. I. Effects of ppGpp on transcription initiation in vivo and in vitro. J. Mol. Biol., 305:673–688. 9. Barker, M. M., T. Gaal, and R. L. Gourse. 2001. Mechanism of regulation of transcription initiation by ppGpp. II. Models for positive control based on properties of RNAP mutants and competition for RNAP. J. Mol. Biol., 305:689–702. 10. Barnard, A., A. Wolfe, and S. Busby. 2004. Regulation at complex bacterial promoters: how bacteria use different promoter organisations to produce different regulatory outcomes. Curr. Op. Microbiol., 7:102–8. 11. Barnard, A. M. L., J. Green, and S. J. W. Busby. 2003. Transcription regulation by tandem-bound FNR at Escherichia coli promoters. J. Bacteriol., 185:5993–6004. 12. Beatty, C. M., D. F. Browning, S. J. W. Busby, and A. J. Wolfe. 2003. CRP-dependent activation of the Escherichia coli acsP2 promoter by a synergistic Class III mechanism. J. Bacteriol., 185:5148–5157. 13. Becker, N. A., J. D. Kahn, and L. J. Maher, III. 2007. Effects of nucleoid proteins on DNA repression loop formation in Escherichia coli. Nucl. Acids Res., 35: 3988–4000. 14. Belyaeva, T. A., V. A. Rhodius, C. L. Webster, and S. J. Busby. 1998. Transcription activation at promoters carrying tandem DNA sites for the Escherichia coli cyclic AMP receptor protein: organisation of the RNA polymerase alpha subunits. J. Mol. Biol., 277:789–804. 15. Belyaeva, T. A., J. T. Wade, C. L. Webster, V. J. Howard, M. S. Thomas, E. I. Hyde, and S. J. W. Busby. 2000. Transcription activation at the Escherichia coli melAB promoter: the role of MelR and the cyclic AMP receptor protein. Mol. Microbiol., 36:211–222. 16. Berg, P. 1956. Acyl adenylates: an enzymatic mechanism of acetate activation. J. Biol. Chem., 222:991–1013. 17. Bertoni, G., N. Fujita, A. Ishihama, and V. de Lorenzo. 1998. Active recruitment of sigma54-RNA polymerase to the Pu promoter of Pseudomonas putida: role of IHF and alphaCTD. EMBO J., 17:5120–5128. 18. Borukhov, S. and E. Nudler. 2003. RNA polymerase holoenzyme: structure, function and biological implications. Curr. Opin. Microbiol., 6:93–100. 19. Brown, N. L., J. V. Stoyanov, S. P. Kidd, and J. L. Hobman. 2003. The MerR family of transcriptional regulators. FEMS Microbiol. Rev., 27:145–163. 20. Brown, T. D. K., M. C. Jones-Mortimer, and H. L. Kornberg. 1977. The enzymic interconversion of acetate and acetyl-coenzyme A in Escherichia coli. J. Gen. Microbiol., 102:327–336. 21. Browning, D. F., C. M. Beatty, E. A. Sanstad, K. A. Gunn, S. J. W. Busby, and A. J. Wolfe. 2004. Modulation of CRP–dependent transcription at the Eschericheria coli acsP2 promoter by a nucleoprotein complex: anti-activation by the nucleoid proteins FIS and IHF. Mol. Microbiol., 51:241–254. 22. Browning, D. F., C. M. Beatty, A. J. Wolfe, J. A. Cole, and S. J. W. Busby. 2002. Independent regulation of the divergent Escherichia coli nrfA and acsP1 promoters by a nucleoprotein assembly at a shared regulatory region. Mol. Microbiol., 43:687–701. 23. Browning, D. F. and S. J. W. Busby. 2004. The regulation of bacterial transcription initiation. Nature Rev., 2:1–5.
12-22
Bacterial Transcriptional Regulation of Metabolism
24. Browning, D. F., J. A. Cole, and S. J. Busby. 2000. Suppression of FNR-dependent transcription activation at the Escherichia coli nir promoter by Fis, IHF and H-NS: modulation of transcription initiation by a complex nucleo-protein assembly. Mol. Microbiol., 37:1258–1269. 25. Browning, D. F., J. A. Cole, and S. J. W. Busby. 2004. Transcription activation by remodelling of a nucleoprotein assembly: the role of NarL at the FNR-dependent Escherichia coli nir promoter. Mol. Microbiol., 53:203–215. 26. Browning, D. F., D. C. Grainger, C. M. Beatty, A. J. Wolfe, J. A. Cole, and S. J. W. Busby. 2005. Integration of three signals at the Escherichia coli nrf promoter: a role for Fis protein in catabolite repression. Mol. Microbiol., 57:496–510. 27. Browning, D. F., D. J. Lee, A. J. Wolfe, J. A. Cole, and S. J. W. Busby. 2006. The Escherichia coli K-12 NarL and NarP proteins insulate the nrf promoter from the effects of IHF. J. Bacteriol., 188:7449–7456. 28. Burr, T., J. Mitchell, A. Kolb, S. Minchin, and S. Busby. 2000. DNA sequence elements located immediately upstream of the -10 hexamer in Escherichia coli promoters: a systematic study. Nucl. Acids Res., 28:1864–1870. 29. Busby, S. and R. Ebright. 1999. Transcription activation by catabolite activator protein (CAP). J. Mol. Biol., 293:199–213. 30. Busby, S. and R. H. Ebright. 1994. Promoter structure, promoter recognition, and transcription activation in prokaryotes. Cell, 79:743–746. 31. Busby, S., D. West, M. Lawes, C. Webster, A. Ishihama, and A. Kolb. 1994. Transcription activation by the Escherichia coli cyclic AMP receptor protein. Receptors bound in tandem at promoters can interact synergistically. J. Mol. Biol., 241:341–352. 32. Campbell, E. A., O. Muzzin, M. Chlenov, J. L. Sun, C. A. Olson, O. Weinman, M. L. Trester-Zedlitz, and S. A. Darst. 2002. Structure of the bacterial RNA polymerase promoter specificity σ subunit. Mol. Cell, 9:527–539. 33. Campbell, E. A., J. L. Tupy, T. M. Gruber, S. Wang, M. M. Sharp, C. A. Gross, and S. A. Darst. 2003. Crystal structure of Escherichia coli σE with the cytoplasmic domain of its anti-σ RseA. Mol. Cell, 11:1067–1078. 34. Chahla, M., J., Wooll, T. M., Laue, N. Nguyen, and D. F. Senear. 2003. Role of protein-protein bridging interactions on cooperative assembly of DNA-bound CRP-CytR-CRP complex and regulation of the Escherichia coli CytR regulon. Biochemistry, 42:3812–3825. 35. Cozzone, A. J. 1998. Regulation of acetate metabolism by protein phosphorylation in enteric bacteria. Ann. Rev. Microbiol., 52:127–164. 36. Dame, R. T., M. C. Noom, and G. J. L. Wuite. 2006. Bacterial chromatin organization by H-NS protein unravelled using dual DNA manipulation. Nature, 444:387–390. 37. Darwin, A., H. Hussain, L. Griffiths, J. Grove, Y. Sambongi, S. Busby, and J. Cole. 1993. Regulation and sequence of the structural gene for cytochrome c552 from Escherichia coli: not a hexahaem but a 50 kDa tetrahaem nitrite reductase. Mol. Microbiol., 9:1255–1265. 38. Davis, C. A., M. W. Capp, M. T. Record, Jr, and R. M. Saecker. 2005. The effects of upstream DNA on open complex formation by Escherichia coli RNA polymerase. Proc. Natl. Acad. Sci. USA, 102:285–290. 39. Demple, B. 1996. Redox signaling and gene control in the Escherichia coli soxRS oxidative stress regulon–a review. Gene, 179:53–57. 40. Deutscher, J., C. Francke, and P. W. Postma. 2006. How phosphotransferase system-related protein phosphorylation regulates carbohydrate metabolism in bacteria. Microbiol. Mol. Biol. Rev., 70:939–1031. 41. Doucleff, M., J. G. Pelton, P. S. Lee, B. T. Nixon, and D. E. Wemmer. 2007. Structural basis of DNA recognition by the alternative sigma-factor, σ54. J. Mol. Biol., 369:1070–1078. 42. Dove, S. L., S. A. Darst, and A. Hochschild. 2003. Region 4 of sigma as a target for transcription regulation. Mol. Microbiol., 48:863–874.
Transcribing Metabolism Genes: Lessons from a Feral Promoter
12-23
43. Filenko, N., S. Spiro, D. F. Browning, D. Squire, T. W. Overton, J. Cole, and C. Constantinidou. 2007. The NsrR regulon of Escherichia coli K-12 includes genes encoding the hybrid cluster protein and the periplasmic, respiratory nitrite reductase. J. Bacteriol., 189:4410–4417. 44. Gimenez, R., M. F. Nunez, J. Badia, J. Aguilar, and L. Baldoma. 2003. The gene yjcG, cotranscribed with the gene acs, encodes an acetate permease in Escherichia coli. J. Bacteriol., 185:6448–6455. 45. Gourse, R. L., W. Ross, and T. Gaal. 2000. UPs and downs in bacterial transcription initiation: the role of the alpha subunit of RNA polymerase in promoter recognition. Mol. Microbiol., 37:687–695. 46. Grainger, D. C., H. Aiba, D. Hurd, D. F. Browning, and S. J. W. Busby. 2007. Transcription factor distribution in Escherichia coli: studies with FNR protein. Nucl. Acids Res., 35:269–278. 47. Grainger, D. C., T. A. Belyaeva, D. J. Lee, E. I. Hyde, and S. J. W. Busby. 2004. Transcription activation at the Escherichia coli melAB promoter: interactions of MelR with the C-terminal domain of the RNA polymerase alpha subunit. Mol. Microbiol., 51:1311–1320. 48. Grainger, D. C., D. Hurd, M. D. Goldberg, and S. J. W. Busby. 2006. Association of nucleoid proteins with coding and non-coding segments of the Escherichia coli genome. Nucl. Acids Res., 34:4642–4652. 49. Grainger, D. C., C. L. Webster, T. A. Belyaeva, E. I. Hyde, and S. J. W. Busby. 2004. Transcription activation at the Escherichia coli melAB promoter: interactions of MelR with its DNA target site and with domain 4 of the RNA polymerase sigma subunit. Mol. Microbiol., 51:1297–1309. 50. Gralla, J. D. 2005. Escherichia coli ribosomal RNA transcription: regulatory roles for ppGpp, NTPs, architectural proteins and a polymerase-binding protein. Mol. Microbiol., 55:973–977. 51. Green, J. and F. A. Marshall. 1999. Identification of a surface of FNR overlapping activating region 1_ that is required for repression of gene expression. J. Biol. Chem., 274:10244–10248. 52. Griffith, K. L., I. M. Shah, and R. E. Wolf. 2004. Proteolytic degradation of Escherichia coli transcription activators SoxS and MarA as the mechanism for reversing the induction of the superoxide (SoxRS) and multiple antibiotic resistance (Mar) regulons. Mol. Microbiol., 51:1801–1816. 53. Griffith, K. L., I. M. Shah, T. E. Myers, M. C. O’Neill, and R. E. Wolf. 2002. Evidence for “Prerecruitment” as a new mechanism of transcription activation in Escherichia coli: the large excess of SoxS binding sites per cell relative to the number of SoxS molecules per cell. Biochem. Biophys. Res. Comm., 291:979–986. 54. Griffith, K. L., J. Wolf, and Richard E. 2004. Genetic evidence for pre-recruitment as the mechanism of transcription activation by SoxS of Escherichia coli: the dominance of DNA binding mutations of SoxS. J. Mol. Biol., 344:1–10. 55. Groisman, E. A. 2001. The pleiotropic two-component regulatory system PhoP-PhoQ. J. Bacteriol., 183:1835–1842. 56. Gross, C., M. Lonetto, and R. Losick. 1992. Bacterial Sigma Factors. Cold Spring Harbor Laboratory Press, Cold Spring Harbor, New York. 57. Gross, C. A., C. Chan, A. Dombrosk, T. Gruber, M. Sharp, J. Tupy, and B. Young. 1998. The functional and regulatory roles of sigma factors in transcription. Cold Spring Harb. Symp. Quant Biol., 63:141–155. 58. Gruber, T. M. and C. A. Gross. 2003. Multiple sigma subunits and the partitioning of bacterial transcription space. Ann. Rev. Microbiol., 57:441–466. 59. Hancock, R. E. W. and J. B. McPhee. 2005. Salmonella’s sensor for host defense molecules. Cell, 122:320–322. 60. Harley, C. B. and R. P. Reynolds. 1987. Analysis of E. coli promoter sequences. Nucl. Acids Res., 15:2343–2361. 61. Haugen, S. P., M. B. Berkmen, W. Ross, T. Gaal, C. Ward, and R. L. Gourse. 2006. rRNA promoter regulation by nonoptimal binding of σ region 1.2: an additional recognition element for RNA polymerase. Cell, 125:1069–1082. 62. Helmann, J. D. 1999. Anti-sigma factors. Curr. Opin. Microbiol., 2:135–141.
12-24
Bacterial Transcriptional Regulation of Metabolism
63. Helmann, J. D. 2002. The extracytoplasmic function (ECF) sigma factors. Adv. Microb. Physiol., 46:47–110. 64. Hook-Barnard, I., X. B. Johnson, and D. M. Hinton. 2006. Escherichia coli RNA polymerase recognition of a σ70-Dependent Promoter Requiring a –35 DNA element and an extended -10 TGn motif. J. Bacteriol., 188:8352–8359. 65. Hughes, K. T. and K. Mathee. 1998. The anti-sigma factors. Ann. Rev. Microbiol., 52:231–286. 66. Hussain, H., J. Grove, L. Griffiths, S. Busby, and J. Cole. 1994. A seven-gene operon essential for formate-dependent nitrite reduction to ammonia by enteric bacteria. Mol. Microbiol., 12:153–163. 67. Ishihama, A. 2000. Functional modulation of Escherichia coli RNA polymerase. Ann. Rev. Microbiol., 54:499–518. 68. Ishihama, A. 1999. Modulation of the nucleoid, the transcription apparatus, and the translation machinery in bacteria for stationary phase survival. Genes Cells, 4:135–143. 69. Jin, D. J. and J. E. Cabrera. 2006. Coupling the distribution of RNA polymerase to global gene regulation and the dynamic structure of the bacterial nucleoid in Escherichia coli. J. Struct. Biol., 156:284–291. 70. Joung, J. K., L. U. Le, and A. Hochschild. 1993. Synergistic activation of transcription by Escherichia coli cAMP receptor protein. Proc. Natl. Acad. Sci. USA, 90:3083–3087. 71. Kallipolitis, B. H. and P. Valentin-Hansen. 2004. A role for the interdomain linker region of the Escherichia coli CytR regulator in repression complex formation. J. Mol. Biol., 342:1–7. 72. Kumari, S., C. M. Beatty, D. F. Browning, S. J. Busby, E. J. Simel, G. Hovel-Miner, and A. J. Wolfe. 2000. Regulation of acetyl coenzyme A synthetase in Escherichia coli. J. Bacteriol., 182:4173–4179. 73. Kumari, S., E. Simel, and A. J. Wolfe. 2000. Sigma70 is the principal sigma factor responsible for the transcription of acs, which encodes acetyl-CoA synthetase in Escherichia coli. J. Bacteriol., 182:551–554. 74. Kumari, S., R. Tishel, M. Eisenbach, and A. J. Wolfe. 1995. Cloning, characterization, and functional expression of acs, the gene which encodes acetyl coenzyme A synthetase in Escherichia coli. J. Bacteriol., 177:2878–2886. 75. Langdon, R. C. and A. Hochschild. 1999. A genetic method for dissecting the mechanism of transcriptional activator synergy by identical activators. Proc. Natl. Acad. Sci. USA, 96:12673–12678. 76. Lawson, C. L., D. Swigon, K. S. Murakami, S. A. Darst, H. M. Berman, and R. H. Ebright. 2004. Catabolite activator protein: DNA binding and transcription activation. Curr. Opin. Struct. Biol., 14:10–20. 77. Lee, D. J., S. J. W. Busby, and G. S. Lloyd. 2003. Exploitation of a chemical nuclease to investigate the location and orientation of the Escherichia coli RNA polymerase α subunit C-terminal domains at simple promoters that are activated by cyclic AMP receptor protein. J. Biol. Chem., 278:52944–52952. 78. Lee, S. J. and J. D. Gralla. 2004. Osmo-regulation of bacterial transcription via poised RNA polymerase. Mol. Cell, 14:153–162. 79. Lewis, M. 2005. The lac repressor. Comptes Rendus Biologies Retour sur l’operon lac 328:521–548. 80. Liu, M., S. Garges, and S. Adhya. 2004. lacP1 promoter with an extended - 10 motif: pleiotropic effects of cyclic amp protein at different steps of transcription initiation. J. Biol. Chem., 279:54552–54557. 81. Lloyd, G., P. Landini, and S. Busby. 2001. Activation and repression of transcription initiation in bacteria. Essays Biochem., 37:17–31. 82. Lloyd, G. S., W. Niu, J. Tebbutt, R. H. Ebright, and S. J. Busby. 2002. Requirement for two copies of RNA polymerase alpha subunit C-terminal domain for synergistic transcription activation at complex bacterial promoters. Genes Develop., 16:2557–2565. 83. Lynch, T. W., E. K. Read, A. N. Mattis, J. F. Gardner, and P. A. Rice. 2003. Integration host factor: putting a twist on protein-DNA recognition. J. Mol. Biol., 330:493–502.
Transcribing Metabolism Genes: Lessons from a Feral Promoter
12-25
84. Macchi, R., L. Montesissa, K. Murakami, A. Ishihama, V. de Lorenzo, and G. Bertoni. 2003. Recruitment of σ54-RNA Polymerase to the Pu promoter of Pseudomonas putida through integration host factor-mediated positioning switch of α subunit carboxyl-terminal domain on an UP-like element. J. Biol. Chem., 278:27695–27702. 85. Maeda, H., N. Fujita, and A. Ishihama. 2000. Competition among seven Escherichia coli σ subunits: relative binding affinities to the core RNA polymerase. Nucl. Acids Res., 28:3497–3503. 86. Magnusson, L. U., A. Farewell, and T. Nystrom. 2005. ppGpp: a global regulator in Escherichia coli. Trends Microbiol., 13:236–242. 87. Magnusson, L. U., B. Gummesson, P. Joksimovic, A. Farewell, and T. Nystrom. 2007. Identical, independent, and opposing roles of ppGpp and DksA in Escherichia coli. J. Bacteriol., JB.00330–07. 88. Marshall, F. A., S. L. Messenger, N. R. Wyborn, J. R. Guest, H. Wing, S. J. Busby, and J. Green. 2001. A novel promoter architecture for microaerobic activation by the anaerobic transcription factor FNR. Mol. Microbiol., 39:747–753. 89. Martin, R. G., W. K. Gillette, N. I. Martin, and J. L. Rosner. 2002. Complex formation between activator and RNA polymerase as the basis for transcriptional activation by MarA and SoxS in Escherichia coli. Mol. Microbiol., 43:355–370. 90. Martinez-Antonio, A. and J. Collado-Vides. 2003. Identifying global regulators in transcriptional regulatory networks in bacteria. Curr. Opin. Microbiol., 6:482–489. 91. Martinez-Antonio, A., S. C. Janga, H. Salgado, and J. Collado-Vides. 2006. Internal-sensing machinery directs the activity of the regulatory network in Escherichia coli. Trends Microbiol., 14:22–27. 92. Mathew, R. and D. Chatterji. 2006. The evolving story of the omega subunit of bacterial RNA polymerase. Trends Microbiol., 14:450–455. 93. McLeod, S., S. Aiyar, R. Gourse, and R. Johnson. 2002. The C-terminal domains of the RNA polymerase alpha subunits: contact site with Fis and localization during co-activation with CRP at the Escherichia coli proP P2 promoter. J. Mol. Biol., 316:517–529. 94. McLeod, S. M. and R. C. Johnson. 2001. Control of transcription by nucleoid proteins. Curr. Opin. Microbiol., 4:152–159. 95. Mitchell, J. E., D. Zheng, S. J. W. Busby, and S. D. Minchin. 2003. Identification and analysis of 'extended -10' promoters in Escherichia coli. Nucl. Acids Res., 31:4689–4695. 96. Mooney, R. A., S. A. Darst, and R. Landick. 2005. Sigma and RNA polymerase: An on-again, offagain relationship? Mol. Cell, 20:335–345. 97. Murakami, K. S. and S. A. Darst. 2003. Bacterial RNA polymerases: the wholo story. Curr. Opin. Struct. Biol., 13:31–39. 98. Murakami, K. S., S. Masuda, E. A. Campbell, O. Muzzin, and S. A. Darst. 2002. Structural basis of transcription initiation: an RNA polymerase holoenzyme-DNA complex. Science, 296:1285–1290. 99. Murtin, C., M. Engelhorn, J. Geiselmann, and F. Boccard. 1998. A quantitative UV laser footprinting analysis of the interaction of IHF with specific binding sites: re-evaluation of the effective concentration of IHF in the cell. J. Mol. Biol., 284:949–961. 100. Newberry, K. J. and R. G. Brennan. 2004. The structural mechanism for transcription activation by MerR family member multidrug transporter activation, N terminus. J. Biol. Chem., 279:20356–20362. 101. Nystrom, T. 2004. Growth versus maintenance: a trade-off dictated by RNA polymerase availability and sigma factor competition? Mol. Microbiol., 54:855–862. 102. Oh, M.-K., L. Rohlin, K. C. Kao, and J. C. Liao. 2002. Global expression profiling of acetate-grown Escherichia coli. J. Biol. Chem., 277:13175–13183. 103. Ohniwa, R. L., K. Morikawa, J. Kim, T. Kobori, K. Hizume, R. Matsumi, H. Atomi, T. Imanaka, T. Ohta, C. Wada, S. H. Yoshimura, and K. Takeyasu. 2007. Atomic force microscopy dissects the hierarchy of genome architectures in eukaryote, prokaryote, and chloroplast. Microsc. Microanal., 13:3–12.
12-26
Bacterial Transcriptional Regulation of Metabolism
104. Ohniwa, R. L., K. Morikawa, J. Kim, T. Ohta, A. Ishihama, C. Wada, and K. Takeyasu. 2006. Dynamic state of DNA topology is essential for genome condensation in bacteria. EMBO J., 25:5591–5602. 105. Ostrovsky de Spicer, P., K. O’Brien, and S. Maloy. 1991. Regulation of proline utilization in Salmonella typhimurium: a membrane-associated dehydrogenase binds DNA in vitro. J. Bacteriol., 173:211–219. 106. Paget, M. and J. Helmann. 2003. The sigma70 family of sigma factors. Genome Biol., 4:203. 107. Pemberton, I. K., G. Muskhelishvili, A. A. Travers, and M. Buckle. 2000. The G + C-rich discriminator region of the tyrT promoter antagonises the formation of stable preinitiation complexes. J. Mol. Biol., 299:859–864. 108. Perederina, A., V. Svetlov, M. N. Vassylyeva, T. H. Tahirov, S. Yokoyama, I. Artsimovitch, and D. G. Vassylyev. 2004. Regulation through the secondary channel–structural framework for ppGppDksA synergism during transcription. Cell, 118:297–309. 109. Perez-Rueda, E. and J. Collado-Vides. 2000. The repertoire of DNA-binding transcriptional regulators in Escherichia coli K-12. Nucl. Acids Res., 28:1838–1847. 110. Peterson, S. N., F. W. Dahlquist, and N. O. Reich. 2007. The role of high affinity non-specific DNA binding by Lrp in transcriptional regulation and DNA organization. J. Mol. Biol., 369:1307–1317. 111. Plumbridge, J. 2002. Regulation of gene expression in the PTS in Escherichia coli: the role and interactions of Mlc. Curr. Opin. Microbiol., 5:187–193. 112. Prost, L. R., M. E. Daley, V. Le Sage, M. W. Bader, H. Le Moual, R. E. Klevit, and S. I. Miller. 2007. Activation of the bacterial sensor kinase PhoQ by acidic pH. Mol. Cell, 26:165–174. 113. Pruss, B. M., J. M. Nelms, C. Park, and A. J. Wolfe. 1994. Mutations in NADH:ubiquinone oxidoreductase of Escherichia coli affect growth on mixed amino acids. J. Bacteriol., 176:2143–2150. 114. Ptashne, M. and A. Gann. 1997. Transcriptional activation by recruitment. Nature, 386:569–577. 115. Radonjic, M., J.-C. Andrau, P. Lijnzaad, P. Kemmeren, T. T. J. P. Kockelkorn, D. van Leenen, N. L. van Berkum, and F. C. P. Holstege. 2005. Genome-wide analyses reveal RNA polymerase II located upstream of genes poised for rapid response upon S. cerevisiae stationary phase exit. Mol. Cell, 18:171–183. 116. Rappas, M., D. Bose, and X. Zhang. 2007. Bacterial enhancer-binding proteins: unlocking σ54dependent gene transcription. Curr. Opin. Struct. Biol., 17:110–116. 117. Record, J. M. T., W. S. Reznikoff, M. L. Craig, K. L. McQuade, and P. J. Schlax. 1996. Escherichia coli RNA polymerase (Esigma70), promoters, and the kinetics of the steps of transcription initiation. In F. C. Neidhardt, Curtiss, C.A., Ingraham, J. L., Lin, E. C. C., Low, K. B., Magasanik, B., Reznikoff, W., Riley, M., Schaechter, M. Umbarger, H. E. (ed.), Escherichia coli and Salmonella:cellular and Molecular Biology, 2nd ed. ASM, Washington, DC. 118. Rhodius, V. A. and S. J. Busby. 1998. Positive activation of gene expression. Curr. Opin. Microbiol., 1:152–159. 119. Rice, P. A., S. Yan, K. Mizuuchi, and H. Nash. 1996. Crystal structure of an IHF–DNA complex: a protein-induced DNA U-turn. Cell, 87:1295–1306. 120. Ross, W., A. Ernst, and R. L. Gourse. 2001. Fine structure of E. coli RNA polymerase-promoter interactions: α subunit binding to the UP element minor groove. Genes Dev., 15:491–506. 121. Ross, W. and R. L. Gourse. 2005. Sequence-independent upstream DNA-αCTD interactions strongly stimulate Escherichia coli RNA polymerase-lacUV5 promoter association. Proc. Natl. Acad. Sci. USA, 102:291–296. 122. Ruiz, N. and T. J. Silhavy. 2005. Sensing external stress: watchdogs of the Escherichia coli cell envelope. Curr. Opin. Microbiol., 8:122–126. 123. Rutherford, S. T., J. J. Lemke, C. E. Vrentas, T. Gaal, W. Ross, and R. L. Gourse. 2007. Effects of DksA, GreA, and GreB on transcription initiation: insights into the mechanisms of factors that bind in the secondary channel of RNA polymerase. J. of Mol. Biol., 366:1243–1257. 124. Sadhale, P., J. Verma, and A. Naorem. 2007. Basal transcription machinery: role in regulation of stress response in eukaryotes. J. Biosci., 32:569–578.
Transcribing Metabolism Genes: Lessons from a Feral Promoter
12-27
125. Salgado, H., S. Gama-Castro, M. Peralta-Gil, E. Diaz-Peredo, F. Sanchez-Solano, A. Santos-Zavaleta, I. Martinez-Flores, V. Jimenez-Jacinto, C. Bonavides-Martinez, J. Segura-Salazar, A. MartinezAntonio, and J. Collado-Vides. 2006. RegulonDB (version 5.0): Escherichia coli K-12 transcriptional regulatory network, operon organization, and growth conditions. Nucl. Acids Res., 34:D394–397. 126. Salgado, H., A. Santos-Zavaleta, S. Gama-Castro, M. Peralta-Gil, M. Penaloza-Spinola, A. MartinezAntonio, P. Karp, and J. Collado-Vides. 2006. The comprehensive updated regulatory network of Escherichia coli K-12. BMC Bioinformatics, 7:5. 127. Sanderson, A., J. E. Mitchell, S. D. Minchin, and S. J. W. Busby. 2003. Substitutions in the Escherichia coli RNA polymerase σ70 factor that affect recognition of extended -10 elements at promoters. FEBS Lett., 544:199–205. 128. Schroder, O. and R. Wagner. 2000. The bacterial DNA-binding protein H-NS represses ribosomal RNA transcription by trapping RNA polymerase in the initiation complex. J. Mol. Biol., 298:737–748. 129. Sclavi, B., C. M. Beatty, D. S. Thach, C. E. Fredericks, M. Buckle, and A. J. Wolfe. 2007. The multiple roles of CRP at the complex acs promoter depend on activation of region 2 and IHF. Mol. Microbiol., 65:425–440. 130. Seshasayee, A. S., P. Bertone, G. M. Fraser, and N. M. Luscombe. 2006. Transcriptional regulatory networks in bacteria: from input signals to output responses. Curr. Opin. Microbiol., 9:511–519. 131. Shah, I. M. and R. E. Wolf. 2006. Inhibition of Lon-dependent degradation of the Escherichia coli transcription activator SoxS by interaction with ‘soxbox’ DNA or RNA polymerase. Mol. Microbiol., 60:199–208. 132. Shin, S., S. G. Song, D. S. Lee, J. G. Pan, and C. Park. 1997. Involvement of iclR and rpoS in the induction of acs, the gene for acetyl coenzyme A synthetase of Escherichia coli K-12. FEMS Microbiol. Lett., 146:103–108. 133. Shultzaberger, R. K., Z. Chen, K. A. Lewis, and T. D. Schneider. 2007. Anatomy of Escherichia coli σ70 promoters. Nucl. Acids Res., 35:771–788. 134. Skoko, D., D. Yoo, H. Bai, B. Schnurr, J. Yan, S. M. McLeod, J. F. Marko, and R. C. Johnson. 2006. Mechanism of chromosome compaction and looping by the Escherichia coli nucleoid protein Fis. J. Mol. Biol., 364:777–798. 135. Stewart, V. 2003. Biochemical Society Special Lecture. Nitrate- and nitrite-responsive sensors NarX and NarQ of proteobacteria. Biochem. Soc. Trans., 31:1–10. 136. Stock, A. M., V. L. Robinson, and P. N. Goudreau. 2000. Two-component signal transduction. Ann. Rev. Biochem., 69:183–215. 137. Tagami, H. and H. Aiba. 1999. An inactive open complex mediated by an UP element at Escherichia coli promoters. Proc. Natl. Acad. Sci. USA, 96:7202–7207. 138. Tebbutt, J., V. A. Rhodius, C. L. Webster, and S. J. Busby. 2002. Architectural requirements for optimal activation by tandem CRP molecules at a class I CRP-dependent promoter. FEMS Microbiol. Lett., 210:55–60. 139. Thouvenot, B., Charpentier, B., and Branlant, C. 2004. The strong efficiency of the Escherichia coli gapA P1 promoter depends on a complex combination of functional determinants. Biochem. J., 383:371–382. 140. Tomsic, M., L. Tsujikawa, G. Panaghie, Y. Wang, J. Azok, and P. L. deHaseth. 2001. Different roles for basic and aromatic amino acids in conserved Region 2 of Escherichia coli sigma 70 in the nucleation and maintenance of the single-stranded DNA bubble in open RNA polymerase-promoter complexes. J. Biol. Chem., 276:31891–31896. 141. Travers, A. 1997. DNA-protein interactions: IHF - the master bender. Curr. Biol., 7:R252–R254. 142. Traxler, M. F., D.-E. Chang, and T. Conway. 2006. Guanosine 3',5'-bispyrophosphate coordinates global gene expression during glucose-lactose diauxie in Escherichia coli. Proc. Natl. Acad. Sci. USA, 103:2374–2379.
12-28
Bacterial Transcriptional Regulation of Metabolism
143. Tseng, G. C., M.-K. Oh, L. Rohlin, J. C. Liao, and W. H. Wong. 2001. Issues in cDNA microarray analysis: quality filtering, channel normalization, models of variations and assessment of gene effects. Nucl. Acids. Res., 29:2549–2557. 144. Tsujikawa, L., O. V. Tsodikov, and P. L. deHaseth. 2002. Interaction of RNA polymerase with forked DNA: Evidence for two kinetically significant intermediates on the pathway to the final complex. Proc. Natl. Acad. Sci. USA, 99:3493–3498. 145. Tyson, K. L., J. A. Cole, and S. J. Busby. 1994. Nitrite and nitrate regulation at the promoters of two Escherichia coli operons encoding nitrite reductase: identification of common target heptamers for both NarP- and NarL-dependent regulation. Mol. Microbiol., 13:1045–55. 146. Wade, J. T., T. A. Belyaeva, E. I. Hyde, and S. J. Busby. 2001. A simple mechanism for co-dependence on two activators at an Escherichia coli promoter. EMBO J., 20:7160–7167. 147. West, A. H. and A. M. Stock. 2001. Histidine kinases and response regulator proteins in two-component signaling systems. Trends Biochem. Sci., 26:369–376. 148. Wilson, C. J., Zhan, H., Swint-Kruse, L., and Matthews, K. S. 2007. The lactose repressor system: paradigms for regulation, allosteric behavior and protein folding. Cell Mol. Life Sci., 64:3–16. 149. Wolfe, A. J. 2005. The acetate switch. Microbiol. Mol. Biol. Rev., 69:12–50.
13 Regulation of Secondary Metabolism in Bacteria 13.1 Introduction �������������������������������������������������������������������������������������13-1 13.2 Pathway Specific Regulators ��������������������������������������������������������� 13-3 13.3 Pleiotropic and Global Regulators in Actinomycetes................ 13-5 Signal Transduction—Regulatory Protein Phosphorylation/Dephosphorylation • Autoregulator Signaling Systems • Other Global Regulation Mechanisms • Pleiotropic or Pathway-Specific Regulators?
13.4 Nutritional and Physiological Factors........................................13-10
Wenjun Zhang, Joshua P. Ferreira, and Yi Tang University of California
Carbon Metabolism and Cyclic AMP • ppGpp and Nitrogen-Limitation • Phosphate Control • S-Adenosylmethionine Activation • Light Induction
13.5 Other Functions of Secondary Metabolites.............................. 13-12 13.6 Conclusions ������������������������������������������������������������������������������������13-17 References ��������������������������������������������������������������������������������������������������13-17
13.1 Introduction Secondary metabolism has long been classified as cellular metabolic activities that are not essential for bacteria growth, respiration, development, and reproduction. An enormous variety of bioactive molecules can be formed from the secondary metabolic pathways,1 many of which have ecological and environmental functions. Those bioactive chemical compounds, called secondary metabolites, can benefit the producing organisms and improve their survival fitness, by serving as bactericides or fungicides towards invading microorganisms, or by acting at specific receptors in competing organisms, as proposed by Williams et al. in 1989.2 Despite these observations, the exact purposes for the production of a large fraction of secondary metabolites in living organisms are not completely understood.3 The bioactive molecules are extremely important to humans because of their wide spectrum of pharmaceutical properties, such as antibacterial, antitumor, antifungal, antihelminthic, antiviral, herbicidal, insecticidal, and immunosuppressive activities.4 There have been intense efforts in the identification, isolation, and characterization of new compounds from bacterial sources. Secondary metabolites are often produced in low titers in the microorganisms, and the slow growth of producing hosts further impedes large-scale production. Furthermore, a majority of secondary metabolites are complex organic compounds that are difficult to synthesize and derivatize. Biotechnological advances, such as metabolic engineering, have been effectively used to produce numerous valuable secondary metabolites in higher quantities to meet the vast demand. One important approach is to fine-tune the regulation mechanisms that govern secondary metabolite production. A number of examples demonstrating elevated titers of secondary metabolites as a result of overexpressing positive regulatory genes have been described.5,6 13-1
13-2
Bacterial Transcriptional Regulation of Metabolism
However, to rationally improve the efficiencies of these bacterial factories, a more complete and global understanding of the regulation pathways is needed. Although secondary metabolism is prevalent in nearly all living organisms, we are most interested in those found in soilborne actinomycetes and filamentous fungi.7 Actinomycetes are spore-forming, Grampositive bacteria and are abundant sources of important bioactive natural products, such as tetracycline (Streptomyces aureofaciens), daunorubicin (Streptomyces peucetius), erythromycin (Saccharopolyspora erythraea), and amphotericin B (Streptomyces nodosus), to name a few. Among actinomycetes species, the microbiology, genetics, and secondary metabolism of the streptomycetes family have been the most extensively studied. Streptomycetes have a typical spore-to-spore life cycle, with several stages of differentiation and unusual aerial hyphae formation. A typical life cycle is as follows:
1. Germ tubes germinate from spores and extend into mycelium of branching filaments (hyphae). 2. During growth of a vegetative mycelium which essentially converts available nutrients into biomass, certain hyphae extend into the air. 3. The aerial grown hyphae are divided into compartments and transform into long chains of unigenomic spores.
In cultures grown on solid media, the production of secondary metabolites by streptomycetes is tied to the streptomyces developmental cycle. Production of metabolites can generally be detected at the onset, and throughout aerial hyphae and spore formation,8,9 which is usually associated with limited nutrient conditions. In cultures grown in liquid media where most streptomycetes do not form spores, production of secondary metabolites is generally associated with the stationary growth phase which occurs after vegetative growth. Therefore, production of secondary metabolites is closely synchronized with the life cycles of actinomycetes, and a complex regulating network is present to control both development and secondary metabolism. One of the characteristics of the regulation of secondary metabolism in actinomycetes is the diversity and complexity.10 Therefore, there are tremendous challenges in understanding and deconvoluting the regulatory cascades, with many fundamental mechanisms still unknown. Different experimental techniques have been employed. The most classic method is gene knockout followed by complementation, which is very useful in demonstrating the in vivo function of a gene. Mutations can be made randomly throughout the genome or in a site-specific fashion to induce certain observable phenotype, followed by complementation to identify the mutated gene that resulted in the physiological difference. One drawback of mutagenesis is that the underlying gene function is still undetermined, since disruption of different regulatory genes in the same regulatory cascades can lead to identical phenotypes. There are biochemical methods to directly detect protein-DNA interactions, thus directly assaying the roles of regulatory proteins and revealing their binding sites. For example, gel mobility assay can give some information about the multiple complexes of DNA and proteins; while DNase I footprinting can efficiently localize the binding site of protein on DNA. In addition, development of reverse transcription-polymerase chain reaction (RT-PCR) and real-time PCR can allow us to rapidly quantify gene expression levels in a particular cell or tissue. Reporter gene fusion is also a good method to determine gene expression level, so as to identify the regulatory elements. Recently, the entire genomes of Streptomyces coelicolor and Streptomyces avermitilis have been sequenced,11,12 which made research on the global network of regulation possible. Bioinformatics and proteomic studies have shed some light on these complex interactions during regulation.13,14 Insight into actinomycete gene function has also been achieved by systematic insertional mutagenesis on a genome-wide basis, by which the desired gene can be disrupted with high efficiency.15,16 Cohen et al. have used DNA microarrays to simultaneously and globally assess factors that affect the transcription of S. coelicolor genes and regulatory pathways for antibiotic biosynthesis.17 Regulation of secondary metabolism in actinomycetes includes at least three layers of cellular control and is different for individual secondary metabolites in different species.
Regulation of Secondary Metabolism in Bacteria
13-3
1. The first level includes the structural genes and the respective proteins that direct biosynthesis of a particular secondary metabolite.18 Genes for the biosynthesis of individual secondary metabolites, such as polyketides and nonribosomal peptides, are arranged in large gene clusters on host chromosomes. 2. The second level of regulation is facilitated by protein products of dedicated regulatory genes, along with other pathway specific regulators (regulators that only control the biosynthesis of a particular secondary metabolite). Many of these transcription regulatory proteins share very similar protein sequence and three-dimensional structures, forming either the Streptomyces antibiotic regulatory protein (SARP) family19 in streptomycetes, and the less common LAL (large ATP-binding regulators of the LuxR family) family.20 3. The third and the most complex level of regulation is exerted by pleiotropic and global regulators, along with several signaling pathways. The global regulators influence the activity of pathwayspecific regulatory genes (such as SARP and LAL) at the end of signal transduction cascades. Some global regulators are also related to morphogenesis and development. Disruption of them can lead to abnormality of both secondary metabolism and differentiation. In addition, the global regulators are sensitive to growth and environmental conditions, such as pH, carbon source, and nitrogen abundance.
In the last few years, many individual secondary metabolite biosynthetic gene clusters have been characterized from actinomycetes, among which S. coelicolor has often been used as a model organism for studying regulation of secondary metabolism.21 This species produces four chemically distinct antibiotics: actinorhodin (Act), undecylprodigiosin (Red), calcium-dependent antibiotic (CDA), and methylenomycin (Mmy). The genes responsible for the synthesis of each of the four antibiotics have been found to be clustered in distinct regions of the genome. The production of deeply pigmented Act (blue) and Red (red) have been commonly used as phenotypical markers during genetic screens. Many members of the pathway specific and global regulatory cascades have been identified through either the up-regulation or down-regulation of the biosynthesis of these two compounds.
13.2 Pathway Specific Regulators Most antibiotic gene clusters contain pathway-specific regulatory genes. In most cases, the pathwayspecific regulators positively control the transcription of the biosynthetic genes, although some act as transcriptional repressors. The most well known family of proteins among the pathway-specific regulators is SARP. SARPs are similar in sequences to the OmpR family of DNA-binding domains and contain N-termini helix-turn-helix motifs that bind to promoter regions of structural genes.19,22 The SARP family of proteins has only been found in actinomycetes, and most of them among the streptomycetes (including ActII-ORF4, RedD, DnrI, CcaR, etc.). Although some members of this family (e.g., AfsR) are pleiotropic regulators, 23,24 most of SARP family members are closely associated with individual secondary metabolite gene clusters and work solely as pathway-specific regulators. Although in majority cases, only a single gene coding SARP is found in a biosynthetic gene cluster, some clusters have more than one SARP-encoding gene, while some antibiotic biosynthetic clusters have no SARP regulatory genes at all. The regulatory roles played by SARPs are simple: they directly up-regulate the expression of the biosynthetic genes present in the associated gene clusters. Hence SARPs act at the end of the signal transduction cascade and do not communicate with additional, downstream regulators. Transcription of SARP activators is in turn regulated by complex, upstream global regulators and is growth-phase dependent. Transcription usually increases during the transition period between exponential growth and the stationary phase in liquid medium, which is correlated with secondary metabolite production. The prototypical SARP regulator is ActII-ORF4, which regulates the synthesis of Act in S. coelicolor A3(2).25 ActII-ORF4 has been characterized as a regulatory protein with specific DNA-binding activity.26 ActII-ORF4 plays a positive role in the transcription of the biosynthetic genes of Act. Deletion of
13-4
Bacterial Transcriptional Regulation of Metabolism
actII-ORF4 resulted in no transcription of act biosynthetic genes, while cloned, extra copies of actIIORF4 cause Act overproduction. Accumulation of actII-ORF4 transcripts was found to be limited to the post-exponential growth period, and introduction of extra plasmid-born copies of actII-ORF4 induced exponential-phase Act production.25 Therefore, the temporal regulation of actII-ORF4 transcription is largely responsible for growth-phase dependent Act production in defined media. Besides actII-ORF4, many other pathway-specific regulatory genes have been identified from streptomycetes, including strR for the streptomycin pathway in S. griseus27 and dnrI for the daunorubicin pathway in S. peucetius.28 Both SARPs have been experimentally shown to bind specifically to the promoter regions within the corresponding gene clusters. The list of defined pathway-specific regulators is rapidly growing, and some of them are shown in Table 13.1. While the SARP family of proteins gained early recognition and are still the most important pathway-specific regulators, characterization of several other macrolide antibiotic pathways has led to the identification of a set of novel transcriptional regulators known as the LAL family, and several homologs are listed in Table 13.1. LAL proteins contain an N-terminal ATP-binding domain, and a C-terminal helix-turn-helix DNA-binding domain also found in LuxR. Proteins in the LAL family (between 800 and 1200 amino acids) are significantly larger in size compared to those in the SARP family (between 200 and 700 amino acids). Analysis of PikD by Wilson et al.20 has shown that deletion of pikD from the chromosome of S. venezuelae results in complete loss of pikromycin production, and complementation by a plasmid carrying pikD restores macrolide biosynthesis, demonstrating that PikD is an essential positive regulator. Overall very limited is known of the role of LALs in regulation of secondary metabolism, and more studies are needed to provide insight into their functions. In streptomycetes, several antibiotic regulatory proteins have been identified that belong to the family of response regulators known as bacterial two-component signal transduction systems, which are prevalent among prokaryotes.42 These widespread regulatory systems control the expression of sets of genes based on external stimuli and internal signals. The two-component system consists of a sensor histidine kinase which is autophosphorylated in response to a specific signal, and a response regulator that activates the transcription of target genes. Some well-characterized two-component systems in actinomycetes belong to global regulators (e.g., AbsA1-AbsA243 and CutR-CutS44). Interestingly, some of the Table 13.1 Examples of Pathway-Specific Regulators in Streptomycetes Example of Producing Organism
Example of Secondary Metabolite
Pathway-Specific Regulator
Family of Regulatory Proteins
Reference
S. coelicolor S. coelicolor S. nogalater
Actinorhodin undecylprodigiosin Nogalamycin
ActII-ORF4 RedD SnoA
SARP SARP SARP
26 29,30 31
S. clavuligerus
CcaR
SARP
32,33
S. peucetius S. argillaceus S. ambofaciens S. griseus S. spheroides S. cyanogenus S136 S. globisporus 1912
Cephamycin C Clavulanic acid Daunorubicin Mithramycin Alpomycin Streptomycin Novobiocin Landomycin A Landomycin E
DnrI MtmR AlpV StrR NovG LanI LndI
SARP SARP SARP SARP SARP SARP SARP
22,28 34 35 27 36 37 38
S. noursei
Nystatin
LAL
39
S. hygroscopicus S. venezuelae S. natalensis
Rapamycin Pikromycin Pimaricin
NysRI, NysRII, NysRIII, NysRIV RapH PikD PimR
LAL LAL SARP+LAL
40 20 41
Regulation of Secondary Metabolism in Bacteria
13-5
pathway-specific regulator genes encode proteins similar to the response regulators. For example, Aur1P is a transcriptional activator for the biosynthesis of auricin in S. aureofaciens CCM 3239 that binds to the promoter and directs expression of the aur1 cluster.45 However, these pathway-specific response regulators (e.g., RedZ,46 DnrN47) do not appear to be associated with any sensor histidine kinases and usually lack a conserved aspartate residue that is required for phosphorylation. Despite the successful characterization of numerous genes and pathway-specific regulators, a more fundamental question is how these pathway-specific regulatory genes are regulated themselves. Uguru et al. have characterized and identified a transcription factor (AtrA) that is required for maximum transcription of actII-ORF4.48 Interestingly, AtrA is a homolog of TetR-family transcriptional regulators, which usually function as repressors. AtrA is demonstrated to be the pathway-specific regulator of actIIORF4, since disruption of AtrA only effects the production of Act, and does not effect Red or CDA biosynthesis in S. coelicolor. In the next section, we will talk about the more fundamental regulators of secondary metabolism: pleiotropic regulators and signaling systems.
13.3 Pleiotropic and Global Regulators in Actinomycetes There are two mechanisms in the regulation of pathway-specific regulators. First, pleiotropic regulatory genes exert their control over two or more secondary metabolites in the same organism, while having no effects on morphological differentiation. Second and more fundamental are genes encoding global regulators which control both secondary metabolism and morphological differentiation, such as many bld genes in S. coelicolor.49 Both types of regulators are present in many different regulatory networks. Since more attention is paid to the production of secondary metabolites in regulation of secondary metabolism here, discussion of most of the genes that are also required for morphological differentiation is precluded.
13.3.1 Signal Transduction—Regulatory Protein Phosphorylation/ Dephosphorylation Reversible protein phosphorylation/dephosphorylation is a ubiquitous mechanism used to regulate protein and cellular functions in living organisms. Different amino acids within a protein can be phosphorylated. O-phosphorylation occurs on the hydroxyl-containing amino acids serine, threonine, and tyrosine, N-phosphorylation occurs on histidine and acyl-phosphorylation occurs on the acidic amino acid aspartic acid.50 Phosphorylation of a protein induces structural and functional changes, and serves as a switch in the signal transduction cascade. Although these sensory systems are widespread throughout the bacteria, very little is known about the signals sensed by the sensor kinase or, in many cases, the targets for the response regulator of each system. Protein N- and acyl-phosphorylation were identified initially in prokaryotes where they are involved in the two-component signal transduction network.42 At least five S. coelicolor two-component systems have been identified to influence antibiotic production globally: AbsA1-AbsA2,43,51,52 CutR-CutS,44 PhoR-PhoP,53 OsaA-OsaB,15 and AfsQ1-AfsQ2.54 The first four pairs function as negative regulators, and disruption mutations in each of the pairs led to antibiotic overproduction. The PhoR-PhoP system is also found in other streptomycetes, such as in S. lividans, where it can control both primary and secondary metabolism.55 It is notable that the activities of the PhoR-PhoP system are always related to intracellular phosphate levels. OsaA-OsaB, involved in osmoadaptation, also function as negative regulators, and disruption mutations result in failure of aerial hyphae formation and antibiotic overproduction.15 The function of AfsQ1-AfsQ2 is still unclear, although positive effect on antibiotic production is observed when they are overexpressed on a low-copy-number plasmid.54 It is known that the range of environmental stimuli to which an organism can respond is usually directly linked to the number of sensor kinases encoded by that organism’s genome, and the number of sensor kinases is also proportional to the size of the genome.56,57 With the knowledge of the S. coelicolor genome sequence, Hutchings et al.
13-6
Bacterial Transcriptional Regulation of Metabolism
have identified 67 pairs of sensor kinase and response regulator genes, 17 unpaired sensor kinase genes and 13 unpaired response regulator genes encoded within the S. coelicolor genome.57 Protein O-phosphorylation was originally discovered in eukaryotes where it serves as the backbone of the eukaryote signal transduction network, and is a common regulatory mechanism for prokaryotes as well.58 Many eukaryotic Ser/Thr protein kinase (ESTPK) genes have been identified in diverse bacterial species, including some related to global regulation of secondary metabolism. Although 34 putative ESTPK genes have been revealed in the genome sequence of S. coelicolor, roles of most of them still remain quite unclear.59 The prototypical AfsK-AfsR pair is very well studied by Horinouchi’s group.24,60–62 Figure 13.1 is a model of a signal transduction pathway involving AfsR and AfsK in S. coelicolor. The AfsK-AfsR system globally controls secondary metabolism. AfsK is the sensor kinase and can activate its kinase activity through autophosphorylation upon sensing external stimuli. AfsR, the response regulator, is phosphorylated by AfsK, which significantly increases the DNA-binding activity of AfsR. Interestingly, AfsR is also phosphorylated by other kinases in the cell, such as PkaG and AfsL. This led Horinouchi to propose that AfsR is capable of processing signals from multiple kinases. The activated AfsR increases transcription of afsS, which encodes a 63 amino acid protein that activates transcription of pathway-specific transcriptional activators, such as actII-ORF4 and redD. Inspection of the genomes of S. coelicolor A3(2) and S. avermitilis reveals that each contains 55 putative eukaryotic-type protein phosphatases,63 which remove the phosphate group on phosphorylated proteins and regulate the phosphorylation levels of phosphoproteins. The first report on protein phosphatases in the regulation of secondary metabolism is the ptpA gene from S. coelicolor A3(2).64,65 Although disruption of ptpA has no effect on secondary metabolism, overexpression of ptpA in S. lividans increases production of Act and Red. Another interesting example of phosphatase is AbsA1, which is revealed to
Signal?
Signal?
Signal?
Cell membrane
Out In
PKaG
PKaG
P
AfsK
AfsK
P
AfsL
AfsL
P
– KbpA
AfsR
AfsR
Binding to other proteins
+
P
+ afsS +
+
+
actII-ORF4 redD +
Secondary metabolism
Figure 13.1 Signal transduction involving AfsR and AfsK in S. coelicolor. Genes and proteins are displayed in square and round boxes respectively. + and - signs indicate the types of regulatory control.
13-7
Regulation of Secondary Metabolism in Bacteria
be both an AbsA2 kinase and an AbsA2~P phosphatase.66 The phosphorylation state of AbsA2, which has been shown to be involved in the regulation of secondary metabolism, is determined by the balance of the kinase and phosphatase activities of AbsA1.
13.3.2 Autoregulator Signaling Systems Production of g-butyrolactones, diffusible signaling molecules that control secondary metabolism and/ or morphological differentiation, is widespread among actinomycetes (Figure 13.2a).67,68 These compounds are autoregulators produced endogenously and are secreted into the medium at specific phases during growth. The g-butyrolactones are extremely effective as facilitators of signaling systems at nanomolar concentrations and can bind to receptor proteins with high specificity. All of the g-butyrolactones (a) H O
H
OH
O
O HO A-factor from S. griseus
H
OH
O
H
O
O O
H
OH
OH
IM-2 type H
OH
O O
H
OH SCB1 from S. coelicolor A3(2)
OH IM-2 from S. lavendulae FRI-5
A-factor type H
O
OH
H
OH
H O
H
OH
O
OH
O H OH OH Virginiae butanolides A-E from S. virginiae O
H
H O O
H
OH
OH
VB type Signal?
(b) AfsA
–
A-factor
arpA – –
adpA +
+
strR +
Sporulation
Streptomycin
Figure 13.2 (a) g-Butyrolactones autoregulators found in actinomycetes. (b) Regulatory cascade involving A-factor in S. griseus.
13-8
Bacterial Transcriptional Regulation of Metabolism
autoregulators identified so far possess a characteristic 2, 3-disubstituted g-butyrolactone, and can be classified into three types based on minor structural differences in the C-2 side chain: (a) virginiae butanolide (VB) type, possessing a 6-a-hydroxy group; (b) IM-2 type, possessing a 6-b-hydroxy group; and (c) A-factor type, possessing a 6-keto group. Accordingly, there are type-specific receptor proteins that recognize the subtle structural differences among the three types of autoregulators. The g-butyrolactones autoregulator chemical structures are somewhat similar to those of the autoinducer acyl homoserine lactones, which many Gram-negative bacteria use in many quorum sensingcontrolled processes (e.g., bioluminescence, biofilm formation, virulence factor expression, antibiotic production, sporulation, and competence for DNA uptake).69 Increasing experimental evidence suggests that g-butyrolactones do not simply function as indicators of population density. Instead, they are synthesized at specific times to activate secondary metabolism and morphological differentiation, and perhaps to regulate cross-talk between individual secondary metabolic pathways. The first g-butyrolactones discovered, and the best characterized, is A-factor (2-isocapryloyl-3Rhydroxymethyl-γ-butyrolactone), required for both antibiotic (streptomycin) production and sporulation in S. griseus.70–74 The signaling cascade is shown in Figure 13.2b. Synthesis of A-factor by AfsA is induced by yet unknown signals. When a detectable level of A-factor has been synthesized and has accumulated in the culture medium, genes involved in the biosynthesis of streptomycin and sporulation are up-regulated. It is believed that A-factor binds to cytoplasmic A-factor-binding protein/repressor ArpA, and causes ArpA to be released from the adpA promoter. In the absence of A-factor, ArpA represses the transcription of the pleiotropic regulatory gene, adpA. Up-regulation of adpA expression activates transcription of the pathway-specific activator gene strR and in turn initiates the transcription of the streptomycin structure genes. The expression of the global regulator AdpA, which is also necessary for sporulation, is self-controlled and autorepressed.75 It is shown that ArpA protein binds to a 22 bp palindrome ARE boxes formed by two inverted repeats of 5′-GG(T/C)CGGT(A/T)(T/C)G(T/G)-3′.76 ARE boxes for binding of butyrolactone receptor proteins are present upstream of several genes encoding SARP proteins. For example, butyrolactone receptor protein homologs are found to bind to the ARE sequence and modulate clavulanic acid and cephamycin biosynthesis by its action on ccaR gene expression.77 In contrast to A-factor, some g-butyrolactones appear to be devoted to the regulation of secondary metabolism and generally work as pathway-specific regulators. VB controls the virginiamycin biosynthesis in S. virginiae through receptor protein BarA, followed by another negative regulator BarB87 in the signaling cascade. IM-2 and its receptor FarA control antibiotic production in S. lavendulae FRI-5, but do not participate in the morphological control.79 Tylosin biosynthetic regulatory cascade in S. fradiae is another interesting example of autoregulatory systems, which contains five putative regulatory genes.82,88–91 Two of these (tylP and tylQ) encode g-butyrolactone-binding proteins, while two others (tylT and tylS) encode SARPs. While most γ-butyrolactones binding proteins have inhibitory roles in regulating transcription of secondary metabolic gene clusters (Table 13.2), some play positive roles (e.g., SpbR in pristinamycin production in S. pristinaespiralis).86 S. coelicolor A3(2) produces several g-butyrolactones. Addition of SCB1 to cultures of S. coelicolor causes precocious production of Act and Red, demonstrating a positive effect on secondary metabolism. Unlike the A-factor model in S. griseus, the role for SCB1 in regulating antibiotic production is quite complex. ScbA is a positive autoregulator and is required for g-butyrolactone production. ScbR (the g-butyrolactone binding protein) activates the transcription of scbA and represses its own expression by binding to the divergent promoter region and the regulatory activities are inhibited by SCB1. With DNA microarray analysis, a direct target for regulation by ScbR, kasO, was revealed as a SARP family pathway-specific regulatory gene. ScbR represses kasO until transition phase, when presumably SCB1 accumulates in sufficient quantity to bind to ScbR and relieve kasO repression. Expression of the cryptic antibiotic gene cluster is reported reduced or undetectable in a kasO deletion mutant.81 Overall, our understanding of g-butyrolactone biosynthesis is very limited. A known key step is the coupling of a b-keto acyl chain with a glycerol-derived three-carbon precursor. The afsA gene from S.
13-9
Regulation of Secondary Metabolism in Bacteria Table 13.2 Examples of γ-Butyrolactone in Actinomycetes Example of Producing Organism
Example of Secondary Metabolite
S. griseus
Streptomycin
S. virginiae S. Lavendulae FRI-5
Virginiamycin Showdomycin Minimycin
S. coelicolor
Act, Red (not related?)
S. fradiae S. clavuligerus
Tylosin Clavulanic acid Cephamycin C Natamycin Bafilomucin B1 Pristinamycin
S. natalensis Kitasatospora setae S. pristinaespiralis
γ-Butyrolactone A-factor (2-isocaproyl-3Rhydroxymethyl-γbutyrolactone) VB (virginiae butanolides) IM-2 ([(2R, 3R, 1′R)2-1 ′-hydroxybutyl-3hydroxymethyl-γ-butanolide] SCB1 ([(2R, 3R, 1′R)-2(1 ′-hydroxy-6-methylheptyl)3-hydroxymethylbutanolide] ? ? ? ? ?
γ-Butyrolactone Receptor Proteins
Reference
ArpA
70–74
BarA FarA
78 79
ScbR
80,81
TylP ScaR
82 83
SngR KsbA SpbR
84 85 86
griseus was thought to encode the coupling enzyme for A-factor synthesis. Studies of AfsA homologues, in particular BarX from S. virginiae92 and ScbA from S. coelicolor,80 suggest that this family of proteins may have additional regulatory functions. BarS1 has been identified to catalyze the last reduction step of VB biosynthesis in S. virginiae.93 Autoregulator phenomenon is widespread in actinomycetes and involves different types of autoinducer molecules in addition to g-butyrolactones. PI factor is a novel type quorum-sensing (QS) inducer recently found in S. natalensis, which can elicit production of pimaricin.94 In addition to these extracellular small molecules, bacteria can also make different kinds of intracellular small molecules to integrate numerous sensory inputs and offer flexibility of recognition and response. We will talk about them later together with the nutritional and physiological factors.
13.3.3 Other Global Regulation Mechanisms A S. coelicolor antibiotic regulatory gene, absB, is identified from mutants that fail to produce any of S. coelicolor’s known antibiotics but are phenotypically normal otherwise.95 AbsB is a homolog of RNase III, which is an endonuclease that processes double-stranded RNA substrates to regulate expression of a set of cellular genes. In the absB mutant strain, actVI-ORF1, actI, actII-ORF4, and redD transcript levels were significantly lower than those of the parent strain, which suggests AbsB can regulate or influence the expression of the antibiotic structural and pathway-specific regulatory genes in S. coelicolor. The absB gene is widely conserved in streptomycetes; however, the reasons for its necessity in the biosynthesis of antibiotics are still unknown. Is sigma factor important in regulation of secondary metabolism? It is still poorly understood, but it is found that a single amino acid substitution (G243D) of the principal and essential sigma factor (HrdB) of S. coelicolor A3(2) resulted in pleiotropic loss of antibiotic production, while having no effect on growth rate or morphological differentiation.96 It was subsequently determined that the deficiency is a result of reduced transcription of the individual SARPs actII-ORF4 and redD. BldG, a putative anti-anti-sigma factor, is found to be able to regulate both aerial hyphae formation and antibiotic production in different streptomyces strains.97 In S. clavuligerus, bldG null mutant failed to express the CcaR transcriptional regulator, which controls the expression of biosynthetic genes for both secondary metabolites as well as expression of a second regulator of clavulanic acid biosynthesis,
13-10
Bacterial Transcriptional Regulation of Metabolism
ClaR. Studies from Dawn et al. indicate that phosphorylation of BldG is necessary for morphological differentiation and antibiotics production in S. coelicolor.98 Although most regulation mechanisms of secondary metabolism are defined at transcriptional level, at least one elegant example of translational level control has been studied, which includes the bldA gene product, a tRNA that recognizes the leucine TTA codon.99–101 This codon is rare in genes found in Streptomyces spp., which is particularly rich in G+C bases. With few exceptions, the TTA codon is absent from genes expressed during vegetative growth and structural genes of secondary metabolites. Interestingly, the TTA codon is present at the 5′ end of many SARP genes, including actII-ORF4.102 Translation of SARP is therefore largely dependent on the availability of the tRNA product of the bldA gene for translation of the rare codon.103 S. coelicolor bldA- mutants are found to be deficient in both sporulation and antibiotic production, while have no effects on primary growth. Not surprisingly, the bldA gene shows a growth-phase dependent pattern of expression and the expression level is highest post-vegetative growth.
13.3.4 Pleiotropic or Pathway-Specific Regulators? In some cases, it is very difficult to distinguish between pleiotropic and pathway-specific regulatory genes. For example, AbsA1-AbsA2 is the first identified two-component regulator pair in S. coelicolor that play a pleiotropic role in regulating secondary metabolism, influencing the production of all four of the antibiotics known to be made by the strain. Intriguingly, the genome sequence of S. coelicolor reveals that the absA1-A2 genes lie in close proximity to the cda gene cluster. A more quantitative analysis revealed mutation in absA1 caused more significant down-regulation of the expression of cda genes than those of the act and red gene clusters.66,104 Therefore, the absA1-absA2 pair is likely a pathway specific regulator for the cda gene cluster and its crosstalk with promoter elements present in the red and act gene cluster may not be a direct or even intended consequence. On the other hand, there are also examples of pathway-specific regulators controlling other antibiotic biosynthetic gene clusters in a pleiotropic fashion in S. coelicolor.105 These examples further illustrate the complexity of the regulatory networks associated with secondary metabolism in actinomycetes.
13.4 Nutritional and Physiological Factors Many nutritional, physiological, and environmental factors have effects on the onset of secondary metabolism. Certain intracellular small-molecule signaling systems, which have influences on secondary metabolism, can be activated upon the transduction of various extracellular signals.
13.4.1 Carbon Metabolism and Cyclic AMP Little is known about regulation of carbon utilization and carbohydrate transport, and how these affect secondary metabolism in streptomycetes. It is known that the preferential utilization of readily metabolizable carbon sources and the synthesis of secondary metabolites are regulated by carbon catabolite repression (CCR).106 Escalante et al. demonstrated that glucose can exert CCR on anthracycline synthesis in S. peucetius var. caesius.107 Although CCR is widespread among bacteria, the actual mechanisms of regulation are very diverse. One of the most important systems for sugar uptake in bacteria is the phosphoenolpyruvate:sugar system (PTS), which has been described in S. coelicolor, S. griseofuscus, and S. lividans.108 However, deletion mutation in S. coelicolor suggests that PTS does not play a direct role in CCR.109 In S. coelicolor, glucose kinase has been reported to have a regulatory role in CCR, but it is also demonstrated that glucose kinase alone cannot be responsible for carbon-source regulation in S. peucetius var. caesius.110,111 Cyclic AMP (cAMP, synthesized by adenylate cyclase) is a common second messenger in bacteria112 and it is involved in activating a transcription factor to control gene expression. In enteric bacteria, the
Regulation of Secondary Metabolism in Bacteria
13-11
PTS glucose-specific enzyme IIA is found to be able to modulate adenylate cyclase activity.113 Upon disruption of the adenylate cyclase gene in S. coelicolor, spore germination, aerial-mycelium formation, and antibiotic production are severely affected.114 Interestingly, the lack of mycelium formation and secondary metabolism in the mutant are pH dependent, partially indicating cAMP may be necessary for neutralizing pH values in S. coelicolor cultures grown in different carbon sources. However, it is also reported that with different carbon sources supplied to the cultures of certain streptomycete, cAMP levels do not fluctuate significantly.115 Evidence for cAMP involvement in actinomycete metabolism is still lacking.
13.4.2 ppGpp and Nitrogen-Limitation ppGpp, the highly phosphorylated guanosine nucleotide, is another common second messenger in bacteria. The accumulation of ppGpp is a first response to nutrient depletion, and is associated with slower growth of the bacteria. Generally, ppGpp is produced from guanosine 5′-triphosphate (GTP) by a ribosome-associated protein (RelA homologs) in response to low levels of charged tRNAs (usually under amino acid limitation). Alternatively, ppGpp can be synthesized by other synthetases under carbon or phosphate limitation in a ribosome-independent mode. Subsequently, ppGpp likely binds to the b-subunit of RNA polymerase and represses the transcription of genes encoding ribosomal RNA and tRNA, and activates the transcription of other genes such as those involved in amino acid synthesis and transport to overcome nutritional limitation. It has been reported that ppGpp plays an important role during secondary metabolism and morphological differentiation in Streptomyces spp. The ribosome-associated ppGpp synthetase (RelA) is required for Red and Act production under conditions of nitrogen limitation in S. coelicolor A3(2) and for clavulanic acid and cephamycin C production in S. clavuligerus.116–119 Upon induction of ppGpp synthesis, transcription of actII-ORF4 is up-regulated, as demonstrated by Hesketh et al.120 Therefore, ppGpp may participate in regulation by activating the transcription of some pathway-specific activator genes through altering the promoter specificity of RNA polymerase. This was confirmed through mutation experiments in the rpoB gene (encoding the RNA polymerase b subunit) and isolation of mutants that confer ppGpp-independent antibiotic production in S. coelicolor A3(2) and S. lividans.121,122 The mutated RNA polymerase may function by mimicking the ppGpp-bound form of wild type polymerase in activating the onset of secondary metabolism.
13.4.3 Phosphate Control The biosynthesis of many different types of secondary metabolites is suppressed by an excessive level of inorganic phosphate in the culture medium.53,123 In a few cases, there is evidence showing that the negative phosphate control is exerted at the transcriptional level.123,124 The two component PhoR-PhoP system has been shown to be involved in the regulation of phosphate control of secondary metabolism in S. lividans and S. coelicolor.53,55 In response to phosphate starvation, the phosphorylated PhoP protein up-regulates the expression of the pho regulon genes by binding to consensus phosphate boxes in the promoter regions (PHO boxes). No consensus PHO boxes have been located in the upstream region of phosphate-regulated secondary metabolism genes. In S. coelicolor, a number of PHO boxes have been identified according to the work of Sola-Landa et al.,125 therefore it is conceivable that PhoP directly activates transcription of the structure genes. It is known that the PhoR-PhoP system also controls the expression of the alkaline phosphatase gene (phoA). Mutation of the PhoR-PhoP system in S. lividans results in reduced levels of alkaline phosphatase activity, and overproduction of large amounts of Act and Red.55 It is interesting that inactivation of the polyphosphate kinase (PPK) of S. lividans, which produces polyphosphate during conditions of phosphate sufficiency, also results in an increase in Act production, and increases in the levels of transcripts of pathway-specific regulatory genes for Act, Red and CDA.126. This indicates that possibly several mechanisms are present in the phosphate control, and
13-12
Bacterial Transcriptional Regulation of Metabolism
intracellular phosphate itself, which is formed by breakdown of polyphosphate in this case, can cause repression of secondary metabolism.
13.4.4 S-Adenosylmethionine Activation Owing to the importance of methylation in various biosynthetic processes, S-adenosylmethionine (SAM) has been studied extensively as an essential molecule in diverse living organisms.127 Recently, studies have revealed a novel function of SAM in activation of secondary metabolism in streptomycetes. The addition of SAM to the culture medium, or the introduction of a high-copy-number plasmid containing the SAM synthetase gene into cells induces Act, Red and CDA biosynthesis in wild-type S. coelicolor A3(2);128 the accumulation of intracellular SAM activates Act production in S. lividans TK23, in which Act biosynthesis is normally silent;129 moreover, the addition of SAM causes hyperproduction of streptomycin in S. griseus, bicozamycin in S. griseoflavus,130 pristinamycin in S. pristinaespiralis, granaticin in S. violaceoruber, oleandomycin in S. antibioticus, and avermectin B1a in S. avermitilis.131 Therefore, up-regulation of secondary metabolism in streptomycetes by SAM is a widespread phenomenon. However, the mechanism by which SAM regulates antibiotic production is still unknown. Some up-regulated secondary metabolites require SAM-dependent methylation in their biosynthesis pathways, such as Red, oleandomycin, and avermectin B1a; in contrast, some do not, such as Act, pristinamycin, and granaticin. The finding that SAM increases the production of various antibiotics in different strains indicates that SAM may regulate the secondary metabolism by means of a common mechanism. Expression of certain pathway-specific regulatory genes such as actII-ORF4 and gra-ORF9 are found to be stimulated upon the addition of SAM, suggesting that at least in some cases, SAM activation of secondary metabolism is mediated by pathway-specific regulators.
13.4.5 Light Induction The carotenoid biosynthesis gene clusters (crt) have been identified in several Streptomyces spp., including S. coelicolor A3(2), S. griseus, and S. avermitilis,11,12,132 while the regulatory mechanisms for carotenoid production in these bacteria are mostly unknown. Recently, the mechanism for light-induced carotenoid biosynthesis in S. coelicolor A3(2) has been identified,133 which involves a photo-inducible extracytoplasmic function (ECF) sigma factor, LitS. The crt biosynthesis gene cluster is transcribed from two convergent promoters (PcrtE and PcrtY), which are specifically recognized by the sLitS (Figure 13.3). The light inducible transcriptional activity of sLitS is modulated by LitR, a transcriptional regulatory protein belonging to the Mer family. It is suggested that the carboxy-terminal domain of LitR contains a possible binding site for vitamin B12, which may serve as a capturing apparatus for the illumination signal. Upon receiving the illumination signal, the conformation of LitR alters, leading to changes in the DNA-binding affinity of LitR. Possibly, LitR negatively regulates litS as well as litR in the absence of light, and its conformational change upon light induction may cause up-regulation of litS transcription. The signaling cascade leads to activation of carotenoid biosynthesis in S. coelicolor A3(2).134
13.5 Other Functions of Secondary Metabolites Secondary metabolites have traditionally been characterized as compounds produced late in the cell cycle, primarily during stationary phase, whose production is not essential to the survival of the organism and are generally believed to be a type of defense mechanism against surrounding competitors. The appearance of these compounds late in the cell cycle and the antibiotic properties they typically display, have previously excluded these metabolites from being considered an integral part of cell regulation and physiology. However, Price-Whelan et al. have seen that pyocyanin (PYO), belonging to a class of compounds known as phenazines, and a type of “secondary” metabolite in Pseudomonas aeruginosa, does interact with other cellular components.135 That phenazines contribute to genetic
13-13
Regulation of Secondary Metabolism in Bacteria
Putative LitR – Blue light – litR
litS
+ σ LitS +
+
crtEIBV
crtYTU
Carotenoids
Figure 13.3 Mechanism of light induction for carotenoid biosynthesis in S. coelicolor A3(2).
regulation through the action of pyocyanin and impact the status of primary metabolites, in addition to their displayed antibiotic properties, challenges the current classifications of secondary metabolites. Pyocyanin contributes to gene regulation through its involvement in a QS network, a motif employed by bacteria to allow for the regulation of genetic expression and cellular physiology in response to environmental and density-dependent signals. QS is utilized throughout the cell cycle of P. aeruginosa, with the most prevalent signal molecules in the exponential phase being acylated homoserine lactones (HSLs), by a quinolone derivative, Pseudomonas quinolone signal (PQS) during the late exponential phase, and by the recently discovered role of pyocyanin in the stationary phase (Figure 13.4). It has been estimated that quorum-sensing regulates anywhere from 6% to greater than 10% of the P. aeruginosa genome.136,137 This noticeable impact of QS throughout the cell cycle, and the apparent participation by phenazine derivatives, necessitates the reevaluation of the current definition of “secondary” metabolites. Early in cell growth QS in P. aeruginosa is characterized by the interaction of two acylated homoserine lactones, N-(3-oxododecanoyl)-L-homoserine lactone (3-oxo-C12 HSL), and N-butanoyl-Lhomoserine lactone (C4 HSL). The synthesis of these molecules are catalyzed by the lasI and rhlI genes respectively, and upon reaching a threshold concentration these inducers interact with the LasR and RhlR regulatory proteins, respectively, in order to enhance or curtail gene expression. Together LasR and RhlR comprise a dynamic pair affecting the expression of the virulence factors (characteristic of
13-14
Bacterial Transcriptional Regulation of Metabolism (a)
o
o
3-oxo-C12-HSL (b)
O
o
C4-HSL (c)
N H
o N H
o Exponential phase
o
O OH N H
Late-exponential to early stationary phase
PQS
(d)
OH N
N
Early stationary to stationary phase
CH3 PYO
Figure 13.4 Molecular structures of the discussed signaling molecules. (a) N-(3-oxododecanoyl)-L-homoserine lactone (3-oxo-C12-HSL). (b) N-butanoyl-L-homoserine lactone (C4-HSL). (c) Pseudomonas quinolone signal (PQS) (d) pyocyanin (PYO).
the pathogenicity of Pseudomonas), secondary metabolites including phenazines, and subsequent QS signal molecules. Adding to the complexity of the QS system, Latifi et al. observed that the las system exerts transcriptional regulation over the rhl system.138 Furthermore, it was found that LasR positively, and RhlR negatively, controls another regulating protein, PqsR (MvfR).139 PqsR controls transcription of the pqsABCDE and phnAB operons, encoding for proteins involved in the synthesis of quinolones, including 2-heptyl-3-hydroxy-4-quinolone (PQS).140 The counteracting effects of LasR-C12 HSL and RhlR-C4 HSL on the regulation of PqsR appear to be dependent on the HSL ratio, and it was seen that PQS interacts with PqsR and can act as a coinducer, exerting a manner of control over its own synthesis.141 In addition to regulating its own synthesis, it was found that PQS had a larger signaling role, extending the QS cascade into the late exponential phase.143 In addition to the PqsR-regulated pqsA-E genes, the conversion of the PQS precursor, anthranilate, to PQS requires the presence of PqsH. PqsH catalyzes the final step in PQS formation, the conversion of 4-hydroxy-2-heptylquinolone (HHQ), to PQS, and was found to be regulated by the las system.144 Gallagher et al. also showed that PqsE was not needed for PQS production, but was required for PYO production. PqsE was thus needed to elicit the cellular response to PQS, but the exact mechanism remains unknown. Deziel et al. saw that HHQ molecules can be released from the cell and taken up by neighboring cells and converted into PQS.145 This finding as well as Mashburn and coworkers’ discovery that PQS, as well as other quinolones, can undergo intercellular transport via membrane vesicles (MVs), expanded the scope of the QS network.146 The vesicles are vital for the functioning of PQS, and the removal of the MVs was shown to inhibit production of PQS-regulated genes, including those encoding for pyocyanin formation.146 The continuation of the QS system in P. aeruginosa is dependent on the activation of PqsR and the functionality of PqsH,
13-15
Regulation of Secondary Metabolism in Bacteria
as membrane vesicles serve to (i) transport PQS and quinolone analogs to neighboring cells in order to carry out QS and (ii) provide a vehicle for the delivery of quinolone derivatives, which display antibiotic properties, to neighboring competitor cells. PQS’s direct role in quorum-sensing revolves around its activation of the phzABCDEFG operon,147 amongst others, encoding for the phenazine biosynthetic genes; and maintaining the propagation of the signaling message between the exponential and stationary phase. The signaling cascade is summarized in Figure 13.5.148 The production of phenazines is thus dependent on PQS being present at the onset of stationary growth,142 and results in phenazines, such as PYO, being dependent on the HSL ratios, as the effects early on in the QS network propagate through to the stationary phase. P. aeruginosa has two copies of the seven-gene phz operon that are 98% identical in their coding region, but differ in their promoter region.149 The two copies of the phenazine biosynthetic genes afford the bacteria more control over expression levels. The PQS-controlled phz1 operon, along with PqsE, is necessary for the synthesis of PYO. Previously it was thought that because phenazines display antibiotic properties and are produced in the stationary phase, that they are “secondary” metabolites whose functions are not an integral part of cellular development. However, the finding by Price-Whelan and coworkers that phenazines are beneficial to the host even in the absence of competitor species, calls into question the importance of these stationary-phase metabolites. Furthermore, Price-Whelan et al. discovered that PYO can also act as an intracellular signal molecule and as a transcriptional activator in the omnipresent QS network.135 The Price-Whelan group found that PYO upregulated the expression of two genes: a multidrug efflux pump, MexGHI-OpmD, and a mono-oxygenase, PA2274; both of which have previously been
Virulence factors
+ C12-HSL
+ +
lasR
pqsR
O PQS
N H +
phzABCDEFG Other phenazines
phzM phzS
+
+
pqsH
C4-HSL
–
+ +
rhlR
phnAB
+ pqsABCDE
OH
pqsE
?
N
OH
N CH3
PYO
Figure 13.5 Quorum-sensing hierarchy in Pseudomonas aeruginosa. The dotted line indicates PQS’s ability to act as a coinducer for its own production. The reader is directed to Diggle for a thorough description of the QS network leading up to phenazine production. (From Diggle, S. P., Winzer, K., Chhabra, S. R., Worrall, K. E., Camara, M., and Williams, P., Mol. Microbiol., 2003, 50(1), 29–43. With permission.)
13-16
Bacterial Transcriptional Regulation of Metabolism
proven to be QS dependent.135,150 The participation of PYO in QS explains the observations made by Whiteley et al. that 3-oxo-C12 and C4 HSLs can up-regulate the expression of mexGHI-opmD (called qsc133 by Whiteley), but only after a lag time corresponding to the onset of stationary phase.151 The presence of a lag time is in agreement with the findings by Price-Whelan et al.; for if PYO stimulates the MexGHI-OpmD efflux pump, then there will be a time delay from the introduction of the HSLs and the affect on the pump, as the density-dependent QS pathway takes time to produce the necessary levels of PYO from its precursors. In addition it was seen that mutants in mexI, encoding for the efflux protein, and opmD, encoding for a membrane-porin protein, resulted in a decreased production of the main virulence component elastase, as well as PYO, and other molecules known to be involved in QS .152 Aendekerk and coworkers theorized that the reduction in PYO could be due to a reduction in the amount of HSLs released, subsequently leading to an imbalance in the HSL ratio within the cell. It is believed that this imbalance results in the accumulation of a toxic precursor, believed to be anthranilate, that a functional MexGHI-OpmD pump could theoretically excrete.153 The elucidation of the mechanism for the loss of QS-regulated genes in the pump mutants has not been completely answered, due to the presence of other efflux pumps. It is believed that the MexAB-OprM pump involves transport of 3-oxo-C12-HSL and MexEF-OprN exports PQS/quinolones, while C4-HSL can diffuse freely across the membrane.154,155 Regardless of the mechanism and the efflux system specificities, mutants in the system result in the disturbance of signal molecule levels and the concomitant reduction of PYO production. The up-regulation of the mexGHI-opmD genes, as described by PriceWhelan et al., would help the organism to ensure the correct ratio of signal molecules and preclude the build up of toxic precursors. The role of PYO in maintaining QS-signal homeostasis, through its impact on the efflux system, depicts a molecule that is not only important to the virulence of the bacteria, but also involved in the regulation of common metabolites. The regulation of PA2274, a mono-oxygenase, has not been extensively studied. Palma et al. describe the interaction of SoxR to activate the PA2274 gene.150 Unlike the other regulatory proteins in quorumsensing, the SoxR regulatory protein does not appear to have a corresponding small inducer molecule. Instead it appears that SoxR might act as a regulator in response to O2.– (superoxide anion radical)dependent oxidative stress. This same inducer-less activation mechanism was found to occur between SoxR and the mexGHI-opmD system; which is the only Mex system not to have a specific small molecule interact with the activator.150 Hassan et al. found that the reduction of PYO can produce superoxide radicals.156 Thus, it appears that PYO could potentially serve the role of providing the superoxide radicals during stationary phase that are required for expression of PA2274 in response to signals in the quorumsensing network. Furthermore, the oxidating properties of PYO, allowing for the transcriptional activation of SoxR, would provide a mechanism for the observed up-regulation of the MexGHI-OpmD efflux system. Palma and coworkers also speculated that the mono-oxygenase could contribute to the viability of the bacteria by providing a detoxification mechanism to eliminate harmful compounds via their oxidation.150 It is also possible that the oxygenase could be involved in the reoxidation of the reduced PYO molecule for subsequent reuse (Figure 13.6). The finding that PYO, a phenazine derivative and metabolite produced during stationary growth, is involved in the regulation of several genes, necessitates a reconsideration of the classical term secondary metabolite. It now appears that these “secondary” metabolites can be integral members of a quorumsensing network that controls gene expression and do interact with primary metabolites and signaling molecules in an ever-expanding regulatory hierarchy evolving to ensure the persistent survival of the bacteria in response to environmental and density-dependent conditions. In addition to the long-known role of phenazines as antibiotics, their involvement in gene regulation and impact on primary metabolic and signaling molecules, confer upon these “secondary” metabolites a more important function that previously allotted to them. The necessity of cell survival, especially during times of nutrient limitations, results in these late-phase metabolites carrying out a vital role that contributes to the viability of the bacteria.
13-17
Regulation of Secondary Metabolism in Bacteria PYOoxid
? O2•–
soxR
?
+ + mexGHI-opm D
+ PA2274
PYOred
Figure 13.6 Proposed involvement of PYO in the regulation of the mexGHI-opmD and PA2274. Possible interaction based on the findings by Price-Whelan et al. that PYO up-regulates these genes, the discovery by Hassan et al. that PYO can produce superoxide radicals, and by Palma et al.’s description of the SoxR protein. (From PriceWhelan, A., Dietrich, L. E., and Newman, D. K., Nat. Chem. Biol., 2006, 2 (2), 71–78; Palma, M., Zurita, J., Ferreras, J. A., Worgall, S., Larone, D. H., Shi, L., Campagne, F., and Quadri, L. E., Infect. Immun., 2005, 73 (5), 2958–2966; Hassan, H. M. and Fridovich, I., J. Bacteriol., 1980, 141 (1), 156–163. With permission.)
13.6 Conclusions Although many regulatory mechanisms and pathways have been elucidated in bacteria, our knowledge of the global regulation network of secondary metabolism is still very limited. On the other hand, with limited knowledge, scientists and engineers have already been able to improve yields of secondary metabolites dramatically during fermentation. Pathway-specific regulatory genes (dnrI and actII-ORF4) on high-copy plasmids have been introduced to the recombinant host, where high yields of doxorubicin and Act are achieved.5,6 Strong heterologous promoter for the activator gene has been engineered to increase transcription of pathway-specific activators, leading ultimately to elevated levels of the product of interest.157 Therefore, the value in understanding and exploiting the regulation of secondary metabolism should not be underestimated.
References 1. Demain, A. L. Microbial secondary metabolism: a new theoretical frontier for academia, a new opportunity for industry. Ciba Found. Symp., 171, 3–16; discussion 16–23, 1992. 2. Williams, D. H., Stone, M. J., Hauck, P. R., and Rahman, S. K. Why are secondary metabolites (natural products) biosynthesized? J. Nat. Prod., 52 (6), 1189–208, 1989. 3. Challis, G. L. and Hopwood, D. A. Synergy and contingency as driving forces for the evolution of multiple secondary metabolite production by Streptomyces species. Proc. Natl. Acad. Sci. USA, 100 Suppl 2, 14555–61, 2003. 4. Strohl, W. R. e. Biotechnology of Antibiotics, 2nd ed. Marcel Dekker, Inc., New York, 1997. 5. Park, H. S., Kang, S. H., Park, H. J., and Kim, E. S. Doxorubicin productivity improvement by the recombinant Streptomyces peucetius with high-copy regulatory genes cultured in the optimized media composition. J.Microbiol. Biotechnol., 15 (1), 66–71, 2005. 6. Bruheim, P., Sletta, H., Bibb, M. J., White, J., and Levine, D. W. High-yield actinorhodin production in fed-batch culture by a Streptomyces lividans strain overexpressing the pathway-specific activator gene actll-ORF4. J. Ind. Microbiol. Biotechnol., 28 (2), 103–11, 2002. 7. Yu, J. H. and Keller, N. Regulation of secondary metabolism in filamentous fungi. Annu. Rev. Phytopathol., 43, 437–58, 2005. 8. Champness, W. C. and Chater, K. F. Regulation and integration of antibiotic production and morphological differentiation in Streptomyces spp. In P. Piggot, C. Moran, and P. Youngman, eds. Regulation of Bacterial Differentiation. American Society for Microbiology, Washington, DC, 61–94, 1994.
13-18
Bacterial Transcriptional Regulation of Metabolism
9. Chater, K. F. and Bibb, M. J. Regulation of bacterial antibiotic production. In H. Kleinkauf and H. von Dohren, eds. Biotechnology. Vol. 6. Products of Secondary Metabolism. VCH, Weinheim, Germany, 57–105, 1997. 10. Bibb, M. J. Regulation of secondary metabolism in streptomycetes. Curr. Opin. Microbiol., 8 (2), 208–15, 2005. 11. Ikeda, H., Ishikawa, J., Hanamoto, A., Shinose, M., Kikuchi, H., Shiba, T., Sakaki, Y., Hattori, M., and Omura, S. Complete genome sequence and comparative analysis of the industrial microorganism Streptomyces avermitilis. Nat. Biotechnol., 21 (5), 526–31, 2003. 12. Bentley, S. D., Chater, K. F., Cerdeno-Tarraga, A. M., Challis, G. L., Thomson, N. R., James, K. D., Harris, D. E., Quail, M. A., Kieser, H., Harper, D., Bateman, A., Brown, S., Chandra, G., Chen, C. W., Collins, M., Cronin, A., Fraser, A., Goble, A., Hidalgo, J., Hornsby, T., Howarth, S., Huang, C. H., Kieser, T., Larke, L., Murphy, L., Oliver, K., O’Neil, S., Rabbinowitsch, E., Rajandream, M. A., Rutherford, K., Rutter, S., Seeger, K., Saunders, D., Sharp, S., Squares, R., Squares, S., Taylor, K., Warren, T., Wietzorrek, A., Woodward, J., Barrell, B. G., Parkhill, J., and Hopwood, D. A. Complete genome sequence of the model actinomycete Streptomyces coelicolor A3(2). Nature, 417 (6885), 141–7, 2002. 13. Novotna, J., Vohradsky, J., Berndt, P., Gramajo, H., Langen, H., Li, X. M., Minas, W., Orsaria, L., Roeder, D., and Thompson, C. J. Proteomic studies of diauxic lag in the differentiating prokaryote Streptomyces coelicolor reveal a regulatory network of stress-induced proteins and central metabolic enzymes. Mol. Microbiol., 48 (5), 1289–303, 2003. 14. Borodina, I., Krabben, P., and Nielsen, J. Genome-scale analysis of Streptomyces coelicolor A3(2) metabolism. Genome Res., 15 (6), 820–29, 2005. 15. Bishop, A., Fielding, S., Dyson, P., and Herron, P. Systematic insertional mutagenesis of a streptomycete genome: a link between osmoadaptation and antibiotic production. Genome Res., 14 (5), 893–900, 2004. 16. Gehring, A. M., Nodwell, J. R., Beverley, S. M., and Losick, R. Genomewide insertional mutagenesis in Streptomyces coelicolor reveals additional genes involved in morphological differentiation. Proc. Natl. Acad. Sci. USA, 97 (17), 9642–47, 2000. 17. Huang, J., Lih, C. J., Pan, K. H., and Cohen, S. N. Global analysis of growth phase responsive gene expression and regulation of antibiotic biosynthetic pathways in Streptomyces coelicolor using DNA microarrays. Genes Dev., 15 (23), 3183–92, 2001. 18. Volokhan, O., Sletta, H., Sekurova, O. N., Ellingsen, T. E., and Zotchev, S. B. An unexpected role for the putative 4’-phosphopantetheinyl transferase-encoding gene nysF in the regulation of nystatin biosynthesis in Streptomyces noursei ATCC 11455. FEMS Microbiol. Lett., 249 (1), 57–64, 2005. 19. Wietzorrek, A. and Bibb, M. A novel family of proteins that regulates antibiotic production in streptomycetes appears to contain an OmpR-like DNA-binding fold. Mol. Microbiol., 25 (6), 1181–84, 1997. 20. Wilson, D. J., Xue, Y., Reynolds, K. A., and Sherman, D. H. Characterization and analysis of the PikD regulatory factor in the pikromycin biosynthetic pathway of Streptomyces venezuelae. J. Bacteriol., 183 (11), 3468–75, 2001. 21. Hopwood, D. A., Chater, K. F., and Bibb, M. J. Genetics of antibiotic production in Streptomyces coelicolor A3(2), a model streptomycete. Biotechnology, 28, 65–102, 1995. 22. Sheldon, P. J., Busarow, S. B., and Hutchinson, C. R. Mapping the DNA-binding domain and target sequences of the Streptomyces peucetius daunorubicin biosynthesis regulatory protein, DnrI. Mol. Microbiol., 44 (2), 449–60, 2002. 23. Floriano, B. and Bibb, M. afsR is a pleiotropic but conditionally required regulatory gene for antibiotic production in Streptomyces coelicolor A3(2). Mol. Microbiol., 21 (2), 385–96, 1996. 24. Horinouchi, S. AfsR as an integrator of signals that are sensed by multiple serine/threonine kinases in Streptomyces coelicolor A3(2). J. Ind. Microbiol. Biotechnol., 30 (8), 462–67, 2003.
Regulation of Secondary Metabolism in Bacteria
13-19
25. Bibb, M. 1995. Colworth Prize Lecture. The regulation of antibiotic production in Streptomyces coelicolor A3(2). Microbiology, 142 ( Pt 6), 1335–44, 1996. 26. Arias, P., Fernandez-Moreno, M. A., and Malpartida, F. Characterization of the pathway-specific positive transcriptional regulator for actinorhodin biosynthesis in Streptomyces coelicolor A3(2) as a DNA-binding protein. J. Bacteriol., 181 (22), 6958–68, 1999. 27. Retzlaff, L. and Distler, J. The regulator of streptomycin gene expression, StrR, of Streptomyces griseus is a DNA binding activator protein with multiple recognition sites. Mol. Microbiol., 18 (1), 151–62, 1995. 28. Tang, L., Grimm, A., Zhang, Y. X., and Hutchinson, C. R. Purification and characterization of the DNA-binding protein DnrI, a transcriptional factor of daunorubicin biosynthesis in Streptomyces peucetius. Mol. Microbiol., 22 (5), 801–13, 1996. 29. Takano, E., Gramajo, H. C., Strauch, E., Andres, N., White, J., and Bibb, M. J. Transcriptional regulation of the redD transcriptional activator gene accounts for growth-phase-dependent production of the antibiotic undecylprodigiosin in Streptomyces coelicolor A3(2). Mol. Microbiol., 6 (19), 2797– 804, 1992. 30. Narva, K. E. and Feitelson, J. S. Nucleotide sequence and transcriptional analysis of the redD locus of Streptomyces coelicolor A3(2). J. Bacteriol., 172 (1), 326–33, 1990. 31. Ylihonko, K., Tuikkanen, J., Jussila, S., Cong, L., and Mantsala, P. A gene cluster involved in nogalamycin biosynthesis from Streptomyces nogalater: sequence analysis and complementation of early-block mutations in the anthracycline pathway. Mol. Gen. Genet., 251 (2), 113–20, 1996. 32. Perez-Llarena, F. J., Liras, P., Rodriguez-Garcia, A., and Martin, J. F. A regulatory gene (ccaR) required for cephamycin and clavulanic acid production in Streptomyces clavuligerus: amplification results in overproduction of both beta-lactam compounds. J. Bacteriol., 179 (6), 2053–59, 1997. 33. Santamarta, I., Rodriguez-Garcia, A., Perez-Redondo, R., Martin, J. F., and Liras, P. CcaR is an autoregulatory protein that binds to the ccaR and cefD-cmcI promoters of the cephamycin C-clavulanic acid cluster in Streptomyces clavuligerus. J. Bacteriol., 184 (11), 3106–13, 2002. 34. Lombo, F., Brana, A. F., Mendez, C., and Salas, J. A. The mithramycin gene cluster of Streptomyces argillaceus contains a positive regulatory gene and two repeated DNA sequences that are located at both ends of the cluster. J. Bacteriol., 181 (2), 642–7, 1999. 35. Aigle, B., Pang, X., Decaris, B., and Leblond, P. Involvement of AlpV, a new member of the Streptomyces antibiotic regulatory protein family, in regulation of the duplicated type II polyketide synthase alp gene cluster in Streptomyces ambofaciens. J. Bacteriol., 187 (7), 2491–500, 2005. 36. Eustaquio, A. S., Li, S. M., and Heide, L. NovG, a DNA-binding protein acting as a positive regulator of novobiocin biosynthesis. Microbiology, 151 (Pt 6), 1949–61, 2005. 37. Rebets, Y., Ostash, B., Luzhetskyy, A., Hoffmeister, D., Brana, A., Mendez, C., Salas, J. A., Bechthold, A., and Fedorenko, V. Production of landomycins in Streptomyces globisporus 1912 and S cyanogenus S136 is regulated by genes encoding putative transcriptional activators. FEMS Microbiol. Lett., 222 (1), 149–53, 2003. 38. Rebets, Y., Ostash, B., Luzhetskyy, A., Kushnir, S., Fukuhara, M., Bechthold, A., Nashimoto, M., Nakamura, T., and Fedorenko, V. DNA-binding activity of LndI protein and temporal expression of the gene that upregulates landomycin E production in Streptomyces globisporus 1912. Microbiology, 151 (Pt 1), 281–90, 2005. 39. Sekurova, O. N., Brautaset, T., Sletta, H., Borgos, S. E., Jakobsen, M. O., Ellingsen, T. E., Strom, A. R., Valla, S., and Zotchev, S. B. In vivo analysis of the regulatory genes in the nystatin biosynthetic gene cluster of Streptomyces noursei ATCC 11455 reveals their differential control over antibiotic biosynthesis. J. Bacteriol., 186 (5), 1345–54, 2004. 40. Aparicio, J. F., Molnar, I., Schwecke, T., Konig, A., Haydock, S. F., Khaw, L. E., Staunton, J., and Leadlay, P. F. Organization of the biosynthetic gene cluster for rapamycin in Streptomyces hygroscopicus: analysis of the enzymatic domains in the modular polyketide synthase. Gene, 169 (1), 9–16, 1996.
13-20
Bacterial Transcriptional Regulation of Metabolism
41. Anton, N., Mendes, M. V., Martin, J. F., and Aparicio, J. F. Identification of PimR as a positive regulator of pimaricin biosynthesis in Streptomyces natalensis. J. Bacteriol., 186 (9), 2567–75, 2004. 42. Stock, A. M., Robinson, V. L., and Goudreau, P. N. Two-component signal transduction, Annu., Rev. Biochem., 69, 183–215, 2000. 43. Brian, P., Riggle, P. J., Santos, R. A., and Champness, W. C. Global negative regulation of Streptomyces coelicolor antibiotic synthesis mediated by an absA-encoded putative signal transduction system. J. Bacteriol., 178 (11), 3221–31, 1996. 44. Chang, H. M., Chen, M. Y., Shieh, Y. T., Bibb, M. J., and Chen, C. W. The cutRS signal transduction system of Streptomyces lividans represses the biosynthesis of the polyketide antibiotic actinorhodin. Mol. Microbiol., 21 (5), 1075–85, 1996. 45. Novakova, R., Homerova, D., Feckova, L., and Kormanec, J. Characterization of a regulatory gene essential for the production of the angucycline-like polyketide antibiotic auricin in Streptomyces aureofaciens CCM 3239. Microbiology, 151 (Pt 8), 2693–706, 2005. 46. Guthrie, E. P., Flaxman, C. S., White, J., Hodgson, D. A., Bibb, M. J., and Chater, K. F., A responseregulator-like activator of antibiotic synthesis from Streptomyces coelicolor A3(2) with an aminoterminal domain that lacks a phosphorylation pocket. Microbiology, 144 ( Pt 3), 727–38, 1998. 47. Furuya, K. and Hutchinson, C. R. The DnrN protein of Streptomyces peucetius, a pseudo-response regulator, is a DNA-binding protein involved in the regulation of daunorubicin biosynthesis. J. Bacteriol., 178 (21), 6310–8, 1996. 48. Uguru, G. C., Stephens, K. E., Stead, J. A., Towle, J. E., Baumberg, S., and McDowall, K. J., Transcriptional activation of the pathway-specific regulator of the actinorhodin biosynthetic genes in Streptomyces coelicolor. Mol. Microbiol., 58 (1), 131–50, 2005. 49. Elliot, M. A. and Talbot, N. J. Building filaments in the air: aerial morphogenesis in bacteria and fungi. Curr. Opin. Microbiol., 7 (6), 594–601, 2004. 50. Cozzone, A. J. Protein phosphorylation in prokaryotes. Annu. Rev. Microbiol., 42, 97–125, 1988. 51. Anderson, T. B., Brian, P., and Champness, W. C. Genetic and transcriptional analysis of absA, an antibiotic gene cluster-linked two-component system that regulates multiple antibiotics in Streptomyces coelicolor. Mol. Microbiol., 39 (3), 553–66, 2001. 52. Aceti, D. J. and Champness, W. C. Transcriptional regulation of Streptomyces coelicolor pathwayspecific antibiotic regulators by the absA and absB loci. J. Bacteriol., 180 (12), 3100–6, 1998. 53. Martin, J. F. Phosphate control of the biosynthesis of antibiotics and other secondary metabolites is mediated by the PhoR-PhoP system: an unfinished story. J. Bacteriol., 186 (16), 5197–201, 2004. 54. Ishizuka, H., Horinouchi, S., Kieser, H. M., Hopwood, D. A., and Beppu, T. A putative two-component regulatory system involved in secondary metabolism in Streptomyces spp. J. Bacteriol., 174 (23), 7585–94, 1992. 55. Sola-Landa, A., Moura, R. S., and Martin, J. F. The two-component PhoR-PhoP system controls both primary metabolism and secondary metabolite biosynthesis in Streptomyces lividans. Proc. Natl. Acad. Sci. USA, 100 (10), 6133–38, 2003. 56. Kim, D. and Forst, S. Genomic analysis of the histidine kinase family in bacteria and archaea. Microbiology, 147 (Pt 5), 1197–212, 2001. 57. Hutchings, M. I., Hoskisson, P. A., Chandra, G., and Buttner, M. J. Sensing and responding to diverse extracellular signals? Analysis of the sensor kinases and response regulators of Streptomyces coelicolor A3(2). Microbiology, 150 (Pt 9), 2795–806, 2004. 58. Kennelly, P. J. Protein kinases and protein phosphatases in prokaryotes: a genomic perspective. FEMS Microbiol. Lett., 206 (1), 1–8, 2002. 59. Petrickova, K. and Petricek, M. Eukaryotic-type protein kinases in Streptomyces coelicolor: variations on a common theme. Microbiology, 149 (Pt 7), 1609–21, 2003. 60. Umeyama, T. and Horinouchi, S. Autophosphorylation of a bacterial serine/threonine kinase, AfsK, is inhibited by KbpA, an AfsK-binding protein. J. Bacteriol., 183 (19), 5506–12, 2001.
Regulation of Secondary Metabolism in Bacteria
13-21
61. Lee, P. C., Umeyama, T., and Horinouchi, S. afsS is a target of AfsR, a transcriptional factor with ATPase activity that globally controls secondary metabolism in Streptomyces coelicolor A3(2). Mol. Microbiol., 43 (6), 1413–30, 2002. 62. Umeyama, T., Lee, P. C., and Horinouchi, S. Protein serine/threonine kinases in signal transduction for secondary metabolism and morphogenesis in Streptomyces. Appl. Microbiol. Biotechnol., 59 (4–5), 419–25, 2002. 63. Shi, L. and Zhang, W. Comparative analysis of eukaryotic-type protein phosphatases in two streptomycete genomes. Microbiology, 150 (Pt 7), 2247–56, 2004. 64. Li, Y. and Strohl, W. R. Cloning, purification, and properties of a phosphotyrosine protein phosphatase from Streptomyces coelicolor A3(2). J. Bacteriol., 178 (1), 136–42, 1996. 65. Umeyama, T., Tanabe, Y., Aigle, B. D., and Horinouchi, S. Expression of the Streptomyces coelicolor A3(2) ptpA gene encoding a phosphotyrosine protein phosphatase leads to overproduction of secondary metabolites in S. lividans. FEMS Microbiol. Lett., 144 (2–3), 177–84, 1996. 66. Sheeler, N. L., MacMillan, S. V., and Nodwell, J. R. Biochemical activities of the absA two-component system of Streptomyces coelicolor. J. Bacteriol., 187 (2), 687–96, 2005. 67. Yamada, Y. Autoregulatory factors and regulation of antibiotic production in Streptomyces. In England, R. R., Hobbs, G., Bainton, N. J., and Roberts, D. McL., eds. Microbial Signalling and Communication. Cambridge University Press, Cambridge, U.K., 177–196, 1999. 68. Choi, S. U., Lee, C. K., Hwang, Y. I., Kinosita, H., and Nihira, T. Gamma-butyrolactone autoregulators and receptor proteins in non- Streptomyces actinomycetes producing commercially important secondary metabolites. Arch. Microbiol., 180 (4), 303–7, 2003. 69. Fuqua, C. and Greenberg, E. P., Listening in on bacteria: acyl-homoserine lactone signaling. Nat. Rev. Mol. Cell. Biol., 3 (9), 685–95, 2002. 70. Horinouchi, S. and Beppu, T. A-factor as a microbial hormone that controls cellular differentiation and secondary metabolism in Streptomyces griseus. Mol. Microbiol., 12 (6), 859–64, 1994. 71. Ohnishi, Y., Yamazaki, H., Kato, J. Y., Tomono, A., and Horinouchi, S. AdpA, a central transcriptional regulator in the A-factor regulatory cascade that leads to morphological development and secondary metabolism in Streptomyces griseus. Biosci. Biotechnol. Biochem., 69 (3), 431–39, 2005. 72. Tomono, A., Tsai, Y., Yamazaki, H., Ohnishi, Y., and Horinouchi, S. Transcriptional control by A-factor of strR, the pathway-specific transcriptional activator for streptomycin biosynthesis in Streptomyces griseus. J. Bacteriol., 187 (16), 5595–604, 2005. 73. Kato, J. Y., Miyahisa, I., Mashiko, M., Ohnishi, Y., and Horinouchi, S. A single target is sufficient to account for the biological effects of the A-factor receptor protein of Streptomyces griseus. J. Bacteriol., 186 (7), 2206–11, 2004. 74. Yamazaki, H., Tomono, A., Ohnishi, Y., and Horinouchi, S. DNA-binding specificity of AdpA, a transcriptional activator in the A-factor regulatory cascade in Streptomyces griseus. Mol. Microbiol., 53 (2), 555–72, 2004. 75. Kato, J. Y., Ohnishi, Y., and Horinouchi, S. Autorepression of AdpA of the AraC/XylS family, a key transcriptional activator in the A-factor regulatory cascade in Streptomyces griseus. J. Mol. Biol., 350 (1), 12–26, 2005. 76. Onaka, H. and Horinouchi, S. DNA-binding activity of the A-factor receptor protein and its recognition DNA sequences. Mol. Microbiol., 24 (5), 991–1000, 1997. 77. Santamarta, I., Perez-Redondo, R., Lorenzana, L. M., Martin, J. F., and Liras, P. Different proteins bind to the butyrolactone receptor protein ARE sequence located upstream of the regulatory ccaR gene of Streptomyces clavuligerus. Mol. Microbiol., 56 (3), 824–35, 2005. 78. Yamada, Y., Sugamura, K., Kondo, K., Yanagimoto, M., and Okada, H. The structure of inducing factors for virginiamycin production in Streptomyces virginiae. J. Antibiot. (Tokyo), 40 (4), 496–504, 1987.
13-22
Bacterial Transcriptional Regulation of Metabolism
79. Kitani, S., Yamada, Y., and Nihira, T. Gene replacement analysis of the butyrolactone autoregulator receptor (FarA) reveals that FarA acts as a Novel regulator in secondary metabolism of Streptomyces lavendulae FRI-5. J. Bacteriol., 183 (14), 4357–63, 2001. 80. Takano, E., Chakraburtty, R., Nihira, T., Yamada, Y., and Bibb, M. J. A complex role for the gamma-butyrolactone SCB1 in regulating antibiotic production in Streptomyces coelicolor A3(2). Mol. Microbiol., 41 (5), 1015–28, 2001. 81. Takano, E., Kinoshita, H., Mersinias, V., Bucca, G., Hotchkiss, G., Nihira, T., Smith, C. P., Bibb, M., Wohlleben, W., and Chater, K. A bacterial hormone (the SCB1) directly controls the expression of a pathway-specific regulatory gene in the cryptic type I polyketide biosynthetic gene cluster of Streptomyces coelicolor. Mol. Microbiol., 56 (2), 465–79, 2005. 82. Stratigopoulos, G., Gandecha, A. R., and Cundliffe, E. Regulation of tylosin production and morphological differentiation in Streptomyces fradiae by TylP, a deduced gamma-butyrolactone receptor. Mol. Microbiol., 45 (3), 735–44, 2002. 83. Kim, H. S., Lee, Y. J., Lee, C. K., Choi, S. U., Yeo, S. H., Hwang, Y. I., Yu, T. S., Kinoshita, H., and Nihira, T. Cloning and characterization of a gene encoding the gamma-butyrolactone autoregulator receptor from Streptomyces clavuligerus. Arch. Microbiol., 182 (1), 44–50, 2004. 84. Lee, K. M., Lee, C. K., Choi, S. U., Park, H. R., Kitani, S., Nihira, T., and Hwang, Y. I. Cloning and in vivo functional analysis by disruption of a gene encoding the gamma-butyrolactone autoregulator receptor from Streptomyces natalensis. Arch. Microbiol., 184 (4), 249–57, 2005. 85. Choi, S. U., Lee, C. K., Hwang, Y. I., Kinoshita, H., and Nihira, T. Cloning and functional analysis by gene disruption of a gene encoding a gamma-butyrolactone autoregulator receptor from Kitasatospora setae. J. Bacteriol., 186 (11), 3423–30, 2004. 86. Folcher, M., Gaillard, H., Nguyen, L. T., Nguyen, K. T., Lacroix, P., Bamas-Jacques, N., Rinkel, M., and Thompson, C. J. Pleiotropic functions of a Streptomyces pristinaespiralis autoregulator receptor in development, antibiotic biosynthesis, and expression of a superoxide dismutase. J. Biol. Chem., 276 (47), 44297–306, 2001. 87. Matsuno, K., Yamada, Y., Lee, C. K., and Nihira, T. Identification by gene deletion analysis of barB as a negative regulator controlling an early process of virginiamycin biosynthesis in Streptomyces virginiae. Arch. Microbiol., 181 (1), 52–9, 2004. 88. Bate, N., Butler, A. R., Gandecha, A. R., and Cundliffe, E. Multiple regulatory genes in the tylosin biosynthetic cluster of Streptomyces fradiae. Chem. Biol., 6 (9), 617–24, 1999. 89. Stratigopoulos, G., Bate, N., and Cundliffe, E. Positive control of tylosin biosynthesis: pivotal role of TylR. Mol. Microbiol., 54 (5), 1326–34, 2004. 90. Stratigopoulos, G. and Cundliffe, E. Expression analysis of the tylosin-biosynthetic gene cluster: pivotal regulatory role of the tylQ product. Chem. Biol., 9 (1), 71–8, 2002. 91. Bate, N., Stratigopoulos, G., and Cundliffe, E. Differential roles of two SARP-encoding regulatory genes during tylosin biosynthesis. Mol. Microbiol., 43 (2), 449–58, 2002. 92. Kawachi, R., Akashi, T., Kamitani, Y., Sy, A., Wangchaisoonthorn, U., Nihira, T., and Yamada, Y. Identification of an AfsA homologue (BarX) from Streptomyces virginiae as a pleiotropic regulator controlling autoregulator biosynthesis, virginiamycin biosynthesis and virginiamycin M1 resistance. Mol. Microbiol., 36 (2), 302–13, 2000. 93. Shikura, N., Yamamura, J., and Nihira, T. barS1, a gene for biosynthesis of a gamma-butyrolactone autoregulator, a microbial signaling molecule eliciting antibiotic production in Streptomyces species. J. Bacteriol., 184 (18), 5151–57, 2002. 94. Recio, E., Colinas, A., Rumbero, A., Aparicio, J. F., and Martin, J. F. PI factor, a novel type quorumsensing inducer elicits pimaricin production in Streptomyces natalensis. J. Biol. Chem., 279 (40), 41586–93, 2004. 95. Price, B., Adamidis, T., Kong, R., and Champness, W. A Streptomyces coelicolor antibiotic regulatory gene, absB, encodes an RNase III homolog. J. Bacteriol., 181 (19), 6142–51, 1999.
Regulation of Secondary Metabolism in Bacteria
13-23
96. Aigle, B., Wietzorrek, A., Takano, E., and Bibb, M. J. A single amino acid substitution in region 1.2 of the principal sigma factor of Streptomyces coelicolor A3(2) results in pleiotropic loss of antibiotic production. Mol. Microbiol., 37 (5), 995–1004, 2000. 97. Bignell, D. R., Lau, L. H., Colvin, K. R., and Leskiw, B. K. The putative anti-anti-sigma factor BldG is post-translationally modified by phosphorylation in Streptomyces coelicolor. FEMS Microbiol. Lett., 225 (1), 93–9, 2003. 98. Bignell, D. R., Tahlan, K., Colvin, K. R., Jensen, S. E., and Leskiw, B. K. Expression of ccaR, encoding the positive activator of cephamycin C and clavulanic acid production in Streptomyces clavuligerus, is dependent on bldG. Antimicrob., Agents Chemother., 49 (4), 1529–41, 2005. 99. Leskiw, B. K., Lawlor, E. J., Fernandez-Abalos, J. M., and Chater, K. F. TTA codons in some genes prevent their expression in a class of developmental, antibiotic-negative, Streptomyces mutants. Proc. Natl. Acad. Sci. USA, 88 (6), 2461–5, 1991. 100. Fernandez-Moreno, M. A., Caballero, J. L., Hopwood, D. A., and Malpartida, F. The act cluster contains regulatory and antibiotic export genes, direct targets for translational control by the bldA tRNA gene of Streptomyces. Cell, 66 (4), 769–80, 1991. 101. Leskiw, B. K., Mah, R., Lawlor, E. J., and Chater, K. F. Accumulation of bldA-specified tRNA is temporally regulated in Streptomyces coelicolor A3(2). J. Bacteriol., 175 (7), 1995–2005, 1993. 102. Wright, F. and Bibb, M. J. Codon usage in the G+C-rich Streptomyces genome. Gene, 113 (1), 55–65, 1992. 103. Rebets, Y. V., Ostash, B. O., Fukuhara, M., Nakamura, T., and Fedorenko, V. O. Expression of the regulatory protein LndI for landomycin E production in Streptomyces globisporus 1912 is controlled by the availability of tRNA for the rare UUA codon. FEMS Microbiol. Lett. 256 (1), 30–7, 2006. 104. Ryding, N. J., Anderson, T. B., and Champness, W. C. Regulation of the Streptomyces coelicolor calcium-dependent antibiotic by absA, encoding a cluster-linked two-component system. J. Bacteriol. 184 (3), 794–805, 2002. 105. Huang, J., Shi, J., Molle, V., Sohlberg, B., Weaver, D., Bibb, M. J., Karoonuthaisiri, N., Lih, C. J., Kao, C. M., Buttner, M. J., and Cohen, S. N. Cross-regulation among disparate antibiotic biosynthetic pathways of Streptomyces coelicolor. Mol. Microbiol., 58 (5), 1276–87, 2005. 106. Stulke, J. and Hillen, W. Carbon catabolite repression in bacteria. Curr. Opin. Microbiol., 2 (2), 195– 201, 1999. 107. Escalante, L., Ramos, I., Imriskova, I., Langley, E., and Sanchez, S. Glucose repression of anthracycline formation in Streptomyces peucetius var. caesius. Appl. Microbiol. Biotechnol., 52, 572–78, 1999. 108. Titgemeyer, F., Walkenhorst, J., Reizer, J., Stuiver, M. H., Cui, X., and Saier, M. H., Jr. Identification and characterization of phosphoenolpyruvate:fructose phosphotransferase systems in three Streptomyces species. Microbiology, 141 (Pt 1), 51–8, 1995. 109. Butler, M. J., Deutscher, J., Postma, P. W., Wilson, T. J., Galinier, A., and Bibb, M. J. Analysis of a ptsH homologue from Streptomyces coelicolor A3(2). FEMS Microbiol. Lett., 177 (2), 279–88, 1999. 110. Guzman, S., Ramos, I., Moreno, E., Ruiz, B., Rodriguez-Sanoja, R., Escalante, L., Langley, E., and Sanchez, S. Sugar uptake and sensitivity to carbon catabolite regulation in Streptomyces peucetius var. caesius. Appl. Microbiol. Biotechnol., 69 (2), 200–6, 2005. 111. Ramos, I., Guzman, S., Escalante, L., Imriskova, I., Rodriguez-Sanoja, R., Sanchez, S., and Langley, E. Glucose kinase alone cannot be responsible for carbon source regulation in Streptomyces peucetius var. caesius. Res. Microbiol., 155 (4), 267–74, 2004. 112. Camilli, A. and Bassler, B. L. Bacterial small-molecule signaling pathways. Science, 311 (5764), 1113–36, 2006. 113. Postma, P. W., Lengeler, J. W., and Jacobson, G. R. Phosphoenolpyruvate:carbohydrate phosphotransferase systems of bacteria. Microbiol. Rev., 57 (3), 543–94, 1993.
13-24
Bacterial Transcriptional Regulation of Metabolism
114. Susstrunk, U., Pidoux, J., Taubert, S., Ullmann, A., and Thompson, C. J. Pleiotropic effects of cAMP on germination, antibiotic biosynthesis and morphological development in Streptomyces coelicolor. Mol. Microbiol., 30 (1), 33–46, 1998. 115. Chatterjee, S. and Vining, L. C. Catabolite repression in Streptomyces venezuelae. Induction of betagalactosidase, chloramphenicol production, and intracellular cyclic adenosine 3’,5’-monophosphate concentrations. Can. J. Microbiol., 28 (3), 311–7, 1982. 116. Chakraburtty, R. and Bibb, M. The ppGpp synthetase gene (relA) of Streptomyces coelicolor A3(2) plays a conditional role in antibiotic production and morphological differentiation. J. Bacteriol., 179 (18), 5854–61, 1997. 117. Sun, J., Hesketh, A., and Bibb, M. Functional analysis of relA and rshA, two relA/spoT homologues of Streptomyces coelicolor A3(2). J. Bacteriol., 183 (11), 3488–98, 2001. 118. Jin, W., Kim, H. K., Kim, J. Y., Kang, S. G., Lee, S. H., and Lee, K. J. Cephamycin C production is regulated by relA and rsh genes in Streptomyces clavuligerus ATCC27064. J. Biotechnol., 114 (1–2), 81–7, 2004. 119. Jin, W., Ryu, Y. G., Kang, S. G., Kim, S. K., Saito, N., Ochi, K., Lee, S. H., and Lee, K. J. Two relA/ spoT homologous genes are involved in the morphological and physiological differentiation of Streptomyces clavuligerus. Microbiology, 150 (Pt 5), 1485–93, 2004. 120. Hesketh, A., Sun, J., and Bibb, M. Induction of ppGpp synthesis in Streptomyces coelicolor A3(2) grown under conditions of nutritional sufficiency elicits actII-ORF4 transcription and actinorhodin biosynthesis. Mol. Microbiol., 39 (1), 136–44, 2001. 121. Xu, J., Tozawa, Y., Lai, C., Hayashi, H., and Ochi, K. A rifampicin resistance mutation in the rpoB gene confers ppGpp-independent antibiotic production in Streptomyces coelicolor A3(2). Mol. Genet. Genomics, 268 (2), 179–89, 2002. 122. Hu, H., Zhang, Q., and Ochi, K. Activation of antibiotic biosynthesis by specified mutations in the rpoB gene (encoding the RNA polymerase beta subunit) of Streptomyces lividans. J. Bacteriol., 184 (14), 3984–91, 2002. 123. McDowall, K. J., Thamchaipenet, A., and Hunter, I. S. Phosphate control of oxytetracycline production by Streptomyces rimosus is at the level of transcription from promoters overlapped by tandem repeats similar to those of the DNA-binding sites of the OmpR family. J. Bacteriol., 181 (10), 3025–32, 1999. 124. Liras, P., Asturias, J. A., and Martin, J. F. Phosphate control sequences involved in transcriptional regulation of antibiotic biosynthesis. Trends Biotechnol., 8 (7), 184–89, 1990. 125. Sola-Landa, A., Rodriguez-Garcia, A., Franco-Dominguez, E., and Martin, J. F. Binding of PhoP to promoters of phosphate-regulated genes in Streptomyces coelicolor: identification of PHO boxes. Mol. Microbiol., 56 (5), 1373–85, 2005. 126. Chouayekh, H. and Virolle, M. J. The polyphosphate kinase plays a negative role in the control of antibiotic production in Streptomyces lividans. Mol. Microbiol., 43 (4), 919–30, 2002. 127. Chiang, P. K., Gordon, R. K., Tal, J., Zeng, G. C., Doctor, B. P., Pardhasaradhi, K., and McCann, P. P. S-Adenosylmethionine and methylation. Faseb J., 10 (4), 471–80, 1996. 128. Okamoto, S., Lezhava, A., Hosaka, T., Okamoto-Hosoya, Y., and Ochi, K. Enhanced expression of S-adenosylmethionine synthetase causes overproduction of actinorhodin in Streptomyces coelicolor A3(2). J. Bacteriol., 185 (2), 601–9, 2003. 129. Kim, D. J., Huh, J. H., Yang, Y. Y., Kang, C. M., Lee, I. H., Hyun, C. G., Hong, S. K., and Suh, J. W. Accumulation of S-adenosyl-L-methionine enhances production of actinorhodin but inhibits sporulation in Streptomyces lividans TK23. J. Bacteriol., 185 (2), 592–600, 2003. 130. Natsumi, S., Kazuhiko, K., Jun, X., Okamoto, X., and Ochi, K. Effect of S-adenosylmethionine on antibiotic production in Streptomyces griseus and Streptomyces griseuoflavus. Actinomyceteology, 17, 47–49, 2003. 131. Huh, J. H., Kim, D. J., Zhao, X. Q., Li, M., Jo, Y. Y., Yoon, T. M., Shin, S. K., Yong, J. H., Ryu, Y. W., Yang, Y. Y., and Suh, J. W. Widespread activation of antibiotic biosynthesis by S-adenosylmethionine in streptomycetes. FEMS Microbiol. Lett., 238 (2), 439–47, 2004.
Regulation of Secondary Metabolism in Bacteria
13-25
132. Lee, H. S., Ohnishi, Y., and Horinouchi, S. A sigmaB-like factor responsible for carotenoid biosynthesis in Streptomyces griseus. J. Mol. Microbiol. Biotechnol., 3 (1), 95–101, 2001. 133. Takano, H., Obitsu, S., Beppu, T., and Ueda, K. Light-induced carotenogenesis in Streptomyces coelicolor A3(2): identification of an extracytoplasmic function sigma factor that directs photodependent transcription of the carotenoid biosynthesis gene cluster. J. Bacteriol., 187 (5), 1825–32, 2005. 134. Takano, H., Asker, D., Beppu, T., and Ueda, K. Genetic control for light-induced carotenoid production in non-phototrophic bacteria. J. Ind. Microbiol. Biotechnol., 33 (2), 88–93, 2006. 135. Price-Whelan, A., Dietrich, L. E., and Newman, D. K. Rethinking ‘secondary’ metabolism: physiological roles for phenazine antibiotics. Nat. Chem. Biol., 2 (2), 71–78, 2006. 136. Schuster, M., Lostroh, C. P., Ogi, T., and Greenberg, E. P. Identification, timing, and signal specificity of Pseudomonas aeruginosa quorum-controlled genes: a transcriptome analysis. J. Bacteriol., 185 (7), 2066–79, 2003. 137. Wagner, V. E., Bushnell, D., Passador, L., Brooks, A. I., and Iglewski, B. H. Microarray analysis of Pseudomonas aeruginosa quorum-sensing regulons: effects of growth phase and environment. J. Bacteriol., 185 (7), 2080–95, 2003. 138. Latifi, A., Foglino, M., Tanaka, K., Williams, P., and Lazdunski, A. A hierarchical quorum-sensing cascade in Pseudomonas aeruginosa links the transcriptional activators LasR and RhIR (VsmR) to expression of the stationary-phase sigma factor RpoS. Mol. Microbiol., 21 (6), 1137–46, 1996. 139. McGrath, S., Wade, D. S., and Pesci, E. C. Dueling quorum sensing systems in Pseudomonas aeruginosa control the production of the Pseudomonas quinolone signal (PQS). FEMS Microbiol., Lett 230 (1), 27–34, 2004. 140. Cao, H., Krishnan, G., Goumnerov, B., Tsongalis, J., Tompkins, R., and Rahme, L. G. A quorum sensing-associated virulence gene of Pseudomonas aeruginosa encodes a LysR-like transcription regulator with a unique self-regulatory mechanism. Proc. Natl. Acad. Sci. USA, 98 (25), 14613–18, 2001. 141. Wade, D. S., Calfee, M. W., Rocha, E. R., Ling, E. A., Engstrom, E., Coleman, J. P., and Pesci, E. C. Regulation of Pseudomonas quinolone signal synthesis in Pseudomonas aeruginosa. J. Bacteriol., 187 (13), 4372–80, 2005. 142. Diggle, S. P., Winzer, K., Chhabra, S. R., Worrall, K. E., Camara, M., and Williams, P. The Pseudomonas aeruginosa quinolone signal molecule overcomes the cell density-dependency of the quorum sensing hierarchy, regulates rhl-dependent genes at the onset of stationary phase and can be produced in the absence of LasR. Mol. Microbiol., 50 (1), 29–43, 2003. 143. Pesci, E. C., Milbank, J. B., Pearson, J. P., McKnight, S., Kende, A. S., Greenberg, E. P., and Iglewski, B. H. Quinolone signaling in the cell-to-cell communication system of Pseudomonas aeruginosa. Proc. Natl. Acad. Sci. USA, 96 (20), 11229–34, 1999. 144. Gallagher, L. A., McKnight, S. L., Kuznetsova, M. S., Pesci, E. C., and Manoil, C. Functions required for extracellular quinolone signaling by Pseudomonas aeruginosa. J. Bacteriol., 184 (23), 6472–80, 2002. 145. Deziel, E., Lepine, F., Milot, S., He, J., Mindrinos, M. N., Tompkins, R. G., and Rahme, L. G. Analysis of Pseudomonas aeruginosa 4-hydroxy-2-alkylquinolines (HAQs) reveals a role for 4-hydroxy-2heptylquinoline in cell-to-cell communication. Proc. Natl. Acad. Sci. USA, 101 (5), 1339–44, 2004. 146. Mashburn, L. M. and Whiteley, M. Membrane vesicles traffic signals and facilitate group activities in a prokaryote. Nature, 437 (7057), 422–5, 2005. 147. Mavrodi, D. V., Bonsall, R. F., Delaney, S. M., Soule, M. J., Phillips, G., and Thomashow, L. S. Functional analysis of genes for biosynthesis of pyocyanin and phenazine-1-carboxamide from Pseudomonas aeruginosa PAO1. J. Bacteriol., 183 (21), 6454–65, 2001. 148. Diggle, S. P., Cornelis, P., Williams, P., and Camara, M. 4-quinolone signalling in Pseudomonas aeruginosa: old molecules, new perspectives. Int. J. Med. Microbiol., 296 (2–3), 83–91, 2006. 149. Ledgham, F., Ventre, I., Soscia, C., Foglino, M., Sturgis, J. N., and Lazdunski, A. Interactions of the quorum sensing regulator QscR: interaction with itself and the other regulators of Pseudomonas aeruginosa LasR and RhlR. Mol. Microbiol., 48 (1), 199–210, 2003.
13-26
Bacterial Transcriptional Regulation of Metabolism
150. Palma, M., Zurita, J., Ferreras, J. A., Worgall, S., Larone, D. H., Shi, L., Campagne, F., and Quadri, L. E. Pseudomonas aeruginosa SoxR does not conform to the archetypal paradigm for SoxR-dependent regulation of the bacterial oxidative stress adaptive response. Infect. Immun., 73 (5), 2958–66, 2005. 151. Whiteley, M., Lee, K. M., and Greenberg, E. P. Identification of genes controlled by quorum sensing in Pseudomonas aeruginosa. Proc. Natl. Acad. Sci. USA, 96 (24), 13904–9, 1999. 152. Aendekerk, S., Ghysels, B., Cornelis, P., and Baysse, C. Characterization of a new efflux pump, MexGHI-OpmD, from Pseudomonas aeruginosa that confers resistance to vanadium. Microbiology, 148 (Pt 8), 2371–81, 2002. 153. Aendekerk, S., Diggle, S. P., Song, Z., Hoiby, N., Cornelis, P., Williams, P., and Camara, M. The MexGHI-OpmD multidrug efflux pump controls growth, antibiotic susceptibility and virulence in Pseudomonas aeruginosa via 4-quinolone-dependent cell-to-cell communication. Microbiology, 151 (Pt 4), 1113–25, 2005. 154. Pearson, J. P., Van Delden, C., and Iglewski, B. H. Active efflux and diffusion are involved in transport of Pseudomonas aeruginosa cell-to-cell signals. J. Bacteriol., 181 (4), 1203–10, 1999. 155. Kohler, T., van Delden, C., Curty, L. K., Hamzehpour, M. M., and Pechere, J. C. Overexpression of the MexEF-OprN multidrug efflux system affects cell-to-cell signaling in Pseudomonas aeruginosa. J. Bacteriol., 183 (18), 5213–22, 2001. 156. Hassan, H. M. and Fridovich, I. Mechanism of the antibiotic action pyocyanine. J. Bacteriol., 141 (1), 156–63, 1980. 157. Wilkinson, C. J., Hughes-Thomas, Z. A., Martin, C. J., Bohm, I., Mironenko, T., Deacon, M., Wheatcroft, M., Wirtz, G., Staunton, J., and Leadlay, P. F. Increasing the efficiency of heterologous promoters in actinomycetes. J. Mol. Microbiol. Biotechnol., 4 (4), 417–26, 2002.
14 A Synthetic Approach to Transcriptional Regulatory Engineering 14.1 Introduction �������������������������������������������������������������������������������������14-1 Biological Motivation for the Synthetic Approach • Challenges in Synthetic Circuit Design and Construction
14.2 Lessons from Natural Circuits ������������������������������������������������������14-2
The Lac Operon • Oocyte Development
Modular Circuits • Integrated Circuits
14.3 Synthetic Intracellular Circuits..................................................... 14-5 14.4 Synthetic Intercellular Circuits ���������������������������������������������������� 14-9 Natural Cell–Cell Communication Systems • Synthetic Intercellular Circuits • Artificial Cell–Cell Communication System
Wilson W. Wong and James C. Liao University of California
14.5 Alternative Synthetic Strategies...................................................14-12 14.6 Issues in Population versus Single Cell Measurements............14-12 14.7 Conclusion ��������������������������������������������������������������������������������������14-13 References ��������������������������������������������������������������������������������������������������14-14
14.1 Introduction It has been well recognized that regulation is a major hurdle to metabolic engineering. Various control loops are connected in multiple networks to direct metabolic flux in accordance with intracellular and extracellular conditions. As such, the cell is often able to negate the alteration introduced by the metabolic engineer. Traditional metabolic engineering has focused on disrupting these regulatory loops and forcing the metabolic flux to a desired pathway regardless of the cell’s need. While this approach has enjoyed tremendous success, complex metabolic engineering may require sophicticated redesign of regulation networks, rather than eliminating them. In addition, building de novo regulatory loops may be necessary for foreign pathways to achieve higher yield and productivity. These applications call for design of “synthetic” regulatory circuits that do not naturally exist in cells. The best-studied regulatory system is transcriptional regulation. Various molecular tools are available to rewire and characterize the transcriptional regulatory circuits in many organisms. By connecting isolated regulatory systems, interesting properties may emerge. Several examples have demonstrated that synthetic regulatory circuits can successfully exhibit the designed behavior, such as bistability [1,2] and oscillation [3–5]. While most of these circuits have not been used in metabolic engineering, a synthetic feedback loop has demonstrated increased productivity for isoprenoid production [6], demonstrating the potential for non-native regulatory circuits and the need for investigating the design principles of native regulatory circuits. 14-1
14-2
Bacterial Transcriptional Regulation of Metabolism
14.1.1 Biological Motivation for the Synthetic Approach In addition to constructing non-native control circuits, the synthetic approach is also useful in identifying biological design principles that are of both fundamental and practical importance. Traditionally, regulatory circuits are studied by using the reductionist approach, which focuses on the biochemical functions of each component. On the other hand, the systems approach focuses on elucidating network connectivity at the whole-cell level. Although these two approaches are essential to elucidating the global behavior of biological systems, they do not readily yield design principles that help to guide the construction of non-native circuits. Existing intracellular networks are often complicated by many auxiliary circuits that may mask the basic design principle of the system. Therefore, to deduce fundamental principles by illuminating each component in the modern day cells is as difficult as rediscovering fundamental laws of physics by disassembling an automobile. An alternative approach, dubbed the synthetic approach, is to generate hypothetical operating principles, followed by testing of these principles using artificially synthesized networks designed based on these principles. The design approach may avoid second-order functions that are not important for the first principle. Furthermore, synthetic networks are not limited by the natural biological systems and provide a wider parameter range for characterization. The synthetic approach starts from creating a good idea, much like the design of any engineering systems. At this stage, the operating principle is inspired by physical and mathematical insight, but constrained by biological and chemical realities. A prototype mathematical model serving as a conceptual blueprint is useful and often necessary. When such a conceptual prototype model is constructed, the proper biological components such as promoters, regulators, enzymes, and metabolites are identified to fulfill the design specification. Finally, the network is “reconstituted” inside the cell to test the properties of the system. Creation of artificial systems also allows one to explore potential applications that are not displayed by the natural design (Figure 14.1). This network synthesis approach is analogous to in vitro protein reconstitution commonly performed in biochemistry and molecular biology. Instead, the network is reconstituted in vivo by selecting the proper genetic and metabolic elements. The design principle tested using the synthetic approach may or may not be used in nature. However, they provide focal points to search for similar designs in the cell and examine the gap between theoretical prediction and biological reality.
14.1.2 Challenges in Synthetic Circuit Design and Construction Although the design of synthetic biological circuits shares many common steps with engineering constructions, biological circuits face a major complication in uncertainties, which are manifested at two levels. First, the lack of detailed kinetic parameters for biological elements prevents the precise prediction of network behavior. Second, the unknown interaction of these molecules with other cellular components causes additional difficulty. Constructions of synthetic circuits are therefore challenging and often iterative. Nonetheless, based on existing knowledge, several synthetic circuits have already been demonstrated and provided valuable insights into the design principles of biological networks. Since the design of synthetic circuits is largely based on knowledge of natural biological circuits, we will begin by briefly reviewing two natural circuits where molecular details are known: one relatively simple but rich in dynamics, and the other seemingly complex but manifesting insightful design principles. We will then discuss the design of synthetic circuits for both intracellular and intercellular systems.
14.2 Lessons from Natural Circuits 14.2.1 The Lac Operon First described by Jacob and Monod in the early 1960s, the lac operon is one of the most studied gene regulation systems [71]. This operon is considered as the paradigm in bacterial gene regulation. The physiological function of this operon is to control the uptake and utilization of lactose. The lac operon
14-3
A Synthetic Approach to Transcriptional Regulatory Engineering
Mathematical Analysis Develop models to generate hypothesis for experimental testing
Design
Experiment
Design a network with the desired behavior
“Reconstitute” the sythetic system or “prototype,” test and make adjustment
Biological Components Choose and characterize components with the desired property
Figure 14.1 Process flow diagram for the design and construction of synthetic biological circuits. As with many engineering projects, the first step is to generate a conceptual design. Biological components are then identified and mathematical models are constructed. After analyzing the feasibility of the concept, the design is implemented inside the cells. Due to uncertainties associated with biological systems, implementation of synthetic circuits tends to be iterative. Efforts are being made to create standardized parts and infrastructures to reduce uncertainties [69].
consists of three genes, lacZ, lacY, and lacA that are transcribed together. The lacZ gene, which codes for β-galactosidase, converts lactose into allolactose as well as hydrolyzes lactose into galactose and glucose. The gene lacY encodes a lactose permease that facilitates the transport of lactose into the cytoplasm. The gene lacA codes for a pyranoside acetyltransferase, which is thought to serve as a detoxifying enzyme by acetylating non-metabolizable sugars to prevent their re-entry into the cell. The activity of the lac operon promoter is controlled by two transcription regulators, LacI and CRP. LacI binds to the lac operator site in the absence of lactose and represses transcription. CRP, a global regulator which, when binds to cyclic AMP (cAMP), helps stabilize the interaction of RNA polymerase with the lac promoter and increases the gene expression from the operon (Figure 14.2a). An interesting dynamic feature of this operon is the positive feedback loop formed by the lactose and the LacY, which can generate multiple steady states. The presence of a positive feedback loop, however, does not guarantee multistability. To gain a better insight into the behavior of the this operon, Ozbudak et al. [7] examined the lac operon using a systems approach. Instead of focusing the attention to only the promoter or the operon itself, they examined the lac operon along with the interaction to the cAMP, lactose, and glucose metabolism. The result is the construction of a phase diagram of the lactose metabolism that maps regions of bistability and
14-4
Bacterial Transcriptional Regulation of Metabolism
(a)
(b)
Mos +
Glc
MEK
Lactose
+
cAMP
P42 MAPK
LacY
Lactose Lactose Metabolism
CRP
+
+ Rsk
LacI
– LacZ
Plac
lacZ
Myt1 lacY
lacA
– Cdc2 Cyclin B
Figure 14.2 Natural biological circuits. Network diagram of (a) the lactose operon in E. coli, and (b) the MAP kinase pathway of oocyte maturation. The dash line from Cdc2/cyclinB complex to Mos represents interaction that might involve yet unknown pathways.
monostability. From the phase diagram, Ozbudak et al. were able to predict conditions that lead to monostability and bistability. Some of the predictions were verified experimentally.
14.2.2 Oocyte Development Maturation and development are usually irreversible processes. When oocytes are transiently exposed to progesterone, they begin the maturation process by leaving the G2 arrest state, completing a meiotic division and finally stay at metaphase meiosis II for days with active cell-division cycle protein kinase (Cdc2) and p42 mitogen-activated protein kinase (MAPK) present [8]. Progesterone first activates the synthesis of Mos, which in turn phosphorylates and activates MEK. Activated MEK then phosphorylates and activate MAPK, which further augment the activity of Cdc2 (Figure 14.2b). The presence of active Cdc2 further enhances the accumulation of Mos, and thus forms a positive feedback loop. Since every phosphorylation steps in the cascade is reversible with turnover rate in the order of minutes, the positive feedback loop is considered to be the reason for the irreversibility observed. Indeed, Xiong and Ferrell have shown that this MAPK cascade signal transduction system posses bistability, and oocytes remain at meiosis II even when induced transiently by progesterone, thus forming “memory”[9]. Even more remarkable is the fact that when the positive feedback loop was blocked, the oocytes maturation process became transient and the oocytes “demature.” This elegant study demonstrates how systems level analysis coupled with traditional biochemical characterization can generate novel results.
14-5
A Synthetic Approach to Transcriptional Regulatory Engineering
14.3 Synthetic Intracellular Circuits 14.3.1 Modular Circuits 14.3.1.1 Negative Feedback Loops One of the most direct ways to test network design principles is to recreate the network in a control environment to eliminate undesired interactions. From theoretical analysis, negative feedback is known to reduce disturbance. Biological circuits in the presence of stochasticity are noisy, and thus a feedback loop is expected to reduce transcription noise and combating environmental fluctuations. To test this prediction experimentally, Becskei and Serrano constructed a negative feedback system with tet promoter controlling its own negative regulator, TetR [10] (Figure 14.3a). By comparing with a system without the feedback control, they indeed demonstrated that negative feedback loop reduces the variance of gene expression. On the other hand, it has been showed that negative feedback could also amplify the noise (11), depending on the context. Although this prediction has not been verified experimentally, the insight may have far-reaching impact since negative feedback is commonly used in biology. 14.3.1.2 Positive Feedback Loops Positive feedback loops can theoretically generate multistability and are instrumental in developmental processes. Detail understanding of this network motif can have profound implications in medicine and biotechnology. Becskei et al. created a positive feedback loop in Saccharomyces cerevisia where a promoter expresses its own activator, causing autocatalytic gene expression [2] (Figure 14.3b). Without the autocatalytic feature, the promoter showed a graded expression response. With autocatalysis, however, the promoter displays binary response, where two populations of either high or low expression level exist. A different positive feedback loop composed of mutually regulated negative feedback loops was constructed by Gardner et al. [1]. Promoter 1, which is regulated by repressor 1, controls the expression of repressor 2 that represses promoter 2. Promoter 2 in turn controls the expression of repressor 1 (Figure 14.3c). This network can function as a toggle switch, and be induced to high gene expression level by transient exposure to inducers. Two versions of the toggle switch were constructed. One uses LacI and TetR as the repressors and the other one uses LacI and λcI as the repressors. Both versions displayed stable switching behavior over several generations. Based on the similar network architecture, (a)
(b)
tetO1
tetR
tetreg
rtTA
(c)
tetO1 or PL
lacI
trc
tetR or cI
Figure 14.3 Synthetic transcriptional feedback controllers. The network diagram of (a) the negative feedback loop constructed and (b) positive feedback loop by Becskei et al. and (c) the toggle switch by Gardner et al. The promoter tetO1 is a synthetic promoter [70] with the –10 and –35 region from lamda phage PL promoter and two binding sites for the repressor tetR. The rtTA regulator is a tetracycline-responsive transactivator and can activate the expression from tetreg promoter in a doxycycline-dependent manner. The promoter trc is another synthetic promoter that contains the –35 region from the trp promoter and the –10 region from lacUV5 promoter. (From Becskei, A., Seraphin, B. and Serrano, L., EMBO J., 2001, 20, 2528–2535; Lutz, R. and Bujard, H., Nucleic Acids Res., 1997, 25, 1203–1210. With permission.)
14-6
Bacterial Transcriptional Regulation of Metabolism
(a)
Repressilator
Ninfa–Atkinson clock
(b) t te 1 O
glnAp2Os
λP R
glnG
λ cl
Lacl
TetR lacO1
glnK
lacl
Figure 14.4 Synthetic transcription oscillators. Schematic network diagram for (a) repressilator, (b) Ninfa– Atkinson Clock. TetO1 and lacO1 are synthetic promoters designed to have large dynamic range with low leakiness. See text for more detailed descriptions.
a mammalian version of the toggle switch and positive feedback loop are later engineered by Kramer et al. [12,13]. These synthetic circuits further demonstrate that positive feedback loops can generate multistability. 14.3.1.3 Oscillators An oscillatory transcription network, termed the “repressilator,” that utilizes a three repressors system to forms a ring oscillator was constructed by Elowitz and Leibler [3]. The setup of this circuit involves a transcription factor (repressor 1) that represses the expression of repressor 2. Similarly, repressor 2 reduces the expression of yet another transcription regulator (repressor 3) (Figure 14.4a). Finally, repressor 3 represses the expression of repressor 1 to complete the circuit. Based on this network structure, Elowitz and Leibler developed a mathematical model and determined that promoters with low leakiness and repressors with shorter half-life would put the oscillator in a parametric regime that displays oscillation. The final construct uses LacI as repressor 1, TetR as repressor 2 and λ cI as repressor 3 with a short peptide sequence fused at the C-terminus that can shorten the half-life of the proteins. To obtain readout of the circuit, a green fluorescence protein, gfp, was placed under the control of the TetR responsive promoter. This system was not designed to be synchronized, and therefore observation of the oscillatory behavior required single-cells measurements. Approximately 40% of the repressilator displayed oscillation with a period of 150 minutes. This work demonstrates that the network configuration of three mutually repressible controllers can generate oscillation in biological system. A different system was later engineered by Atkinson et al. [4]. The conceptual design is reminiscent of the one proposed by Barkai and Leibler [14], which is inspired from observations of naturally occurring oscillators. Through simulation, this design was determined to be relatively noise resistant. This system involves a promoter that controls its own activator and repressor (Figure 14.4b). The repressor also represses the expression of the activator. For the repressor, LacI was chosen and placed under the control of glnK promoter. For the activator, glnG was chosen and placed under the control of a modified glnAp2 promoter with LacI binding sites. The gene glnG is a part of the Ntr two-components system involved in nitrogen starvation response. Both glnAp2 and glnK promoters are positively activated by glnG, but glnK requires higher level of activated glnG than glnAp2to be fully induced. The period of the oscillation range from 10 to 20 hours and can be observed at the population level when grown under constant cell density (turbidostat) condition. This circuit resembles a predator (lacI)-prey (glnG) model and demonstrates that such circuit can exhibit oscillation in single cell organisms.
14-7
A Synthetic Approach to Transcriptional Regulatory Engineering
14.3.2 Integrated Circuits 14.3.2.1 Plug and Play Circuits To avoid complication associated with unknown interactions with the host’s cellular network, the pioneering synthetic genetic circuits were designed to be isolated from the rest of the cellular physiology. These systems had proven to be a valuable tool for understanding network design principles. Biological circuits, however, have evolved to control cellular functions. Integration of circuits into the cellular physiology and metabolism are therefore required to achieve the intended capability. An approach to achieve circuit integration into the host is to use existing modular circuit and connect the input and output of the circuit into the regulation networks of the host. Kobayashi et al. [15] incorporated the toggle switch into the SOS response and the biofilm formation network of E. coli. With this configuration, the toggle switch can switch from one state to another state through transient exposure by small dose of UV radiation. The output can be protein expression or biofilm formation. 14.3.2.2 Integrated Dynamic Feedback Controller Another approach to achieve circuit integration is to employ cellular physiology as part of the circuit. Farmer and Liao [6] had designed a dynamic feedback controller in E. coli that positioned the expression of a key enzyme in the lycopene production pathway under the control of a metabolite, acetyl-phosphate [6]. The goal of this circuit is to reach a balance for the resource utilization between cell growth and metabolite production, a challenge commonly encountered in metabolic engineering. When grown in glucose, E. coli began to produce a metabolic by-product, acetate, when the tricarboxylic acid cycle (TCA) is unable to accommodate the incoming glycolytic flux (Figure 14.5a). The buildup of acetate signifies that the cells have enough energy and thus represent a prime opportunity to shift cellular resources from cell growth toward the production of metabolites. When the level of acetate increases, (b) Metabolator
(a)
Glycolysis (Influx)
Glucose
Controller NRI Binding sites
Control valve
Limiting enzymes
AcP
AcCoA (M1) Waste Pta (E1)
glnA p2
LacI
Acs (E2)
AcP -OAc HOAc (M2)
Sensor NRI Product
Acetate excretion (outflux)
Figure 14.5 Synthetic gene-metabolic circuits. Network diagram of (a) the dynamic intracellular feedback controller and (b) the metabolator. The dynamics feedback controller regulates the fluxes going to the production formation based on the amount waste product generated. The signaling molecule acetyl-phosphate, which is an intermediate to the acetate (waste) production, was used as an indicator of waste formation. The promoter glnAp2 is responsive to the level of AcP and controls the expression of the key enzyme. The construction of the metabolator, which also employed the AcP system, demonstrated that metabolic flux can be used to control the dynamics of an oscillator.
14-8
Bacterial Transcriptional Regulation of Metabolism
its precursor, acetyl phosphate, would also increase and activate the production of lycopene. This circuit demonstrates the potential applications of synthetic circuits. 14.3.2.3 Integrated Gene and Metabolic Oscillator An integrated gene and metabolic oscillator, termed metabolator, that directly couples with the central metabolism of E .coli, was constructed by Fung et al. [5] (Figure 14.5b). The metabolator is consisted of a flux-carrying network with two inter-converting metabolite pools (M1 and M2) catalyzed by two enzymes (E1 and E2), whose expressions are negatively and positively regulated by M2, respectively. In the first stage where the M2 level is low, E1 is expressed, while E2 is not. A high input metabolic flux converts M1 to M2 rapidly. The accumulation of M2 represses E1 and up-regulates E2. When the backward reaction rate exceeds the sum of the forward reaction rate and the output rate, M2 level decreases and M1 level increases. E1 is then expressed again and E2 is degraded, returning to the first stage. On the other hand, if the input flux is low, M2 will not accumulate quickly enough to cause a large swing in gene expression, and thus a stable steady state will be reached. This design allows metabolism to control gene expression cycles, a characteristic commonly seen in circadian regulation. The experimental design is realized with the promoter glnAp2 controlling two genes, lacI and acs (encoding acetyl-coenzyme-A synthetase). Phosphotransacetylase, Pta, a reversible enzyme that catalyzes the conversion of acetyl coenzyme A, AcCoA, to acetyl phosphate, AcP, is placed under the control of the lacO1 promoter. The lacO1 promoter is a synthetic promoter designed to reduce the leakiness of gene expression when repressed, while maintaining a large dynamic range of protein expression. To obtain readout of the circuit, a green fluorescence protein is placed under the control of another LacI repressible promoter. All proteins were fused to an ssrA degradation peptide to reduce their half-lives. The promoter glnAp2, in the absence of NRII (a bifunctional protein kinase/phosphatase regulation involved in the nitrogen starvation response), can be activated by AcP [6]. Here AcCoA corresponds to M1, AcP corresponds to M2, Pta corresponds to E1, and Acs corresponds to E2. When the AcP level is low, LacI and Acs expression levels are also low, and thus derepressing the lacO1 promoter which in turn increases the production of Pta. As Pta is being produced, it converts acetyl coenzyme A, AcCoA, into AcP, which activates the glnAp2 promoter and synthesizes Acs and LacI. Increasing the concentration of LacI represses the transcription rate of pta, hence lowering the level of AcP. Meanwhile, as the level of Acs increases, it converts more acetate into AcCoA. This removes the downstream product from the AcP degradation pathway and helps lower the level of AcP. One important aspect of this design is the interconversion of two metabolite pools, AcCoA and AcP, through two enzymes, Pta and Acs, that are controlled by the circuit. These two enzymes in turn, indirectly and directly, respond to AcP. To gain more insight into the properties of the system, nonlinear differential equations and bifurcation analysis were employed to probe the dynamic properties of the system. The analysis predicts that oscillation will be favored when the metabolic influx is high. The metabolic flux driven dynamics represents a very important feature of this design. An imbalance of metabolic fluxes destabilizes the steady states and leads to oscillation. This allows the metabolator to respond to the glycolytic influx. To test the prediction, the metabolator was cultured in different carbon sources that support different glycolytic rates. When grown in glucose, fructose, and mannose, which support high glycolytic flux, the metabolator exhibits oscillation. When grown in glycerol however, which yields a low glycolytic flux, the metabolator did not exhibit oscillation. The experimental results therefore confirmed the mathematical predictions. In addition to glycolytic flux, the protein degradation kinetics can also exert great influence onto the metabolator. The degradation kinetics of ssrA tagged protein is first-order when measured in vivo at the population level [16]. Through mathematical modeling, Wong et al. showed that zeroth-order degradation kinetics was shown to enhance the robustness of the metabolator by further destabilizing the steady states of the system [17]. The construction of the metabolator demonstrates that the two-pool network architecture produces oscillation with metabolic fluxes as the driving force for oscillation.
A Synthetic Approach to Transcriptional Regulatory Engineering
14-9
14.4 Synthetic Intercellular Circuits Intercellular communication is of paramount importance to the development of higher organisms. Recently, numerous reports have also demonstrated the importance of cell-cell communication in bacterial physiology. Further insight of biological networks requires more understanding of more molecular details and system level analysis of intercellular communication. Using well characterized model organisms such as E. coli and S. cerevisiae, one can construct synthetic intercellular circuits to deduce underlying principles, similar to the approach used in developing synthetic intracellular circuits.
14.4.1 Natural Cell–Cell Communication Systems Bacteria have long been considered to be unicellular and do not interact with other bacteria. This notion is mainly due to the way studies are performed on bacteria. In most laboratory conditions, bacteria are grown in pure culture. Bacteria in natural environment, however, rarely live alone in pure culture planktonic condition. Environmental biologists have recognized the importance of bacterial communities to biogeochemical cycling that maintain the biosphere [18]. Biofilm is an example of bacterial communities. Biofilm can be composed of a single species or multiple species. Such communities can live on biotic and abiotic surfaces and perform diverse metabolic functions. To coordinate the population behavior, intercellular communication is essential. In the 1970s, researchers had identified a communication system in Vibrio fisheri with homoserine lactone as the signaling molecule, which was termed autoinducer-1 (AI-1). V. fischeri is a symbiont that lives in the light producing organ of Euprymna scolopes. In low cell density condition, V. fisheri produces small amounts of AI-1 proportional to their growth. Once a threshold concentration of AI-1 is reached, it activates the lux operon that encodes light emitting (luciferase) and AI producing enzymes. This gene expression system response to density or quorum and thus is given the name quorum sensing. Since then, numerous examples of quorum sensing have been found in both Gram positive and Gram negative bacteria with different types of diffusible molecules and different regulation mechanisms. The function under the control of quorum sensing ranges from sporulation, biofilm formation, and virulence factor expression. In higher organisms, cellular development relies heavily on the signaling cues generated by other cells. Complex spatial patterning is resulted from multiple intercellular signals coupled with feedback. An example is the development of left-right asymmetry in mouse embryo. The heart and other inner organs develop an asymmetrical arrangement and morphogenesis [19]. The expression and relay of TGF-β family signaling molecule, Nodal, is crucial to the symmetry breaking and differentiate leftright organ. Nodal couples with extracellular cofactor EGF-CFC to form a positive feedback loop to substantiate its own existence. Moreover, Nodal also induces another intercellular signal Lefty-2/antivin to form a negative feedback loop. In chicks, Nodal was found to induce a Cre-like molecule, Caronte, to relay the signal to more distal cells [20].
14.4.2 Synthetic Intercellular Circuits Equipping circuits with communication systems will greatly enhance the capabilities of the circuits, as demonstrated by the marriage of the Internet and computers. Analogously, the capabilities of biological circuits can be greatly improved with an intercellular communication network. Engineering a communication system between cells will allow cellular programming at the population level rather than at the single-cell level. Basu and Weiss utilized the quorum sensing system from V. fisheri to create a spatiotemporal pulse generator [21] (Figure 14.6a). This circuit contains sender cells that can produce AI and receiver cells that generate a pulse of gene expression in response to increasing the level of AI. Using similar concept but with different network configuration, Basu et al. were able to create specific spatial patterns such as a “bulls eye” on solid media [22] (Figure 14.6b). These two reports demonstrate that
14-10
Bacterial Transcriptional Regulation of Metabolism
Increasing AI Concentration
(a) LuxR
LuxR cI*
cI*
cI* GFP*
GFP*
(b)
LuxR
LuxR
GFP*
LuxR
LuxR
cI
LacIM1
cI
LacIM1
cI
LaclM1
LacI
GFP*
LacI
GFP*
LacI
GFP*
Figure 14.6 Intercellular communication network in E. coli based on the lux/AI system from V. fisheri. Network configuration for (a) the spatioteomporal pulse generator and the (b) pattern formation. The * denotes that the protein half-life is reduced by the addition of a degradation sequence at the end of the protein. lacI M1 is a codon modified mutant of lacI to reduce recombination with the other lacI in the system.
spatial and temporal patterns can be created in unicellular organism, mimicking a powerful capability commonly found in higher organisms. You et al. developed a circuit in E. coli that can sense and control its own cell density [23]. This circuit produces AI continuously and accumulates AI in a cell density-dependent manner (Figure 14.7a). As the AI level reaches a threshold, it activates a toxic gene, which leads to cell death. This circuit is a negative feedback loop that operates concertedly at the population level. The stability of AI is pH dependent, and therefore the steady state cell density can be modulated by pH. As with any negative feedback loop, this system can potentially oscillate when operated in the proper parameter space. In this system, the degradation of AI is probably too slow for oscillation to occur. When grown in a microchemostat condition, however, the constant washing of the growth chamber facilitates AI removal, which allows the circuit to display cell density oscillation [24].
14.4.3 Artificial Cell–Cell Communication System With only one channel of communication, the amount of information that can be transmitted is limited. The availability of information carrier can limit the capability of the overall networks. To increase the information content, a solution is to generate oscillating signals and encode information into the frequency. Another solution is to create more communication channels. The synthetic oscillators discussed in the earlier sections have the potential to generate oscillating intercellular signal. A separate frequency decoder circuit, however, would be needed to complete the scheme. Bulter et al. addressed the later possibility by demonstrating the design and construction of an artificial communication system in E .coli using gene and metabolic network with acetate as the signaling molecule [25]. Chen and Weiss
14-11
A Synthetic Approach to Transcriptional Regulatory Engineering (a)
(b) AI
AtCRE1
luxR
luxR
IP
P
AI ccdB (Kill)
luxI
AI
YPD1
PTP2
P
SSK1
SKN7
HOG1
GFP
P
(c) Glu
AcCOA
Growth Dependent
pta
AcP
ackA
-OAc
Acid/Base HOAc Equilibrium
NRI
NRI-P
NRII-P
NRII
Membrane Transport
HOAc
Acid/Base -OAc Equilibrium
Activation NRI NRI Enhancer
glnAp2 Core
RBS
gfpmut3.1
Figure 14.7 Engineering intercellular communication networks. (a) Schematic diagram of the population controller. AI is the diffusible autoinducer. Kill is a gene that when expressed, can inhibit growth of E. coli. (b) Artificial cell–cell communication in S. cerevisiae and (c) in E. coli. In (b), the signaling molecule and detection module is borrowed from the plant Arabidopsis thaliana. In (c), the signaling molecule is a metabolite from the central metabolism, acetate, and use a nitrogen starvation response two-component system Ntr, to sense acetate. NR-I is responsive to acetate in the absence of it cognate sensor, NR-II.
later engineered an artificial communication system in Saccharomyces cerevisiae by incorporating Arabidopsis thaliana signal synthesis and receptor components into the host [26]. The design strategy identified here can also facilitate further development of more independent communication channels. An intercellular communication system can be divided into two modules; signal generation and signal detection. In designing a signal generation system, the signaling molecules must be chosen to allow transport in and out of the cells. The production of the signal must also be controlled. In designing the signal detection module, the response element to the signal must be chosen to be specific and cannot be a general toxic respond. The sensitivity of the detection system should also be in the range of signal production level and preferably tunable. For the artificial communication system in E. coli, acetate was selected as the signaling molecule (Figure 14.7b). Acetate is mainly produced from the pta/ackA pathway [25]. The transport of acetate
14-12
Bacterial Transcriptional Regulation of Metabolism
across the membrane is passive with the permeability of acetic acid (the conjugate acid of acetate) across the membrane being three orders of magnitude higher. Therefore, less acetate is needed to transport across the membrane at low pH. The acidity of the medium therefore changes the intracellular level of acetate. This unique property of weak acid allows dynamic sensitivity tuning of the system through pH modulation of the media. For the detection module, glnAp2 promoter described in the construction of gene-metabolic oscillator was employed. The activity of the system was reported by gfp. This circuit can exhibit cell density dependent gene expression and the sensitivity of this behavior can be modulated by pH and also through promoter engineering. In the yeast communication system, cytokinin isopentenyladenine (IP) from the plant Arabidopsis thaliana was chosen [26] as the signaling molecule (Figure 14.7c). IP can be generated by adenylate isopentenyl-transferases. To sense this signal, the cytokine receptor AtCRE1 from the same plant was chosen. This receptor can interact with yeast’s endogenous phosphorylation signaling pathway YPD1/SKN7 to activate gene expression. In the wild type strain, YPD1/SKN7 is part of a phosphorylation pathway with SLN1, a cell-surface osmosensor hisdine kinase, and SSK1, an aspartate response regulator. Under normal condition, SLN1 phosphorylates YPD1, which in turn phosphorylates SSK1 and represses HOG1 activity. HOG1 activity is crucial for survival in high osmolarity conditions. In normal condition, however, HOG1 activity is lethal. The constant phosphorylation of YPD1 by SLN1 is therefore a problem because it will render the system insensitive to IP. Deletion of SLN1, however, results in activation of HOG1. To circumvent this problem, Chen and Weiss removed SLN1, but overexpressed an endogenous HOG1 phosphatase to decrease HOG1 activity. This artificial system exhibits quorum sensing behavior when both the signal generation module and the receiver module are present in the same cell.
14.5 Alternative Synthetic Strategies A major challenge facing synthetic circuits engineering is the tedious construction process. To expedite the construction process, different synthetic strategies have been employed to create circuits. Guet et al. used a combinatorial approach for circuits construction [27]. Here they randomly mixed five promoters with three transcription factors to generate an array of 125 possible biological logic gates, such as NAND, NOR, and NOT IF. The behavior of some of the circuits can be readily determined from their connectivity. Other circuits, however, display behavior contrary to the expected output resulted from the network connectivity. Another approach to generate synthetic biological circuit is through directed evolution pioneered by Yokobayashi et al. [28]. In this approach, a simple genetic regulatory network was constructed by using a lacI repressible promoter to control the expression of another repressor cI. A cI repressible promoter then controls a reporter protein, EYFP. With this design, one would expect a high level of IPTG is added, large amount of cI is induced and therefore, low level of EYFP will be produced. Due to various biochemical parameters mismatch, however, the circuit does not function as expected. Yokbayashi et al. then use directed evolution to select for mutants that posses the desire properties. The works described above once again demonstrate that both network and biochemical parameters are needed to fully determine and design biological circuits.
14.6 Issues in Population versus Single Cell Measurements Accurate descriptions of cellular components are critical to the design and construction of synthetic circuits. Most biological measurements are performed using a population of cells. Due to the heterogeneity that exists in every population, however, bulk measurements can sometime lead to incomplete or even inaccurate picture of the process. For example, in processes that involves ultra-sensitive response to inducer, such as Xenopus oocytes maturation [29] and various positive feedback loop gene expressions system [7,30,31] that were discussed earlier, population measurements can sometime yield a graded response to inducer while single cell measurements show a switch-like response. This difference
A Synthetic Approach to Transcriptional Regulatory Engineering
14-13
between single-cell and population observations is cause by the presence of distinct subpopulations, a consequence of the ultrasensitivity. Because of this inconsistency, the underlying mechanism inferred by the population average will be incorrect. Single-cell measurement, therefore, represent an important method to uncover phenomenon that are masked by population diversity. Moreover, when the dynamics of the biological circuits are not synchronized, such as those displayed by the synthetic oscillators or Bacilis subtilis development process [32,33], single cell measurements are the only means to observed the response of the circuits. Typically, only “snapshots” are needed when performing most single cell measurements. Time-course tracking of individual cells is necessary usually when the dynamics of individual cells is not synchronized. However, an example from single-cell protein degradation study showed that synchronized dynamics can also be masked by population heterogeneity, thus causing the bulk dynamics to appear different from single-cell dynamics. Theoretical analysis had showed that the kinetics of protein degradation can exert profound effect on the performance of biological circuits [34,35]. In another theoretical analysis zeroth-order degradation can enhance the parametric robustness of the metabolator described earlier [17]. The degradation kinetics of a ssrA tagged GFP, which led to the degradation of the GFP by ClpXP and ClpAP protease, is first-order when measured at population level. The single cell degradation kinetics, however, is zeroth-order [17]. The discrepancy is caused by a long-tailed distribution of the initial GFP level and potentially from the variation in protease level. This difference in kinetics between single cell and population will exist even when all the cells exhibit the same degradation rate and start the degradation at the same time. Since the in vitro determined K m value (75 nM) of the ssrA tagged protein for the ClpXP protease is much smaller than the amount of the substrate protein present, the degradation kinetics is expected to fall into the zeroth-order regime of the Michaelis and Menten equation. The single-cell data seems to accurately represent the in vitro data. Due to the heterogeneity of the protein level, however, the first-order kinetics observed at the population level suggests a much higher K m value relative to the amount of substrate protein. A comparison between the in vivo population level data with the in vitro data can lead to an incorrect assumption that unidentified factors in vivo are increasing the K m value, while the actual cause of the discrepancy is the population diversity in protein level. Although demonstrated in protein degradation, the masking phenomenon can be further generalized to any protein modification process, such as phosphorylation. A routinely encountered question in circuit design and analysis is the issue of robustness. How robust is the system of interest under various intrinsic and extrinsic fluctuations? Many exciting works have been done recently to quantify the stoichasticity, or noise, in gene expression [36–39] and the source of such stoichasticity [40–44]. Based on a bioinformatics study, noise in gene expression seems to be minimized for essential genes, suggesting the importance of noise regulation in fitness enhancement [45]. Genome scale analysis of protein expression noise in S. cerevisiae showed that housekeeping genes, such as ribosome and protease, are less noisy than stress response genes [46,47]. These results suggest that the possibility that noisy gene expression may be beneficial during stress response. During acute environmental changes, Blake et al. showed that cell-cell variability and rapid response can increase the fitness of the organism [48]. Noise had also been demonstrated to play an important role in E. coli pap operon pilli expression [49,50], bacteriophage lytic decision [51], and HIV viral latency decision [52]. Incorporating these findings into the design of synthetic circuits promise to facilitate the design and performance of the system.
14.7 Conclusion Owing to the lack of detail kinetic parameters and unknown interaction with the host endogenous networks, construction of synthetic circuits can be challenging and time consuming. Significant gap in robustness and performance remains between natural circuits and synthetic circuits. Nonetheless, rational design and construction of synthetic circuits had provided valuable knowledge on the governing principles of cellular networks. Failure to achieve the designed behavior can also provide valuable
14-14
Bacterial Transcriptional Regulation of Metabolism
lessons and direct our attention to reduce the gap between current understanding and realities. Although significant advances in genetic manipulation and DNA synthesis technologies had been made in the last several decades, numerous questions remain to be solved in order to completely custom-designed an organism. Can we custom design enzymes to perform specific chemical reactions so that we can engineer organisms to produce synthetic drugs, similar to a chemist’s ability to design synthetic chemical routes? Once we engineered the enzymes, can we incorporate them with control systems, such as transcriptional, translational, post-translational and metabolic, to enhance the efficiency of the production? Our group is focused on integrating metabolic and transcriptional regulation. Others have engineered control at the translation level by manipulating the three dimensional structure of mRNA and the interaction between ribosomes and mRNA [53–55]. Protein signaling cascades can also be manipulated by rewiring the input and output domains [56–58]. Borrowing from enzymes’ powerful chemical synthetic capabilities, biosynthetic pathways have been engineering and rewired to not only improve the yield of high-valued compounds, but to also create completely new compounds [59–68]. Works have also been done to engineer circuits, such as toggle switches, in mammalian cells [12,13]. In natural settings, cells rarely live alone in pure culture. Can we engineer various species to communicate and cooperate with one another to further harness nature’s chemical synthesis and regulation potentials? With our sophisticated developments of electronic network, can we directly integrate external electronic circuits with intracellular and intercellular circuits to form a hybrid entity with superior control and powerful chemical synthesis capabilities? To answer these questions, efforts from a wide variety of disciplines, such as physics, engineering, biology, chemistry, and mathematics, will be required. Aside from the possibility of constructing custom organisms for biotechnological applications, the answers to these questions will also allow us to explore the design principles of life, and probably yield insights into the evolution and origin of life as well.
References
1. Gardner, T. S., Cantor, C. R., and Collins, J. J. 2000. Nature, 403, 339–42. 2. Becskei, A., Seraphin, B., and Serrano, L. 2001. EMBO J., 20, 2528–35. 3. Elowitz, M. B. and Leibler, S. 2000. Nature, 403, 335–38. 4. Atkinson, M. R., Savageau, M. A., Myers, J. T., and Ninfa, A. J. 2003. Cell, 113, 597–607. 5. Fung, E., Wong, W. W., Suen, J. K., Bulter, T., Lee, S. G., and Liao, J. C. 2005. Nature, 435, 118–22. 6. Farmer, W. R. and Liao, J. C. 2000. Nat. Biotechnol. 18, 533–37. 7. Ozbudak, E. M., Thattai, M., Lim, H. N., Shraiman, B. I., and Van Oudenaarden, A. 2004. Nature, 427, 737–40. 8. Ferrell, J. E., Jr. 2002. Curr. Opin. Cell. Biol., 14, 140–48. 9. Xiong, W. and Ferrell, J. E., Jr. 2003. Nature, 426, 460–65. 10. Becskei, A. and Serrano, L. 2000. Nature, 405, 590–93. 11. Xu, B. L. and Tao, Y. 2006. J. Theor. Biol., 243, 214–21. 12. Kramer, B. P., Viretta, A. U., Daoud-El-Baba, M., Aubel, D., Weber, W., and Fussenegger, M. 2004. Nat. Biotechnol., 22, 867–70. 13. Kramer, B. P. and Fussenegger, M. 2005. Proc. Natl. Acad. Sci. USA, 102, 9517–22. 14. Barkai, N. and Leibler, S. 2000. Nature, 403, 267–68. 15. Kobayashi, H., Kaern, M., Araki, M., Chung, K., Gardner, T. S., Cantor, C. R., and Collins, J. J. 2004. Proc. Natl. Acad. Sci. USA, 101, 8414–19. 16. Andersen, J. B. 1998. Appl. Environ. Microbiol., 64, 2240–46. 17. Wong, W. W., Tsai, T. Y., and Liao, J. C. 2007. Mol. Syst. Biol., 3, 1–8. 18. Davey, M. E. and O’Toole, G. A. 2000. Microbiol. Mol. Biol. Rev., 64, 847–67. 19. Gaio, U., Schweickert, A., Fischer, A., Garratt, A. N., Muller, T., Ozcelik, C., Lankes, W., Strehle, M., Britsch, S., Blum, M., and Birchmeier, C. 1999. Curr. Biol., 9, 1339–42. 20. Freeman, M. 2000. Nature, 408, 313–19.
A Synthetic Approach to Transcriptional Regulatory Engineering
14-15
21. Basu, S., Mehreja, R., Thiberge, S., Chen, M. T., and Weiss, R. 2004. Proc. Natl. Acad. Sci. USA, 101, 6355–60. 22. Basu, S., Gerchman, Y., Collins, C. H., Arnold, F. H., and Weiss, R. 2005. Nature, 434, 1130–34. 23. You, L., Cox, R. S., 3rd, Weiss, R., and Arnold, F. H. 2004. Nature, 428, 868–71. 24. Balagadde, F. K., You, L., Hansen, C. L., Arnold, F. H., and Quake, S. R. 2005. Science, 309, 137–40. 25. Bulter, T., Lee, S. G., Wong, W. W., Fung, E., Connor, M. R., and Liao, J. C. 2004. Proc. Natl. Acad. Sci. USA, 101, 2299–304. 26. Chen, M. T. and Weiss, R. 2005. Nat. Biotechnol., 23, 1551–55. 27. Guet, C., Elowitz, M. B., Hsing, W., and Leibler, S. 2002. Science, 296, 1466–70. 28. Yokobayashi, Y., Weiss, R., and Arnold, F. H. 2002. Proc. Natl. Acad. Sci. USA, 99, 16587–91. 29. Ferrell, J. E., Jr and Machleder, E. M. 1998. Science, 280, 895–98. 30. Novick, A. and Weiner, M. 1957. Proc. Natl. Acad. Sci. USA, 43, 553–66. 31. Siegele, D. A. and Hu, J. C. 1997. Proc. Natl. Acad. Sci. USA, 94, 8168–72. 32. Suel, G. M., Kulkarni, R. P., Dworkin, J., Garcia-Ojalvo, J., and Elowitz, M. B. 2007. Science, 315, 1716–19. 33. Suel, G. M., Garcia-Ojalvo, J., Liberman, L. M., and Elowitz, M. B. 2006. Nature, 440, 545–50. 34. Buchler, N. E., Gerland, U., and Hwa, T. 2005. Proc. Natl. Acad. Sci. USA, 102, 9559–64. 35. Kim, P. M. and Tidor, B. 2003. Genome Res., 13, 2391–95. 36. Elowitz, M. B., Levine, A. J., Siggia, E. D., and Swain, P. S. 2002. Science, 297, 1183–86. 37. Raser, J. M. and O’Shea, E. K. 2004. Science, 304, 1811–14. 38. Ozbudak, E. M., Thattai, M., Kurtser, I., Grossman, A. D., and van Oudenaarden, A. 2002. Nat. Genet., 31, 69–73. 39. Pedraza, J. M. and van Oudenaarden, A. 2005. Science, 307, 1965–69. 40. Becskei, A., Kaufmann, B. B., and van Oudenaarden, A. 2005. Nat. Genet., 37, 937–44. 41. Rosenfeld, N., Young, J. W., Alon, U., Swain, P. S., and Elowitz, M. B. 2005. Science, 307, 1962–65. 42. Volfson, D., Marciniak, J., Blake, W. J., Ostroff, N., Tsimring, L. S., and Hasty, J. 2006. Nature, 439, 861–64. 43. Austin, D. W., Allen, M. S., McCollum, J. M., Dar, R. D., Wilgus, J. R., Sayler, G. S., Samatova, N. F., Cox, C. D., and Simpson, M. L. 2006. Nature, 439, 608–11. 44. Colman-Lerner, A., Gordon, A., Serra, E., Chin, T., Resnekov, O., Endy, D., Pesce, C. G., and Brent, R. 2005. Nature, 437, 699–706. 45. Fraser, H. B., Hirsh, A. E., Giaever, G., Kumm, J., and Eisen, M. B. 2004. PLoS Biol 2, e137. 46. Bar-Even, A., Paulsson, J., Maheshri, N., Carmi, M., O’Shea, E., Pilpel, Y., and Barkai, N. 2006. Nat. Genet., 38, 636–43. 47. Newman, J. R., Ghaemmaghami, S., Ihmels, J., Breslow, D. K., Noble, M., DeRisi, J. L., and Weissman, J. S. 2006. Nature, 441, 840–46. 48. Blake, W. J., Balazsi, G., Kohanski, M. A., Isaacs, F. J., Murphy, K. F., Kuang, Y., Cantor, C. R., Walt, D. R., and Collins, J. J. 2006. Mol. Cell., 24, 853–65. 49. Zhou, B., Beckwith, D., Jarboe, L. R., and Liao, J. C. 2005. Biophys. J., 88, 2541–53. 50. Jarboe, L. R., Beckwith, D., and Liao, J. C. 2004. Biotechnol. Bioeng., 88, 189–203. 51. Arkin, A., Ross, J., and McAdams, H. H. 1998. Genetics, 149, 1633–48. 52. Weinberger, L. S., Burnett, J. C., Toettcher, J. E., Arkin, A. P., and Schaffer, D. V. 2005. Cell, 122, 169–82. 53. Pfleger, B. F., Pitera, D. J., Smolke, C. D., and Keasling, J. D. 2006. Nat. Biotechnol., 24, 1027–32. 54. Bayer, T. S. and Smolke, C. D. 2005. Nat. Biotechnol., 23, 337–43. 55. Isaacs, F. J., Dwyer, D. J., Ding, C., Pervouchine, D. D., Cantor, C. R., and Collins, J. J. 2004. Nat. Biotechnol., 22, 841–47. 56. Bhattacharyya, R. P., Remenyi, A., Yeh, B. J., and Lim, W. A. 2006. Annu. Rev. Biochem., 75, 655–80. 57. Dueber, J. E., Yeh, B. J., Chak, K., and Lim, W. A. 2003. Science, 301, 1904–8. 58. Park, S. H., Zarrinpar, A., and Lim, W. A. 2003. Science, 299, 1061–64.
14-16
Bacterial Transcriptional Regulation of Metabolism
59. Ro, D. K., Paradise, E. M., Ouellet, M., Fisher, K. J., Newman, K. L., Ndungu, J. M., Ho, K. A., Eachus, R. A., Ham, T. S., Kirby, J., Chang, M. C., Withers, S. T., Shiba, Y., Sarpong, R., and Keasling, J. D. 2006. Nature, 440, 940–43. 60. Yoshikuni, Y., Ferrin, T. E., and Keasling, J. D. 2006. Nature, 440, 1078–82. 61. Lee, T. S., Khosla, C., and Tang, Y. 2005. J. Am. Chem. Soc., 127, 12254–62. 62. Tang, Y., Lee, T. S., and Khosla, C. 2004. PLoS Biol. 2, e31. 63. Schmidt-Dannert, C. 2000. Curr. Opin. Biotechnol., 11, 255–61. 64. Schmidt-Dannert, C., Umeno, D., and Arnold, F. H. 2000. Nat. Biotechnol., 18, 750–53. 65. Achkar, J., Xian, M., Zhao, H., and Frost, J. W. 2005. J. Am. Chem. Soc., 127, 5332–33. 66. Guo, J. and Frost, J. W. 2002. J. Am. Chem. Soc., 124, 528–29. 67. Wang, C., Oh, M. K., and Liao, J. C. 2000. Biotechnol. Prog., 16, 922–26. 68. Wang, C. W., Oh, M. K., and Liao, J. C. 1999. Biotechnol. Bioeng., 62, 235–41. 69. Endy, D. 2005. Nature, 438, 449–53. 70. Lutz, R. and Bujard, H. 1997. Nucleic Acids Res., 25, 1203–10. 71. Jacob, F. and Monod, J. 1961. J. Mol. Biol., 3, 318–56.
Modeling Tools for Metabolic Engineering
IV
Costas D. Maranas Pennsylvania State University
15 Metabolic Flux Analysis Maria I. Klapa...........................................................................15-1 Introduction • Metabolic Flux Quantification Methods • ObservabilityRedundancy-Sensitivity Analysis • New Directions and Challenges • Conclusions
16 Metabolic Control Analysis Joseph J. Heijnen.................................................................16-1 Introduction • Definitions and Structure of Metabolic Reaction Networks • Mathematical Models of Metabolic Networks • MCA: A Linear Kinetic Approximation • Nonlinear Approximate Kinetics • Parameterisation of the Metabolic Reaction Network Model • Conclusion and Outlook
17 Structure and Flux Analysis of Metabolic Networks Kiran Raosaheb Patil, Prashant Madhusudan Bapat, and Jens Nielsen.................................................................17-1 Introduction • Metabolic Network Structure • Network Functionality at Metabolite Level • Conclusions and Future Perspective
18 Constraint-Based Genome-Scale Models of Cellular Metabolism Radhakrishnan Mahadevan..................................................................................................18-1 Introduction • Methods for Model Development • Methods for Interrogating Metabolic Networks • Software and Databases for Genome-Scale Modeling • Survey of Genome-Scale Metabolic Models • Conclusions
19 Multiscale Modeling of Metabolic Regulation C.A. Leclerc and Jeffrey D. Varner......................................................................................................................19-1 Introduction • Background • The Multiscale Nature of Metabolism Follows from the Central Dogma of Molecular Biology • Construction of First-Principle and Reversed Engineered Models of Transcriptional Programs • Models of the Prokaryotic Translational Program • Integrating Transcriptional and Translational Programs with Physiology Leads to More Predictive Models • Summary and Conclusions
IV-1
IV-2
Modeling Tools for Metabolic Engineering
20 Validation of Metabolic Models Sang Yup Lee, Hyohak Song, Tae Yong Kim, and Seung Bum Sohn................................................................................... 20-1 Introduction • Metabolic Model Validation • Conclusions and Future Prospects
I
n this section a number of modeling tools are described that are becoming increasingly commonplace in the practice of metabolic engineering. These modeling tools account for metabolic network reconstruction, enumeration of all feasible flux distributions, reconciliation with various experimental data such as DNA microarray, gene essentiality experiments, isotopomer spectra, etc., and metabolism redirection for various overproduction targets. Despite the breadth of the described activities this eclectic compilation of studies is by no means a complete enumeration of all facets of metabolic engineering that has been affected by modeling and computations. Even though our knowledge of metabolism and of other interacting biological processes remains incomplete, models provide a systematic way for organizing our current understanding and computations allows us to test the limit of their predictive capabilities by revealing new biological insight and/or more often reveal gaps in our understanding. In Chapter 15, it is described how modeling and computations can be used, in the context of metabolic flux analysis (MFA), to elucidate fluxes in metabolic networks given GC/MS or NMR metabolite measurements. The identification of the metabolic flux distribution in a metabolic network given a particular set of conditions is of great significance because it uniquely captures the metabolic state of the system in a way that DNA microarray or proteomic data alone cannot. In this contribution, we see how a model that describes the propagation of labeled atoms in a metabolic network and experimental data on the final destination of labeled atoms enable the deciphering of internal metabolic flows under steadystate and transient conditions. In Chapter 16, we see how the totality of metabolism can be organized using network and graph concepts. Furthermore, how regulation decomposes metabolic networks into subnetworks that are active only under certain conditions. Flux elucidation is described based on maximization principles (i.e., max biomass) or experimental measurements (i.e., MFA). Finally insights are provided into the use of optimization tools (e.g., OptKnock) for metabolism redesign. In Chapter 17, we see how individual metabolic pathways can be organized into organism-specific genome-scale models. Metabolic network reconstruction practices based on homology searches and manual curation are presented followed by the derivation of a biomass equation that drains all biomass components in their appropriate biological ratios and ATP maintenance requirements. The completeness/correctness of the genome-scale model is subsequently assessed based on high-throughput growth physiology, gene essentiality, and by-product secretion patterns. Any discrepancies are resolved based on manual curation. Given a genome-scale model enables a number of queries such as extreme pathway and elementary mode analyses which aim to identify all feasible metabolic flux distributions. Alternatively, flux variability analysis identifies ranges on all fluxes given a set of constraints on growth and/or substrates. Flux coupling analyses provide the means to pinpoint co-variability of different flux pairs. Finally, this study highlights new features in metabolic models such as dynamics, regulatory and thermodynamics constraints. In Chapter 18, a new dimension in the modeling of metabolic networks is introduced. Specifically, multi-scale modeling tools are reviewed that account for not only stoichiometry but also integrate kinetics with regulatory and control responses. The motivation is to enable the prediction of physiological switches in a dynamic sense that stoichiometric models inherently cannot capture. Two different avenues are highlighted, the augmented constraints-based methods and the cybernetic modeling framework. A number of examples are provided where the explicit modeling of transcription and translation integrated with metabolism lead to improved predictive capabilities.
Modeling Tools for Metabolic Engineering
IV-3
Finally, in Chapter 19, a number of procedures are introduced for the construction and validation of genome-scale models. We see the necessity of comparing the simulation results with experimental data under various genetic and environmental perturbations. The use of single and in some cases double gene essentiality studies is reviewed along with the use of a novel gene amplification technique as the means of reconciling model predictions. Furthermore, the use of genome-scale models for the design of a defined growth medium is highlighted.
15 Metabolic Flux Analysis 15.1 Introduction ������������������������������������������������������������������������������������ 15-1 15.2 Metabolic Flux Quantification Methods..................................... 15-2 Metabolic Network Reconstruction • Metabolic Flux Vector • Metabolite Balancing Analysis • Isotopomer Distribution Analysis
15.3 Observability-Redundancy-Sensitivity Analysis......................15-10 15.4 New Directions and Challenges...................................................15-10
Maria I. Klapa Foundation for Research and Technology-Hellas
• Flux Estimation from Metabolomics • Flux Analysis under Transient Physiological Conditions
15.5 Conclusions ������������������������������������������������������������������������������������15-14 References ������������������������������������������������������������������������������������������������� 15-15
15.1 Introduction Let us consider any type of network from our daily lives, from electrical and power to telephone, transportation, computer networks and the Internet itself. In all these cases, a certain quantity of the characteristic variable moves from a node to another on particular paths having to comply with the underlying network’s (a) structure, (b) path capacities and, (c) availability in relevant resources. In network flow theory [1], the flux (or flow rate) on a particular path of the network is defined by the number of the characteristic variable units that move from the starting to the end node of the path per unit time [2]. In this sense, it is the over the section area (or surface) integral of the transport phenomena definition of the flux term [3]. Accordingly, in (bio)chemical reaction networks, flux is defined “the rate at which material is processed through a pathway” [4,5]. Based on this definition, the flux of a particular (bio)chemical reaction is equal to its rate, and the steady-state flux of a linear metabolic pathway is equal to the steadystate rate of any involved reaction [4]. Similarly to the flux map of any type of network, being able to measure the metabolic flux distribution of a biological system at a particular set of conditions is of great significance. It provides a metabolic fingerprint of the system’s in vivo physiological state, constituting thus a fundamental determinant of cellular physiology [4–7]. Considering the role of metabolism in the context of the overall cellular function, it is easily understandable why quantifying a complete and accurate metabolic flux map, also termed metabolic flux analysis (MFA), is among the major objectives of metabolic pathway engineering, quantitative systems biology, and integrated multiscale interaction analysis over multiple levels of cellular function. It should be noted that the cellular metabolic state and consequently the metabolic flux map cannot be directly extrapolated from neither the genotype, nor the full-genome gene expression map, nor the full-proteome protein concentration map individually or in combination. Such approach disregards the fact that the relationship between gene expression, protein concentration, and metabolic reaction rate is neither linear, nor unidirectional [8,9]. Metabolism is regulated by local control mechanisms [4]. In addition, it constitutes a molecular framework through which the cell interacts with its environment; thus the metabolic state could itself affect transcription and/or translation and not just be determined 15-1
15-2
Modeling Tools for Metabolic Engineering
by them [4,10]. If extensive and reliable in vivo kinetic information for the metabolic reactions had been available, MFA would have been a straightforward task. Without this information, the MFA parameter estimation problem is based on the mass conservation law. Specifically, the intracellular reaction fluxes, which cannot be measured directly, are expressed as functions of measurable quantities (e.g., metabolic output, isotopic tracer distribution measurements) through metabolite and isotopic tracer balance equations based on a stoichiometric model [7]. MFA’s objective is to accurately invert the system of balances and determine the metabolic flux map that lead to the observed measurement. Because the available measurements are expected to be subject to random experimental errors and process variability, their values are not expected to strictly satisfy the balance equations, leading thus to mathematical singularities that do not correspond to the physical reality. Therefore, instead of solving a system of balances, flux estimation is defined as a data reconciliation problem [7,11]. In these weighted least-squares constrained minimization problems, the measured variables are optimally adjusted, so that their adjusted values satisfy the balance equations [11]. The next section describes the specifics of the currently used metabolic flux quantification methods. Their advantages and weaknesses, the latter necessitating the introduction of new methods, will also be discussed in an effort to present a systematic methodology for flux quantification. The last section describes new directions of MFA toward transient and high-throughput analyses, as these are dictated by the needs of the post-genomic era.
15.2 Metabolic Flux Quantification Methods 15.2.1 Metabolic Network Reconstruction The first task in MFA is the reconstruction of the most extensive potentially active metabolic reaction network of the organism under question. This will be the stoichiometric model based on which the mass and isotopic tracer balances connecting the unknown intracellular fluxes with the measurable quantities will be formulated. Before the DNA sequencing revolution, the reconstruction of an organism’s metabolic network was mainly based on the existing knowledge about the metabolic network structure of similar cellular systems, along with any available data regarding in vitro/in vivo enzymatic activity and metabolic output measurements under various genetic backgrounds or environmental conditions [see e.g., 12]. In the postgenomic era, the available resources are further augmented by the constantly increasing knowledge about gene annotation based on high-throughput sequencing [e.g., 13–15] and gene expression analyses [16]. While the availability of the genomic data provides a significant advancement in the process of metabolic network reconstruction, this still remains a nontrivial task that requires the direct involvement of an expert’s judgment to decide over the sometimes multiple feasible answers to questions that arise during the process [17]. Since the genome sequencing and/or annotation processes have not yet completed for any organism, uncertainties regarding the intracellular biochemistry are bound to exist in any MFA, to a higher or lower degree depending on the organism. Any uncertainty regarding the considered stoichiometric model can affect the reliability of flux estimates as true measures of the actual in vivo fluxes [5,7]. This is the reason why data reconciliation, and thus measurement redundancy, is an integral part of MFA [18–20], as it will be discussed in further detail in Section 15.3. Barring measurement biases, inconsistencies between measured and estimated values of redundant data indicate the need for modifications in the initially assumed structure of the metabolic pathway network, leading to new discoveries regarding the intracellular biochemistry of a particular organism [21–26]. After its reconstruction, the considered metabolic network comprises M metabolites and N intracellular reactions. In the context of graph theory, a metabolic network is represented by a graph with nodes the metabolites of the network (see Figure 15.1). A metabolite A is connected through a directed edge with a metabolite B, only if A and B are reactant (product) and product (reactant), respectively, of a particular network’s reaction; the direction of the edge indicates the direction of the reaction. Obviously, if a reaction is potentially reversible, the graph includes edges of both directions. Two edges originating
15-3
Metabolic Flux Analysis rAinput The “inside” the network environment
A v1
v3 G
rBoutput
B
v2
v6 E v5f
C v5b D
v4
v7f v7b
F
rFoutput
The “outside” of the network environment
Figure 15.1 In the context of graph theory, a metabolic network is depicted by a graph with nodes the metabolites and edges the reactions of the network. A frame enclosing certain metabolites and reactions defines the border between the “inside” and the “outside” of the network environment. Edges crossing this frame depict the transport reactions of the substrates and excreted products between the two environments. In the depicted network, A is the substrate of the network with input rate equal to rAinput, while B and F are excreted out of the network with excretion rates equal to rBoutput and rFoutput, respectively. v i represents the forward ( = net) flux of the i-th irreversible network reaction. v jf, v jb depict, respectively, the forward and reverse flux of a reversible network reaction (the forward direction of a reversible reaction is selected a priori and does not necessarily coincide with the in vivo forward direction of the particular reaction).
from different metabolites merge toward a third metabolite only in the case of condensation reactions. A frame enclosing all intracellular metabolites and reactions differentiates the “intra-” from the “extra-” cellular environment or, in the case of compartmentalization, the “inside” from the “outside” the compartment environment. Edges crossing the frame(s) to and from intracellular (or inside a compartment) metabolites depict the transport reactions of the substrate(s) and excreted product(s). When analyzing a particular metabolic network, the frame does not necessarily have to include the entire metabolic network or a particular compartment, but it could be isolating just a fraction of it for further more specific analysis. (See legend of Figure 15.1).
15.2.2 Metabolic Flux Vector The in vivo activity of a metabolic reaction i is described by at most two fluxes: the nonnegative fluxes of its forward and backward (reverse) directions, which, in text and figures, are depicted as v if and vbi, respectively (see Figure 15.1); the latter flux is equal to zero in the case of irreversible reactions. Equivalently, the activity of a metabolic reaction could be represented by its net and exchange flux [21], which are exch, respectively. Specifically, for any reaction i of the network (i = 1, …, N), the net depicted as vnet i and v i and exchange fluxes are defined as function of the forward and reverse fluxes as follows:
f b vnet i = v i - v i
(15.1)
v iexch = min (|vfi |, |v ib|)
(15.2)
15-4
Modeling Tools for Metabolic Engineering
v f of reaction 1 = 40
v f of reaction 2 = 2440 v b of reaction 2 = 2400
vexch of reaction 2 = 2400 v net of reactions 1,2 = 40
Figure 15.2 Neglecting the importance of exchange flux estimation could lead to erroneous biologically relevant conclusions. The two depicted reactions have the same net flux, however they are not biochemically equivalent: one is irreversible, while the other is practically at equilibrium.
The net flux of a metabolic reaction represents the net conversion rate of material through reaction i and may in theory take any value from -∞ to + ∞. The exchange flux of a metabolic reaction is a measure of the extent of its reversibility; it is by definition nonnegative and zero in the case of irreversible reactions. While the estimation of the net conversion rate map of a metabolic reaction network has been traditionally attributed higher significance in MFA, in metabolic pathway engineering the accurate determination of the exchange fluxes is of high value also. For example, let us consider the two reactions shown in Figure 15.2. While of the same net flux, the two reactions are not biochemically equivalent; one is irreversible, while the other is practically at equilibrium. In metabolic pathway engineering, failing to determine this difference could lead to erroneous conclusions regarding targets for genetic modification (see a case indicating the significance of accurate exchange flux quantification in Ref. [27]). In the rest of the text, the balance equations will be written in terms of the forward and reverse reaction fluxes (unless otherwise specified).
15.2.3 Metabolite Balancing Analysis (MBA) Metabolite balancing analysis (MBA) [4–7] refers to the estimation of intracellular metabolic fluxes based on metabolic output measurements, i.e., the net excretion rates of the network substrates and products (See Figure 15.1). For a particular stoichiometric model, intracellular fluxes are connected to the metabolic output through the metabolite balances, a term used in MF A for the mole balances around each metabolite of the metabolic network. Specifically, the mole balance around metabolite j of a network with M metabolites and N metabolic reactions is defined as follows: 2N
∑ (s ⋅ v ) - (r ji
i=1
i
output j
- rjinput )=
dc j dt
(15.3)
where
sji, the stoichiometric coefficient of metabolite j in reaction i (the forward and reverse directions of a metabolic reaction constitute different reactions in this definition); sji is positive, negative, or zero, if metabolite j is, respectively, a product of, a reactant of, or does not participate in reaction i. v j, the flux of the j-th reaction in metabolic network. In this definition, each of the N metabolic reactions corresponds to two fluxes, the forward and reverse directions of each of the N metabolic reactions constitute different reactions. Thus, the total number of fluxes is 2N. (rjoutput - rjinput ), the net excretion rate of metabolite j out of the network; rjoutput , rjinput depict the rate of excretion and assimilation, respectively, of metabolite j out of and into the network.
15-5
Metabolic Flux Analysis
(dcj/dt), the rate of change of metabolite’s j pool size, the latter being depicted by cj; accumulation term. Separating the forward from the reverse fluxes of the metabolic reactions in the network, Equation 15.3 could be rewritten as follows: N
∑[s
jk
⋅ ( v fk - v bk )] - (rjoutput - rjinput ) =
k=1
N
∑s
jk
dc j or equivalently (according to Equation 15.2) dt
output ⋅ v net - rjinput ) = k - (rj
k=1
dc j dt
(15.4)
where
sjk, the stoichiometric coefficient of metabolite j in the forward direction of metabolic reaction k; it is the opposite of j’s stoichiometric co efficient in the reverse direction of metabolic reaction k. Equation 15.4 indicates that metabolite balances constrain only the net fluxes of the metabolic network as linear function of the metabolite net excretion rates. Thus, metabolic output measurements contain and can provide information only about the net fluxes. This significant conclusion needs to be seriously taken into consideration, when designing flux analysis experiments. In addition, the accurate measurement of intracellular metabolite concentrations remains to-date a cumbersome, if not impossible for many free metabolite pools, task. Developments in metabolomic analysis [27–29] are expected to alter this situation, as it will be discussed in the last section. Due to these experimental limitations, flux quantification, in general, and metabolite balancing, in particular, has been mainly applied to-date under physiological steady- or pseudo-steady-state conditions. In this case, all reaction fluxes are constant and so are the metabolite pool sizes, as the formation rate of each metabolite is equal to its consumption rate in the network. Thus, in the metabolite balance equations (Equations 15.3, 15.4), the steady-state conditions correspond to a zero accumulation term. The (pseudo)steady-state assumption can be practically justified in many cases, as most metabolites are characterized by high turnover rate [4]. However, it should always be validated in metabolically transient systems. The experimental environment for attaining metabolic steady-state conditions is the continuous flow bioreactor, termed chemostat. In this theoretical and experimental context, the MBA problem for a metabolic network of M metabolites and N metabolic reactions at (pseudo)steady-state conditions and with A ≤ M metabolite net excretion rates measured, is defined as follows: min
∑
j∈list AT
-rjinput ) - (rjoutput - rjinput )]2 [(r output j (σ rj )2
(15.5a)
subject to
S⋅v
net
= (r
output
-r
input
)
x k ≤ v net k ≤ y k ∀ k = 1, …, N
(15.5b) (15.5c)
where list Â, the list of the A network metabolites whose extracellular net excretion rates have been measured (or are measurable). In this formulation, this list includes all intracellular metabolites also; their net excretion rate is postulated to be equal to zero. output - rjinput ), the measured value of the net excretion rate of metabolite j; as described in the previous ( r j definition, for any intracellular metabolite this is equal to zero.
15-6
Modeling Tools for Metabolic Engineering
(rjoutput - rjinput), the adjusted (to the measurements) [estimated] net excretion rate of metabolite j (variable). (routput - rinput), the M × 1 vector of the adjusted metabolite net excretion rates output σ rj , the standard deviation of (r - rjinput ). For the intracellular metabolites this is postulated j to be ≤ E-4. In this way, a high penalty value (1/σ rj ) prevents the objective function to converge to a value that would correspond to a (large) nonzero net excretion rate for any of the intracellular metabolites. S, the M × N stoichiometric matrix of the forward directions of the metabolic network reactions; the forward direction of any metabolic reaction is selected at the stage of problem formulation and does not necessarily coincide with the net flux direction of this reaction under the investigated conditions. v, vk, the N × 1 net flux vector and the net flux of the k-th metabolic reaction. x k, yk, the lower and upper bound on the k-th net flux. These bounds are imposed based on previous biological knowledge. If no special bounds are applicable, the net flux of a metabolic reaction could in theory take any value between -∞ and + ∞. Negative net flux has no physical meaning; it simply indicates that under the investigated experimental conditions the net conversion of material through the particular reaction has the opposite direction than the one that has been selected as its forward. Due to its (theoretical and experimental) simplicity, but also the type and amount of information that could be obtained from MBA, it is usually the first and most common MFA tool to be used. However, its inherent limitation of providing information only about the net fluxes imposes de facto the use of other flux quantification methods for the determination of exchange fluxes too. In addition, as it will be explained in Section 15.3, using solely MBA, it is not possible to estimate the flux through metabolic cycles or the flux ratio between parallel pathways. Last, MBA usually provides a limited degree of redundancy for data reconciliation and metabolic network structure validation. To increase the degree of both flux observability and measurement redundancy, quantification methods that can use additional to metabolic output measurements for higher resolution flux estimation, have been introduced. These involve the use of isotopically labeled substrates and are presented in the next section.
15.2.4 Isotopomer Distribution Analysis The isotopomer distribution analysis extends MBA by enabling the use of isotopic tracer distribution measurements for the estimation of metabolic fluxes. Isotopomer distribution analysis is based on the fact that for a particular substrate labeling the isotopic tracer distribution through a metabolic network depends on (a) the stoichiometry of carbon transfer through the network reactions and (b) the metabolic fluxes. As the most abundant element in metabolic networks, the stable (13C) or radio (14C) isotope of carbon is most commonly used as “label”/tracer in metabolic flux estimation. (Positional) isotopomers are termed the various labeling patterns of a molecule. A metabolite with n atoms of a particular element has 2n positional isotopomers of that element in one of its less naturally abundant stable isotopes [7]. For example, a molecule with two carbon atoms has four 13C positional isotopomers, as the two carbon atoms might be either (a) both 12C atoms (“unlabeled” form), or (b) the first 13C and the second 12C, or (c) the first 12C and the second 13C, or (d) both 13C atoms (“fully or uniformly labeled form”). Relative population of a particular metabolite’s isotopomer is defined the fraction of this metabolite’s pool that is labeled according to that isotopomer’s labeling pattern. By definition, the sum of the relative populations of all isotopomers in a metabolite pool is equal to 1. If the relative populations of all isotopomers of a particular isotopic tracer in a metabolite pool are known, this is the most refined representation of the isotopic labeling status of this pool.
15-7
Metabolic Flux Analysis
For a particular substrate labeling, intracellular fluxes are connected to the relative populations of the metabolites’ isotopomer through the isotopomer balances, term used in MFA for the mole balance around any isotopomer of any metabolite of the metabolic network. Specifically, the mole balance around isotopomer q of the metabolic network with M metabolites, N biochemical reactions (2N metabolic fluxes), P carbon atoms and Q isotopomers, is defined as follows [7]:
rjinput ⋅ Iinput + q q
2N
Q
∑∑ i
l=1
2N
(f lqi ⋅ v i ⋅ Il ) +
Q
Q
∑ ∑ ∑(w i
l=1
h=1
i lhq
⋅ v i ⋅ Il ⋅ Ih ) - rjoutput ⋅ Iq = q
d[c jq ⋅ Iq ] dt
(15.6)
where for symbols not listed here see their definition in Equations 15.3 through 15.5) jq, the metabolite to which the q-th isotopomer of the network corresponds. Iinput , the fraction of the metabolite’s jq feed (input) rate that is labeled according to the q-th isotoq pomer’s labeling pattern; it is a known constant for a specific substrate labeling. Iq, the relative population of the q-th isotopomer of the network. IMM iql , the stoichiometric coefficient of the conversion of the l-th to the q-th isotopomer through the i-th reaction of the network; the stoichiometry of isotopomer conversions is represented in a compact form by the isotopomer mapping matrice (IMM) for every reactant-product pair of any of the network reactions ([30], Box 2.2. in Ref. [7]). f lqi = -IMM iql⋅ s jl i (≥ 0), if metabolite jq is produced from metabolite jl through the noncondensation reaction i = s jq i (<0), if l = q and jq is a reactant of reaction i = 0, otherwise w ilhq = IMM iql ⋅ IMM iqh ⋅ s jl i ⋅ s jh i (≥ 0) = 1 if jl and jh condense to produce jq through reaction i (see explanation in Ref. [7]) = 0, otherwise Through the isotopomer balances (Equation 15.6), the metabolic fluxes are nonlinear functions of the relative isotopomer populations. If no condensation reactions are present in the network, the isotopomer balances become bilinear with respect to the intracellular fluxes and the relative isotopomer populations. While the latter cannot be measured directly, they are linear functions of measurable quantities [7,31], e.g., isotopic enrichment measurements from 13C-Nuclear Magnetic Resonance (NMR) [31], the multiplet structure of 13C-NMR spectra [32] and mass isotopomer fraction measurements obtained from mass spectrometry [33]. Isotopomer distribution analysis aims at inverting the mapping of the fluxes into these measurements. As discussed in the MBA section, experimental limitations have to-date imposed the application of MFA in the analysis of metabolic networks under metabolic and isotopic steady-state conditions. Under the latter, neither the size nor the relative isotopomer composition of the metabolite pools change with time. In this case, the right-hand side of Equation 15.6 is equal to zero. It needs to be underlined that to use the steady-state isotopomer balances for the estimation of fluxes, the metabolic and isotopic steady-state conditions should be experimentally validated. If the steady-state criteria are not satisfied, any flux estimates are not reliable. In this theoretical and experimental context, the isotopomer distribution analysis problem under metabolic and isotopic steady-state conditions is defined as follows; this formulation refers to a metabolic network with M metabolites, N biochemical reactions (2N fluxes), P carbon atoms, Q isotopomers, A ≤ M measured metabolite net excretion rates, T ≤ P measured 13C-enrichments, Γ measured 13CNMR sub-peak areas, and Δ ≤ (P + M) measured mass isotopomer fractions (all symbols in Equations 15.3 through 15.6 are hereby used identically):
15-8
Modeling Tools for Metabolic Engineering
ˆ - C )2 [(rjoutput - rjinput ) - (rjoutput - rjinput )]2 (C t t min + C )2 (σ rj )2 ( σ j∈list t t list ∈ T A
∑
∑
∧ ∧ (SP γ - SPγ )2 (MIδ - MIδ )2 + + 2 (σ SP (σ δMI )2 γ ) γ ∈list Γ δ∈list Δ
∑
∑
(15.7a)
subject to: S⋅v
net
= (r
output
-r
input
) (metabolite balances)
(15.7b)
rjinput ⋅ Iinput q q
2N
+
Q
∑∑ i
2N
(f lqi ⋅ v i ⋅ Il ) +
l=1
Q
Q
∑ ∑ ∑(w i
l=1
i lhq
⋅ v i ⋅ Il ⋅ Ih )
h=1
-rjoutput ⋅ Iq = 0 ∀ q = 1,..., Q (isotopomer balances) q
(15.7c)
Q
∑u
Ct =
tq
⋅ Iq ∀ t∈listT
(15.7d)
q=1
Q
SPγ =
∑β
γq
⋅ Iq ∀ γ∈list Γ
(15.7e)
⋅ Iq ∀ δ∈list Δ
(15.7f)
q=1 Q
MIδ =
∑ξ
δq
q=1
x k ≤ v net k ≤ y k
∀ θ = 1,…, N
(15.7g)
w k ≤ v exch ≤ ηk ∀ θ = 1,…, N k
(15.7h)
where ∧ C t, Ct, the measured and adjusted (estimated), respectively, 13C-enrichment of the t-th carbon atom of the metabolic network. 13C-enrichment of the t-th carbon atom of the network is defined the fraction of the metabolite pool to which the t-th carbon atom of the metabolic network belongs in which the particular carbon is 13C. listT, the list of the T network carbon atoms whose 13C-enrichment has been measured. σ∧Ct , the standard deviation of the measured 13C-enrichment value of the t-th carbon atom. SPγ , SPγ, the measured and adjusted (estimated) γ-th 13C-NMR sub-peak area. list Γ, the list of the Γ measured 13C-NMR sub-peak areas. σγSP, the standard deviation of the γ-th measured 13C-NMR sub-peak area. ∧
MIδ , MIδ , the measured and adjusted (estimated) relative population of the δ-th mass isotopomer of the network. Mass isotopomers of a metabolite are the labeling patterns of different molecular weight (e.g., the unlabeled positional isotopomer of a two-carbon atom metabolite is a different mass isotopomer than the uniformly labeled positional isotopomer; however, the labeled at the first or second carbon atom positional isotopomers correspond to the same mass isotopomer of the metabolite).
Metabolic Flux Analysis
15-9
list Δ , the list of the Δ network mass isotopomers whose relative population has been measured. σ δΜΙ, the standard deviation of the measured value of the δ-th mass isotopomer relative population. utq = 1, if the t-th carbon atom of the network is 13C in the q-th network isotopomer (this implies that the metabolite to which the t-th carbon atom of the network belongs ( jCt ) is the metabolite of the q-th network isotopomer (jq)). = 0, otherwise. β γq = 1, if the q-th network isotopomer contributes to the γ-th measured 13C-NMR sub-peak area (both quantities correspond to the same metabolite). = 0, otherwise. ξ δ q = 1, if the q-th positional isotopomer has the same molecular weight as the δ-th mass isotopomer of the network (both quantities correspond to the same metabolite). = 0, otherwise. x k, yk lower and upper bound of the net flux of the k-th metabolic reaction of the network. In lack of other more specific constraints, these are -∞ and + ∞, respectively. wk, ηk, lower and upper bound of the exchange flux of the k-th metabolic reaction of the network. In lack of other more specific constraints, these bounds are 0 and + ∞, respectively. If a reaction is irreversible then wk = ηk = 0, while if it is at equilibrium, wk = ηk = + ∞. The described formulation of the isotopomer distribution analysis problem refers to the three primarily used types of isotopic tracer measurements. However, any other type of isotopic tracer measurement could be accordingly incorporated as a linear function of the relative populations of the positional isotopomers of the network. Taking into consideration that the isotopic tracer distribution depends not only on the net, but on the exchange fluxes too [34,35], isotopomer distribution analysis is expected to provide a higher degree of flux observability than MBA enabling the estimation of exchange fluxes. In addition, the flux ratio between parallel pathways could be observed under certain circumstances [33,36]. The increased number of measurements for the same set of metabolic fluxes is expected to also increase the degree of measurement redundancy that is essential for data reconciliation analysis (see Section 15.3). These advantages are, however, provided with the cost of solving a much larger and more complex problem than in the case of MBA. Specifically, if an average metabolic network comprises 50 metabolites (which is thereby the number of metabolite balances), then with an average number of four carbon atoms, and thus 16 13Cpositional isotopomers per metabolite, the metabolic network comprises 800 isotopomers. This indicates that the isotopomer balancing analysis problem may consist of thousands of variables and nonlinear constraints compared to tens of linear constraints for the respective metabolite balancing problem. The main difficulty of the isotopomer balancing analysis emerges from the condensation reactions in the metabolic network, which give rise to the quadratic term in the isotopomer balance constraints (Equations 15.6 or 15.7). If metabolic networks did not include condensation reactions, the isotopomer balances would be bilinear, with analytical solution for the system of isotopomer balance equations when all intracellular fluxes are known. In addition, the relevant observability and redundancy analysis (see Section 15.3) would be less intricate. In 1999, Wiechert et al. [37] managed to transform the isotopomer balance system of equations into a “cascade” of bilinear systems of equations by introducing the (artificial) “cumomer” entity. The cumomer fraction refers to the sum of certain isotopomer fractions of a metabolite. A metabolite has as many cumomers as isotopomers. Thus, a metabolite with n carbon atoms has 2n 13C-cumomers. A 13C-cumomer is characterized by the number and the position of the carbon atoms that have been specified as labeled. Specifically, a metabolite has n!/[k!(n-k)!] k-customer. They correspond to any possible combination of k carbon atoms of the metabolite and the relative population of a particular k-cumomer is defined as the sum of the relative populations of all positional isotopomers which are labeled at the specified k carbon atoms at least. Based on this definition a metabolite with n carbon atoms has a single n-cumomer, which coincides with the fully labeled isotopomer of the metabolite, while it has n 1-cumomers, each coinciding with the 13C enrichment of the corresponding carbon atom. The single 0-cumomer corresponds to the sum of all metabolite isotopomers and is by
15-10
Modeling Tools for Metabolic Engineering
definition equal to 1. The significance of the linear transformation of the isotopomer into the cumomer space is that the cumomer balances can be solved from a “cascade of linear equations” [37]. Notably, the system of balances around the 0-cumomers of all metabolites is equal to the metabolite balances (see Equation 15.5b), while the system of balances around the 1-cumomers is equal to the system of the bilinear carbon enrichment balances, which has an analytical solution [31]. By representing the (k-1)-cumomer fractions in the k-cumomer (k is a nonzero integer) balances as a function of fluxes, as this has been analytically estimated after solving the (k-1)-cumomer balances, all cumomer fractions could be finally represented as a function of fluxes and the known cumomer fractions of the labeled substrate. The isotopomer fractions could be estimated from the cumomer fractions through the linear transformation that connects them. In this sense, all numerical and statistical methods that have been derived for parameter estimation problems with bilinear constraints and/or solution of bilinear equation systems could be applied in isotopomer distribution analysis. Even after the cumomer transformation, isotopomer/cumomer distribution analysis remains a challenging problem, due to the large number of isotopomers/cumomers. The size of the problem increases dramatically when multiple isotopic tracers are used [38]. In this case, it is recommended that the recently introduced elementary metabolite units (EMU) framework [38] is used. This provides a modeling approach that significantly reduces the number of problem variables without loss of information. The crucial step in the EMU framework is the decomposition of the metabolic network into reactions between EMUs rather than metabolites through the known stoichiometry of the atomic transitions in the network reactions. Then, in the context of the EMU reaction network, balances are formulated around each EMU. The EMU balances replace the isotopomer balance constraints in the isotopomer distribution analysis problem (Equation 15.7a), while all isotopic measurements can be accordingly written as linear functions of the network’s EMUs. Antoniewicz et al. [38] showed that for the gluconeogenesis pathway and thee isotopic tracers, i.e., 13C, 2H and 18O, the EMU analysis is based on only 354 EMUs, while the metabolic network involves ~2.7 million isotopomer/cumomers.
15.3 Observability-Redundancy-Sensitivity Analysis Observability, redundancy and sensitivity analyses are integral part of MFA. As fluxes cannot be measured directly, but are estimated by available measurements as described in Section 15.2, flux quantification can be considered as successfully designed if (a) the acquired experimental data contains sufficient information to allow for the estimation (observation) of all unknown fluxes, and (b) the measurements are reliable and sensitive indicators of the metabolic fluxes. Degree of observability of a metabolic flux network from a particular set of measurements is defined the number of fluxes that are estimable (observable) from this data [7]. If a flux is unobservable from the given data, this means that its value is not mapped in these measurements. Thereby, the obtained data values would be the same independent of the actual value of this flux [39]. Identification of unobservable fluxes, termed observability analysis, is imperative prior to any flux quantification, because flux determination algorithms are bound to assign a certain value to these fluxes as they converge to the leastsquares solution. This value, however, which could vary within the allowable bounds, has no practical meaning and cannot be used to extract biologically relevant conclusions. Observability analysis could also indicate which additional measurements are needed to increase the degree of observability of a certain metabolic flux network, thus greatly assisting toward successful experimental design. Observability analysis is set in the context of “error-free measurements” [40]. If, however, all or some of the measurements are biased, the estimates of the observable fluxes may be negatively affected. Thus, grossly faulty, unreliable measurements should be excluded from further analysis. Identification of this type of these measurements can be achieved only through satisfaction of redundancies. Redundant (or noncritical) are the measurements whose exclusion from the dataset will not decrease the number of observable fluxes [2]. Redundancy is useful as safety against biases, potentially improving the accuracy of flux estimates. A high degree of redundancy could also assist in
Metabolic Flux Analysis
15-11
the identification of inconsistencies in the initially assumed intracellular biochemistry. Finally, it is also critical to determine how sensitive “sensors” of the observable metabolic fluxes are the available reliable measurements. A priori sensitivity analysis is significant for correct experimental design. A posteriori sensitivity analysis determines the confidence intervals for the estimated fluxes based on the variance in the available measurements. If vnet, vexch and ω are the vector of net fluxes, exchange fluxes and available measurements for a particular metabolic network, vnet and vexch are fully observable from ω, if and only if the “derivative” [41] or “sensitivity” matrix, D ω , of the measurements vector with respect to the unknown flux vectors based on the balance equations that connect them is fully ranked for any values of the flux vectors. If the derivative matrix is fully ranked for any values of the flux vectors, is means that it can be inverted at any point of the flux space and the unknown fluxes can be estimated from the given set of measurements. If the rank of D ω is not equal to the number of unknown fluxes, the rank of its maximum fully ranked sub-matrix is equal to the degree of observability of the metabolic flux network for this set of measurements. The fluxes that correspond to the zero columns of D ω after rearrangement are unobservable from the given dataset [7]. Obviously, if the number of measurements is smaller than the number of unknown fluxes, the flux network is not fully observable. However, full flux observability is not necessarily guaranteed if the number of measurements is equal or greater than that of the unknown fluxes in the case of the nonlinear isotopic tracer balance constraints. The difference between the number of measurements and the number of observable fluxes defines the number of redundant measurements. Hence, the degree of redundancy in a given dataset may be underestimated within the isotopomer distribution analysis framework, if it is calculated as just the difference between the number of measurements and the number of unknown fluxes. In metabolite balancing, the measurable net excretion rates are linearly connected to the unknown net fluxes through the metabolite balances (Equation 15.5b). In this case, observability and redundancy analysis [18,19,42] have been based on the corresponding methods that were introduced for the chemical process networks with linear constraints in the 1970s and 1980s [43–49]. Specifically, if all the net excretion rates of all metabolites are measurable, then the derivative matrix of the measurements with respect to the unknown net fluxes is the stoichiometrix matrix (of the forward directions of the metabolic reactions, see Equation 15.5b). If any of the net excretion rates are not measurable, the derivative matrix is the stoichiometric matrix augmented by as many columns as the unknown net excretion rates; each column corresponds to a metabolite whose net excretion rate is not measured, has all elements equal to zero but the element of the row that corresponds to the particular metabolite, which is equal to -1. Thus, any conclusions about flux observability and measurement redundancy or sensitivity depend only on the topology (stoichiometry) of the metabolic network and are independent of the actual (unknown prior to an experiment) value of the unknown net fluxes at the investigated experimental conditions. This characteristic greatly facilitates the design of a MBA experiment, as the degree of observability and redundancy can be known a priori and are independent of the experimental conditions. In the case that isotopic tracers are used in the analysis of metabolic networks, the isotopic distribution measurements are nonlinearly connected to the unknown metabolic fluxes through the isotopomer balances (Equation 15.7c). Thus, in contrast to the MBA, the derivative matrix of the measurements with respect to the unknown fluxes depends on the actual values of the unknown fluxes at the investigated experimental conditions. Another parameter in isotopomer distribution analysis that is not an issue in MBA is the labeling of the substrate(s). The substrate labeling, a known input to the analysis, does not modify the connectivity of the isotopic measurements with the unknown fluxes. The latter is determined by the stoichiometry of carbon transfer in the particular metabolic network. However, it dictates which atoms are to be labeled by the isotopic tracer depending on the flux values. Hence, it affects the part of the isotopic tracer distribution network that is finally utilized and the values of the isotropic measurements. This implies that, for the same flux values, various substrate labels may lead to different degrees of flux observability; for a particular substrate labeling, isotopomers that are expected to be good indicators of some fluxes may not be generated at all or may be generated at small quantities that are not accurately measurable [7]. Thus, the type and quantity of substrate labeling might also affect
15-12
Modeling Tools for Metabolic Engineering
the accuracy of the measurements and conscquently their reliability as flux sensors [34]. Therefore, the design of an isotopomer distribution analysis experiment involves careful selection of the substrate labeling and the measurements that can enhance the flux quantifiability. A systematic methodology to determine the degree of flux observability and measurement redundancy before the actual flux estimation is thus required [7,33,50]. Structural observability analysis [7,33] could provide such a methodology. It takes into consideration the topological and not the functional form of the derivative matrix, which is called occurrence matrix. Its [i,j] element is nonzero (usually depicted as “x”), if variable j participates in equation i of the system of balances represented by the matrix. If the occurrence matrix of the derivative matrix is not fully ranked, then the flux distribution is not fully observable from the particular set of measurements independent substrate labeling and/or actual flux values. Thus, structural observability analysis is significant, because it can identify a priori the fluxes that cannot be observable from the particular set of measurements independently of any circumstances, including the accuracy of the available experimental techniques. The opposite, however, does not hold true. If a set of unknown fluxes is identified as structurally observable from a given set of measurements, this does not guarantee that it is actually observable from this set of measurements under any experimental circumstances. Topological observability analysis cannot capture numerical singularities due to the actual values of the fluxes at the investigated experimental conditions and the actual labeling selected substrate. It has to be assisted by numerical observability analysis. Numerical observability analysis depends on the value of flux vector at which the rank of the derivative matrix is estimated. Strict bounds on the flux values, resulting potentially from previous biological knowledge, could assist a priori observability analysis. Despite its advantages, the numerical calculation of the rank is inevitably prone to round-off errors, machine accuracy and the accompanying ambiguity concerning the numerical threshold for considering an element of the derivative matrix equal to zero. For each pair of measured variable and unknown flux, this threshold should be set based on the standard deviation of the measured variable and the range of potential values for the unknown flux. If, for example, the standard deviation of a measurable variable j is 0.1 unit, then a sensitivity coefficient of this variable with respect to a net flux k smaller than 10 -4 should be considered (practically) equal to 0. Such value implies that the net flux k should be changed by 1000 for this change to be detected by the measured variable. In the case that all net fluxes are normalized with respect to the substrate uptake rate and the latter is set equal to 100, then all the net fluxes of a network are in the order of magnitude of the substrate uptake rate and all feasible changes of net flux k would remain unobservable by the particular measured variable, because a change by 1000 is much higher than the maximum potential value size for a net flux. The exchange fluxes of the network reactions may vary from 0 to infinity times the corresponding net flux. Since the relationship of an isotopic tracer variable with an exchange flux is not linear, Wiechert et al. [34] proposed a transformation from the vexch to the vexch[0,1] space:
vexch =
vexch[0,1] . 1 - vexch[0,1]
(15.8)
In this space, the relationship between an isotopic tracer variable and vexch[0,1] is “almost” linear, and vexch[0,1] can only take values between 0 and 1. In this case, if, for example, the standard deviation of a measurable variable j is 0.1 unit, a smaller than 10 -2 sensitivity coefficient of this variable with respect to the d-th vexch[0,1] flux of the network should be considered (practically) equal to 0. Sensitivity analysis for isotopic tracer measurements has indicated that the latter become insensitive to changes in any exchange flux for high values of this exchange flux, practically higher than ten times the net flux of a reaction [7,34]. If the value of the exchange flux is that high, isotopic tracer measurements can only “report” that the exchange flux is in this value range, but they cannot “sense” its exact value. In this case, the accuracy of the exchange flux estimate could be considered numerically low.
15-13
Metabolic Flux Analysis
However, such result is biologically significant, because it indicates a value threshold for the exchange flux after which the reversible reaction should be considered practically at equilibrium in vivo.
15.4 New Directions and Challenges Even though MFA is extremely important for the determination of the metabolic state of biological systems and the analysis of flux control [5,51], in the post-genomic era it has not been as widely utilized as the genomic, transcriptomic, and even proteomic analyses for the analysis of biological systems. This could be attributed to the fact that (a) due primarily to experimental limitations, flux analysis methodologies have been to-date designed for and applied to steady- or pseudo-steady-state conditions, and (b) flux analysis is not high-throughput and requires extensive knowledge of the stoichiometry of the metabolic network, which, especially for complex systems, may not be directly available. These are the main reasons for which flux analysis is not directly applicable to tissue (or body fluid) samples obtained under transient physiological conditions or pharmaceutical biotechnology, in which chemostat cultures can be costly to maintain with respect to batch or fed-batch processes. New directions in flux analysis involve efforts to exploit the experimental advances of the postgenomic era along with novel numerical methodologies to establish flux estimation methodologies for transient biological systems. In addition, metabolomics has recently provided the high-throughput platform for the analysis of a cellular metabolic fingerprint. Deriving flux information from the metabolomic profiles is among the challenges that the fluxomics community is currently attempting to address.
15.4.1 Flux Estimation from Metabolomics Metabolomics, the most recent of the “omics” techniques, refers to the high-throughput determination of a cellular metabolic fingerprint: the profile of the free metabolite pools [27,28]. Metabolomic analysis is the metabolic high-throughput equivalent of the transcriptomic and proteomic profiles; it does not require information of the metabolic network structure and regulation, and could be acquired and interpreted under transient metabolic conditions too. Fluxes depend on and affect the concentration of free metabolites in a metabolic network, thus the metabolic profile provides insight to the metabolic state of a biological system. However, the relationship between fluxes and free metabolite concentrations v0
v1
A
v2
B
Case 1: Change in flux is not reflected in the metabolite concentration profile 2v
2v
A 3v
2v
B 3v
A v
v
A
3v
B
v
B
Case 2: Change in flux is reflected in the metabolite concentration profile 2v
2v
A 2v
2v
A 3v
2v
B
A
v
B 3v
B
2v
Figure 15.3 Change in the metabolic flux distribution of a particular section of a metabolic network is not necessarily reflected in its metabolic profile.
15-14
Modeling Tools for Metabolic Engineering
is not linear. Therefore, metabolic profiles cannot be directly translated into metabolic flux information. Figure 15.3 indicates that the concentration of metabolites in a linear pathway, and thus the relevant part of the metabolic profile, might remain unaltered, despite changes in the fluxes of the reactions involved in the pathway. This is true, however, if one compares the flux map with the metabolic profile of this linear pathway alone. While in that case, the two profiles may contradict, the change in the fluxes of this linear pathway is expected to affect the metabolic profile of other parts of the network and be observable through these differences (e.g., accumulation of a metabolic product later on in the network). In this sense, it is important for the number of metabolites whose peaks could be annotated in the metabolic profile to be increased [52], and the accuracy of the metabolomic measurements to be enhanced [29]. In addition, the use of isotopically labeled substrates, when possible, and the measurement of the “labeled” metabolic profile could upgrade the flux information that can be obtained from the metabolic profiles of systems that cannot be considered at steady or pseudo-steady-state conditions without extensive knowledge of the metabolic network structure [53].
15.4.2 Flux Analysis under Transient Physiological Conditions To date, MFA is carried out on systems that are under steady or pseudo-steady state conditions. This limits the applications of MFA. Even in the cases that metabolic and isotopic steady-state conditions are reachable, establishing them (mainly the latter) requires long experimental times that are not feasible for some biological systems and processes. In addition, under steady-state conditions, isotopic tracer experiments have relied on the biomass constituents that are easily accessible to provide the information about the isotopic tracer distribution of the intracellular metabolites. To elucidate the flux map under transient metabolic conditions, it is required that the concentration of all free intracellular metabolites be measured at frequent time points, so that the accumulation terms in Equation 15.3 are determined. One current challenge for this analysis is the fact that, as indicated in Section 15.4, metabolomic analysis has not progressed to the point that all peaks in the metabolic profile are annotated. This hinders the direct application of metabolomics to MBA, because the number of variables is much higher than the number of measurements. A second challenge common in all time-series experiments in biology is the selection of the time interval between two measurements to capture the actual changes in cellular physiology. The measurement of the labeled metabolic profile in a high-throughput way after the introduction of labeled substrates could provide better qualitative insight to the flux map and its change with time. The same, as in metabolomics, limitations of limited number of observable metabolites (and thus isotopomer fractions) apply to this analysis. In addition, estimating the metabolic fluxes (through Equation 15.7b) under transient metabolic and isotopic conditions from the metabolic measurements can be a cumbersome task. In 2007, Noh et al. [54] proposed an intermediate model for metabolic steady-state and isotopically instationary conditions. In this case, some unobservable fluxes under steady-state MBA and some pool sizes that cannot be measured from the available experimental techniques can be estimated. Antoniewicz et al. [55] presented a modeling strategy that “combines key ideas from isotopomer spectral analysis (ISA) and in stationary MFA” to estimate the flux distribution in a fed-batch fermentation of Escherichia coli producing 1,3-propanediol (PDO). The strategy used the EMU distribution framework (see Section 15.2.4) and two dilution parameters to account for the dilution of isotopic tracer and biomass constituents.
15.5 Conclusions Obtaining a complete and accurate flux map is as essential for the analysis of metabolic networks as the gene expression profile for the reconstruction of gene regulation networks. In this chapter, the rationale behind flux quantification and an extensive description of currently used flux determination methologies was provided. In the absence of reliable and extensive in vivo kinetic data, all flux analysis methodologies are based on the fact that intracellular fluxes are mapped into measurable quantities
Metabolic Flux Analysis
15-15
through metabolite and isotopomer balances. In addition, the significance of observability, redundancy, and sensitivity analysis for the success of flux quantification and analysis was discussed and the relevant methods and issues were presented. Last, the new directions and challenges of MFA in the highthroughput post-genomic era were discussed. These involve mainly the connection between flux maps and metabolomic profiles and the extension of the flux determination methodologies to model transient metabolic and isotopic conditions.
References 1. Ahuja, R.K., Magnanti, T.L., and Orlin, J.B. 1993. Network Flows: Theory, Algorithms, and Applications. Prentice Hall, Upper Saddle River, N]. 2. Romagnoli, J.A. and Sanchez, M.C. 2000. Data Processing and Reconciliation for Chemical Process Operations (Process Systems Engineering, Vol 2). Academic Press, San Diego, CA. 3. Bird, R.B., Stewart, W.E., and Lightfoot, E.N. 1960. Transport Phenomena (1st edition). John Wiley and Sons, New York, NJ. 4. Stephanopoulos, G., Nielsen, J., and Aristidou, A. 1998. Metabolic Engineering: Principles and Methodologies. Academic Press, San Diego, CA. 5. Stephanopoulos, G. 1998. Metabolic fluxes and metabolic engineering. Metabolic Engin., 1: 1–10. 6. Klapa, M.I. and Stephanopoulos, G. 2000. Metabolic flux analysis. In: Schugerl, K./Bellgardt, K.H. (eds). Bioreaction Engineering: Modeling and Control. Springer, Berlin, Heidelberg, New York. 7. Klapa, M.I. 2001. High-resolution metabolic flux determination using stable isotopes and mass spectrometry. Ph.D. Thesis, Massachusetts Institute of Technology, Cambridge, MA. 8. Mattick, J.S. 2003. Challenging the dogma: The hidden layer of non-protein-coding RNAs in complex organisms. BioEssays 25: 930–9. 9. Klapa, M.I. and Quackenbush, J. 2003. The quest for the mechanisms of life. Biotechnol. Bioeng., 84:739–42. 10. Klapa, M.I. and Stephanopoulos, G. 2000. Metabolic engineering: A framework for the integration of genomic and physiological data. In: Barbotin, J.N./Portais, J.C. (eds). NMR in Microbiology: Theory and Applications. Horizon Scientific Press, U.K., 453–77. 11. Crowe, C.M. 1996. Data reconciliation. Progress and challenges. J. Process Control., 6: 89–98. 12. Vallino, J.J. and Stephanopoulos, G. 1993. Metabolic flux distributions in Corynebacterium glutamicum during growth and lysine overproduction. Biotechnol. Bioeng., 41: 633–46. 13. Foster, J., Famili, I., Fu, P.C., Palsson, B.O., and Nielsen, J. 2003. Genome-scale reconstruction of the Saccharomyces cerevisiae metabolic network. Genome Res., 13: 244–53. 14. Reed, J.L., Vo, T.D., Schilling, C.H., and Palsson, B.O. 2003. An expanded genome-scale model of Escherichia coli K-12 (iJR904 GSM/GPR). Genome Biol., 4: R54. 15. Schilling, C.H., Covert, M.W., Famili, I., Church, G.M., Edwards, J.S., and Palsson, B.O. 2002. Genome-scale metabolic model of Helicobacter pylori 26695. J. Bacteriol., 184: 4582–93. 16. Kharchenko, P., Vitkup, D., and Church, G.M. 2004. Filling gaps in a metabolic network using expression information. Bioinformatics, 20(Suppl. 1): I178–I185. 17. Tsantili, I.C., Karim M.N. and Klapa, M.I. 2007. Quantifying the metabolic capabilities of engineered Zymomonas mobilis using linear programming analysis. Microbial Cell Factories, 6: 8. 18. Wang, N.S. and Stephanopoulos, G. 1983. Application of macroscopic balances to the identification of gross measurement errors. Biotechnol. Bioeng., 25: 2177–2208. 19. van der Heijden, R.T.J.M., Heijnen, J.J., Hellinga, C., Romein, B., and Luyben, K.C.A.M. (1994a) Linear constraint relations in biochemical reaction systems: I. Classification of the calculability and the balanceability of conversion rates. Biotechnol. Bioeng., 43: 3–10. 20. van der Heijden, R.T.J.M., Heijnen, J.J., Hellinga, C., Romein, B., and Luyben, K.C.A.M. (1994b) Linear constraint relations in biochemical reaction systems: 2. diagnosis and estimation of gross errors. Biotechnol. Bioeng., 43: 11–20.
15-16
Modeling Tools for Metabolic Engineering
21. Wiechert, W. and de Graaf, A.A. 1997. Bidirectional reaction steps in metabolic networks: I. Modeling and simulation of carbon isotope labeling experiments. Biotechnol. Bioeng., 55: 101–17. 22. Aiba, S. and Matsuoka, M. 1979. Identification of metabolic model: Citrate production from glucose by Candida lipolytica. Biotechnol. Bioeng., 21: 1373–86. 23. Follstad, B.D., Balcarcel, R.R., Stephanopoulos, G., and Wang, D.I.C. 1999. Metabolic flux analysis of hybridoma continuous culture steady state multiplicity. Biotechnol. Bioeng., 63: 675–83. 24. Jørgensen, H., Nielsen, J., Villadsen, J., and Møllgaard H. 1995. Metabolic flux distributions in Penicillum-chrysogenum during fed-batch cultivations. Biotechnol. Bioeng., 46: 117–31. 25. Park, S.M., Shaw-Reid, C., Sinskey, A.J., and Stephanopoulos, G. 1997. Elucidation of anaplerotic pathways in Corynebacterium glutamicum via 13C-NMR spectroscopy and GC-MS. Appl. Microbiol. Biot., 47: 430–40. 26. Nissen, T.L., Schulze, U., Nielsen, J., and Villadsen, J. 1997. Flux distributions in anaerobic, glucoselimited continuous cultures of Saccharomyces cerevisiae. Microbiology, 143: 203–18. 27. Fiehn, O., Kopka, J., Dormann, P., Altmann, T., Trethewey, R. N., and Willmitzer, L. 2000a. Metabolite profiling for plant functional genomics. Nature Biotech., 18: 1157–68. 28. Fiehn, O., Kopka, J., Trethewey, R. N., and Willmitzer, L. 2000b. Identification of uncommon plant metabolites based on calculation of elemental compositions using gas chromatography and quadrupole mass spectrometry. Anal. Chem., 72: 3573–80. 29. Κanani, H. and Klapa, M.I. 2007. Data correction strategy for metabolomics analysis using gas chromatography-mass spectrometry. Metab. Eng., 9: 39–51. 30. Schmidt, K., Carlsen, M., Nielsen, J., and Villadsen J. 1997a. Modeling isotopomer distributions in biochemical networks using isotopomer mapping matrices. Biotechnol. Bioeng., 55: 831–40. 31. Wiechert, W., Mollney, M., Isermann, N., Wurzel, W., and de Graaf A.A. 1999. Bidirectional reaction steps in metabolic networks: III. Explicit solution and analysis of isotopomer labeling systems. Biotechnol. Bioeng., 66: 69–85. 32. Klapa, M.I., Park, S.M., Sinskey A.J., and Stephanopoulos G.N. 1999. Metabolite and isotopomer balancing in the analysis of metabolic cycles: I. Theory. Biotechnol. Bioeng., 62: 375–91. 33. Klapa, M.I., Aon, J.C., and Stephanopoulos, G. 2003. Systematic quantification of complex metabolic flux networks using stable isotopes and mass spectrometry. Eur. J. Biochem., 270: 3525–42. 34. Wiechert, W., Siefke, C., de Graaf, A.A., and Marx, A. 1997. Bidirectional reaction steps in metabolic networks: II. Flux estimation and statistical analysis. Biotechnol. Bioeng., 55: 118–35. 35. Follstad, B.D. and Stephanopoulos, G. 1998. Effect of reversible reactions on isotope label redistribution. Analysis of the pentose phosphate pathway. Eur. J. Biochem. 252: 360–71. 36. Sonntag, K.E.L., de Graaf A.A., and Sahm H. 1993. Flux partitioning in the split pathway of lysine synthesis in Corynebacterium glutamicum- quantification by 13C NMR and 1H-NMR spectroscopy. Eur. J. Biochem., 213: 1325–31. 37. Wiechert, W., Mollney, M., Isermann, N., Wurzel, M., and de Graaf A.A. 1999. Bidirectional reaction steps in metabolic networks: III. Explicit solution and analysis of isotopomer labeling systems. Biotechnol. Bioeng., 66: 69–85. 38. Antoniewicz, M.R., Kelleher, J. K., and Stephanopoulos, G. 2007. Elementary metabolite units (EMU): A novel framework for modeling isotopic distributions. Metab. Eng., 9: 68–86. 39. Mah, R.S.H. 1990. Chemical Process Structures and Information Flows. Butterworths Series in Chemical Engineering. Butterworths, Boston. 40. Cobelli, C. and Caumo, A. 1998. Using what is accessible to measure that which is not: necessity of model of system. Metabolism, 47: 1009–35. 41. Bates, D.M. and Watts, D.G. 1988. Nonlinear Regression Analysis and its Applications. John Wiley and Sons, New York, Chichester, Brisbane, Toronto, Singapore. 42. Vallino, J.J. 1990. Indentification of branch-point restrictions in microbial metabolism through metabolic flux analysis and local network perturbations. Ph.D. Thesis. Massachusetts Institute of Technology, Cambridge, MA.
Metabolic Flux Analysis
15-17
43. Vaclavek, V.L.M. 1976. Selection of measurements necessary to achieve multicomponent mass balances in chemical plant. Chem. Eng. Sci., 31: 1199–1205. 44. Romagnoli, J.A and Stephanopoulos, G. 1980. On the rectification of measurement errors for complex chemical plants—steady-state analysis. Chem. Eng. Sci., 35: 1067–81. 45. Romagnoli, J.A. and Stephanopoulos, G. 1981. Rectification of process measurement data in the presence of gross errors. Chem. Eng. Sci., 36: 1849–63. 46. Stanley, G.M. and Mah, R.S.H. 1981. Observability and redundancy classification in process networks—theorems and algorithms. Chem. Eng. Sci., 36: 1941–81. 47. Stanley, G.M. and Mah, R.S.H. 1981. Observability and redundancy in process data estimation. Chem. Eng. Sci., 36: 259–72. 48. Kretsovalis, A. and Mah, R.S.H. 1988. Observability and redundancy classification in generalized process networks. 1. Theorems. Comp. Chem. Eng., 12: 671–87. 49. Kretsovalis, A. and Mah, R.S.H. 1988. Observability and redundancy classification in generalized process networks. 2. Algorithms. Comp. Chem. Eng., 12: 689–703. 50. van Winden, W.A., Heijnen, J.J., Verheijen, P.J.T., and Grievink J. 2001. A priori analysis of metabolic flux identifiability from 13C-labeling data. Biotechnol. Bioeng., 74: 505–16. 51. Bailey, J.E. 1998. Mathematical modeling and analysis in biochemical engineering: Past accomplishments and future opportunities. Biotechnol. Prog., 14: 8–20. 52. Kind, T. and Fiehn, O. 2007. Seven golden rules for heuristic filtering of molecular formulas obtained by accurate mass spectrometry. BMC Bioinformatics. 8: 105. 53. Antoniewicz, M.R., Stephanopoulos, G., and Kelleher, J.K. 2006. Evaluation of regression models in metabolic physiology: Predicting fluxes from isotopic data without knowledge of the pathway. Metabolomics, 2: 41–52. 54. Noh, K., Aljoscha, W., and Wiechert, W. 2006. Computational tool for isotopically instationary 13C labeling experiments under metabolic steady-state conditions. Metab. Eng. 8: 554–77. 55. Antoniewicz, M.R., Kraynie, D.F., Laffend, L.A., Gonzalez-Lergier, J., Kelleher, J.K., and Stephanopoulos, G. 2007. Metabolic flux analysis in a nonstationary system: Fed-batch fermentation of a high yielding strain of E. coli producing 1,3-propanediol. Metab Eng., 9: 277–92.
16 Metabolic Control Analysis 16.1 Introduction �������������������������������������������������������������������������������������16-1 16.2 Definitions and Structure of Metabolic Reaction Networks ������������������������������������������������������������������������� 16-2 Definitions • Vector and Matrix Notation • Structural Features of Metabolic Networks
16.3 Mathematical Models of Metabolic Networks........................... 16-4 Metabolite Mass Balances • Reduction to the Independent Metabolite Mass Balances and Simplification • Steady State Metabolite and Flux Functions: The Problems
16.4 MCA: A Linear Kinetic Approximation...................................... 16-6 Reference Steady State and a Linearized Kinetic Relation Using Elasticity Parameters • The Metabolite Steady State Solution • The Steady State Flux Solution • Problems of the Linear Approximation Approach
16.5 Nonlinear Approximate Kinetics................................................16-11 Introduction • Lin-Log Kinetics • Steady State Metabolite and Flux Function for Large Perturbations
16.6 P arameterisation of the Metabolic Reaction Network Model ������������������������������������������������������������������������������16-14 Steady State Perturbation Experiments • Dynamic Perturbation Experiments
Joseph J. Heijnen Delft University of Technology
16.7 Conclusion and Outlook ������������������������������������������������������������� 16-23 Appendix A ����������������������������������������������������������������������������������� 16-24 Appendix B ������������������������������������������������������������������������������������ 16-26 References ������������������������������������������������������������������������������������������������� 16-28
16.1 Introduction With the rapid developments in molecular biology tools, genome sequencing and bioinformatics it has become very easy to manipulate enzyme levels and properties in metabolic reaction networks, with the aim to achieve desirable changes in product formation rates by cells and microorganisms. For many interesting products the stoichiometry of their synthesis pathways is known, including their links to central metabolism. Much less known is the effect of changes in enzyme/transporter levels (or changes in their kinetic properties) in the product pathway or in central metabolism on the product flux. This lack of knowledge makes it very difficult to select an enzyme/transporter (or a combination of enzymes) in the product pathway or in central metabolism as target(s) for genetically based change. Currently most of the enzymes are selected in an intuitive way. E.g., if the product requires a lot of NADPH a likely target is the pentose phosphate pathway as indicated by Van Gulik et al. (2000). If the product draws its carbon from an intermediate of the TCA cycle (e.g., lysine production requires 16-1
16-2
Modeling Tools for Metabolic Engineering
oxaloacetic acid OAA) then a likely candidate is the anaplerotic reaction which replenishes OAA into the TCA cycle. Often the final product shows end-product-feedback inhibition on the first committed reaction of the product pathway. An often proposed strategy is then to decrease the intracellular product concentration by increasing the product exporter. However, in many situations the intuitive approach fails because metabolic reaction networks have many nonlinear interactions and many enzymes. The usual approach to select gene targets for such complex metabolic reaction networks is to construct a mathematical model. In this contribution the focus will be on a special class: “MCA models.” The basis of all models of metabolic reaction networks is the definition of the reactions and the structure of the reaction network, which is, therefore, discussed first.
16.2 Definitions and Structure of Metabolic Reaction Networks 16.2.1 Definitions In a metabolic network (Figure 16.1) a substrate is transported over a membrane (using a transporter protein e1), is subsequently converted through a network of connected enzyme catalyzed reactions inside the organism (e2, e3, etc.) and finally there is a product secreted using a transporter (e4). The network is, therefore, characterized by the different enzymes (transporters) which are present. Each enzyme is present with an amount ei. This amount needs to be determined experimentally (using classical activity assays or the more modern quantitative proteome analysis). ei is conveniently expressed per biomass amount. Each enzyme performs a known reaction, for which the stoichiometry is known, e.g., the enzyme Fructose 1,6 bisphosphate aldolase performs the reaction
1F16BP + 1GAP + 1DHAP.
This stoichiometry is taken from textbooks or databases. In addition each reaction has a reaction rate v i. The usual dimension of Vi is biomass specific, e.g., mol product of the reaction per hour per amount of biomass. In the reaction appear (dependent) intracellular metabolites Xj and (independent) mostly extracellular metabolites Ck. Extracellular metabolites can occur in intracellular reactions such as for transporters, but also as allosteric effectors (inhibitors, activators). The reason to distinguish Ck (from Xj) is that Ck represents concentrations which can be manipulated by the experimentator in an independent way (using proper experimental devices to control, e.g., pH, T, extracellular substrate, product, O2, CO2, etc. concentration). A special category of independent concentrations are conserved moiety sums. These sums (e.g., NADH + NAD + or ATP + ADP + AMP) are considered constant in certain conditions, but e1 v1
S
C
Xfull
e2 v2
X1
X2
0
0
e3 v3
Cs
–1
0
Cp
0
0
X1
+1
–1
0
0
X2
0
+1
–1
0
X3
0
0
0 +1
+1 –1
X3
e4 v4
P
V
e
v1
e1
v2
e2
v3
e3
v4
e4
Sfull
Figure 16.1 Metabolic reaction network, vector and matrix definitions.
16-3
Metabolic Control Analysis
they can be experimentally manipulated using genetic engineering (cofactor engineering). The intracellular concentrations are dependent on the kinetics and the levels of the enzymes and the levels of the independent concentration.
16.2.2 Vector and Matrix Notation A metabolic network is characterized by n reactions. Reaction i is characterized by: • Enzyme level ei • A reaction rate v i, which is by definition positive! • A given reaction stoichiometry The whole network contains n enzymes and therefore, also n reaction rates v i. It is convenient to define the n × 1 column rate vector V, which contains each reaction rate v i. In this network one can distinguish m different intracellular metabolites Xj, which can be represented in the m × 1 column vector X full. In addition there are k different independent (extracellular) metabolites Ck which are represented in the k × 1 column vector C. The reaction stoichiometry is now easily represented by the full stoichiometry matrix Sfull which contains (m + k) rows and n columns. The dimension of Sfull is, therefore, (m + k) × n. Column i represents the stoichiometry of reaction i. Element ji represents the stoichiometry of metabolite j in reaction i. Each row represents a metabolite, where the upper k rows represent the independent metabolites Ck and the lower x m rows represent the dependent intracellular metabolites Xj. Sfull can be separated in a lower matrix S full c for all the intracellular metabolites (X) and an upper matrix S for the extracellular compounds. Finally the network is characterized by the enzyme level ei of each reaction (transporter), which is represented by the n × 1 column enzyme vector e. Figure 16.1 shows an example network and the vector/matrix definitions. The network contains three dependent (X1, X 2, X3), two independent (S, P) metabolites and four reactions. Therefore, C = 2 × 1, X = 3 × 1, V = 4 × 1, e = 4 × 1 and Sfull = 5 × 4 dimensional. Sc are the two upper x rows and S full the three lower rows of Sfull.
16.2.3 Structural Features of Metabolic Networks Metabolic networks have structural features. • Linear path A linear path is a linear sequence of enzyme catalyzed reactions. Important examples are product pathways which convert a central metabolite into a secreted product. • Branch In a branch point a metabolite (the branch point metabolite) is involved in more than one reaction. Examples are G6P which is used in the pentose-phosphate pathway and in glycolysis, or pyruvate which can be used in the TCA cycle, anaplerosis or for a fermentative product. Branch points are extremely relevant because here material flow in the network is determined, influencing product yields. • Conserved moieties Biological networks contain many so-called conserved moieties, such as ATP/ADP/AMP or NADH/ NAD +, etc. These molecules carry certain entities (energy or electrons) and are not converted themselves. Their sum amount (e.g., ATP + ADP + AMP or NADH + NAD + does not change in a given situation. This always allows writing so-called conserved moiety sums.
ATP + ADP + AMP = A
NADH + NAD + = N Here A and N are the moiety sums.
16-4
Modeling Tools for Metabolic Engineering
The conserved moiety sum (A or N) present in organisms depends on the activity of their synthesis pathways. The sum can be influenced by changing cultivation conditions and/or genetic techniques. Metabolic networks contain many examples of such conserved moiety sums. The sums put mathematical constraints on the individual intracellular concentrations. A mathematical analysis of the full stoichiometry matrix Sfull (using well-known matrix operations) immediately reveals the presence of such conserved moieties (see Heinrich and Schuster (1996)). • Cycles Networks often contain cycles. An important example is the TCA cycle which oxidizes Acetyl CoA to CO2 and NADH. The intermediates in the TCA cycle form a conserved moiety sum of C4 + C5 + C6 metabolites. However, it is also known that the TCA cycle intermediates αKG, OAA, Succ-CoA are consumed for synthesis purposes. This decreases the conserved moiety sum of the TCA cycle. It is, therefore, absolutely mandatory that removal of TCA cycle compounds (due to biosynthesis) is precisely compensated by influx using the so-called anaplerotic reaction. Other types of cycles are substrate cycles or shuttles (which transport molecules or redox equivalents between mitochondrion and cytosol such as the malate aspartate shuttle) or futile cycles (where ATP is dissipated).
16.3 Mathematical Models of Metabolic Networks 16.3.1 Metabolite Mass Balances The mathematical model of a metabolic network is always based on the mass balances of the dependent, intracellular metabolites. This mass balance in vector notation follows:
dX full x v - µX = S full full dt
(16.1a)
x is the m × n stoichiometric matrix (being the lower part of Sfull, Figure 16.1) representing all the Here S full intracellular metabolites (x). dX full/dt is the m × 1 column vector of time derivatives of all intracellular metabolites, v is the n × 1 column vector of reaction rates. µX full is the so-called dilution term due to biomass growth at specific growth rate µ.
16.3.2 Reduction to the Independent Metabolite Mass Balances and Simplification We have already seen that in metabolic reaction networks, there are often conserved moiety sums. For example NADH + NAD + = N. We can rewrite this as:
dNADH dNAD+ + =0 dt dt
This shows that the mass balances for NADH and NAD + are dependent. Therefore, we need to remove one (e.g., for NAD + ) mass balance. The mathematical model then only gives the NADH concentration as time dependent output. However, NAD + immediately follows then from the above conserved moiety sum. Conserved moiety sums, therefore, require the reduction of the metabolite vector X full and of the x stoichiometry matrix S full by removing a metabolite present in a conserved sum relation (e.g., NAD + is removed). The result of handling conserved moieties is x ) which represents all intracellular metabolites (X) has • That the stoichiometry matrix (S full been reduced by removing one metabolite ( = one row) for each conserved moiety sum. The number of rows removed equals the number of conserved moiety sums. This reduced
16-5
Metabolic Control Analysis
stoichiometry matrix is called S, with dimension m′ × n. m′ is the number of remaining intracellular metabolites. • That the vector of all intracellular metabolites (X full) has also been reduced by removing the same conserved moiety related metabolites. This reduced vector of intracellular metabolites is called X. After reduction one can simplify the metabolite mass balances, because the intracellular metabolite concentration Xj is generally very low, meaning that the µXj-term in the mass balance for Xj is much smaller than the contribution from reactions. Therefore, for metabolites this dilution term µXj is usually neglected. However, for e.g., storage compounds (PHB, glycogen, etc.) this is not allowed. This leads to the independent mass balances for intracellular metabolites (with neglected dilution term and conserved moiety metabolites removed).
dX = Sv dt
(16.1b)
In addition one can formulate the mass balances for extracellular metabolites. Usually the organisms are cultivated in chemostat (e.g., to study controlled steady states), with biomass concentration Cx. The mass balances for extracellular compounds then also involve in- and out-transports, represented as DCin and DC. This gives as balances:
dC = C x Sc v + DC in - DC dt
(16.1c)
Here Sc is the part of Sfull relating to extracellular compounds. Cin is the k × 1 column vector representing the concentrations of extracellular compounds in the inflow. C is the similar vector for the concentrations of extracellular compounds in the chemostat, D is the dilution rate of the chemostat ( = φin/V = φout/V), with V the chemostat broth volume. A further simplification of Equation 16.1b is possible using the assumption of Pseudo steady state (PSS) for intracellular metabolites, meaning that the contribution of the dx/dt term is negligible when compared to the reaction term (Sv). This assumption is only valid for time scales >10 times the metabolite pool turnover time (t.o.t.). Because X is generally small, the t.o.t. is of order of seconds to minutes. This has been confirmed repeatedly in, e.g., Saccharomyces cerevisiae (Wu et al., 2006) and Penicillium chrysogenum (Nasution et al., 2006). This shows that, even in dynamic (at time scales >10 minutes) experiments we can often invoke PSS for X. With these simplifications, and after removal of conserved moieties, the mass balances for intracellular metabolites reduce to a mathematically independent set of steady state equations.
Sv = 0
(16.1d)
These independent mass balances are also the basis of the metabolic flux balance analysis (MFA) calculations (see Palsson, 2006). Note that PSS does not hold for extracellular metabolites (Equation 16.1c).
16.3.3 Steady State Metabolite and Flux Functions: The Problems In general there are m′ dynamic metabolite balances, where it is assumed that the conserved moiety partners have been removed already, so that these balances are mathematically independent. In (pseudo) steady state these m′ mass balances allow to obtain the steady state values for the m′ metabolite concentrations belonging to the (pseudo) steady state. To be able to do this calculation we need expressions for the rate equation for each enzyme catalysed reaction (v i). In general
v i = f i (ei, Xj, Ck, parameters)
(16.2)
16-6
Modeling Tools for Metabolic Engineering
This relation states that the v i depends: • On the enzyme amount ei present in the metabolic reaction network. Usually ei and v i are proportional. • In a nonlinear fashion on intracellular (Xj) and extracellular (Ck) metabolite concentration. • In a nonlinear fashion on parameters. These enzyme kinetic functions have been elucidated by enzymologists for many different enzymes in the past decades. Indeed all found kinetics are strongly nonlinear (hyperbolic, sigmoid and others). This makes a general analytical solution of the PSS mass balances (Equation 16.1d) impossible. Only numerical approaches are possible but these require the values of the kinetic parameters for each enzyme. Here a second problem arises. The above mentioned enzyme kinetic studies have been performed on purified enzymes. These in vitro conditions usually are far from the conditions in the cell (in vivo) and therefore, it has been found that the available kinetic parameters are not applicable in vivo (Teusink et al., 2000). This means that kinetic studies of enzymes in metabolic networks need to use in vivo conditions where each enzyme is still in the cell. These two problems have been tackled by a theoretical and experimental approach, which will be presented below: • Approximate kinetics, which allows analytical solutions of the metabolite mass balances (Sections 16.4 and 16.5) • Experimental in vivo perturbation of whole cells (perturbing enzyme levels and metabolite concentrations, Section 16.6)
16.4 MCA: A Linear Kinetic Approximation 16.4.1 Reference Steady State and a Linearized Kinetic Relation Using Elasticity Parameters We have seen that the metabolite mass balances cannot be analytically solved due to the nonlinear enzyme kinetic functions. Because the metabolite mass balances are already linear in V (Equation 16.1d), it is clear that a kinetic equation which is linear in ei, Xj and Ck would allow a general solution of the steady state metabolite balances. In essence this means that the nonlinear kinetic equation needs to be linearized. In order to do this one needs a working point (indicated by the superscript o). Because the metabolic system is considered in steady state it is useful to choose a working point which is a steady state, called reference steady state. For the rate function Vi (Equation 16.2) we can now write (using Taylor expansion).
dVi =
∂fi ∂f ∂f dei + i dX j + i dC k ∂ei ∂X j ∂C k
(16.3a)
ei is only once present, but there are as many Xj and Ck terms as there are different intra and extracellular metabolites present in the kinetic function. Using Joi as reference flux, eoi as enzyme level in the reference steady state, X oj as reference intracellular metabolite and C ok as extracellular reference concentration we can write (with dz = z – zo) o
o
o
∂f ∂f ∂f Vi - Joi = i (ei - eoi ) + i (X X j - X oj ) + i (C k - C ok ) ∂ei ∂C k ∂X j
(16.3b)
It is noted that the symbol Ji (flux) is used for a rate of reaction in a steady state network. Joi is then this rate in the chosen reference steady state.
16-7
Metabolic Control Analysis
We can rewrite this equation by introducing fold-changes in enzyme ei / eoi , in intra- and extracellular metabolite X oj / X oj and C k / C ok , and in reaction rate Vi / Joi (where Joi is also fio) o
o o Vi ei ∂f i ei X j ∂f i X j C k ∂f i C k 1 1 + -1 = 1 + f i ∂ei eoi f i ∂X j X oj f i ∂C k C ok Joi
(16.3c)
• The ()o terms can now be seen to be dimensionless quantities, and also all other terms are dimensionless. Using fold changes is practical because it removes all kinds of different dimensions and it introduces a natural scaling of experimental data. o
e ∂f • i i can be seen to be equal to 1 because Vi and ei are nearly always proportional. f i ∂ei o
o
X j ∂fi ∂f / f • equals i i , which represents the relative change in rate f i upon a relative fi ∂X j ∂X j / X j change in metabolite Xj. This quantity is called the elasticity ε ijxo of intracellular metabolite Xj on enzyme ei. The superscript o indicates that the ε-value depends on the reference. o
C ∂f • k i can in a similar way be identified as the elasticity ε co ik of extracellular metabolite Ck fi ∂C k on reaction i. • The interesting point of elasticities is that these represent dimensionless kinetic parameters which are well bounded. If metabolite Xj stimulates an enzyme ei in a classical Michaelis and Menten way, then 0 < ε < 1 where 0 is achieved at enzyme saturation and 1 at low metabolite levels. If there is hyperbolic inhibition then elasticity is negative –1 < ε < 0. For Hill type kinetics ε can be larger than |1|. Usually, following enzyme structural features (number of subunits n), ε<|n|. Composite elasticities (Appendix A) can also be larger than |1|. • It should be noted that elasticities are defined in the reference steady state (hence o). Figure 16.2 shows the elasticity concept for a simple Michaelis and Menten enzyme. This figure shows, due to the slope behavior, that ε oij approaches 0 for high Xj and 1 for low X oj .
б vi
εoij =
б xj
vi
J oi
0 0
xoj
xj
Figure 16.2 Elasticity for a Michaelis and Menten enzyme.
0
.
xoj Joi
16-8
Modeling Tools for Metabolic Engineering
We can now, using elasticities, rewrite Equation 16.3c: Xj Vi e Ck - 1 = 1 oi - 1 + ε ijxo o - 1 + ε co ik o - 1 o ei Ck Ji Xj
(16.3d)
This linearized kinetic equation for enzyme ei contains one elasticity term for each metabolite which has a kinetic effect on enzyme ei. The elasticity as defined here is the same as defined in the classical metabolic control analysis (MCA) as proposed by Kacser and Burns (1973) and Heinrich and Rapoport (1974) (as reviewed in the book of Heinrich and Schuster (1996). Previously (Section 16.3.2) we have noted that conserved moieties often occur in metabolic reaction networks. It was already mentioned that, in order to obtain metabolite mass balances which are mathematically independent, we have to remove (for each conserved moiety sum) a metabolite from the stoichiometry matrix (by removing its corresponding row). E.g., one removes NAD + (because of conserved moiety sum (NADH + NAD + )) and ADP (because of the conserved moiety sum ATP + ADP + AMP). This leads to a reduced stoichiometry x matrix from S full to S) with only independent rows. Also the metabolite vector has to be reduced similarly (from X full to X, Section 16.3.2). An additional required action is now to remove these metabolites (e.g., NAD + and ADP) also from the kinetic equations. Appendix A shows that this leads to modified x kinetic relations. The conserved moiety metabolites, removed from S full (which then becomes S) and from X full (which becomes X) must also be removed from each linearized kinetic expression by using the conserved moiety relation. This introduces several features (Appendix A): • In the kinetic relation then appears the conserved moiety sum T as an independent concentration. • New, composite, elasticities are defined for the conserved moiety metabolites which have not been removed. • An elasticity is defined for the effect of the independent conserved moiety sum T on rate of reaction Vi. However, the structure of the linearized kinetic relation remains untouched. An interesting observation is that, when the conserved moiety sum CT is not perturbed (C T = C oT ), which is usually the case, there is an identifiably problem for the elasticities of the conserved moiety metabolites (see Appendix A). The kinetic equation (Equation 16.3d) can be written for each enzyme, and because Equation 16.3d is linear it is convenient to write all kinetic equations in matrix notation. V e X c = i + I o - i + E xo o - i + E co o - 1 e X c Jo
(16.4)
Here • (V / Jo) is an n × 1 column vector with elements (Vi / Joi ). • (e / eo) is an n × 1 column vector with elements (ei / eoi ). • (X / Xo) is an m′ × 1 column vector for the m′ intracellular metabolites (reduced with respect to the conserved moieties) with elements ( X j / X oj ). • (c / co) is a k′ × 1 column vector for the independent metabolites with elements (C k / C ok ). These represent extracellular concentrations Ck and conserved moiety sums CT/C oT!! • Exo is the elasticity matrix for the intracellular (dependent) metabolites (where the conserved moiety partners have been removed) of dimension n × m′. This matrix also contains “composite” elasticities for the nonremoved conserved moiety partners (Appendix A). • Eco is the elasticity matrix for the independent metabolites (extracellular and conserved moiety sums) of dimension n × k′. • The elasticities are defined in the reference state, hence Exo and Eco. • i and I are identity vector and matrices with proper dimensions.
16-9
Metabolic Control Analysis
Equation 16.4 represents for all enzymes the linearized kinetic functions, with the elasticities for the dependent (X) and independent (C) concentrations as kinetic (affinity related) parameters and Jo as the rate parameter.
16.4.2 The Metabolite Steady State Solution The linearized kinetic equation (Equation 16.4) can now be used to solve the independent steady state mass balances for the dependent metabolites. The steady state metabolite mass balances (using the reduced stoichiometry matrix S) follow from Equation 16.1d by rewriting.
V S [ J o ] o = 0 J
(16.1e)
Here [Jo] is a n × n square diagonal matrix with elements Joi . Introducing (V / Jo) from Equation 16.4 gives
e X c S[ Jo ]i + S[ Jo ] o - i + S[ Jo ]E xo o - i + S[ Jo ]E co o - i = 0 e X c
(16.5)
By definition the matrix SJoExo is (m′ × m′) square and of full rank because the conserved moieties have been taken care of. Therefore, SJoExo is invertible!! This equation is easily solved for the dependent metabolite vector ((X/Xo)−i), leading to Equation 16.6 (note that S[Jo]i = 0 due to Jo belonging to a steady state).
X e c - i = C xo o - i + R xo o - i o e c X
(16.6)
In this equation matrices Cxo and R xo represent the so-called intracellular metabolite (X) control and response coefficients, defined in the reference state o. The solution of Equation 16.5 shows that the following relations exist:
C xo = [ -S[ Jo ]E xo ] S[ Jo ]
(16.7a)
R xo = CxoEco
(16.7b)
-1
The intracellular metabolite control coefficient C ijxo shows, upon a small relative change ((e j - eoj )/ eoj ) in enzyme activity ej, the resulting small relative change in dependent metabolite X i. In a similar way o the response coefficient R xo jk relates a relative change ((C k / C k ) - 1) in independent metabolite (extracellular or conserved moiety sum) to the resulting change in dependent metabolite Xj. Equation 16.7a,b show that the intracellular metabolite control and response coefficients are completely known, when the enzyme kinetics (represented in Exo, Eco and Jo) is known. Equation 16.6 is a complete solution for all intracellular metabolites, where part of the conserved moiety concentrations follow from Equation 16.6 and the remaining ones are found using the available conserved moiety sum relations (Appendix A). The control coefficients C ijxo are not all independent. This is in contrast to the elasticity parameters defined in matrix Exo. This is easily understood from the fact that Cxo contains m′ × m′ entries, each of which has a value. In contrast Exo is a n × m′ matrix, which is very sparse (many zero entries). In general there are much less ε oij values than there are C ijxo entries. This leads directly to dependencies. In Appendix B, this dependency aspect is discussed more in detail. Clearly there must exist dependency relations
16-10
Modeling Tools for Metabolic Engineering
between C ijx. A most famous relation is the summation relation. If we multiply Equation 16.7a with the unit vector i (meaning we add up each row of the Cxo matrix) we obtain:
Cxoi = 0
(16.7c)
This shows that the sum of entries in each row of the Cxo matrix equals o by definition. Another interesting relation for Cxo is easily obtained from Equation 16.7a: post-multiplying with Exo leading to:
CxoExo = - I
(16.7d)
This is called the metabolite control connectivity relation. Equation 16.6 shows how a small relative perturbation in the enzyme or independent concentration (extracellular or conserved moiety sum) leads to a small relative change in the dependent (intracellular) metabolite concentration. In this solution the metabolite control and response coefficients are the parameters which follow from enzyme kinetics (Eco, Exo, Jo) and stoichiometry (S) (Equation 16.7a, b).
16.4.3 The Steady State Flux Solution The steady state flux solution is easily obtained by entering the steady state metabolite solution (Equation 16.6) in each enzyme kinetic equation (Equation 16.4) which gives then the rate of each enzyme catalysed reaction in that steady state. A steady state rate is called a flux, with symbol J. In matrix notation this gives:
J e c - i = C Jo o - i + R Jo o - i o e c J
(16.8)
In this flux solution matrices CJo and RJo are the flux control and response coefficients, defined in the reference state. C ijJo gives, upon a relative change (ei eoj -1) in enzyme ej, with all other enzyme levels unchanged and all extracellular concentrations unchanged, the small relative change in the flux of reaction i. R ikJo is a similar coefficient which relates the relative change in one extracellular concentration Ck to the resulting change in the flux of reaction i. The substitution of Equation 16.6 in Equation 16.4 shows that the following relations hold:
CJo = I + ExoCxo
(16.9a)
RJo = ExoR xo + Eco = CJoEco
(16.9b)
Equation 16.9a and 16.9b (combined with Equation 16.7a, b) show that these flux control and response coefficients are completely known when the enzyme kinetics (matrices Eco, Exo, [Jo] are known and when the stoichiometry matrix S is known. Also for the flux control coefficients it holds that these are not independent. There are many dependency relations (see Appendix B for more details). The most famous is the flux control summation relation, which is obtained by summing the coefficients in each row of CJo. Equation 16.9a shows (using also Equation 16.7c) that Equation 16.9c holds, which says that the control coefficients for a flux sum up to 1.
CJoi = i
(16.9c)
Also for the flux control matrix there exists a connectivity relation. This relation is obtained by postmultiplying Equation 16.9a with Exo (and using the other connectivity relation Equation 16.7d).
CJoExo = 0
(16.9d)
16-11
Metabolic Control Analysis
16.4.4 Problems of the Linear Approximation Approach Equations 16.8 and 16.6 describe completely how small changes in enzyme levels (e/eo - 1)and independent concentrations(c/co - 1) (extracellular and conserved moiety sum) lead to small changes in fluxes and metabolite levels. The coefficients in these steady state equations must be found from steady state experiments were • Perturbations are applied in the independent quantities such as enzyme levels, extracellular concentration and conserved moiety sums. • The resulting changes in intracellular metabolite levels X/Xo and fluxes J/Jo are measured. The major problem with this approach is that the linear approximation requires very small perturo o o o bations (in (ei / e01 i ),(C k / C k ),(C T / C T )), leading to very small changes in X j / X j and Ji / Ji . It is nearly impossible to create experimentally such small perturbations, and it is even more difficult to measure (error less than 1%) the resulting small changes in X j / X oj and Ji / Joi . This is a fundamental problem of the linear approximation. A last point of concern is that, even if these small perturbations could be realized experimentally and the small changes in X and J could be measured with enough accuracy, then the obtained control and response coefficients are still only valid close to the steady state which is used. This means that is it impossible to predict the effect of, e.g., large changes in enzyme levels on fluxes. This linear MCA-model has therefore, very limited practical application; although the obtained control coefficients and their properties are very relevant to understand network properties (see Fell, 1996). What is clearly required is a nonlinear approach which allows reliable predictions for the new fluxes and metabolite levels following large changes in enzyme levels and/or extracellular concentrations. This can be achieved using nonlinear approximative kinetics!!
16.5 Nonlinear Approximate Kinetics 16.5.1 Introduction The use of nonlinear approximate kinetics in modeling of metabolic networks has been recently reviewed (Heijnen, 2005). It was concluded that lin-log kinetics has favorable properties compared to other approaches such as biochemical systems theory (Savageau, 1976 and Voit, 2000), the large perturbation approach of Small and Kacser (1993a,b) and the logarithmic linearization approach (Hatzimanikatis and Bailey, 1997). Here we will use only the lin-log approach as approximate kinetics for enzymes (Heijnen, 2005).
16.5.1 Lin-Log Kinetics The kinetics are easily written in already used vector/matrix notation because lin-log kinetics uses the same kinetic parameters (E xo, E co, [Jo]) as used in MCA (Section 16.4). The key is that in lin-log kinetics the metabolites are nonlinear present as In (X/Xo) terms. The logarithmic form is due to the concept of thermodynamic driving force which is at the basis of lin-log kinetics (Visser and Heijnen, 2003 and Heijnen, 2005). Figure 16.3 shows how hyperbolic kinetics can be approximated (at Xo = Km) using lin-log kinetics. It is clear that lin-log kinetics shows a very satisfying approximation. A second point is that in lin-log kinetics the proportionality between Vi and ei is maintained, in contrast to traditional linearized MCA approach (which is linear, but not proportional in ei eoi !). For an enzyme ei one can write lin-log kinetics:
Xj Vi ei Ck = o i + ε ijxo ln o + ε co ik ln o Ji e i Xj Ck
(16.10a)
16-12
Modeling Tools for Metabolic Engineering
vi/Joi
3
Linear
2
Lin-log Hyperbolic
1
0
1
2
xj/xoj
3
4
5
Figure 16.3 Lin-log and linear kinetics as approximation of hyperbolic kinetics (at working point X oj = Km).
This equation shows that rate and enzyme are proportional (which is considered a safe assumption), that a dependent metabolite Xj has a kinetic effect according to a logarithmic concentration term and with an elasticity ε ijxo as kinetic parameter, that the independent (extracellular or conserved moiety) concentration Ck also has a kinetic effect in a similar way with an elasticity ε co ik . A kinetic equation shows as many logarithmic contributions as there are kinetic active metabolites, but each metabolite Xj has only one kinetic parameter ε oij for enzyme i. The number of kinetic parameters per reaction (enzyme) equals the number of elasticities and the rate parameter Joi . This is the minimum number of parameters possible. An important aspect of lin-log kinetics is that the elasticity is not a constant (as in power low kinetics, see Heijnen, 2005), but changes in a downward concave way with concentration. For example, when
V e X = 1 + ε xo ln o Jo eo X
then one obtains
ε xo ∂V X εx = ⋅ = . ∂X V (1 + ε xo ln XXo )
This shows that for lin-log kinetics the elasticity drops at higher metabolite levels, which is according to experience (see also Figure 16.3). A final important property of lin-log kinetics is that the elasticity parameters are linear in the kinetic functions. This allows the use of linear regression to obtain the elasticity parameters from experiments (see Section 16.6). We can now write lin-log kinetics in vector notation:
V e X c = o i + E xo ln o + E co ln o o c J e X
(16.10b)
In this equation [e/eo] a n × n square diagonal matrix with elements ei / eoi and V/Jo is a n × 1 column vector with elements Vi / Joi , Exo, Eco are the elasticity matrices in the reference state, (X/Xo) is the vector of dependent metabolites, where conserved moiety metabolites have been removed, c k / c ok is the vector of independent metabolites. Equation 16.10b contains vectors with ln ( X i / X oi ) and ln ( c / c oi ) as elements.
16-13
Metabolic Control Analysis
16.5.2 Steady State Metabolite and Flux Function for Large Perturbations We can obtain the steady state metabolite and flux solution by combining the independent steady state mass balances (Equation 16.1d) with lin-log kinetics (Equation 16.10b). An equivalent, more elegant, parallel approach for the flux equation is possible using Equation 16.7 and 16.9. Rewriting Equation 16.10b gives in steady state (V = J):
J eo X c - i = E xo ln o + E co ln o X c Jo e
(16.10c)
In this relation ((J / Jo )(eo / e)) is a n × 1 column vector with ((Ji / Joi )(eoi / ei )) as elements. If we could eliminate ln (X/Xo) from Equation 16.10c one obtains the desired steady state flux solution. This is simply achieved by premultiplying Equation 16.10c with the flux control matrix CJo:
c J eo X C Jo o = C Joi + C JoE Xo ln o + C JoE co ln o c J e X
(16.11a)
Due to the connectivity relation (Equation 16.9d) the ln (X/Xo) term disappears and the term before ln(c/co) can be recognized to be equal to the flux response matrix (Equation 16.9b). The use of the flux summation relation, CJoi = i, gives the desired flux relation:
J eo c C Jo o = i + R Jo ln o J e c
(16.11b)
In case of absence of perturbations in e (eo/e = i) and in the independent metabolites (c/co = i), eq 16.11b reduces to the constraint equation on rates as proposed by Delgado and Liao (1991, 1992). In a similar way one can obtain the metabolite solution, by premultiplying Equation 16.10c with the metabolite control matrix Cxo
X J eo c ln o = -C xo o + R Xo ln o X J e c
(16.12)
Equation 16.11b allows calculating the new flux J upon large perturbations in the level of (different) enzymes and/or independent concentrations. Using these new fluxes, Equation 16.12 gives the new metabolite levels. Instrumental in these equations is that one does know the flux and metabolite control and response coefficients CJo, RJo, Cxo, R xo. This essentially requires knowledge of network stoichiometry (S), and of enzyme kinetics Exo, Eco, [Jo] (see Equations 16.7a, b, and 16.9a, b). A final interesting result is the so-called design equation (Visser et al., 2004) which gives for a selected set of desired fluxed and metabolite levels, the required changes in enzyme levels. The validity of the flux and metabolite equations (Equations 16.11b and 16.12) of course depends on the adequacy of the lin-log approximations. • For a branched pathway with conserved moieties and strongly nonlinear kinetics it was shown that lin-log kinetics allowed adequate dynamic modeling and accurate redesign (using the design equation, Visser and Heijnen, 2003 and Visser et al., 2004). • A model of E. coli glycolysis was successfully approximated using lin-log kinetics and also here (in silico) redesign of a product pathway was successful (Visser et al., 2004). In order to apply the results (Equations 16.10b, 16.11b, and 16.12) of the lin-log kinetics it is very important to address the problem of how to obtain the parameters (Exo, Eco, Jo) in these equations.
16-14
Modeling Tools for Metabolic Engineering
16.6 Parameterisation of the Metabolic Reaction Network Model In order to use the derived equations to predict fluxes and metabolites it is essential to obtain the in vivo kinetic parameters of each enzyme in the metabolic network. This requires perturbations of metabolic systems, which can be steady state and/or dynamic perturbations.
16.6.1 Steady State Perturbation Experiments In steady state perturbation experiments one uses genetic engineering techniques and/or environmental perturbations to generate sets of perturbed enzyme levels (e/eo). Each different steady state can then be analyzed for the changed fluxes and intracellular metabolite levels. When genetic engineering techniques are used one often modifies the expression of one target gene. The mutant organism is then grown and one usually assumes that only the enzyme corresponding to the target gene has changed, while all other enzymes in the network have not changed. This is an assumption which completely disregards genetic control mechanisms, which are triggered by the changed metabolite levels and fluxes in the mutant organism. This requires that in each mutant organism all enzymes-levels must be quantified. The pioneering study of Niederberger et al. (1992) on the tryptophan synthesis pathway shows convincingly that many nontarget enzymes change levels due to genetic control mechanisms. Alternatively one often uses inhibitors to modify the activities of selected enzymes. However, the precise change in enzyme activity is difficult to quantify, because the intracellular level of inhibitor is difficult to control (due to possible complicated transport behavior of the inhibitor, such as active import or export). Finally one should realize that measuring of intracellular metabolite concentrations is a tedious task (requiring rapid sampling, quenching, advanced MSMS, analysis). It therefore, often happens that metabolite levels in the perturbed states are not measured, but only the new fluxes are measured, which only requires uptake measurements and flux balance calculations. Metabolite levels are then not available. In this situation one can only quantify the flux control coefficients CJo. It is not possible to obtain elasticity parameters or metabolite control coefficients Cxo. Only when in addition the mutants are analyzed for intracellular metabolites ( X j X oj ), then one can obtain the elasticity parameters, and from there CJ and Cx can be calculated. In the following section methods of parameterisation will be presented for linear (Section 16.6.1.1) and branched pathways (16.6.1.2). 16.6.1.1 Steady State Linear Pathways Consider the linear pathway shown in Figure 16.4. In this pathway the steady state reference flux Jo is the same for all enzymes, and the reference concentrations are e1o , eo2 , e3o for the enzymes, X1o , X o2 for the dependent metabolites and C os and C op for the independent extracellular metabolites. Reaction 1 is activated by substrate (ε1os > 0) , and negatively o < 0) (e.g., due to reversibility) and X (allosteric ε o < 0). Reaction 2 is affected by metabolites X1 (ε11 2 12 activated by metabolites X1 (ε o21 > 0) and inhibited by X 2 (ε o22 < 0). Reaction 3 is similar to reaction 2 (ε o32 > 0, ε o3 p < 0). We can now distinguish three experimental situations, with respect to available measurements. • Only flux and enzyme measurements In this situation one can derive the flux equation (Equation 16.11b) for the linear path where the flux control coefficients can be found as parameters (Figure 16.5). In this pathway (three enzymes) there are nine CJ-coefficients (3 × 3 CJ-matrix), however, this matrix has only one independent row (because the flux for each enzyme in steady state is the same). This results
16-15
Metabolic Control Analysis
e1 v1
S
V1
=
Jo
V2 Jo
=
V1 Jo
e1
eo1
X1
1 + ε011 ln
e2 eo2
1 + ε021 ln
e3 eo3
1 + ε032 ln
e2 v2
X1
Xo1
X1
Xo1
X2
Xo2
X2
+ ε012 ln
+ ε022 ln
+ ε0p ln
e3 v3
X2
Xo2
P
+ ε01s ln
Cs
Cos
X2
Xo2
Cp
Cop
Figure 16.4 A linear pathway and its lin-log kinetics.
1.33 J/Jo
J/Jo
1
1 1–C J1
0.5
0 0
1
e1/eo1
Figure 16.5 The dependence of the relative flux to the changing relative enzyme levels. For this example C1Jo is taken as 0.25.
Jo in C1Jo , C Jo 2 andC 3 as control coefficient of the flux for enzyme 1, 2 and 3. The flux equation (Equation 16.11b) now follows as: Cp C 1 + ε1osC1Jo ln os + ε 3opC 3Jo ln o Cs Cp J = (16.13a) o o o e e e Jo Jo 1 Jo 2 Jo 3 C1 + C2 + C3 e1 e2 e3
This is an interesting result.
16-16
Modeling Tools for Metabolic Engineering
In case that S and P are kept constant and equal to the reference steady state values, C s = C os , C p = C op and that, e.g., only one enzyme (e1) is modified (e 2 = eo2 , e3 = e3o) it follows that: 1 J = Jo Jo C1Jo e1o e1 + C Jo 2 + C3
(16.13b)
Jo Recognizing that the flux summation relation holds, C1Jo + C Jo 2 + C 3 = 1, it follows:
(e1 e1o ) J = Jo o J C1 + (1 - C1Jo )(e1 e1o )
(16.13c)
The flux relation (Equation 16.13c) shows that o
• Upon modification of only one enzyme, e.g., e1, the flux changes hyperbolically with e1 e1 . This theoretical result has been nearly always observed in many experimental studies (Fell, 1996). • That the maximal flux increase ((e1 e1o ) → ∞) equals ((J / Jo ) = 1 / (1 - C1Jo , showing that a significant increase is only possible when CJo is close to 1. • The above flux equations (Equation 16.13a) directly allows to calculate how the flux control coefficient changes with changing enzyme levels, by calculating the derivative C iJ = ((ei / J)(∂J / ∂ei )). The result shows that CJ drops at increasing enzyme level (e1 e1o higher ), as expected. This calculation also shows how CJ depends on all enzymes. • The flux equation (Equation 16.13a) has a lin-log structure with respect to the lumped reaction S → P and the inverse of the denominator equals the pathway capacity factor. This shows that lin-log kinetics is canonical. • The parameterisation of the two independent flux control coefficients only requires minimally two perturbations, e.g., in enzyme e1 and e2.
))
Equation 16.13b has been applied successfully on experimental data for the penicillin pathway (Van Gulik et al., 2003) and for a reconstituted partial glycolysis pathway (Wu et al., 2004). Figure 16.6 (Van Gulik et al., 2003) shows for the penicillin pathway the experimental measurements (flux and three enzymes, ACVS, IPNS, AT during a fed batch experiment; Nielsen, 1995) and the estimated CJ-coefficients using Equation 16.13b. It appears that IPNS is the rate limiting enzyme.
600
100 200 300 Time (hours) AT
400
IPNS
100 200 300 Time (hours) Jp
15
100 200 Time (hours)
300
5 0
100 200 Time (hours)
300
18
1.4 1.2 1
IPNS
0.8
12
0.4 0.2 0
–0.4 0
16 14
0.6
–0.2
10
200 0 0
20
(b)
JP (mumoles/g DW/h)
800
ACVS
3.5 3 2.5 2 1.5 1 0
C(acvs), C(ipns), C(at)
(a) 26 24 22 20 18 16 0
AT ACVS 100 200 300 Time (hours)
10 8 6 0
100 200 300 Time (hours)
Figure 16.6 Estimating CJ for Penicillin pathway using fed batch data on penicillin flux (Jp) and enzyme (ACVS, IPNS, AT). Left panel (a), the available experimental data as measured enzyme levels and flux. Right panel (b), estimated flux control coefficients of three enzymes and the estimated flux using Equation 16.13a, b.
16-17
Metabolic Control Analysis
• Flux, enzyme and metabolite measurements In this situation one has measurements of enzyme perturbations (created by whatever means) and for the resulting changes in fluxes and all metabolite levels. Although it is possible to use Equations 16.11a and 16.12 to obtain CJo and Cxo, this approach is complicated because of the many dependency relations between CJ-parameters and Cx-parameters (Appendix B). A much more elegant and simpler approach is to enter measured values for e/eo, J/Jo, X/Xo directly in the lin-log equations (Equation 16.10b). This results in a linear set of equations from which the elasticities can be elaborated using the powerful tools of linear regression. In Wu et al. (2004) this approach is shown for experimental glycolysis data. Consider the network of Figure 16.4 and assume that only two perturbations are available. For perturbation 1 only the enzyme level of e1 is increased e1 e1o = 2.0, e 2 eo2 = 1.0, e3 eo3 = 1, resulting in J Jo = 1.35, X1 X1o = 2.25 , X 2 X o2 = 2.0 , C s C os = 1 and C p C op = 1. This gives three lin-log relations (one for each reaction):
0 ln 2.25 + ε 0 ln 2.0) 1.35 = 2 (1 + ε11 12
1.35 = 1 (1 + ε 021 ln 2.25 + ε 022 ln 2.0)
1.35 = 1 (1 + ε 032 ln 2.0) In the perturbation 2 enzyme 2 is reduced e1 e1o = 1, e 2 eo2 = 0.40, e3 eo3 = 1.0, which results in
J Jo = 0.69, X1 X1o = 2.17 , X 2 X o2 = 0.53, C s C os = 1, C p C op = 1 This gives three additional relations:
0 ln 2.17 + ε 0 ln 0.53) 0.69 = 1 (1 + ε11 12
0.69 = 0.40 (1 + ε 021 ln 2.17 + ε 022 ln 0.53)
0.69 = 1 (1 + ε 032 ln 0.53)
These six equations, with five unknown elasticities are easily solved by linear regression leading to 0 = -0.40 0 = 0 ε 0 = 0.70 ε 0 = -0.30 ε 0 = 0.50 ε11 , ε12 , 21 , 22 , 32 . (The values for ε 0s and ε 0p cannot be found because Cs and Cp were not changed!!). The obtained elasticity values directly give the CJ and Cx-values using Equations 16.7a and 16.9a. To obtain the elasticities ( ε1os and ε 03 p ) for the extracellular compounds one needs to perturb Cs and Cp. Then also the response coefficients can be calculated. • Noncharacterized perturbation In the so-called co-response method (Hofmeyr and Cornish-Bowden, 1996) it is supposed that one can create an enzyme perturbation ei eoi , which is completely specific, meaning that all other enzymes (j≠i) are not changed. In addition one does not know the magnitude of the change in ei (which is, therefore, called a noncharacterized perturbation). However, one has measured the changed fluxes and metabolites. This situation could arise when applying a totally specific inhibitor, for which the inhibition kinetics are not known. However, it is hard to prove that an inhibitor is totally specific for its enzyme. Alternatively one could genetically change an enzyme, but one is lacking an experimental method to measure the change in ei eoi . In addition it is assumed that all the other enzymes are not changed, which is highly unlikely given the gene regulation effects (see Niederberger et al., 1992). Although the practical applicability of this approach must be severely doubted (because of the many assumptions) it is interesting to show its application. Now one obviously needs more perturbations than the two needed
16-18
Modeling Tools for Metabolic Engineering
before. We use here the same example as presented by Hofmeyr and Cornish-Bowden (1996) and three perturbations are available:
Perturbation 1 (in e1) J/Jo = 1.3, X1 / X1o = 2.12 , X 2 / X o2 = 1.60
Perturbation 2 (in e2) J/Jo = 2.0, X1 / X1o = 0.32 , X 2 / X o2 = 4.80
Perturbation 3 (in e3) J/Jo = 1.10, X1 / X1o = 0.90 , X 2 / X o2 = 15.50
The perturbation is specific, but its magnititude is unknown. E.g., for perturbation 1 (in e1), e1 / e1o is unknown, but e 2 / eo2 = 1 and e3 / eo3 = 1. We can now write for the lin-log kinetics for the reactions 2 and 3: Perturbation 1 (only for reactions 2 and 3)
1.30 = 1 (1 + ε 021 ln 2.12 + ε 022 ln 1.60)
1.30 = 1 (1 + ε 032 ln 1.60) For reaction 1 the lin-log kinetic relation is not useful because it contains the unknown e1 / e1o . Perturbation 2 (only for reactions 1 and 3)
0 ln 0.32 + ε 0 ln 4.80) 2.0 = 1 (1 + ε11 12
2.0 = 1 (1 + ε 032 ln 4.80) Perturbation 3 (only for reactions 1 and 2)
0 ln 0.90 + ε 0 ln 15.50) 1.1 = 1 (1 + ε11 12
1.1 = 1 (1 + ε 021 ln 0.90 + ε 022 ln 15.50)
0 = –0.885, ε 0 = 0, ε 0 = 0.434, ε 0 = -0.054, ε 0 = 0.637, which Linear regression leads to the elasticities ε11 12 21 22 32 were also found by Hofmeyr and Cornish-Bowden (1996) using much more complicated matrix operations. The here presented approach is much simpler. It should be noted that one needs only one more perturbation (3 against 2) to obtain all elasticities.
16.6.1.2 Steady State Branched Pathway The branch point pathway is shown in Figure 16.7.
J1
e2 e1
V1 V3
J o1 V2 e3
J2 Jo2 J3
Jo3
=
=
=
e1
eo1 e2
eo2 e3
eo3
(1 + εo1 ln X ) X0 (1 + εo2 ln X ) X0 (1 + εo3 ln X ) X0
Figure 16.7 Branch point and lin-log kinetic relations for the three reactions.
16-19
Metabolic Control Analysis
In this pathway there is only one metabolite (X). Usually this metabolite inhibits the branch feed r eaction (ε1 < 0) and stimulates reactions 2 and 3 (ε2 > 0, ε3 > 0). We can now distinguish different experimental measurement situations following perturbations, as before (Section 16.6.1.1). • Only flux and enzyme measurements We can eliminate ln (X/Xo) from the three kinetic equations (Figure 16.6) leading to two equations involving only enzyme and flux ratio’s.
J1 e1o ε o J eo - 1 = 1o o2 2 - 1 o J1 e1 ε 2 J2 e 2
ε o J eo J2 eo2 - 1 = o2 o3 3 - 1 ε 3 J3 e 3 Jo2 e 2
In these two equations one can enter the flux results from one (!!) perturbation experiment, e.g., in enzyme 1.
e1 e1o = 2.0, e 2 eo2 = 1, e3 e3o = 1, J1 J1o = 2.86 / 2.0 , J2 Jo2 = 1.29 / 1, J3 Jo3 = 1.57 / 1
Entering these results in the two relations gives ε o2 ε o3 = 0.50, ε1o ε o2 = -1.00 The above two relations with the mass balance (J1 = J2 + J3) give the relations for each Ji Joi as function of the three enzyme levels with only the two elasticity ratio’s as parameters. It has also been shown (Heijnen et al., 2004) that these two ε-ratios completely define the nine CJ-values of this network. The relations are given in Appendix B. • Metabolite, enzyme and flux measurements Assume that in the above perturbation also the change in metabolite was measured X/Xo = 1.77. We can now enter all information in the lin-log kinetic relations for the three reactions.
2.86 = 2(1 + ε10 ln1.77 ) 2
1.29 = 1(1 + ε 02 ln1.77) 1
1.57 = 1(1 + ε 03 ln1.77) 1
This directly give the three elasticity values ε1o = –0.50, ε o2 = 0.50, ε o3 = 1, which allows to calculate all nine CJ and three Cx-values (see Appendix B). Note that the ε-ratio’s previously obtained agree with these values and that only one perturbation is minimally required to obtain all ε-values. Of course more perturbations lead to ε-values with smaller error (Heijnen et al., 2004). • Noncharacterized perturbations Suppose we perform a noncharacterized perturbation (see also Section 16.6.1.1) in reaction 1 (hence e1 e1o is not known, but e 2 eo2 = 1 and e3 eo3 = 1). The perturbation leads to J1 J1o = 2.86/2, J2 Jo2 = 1.29/1, J3 Jo3 = 1.57/1, X X o = 1.77. For reaction 2 and 3 there follow the lin-log kinetics:
1.29 = 1(1 + ε 02 ln1.77) 1
16-20
Modeling Tools for Metabolic Engineering
1.57 = 1(1 + ε 03 ln1.77) 1
Next we perform a second noncharacterized perturbation, but now in reaction 2 leading to J1 J1o = 2.33/2, J2 Jo2 = 1.67/1, J3 Jo3 = 0.67/1, X/Xo = 0.72. For reaction 1 and 3 there follow as lin-log kinetic relations:
2.33 = 1(1 + ε10 ln 0.72) 2
0.67 = 1(1 + ε 03 ln 0.72) 1
Solving these four equations for the elasticities gives ε1o = –0.50, ε o2 = 0.50, ε o3 = 1.0, as before leading (see Appendix B) to all CJ and Cx-values. A remarkable aspect of noncharacterized perturbations is that, when only flux measurements are available, without e/eo information, one can still obtain CJ as follows! For the two perturbations one enters the results in the equations derived before (see “only flux and enzyme measurements” above). Uncharacterized perturbation in e1 (hence e 2 eo2 = 1, e3 e3o = 1 ): 1.29 ε 0 1.57 × 1 - 1 = 02 ( × 1 - 1) 1 ε3 1
Uncharacterized pertubation in e2 (or e1 e1o = 1, e3 eo3 = 1) 2.33 ε 0 0.67 × 1 - 1 = 10 × 1 - 1 2 ε3 1
This gives ε 02 / ε 03 = 0.50 and ε10 / ε 03 = -0.50, which agrees with the first approach (only flux and enzyme measurements) and gives all CJ-values according to Appendix B. Note that one needs 1 characterized flux perturbation (knowing e/eo and J/Jo) or two noncharacterized perturbations (knowing only J/Jo) to solve the branch. This approach has been applied to experimental branch point data (lysine production, glutamate production, and glycolysis) in a recent paper by Heijnen et al. (2004).
16.6.2 Dynamic Perturbation Experiments 16.6.2.1 The Measurement Problem for Steady State Perturbation Experiments For the previously mentioned steady state perturbation experiments of metabolic networks one needs different measurements: Extracellular concentrations These concentrations give (using proper mass balances) the uptake/secretion rates. These rates are used in flux balance analysis to calculate the steady state fluxes. This gives Ji Joi Enzyme activities In a mutant, in which a target gene has been changed, leading to a change in enzyme k, e k eok, for each individual other enzyme in the network a measurement must be done to provide the other values ei eoi . It is not sufficient to quantify only the change in, e.g., the enzyme whose level was modified using genetic techniques. Due to genetic regulation mechanisms, which respond to the changed metabolite levels, in
Metabolic Control Analysis
16-21
principle all enzyme levels (and not only the target enzyme) have changed in the mutant (Niederberger et al. 1992). In practice this poses severe problems. Often traditional enzyme activity assays are the only method of quantification, but for many enzymes these are not available. Here the recent developments in protein identification and quantification using mass spectrometry (Groot et al., 2007) are a big leap forward. We have seen (Section 16.6.1) that in principle, using Ji Joi and ei eoi data one can elaborate the flux control coefficients CJ. It is not possible to obtain Cx; for this one needs metabolite measurements. Intracellular metabolite measurements For each perturbed steady state one needs to measure all concentrations of the intracellular metabolites, to provide X j X oj . The experimental effort for intracellular metabolite measurements is significant. Due to rapid turnover of metabolites one needs rapid sampling and quenching of the biomass, such as the cold methanol method (Lange et al., 2001). Subsequently one needs to wash the biomass, to remove the extracellularly present metabolites under cold conditions. Then the washed biomass is extracted for intracellular metabolites (using, e.g., the boiling ethanol method). Finally the intracellular metabolites must be quantitatively analysed. This requires sophisticated MSMS techniques and the use of 13C standards (Dam et al., 2002, Wu et al., 2005, Mashego et al., 2004). The combined data of J/Jo, X j X oj and ei eoi allow to calculate the metabolite concentration control coefficients (Cx), and flux control coefficients CJ. This is called the direct approach. CJ and Cx values can then be used to obtain the elasticity coefficients using a general matrix equation (Westerhoff and Kell, 1987). It is, however, much simpler to use Ji Joi , ei eoi , X j X oj , C k C ok directly with lin-log kinetics to obtain the elasticities using linear regression as shown in Section 16.6.1. This avoids the nasty problem of (usually unknown) dependencies between CJ, Cx parameters (Appendix B), which must be taken into account when one applies the direct method. It is clear that to obtain all elasticities from steady state perturbation experiments one needs extensive quantitative datasets of fluxes, enzyme levels and metabolite levels. This is a formidable task. 16.6.2.2 Dynamic Perturbations Only Require Metabolite Measurements Dynamic perturbation experiments are an interesting alternative possibility to obtain ε-values. In such experiments the organism in steady state is perturbed extracellularly. This can be the addition of substrate, a switch of electron acceptor, addition of an inhibitor, change in dissolved O2 or CO2, a change in pH, etc. Subsequently the dynamic pattern of intracellular metabolites is measured in a short time frame, e.g., during a hundred seconds. In these rapid pulse experiments the change in enzyme activity is considered absent due to the short time (Rizzi et al., 1997, Theobald et al., 1997, Vaseghi et al., 1999, Kresnowati et al., 2006). This method only requires extra/intracellular metabolite analysis, and enzyme activity levels are not needed (being constant). Recently it was shown that these rapid pulse experiments can be performed in a mini (3 ml) reactor, called Bioscope (Visser et al., 2002 and Mashego et al., 2006). This Bioscope is fed, from the chemostat, with a constant (about 1 ml/min) broth stream containing steady state biomass, which is perturbed and sampled in the Bioscope. This is highly advantageous, compared to performing the pulse in the fermentor, because many different pulses can be performed with biomass from the same steady state chemostat. Also the amount of sample per time point is unlimited. Delgado and Liao (1991, 1992) have shown that CJ and Cx parameters can be directly obtained from such concentration time traces. However, it would be more convenient to obtain directly the elasticity parameters from such rapid pulse experiments. The data set of dynamic concentrations can be used for parametrization of a classical nonlinear model of, e.g., glycolysis (Rizzi et al., 1997 and Chassagnole et al., 2002). Such a nonlinear kinetic model allows then, for a given steady state the calculation of elasticities, followed by CJ and Cx-values. The key problem is that the parameter estimation in such nonlinear models is troublesome. Degenring et al. (2004), Haunschild et al. (2005, 2006), Nikerel et al. (2008), Wahl et al. (2006), Wiechert (2002), Wiechert and Takors (2004). The reason is that a nonlinear parameter estimation algorithm does need an initial guess of the parameters (which is not available) and that therefore, the best global estimate is not guaranteed.
16-22
Modeling Tools for Metabolic Engineering
In addition many parameters are hardly identifiable. This leads to model reduction afterwards. Kresnowati et al. (2005) and Nikerel et al. (2006) have shown that the direct use of lin-log kinetics enables the direct estimation of elasticities. The key in lin-log kinetics is that the elasticity parameters are linearly present in the equations, but the concentrations are nonlinear (logarithm) present. A dynamic experiment is then described (using lin-log kinetics) using the independent mass balances for intracellular metabolites (Equation 16.1b):
dX = S[ Jo ][ E xo ln( X / X o + E co ln(c / c o )] dt
(16.14a)
In addition there are the mass balances for the extracellular concentration, which contain the transport terms (DCin for in transport and DC for out transport) and biomass concentration Cx:
dC = C x Sc [ Jo ][ E xo ln( X / X o + E co ln(c / c o )] DC in - DC dt
(16.14b)
Integration of left and right side of these equations between time intervals (using linear approximation of X/Xo to calculate the integral of (ln X/Xo)) leads to a set of equations, which are linear in elasticities. Linear regression then gives a first estimate of the ε-parameters. This estimate is the initial parameter set for a conventional nonlinear parameter estimation algorithm (Nikerel et al., 2006, 2007, 2008). Figure 16.8 shows an example toy network (branch) which has two intracellular metabolites (X1, X 2), a substrate S and two products P1 and P2. There is inhibition of X 2 on reaction 1 and of P2 on reaction 3. The kinetics are highly nonlinear as shown. The stoichiometry matrix (Sc, S) is shown. A reference steady state is shown, in which the indicated elasticities hold. Using the elasticities, and reference fluxes a lin-log model is constructed. The steady state in a chemostat is perturbed at t = 0 by shifting S (1 → 2), X1 (2 → 1.5), X 2 (1 → 1.5), P1 (1 → 0.8). Figures shows the calculated metabolite response from the original model (dots) and the lin-log kinetics (line). Clearly, lin-log kinetics describes these large perturbations very well. Subsequently these metabolite data points were used to estimate the elasticity set (assuming that kinetic knowledge allowed to put certain, see Figure 16.8, Ec,x, elasticities to zero), using the above approach. Figure 16.9 shows the result of the estimated elasticities, which are very close to the expected values. Also the dynamic lin-log model with the evaluated ε-values performs very well. Recently (Nikerel et al., 2006, 2008) the glycolysis was studied in silico with respect to estimating the elasticities from dynamic concentration data and a lin-log model. It appeared that not all elasticities could be identified. However, due to the linear parameter character of lin-log kinetics the required model reduction could be performed a priori, using only dynamic metabolite measurement data. In addition it was shown that all parameters could be identified using a proper combination of dynamic and steady state perturbations which were used simultaneous for the correct estimation of elasticities. Having obtained the elasticities of the metabolic reaction network, with the reference fluxes, linlog kinetics provides a complete dynamic model to be used for simulation and network optimization (Visser et al., 2004). Also the lin-log model directly allows: • The calculation of CJ, CX. • The calculation of large changes in X and J upon large changes in enzyme levels. • The inverse calculation where the new J and X are specified and the design equation gives the required changes in enzymes (e/eo) (Visser and Heijnen, 2003). • Unravelling of silent mutations (Raamsdonk et al., 2001) using only flux and metabolite data (Wu et al., 2005). This approach seems very powerful to use metabolome and flux data for detailed functional genomics (Wu et al., 2005).
16-23
Metabolic Control Analysis
v4
v1
S
P1
v1
v2
v3
v4
–1
0
0
0
P1
0
0
0
1
P2
0
0
1
0
X1
1
–1
0
–1
X2
0
1
–1
0
0
1
Sc
S
X1 v2 X2
v1 = 2
S
P2
S S+1 X1
v2 = 1.867 v3 = 4
v3
X1 + 2
X2
v1
3 + X2
v2 v3
(1– 0.65·P2)
X2 + 1
v4 = 0.6
3
0
S
1 =
v4
X1
0.7
X2
0.7 0.3
X31
2 =
1
P1
1
P2
1
X31+ 23
2.5
S, X1, X2, P1, P2
X1 2
Ec,x
P1
1.5 P2
1
0.5
X2 S
0
5
10
15 Time
20
25
S
P1
P2
X1
X2
v1
0.50
0
0
0
0
v2
0
0
0
0.5
–0.25
v3
0
0
–1.86
0
0.5
v4
0
0
0
1.5
0
30
Figure 16.8 Toy metabolic network with nonlinear kinetics, the stoichiometric matrix, the used reference steady state, generated perturbation data (•), calculated reference elasticities and simulation of the same perturbation with the calculated elasticities (—).
16.7 Conclusion and Outlook It has been shown that the traditional (small perturbation) MCA can be extended easily, using lin-log kinetics, to a full nonlinear kinetic model which describes large perturbations well. This kinetic model has the MCA elasticities (and Jo) as kinetic parameters. This model is nonlinear in concentrations, but linear in parameters. This last property is a key element compared to traditional nonlinear enzyme kinetics (which are nonlinear in both concentration and parameter). The parameter linearity in lin-log kinetics allows to use the powerful toolbox of linear algebra for parameter (elasticities) estimation. In addition this parameter linearity reveals parameter identifiably problems and allows a priori model reduction. Finally, due to its linear parameter character of lin-log kinetics these methods of parameter identification scale favorable for large, realistic reaction networks.
16-24
Modeling Tools for Metabolic Engineering
S, X1, X2, P1, P2
2.5 X1
2
1.5 X2
P2
1
0.5 0
Ec,x
P1
S 5
10
15 Time
20
25
S
P1
P2
X1
X2
v1
0.49
0
0
0
0
v2
0
0
0
0.47
–0.26
v3
0
0
–1.68
0
0.40
v4
0
0
0
1.47
0
30
Figure 16.9 Generated dynamic perturbation data using mechanistic model (•), simulation of the perturbation of the toy network using the estimated elasticities of the lin-log kinetic (—) and the estimated elasticities.
In future the aspects of noise and experimental design need to studied more extensively. Ever more challenging is to design in vivo experiments which enable to extract elasticities in an unbiased way, by allowing that all entries in Exo and Eco can be different from zero (inverse engineering). This allows an unbiased investigation of metabolite/enzyme allosteric interactions. These results are all very relevant to quantitatively describe metabolic reaction networks. However, enzyme levels/activities in these models are present as parameters. In reality there is a coupling between metabolite status and gene expression, (de)phosphorylation cascades of enzymes. This regulation level must also be put into a convenient mathematical framework. The challenge will be to use model formats which are scalable to large networks, which can accommodate large perturbation, easy parameter estimation and identifiability studies. In the past, the power law approach (Savageau, 1976, Voit, 2000) has shown promises here.
Appendix A Enzyme Kinetics in the Presence of Conserved Moieties Assume that the reaction rate of an enzyme is influenced by an intracellular metabolite X1, three metabolites which belong to a conserved moiety (X 2, X 3, X4) and an extracellular metabolite present in concentration Cs (Equation 16.1). We can then write for the reaction rate V:
V = f(X1, X 2, X3, X4, Cs, enzyme, parameters) This kinetic equation is linearized (using the approach in Section 16.4.4) around a steady state.
V X X X e - 1 = o - 1 + ε1o 1o - 1 + ε o2 o2 - 1 + ε o3 o3 - 1 o J e X1 X2 X3 C X + ε o4 o4 - 1 + ε os os - 1 X4 Cs
(A.1a)
There is also a conserved moiety sum with sum total T:
X 2 + X3 + X4 = T
(A.2a)
16-25
Metabolic Control Analysis
In the reference steady state the conserved moiety sum follows as X o2 + X o3 + X o4 = T o
(A.2b)
Here To is the conserved moiety sum in the reference steady state which can be perturbed, hence the general sum T. We can combine Equations A2a and A2b (subtraction) and rewrite: X X X T X o2 o2 - 1 + X o3 o3 - 1 + X o4 o4 -11 = T o o - 1 T X2 X4 X3
(A.2c)
To obtain independent mass balances the stoichiometric matrix S was reduced by eliminating a row corresponding to a chosen metabolite present in the conserved moiety sum. Also the dependent metabolite vector is accordingly reduced (matrix Sfull becomes S and vector X full becomes vector X). We now also have to eliminate the removed metabolite from the enzyme kinetic relations, where present. If we assume that X4 is the chosen removed metabolite, then we have to adapt Equation A1a, by using Equation A2c to eliminate (( X 4 / X o4 ) - 1). This gives us as new kinetic equation. V e o X1 o o X o2 X 2 = 1 eo - 1 + ε1 X o - 1 + ε 2 - ε 4 X o X o - 1 Jo 1 4 2
X o X C To T + ε 3o - ε o4 o3 o3 - 1 + ε os os - 1 + ε o4 o o - 1 X X C X T 3 s 4 4
(A.1b)
This kinetic relation has still the linear format but: • For the remaining conserved moiety metabolites (X 2, X3) new composite elasticities (`) arise
Xo ε o21 = ε o2 - ε o4 o2 X4
Xo ε o31 = ε 3o - ε o4 o3 X4
It should be noted that these elasticities can be quite different from 1, due to the metabolite ratio’s which can be very different from 1. • A new independent metabolite shows up, the conserved moiety sum T/To with its own composite elasticity o ε 01 T = ε4
To X o4
This shows that the vector of independent concentrations C is extended with conserved moiety sum. In case of lin-log kinetics the conserved moiety sum (e.g., Equation A2c) is approximated by its logarithmic format (y - 1 ≈ ln y), which is, therefore, only accurate for not too large (plus or minus 30%, Heijnen, 2005) changes in conserved moiety metabolites. This kinetic equation (Equation A1b) is the equation to be introduced in the independent metabolite mass balances represented by the reduced matrix S and reduced vector X. In these mass balances the elasticity matrices Ex and Ec contain the composite elasticities. Its steady state metabolite and flux solutions then also show the effect of changed conserved moiety sums on metabolite and flux changes. These (composite) elasticities are the only kinetic parameters which can be estimated using proper perturbation experiments (see Section 16.6).
16-26
Modeling Tools for Metabolic Engineering
In case that in such experiments the conserved moiety sum does not change (hence T = To), it directly follows that ε oT cannot be obtained!, only the composite elasticities for X 2 and X 3 (ε o21 and ε o31) can be obtained. These two composite elasticities are made up of the three unknown individual elasticities (one for each metabolite X 2, X3, X4, of the conserved moiety) and the known reference state metabolite levels of the conserved moiety ( X o2 , X 3o , X o4 ). Therefore, the original elasticities (ε o2 , ε o3 , ε o4 ) cannot be resolved, showing that conserved moieties lead to an identifiability problem. In case that the conserved moiety sum is perturbed, then ε oT can be found and therewith all original elasticities can be found (as expected, because conserved moieties are absent if T is allowed to vary).
Appendix B Analysis of Control Coefficients and Dependency Relations for a Branch Point The branch point split ratio a = J20/J10 is defined in the reference state. Solving this network gives the following equations for the metabolite x and J1 J2 and J3 as a function of the enzyme levels:
J3 e03 e1 e10 ⋅ ( ε 03 ε10 - 1) + a ⋅ e 2 e02 ⋅ ( ε 02 ε10 - ε 03 ε10 ) ⋅ = J03 e3 a ⋅ e 2 e02 ⋅ ε 02 ε10 + (1 - a ) ⋅ e3 e03 ⋅ ε 03 ε10 - e1 e10
(B.1)
J2 e02 (1 - a ) ⋅ e3 e03 ⋅ ( ε 03 ε10 - ε 02 ε10 ) + e1 e10 ⋅ ( ε 02 ε10 - 1) ⋅ = J02 e 2 a ⋅ e 2 e02 ⋅ ε 02 ε10 + (1 - a ) ⋅ e3 e03 ⋅ ε 30 ε10 - e1 e10
(B.2)
J1 J J = a ⋅ 02 + (1 - a ) ⋅ 03 0 J1 J2 J3
(B.3)
- ln ( x x 0 ) =
- e1 e10 + a ⋅ e 2 e02 + (1 - a ) ⋅ e3 e03 - e1 e10 ⋅ ε10 + a ⋅ e 2 e02 ⋅ε 02 + (1 - a ) ⋅ e3 e30 ⋅ ε 03
(B.4)
The following relations are obtained for the nine flux control coefficients:
J0 C11 = ( a ⋅ ε 02 ε10 + (1 - a ) ⋅ ε 03 ε10 ) D
(B.5a)
J0 C12 = -a D
(B.5b)
J0 C13 = ( -1 + a ) D
(B.5c)
C J210 = ( ε 02 ε10 ) D
(B.5d)
C J220 = ((1 - a ) ⋅ ε 03 ε10 - 1) D
(B.5e)
C J230 = ( - (1 - a ) ⋅ ε 02 ε10 ) D
(B.5f)
J0 C 31 = ( ε 03 ε10 ) D
(B.5g)
16-27
Metabolic Control Analysis
J0 C 32 = ( -a ⋅ ε 03 ε10 ) D
(B.5h)
J0 C 33 = ( a ⋅ ε 02 ε10 - 1) D
(B.5i)
Where, the denominator D is defined as:
D = a ⋅ε 02 ε10 + (1 - a ) ⋅ ε 03 ε10 - 1
Eliminating the elasticity ratio’s (ε20/ε10 and ε30/ε10) gives seven relations between CJ, showing strong dependency Mass balance derived constraints
C1J0i = a ⋅ C J20i + (1 - a ) ⋅ C 3J0i ,
i = 1, 2
(B.6)
Summation constraints
C11 + C12 + C13 = 1
(B.7a)
C21 + C22 + C23 = 1
(B.7b)
C 31 + C 32 + C 33 = 1
(B.7c)
(1 - a ) ⋅ C12J0 - a ⋅ C13J0 = 0
(B.8a)
(1 - a ) ⋅ C J210 + a ⋅ C J230 = 0
(B.8b)
Branch point constraints
For the metabolite control coefficients the following relations are obtained:
C1x 0 = 1 D '
(B.9a)
C 2x 0 = -a D '
(B.9b)
C 3x 0 = - (1 - a ) D '
(B.9c)
The denominator D′ is defined as:
D ' = ε10 D = - ε10 + a ⋅ ε 02 + (1 - a ) ⋅ ε 03
These three relations only contain one ε-group ( = D′) which can be eliminated. The two constraints are the metabolite control summation theorem and the kinetics based relation:
C1x 0 + C 2x 0 + C 3x 0 = 0
(B.10)
a ⋅ C1x 0 + C 2x 0 = 0
(B.11)
16-28
Modeling Tools for Metabolic Engineering
References Chassagnole, C. et al. 2002. Dynamic modelling of the central carbon metabolism of Escherichia coli. Biotechnol. Bioengin., 79: 53–73. Degenring, D. et al. 2004. Sensitivity analysis for the reduction of complex metabolism models. J. Process Contr., 14: 729–745. de Groot, M.J.L., Daran-Lapujade, P., van Breukelen, B., Knijnenburg, T.A., de Hulster, E.A.F., Reinders, M.J.T., Pronk, J.T., Heck, A.R., and Slijper, M. 2007. Quantitative proteomics and transcriptomics of anaerobic and aerobic yeast cultures reveals post-transcriptional regulation of key cellular processes. Microbiology, 153: 3864–3878. Delgado, J. and Liao, J.C. 1992. Metabolic control analysis using transient metabolite concentration. Biochem. J., 285: 965–972. Delgado, J.P. and Liao, J.C. 1991. Identifying rate-controlling enzymes in metabolic pathways without kinetic-parameters. Biotechnol. Prog., 7: 15–20. Fell, D. 1996. Understanding the Control of Metabolism. Portland Press, London. Hatzimanikatis, V. and Bailey, J.E., 1997. Effects of spatiotemporal variations in metabolic control: approximate analysis using (log)-linear kinetic models. Biotechnol. Bioeng., 57: 75–87. Haunschild, M.D. et al. 2005. Investigating the dynamic behaviour of biochemical networks using model families. Bioinformatics, 21: 1617–1625. Haunschild, M.D. et al. 2006. A general framework for large scale model selection. Optimiz. Methods Software, 21: 901–917. Heinrich, R. and Rapoport, T.A. 1974. A linear steady-state treatment of enzymatic chains: general properties, control and effector strength. Eur. J. Biochem., 42: 89–95. Heijnen, J.J. 2005. Approximative kinetic formats used in metabolic network modelling. Biotechnol. Bioeng., 91: 534–545. Heijnen, J.J., van Gulik, W.M., Shimizu, H., and Stephanopoulos, G. 2004. Metabolic flux control analysis of branch points: an improved approach to obtain flux control coefficients from large perturbation data. Metabol. Engin., 6: 391–400. Heinrich, R. and Schuster, S. 1996. The Regulation of Cellular Systems. Chapman & Hall, New York. Hofmeyr, J.H. and Cornish-Bowden, A. 1996. Co-response analysis:a new strategy for experimental metabolic control analysis. J. Theor. Biol., 182: 371–380. Kacser, H. and Burns, I. 1973. Rate control in biological processes, Darries DD, ed., 65–104, Cambridge University Press, Cambridge. Kresnowati, M.T.A.P., van Winden, W.A., and Heijnen, J.J. 2005. Determination of elasticities, concentration and flux control coefficients from transient metabolite data using linlog kinetics. Metabol. Eng., 7: 142–153. Kresnowati, M.T.A.P., van Winden, W.A., Almering, M.J.H., ten Pierick, A., Ras, C., Knijnenburg, T.A., Daran-Lapujade, P.A.S., Pronk, J.T., Heijnen, J.J., and Daran, J.M. 2006. When transcriptome meets metabolome: fast cellular responses of yeast to sudden relief of glucose limitation. Mol. Systems Biol., 2 (49): 1–16. Lange, H.C., Eman, M., van Zuijlen, G., Visser, D., van Dam, J.C., Frank, J., Teixeira de Mattos, M.J., and Heijnen, J.J. 2001. Improved rapid sampling for in-vivo kinetics of intracellular metabolite in Saccharomyces cerevisiae. Biotechnol. Bioeng., 75 (4): 406–415. Mashego, M.R., Wu, L., van Dam, J.C., Ras, C., Vinke, J.L., van Winden, W.A., van Gulik, W.M., and Heijnen, J.J. 2004. Miracle: mass isotopomer ratio analysis of U-13-C labeled extracts. A new method for accurate quantification of changes in concentrations of intracellular metabolites. Biotechnol. Bioeng., 85 (6): 620–628. Mashego, M.R., van Gulik, W.M., Vinke, J.L., Visser, D., and Heijnen, J.J. 2006. In-vivo kinetics with rapid perturbation experiments in Saccharomyces cerevisiae using a second generation Bioscope. Metabol. Eng., 8: 370–383.
Metabolic Control Analysis
16-29
Nasution, U., van Gulik, W.M., Pröll, A., van Winden, W.A., and Heijnen, J.J. 2006. Generating short-term kinetic responses of primary metabolism of Penicillium Chrysogenum through glucose perturbation in the Bioscope mini reactor. Metabol. Engin., 8 (5): 395–405. Niederberger, P., Prasad, R., Miozarri, G., and Kacser, H. 1992. A strategy for increasing an in-vivo flux by genetic manipulations. Biochem. J., 287: 473–479. Nielsen, J. 1995. Physiological engineering aspects of Penicillium Chrysogenum. DSc thesis, Technical University of Denmark, Lyngby, Denmark. Nikerel, I.E., van Winden, W.A., van Gulik, W.M., and Heijnen, J.J. 2006. A method for estimation of invivo elasticities in metabolic networks using data from steady-state and rapid sampling experiments with linlog kinetics. BMC Bioinformatics, 7: 540. Nikerel, I.E., van Winden, W.A., van Gulik, W.M., and Heijnen, J.J. 2007. Linear-logarithmic kinetics; a framework for modeling kinetics of metabolic reaction networks. Simulation News Europe, 17 (1): 19–26. Nikerel, I.E., van Winden, W.A., Verheijen, P.J.T., and Heijnen, J.J. 2009. Model Reduction and a priori kinetic parameter identifiability analysis using metabolome time series for metabolic reaction networks with lin-log kinetics. Met. Eng., 11: 20–30. Raamsdonk, L.M., Teusink, B., Broadhurst, D., Zhang, N.S., Hayes, A., and Walsh, M.C. et al. 2001. A functional genomics strategy that uses metabolome data to reveal the phenotype of silent mutations. Nature Biotechnol., 19: 45–50. Rizzi, M. et al. 1997. In vivo analysis of metabolic dynamics in Saccharomyces cerevisiae: II. Mathematical model. Biotechnol. Bioeng.. 55: 592–608. Savageau, M.A. 1976. Biochemical Systems Analysis: A Study of Function and Design in Molecular Biology. Addison-Wesley, London. Small, J.R. and Kacser, H. 1993a. Response of metabolic systems to large changes in enzyme activities and effectors 2. The linear treatment of branched pathway and metabolite concentrations. Assessment of the general non-linear case. Eur. J. Biochem., 213: 625–640. Small, J.R. and Kacser, H. 1993b. Responses of metabolic systems to large changes in enzyme-activities and effectors. 1. The linear treatment of unbranched chains. Eur. J. Biochem., 213: 613–624. Teusink, B., Passarge, J., Reijenga, C.A., Esgalhado, E., van der Weijden, C.C., and Schepper, M. et al. 2000. Can yeast glycolysis be understood in terms of in vitro kinetics of the constituent enzymes? Testing biochemistry. Eur. J. Biochem., 267: 5313–5329. Theobald, U. et al. 1997. In vivo analysis of metabolic dynamics in Saccharomyces cerevisiae: I. Experimental observations. Biotechnol. Bioengin., 55: 305–316. van Dam, J.C., Eman, M.R., Frank, J., Lange, H.C., van Dedem, G.W.K., and Heijnen, J.J. 2002. Analysis of glycolytic intermediates in Saccharomyces cerevisiae using anion exchange chromatography and electrospray inonisation with tandem mass spectrometric detection. Analytica Chimica Acta, 460 (2): 209–218. van Gulik, W.M., De Laat, W.T.A.W., Vinke, J.L., and Heijnen, J.J. 2000. Application of metabolic flux analysis for the identification of metabolic bottlenecks in the biosynthesis of Penicillin G. Biotechnol. Bioeng., 68: 602–618. van Gulik, W.M., van Winden, W.A., and Heijnen, J.J. 2003. Metabolic flux analysis, modeling and engineering solutions. In Handbook of Industrial Cell Culture, Vinci, V. and Parekh, S.R., eds. Humana Press, Totowa, New Jersey. Vaseghi, S., Baumeister, A., Rizzi, M., and Reuss, M. 1999. In-vivo Dynamics of the Pentose Phosphate Pathway in Saccharomyces cerevisiae. Metabol. Eng., 1(1): 128–140. Visser, D. and Heijnen, J.J. 2002. The Mathematics of Metabolic Control Analysis revisited. Metabol. Engin., 4: 114–123. Visser, D. and Heijnen, J.J. 2003. Dynamic simulation and metabolic re-design of a branched pathway using lin-log kinetics. Metabol. Engin., 5: 164–176. Visser, D., Schmid, J.W., Mauch, K., Reuss, M., and Heijnen, J.J. 2004. Optimal redesign of primary metabolism in Escherichia coli using lin-log kinetics. Metabol. Eng., 6: 378–390.
16-30
Modeling Tools for Metabolic Engineering
Visser, D. et al. 2002. Rapid sampling for analysis of in vivo kinetics using the BioScope: a system for continuous-pulse experiments. Biotechnol. Bioengin., 79: 674-681. Voit, E.O. 2000. Computational Analysis of Biochemical Systems. Cambridge University Press, Cambridge. Wahl, S.A. et al. 2006. Unravelling the regulatory structure of biochemical networks using stimulus response experiments and large scale model selection. IEE Proc. Systems Biol., 153: 275–286. Westerhoff, H.V. and Kell, D.B. 1987. Matrix method for determining steps most rate limiting to metabolite fluxes in biotechnological processes. Biotechnol. Bioengin., 30: 101–107. Wiechert, W. 2002. Modeling and simulation: tools for metabolic engineering. J. Biotechnol., 94: 37–63. Wiechert, W. and Takors, R. 2004. Validation of metabolic models: concepts, tools, and problems. Metabolic engineering in the post genomic era. Horizon Biosci., Wymondham, England. Wu, L., Wang, W., van Winden, W.A., Van Gulik, W.M., and Heijnen, J.J. 2004. A new framework for the estimation of control parameters in metabolic pathways using lin-log kinetics. Eur. J. Biochem., 271: 3348–3359. Wu, L., van Winden, W.A., van Gulik, W.M., and Heijnen, J.J. 2005. Application of metabolome data in functional genomics: A conceptual strategy. Metabol. Eng., 7: 302–310. Wu, L., van Dam, J.C., Schipper, D., Kresnowati, M.T.A.P., Pröll, A., Ras, C., van Winden, W.A., van Gulik, W.M., and Heijnen, J.J. 2006. Short term metabolome dynamics and carbon, electron and ATP balances in chemostat-grown Saccharomyces cerevisiae CEN-PK.113-7D following a glucose pulse. Appl. Environ. Microbiol., 72 (5): 3566–3577. Wu, L., Mashego, M.R., van Dam, J.C., Pröll, A., Vinke, J.L., Ras, C., van Winden, W.A., van Gulik, W.M., and Heijnen, J.J. 2005. Quantitative analysis of the microbial metabolome by isotope dilution mass spectrometry using uniformly 13C-labeled cell extracts as internal standards. Anal. Biochem., 336: 164–171.
17 Structure and Flux Analysis of Metabolic Networks Kiran Raosaheb Patil and Prashant Madhusudan Bapat Technical University of Denmark
Jens Nielsen Chalmers University of Technology
17.1 Introduction �������������������������������������������������������������������������������������17-1 17.2 Metabolic Network Structure ��������������������������������������������������������17-2 Representation of Metabolic Networks • Structure–Function Relationship
17.3 Network Functionality at Metabolite Level...............................17-12 Experimental Estimation of Fluxes • In Silico Prediction of Fluxes • The Fluxome in Metabolic Engineering: Applications • Kinetic Models for Flux Simulations
17.4 Conclusions and Future Perspective...........................................17-16 References ��������������������������������������������������������������������������������������������������17-16
17.1 Introduction Conceptual understanding of complex cellular organization can be facilitated through a perspective based on the central dogma of biology1 (Figure 17.1). Accordingly, information coded in a genome is translated into proteins via mRNA. Proteins play a variety of roles in a cell, including that of enzymes, which selectively catalyze chemical transformation between metabolites. Ensemble of all nongenetically encoded compounds (thus, excluding mRNA, proteins, etc.) and enzymes operating on them is generally referred to as a metabolic network.2 In essence, metabolic networks convert nutrients available from environment into fundamental building blocks for the synthesis of proteins, DNA, and other cellular components. By providing energy and building blocks for growth and maintenance of cells, metabolic networks play a central role in sustaining life. This key role of metabolic networks in cellular operations is evident by two facts. Firstly, the basic architecture of metabolic networks is largely conserved across several different species ranging from microscopic bacteria to humans.3 Second, cellular response and adaptation to genetic/environmental perturbations is often mediated through or reflected in the operation of metabolic networks.4 Although the structure of metabolic networks differ significantly at local levels (e.g., specific pathway structures),3,5 their large-scale conservancy across different species implies common biochemical and evolutionary principles underlying their operation.6,7 Understanding such general principles has great implications for: (i) correlating and extrapolating knowledge across different species, especially from model organisms (such as yeast) to humans, (ii) devising rational strategies for metabolic engineering, iii) finding remedies for metabolism related diseases, and (iv) synthetic biology. Most metabolic engineering problems are concerned with optimization of metabolic network function at the level of fluxes. Important exceptions may be found in higher eukaryotes such as plants where optimization of certain metabolic pools may be of more relevance.8 A flux for any reaction can be defined 17-1
17-2
Modeling Tools for Metabolic Engineering DNA Replication
(a)
(b) Enzymek Flux = f(Mi, Mj, Enzymek)
DNA Mi
Mj
Transcription mRNA Translation
Protein
Structural element
Nutrients
By-products
Enzyme
Regulatory protein
Metabolic network
Energy
... ... M1 + M2 > M3 + M4 :: Enzyme j M4 < > M5 :: Enzyme k M5 + M6 -> M1 + M7 :: Enzyme j ... ...
Building blocks for growth
Figure 17.1 (a) The central dogma in molecular biology. DNA replication can be thought as information flow (back-up) from genome to genome. Information coded in genes flows to proteins via transcription and translation. Proteins may play a variety of functional roles in a cell. Only three roles are shown as examples. (b) Enzymes catalyze chemical transformation of metabolites. The rate of enzyme catalyzed reaction (flux) is not only a function of enzyme availability and properties, but also concentration of substrates and products. Several of such enzymatic steps constitute a metabolic network where products of some reaction (/s) serve as substrates for other reaction (/s), thus creating an interconnected reaction web. The overall function of a metabolic network can then be viewed as utilizing environmentally available nutrients to generate energy and building molecules for growth and maintenance of the cell.
as the amount of substrates processed (or products produced) per unit time. Whole metabolic network can be viewed as an interconnected set of mass flow channels. Most microbial metabolic engineering problems then can be represented as an optimization of certain set of cellular exchange fluxes, i.e., rates of secretion/uptake of compounds of interest. Knowledge of intracellular flux distribution and computational tools for predicting fluxes in mutant strains is thus of prime importance for metabolic engineering. Some of the key aspects of network structure and flux analysis relevant for metabolic engineering applications are depicted in Figure 17.2. We here note that most of the discussion in this chapter is presented with a global view of metabolic network (at genome-scale). Although most of the flux and structure analysis tools are usually applied to semi-global reduced networks, the use of genome scale metabolic models will be inevitable in modern metabolic engineering studies. We have, therefore, also refrained ourselves from discussing tools and approaches that are based on the isolated analysis of selected pathways.
17.2 Metabolic Network Structure 17.2.1 Representation of Metabolic Networks Use of appropriate representations for depiction and analysis is an essential element in discovering universal organizational and operational principles in metabolic networks. Convenient and biologically
17-3
Structure and Flux Analysis of Metabolic Networks
Micro-organism
Genome sequence, Biochemical data, Literature
Structure -function
Structure analysis
• Graph topology • Petri-net • Flux coupling • Network based data integrations
Genome scale model
Reduced model
Flux analysis
Modeling tools
Experimental tools
• FBA, ROOM • MOMA, EBA
• 13C Flux analysis • Metabolites analysis
Fluxes : simulated
Fluxes : experimental
Metabolic engineering tools • OptGene, • OptKnock, optStrain • Heuristic based • Dynamic optimization
Figure 17.2 Schematic overview of tools and information flow in global structure and flux analysis of metabolic networks. Structure analysis: the information retrieved from the genome sequence, biochemistry, and the literature can be utilized for deducing metabolic network structure for the given organism. Existing reduced models are often used as templates for re-constructing global models. Flux analysis: The connectivity and stoichiometry from structure analysis is systematically exploited for measurement and simulation of intracellular fluxes. The integrated information can then be applied for deducing underlying structure–function relationship. Metabolic engineering tools: Flux analysis and structure–function relationship is recruited for identifying metabolic engineering targets.
17-4
Modeling Tools for Metabolic Engineering
meaningful representations not only help in comparing different metabolic networks, both on global and local scale, but also for quantitatively categorizing different network structures. Furthermore, network structure has also been shown to be an inherent element in genome-scale data integration.4,9 We hence begin with a brief overview of the different representations of metabolic networks and then discuss the structure–function relationship. In order to facilitate the description of different network representations, an example metabolic network illustrated in different forms discussed in the following text (Figure 17.3). 17.2.1.1 Pathway Representation Pathway is the oldest and perhaps the most commonly used way to represent a metabolic network. A pathway generally depicts a part of metabolism (collection of enzyme-catalyzed reactions and corresponding metabolites) that performs a certain biochemical task. Examples of pathways include the TCA cycle, the Emden–Meyerhof–Parnas pathway, histidine biosynthesis, etc. Such representations are familiar to biologists due to their wide-spread use in biochemistry text books and online databases such as KEGG (Kyoto Encyclopedia of Genes and Genomes, http://www.kegg.com).10 In a pathway representation, metabolites are shown as nodes (usually simply as text) and enzymes as arrows connecting the corresponding metabolites. Currency metabolites, i.e., the metabolites taking part into large number of reactions (e.g., NADH, ATP, CO2, etc.), are shown only at the individual reaction level. Although, pathway representation is generally used only for small parts of metabolism (e.g., KEGG), there are examples of pathway-style representation of genome-scale metabolic networks (e.g., see EcoCyc database, http:// ecocyc.org/). Indeed, categorization and pictorial depiction of biochemical functionality is an important aid to the human understanding of the operational and regulatory logic of metabolism. Consequently, several pathway-based analysis tools are routinely used for the analysis of metabolic networks. Despite of their usefulness, there are certain drawbacks and pitfalls in the use of pathway-based representations. Most of these drawbacks can be attributed to two facts: (i) pathways often fail to account for the high connectivity there is in metabolic networks, (ii) definition of a pathway is vague and does not necessarily strictly correspond to a particular physiological functionality (e.g., homeostasis of metabolites, balanced flux distribution, etc.). Indeed, several metabolites span several different pathways since the end-point or intermediates from one pathway often act as substrates/products in other pathways (see Table 17.1 for a list of selected metabolites and the number of yeast KEGG-pathways they participate in). Consequently, the choice of reactions shown as a part of a pathway is rather arbitrary. These drawbacks are becoming more apparent as large amount of genome-wide data on gene expression, protein abundances, metabolite levels, and fluxes is being generated. Complex patterns observed in these datasets seldom fit into standard pathway definitions and thus operation of metabolic networks can not be explained through them. Consequently, other more comprehensive representations are being sought in order to systematically describe metabolic networks. 17.2.1.2 Graph Theoretical Representation Complex cellular organization can be viewed as an ensemble of several biomolecular interaction networks; such as protein–protein and protein–DNA interaction networks. Functional relationships between cellular species can also be conceptualized as interactions, in addition to the interactions arising due to the physical contacts between biomolecules. Metabolic reactions can thus be seen as functional relationship between metabolites and vice versa. Interaction centered view of cellular metabolism can be used to construct a graph theoretical representation (Figure 17.3b) where metabolites and reactions (/corresponding enzymes) are represented as nodes while interactions between them form edges. The resulting graph is essentially bi-partite, meaning that there are two classes of nodes, viz. metabolite nodes and enzyme nodes, and no two nodes in the same class are directly connected to each other. Two other uni-partite metabolic graphs can be derived from the bi-partite representation. In reaction interaction graph (Figure 17.3c) only reactions are represented as nodes while two reactions sharing common metabolite (/s) are connected by an edge. Similarly, metabolite interaction graph can also be
ADHE
ADHA
2 Acetate
2 ATP
2 ADP
Diacetyl
2,3-Butanediol
NAD+
NADH
ALDB
Acetoin
NADH NAD+ BUTB
BUTA
CO2 2-Acetolactate O2 Chemical oxidation CO2 CO 2
ALS ILVB
2Lactate
2NAD
LDH
2NADH
+
Acetoin
BUTB
2,3-Butanediol
Diacetyl
ADP
ALS
Pyruvate
2-Acetolactate
CO2
PDH
Acetyl-P
PFL Formate
PTA
ACKA
Acetyl-CoA Lactate
Chemical_oxidation O2
ATP
Acetate
Glycolysis
Glucose
LDH
NADH
ALDB
BUTA
NAD+
ADHE
Acetaldehyde Ethanol ADHA
(b)
Figure 17.3 Different depictions of metabolic network that are commonly used for visualization, conceptual representation, or unraveling underlying structural and functional properties. A small schematic representation of pyruvate metabolism in Lactococcus lactis54 is used as an example. (a) Traditional pathway-style representation. Reactions are usually represented as arrows showing conversion of corresponding metabolites. Highly connected metabolites (especially cofactors such as NADH and ATP) are only displayed locally at individual reaction level. (b) Graph theoretical representation of metabolic network (bi-partite view). Reactions (circles) and metabolites (squares) form two classes of nodes in a graph and the interactions among them form the corresponding edges. (c) Reaction interaction graph. This is another graph theoretical view of the metabolic network where only reactions form the nodes while the metabolites form the edges connecting the corresponding reactions. (d) Metabolite interaction graph. Metabolite centered graphical representation where metabolites form the nodes while enzymes act as edges connecting the corresponding metabolites. (e) Flux coupling graph. A concept based on the stoichiometric constraints on the operation of the network. All reactions are represented as nodes. Two reactions for which the corresponding fluxes are correlated are connected by an edge (directional coupling: dashed line, full coupling: thick complete line).
2 Ethanol
2 NAD+
2 NADH
2 Acetyl-P
PTA
2Formate
PFL
2NAD
2ATP
2-Pyruvate
2NAD+
Glucose
2ADP
2 Acetyl-CoA
CO2
2NADH
2NAD+
PDH
Glycolysis
2 Acetaldehyde
2NAD+
2 NADH
(a)
Structure and Flux Analysis of Metabolic Networks 17-5
PTA
ADHE
ACKA
ALS
PFL
Chemical_oxidation
ALDB
PDH
LDH
Glycolysis
Figure 17.3 (Continued)
BUTA
BUTB
ADHA
(c)
CO2
Glucose
Pyruvate
NADH
ATP
Formate
Acetate
ADP Acetyl-P
Lactate
Acetyl-CoA
Acetaldehyde
NAD+
Ethanaol
2,3-Butanediol
Acetolin
Diacetyl
2-Acetolactate
O2
(d)
PDH
BUTA
ALDB
Chemical_oxidation
BUTB
ADHA
PTA
Glycolysis
ACKA
LDH
ADHE
(e)
ALS
PFL
17-6 Modeling Tools for Metabolic Engineering
17-7
Structure and Flux Analysis of Metabolic Networks Table 17.1 Selected Metabolites from Yeast Central Metabolism That Participate in Several Different Pathway Definitions as per KEGG Database Metabolite Glucose 6-phosphate NH3 Glucose L-Glutamine Glyoxylate 2-Oxoglutarate Urea Isocitrate Oxaloacetate Acetyl-CoA Malate
Number of KEGG-Pathways 6 7 6 5 5 12 4 3 9 23 7
constructed (Figure 17.3d). Graph theoretical representations of metabolism offer several advantages including conservancy of the global connectivity in the network and it is not necessary to remove the highly connected nodes from the network for simplifying the analysis. Another gain is in terms of an algorithm-friendly data structure offered by graph-theoretical representations. Indeed, several algorithms for the analysis of metabolic networks are based on graph data structures. Graph-theoretical representations thus provide a platform for systematic integration of omics data with metabolic networks in order to discover new biological patterns and hypotheses. 17.2.1.3 Stoichiometric Matrix Although not falling strictly into the category of visual representation, a collective stoichiometric matrix of all reactions comprising a network has important theoretical and practical implications in the analysis of network structure and function. The stoichiometric information that is usually missing from the previously described network depictions is systematically arranged in a stoichiometric matrix. The flow of mass through metabolic networks (fluxes) can only be calculated/estimated/measured/understood only in the light of the stoichiometry of reactions occurring in the network. Indeed, at (pseudo-) steady-state conditions stoichiometric matrix implies the feasible space for all possible flux solutions in the network. By using a stoichiometric matrix, it is thus possible to enumerate all possible steady state flux solutions of a given network. For example, boundaries of the feasible solution space can be identified in terms of elementary flux modes. Elementary flux mode is defined as a set of reactions that can operate at steady state and cannot be further decomposed in to such smaller sets. Consequently, any flux distribution at a steady state can be shown as a weighted linear combination of elementary flux modes. Elementary flux modes for the example metabolic network shown in Figure 17.3a are shown in Table 17.2. Further applications of stoichiometric matrix for flux estimations are discussed in the second part of this chapter. Different representations of metabolic networks discussed above must be only seen as different ways of depicting the same data. In this sense the stoichiometric matrix is perhaps the most complete data structure as it holds stoichiometric coefficients in addition to connectivity information. However, bipartite graph can also be easily extended to incorporate stoichiometric coefficients (e.g., as edge properties). On the other hand, any graph structure can always be represented in a matrix format (most common example being adjacency matrix). Thus, the choice of representation should be dictated by the intended application. It must be noted, however, that the pathway-representation is perhaps mostly useful for pictorial depiction due to limitations of this approach as discussed before. In contrast, since other approaches usually do not make any a priori assumptions on a particular set of reactions/metabolites
17-8
Modeling Tools for Metabolic Engineering Table 17.2 Elementary Flux Modes for the Reactions Depicted in Figure 17.3a Overall Reaction E1
Glucose + 2 ADP = 2 ATP + 2 Lactate
E2
Glucose + 2 ADP = 2 ATP + 2 Ethanol + CO2
E3
Glucose + 2 ADP + O2 = 2 ATP + 2,3-Butanediol + 2 CO2
E4
Glucose + 3 ADP = 3 ATP + Ethanol + Acetate + 2 Formate
E5
1.5 Glucose + 3 ADP = 3 ATP + Ethanol + 2,3-Butanediol + 2 CO2 + Formate
being a part of a particular process, they are more successful in uncovering the principles underlying complex operations of metabolic networks, both in terms of fluxes and their regulation. Indeed, it is only for a visual and conceptual convenience that large highly connected metabolic networks are partitioned into pathways. Some examples illustrating this idea are discussed in the following section.
17.2.2 Structure–Function Relationship 17.2.2.1 Topological and Functional Features of Network Elements One of the simplest and intuitive measures of topological importance of elements in a network is the degree of a node. Degree of a node is defined as number of edges connected to that node (or the number of its immediate neighbors). It may also be convenient to distinguish between in-degree and out-degree in a directed graph (see Figure 17.4 for illustration of some graph-related definitions). Although relatively simple, distribution of degree of nodes in a network can elucidate several structural aspects of network characteristics. Degree distribution of metabolite nodes in the bi-partite graph of Saccharomyces cerevisiae is shown in Figure 17.5. This distribution projects an important feature of metabolic networks, namely existence of few metabolites that participate in a large number of reactions (e.g., ATP, NADH, and NADPH), while most of the metabolites take part in relatively few reactions. Degree distribution of metabolites thus obeys a power law P(k) = k −γ, where γ is a constant. Although the existence of highly connected metabolites is long known in biology (as currency/cofactor metabolites), the network structure at genome-scale allows a systematic study of network topology and structure–function relationship from applied and evolutionary perspectives. Study of several metabolic networks across all three domains of life has revealed that the power law degree distribution is prevalent among them.11,12 Interestingly, the power law degree distribution indicates a scale-free organization of metabolic networks, in line with other physical/biological networks occurring in the nature. As the name implies, “scale-free” networks show similar basic topological features irrespective of the scaling at which the network is viewed. As mentioned above for metabolic networks, such networks are characterized by the presence of few highly connected nodes (hubs) while rests of the nodes have relatively few links. Hubs bestow small world property to metabolic networks,13 meaning that any two nodes are, on average, at relatively small distance from each other. Although scale-free networks have been found in many biological and physical systems, the fact that they are scale-free is far from expected if these networks were created through a random process. Thus, scale-free networks also display certain properties that are not found in random networks. Perhaps the most interesting property of scale-free networks in the metabolic engineering context is their robustness against random failures.11 Since most of the nodes have relatively low connectivity, deletion of randomly selected nodes does not alter the connectivity in the network. On the other hand, the presence of few highly connected nodes (hubs) makes the network susceptible to targeted attacks. To what extent these simple topological measures explain the functioning and evolutionary origin of metabolic network structure? Metabolic network can often be conceptually viewed as a collection of modules working together (see the previous section on metabolic network representation). Such
17-9
Structure and Flux Analysis of Metabolic Networks (a)
(b)
2 n1
n4
n2
1
n3
1
1+0=1
3
1
0+2=2 n1
2+1=3 n4
n2 n3
n5
n5
0+2=2
1+0=1 Degree
In-degree + Out-degree = Degree
Figure 17.4 Illustration of some basic graph-related definitions. (a) Undirected graph. Edges do not indicate any information about the direction of flow (of information, mass, energy, etc.) between the nodes. This either implies that either such flow is possible in both directions, or no information is available on directionality of edges. (b) Directed graph. In contrast to undirected graph, edges are “arrowed” and imply possible direction of flow in the network.
500 450 400 350
Frequency
300 250 200 150 100 50 0 0
20
40
60 80 100 120 Degree of metabolites
140
160
Figure 17.5 Distribution of degree of metabolites in the bi-partite graph representing genome-scale metabolic network of Saccharomyces cerevisiae. (From Forster, J., Famili, I., Fu, P., Palsson, B.O., Nielsen, J. Genome Res., 2003, 13 (2), 244–253. With permission).
17-10
Modeling Tools for Metabolic Engineering
modularization could be based on, e.g., chemical nature of metabolites.14 Indeed, it has been computationally shown that scale-free nature of metabolic networks can be explained through hierarchical modular organization that evolves based on “rich becomes rich” principle.12 Thus, a network is built starting with small nonscale-free modules that replicate and connect to other modules with preference. Several of these decomposed modules from E. coli metabolic network were found to coincide with the known biochemical functional modules in the metabolism. Thus, the network structure not only provides clues to the evolutionary origin of organization in metabolic networks but may also help in automated and robust classification of metabolism in different functional units.15,16 Another key information emerging from (/confirmed through) the topological analysis is the “bow tie” architecture of metabolic networks.16,17 Several nutrients thus enter the central knot, while different biosynthesis building blocks fan out from this knot. The central knot represents the 12 precursor molecules from which amino acids, nucleotides, and other essential components are built. Furthermore, redox and energy cofactors and other hub metabolites act as connecting links between the central knot and other parts of metabolic network. The bow tie architecture of metabolic networks bestows a remarkably balanced flexibility, robustness, and thus, evolvability to metabolic networks. This architecture can be seen as a combination of standardization and “plug and play” type modularity of nutrient intake and secondary metabolism achieved through a fixed set of precursor molecules. For example, new pathways for antibiotic synthesis can easily be acquired by an organism through horizontal gene transfer, since their synthesis will start from the existing precursor molecules. Robust and global control of the complex network is achieved via hub metabolites. This modular yet flexible design also allows keeping a minimum inventory of metabolites and just in time synthesis of necessary building blocks for growth. The bow tie nature of metabolic networks on the other hand also makes them fragile against changes in the central core and hub metabolites. From metabolic engineering point of view, the bow tie architecture can be used to formulate rules of thumb for choosing/rejecting certain targets for genetic manipulations. Indeed, it has been observed in several occasions that perturbations in metabolic network that adversely affect either pre-cursor or hub metabolites (e.g., ATP) often lead to deleterious phenotypes. The central role of metabolite hubs can, on the other hand, also be exploited for redirecting fluxes toward desired products.18,19 Plug and play nature of “fan-in” and “fan-out” part of bow ties can be exploited for creating super hosts for production of heterogenous proteins or secondary metabolites. Moreover, these rules of thumb will also help in devising better strategies for combating infectious microorganisms20 and understanding metabolic diseases.17 Understanding of large-scale organization principles can thus, lead to formulate more complete modeling platforms for in silico metabolic engineering. This can be achieved, e.g., by exploiting the general principles of operation rather than focusing on very detailed kinetic modeling where reliable in vivo information is difficult to obtain on a whole network scale. 17.2.2.2 Fluxes and Metabolic Network Structure In contrast to protein–protein interaction networks, where a good correlation has been observed between the essentiality of a protein for growth and number of interactions that it takes part into; no such strong correlation was observed in metabolic networks.18,21 Thus, operation of metabolic networks appears to be fundamentally different from that of other biological (e.g., protein–protein interaction network) and technological networks (e.g., internet). Furthermore, connectivity in commonly used graph theoretical representations of a metabolic network does not completely represent mass and energy flow through the network. This is because the stoichiometry and transfer of structural moieties between metabolites are not generally accounted for in graph representations. Consequently, although topology of a metabolic network implies a small world, this characteristic high connectivity does not hold when strict biochemical transformation networks are considered.22 Since most of the metabolic engineering applications are aimed at manipulation and redirection of fluxes, it is vital to account for relationship between different reactions in the network not only at shared metabolite level (as reflected in the topology), but also at the flux level. Elucidation of such relationship can only be achieved by systematically accounting for the stoichiometry of all reactions involved.
Structure and Flux Analysis of Metabolic Networks
17-11
Flux coupling analysis, an elegant mathematical formulation reported by Burgard et al.23 can be used to identify the connectivity at the level of fluxes (flux coupling) under the assumption of steady state operation. Flux coupling analysis uses linear programming to decide whether flux through a particular reaction implies a fixed/variable flux through other reactions such that no metabolites are accumulated or depleted in the cell. Thus, two fluxes f1 and f2 can be (i) fully coupled, i.e., a nonzero flux for f1 implies a nonzero and a fixed flux for f2 and vice versa; or (ii) partially coupled, i.e., a nonzero flux for f1 implies a nonzero, though variable, flux for f2 and vice versa; or (iii) directionally coupled, i.e., a nonzero flux for f1 implies a nonzero flux for f2 but not necessarily the reverse; or (iv) uncoupled , i.e., two fluxes operate independently. Comparison between Figure 17.3c and e marks the difference between the reaction interaction graph and the flux coupling graph for the same metabolic network. In particular, flux coupling graph extends much further than the connectivity implied by the metabolites participating in the corresponding reactions. Flux coupling analysis can thus not only greatly aid metabolic engineering by revealing distant and nonintuitive relationships, but also provides a new representation of metabolic network that can be used as a data integration scaffold. Interestingly, the topology of the genome-scale E. coli flux coupling graph also shows a scale-free architecture, 23 and so does the global organization of fluxes.24 Thus, a metabolic network is featured by large fluxes through few reactions while most of the reactions carry relatively low fluxes. Few of the fluxes also act as hubs by being coupled to large number of fluxes throughout the network. This topological similarity between different structural counterparts of a metabolic network underscores the global common principle of their operation. 17.2.2.3 Network Structure and Regulation The relevance of graph theoretical analysis to metabolic engineering is perhaps not as directly evident as that of flux coupling analysis (and other stoichiometry centered steady state approaches). However, in several metabolic engineering problems in microorganisms as well as for problems in mammalian and plant cells, dynamic metabolic operation is of interest. Furthermore, the steady state analysis approaches, typically, only reveal the boundaries of the operation of metabolic fluxes and thus, the observed solution is not always theoretically deducible from the stoichiometry alone. Thus, cellular metabolism, as reflected in the metabolite levels and fluxes, is an integrated result of mass balance constraints (stoichiometric constraints) and regulation at several different levels. Thus, the inherent interdependency between enzymatic regulation, metabolite levels and fluxes is partially reflected in the high connectivity of metabolic graphs. Both metabolite and enzyme nodes potentially contribute to the regulation of metabolite levels and fluxes. Disturbances at any node (/s) of the network can then spread through a highly connected network in terms of changes in metabolite and enzyme levels, and fluxes. Consequently, it can be hypothesized that the topology of the interactions involved in metabolism can be used to understand the underlying regulatory mechanisms (e.g., at transcriptional level) controlling the flow of mass and energy. This hypothesis was formalized into an algorithm by Patil and Nielsen.4 The algorithm integrates gene-expression data with topological information from genome-scale metabolic models, and thus, enables systematic identification of so-called reporter metabolites that represent hot spots in terms of metabolic regulation. Several metabolites, especially ones with high connectivity, usually span many pathways and act as connecting bridges across these pathways. Consequently, pathways as a whole are not subjected to strict stoichiometric/thermodynamic constraints on their own. Constraints on a pathway can thus, only be invoked in the connection with other connected pathways due to overlap of metabolites across pathways. On the contrary, coordinated transcriptional changes around metabolites are indeed necessary for one of two reasons (or both). Either to maintain homeostasis or to change the enzyme and metabolite levels so as to adjust to the new flux demands placed on the metabolic network by perturbation (/s). Thus the transcriptional coregulation of the genes surrounding a metabolite is, in part, stoichiometric and thermodynamic necessity and reporter metabolites indicate specific parts of metabolism where significant transcriptional regulation is exerted.
17-12
Modeling Tools for Metabolic Engineering
In order to identify the reporter metabolites each metabolite node in a metabolic graph is scored based on the normalized transcriptional response of its neighboring enzymes.
Z metabolite =
1 k
∑Z
ni
where Zni is a score of ith neighboring enzyme, typically estimated as inverse normal cumulative of p-value indicating the significance of the expression change. Zmetabolite scores should be corrected for the background distribution by subtracting the mean (µk) and dividing by the standard deviation (σk), of the aggregated Z-scores of several sets of k enzymes chosen randomly from the metabolic graph.
Z corrected metabolite =
(Z metabolite - µ k ) σk
The scoring used for identifying reporter metabolites is basically a test for the null hypothesis, “neighbor enzymes of a metabolite in the metabolic graph show the observed normalized transcriptional response by chance.” The metabolites with significant score are defined as reporter metabolites. 17.2.2.4 Transcriptionally Responsive Sub Networks Metabolic changes in a metabolic network are featured by coordinated changes throughout the network. An extension of the reporter algorithm4 searches the enzyme interaction graph to identify a sub network with maximum collective transcriptional response. Thus, while reporter metabolites probe the local points in the metabolic network for significant changes, sub networks paint a global picture of the transcriptional regulation. Both reporter metabolites and sub networks can find small but coordinated changes in a network without a priori assumption on particular pathway structures. Together, these tools have successfully been employed to correlate transcriptional changes with flux changes in a mutant strain.4 Due to strong biological hypothesis underlying the reporter algorithm, it can also be easily used to integrate other omics data with metabolic networks. An example includes use of metabolome data together with the transcriptome data to predict whether a particular flux is controlled at hierarchical level or metabolic level.25 At present, it is not possible to reliably estimate fluxes in many different parts of metabolism, while mRNA expression can be measured for all genes in a sequenced organism. Moreover, metabolome and proteome data is becoming increasingly available for different parts of metabolism. Consequently, reporter and sub network algorithm are valuable tools for obtaining a holistic picture of metabolic changes, even from the flux point of view. In cases where fluxome data is available, it can be used to improve the results/predictions from the reporter/sub network algorithm. Another approach that uses stoichiometric constraints in addition to topology for elucidation of regulatory logic is based on elementary flux modes. Stelling et al.26 introduced a concept of control-effective flux that accounts both for network efficiency and flexibility at a particular node in the network. Control-effective flux is defined as the average flux through a reaction in all elementary modes, whereby for each mode the actual flux it weighted by the modeís efficiency in terms of supporting cellular growth. Transcriptional changes were found to correlate well with control-effective fluxes for several metabolic genes, results that were not possible to explain by considering only optimal routes. Accounting for all elementary flux modes thus accounts for the network flexibility, an important characteristics bestowing robustness to cellular networks.
17.3 Network Functionality at Metabolite Level Due to high connectivity between and within various metabolic processes, the space of possible flux distributions in a given metabolic network is very large. In other words, substrates consumed by cells can
Structure and Flux Analysis of Metabolic Networks
17-13
be distributed through metabolic channels in numerous ways. Rechanneling of this mass flow toward a desired compound thus demands understanding of biological basis of a particular flux distribution under a given condition. This task is challenging owing to the complexity of factors constraining and regulating fluxes. Flux at any given reaction in the network is an (often unknown) integrated function of: enzyme activity, substrate, and product concentrations and underlying kinetic mechanisms. Enzyme activity in turn is a function of transcriptional and translational efficiency of the corresponding protein as well as accompanying regulatory mechanisms. Thus, a given flux can be thought as being regulated at hierarchical (from gene to enzyme activity) and metabolic level (kinetic dependence of flux on metabolite pools).27 Since it is now possible to quantify large number of intracellular metabolite pools, it is possible to infer whether the reactions are hierarchically or metabolically regulated. For example, the principle underlying the reporter metabolite algorithm can be used to map different layers of regulation within metabolic networks through combination of metabolome and transcriptome data.28 However, such analysis usually reveals the regulatory architecture only in qualitative terms and for given set of experimental conditions. Indeed, high connectivity of cellular processes at both hierarchical and metabolite level as well as regulatory interactions contribute toward the complexity of flux dependence on genotype in a given environmental conditions. On the other hand, this complexity can be conveniently exploited by viewing fluxes as an integrated outcome of all complex cellular processes.29 Full exploitation of this view motivates the tools for measurement of in vivo fluxes for a system under investigation. One of the useful simplifications that can be applied at both theoretical and experimental fronts of flux measurements is the assumption of (pseudo) steady state. We briefly discuss both theoretical and experimental flux-estimation tools in the following text.
17.3.1 Experimental Estimation of Fluxes There are no direct methods available for the analysis of in vivo metabolic fluxes. Intracellular fluxes or in vivo reactions rates can be quantified by combining experimental metabolite measurements with mass balances applied around intracellular metabolites. The mass balances are based on the stoichiometry of the intracellular reactions that are included in the metabolic model and is largely based on assumed biochemistry.30 The key assumption mentioned above means that for a given metabolic network the balances around each metabolite impose a number of constraints on the system. In general if there are J fluxes and K metabolites, then the degree of freedom is F = J–K, and through measurements of only F fluxes (biosynthetic requirement (μ), nutrient uptake (–rs), and product secretion rates (r p)), the remaining fluxes can be calculated. Although this methodology works well with the linear reaction sequences, it often fails at intermediary metabolism. Limited data and stoichiometric constraints often lead to the undetermined system that does not allow resolving flux distribution uniquely. One approach to overcome this limitation is to combine metabolite balancing with feeding labeled tracers (stable isotope) to the cells and measuring the distribution of labeling in the different intracellular metabolites. Several experimental techniques for analysis of the enrichment pattern in intracellular metabolites have been developed (for excellent review please refer to Ref. 31). All these techniques are currently based on using nuclear magnetic resonance (NMR)32 or gas chromatography-mass spectrometry (GC-MS).33 Due to the low intracellular concentration of central metabolites, it is impractical to use these compounds for the analysis of labeling patterns. However, since central metabolites are converted to amino acids, this labeling information is saved/ stored in the respective proteins through conserved biosynthetic pathway. The proteins can then be hydrolyzed to release the labeled proteinogenic amino acids which can be further analyzed using NMR or GC-MS. A consequence of the use of proteinogenic amino acids for analysis is that steady state cultivation is required for flux quantification through the 13C tracer approach. However, 13C-labeling methods can be applied in batch cultivation for quantitative assessment of flux distribution if there is sampling in the exponential growth phase after several doublings of the biomass concentration.
17-14
Modeling Tools for Metabolic Engineering
Once NMR or MS spectra are recorded, the next process is quantitative interpretation of isotopomer data by employing mathematical models that describe the relationship between flux and observed isotopomer abundance. Similar to metabolite balancing, balances can be set up around all isotopomer of the particular metabolite. Schmidt et al.34 described an elegant method for automatically generating the complete set of isotopomer balances using a matrix based method. Some other approaches include, cumulative isotopomer (cumomers),35 bondomers36 and sum fractional labels.37 Such comprehensive accounting of all available physiological and isotopomer data from single experiments retrieves the maximum information through data integration. Although the mathematical framework for flux analysis (MFA) has emerged as a tool of great significance, an important limitation is a large search space to optimize the flux distribution, which is computationally expensive. Moreover, it imposes limitation when multiple isotopic tracers are used for the labeling of the system and often reduces the ability of MFA to fully utilize the power of multiple tracers in elucidating physiology of the organism. Recently Antoniewicz and coworkers38 proposed a mathematical framework based on elementary metabolite unit (EMU), which is based on a highly efficient decomposition method that identifies the minimum amount of information needed to simulate isotopic labeling within a reaction network using the knowledge of atomic transition occurring in the network reaction. This helped in reduction of isotopomers from two millions to 354 EMUs in gluconeogenesis pathway with 2H, 13C, and 18O. Apart from this, new flux estimation tools are emerging that use the information from direct detection of 13C patterns in pathway intermediates rather than proteinogenic amino acids or accumulated extracelluler metabolites.39 This approach has been demonstrated for few selected metabolites and the method is not yet suitable for more global analysis.
17.3.2 In Silico Prediction of Fluxes Available experimental methods for intracellular flux measurements are often limited to only a part of the whole metabolism. This limitation is problematic in connection with studying the systems at global level and in cases where the fluxes of interest lie outside the scope of experimental determination. In these situations, computational methods for predicting fluxes are desirable. More importantly, theoretical flux prediction tools will allow prediction of fluxes in order to design mutants in silico. Due to overwhelmingly large flux solution space, even under steady state assumptions, it is not computationally feasible to enumerate all possible flux solutions under a given condition. One of the ways to overcome this problem is to simulate fluxes by optimizing a functional property of the network. Such optimization function can be viewed as a biological objective of the cellular metabolism. For bacteria and simple eukaryotes such as Saccharomyces cerevisiae, it has been demonstrated that this objective function can chosen to be the formation of biomass building blocks and/or maximization of energy production. This objective function can be simply formulated as a linear combination of fluxes in the metabolic network. Under steady state assumptions this results in a linear optimization problem, often referred to as flux balance analysis (FBA). Thus, given a metabolic network (in the form of stoichiometric matrix) and experimentally measured or hypothesized constraints on uptake of substrates, FBA yields metabolic flux distribution that maximizes, e.g., biomass formation. FBA with biomass formation (or growth rate when substrate uptake rate is fixed) as an objective function has been shown to successfully predict essentiality of single gene deletion mutants in E. coli40and S. cerevisiae.41 Moreover, several nonoptimally growing E. coli single gene deletion mutants were observed to evolve toward FBA predicted optimal solution.40 FBA with biomass formation thus seems to be an useful objective function for predicting intracellular fluxes in microbial systems, although notable exceptions exist.42 One of the other limitations associated with the FBA approach is the nonuniqueness of the flux solution obtained under many physiological conditions. Additional constraints become necessary to resolve ambiguities, and such constraints can be, e.g., obtained from experimental measurements of some of the fluxes.
Structure and Flux Analysis of Metabolic Networks
17-15
The FBA approach basically assumes optimal operation of the metabolic network. This assumption is justified on the ground of the long evolutionary history of cells to maximize their growth. Consequently, assumptions of optimality may easily become invalid for mutants. In an alternate approach to FBA, Segre and coworkers43 proposed that the flux distribution in mutant strains is at minimal distance from the flux map of the reference metabolic network (wild type). The metabolic objective for mutant strains can thus be formulated as minimization of metabolic (flux) adjustments (MOMA). The MOMA approach usually predicts changes in a large number of fluxes. This strategy may represent high adaptation cost for the perturbed cell. Shlomi and others44 therefore proposed a computational method termed regulatory on/off minimization (ROOM) where the number of flux changes in a perturbed strain are minimized. Some evidence, although not sufficient, suggests that a genetic perturbation initially leads to a flux distribution predicted by MOMA and then eventually converges to a solution predicted by FBA or ROOM.44 All of these three strategies (FBA, MOMA, and ROOM) only partially consider thermodynamic constraints in the form of directionality of fluxes. In a more explicit way, Beard et al.45 impose additional thermodynamic constraints on the system to improve the FBA solution.
17.3.3 The Fluxome in Metabolic Engineering: Applications Genome-scale stoichiometric models represent the integrated metabolic potential of a microorganism by defining flux-balance constraints that characterizes all feasible metabolic phenotypes under steady state conditions. Combinatorial complexity prevents calculation of all feasible metabolic phenotypes that a microbial genotype can assume under a given environmental conditions. One of the approaches to determine the metabolic phenotype (i.e., the fluxes through all metabolic reactions) is to use FBA/ MOMA/ROOM, desirably in combination with experimentally measured fluxes. All these methods provide a basis for using genome-scale metabolic models to predict possible metabolic phenotypes, and hence for in silico metabolic engineering. The algorithm developed by Maranas et al.46 (Named OptKnock) represents one of the first rational modeling frameworks for suggesting gene knockouts leading to the overproduction of a desired metabolite. OptKnock searches for a set of gene (reaction) deletions that maximizes the flux toward a desired product, while the internal flux distribution is still operated such that growth (or another biological objective) is optimized. Thus, the identified gene deletions will force the microorganism to produce the desired product in order to achieve maximal growth. Indeed, the design philosophy underlying OptKnock approach takes advantage of inherent properties of microbial metabolism to drive the optimization of the desired metabolic phenotype. The relation of OptKnock with the biological objectives of microorganisms makes it an attractive and promising modeling framework for in silico metabolic engineering. The same modeling framework can be extended for determining optimal set of new genes to be added in a given host for production of new compounds or for the optimization of native molecules of interest.47,48 OptKnock is implemented by formulating a bi-level linear optimization problem using mixed integer linear programming (MILP) that guarantees to find the global optimal solution. The applicability of OptKnock approach can be extended by formulating the in silico design problem by using a genetic algorithm (GA), hereafter referred to as OptGene.49 Direct relation of GA with biological evolution makes it a natural method of choice to identify suitable genetic modifications for improved metabolic phenotype. There are two major advantages of the OptGene formulation. Firstly, OptGene demands relatively less computational time and thus it enables to solve more complex problems. This is of particular importance as the relation between the size of the problem (as defined by the number of enzymes and number of deletions desired) and the corresponding search space (combinations of enzymes to be deleted) is combinatorial. Secondly, the OptGene formulation allows the optimization of non-linear objective functions, which is of considerable interest in several problems of commercial interest. One example of an important non-linear engineering objective function is the productivity (amount of product formed per unit time).
17-16
Modeling Tools for Metabolic Engineering
17.3.4 Kinetic Models for Flux Simulations Steady state models of metabolism show a good promise for predicting and exploiting the flux phenotype of cells for metabolic engineering. Assumption of steady state, however, is not valid under several conditions of practical importance, e.g., batch and fed-batch cultivations. Furthermore, a solution predicted by a steady state model may not be realizable in light of kinetic characteristics of the system and given initial state of the metabolic network. Although a full kinetic model of the system is desirable, present day experimental techniques are far from deducing all necessary in vivo kinetic parameters and accurate metabolic state (e.g., concentrations of all metabolites). Nevertheless, several metabolic engineering strategies based on kinetic modeling of metabolism are being proposed.50–53 These modeling frameworks are, in general, limited to the use of a small scale metabolic model, which may still be practically relevant.
17.4 Conclusions and Future Perspective Understanding of the “genome to fluxome” relationship is a key for rational designing of microbial cells through metabolic engineering. Unraveling of such a relationship (even to a partial extent), however, is not easy due to the highly nonlinear and complex nature of cellular organization and operations. This challenging task is to some extent being attempted (and further extended) through (i) simplifying assumptions such as steady-state; (ii) deducing general principles of metabolic regulation through hypothesis driven methods (e.g., FBA, MOMA, and reporter metabolites). Although these methods are successful in expanding our knowledge and capabilities for developing new rational tools for metabolic engineering, only a small fraction of the cellular complexity and nonlinearity is accounted for by the current methods. Thus, new tools need to be developed that will allow us to generate quantitative metabolomic and fluxomic data that span different species and environmental conditions of interest. Novel model-based and hypothesis-driven computational tools will be necessary to uncover and exploit patterns emerging from these datasets. Such algorithmic tools are bottlenecks even with the present day available datasets such as genome, transcriptome and (to limited extent) fluxome, and metabolome information. Tools are necessary for use of genome-scale metabolic models in combination with experimental flux measurements for obtaining global flux mapping.
References 1. Crick, F. Central dogma of molecular biology. Nature, 1970, 227 (5258), 561–563. 2. Nielsen, J. and Oliver, S. The next wave in metabolome analysis. Trends Biotechnol., 2005, 23 (11), 544–546. 3. Peregrin-Alvarez, J. M., Tsoka, S., and Ouzounis, C. A. The phylogenetic extent of metabolic enzymes and pathways. Genome Res., 2003, 13 (3), 422–427. 4. Patil, K. R. and Nielsen, J. Uncovering transcriptional regulation of metabolism by using metabolic network topology. PNAS 2005, 102 (8), 2685–2689. 5. Huynen, M. A., Dandekar, T., and Bork, P. Variation and evolution of the citric-acid cycle: a genomic perspective. Trends Microbiol.,1999, 7 (7), 281–291. 6. Stryer, L. Biochemistry, 4 ed. W.H. Freeman & Company, New York, 2005. 7. Woese, C. The universal ancestor. PNAS, 1998, 95 (12), 6854–6859. 8. Ratcliffe, R. G. and Shachar-Hill, Y. Measuring multiple fluxes through plant metabolic networks. Plant J., 2006, 45 (4), 490–511. 9. Ideker, T., Ozier, O., Schwikowski, B., and Siegel, A. F. Discovering regulatory and signalling circuits in molecular interaction networks. Bioinformatics, 2002, 18 (90001), 233S–240. 10. Kanehisa, M., Goto, S., Kawashima, S., Okuno, Y., and Hattori, M. The KEGG resource for deciphering the genome. Nucleic Acids Res., 2004, 32 Database issue, D277–D280.
Structure and Flux Analysis of Metabolic Networks
17-17
11. Jeong, H., Tombor, B., Albert, R., Oltvai, Z. N., and Barabasi, A. L. The large-scale organization of metabolic networks. Nature, 2000, 407 (6804), 651–654. 12. Ravasz, E., Somera, A. L., Mongru, D. A., Oltvai, Z. N., and Barabasi, A. L. Hierarchical organization of modularity in metabolic networks. Science, 2002, 297 (5586), 1551–1555. 13. Fell, D. A. and Wagner, A. The small world of metabolism. Nat. Biotechnol., 2000, 18 (11), 1121–1122. 14. Malygin, A. G. Structure-chemical approach to organization of information on metabolic charts. Biochemistry (Mosc. ) 2004, 69 (12), 1379–1385. 15. Gagneur, J., Jackson, D. B., and Casari, G. Hierarchical analysis of dependency in metabolic networks. Bioinformatics, 2003, 19 (8), 1027–1034. 16. Ma, H. W. and Zeng, A. P. The connectivity structure, giant strong component and centrality of metabolic networks. Bioinformatics, 2003, 19 (11), 1423–1430. 17. Csete, M. and Doyle, J. Bow ties, metabolism and disease. Trends Biotechnol., 2004, 22 (9), 446–450. 18. Roca, C., Nielsen, J., and Olsson, L. Metabolic engineering of ammonium assimilation in xylosefermenting Saccharomyces cerevisiae improves ethanol production. Appl. Environ. Microbiol., 2003, 69 (8), 4732–4736. 19. Verho, R., Londesborough, J., Penttila, M., and Richard, P. Engineering redox cofactor regeneration for improved pentose fermentation in Saccharomyces cerevisiae. Appl. Environ. Microbiol., 2003, 69 (10), 5892–5897. 20. Rahman, S. A. and Schomburg, D. Observing local and global properties of metabolic pathways: ‘load points’ and ‘choke points’ in the metabolic networks. Bioinformatics, 2006, 22 (14), 1767–1774. 21. Mahadevan, R. and Palsson, B. O. Properties of metabolic networks: structure versus function. Biophy. J., 2005, 88 (1), L7–L9. 22. Arita, M. The metabolic world of Escherichia coli is not small. PNAS, 2004, 101 (6), 1543–1547. 23. Burgard, A. P., Nikolaev, E. V., Schilling, C. H., and Maranas, C. D. Flux coupling analysis of genomescale metabolic network reconstructions. Genome Res., 2004, 14 (2), 301–312. 24. Almaas, E., Kovacs, B., Vicsek, T., Oltvai, Z. N., and Barabasi, A. L. Global organization of metabolic fluxes in the bacterium Escherichia coli. Nature, 2004, 427 (6977), 839–843. 25. Cakir, T., Patil, K. R., Onsan, Z. I., Ulgen, K. O., Kirdar, B., and Nielsen, J. Integration of metabolome data with metabolic networks reveals reporter reactions. Mol. Syst. Biol., 2006, 2, 50. 26. Stelling, J., Klamt, S., Bettenbrock, K., Schuster, S., and Gilles, E. D. Metabolic network structure determines key aspects of functionality and regulation. Nature, 2002, 420 (6912), 190–193. 27. ter Kuile, B. H. and Westerhoff, H. V. Transcriptome meets metabolome: hierarchical and metabolic regulation of the glycolytic pathway. FEBS Lett., 2001, 500 (3), 169–171. 28. Nielsen, J. It is all about metabolic fluxes. J. Bacteriol., 2003, 185 (24), 7031–7035. 29. Varma, A. and Palsson, B. O. Metabolic flux balancing—basic concepts, scientific and practical use. Bio-Technology 1994, 12 (10), 994–998. 30. Sauer, U. Metabolic networks in motion: C-13-based flux analysis. Mol. Syst. Biol., 2006, 2, 62. 31. Szyperski, T. C-13-NMR, MS and metabolic flux balancing in biotechnology research. Quart. Rev. Biophy., 1998, 31 (1), 41–106. 32. Dauner, M. and Sauer, U. GC-MS analysis of amino acids rapidly provides rich information for isotopomer balancing. Biotechnol. Prog., 2000, 16 (4), 642–649. 33. Schmidt, K., Carlsen, M., Nielsen, J., and Villadsen, J. Modeling isotopomer distributions in biochemical networks using isotopomer mapping matrices. Biotechnol. Bioeng., 1997, 55 (6), 831–840. 34. Wiechert, W., Mollney, M., Isermann, N., Wurzel, W., and de Graaf, A. A. Bidirectional reaction steps in metabolic networks: III. Explicit solution and analysis of isotopomer labeling systems. Biotechnol. Bioeng., 1999, 66 (2), 69–85. 35. van Winden, W. A., Heijnen, J. J., and Verheijen, P. J. T. Cumulative bondomers: a new concept in flux analysis from 2D [C-13,H-1] COSYNMR data. Biotechnol. Bioengin., 2002, 80 (7), 731–745.
17-18
Modeling Tools for Metabolic Engineering
36. Christensen, B., Gombert, A. K., and Nielsen, J. Analysis of flux estimates based on C-13-labelling experiments. Eur. J. Biochem., 2002, 269 (11), 2795–2800. 37. Antoniewicz, M. R., Kelleher, J. K., and Stephanopoulos, G. Elementary metabolite units (EMU): a novel framework for modeling isotopic distributions. Metabol. Eng., 2007, 9 (1), 68–86. 38. van Winden, W. A., van Dam, J. C., Ras, C., Kleijn, R. J., Vinke, J. L., van Gulik, W. M., and Heijnen, J. J. Metabolic-flux analysis of Saccharomyces cerevisiae CEN.PK113-7D based on mass isotopomer measurements of C-13-labeled primary metabolites. Fems Yeast Res., 2005, 5 (6–7), 559–568. 39. Ibarra, R. U., Edwards, J. S., and Palsson, B. O. Escherichia coli K-12 undergoes adaptive evolution to achieve in silico predicted optimal growth. Nature, 2002, 420 (6912), 186–189. 40. Forster, J., Famili, I., Palsson, B. O., and Nielsen, J. Large-scale evaluation of in silico gene deletions in Saccharomyces cerevisiae. OMICS: A J. Integrat. Biol., 2003, 7 (2), 193–202. 41. Fischer, E. and Sauer, U. Large-scale in vivo flux analysis shows rigidity and suboptimal performance of Bacillus subtilis metabolism. Nat. Genet., 2005, 37 (6), 636–640. 42. Segre, D., Vitkup, D., and Church, G. M. Analysis of optimality in natural and perturbed metabolic networks. PNAS, 2002, 99 (23), 15112–15117. 43. Shlomi, T., Berkman, O., and Ruppin, E. Regulatory on/off minimization of metabolic flux changes after genetic perturbations. PNAS, 2005, 102 (21), 7695–7700. 44. Beard, D. A., Liang, S. C., and Qian, H. Energy balance for analysis of complex metabolic networks. Biophy. J., 2002, 83 (1), 79–86. 45. Burgard, A. P., Pharkya, P., and Maranas, C. D. OptKnock: a bilevel programming framework for identifying gene knockout strategies for microbial strain optimization. Biotechnol. Bioeng., 2003, 84 (6), 647–657. 46. Pharkya, P., Burgard, A. P., and Maranas, C. D. OptStrain: a computational framework for redesign of microbial production systems. Genome Res., 2004, 14 (11), 2367–2376. 47. Bro, C., Regenberg, B., Forster, J., and Nielsen, J. In silico aided metabolic engineering of Saccharomyces cerevisiae for improved bioethanol production. Metabol. Engin., 2006, 8 (2), 102–111. 48. Patil, K. R., Rocha, I., Forster, J., and Nielsen, J. Evolutionary programming as a platform for in silico metabolic engineering. BMC Bioinformatics, 2005, 6, 308. 49. Klipp, E., Nordlander, B., Kruger, R., Gennemark, P., and Hohmann, S. Integrative model of the response of yeast to osmotic shock (vol 23, pg 975, 2005). Nat. Biotechnol., 2005, 23 (8), 975–982. 50. Liebermeister, W. and Klipp, E. Bringing metabolic networks to life: integration of kinetic, metabolic, and proteomic data. Theor. Biol. Med. Model., 2006, 3, 42. 51. Steuer, R., Gross, T., Selbig, J., and Blasius, B. Structural kinetic modeling of metabolic networks. PNAS, 2006, 103 (32), 11868–11873. 52. Wang, L. Q. and Hatzimanikatis, V. Metabolic engineering under uncertainty. I: framework development. Metabol. Eng., 2006, 8 (2), 133–141. 53. Oliveira, A. P., Nielsen, J., and Forster, J. Modeling Lactococcus lactis using a genome-scale flux model. BMC Microbiol., 2005, 5, 39. 54. Forster, J., Famili, I., Fu, P., Palsson, B. O., and Nielsen, J. Genome-scale reconstruction of the Saccharomyces cerevisiae metabolic network. Genome Res., 2003, 13 (2), 244–253.
18 Constraint-Based Genome-Scale Models of Cellular Metabolism 18.1 Introduction �������������������������������������������������������������������������������������18-1 18.2 Methods for Model Development..................................................18-2 Curated Reaction and Metabolite Database • Metabolic Network Reconstruction Methods • Representation of Biomass Reaction • Determination of Maintenance • Integration with Physiology Data for Validation and Refinement
18.3 Methods for Interrogating Metabolic Networks........................ 18-6 Flux-Based Methods • Regulatory and Dynamic Extensions • Thermodynamic and Metabolic Extensions • Optimization of Metabolic Networks
Radhakrishnan Mahadevan University of Toronto
18.4 Software and Databases for Genome-Scale Modeling............ 18-12 18.5 Survey of Genome-Scale Metabolic Models............................. 18-12 18.6 Conclusions ������������������������������������������������������������������������������������18-14 References ��������������������������������������������������������������������������������������������������18-14
18.1 Introduction The explosion in the database of microbial genome sequences has motivated intense efforts in the functional characterization of these genomes. As metabolism is fairly well conserved across organisms, several techniques for metabolic network reconstruction from the genome sequence using bioinformatics algorithms have been developed. The underlying metabolic network represents the metabolic potential of the organism, and hence is valuable for the interrogation of the metabolic capabilities and its relation to physiology. The reaction stoichiometry of the biochemical reactions associated with metabolism is well established and therefore, the stoichiometric matrix associated with the genome-scale network is a concise representation of the highly interconnected metabolic network and is amenable for systematic computational analysis. Unlike other biological networks such as protein interaction networks, the links in the metabolic network are mainly chemical reactions and consequently are time-invariant from the standpoint of connectivity. Hence, these reconstructions once validated represent unchanging snapshots of metabolic potential that get augmented and can only grow in size as new functional assignments for genes are made. The main advantage of such large-scale descriptions of metabolism is the molecular detail that is represented in genome-scale metabolic reconstructions. Such molecular detail enables the representation and analysis of genomic events such as gene expression data analysis, large-scale gene deletions, largescale growth physiology, and the large-scale pathway analysis for metabolic engineering. However, the 18-1
18-2
Modeling Tools for Metabolic Engineering
description of genome-scale network presents several challenges on the computational side regarding the scalability of algorithms. These challenges are partly reflected in the development of methods that are primarily based on the linear or quadratic optimization and mostly deal with representing the flux distribution in the metabolic network. The first genome-scale model was developed in 1999 (H. influenzae, Edwards and Palsson, 1999) and at that time it was derived primarily from the genome sequence and available literature data. However, subsequent genome-scale models began to incorporate other forms of high-throughput data and physiology data including gene and proteome expression sets. Further, the metabolite and reaction databases also became sophisticated as additional details on the charge, molecular formula were included and all reactions were charge and elementally balanced. Cheminformatic algorithms have been used to determine the charge from the analysis of the acid dissociation constants (pKa). This critical development enhanced the quality of the reaction network and enabled the tracking of protons generated as a part of metabolism which could impact the model predictions. An additional development was in the consideration of thermodynamics of chemical reactions and the derivation of constraints from the use of such data. Finally, such models have been integrated with metabolomic and thermodynamic data to identify feasible ranges of metabolite concentrations. Another effort toward enhancing the predictive capabilities is in the incorporation of nonmetabolic phenomena such as transcriptional regulation and translation. However, such efforts has been attempted only for well studied organisms such as Escherichia coli and Saccharomyces cerevisiae as the information on the regulatory mechanisms are not yet widely available. In this chapter, a brief introduction to the development and utilization of constraint-based genomescale modeling of cellular metabolism are presented. The methods for metabolic network reconstruction, the subsequent computational analysis of the reconstructed network, and the software for genome-scale modeling are presented in the first section of this chapter. In the second part, the applications of these models and a brief summary of the status of the constraint-based models of metabolism in different organism are reviewed. Finally, the state of art in the development of such models and the future directions are outlined. Although the scope of this chapter is limited to covering the developments in constraint-based modeling, it is important to note that there are other metabolic modeling approaches that have been used for pathway scale metabolic networks and are also rapidly evolving in sophistication (Savageau, 1969; Varner, 2000).
18.2 Methods for Model Development Development and validation of genome-scale models requires several levels of biological information including genome sequence, physiology, gene essentiality, literature on biochemical and genetic data. The first step in the construction of a CBM (Figure 18.1) is the reconstruction of a highly curated metabolic network which forms the basis for the numerical computations. This is followed by the determination of biomass composition and the representation of biomass synthesis. Finally, additional physiological data is required for the identification of maintenance requirements to enable quantitative predictions of growth and by-product secretion patterns. It must be noted that modifications to the network can be identified even at steps 2 and 3 in the event that reactions required for the synthesis of key biomass components are missing. Hence, although the model development is divided into separate steps in Figure 18.1, the discovery of additional network components and links is a continuous process necessitating an iterative approach to model development. In the next section, a brief summary of each of the steps in the model development process is provided.
18.2.1 Curated Reaction and Metabolite Database An important component in developing genome-scale models is well curated database of metabolites and reactions. Often databases of metabolic reactions contain compounds with inconsistent formulas
Constraint-Based Genome-Scale Models of Cellular Metabolism
18-3
Metabolic network reconstruction
Genome sequence
Omics data
Determination of biomass composition
Biochemical databases
Physiology
%w biomass Protein content RNA content DNA content Carbohydrate content Lipid content Other
46% 10% 4% 15% 15% 10%
Validation
Genome-scale models
Experimental data
Determination of maintenance energy
Growth associated maintenance
Substrate uptake rate Substrate requirement for nongrowth maintenance
Growth rate
Literature Biological data
Figure 18.1 Three critical steps in the development of genome-scale metabolic models from biological data.
and structures, or elementally unbalanced reactions that have to be reconciled before the modeling. This is especially critical as one of the features of constraint-based models is the ability to track the flow of elements across the pathways based on the reaction stoichiometry. As an example, proton balancing of all of the chemical reactions was valuable in identifying the physiological basis for difference in the yield during growth of G. sulfurreducens with different electron acceptors, and in predicting the pH changes in the extra-cellular medium during growth of E. coli with varying degree of reduced substrates (Reed et al., 2003; Mahadevan et al., 2006). Hence, it is important to represent the correct charged formula for all the metabolites in the database and ensure that all the reactions are both elementally and charge balanced before incorporating them into the metabolic model.
18.2.2 Metabolic Network Reconstruction Methods Network reconstruction is primarily performed using tools from bioinformatics and typically involves a sequence-based metabolic network identification step followed by pathway analysis to close network gaps. The steps involved in the metabolic network reconstruction have been extensively reviewed before (see reviews by Covert et al., 2001; Francke et al., 2005) and are only summarized here. A notable difference between metabolic networks reconstructed in enzyme databases and those needed for the development of metabolic model is the requirement of a complete network that can synthesize the components
18-4
Modeling Tools for Metabolic Engineering
of the biomass from basic substrates typically found in minimal media (assuming the organism can grow in such media). However, as several of the reconstructed networks in databases are obtained automatically, it is critical to manually curate or inspect these automatically generated networks to ensure that any inconsistency in the network is eliminated. Increasingly, automatic algorithms for the pathway gap filling and model development are also being developed to further facilitate the process of genomescale modeling (Karp et al., 2002; Segre et al., 2003; Notebaart et al., 2006; Herrgard et al., 2006a). The first step in this process of metabolic network reconstruction is the identification of genes with a defined metabolic function in the annotation resulting from a completed genome. These genes are then verified by examining their homologs in other well characterized organisms and are typically assigned confidence levels according to the degree of sequence similarity. In addition to this step, all of the genes are evaluated through sequence comparison and phylogenetic analysis with up-to-date enzyme databases such as KEGG, BRENDA, and the manually curated subset of databases such as SWISSPROT (Kanehisa et al., 2004; Schomburg et al., 2004; Wu et al., 2006). In case, only the draft genome is available, metabolic network reconstruction tools such as metaSHARK (Pinney et al., 2005) in combination with manual curation can be utilized to identify the metabolic network. The next step in the network reconstruction involves the completion of metabolic pathways by identifying network gaps and analyzing them further. These network gaps typically correspond to metabolites that can either be consumed only or produced only in the network. The missing reactions associated with such metabolites that can close the gaps are identified from reaction databases. The next step is the identification of genes encoding enzymes that catalyze the missing reactions in other organisms followed by sequence comparison of these genes with genome of the modeled organism. Such analysis can lead to the assignment of novel metabolic functions based on comparison of the protein sequence and domains. Another critical component of the model development process is the representation of the proton translocation stoichiometry of the proteins in electron transport chain. The proton translocation across the inner membrane for Gram negative bacteria and the cell membrane for Gram positive bacteria is directly correlated with ATP synthesis. Hence, variations in the translocation stoichiometry can significantly affect the maximum energy in terms of ATP that can be generated from a mole of the substrate such as glucose. An important factor to consider in the determination of the translocation stoichiometry is the total energy that can be generated given the substrate (electron donors such as reduced sugars) and oxidant (electron acceptor such as oxygen) and the efficiency of cellular machinery. As an example, the efficiency of the electron transport chain in human mitochondria is 57% of the theoretical maximum (1.25 mol ATP/mol electron with oxygen as the acceptor given a theoretical thermodynamic maximum of 2.2 mol ATP/mol electron) (Kroger et al., 2002). Such thermodynamic constraints are required to ensure that the rate of ATP generation is physiologically realistic. In addition, the standard Gibb’s free energy change associated with reactions can also be used to determine whether a reaction is reversible/irreversible so that the appropriate constraints on the reaction direction can be imposed. Another critical factor that has to be determined for many of the reactions, and especially for the redox reactions, is the choice of the cofactor (NADH/NADPH) that acts as the donor/acceptor. In some cases, the cofactor specificity can be determined from the sequence and phylogenetic analysis (Zhu et al., 2005) and biochemical literature (Lehninger et al., 1993).
18.2.3 Representation of Biomass Reaction The second step in the development of a genome-scale metabolic model is representation of the biomass synthesis reactions in the model. The synthesis of one gram of cell requires over 30 metabolites which include structural components such as cell wall, proteins, energy metabolites such as ATP, and storage polymers such as glycogen. Experimental protocols for the determination of composition of macromolecules such as proteins, nucleic acids, carbohydrates, lipids, and other ions are established and the biomass composition of several well studied organisms are available. The ATP requirements for the
Constraint-Based Genome-Scale Models of Cellular Metabolism
18-5
synthesis of the macromolecular components are also included in the biomass reaction. The distribution of metabolites (the amino acid composition) that make up the macromolecules has to be determined to rigorously define the biomass synthesis reaction. In the absence of such data, in specific cases such as the amino acid composition, the distribution can be inferred from the sequence based on the assumption that all of the proteins are expressed. Another alternative employed in the development of genome-scale models is the assumption that the amino acid composition is similar to that of E. coli for which experimental data is available. It is important to note that the biomass reaction derived from experimental measurements is expected to be valid during growth simulations corresponding to the environment from which the data on biomass composition was collected. However, unless there are significant changes to biomass composition in different environments, small variations are unlikely to result in significant changes in the growth rate. As an example, for the case of Geobacter sulfurreducens, even 10% variations in the macromolecular composition resulted in only a 1.5% change in the growth rate which appears to be consistent with previous studies on the impact of the variations in E. coli biomass composition (Mahadevan et al., 2006). The definition of a comprehensive biomass reaction is critical for the accuracy of gene essentiality predictions. The impact of any deletion that disrupts the synthesis of a metabolite required for biomass composition can be predicted accurately only if the metabolite is incorporated in the biomass reaction.
18.2.4 Determination of Maintenance A key component in the representation of the biomass synthesis reaction is the incorporation of ATP requirements for maintenance of cellular processes not included in the biomass reaction such as the energy required for the turnover of amino acid pools, maintenance of membrane potential, and other cellular events that might be proportional to the growth rate of the cells. Methods for calculating the growth and nongrowth associated maintenance are well established (Pirt, 1965; Neijssel et al., 1996) and essentially requires physiological data on the substrate uptake rate at different growth rates. A schematic of this procedure is shown in Figure 18.2, where the substrate uptake rate extrapolated to zero growth rate (y-intercept) is used to first calculate the nongrowth associated ATP maintenance parameter. Here, the uptake rate is imposed as a constraint and the ATP synthesis rate is maximized and the resulting objective value is set as the nongrowth associated maintenance. The slope of the predicted uptake rate at different growth rates is dependent on the growth associated ATP maintenance parameter which is varied to match the experimental observations. Although in most cases the experimental data on substrate uptake rate and growth rate are linear, there can be instances where this relation can be piece-wise linear indicating that the energetic efficiency and the mode of growth can be different (e.g., Experimental data Growth associated maintenance Substrate uptake rate Substrate requirement for nongrowth maintenance
Growth rate
Figure 18.2 Schematic illustrating the determination of the maintenance parameters.
18-6
Modeling Tools for Metabolic Engineering
the Crabtree effect observed during chemostat growth of S. cerevisiae on glucose at high growth rates (van Hoek et al., 1998). In such instances, it will be necessary to incorporate additional constraints on the regulatory network to obtain an accurate prediction of metabolism at different growth rates.
18.2.5 Integration with Physiology Data for Validation and Refinement The final step in the model development process is the validation with experimental data after which the model can be generate experimental hypotheses about cellular functions. Data on growth and by-product secretion pattern in conditions other than those used to calculate the model parameters are useful for validation. Recently, it has become possible to obtain such data at large scale using high-throughput physiology techniques such as phenotype microarrays (Biolog Inc., (Bochner et al., 2001)). Phenotype microarrays essentially evaluate the growth and respiration patterns of cells in over 700 environments with varying substrates. The technique relies on a colorimetric assay based on dye reduction that is then linked to growth. This high-throughput assay of growth in a multitude of environments has been obtained for both E. coli and B. subtilis (Covert et al., 2004) and is useful for identifying missing cellular functions such as transporters. CBMs capture most of the known metabolic pathways and hence, can be used to predict cellular fate in the wake of a genetic perturbation. High-throughput gene essentiality data including genome-scale transposon mutagenesis, and single-gene knockouts are already available for well studied organisms such as E. coli, B. subtilis, and P. aeruginosa, etc. The comparison of the model predictions to large-scale gene essentiality can be used to identify novel pathways (in the case of false negatives), inactive enzymes represented in the model (false positives) and additional regulatory features not captured by the model. As an example, analyzing the gene deletion phenotypes of redundant pathways predicted by the model of Geobacter sulfurreducens identified several cases of inactive enzymes in central metabolism such as pyruvate dehydrogenase, and succinyl-CoA synthetase (Segura et al., 2008). The reconciliation of the model with high-throughput growth physiology, gene essentiality, and by-product secretion patterns is essential in creating a compact and systematic representation of cellular metabolic capabilities for further computational and experimental interrogation.
18.3 Methods for Interrogating Metabolic Networks Although the first stoichiometric model of metabolism was constructed in 1990 (Majewski and Domach, 1990), the driver for computational tool development for analysis of metabolic networks, was the reconstruction of genome-scale models in the late 1990s. Since then, there have been a vast array of methods formulated and are extensively summarized in several reviews (Covert et al., 2003; Price et al., 2004a; Reed et al., 2006). The methods for metabolic network analysis (Figure 18.3) can be broadly categorized into four classes; (1) the methods based solely on the stoichiometric and reaction directionality constraints, (2) extensions that incorporated additional constraints based on thermodynamics, kinetics, and metabolite concentrations, (3) regulatory and dynamic extensions, and (4) optimization methods for design and analysis. Here, these methods are only briefly recounted and further information is available in other chapters of this book.
18.3.1 Flux-Based Methods Most of the methods that relied on stoichiometric and reaction directionality constraints were developed to analyze the feasible solution space determined by the imposed constraints. These methods either relied on the selection of one point in the solution space based on an objective function (thereby biasing the selection for optimizing the objective) or attempted to characterize the solution space in its entirety without any bias toward a particular solution (unbiased methods). These unbiased methods included the definition of extreme pathways, elementary modes, and random sampling of the solution
18-7
Constraint-Based Genome-Scale Models of Cellular Metabolism
Flux analysis based on C13 Isotope distribution (overdetermined systems)
Analysis of genome-scale metabolic networks
Physico-chemical constraints (underdetermined systems)
Thermodynamic and metabolic extensions Energy balance analysis • Flux minimization • Network embedded thermodynamic analysis • k-Cone analysis •
Stoichiometric and capacity constraints
Regulatory and dynamic extensions
Regulatory flux Balance analysis • Dynamic flux balance analysis •
Optimization methods for metabolic networks OptKnock ObjFind • Optstrain • Optreg • OMNI • Error reconciliaton • •
Biased methods analyzing sub-set of the solution space FBA Sensitivity analysis • MOMA • ROOM • FVA • FCF • α-spectrum • •
Unbiased methods for uniform analysis of the entire space ExPA ElMo Random sampling • Volume analysis • • •
Figure 18.3 Methods for computational analysis of genome-scale metabolic models.
space, whereas the biased methods included flux balance analysis, flux variability analysis, and deletion analysis as defined below. 18.3.1.1 Biased Methods Flux balance analysis (FBA). FBA has been extensively reviewed over the years and is the classical method for predicting the flux distribution in genome-scale metabolic networks based on linear programming, where an objective function corresponding to a cellular goal is defined. Typically, the growth rate maximization objective is used based on the hypothesis that cellular metabolism is programmed through evolution for optimal resource utilization and growth. The genome-scale metabolic network is used to derive the stoichiometric constraints based on the assumption that metabolite levels are at steady state during balanced growth (Equation 18.1a). The stoichiometric constraints are augmented with the directionality and enzymatic capacity constraints (Equation 18.1b), and substrate uptake constraints which correspond to the media composition. Hence, the FBA problem is formulated as follows: Max µ =f T v
s.t. Sv = 0
(18.1a)
lb ≤ v ≤ ub
(18.1b)
vs = qs
(18.1c)
n
where v is vector of fluxes (v ∈ ℜ ), S is m × n dimensional stoichiometric matrix, m is the number of metabolites, n is the number of reactions, qs is the experimentally measured uptake rates, and vs are
18-8
Modeling Tools for Metabolic Engineering
fluxes corresponding to the substrate (e.g., glucose) uptake rate. It is important to note that in the case where the substrate uptake rate is fixed the solution of the FBA problem results in a flux distribution that maximizes the growth yield. If uptake rates of several substrates and standard deviations are available, Equation 18.1c can be modified to incorporate experimental error. In some cases, where there are variations in the experimental measurements, data reconciliation methods are required to ensure the consistency between the experimental measurements and the stoichiometric constraints (van der Heijden et al., 1994). Flux variability analysis (FVA). FVA is used to evaluate the degree of flexibility in the metabolic network and is based on a series of optimization functions to identify the extremes of the optimal solution space (Mahadevan and Schilling, 2003). The objective function is the maximization and minimization of every flux in the network subject to the constraint that the growth rate is optimal. The FVA problem is formulated as below: Max ei T v s.t. Sv = 0 lb ≤ v ≤ ub
(18.2)
vs = qs f Tv = µ* n
Where ei is unit vector (ei ∈ ℜ ) and µ* is the optimal growth rate calculated through the solution of a LP as described in Equation 18.1. The solution of resulting 2n linear programming problems defines the range of values, a reaction can have and still support the optimal growth rate. These reactions represent redundant pathways in the network and can substitute for one another. A variant of this algorithm was used to analyze fermentation data of L. plantarum and identify the flexibility of the metabolic pathways for a case where the objective was not clearly defined (Teusink et al., 2006). Deletion analysis methods. Three different methods have been proposed for simulating the effect of gene deletion on the metabolic flux distribution (Edwards and Palsson, 2000a; Segre et al., 2002; Shlomi et al., 2005). The key difference among these methods is the hypothesis underlying their formulation. In the first approach, the cellular goal is assumed to be the maximization of the growth rate even after the loss of enzymatic activity due to gene deletion. Here, a LP problem is formulated by augmenting the FBA algorithm with an additional constraint eliminating flux through the reaction catalyzed by the deleted gene product. Segre et al. proposed another approach known as MOMA for minimization of metabolic adjustment. Here, the cellular objective of the mutant strain was assumed to be homeostasis of the metabolic flux distribution rather than the growth rate maximization, and the Euclidean distance between the mutant and the wild type flux distribution was minimized. In the third approach known as ROOM, the hypothesis is similar to MOMA, however, instead of the Euclidean distance, the number of flux changes was minimized. The rationale was that Euclidean distance minimization approach led to changes in several fluxes and sometimes did not identify short alternative pathways used for rerouting metabolic flux. However, recent studies have shown that FBA is more predictive of the growth rate of the adapted mutant than the initially generated strain (Fong and Palsson, 2004). Further investigation is required to understand the changes in the flux distribution as the mutant strains evolved to higher growth rates during selection for growth. Sensitivity analysis methods. The impact of changes in the substrate uptake rates at both local and global scales can be investigated by a variety of methods. At the local scale, the shadow prices and reduced costs obtained during the solution of the linear programming problem can be used to assess the sensitivity of the objective function. The dimension of shadow price vector corresponds to the number of constraints or metabolites in the problem and the shadow prices contain information on the potential changes in the objective function value when a small change in the availability of the corresponding
Constraint-Based Genome-Scale Models of Cellular Metabolism
18-9
metabolite (source/sink term) is made. The shadow price reflects the value of a metabolite and is useful in network debugging to identify missing biosynthetic pathways in the network. Further details on the shadow prices can be found in Edwards et al. (2002), Chvatal (1983), and Palsson (2006). Additional methods to evaluate the sensitivity of the objective function at a broader parametric range are also available. For example, in robustness analysis (Edwards and Palsson, 2000b), one of the substrate uptake rates (or any other flux) is varied over a range and the resulting objective function profile is plotted. Robustness analysis essentially represents a two dimensional slice of the feasible solution space defined in the flux coordinates by physico-chemical constraints in Equation 18.1. An extension of the robustness analysis is the phase plane analysis (Edwards et al., 2002), where the value of objective function is calculated by changing the two parameters (fluxes) over a range. In the phase plane analysis, a region in which the shadow prices remain the same is defined as a metabolic phase and the boundaries between the phases are also calculated (Bell and Palsson, 2005). These phases can be linked to a particular phenotype such as the acetate overflow in E. coli and the slope of phase boundaries can be used to identify regions of single and dual substrate limitations. 18.3.1.2 Unbiased Methods Although the assumption of growth rate maximization appears to describe metabolism in prokaryotic networks, it is not clear if the metabolism in higher organisms can be represented similarly. However, the physiochemical constraints such as mass and energy balances have to be satisfied by the metabolic networks in complex biological systems. Hence, the analysis of the solution space defined by these constraints is valuable to characterize metabolism in higher organisms, where a clear objective function is absent. The details of methods proposed for the analyzing the properties of the solution space are discussed below. Extreme pathway analysis and elementary modes. Extreme pathway analysis and elementary mode analysis are two convex analysis based approaches for analyzing metabolic pathways (Schilling et al., 1999; Stelling et al., 2002; Papin et al., 2003). These two related methods attempt to characterize all feasible metabolic flux distributions and define the metabolic pathway associated with the distributions. Both of these approaches are combinatorial in nature and attempt to characterize the solution space in its entirety rather than pick out a particular solution. These methods have been extensively reviewed and compared in detail elsewhere (Papin et al., 2004). Briefly, ElMos is the set of all the feasible solutions that are non-decomposable (i.e., an ElMo is not a subset of any other ElMo), whereas the extreme pathways also require an additional condition of systemic independence and is a subset of the ElMos. ElMos, and ExPas are combinatorial (e.g., the number of ElMos for a 110 reaction network was 27099 during growth with glucose) and genome-scale computation of ElMos and ExPas is still an area of extensive research (Bell and Palsson, 2005). Random and uniform sampling. The challenges in computing genome-scale ElMos and ExPas led to other approaches such as the random sampling in an effort to comprehensively analyze the solution space (Wiback et al., 2004; Price et al., 2004b). Here, Monte Carlo method is used to generate random flux distributions uniformly throughout the constrained space. The physiological properties of the points that are still feasible after an additional constraint (e.g., decreased capacity as in an enzymopathy) is imposed can be used to obtain information on the outcome of perturbations without reliance on biased methods. Almaas et al. (2004) used such sampling methods to analyze the genome-scale metabolic network of E. coli under varying environments and identified a high-flux back bone in the network that was selectively reorganized in response to environment. A similar sampling algorithm was implemented to analyze the metabolic network of human mitochondria under different pathophysiological conditions (Thiele et al., 2005). In that study, reaction co-sets which have highly correlated flux values in sampled distributions were identified in these different physiological conditions providing insights on the regulation of the metabolic network. Flux coupling analysis (FCA). FCA is an optimization based algorithm for determining the correlations between metabolic fluxes for a genome-scale reaction network (Burgard et al., 2004). Here, the
18-10
Modeling Tools for Metabolic Engineering
pair-wise ratio of flux values is maximized and minimized to obtain the range of the flux ratio. If two flux values are perfectly correlated then the flux ratio is a constant. FCA has been used to identify both perfectly correlated and partially correlated sets in genome-scale networks of S. cerevisiae, H. pylori, and E. coli and represents a powerful method for topological analysis of genome-scale metabolic networks and to identify metabolic modules that function together in the network.
18.3.2 Regulatory and Dynamic Extensions Classical approaches to constraint-based modeling assumed that all of the metabolic pathways could be active at all times, whereas it is clear that some of these pathways are subject to regulatory mechanisms and are active only under specific conditions. In order to account for such mechanisms, a regulatory extension to the classical approach was proposed by Covert and Palsson (2002). In that study, transcriptional regulation was represented as a Boolean formulation, whereby in the presence of environmental signal, pathways repressed by the signal would be constrained to have zero flux. Since in the FBA approach, concentrations are not represented, the environmental signals (e.g., presence of oxygen, carbon source) were determined from the FBA solution without any constraints. The addition of the regulatory constraints reduced the solution space and eliminated solutions inconsistent with such regulatory mechanisms. A genome-scale integrated model of transcriptional and metabolic pathways is available for both E. coli and S. cerevisiae (Covert et al., 2004;Herrgard et al., 2006b). However, such genome-scale extension to other organisms is possible only if the transcriptional regulatory network in those organisms is well characterized (Tavazoie et al., 1999;Tegner et al., 2003). Another area where the classical FBA has been extended is in the area of dynamic modeling of meta bolism. Classical FBA relies on steady state assumption that leads to the linear stoichiometric flux balance constraints. The assumption that the intra-cellular metabolites are at steady state levels can be reasonable as the time scales of the enzymatic reaction events are fast (seconds–minutes) relative to time scale of cellular growth (minutes-hours). However, in several cases, the cellular environment changes during growth, and such changes can impact the metabolic flux distribution and the cellular growth. An example is oxygen limitation due to increased cell density in a batch culture that leads to the secretion of fermentation products. These processes can be represented using classical FBA by switching the constraints to reflect the changes in the oxygen levels (Varma et al., 1993). However, the switching between different metabolic states is assumed to be instantaneous in these formulations. In order to capture such dynamic effects due to regulation of pathways, the dynamic FBA (dFBA) was proposed, where the dynamics of the extra-cellular environment was integrated with metabolic models of cellular growth (Mahadevan et al., 2002). The dFBA formalism has been used to identify optimal genetic and environmental manipulation profiles for maximizing the formation of chemicals such as ethanol and acetate in a fed-batch bioreactor (Gadkar et al., 2005). Hence, this formulation is critical to integrate the detailed molecular representation of metabolism with the macroscopic bioprocess description for optimization and design of these processes. More recently, this formulation has been used for the analysis of metabolic dynamics in mammalian myocardia under ischemic conditions (Luo et al., 2006).
18.3.4 Thermodynamic and Metabolic Extensions The initial genome-scale models incorporated limited thermodynamic information on the directionality of the reactions based on the standard Gibb’s free energy change of metabolic reactions. However, it was recognized that the energy balances were required in addition to the stoichiometric constraints in order to enforce laws of thermodynamics (Beard et al., 2002). Energy balance analysis incorporated explicit constraints to prevent flux through thermodynamically infeasible pathways such as reaction cycles and obtained the flux distribution through the solution of quadratic programming problem. Price et al. (2002) presented an alternative approach to enforcing thermodynamic feasibility via the elimination of reaction cycles.
Constraint-Based Genome-Scale Models of Cellular Metabolism
18-11
Recently, Henry et al. (2006) calculated the Gibb’s free energy change associated with all of the reactions in the genome-scale metabolic model of E. coli and identified thermodynamically unfavorable reactions essential for growth. Further, such thermodynamic information has been used to calculate feasible metabolite ranges based on the measurement of subset of metabolites using stoichiometric constraints (Mavrovouniotis, 1996; Kummel et al., 2006). An alternative formulation to FBA that incorporates thermodynamic information was proposed by Holzhutter ( 2006). This involves the calculation of the flux distribution that minimizes the weighted sum of the fluxes when few measured fluxes are specified. The weights on these fluxes are determined based on the standard Gibb’s free energy change associated with the reactions. The prediction of the flux distribution in the red blood cell metabolic network using the FM approach was found to be consistent with the kinetic model. These results suggest that a combination of stoichiometry and other physicochemical constraints can be used to analyze metabolism in higher organisms even if the cellular goal in such cases is unclear. Another approach that integrated metabolomic data and stoichiometric constraints was proposed by Famili et al. (2005). Here, the data and the constraints were used to derive constraints on the range of kinetic parameters for dynamic model development. k-cone analysis of S. cerevisiae metabolism was performed to determine the consistency between in vitro enzymatic parameters and in vivo concentration, and to determine the minimum number of enzymatic parameters that needed to be changed to ensure the consistency with data. This approach was applied to the red blood cell metabolic network to determine the range of the kinetic parameters under different physiological conditions using Monte Carlo sampling.
18.3.5 Optimization of Metabolic Networks A suite of design and analysis approaches for metabolic engineering is now available for the constraintbased representation of the metabolic networks. Most of these approaches rely on well established optimization techniques routinely used in systems engineering and control. These approaches use mathematical programming to optimize for an alternative objective function (e.g., number of active reactions) subject to either stoichiometric constraints alone or both stoichiometric constraints and the objective function of growth maximization. These are classified into two categories based on problem formulation and discussed in further detail below. 18.3.5.1 Integer Programming In this class of algorithms, Boolean or binary variables are used to represent the activity state (0 for inactive and 1 for the active) of the enzyme catalyzing the reaction along with the value of the flux through the reaction. These variables can be then used to formulate problems by which both the flux and the activity of enzyme can be varied in a problem to optimize an objective function in both continuous and binary variables. As example, Burgard et al. (2001) used integer programming to determine the minimum number of reactions required for supporting the synthesis of biomass components. In another study, Pharkya et al.(2004) used this formulation to identify and incorporate metabolic reactions from a database that lead to enhanced yield of specific metabolic products in E. coli. Hence, integer programming representations are valuable for identifying required additional metabolic functions or eliminating existing ones for optimization of the metabolic network. 18.3.5.2 Bilevel Programming Bilevel programming is another class of optimization approach where two optimization problems are nested within each other. Such problems naturally arise in the design of a metabolic network, whereby the FBA problem with the cellular objective of growth rate maximization is nested within another problem with a higher level engineering objective. This formulation was first introduced by Burgard et al. (2003), where the outer level objective was the maximization of product yield given the cellular objective
18-12
Modeling Tools for Metabolic Engineering
and constraints on the maximum allowable number of knock-outs. In that study, the nested optimization problem was solved by converting the inner LP problem into linear constraints using duality theory. A binary variable was used to represent the activity of a reaction, and the solution of the resulting MILP problem identified reactions that have to be knocked out to increase the product yield while still maximizing growth. Hence, the Optknock formulation, discussed in subsequent chapter is valuable in coupling product formation to growth. Additionally, bilevel programming has been used to identify objective functions that are most consistent with experimentally measured flux distributions, and to reconcile experimental measurements with the stoichiometric constraints (Burgard and Maranas, 2003; Raghunathan et al., 2003). Recently, Pharkya and Maranas (2006) extended the Optknock formulation to determine reaction activation/ inhibition rather than just knock-outs (binary state) that can lead to enhanced product yield. Finally, Herrgard et al. (2006a) proposed optimal metabolic network identification for reconciling the predictions of the genome-scale model with experimentally observed flux distributions and identifying potential bottleneck reactions leading to suboptimal growth. Several of these optimization methods for metabolic network analysis and design are discussed in detail in other chapters of this book.
18.4 Software and Databases for Genome-Scale Modeling A number of alternatives including academic and commercial software is available for the development of genome-scale metabolic models and implementation of the computational analysis techniques discussed in the earlier sections. These are briefly summarized in Table 18.1. Most of these software include a reaction and a metabolite database for the construction of the models and a link to a optimization solver for the solution of the underlying linear or quadratic programming problem. One of the critical components in the construction of such models is the representation of appropriate charged form of the metabolites in the solution as the global proton balance can have a significant impact on the physiology predictions (Reed et al., 2003; Mahadevan et al., 2006). Another often overlooked component in the analysis of genome-scale models is the numerical challenges associated with the large-scale metabolic model formulations. Although the solution of the underlying linear programming problems is comparatively efficient even at genome-scales, the formulation of the biomass component requirements which varies across two to three orders of magnitude can cause numerical scaling issues. Hence, the solution returned by the LP solver has to be examined carefully and the optimization parameters changed appropriately to ensure model accuracy. Another feature to consider, when evaluating the different software is the ability to import and export the models in a standard form such as the Systems Biology Markup Language (SBML), which is emerging as a primary standard for exchange and archiving of biological models. The available commercial and academic software and some of their features are summarized in Table 18.2.
18.5 Survey of Genome-Scale Metabolic Models As of 2006, 12 genome-scale models of bacteria, archaea and eukaryotes have been developed and utilized for applications ranging from metabolic engineering, recombinant protein production, bioremediation, and anti-microbial development (Table 18.1). Initial genome-scale models were constructed with academic grade software (FBA) and did not include charge balanced reactions. However, such models were able to predict the outcome gene deletions with high accuracy (~70%) (Edwards and Palsson, 2000a). With the availability of more sophisticated tools such as MetaFluxNet, CellAnalyzer, and SimPheny, the more recent genome-scale models have all incorporated charge and elemental balancing and are coupled to commercial linear programming solvers. The models range from 373 reactions for M. succiniproducens to 1220 reactions for the eukaryote M. musculus cell line and some of the models have also been updated as additional information in the genome annotation and software became available. There are four versions of E. coli model suggesting that these models grow in size as new functions are
18-13
Constraint-Based Genome-Scale Models of Cellular Metabolism
Table 18.1 Commercial and Academic Software Available for Development and Analysis of Genome-Scale Metabolic Models
Software/Vendor
Built-in Charge/ Elementally Balanced Database
Linear Optimization Tools
Y N
Y Y
N N
Y Y
N
Y
SimPheny, Genomatica Inc. In Silico Discovery, InSilicoBiotechnology Inc. FBA, UCSD CellNetAnalyzer, Garching Innovation GmbH MetaFluxNet, KAIST
C13 Metabolic Flux Analysis
SBML Import/ Export
XPRESSS JAVA
N Y
N* Y
LINDO MATLAB/MEX Interface LPSOLVE
N N
N Y
N
Y
LP Solver
* Export available.
Table 18.2 List of Genome-Scale Models and the Features Organism
Model Version
Software Platform
Size (Metabolites × Reactions)
Reference
Escherichia coli
iJE660a iJR904 iMC1010 iHJ873
FBA SimPheny SimPheny
438 × 627 625 × 937 626 × 939 518 × 873
Edwards and Palsson, 2000c Reed et al., 2003 Covert et al., 2004 Henry et al., 2006
Haemophilus influenzae
iJE295
FBA
343 × 488
Edwards and Palsson, 1999
Helicobacter pylori
iCS291 iIT341
FBA SimPheny
339 × 388 411 × 476
Schilling et al., 2002
Saccharomyces cerevisiae
iFF708 iND750
FBA SimPheny
584 × 1175 646 × 1149
Staphylococcus aureus
iSB619
SimPheny
571 × 640
Becker and Palsson, 2005
Geobacter sulfurreducens
iRM588
SimPheny
541 × 523
Mahadevan et al., 2006
Mus musculus
iKS1156
Lindo
Methanosarcina barkeri
iAF692
SimPheny
558 × 619
Mannheimia succiniproducens
iHK335
MetaFluxNet
352 × 373
Hong et al., 2004
Streptomyces coelicolor
iIB711
500 × 700
Borodina et al., 2005
Bacillus subtilis
iYK850
Matlab, Lindo SimPheny
Lactococcus lactis
iAO358
GNU LP kit
872 × 1220
986 × 1033 509 × 621
Famili et al., 2003 Duarte et al., 2004
Sheikh et al., 2005 Feist et al., 2006
Oh et al., 2007 Oliveira et al., 2005
identified and non-stoichiometric constraints such as transcriptional regulatory and thermodynamic constraints are incorporated in the model. Genome-scale models of well studied organisms such as E. coli, S. cerevisiae, and B. subtilis have been extensively validated experimentally, whereas in the case of the other organisms, these models have been used primarily to understand the unique features of their metabolic network. For example, the genome-scale model of H. pylori was used to identify minimal media requirements for that organism and the G. sulfurreducens model revealed the metabolic challenges associated with metal respiration and extracellular electron transfer. The models of E. coli, S. cerevisiae, and M. succiniproducens have been used for designing strains with improved lactate, ethanol, and succinate yields, respectively, further highlighting the potential of validated models. Thus far, the genome-scale models have been used in a variety of applications including: (1) analysis and refinement of network through reconciliation with data, (2) for the organization of high-t hroughput “omics” data, (3) for the redesign of cellular metabolism and optimization of bioprocesses, and (4) for the identification of the network flux distribution through the analysis of C13 isotope label incorporation in biomass. The model predictions of growth and essentiality have been compared with genome-scale
18-14
Modeling Tools for Metabolic Engineering
data for E. coli, B. subtilis, and S. cerevisiae and have led to significant modifications in the models. Further, the model-based engineering of metabolism in S. cerevisiae and E. coli has led to improved product yields for ethanol, succinate, and lactic acid, respectively (Hong et al., 2004; Bro et al., 2006) and flux analysis has been used to identify the experimental flux distributions for several organisms. In summary, these genome-scale models have been used in a variety of applications to characterize and design cellular metabolism across the different kingdoms of life.
18.6 Conclusions As these recent studies suggest, the availability of additional layers of omics data along with computational analysis methods, has resulted in unprecedented opportunity to analyze cellular metabolism and redesign the metabolic networks for several practical applications ranging from production of biofuels such as ethanol, commodity chemicals such as succinate/lactate, nutraceuticals, and even unconventional products such as electrical current generation due to bacterial respiration in microbial fuel cells. While some of these model-based computational approaches are summarized in this chapter, the reader is referred to other chapters in this book for a comprehensive treatment of the applications to metabolic engineering. With the recent advances in bioinformatics enabling the efficient reconstruction of metabolic network followed by model development, the number of such genome-scale models is expected to increase. We expect that these models will initially be used to improve our understanding of metabolism by iteratively (1) designing and conducting experiments to test model predictions, (2) reconciling the experimental data with computational results to discover novel functional constraints, and (3) refining the model to account for the new constraints. This iterative process will ultimately lead to improved understanding of metabolism across these organisms and the resulting models will be critical for manipulating metabolism for practical applications in metabolic engineering, bioremediation, recombinant protein and anti-microbial discovery.
References Almaas, E., Kovacs, B., Vicsek, T., Oltvai, Z.N., and Barabasi, A.L. 2004. Global organization of metabolic fluxes in the bacterium Escherichia coli. Nature, 427, 839–843. Beard, D.A., Liang, S.C., and Qian, H. 2002. Energy balance for analysis of complex metabolic networks. Biophys. J., 83, 79–86. Becker, S.A. and Palsson, B.O. 2005. Genome-scale reconstruction of the metabolic network in Staphylococcus aureus N315: an initial draft to the two-dimensional annotation. BMC. Microbiol, 5, 8. Bell, S.L. and Palsson, B.O. 2005. Expa: a program for calculating extreme pathways in biochemical reaction networks. Bioinformatics, 21, 1739–1740. Bochner, B.R., Gadzinski, P., and Panomitros, E. 2001. Phenotype microarrays for high-throughput phenotypic testing and assay of gene function. Genome Res., 11, 1246–1255. Borodina, I., Krabben, P., and Nielsen, J. 2005. Genome-scale analysis of Streptomyces coelicolor A3(2) metabolism. Genome Res., 15, 820–829. Bro, C., Regenberg, B., Forster, J., and Nielsen, J. 2006. In silico aided metabolic engineering of Saccharomyces cerevisiae for improved bioethanol production. Metab. Eng., 8, 102–111. Burgard, A.P. and Maranas, C.D. 2003. Optimization-based framework for inferring and testing hypothesized metabolic objective functions. Biotechnol. Bioengin., 82, 670–677. Burgard, A.P., Nikolaev, E.V., Schilling, C.H., and Maranas, C.D. 2004. Flux coupling analysis of genomescale metabolic network reconstructions. Genome Res., 14, 301–312. Burgard, A.P., Pharkya, P., and Maranas, C.D. 2003. OptKnock: a bilevel programming framework for identifying gene knockout strategies for microbial strain optimization. Biotechnol. Bioeng., 84, 647–657.
Constraint-Based Genome-Scale Models of Cellular Metabolism
18-15
Burgard, A.P., Vaidyaraman, S., and Maranas, C.D. 2001. Minimal reaction sets for Escherichia coli metabolism under different growth requirements and uptake environments. Biotechnol. Prog., 17, 791–797. Chvatal, V. 1983 Linear Programming. W.H. Freeman and Company, New York. Covert, M.W., Famili, I., and Palsson, B.O. 2003. Identifying constraints that govern cell behavior: a key to converting conceptual to computational models in biology? Biotechnol. Bioeng., 84, 763–772. Covert, M.W., Knight, E.M., Reed, J.L., Herrgard, M.J., and Palsson, B.O. 2004. Integrating high-throughput and computational data elucidates bacterial networks. Nature, 429, 92–96. Covert, M.W. and Palsson, B.O. 2002. Transcriptional regulation in constraints-based metabolic models of Escherichia coli. J. Biol. Chem., 277, 28058–28064. Covert, M.W., Schilling, C.H., Famili, I., Edwards, J.S., Goryanin, I.I., Selkov, E., and Palsson, B.O. 2001. Metabolic modeling of microbial strains in silico. Trends Biochem. Sci., 26, 179–186. Duarte, N.C., Herrgard, M.J., and Palsson, B. 2004. Reconstruction and validation of Saccharomyces cerevisiae iND750, a fully compartmentalized genome-scale metabolic model. Genome Res., 14, 1298–1309. Edwards, J.S. and Palsson, B.O. 1999. Systems properties of the Haemophilus influenzae Rd metabolic genotype. J. Biol. Chem., 274, 17410–17416. Edwards, J.S. and Palsson, B.O. 2000a. Metabolic flux balance analysis and the in silico analysis of Escherichia coli K-12 gene deletions. BMC Bioinformatics., 1, 1. Edwards, J.S. and Palsson, B.O. 2000b. Robustness analysis of the Escherichia coli metabolic network. Biotechnol. Prog., 16, 927–939. Edwards, J.S. and Palsson, B.O. 2000c. The Escherichia coli MG1655 in silico metabolic genotype: its definition, characteristics, and capabilities. Proc. Natl. Acad. Sci. USA, 97, 5528–5533. Edwards, J.S., Ramakrishna, R., and Palsson, B.O. 2002. Characterizing the metabolic phenotype: a phenotype phase plane analysis. Biotechnol. Bioeng., 77, 27–36. Famili, I., Forster, J., Nielsen, J., and Palsson, B.O. 2003. Saccharomyces cerevisiae phenotypes can be predicted by using constraint-based analysis of a genome-scale reconstructed metabolic network. Proc. Natl. Acad. Sci. USA, 100, 13134–13139. Famili, I., Mahadevan, R., and Palsson, B.O. 2005. k-Cone analysis: determining all candidate values for kinetic parameters on a network scale. Biophys. J, 88, 1616–1625. Feist, A.M., Scholten, J.C.M., Palsson, B.O., Brockman, F.J., and Ideker, T. 2006. Modeling methanogenesis with a genome-scale metabolic reconstruction of Methanosarcina barkeri. Mol. Syst. Biol., msb4100046, E1–E14. Fong, S.S. and Palsson, B.O. 2004. Metabolic gene-deletion strains of Escherichia coli evolve to computationally predicted growth phenotypes. Nat. Genet., 36, 1056–1058. Francke, C., Siezen, R.J., and Teusink, B. 2005. Reconstructing the metabolic network of a bacterium from its genome. Trends Microbiol., 13, 550–558. Gadkar, K.G., Doyle, F.J., Edwards, J.S., and Mahadevan, R. 2005. Estimating optimal profiles of genetic alterations using constraint-based models. Biotechnol. Bioeng., 89, 243–251. Henry, C.S., Jankowski, M.D., Broadbelt, L.J., and Hatzimanikatis, V. 2006. Genome-scale thermodynamic analysis of Escherichia coli metabolism. Biophy. J., 90, 1453–1461. Herrgard, M.J., Fong, S.S., and Palsson, B.O. 2006a. Identification of genome-scale metabolic network models using experimentally measured flux profiles. PLoS. Comput. Biol., 2, e72. Herrgard, M.J., Lee, B.S., Portnoy, V., and Palsson, B.O. 2006b. Integrated analysis of regulatory and metabolic networks reveals novel regulatory mechanisms in Saccharomyces cerevisiae. Genome Res., 16, 627–635. Holzhutter, H.G. 2006. The generalized flux-minimization method and its application to metabolic networks affected by enzyme deficiencies. Biosystems, 83, 98–107. Hong, S.H., Kim, J.S., Lee,S.Y., In, Y.H., Choi, S.S., Rih, J.K., Kim, C.H., Jeong, H., Hur, C.G., and Kim, J.J. 2004. The genome sequence of the capnophilic rumen bacterium Mannheimia succiniciproducens. Nat. Biotechnol., 22, 1275–1281.
18-16
Modeling Tools for Metabolic Engineering
Kanehisa, M., Goto, S., Kawashima, S., Okuno, Y., and Hattori, M. 2004. The KEGG resource for deciphering the genome. Nucleic Acids Res., 32, Database issue, D277–D280. Karp, P.D., Paley, S., and Romero, P. 2002. The Pathway Tools software. Bioinformatics, 18 Suppl 1, S225–S232. Kroger, A., Biel, S., Simon, J., Gross, R., Unden, G., and Lancaster, C.R. 2002. Fumarate respiration of Wolinella succinogenes: enzymology, energetics and coupling mechanism. Biochim. Biophys. Acta, 1553, 23–38. Kummel,A., Panke, S., and Heinemann, M. 2006. Putative regulatory sites unraveled by network-embedded thermodynamic analysis of metabolome data. Mol. Syst. Biol., 2, 2006. Lehninger, A.L., Cox, M.M., and Nelson,D.L. 1993. Principles of Biochemistry. Worth Publishers, New York. Luo, R.Y., Liao, S., Tao, G.Y., Li, Y.Y., Zeng, S., Li, Y.X., and Luo, Q. 2006. Dynamic analysis of optimality in myocardial energy metabolism under normal and ischemic conditions. Mol. Syst. Biol., 2, 2006. Mahadevan, R., Bond, D.R., Butler, J.E., Esteve-Nunez, A., Coppi, M.V., Palsson, B.O., Schilling, C.H., and Lovley, D.R. 2006. Characterization of Metabolism in the Fe(III)-reducing organism Geobacter sulfurreducens by constraint-based modeling. Appl. Environ. Microbiol., 72, 1558–1568. Mahadevan, R., Edwards, J.S., and Doyle, F.J. 2002. Dynamic flux balance analysis of diauxic growth in Escherichia coli. Biophy. J., 83, 1331–1340. Mahadevan, R. and Schilling, C.H. 2003. The effects of alternate optimal solutions in constraint-based genome-scale metabolic models. Metab. Eng., 5, 264–276. Majewski, R. and Domach, M. 1990. Simple constrained-optimization view of acetate overflow in Escherichia-coli. Biotechnol. Bioeng., 35, 732–738. Mavrovouniotis, M.L. 1996. Duality theory for thermodynamic bottlenecks in bioreaction pathways. Chem. Eng. Sci., 51, 1495–1507. Neijssel, O.M., Teixeria de Mattos, M.J., and Tempest, D.W. 1996. Growth yield and energy distribution. In Neidhardt, F. (Ed.), Escherichia coli and Salmonella: Cellular and Molecular Biology. ASM Press, Washington, DC. Notebaart, R.A., Van Enckevort, F.H.J., Francke, C., Siezen, R.J., and Teusink, B. 2006. Accelerating the reconstruction of genome-scale metabolic networks. BMC Bioinformatics, 7. Oh, Y. K., Palsson, B. O., Park, S. M., Schilling, C. H., and Mahadevan, R. 2007. Genome–Scale reconstruction of metabolic network in Bacillus Subtilis based on high-throughout phenotyping and give essentiality data. J. Biol. Chem., 282, 28791–28799. Oliveira, A.P., Nielsen, J., and Forster, J. 2005. Modeling Lactococcus lactis using a genome-scale flux model. BMC Microbiol., 5, 39. Palsson, B. 2006. Systems Biology: Properties of Reconstructed Networks. Cambridge University Press, New York. Papin, J.A., Price, N.D., Wiback, S.J., Fell, D.A., and Palsson, B. 2003. Metabolic pathways in the postgenome era. Trends Biochem. Sci., 28, 250–258. Papin, J.A., Stelling, J., Price, N.D., Klamt, S., Schuster, S., and Palsson, B.O. 2004. Comparison of networkbased pathway analysis methods. Trends Biotechnol., 22, 400–405. Pharkya, P., Burgard, A.P., and Maranas, C.D. 2004. OptStrain: a computational framework for redesign of microbial production systems. Genome Res., 14, 2367–2376. Pharkya, P. and Maranas, C.D. 2006. An optimization framework for identifying reaction activation/inhibition or elimination candidates for overproduction in microbial systems. Metab. Eng., 8, 1–13. Pinney, J.W., Shirley, M.W., McConkey, G.A., and Westhead, D.R. 2005. metaSHARK: software for automated metabolic network prediction from DNA sequence and its application to the genomes of Plasmodium falciparum and Eimeria tenella. Nucleic Acids Res., 33, 1399–1409. Pirt, S.J. 1965. The maintenance energy of bacteria in growing cultures. Proc. R. Soc. London (Biol), 163, 224–231.
Constraint-Based Genome-Scale Models of Cellular Metabolism
18-17
Price, N.D., Famili, I., Beard, D.A., and Palsson, B.O. 2002. Extreme pathways and Kirchhoff ’s second law. Biophys. J., 83, 2879–2882. Price, N.D., Reed, J.L., and Palsson, B.O. 2004a. Genome-scale models of microbial cells: evaluating the consequences of constraints. Nat. Rev. Microbiol., 2, 886–897. Price, N.D., Schellenberger, J., and Palsson, B.O. 2004b. Uniform sampling of steady-state flux spaces: means to design experiments and to interpret enzymopathies. Biophys. J., 87, 2172–2186. Raghunathan, A.U., Perez-Correa, J.R., and Biegler, L.T. 2003. Data reconciliation and parameter estimation in flux-balance analysis. Biotechnol. Bioeng., 84, 700–708. Reed, J.L., Famili, I., Thiele, I., and Palsson, B.O. 2006. Towards multidimensional genome annotation. Nat. Rev. Genet., 7, 130–141. Reed, J.L., Vo, T.D., Schilling, C.H., and Palsson, B. 2003. Escherichia coli iJR904: an expanded genomescale model of E. coli K-12. Genome Biol., 4, R54.1–R54.12. Savageau, M.A. 1969. Biochemical systems analysis. I. Some mathematical properties of the rate law for the component enzymatic reactions. J. Theor. Biol., 25, 365–369. Schilling, C.H., Covert, M.W., Famili, I., Church, G.M., Edwards, J.S., and Palsson, B.O. 2002. Genomescale metabolic model of Helicobacter pylori 26695. J. Bacteriol., 184, 4582–4593. Schilling, C.H., Schuster, S., Palsson, B.O., and Heinrich, R. 1999. Metabolic pathway analysis: basic concepts and scientific applications in the post-genomic era. Biotechnol. Prog., 15, 296–303. Schomburg, I., Chang, A., Ebeling, C., Gremse, M., Heldt, C., Huhn, G., and Schomburg, D. 2004. BRENDA, the enzyme database: updates and major new developments. Nucleic Acids Res., 32, D431–D433. Segre, D., Vitkup, D., and Church, G.M. 2002. Analysis of optimality in natural and perturbed metabolic networks. Proc. Natl. Acad. Sci. USA, 99, 15112–15117. Segre, D., Zucker, J., Katz, J., Lin, X., D’Haeseleer, P., Rindone, W.P., Kharchenko, P., Nguyen, D.H., Wright, M.A., and Church, G.M. 2003. From annotated genomes to metabolic flux models and kinetic parameter fitting. OMICS, 7, 301–316. Segura, D., Mahadevan, R., Juárez, K., and Lovely, D. R. 2008. Computational and experimental analysis of redundancy in the central metabolism of Geobacter sulfurreducers. PLOS Comput. Biol. 4, e 36. Sheikh, K., Forster, J., and Nielsen, L.K. 2005. Modeling hybridoma cell metabolism using a generic genome-scale metabolic model of Mus musculus. Biotechnol Prog., 21, 112–121. Shlomi, T., Berkman, O., and Ruppin, E. 2005. Regulatory on/off minimization of metabolic flux changes after genetic perturbations. Proc. Natl. Acad. Sci. USA, 102, 7695–7700. Stelling, J., Klamt, S., Bettenbrock, K., Schuster, S., and Gilles, E.D. 2002. Metabolic network structure determines key aspects of functionality and regulation. Nature, 420, 190–193. Tavazoie, S., Hughes, J.D., Campbell, M.J., Cho, R.J., and Church, G.M. 1999. Systematic determination of genetic network architecture. Nat. Genet., 22, 281–285. Tegner, J., Yeung, M.K., Hasty, J., and Collins, J.J. 2003. Reverse engineering gene networks: integrating genetic perturbations with dynamical modeling. Proc. Natl. Acad. Sci. USA, 100, 5944–5949. Teusink, B., Wiersma, A., Molenaar, D., Francke, C., de Vos, W. M., Siezer, R. J., and Smid, E. J. 2006. Analysis of growth of lactobacillus plantarum WCFS1 on a complex medium using a genome-scale metabolic model. J. Biol. Chem., 281, 40041–40048. Thiele, I., Price, N.D., Vo, T.D., and Palsson, B.O. 2005. Candidate metabolic network states in human mitochondria. Impact of diabetes, ischemia, and diet. J. Biol. Chem., 280, 11683–11695. van der Heijden, R.T.J.M., Heijnen, J.J., Hellinga, C., Romein, B., and Luyben, K.C.A.M. 1994. Linear constraint relations in biochemical reaction systems: II. Diagnosis and estimation of gross measurement errors. Biotechnol. Bioeng., 43, 11–20. van Hoek, P., van Dijken, J.P., and Pronk, J.T. 1998. Effect of specific growth rate on fermentative capacity of baker’s yeast. Appl. Environ. Microbiol., 64, 4226–4233. Varma, A., Boesch, B.W., and Palsson, B.O. 1993. Stoichiometric interpretation of Escherichia coli glucose catabolism under various oxygenation rates. Appl. Environ. Microbiol., 59, 2465–2473.
18-18
Modeling Tools for Metabolic Engineering
Varner, J.D. 2000. Large-scale prediction of phenotype: Concept. Biotechnol. Bioeng., 69, 664–678. Wiback, S.J., Famili, I., Greenberg, H.J., and Palsson, B.O. 2004. Monte Carlo sampling can be used to determine the size and shape of the steady-state flux space. J. Theor. Biol., 228, 437–447. Wu, C.H., Apweiler, R., Bairoch, A., Natale, D.A., Barker, W.C., Boeckmann, B., Ferro, S., Gasteiger, E., Huang, H., Lopez, R., Magrane, M., Martin, M.J., Mazumder, R., O’Donovan, C., Redaschi, N., and Suzek, B. 2006. The Universal Protein Resource (UniProt): an expanding universe of protein information. Nucleic Acids Res., 34, D187–D191. Zhu, G., Golding, G.B., and Dean, A.M. 2005. The selective cause of an ancient adaptation. Science, 307, 1279–1282.
19 Multiscale Modeling of Metabolic Regulation
C.A. Leclerc McGill University
Jeffrey D. Varner Cornell University
19.1 Introduction �������������������������������������������������������������������������������������19-1 19.2 Background ��������������������������������������������������������������������������������������19-1 19.3 The Multiscale Nature of Metabolism Follows from the Central Dogma of Molecular Biology ����������������������������������������� 19-2 19.4 Construction of First-Principle and Reversed Engineered Models of Transcriptional Programs ������������������������������������������ 19-4 19.5 Models of the Prokaryotic Translational Program.................... 19-5 19.6 Integrating Transcriptional and Translational Programs with Physiology Leads to More Predictive Models.................... 19-6 Multiscale Constraints Based Models Have Increased Capabilities • Cybernetic Models Bridge Metabolic Hierarchies
19.7 Summary and Conclusions ����������������������������������������������������������� 19-8 References ��������������������������������������������������������������������������������������������������� 19-8
19.1 Introduction The capability to gather organism wide data has far outstripped the ability to understand it. Transforming large-scale data sets into a better cell requires tools that integrate physiology with its environment. One such tool is multiscale mathematical modeling where stoichiometry and kinetics are integrated with metabolic regulation and control. Integrated multiscale models could in principle predict physiological shifts resulting from environmental or genetic perturbation thereby enhancing our ability to engineer metabolism. However, the complexity underlying the formulation and validation of multiscale models of metabolic regulation severally hampers the approach. In this chapter, we review the salient developments in the area of multiscale metabolic models with an emphasis on understanding the evolution of the field. We begin by presenting the general metabolic modeling landscape by reviewing both dynamic and stoichiometric models. We then present a general set of multiscale metabolic mass balances and frame the discussion of the formulation and validation of models of transcriptional and translational programs from large-scale data sets and first-principles in the context of these balances. We conclude by reviewing two multiscale modeling techniques; the augmented constraints based models of Covert, Palsson, and coworkers and the cybernetic models of Ramkrishna and coworkers. One important area, namely stochastic models of gene expression, is not considered here, see Ref. [1] for a review of the origin of stochastic fluctuation in gene expression.
19.2 Background The deepest level of metabolic analysis ultimately culminating in the prediction of metabolic dynamics, for example, the metabolic reprogramming observed in the seminal work of Brown and coworkers during the diauxie shift in Saccharomyces cerevisiae [2], requires that stoichiometry and kinetics be married 19-1
19-2
Modeling Tools for Metabolic Engineering
with metabolic regulation and control. Constructing multiscale or hierarchal models of physiology is not new; Shuler and coworkers in the late 1970 and early 1980s formulated dynamic single cell models of Escherichia coli [3–6], Chinese Hamster Ovary (CHO) cells [7,8] and S. cerevisiae [9]. These models were capable of predicting physiological characteristics ranging from the dependence of cell geometry upon growth rate and the impact of nutrient conditions [6,10,11] to plasmid replication and host-plasmid interactions [12–14]. Many examples of the single cell model paradigm can be found in the literature, see Shuler [15]. While arguably being the best formalism to describe cell growth and physiology, single cell models are computationally expensive, require a large number of kinetic parameters and detailed biological knowledge [15]. Reuss and coworkers have developed structured unsegregated dynamic models (state averaged over the population) of both S. cerevisiae [16,17] and E. coli [18] and have studied the in vivo dynamics of key pathways such as the pentose phosphate pathway (PPP) and sugar transport in S. cerevisiae [19,20]. Dynamic models of varying complexity has also been constructed to study the penicillin biosynthetic pathway [21–23], threonine pathway dynamics [24,25], regulatory architectures in metabolic reaction networks [26,27], red-blood cell metabolic pathways [28–32] and plant metabolic pathways [33–35]. Stoichiometric models, such as those used in flux balance analysis (FBA), have also emerged as powerful analysis tools that couple observed extracellular phenomena (uptake/production rates, growth rate, product and biomass yields, etc.) with the intracellular carbon flux and energy distribution. Constraints based stoichiometric models do away with kinetics in favor of a pseudo-steady-state picture of metabolism. FBA and stoichiometric models have been employed to calculate genomic-scale snapshots of several organisms as well as portraits of key subnetworks such as central carbon metabolism. One of the first examples of what would evolve into FBA was the analysis of butyric-acid bacteria by Papoutsakis [36–38]. Later, Varma, Palsson, and coworkers employed a stoichiometric model of E. coli W3110 to study oxygen limitation and by-product secretion [39,40]. Vallino and Stephanopoulos employed FBA to explore Corynebacterium glutamicum during lysine overproduction [41,42], while Sauer et al., characterized the metabolic capabilities of riboflavin producing B. subtilis [43]. Pramanik and Keasling explored the impact of time varying biomass composition and E. coli metabolism [44,45] while Maranas and coworkers explored the performance limits of E. coli subject to gene additions or deletions [46], the coupling of metabolic fluxes in large-scale networks [47], the generation of optimal gene deletion strategies [48], the production of lactic acid in E. coli [49] and the computational identification of reaction activation/inhibition or elimination candidates in metabolic networks [50]. Edwards, Schilling, Palsson, and coworkers extended FBA to genomic-scale metabolic reconstructions of Helicobacter pylori 26,695 (389 reactions) [51], E. coli MG1655 (740 reactions) [52,53], E. coli K-12 (931 reactions) [54], S. cerevisiae (1173 reactions) [55] and most recently to the human metabolic map with a genome scale reconstruction consisting of 3,311 metabolic and transport reactions and 2,766 metabolites [56]. An attractive feature of constraints based models is the relative ease of computation (solving a linear program or determining a matrix inverse) and the ability to directly incorporate process information, for example on-line CO2, O2, or cellmass measurements into the constraints (see Savinell and Palsson for discussion of optimal measurement selection [57] or Becker et al., for FBA software [58]). In addition to physiological measurements, 13C-NMR/GC-MS labeling techniques have been employed by many groups to add additional constraints to the flux calculation [59–74]. Sauer et al., (and others) have pushed 13C enhanced metabolic flux estimation beyond serial experiments into the realm of parallel high-throughput data generation; see Ref. [75].
19.3 The Multiscale Nature of Metabolism Follows from the Central Dogma of Molecular Biology The central dogma of molecular biology (Figure 19.1), i.e., information stored in DNA is transcribed to an intermediate mRNA message which is then translated into a working protein machine which carries out a catalytic, regulatory, or structural role in the cell dictates that metabolism is hierarchical or
19-3
Multiscale Modeling of Metabolic Regulation
+ –
σ
Transcriptional programs
mRNA X P
Gene X
mRNA X mRNA X
Translational programs
pX pX
pX pX
B
A
Metabolic programs
Figure 19.1 (See color insert following page 10-18.) Schematic of the central dogma of molecular biology. Genetic information is transcribed into mRNA which is then translated into protein machines. The layers and programs of metabolism are coupled and hierarchical; transcriptional programs influence translation which then drives metabolic programs. Metabolic programs in turn influence transcription, thus, forming feedback loops that integrate the metabolic layers.
multiscale. This is true as the different layers of metabolism are integrated together via explicit dependencies, e.g., translation of protein j cannot occur without the corresponding mRNA transcript or via feedback loops as described by Csete and Doyle [76] which have developed over evolutionary time to ensure robustness in the face of shifting external environments. Traditional dynamic metabolic models or constraints based stoichiometric models do not, in general, systematically account for metabolic regulation and control. This is not to say that regulation and control is neglected, rather, it is often incorporated into the kinetics for dynamic models or into the constraints for stoichiometric models. The distinction between traditional metabolic modeling approaches and the multiscale paradigm is that the regulation and control programs governing the dynamics of the different metabolic hierarchies are explicitly and systematically incorporated into the model formulation. A general unsegregated multiscale model of metabolism consists of mass balance equations governing the time rate change of Z mRNA species, E protein, and M metabolite species, where each of the mass balances explicitly, however, not necessarily mechanistically, accounts for the output of the control programs managing metabolism. Thus, the mass balance around transcript j under condition k, denoted by zjk, is given by:
dzjk = rx , zjkujk - (kd , zjk + µk )zjk + ηjk dt
j = 1, 2,...,Z
(19.1)
where rx,zjk denotes the specific rate of expression of transcript j under condition k and ujk denotes the control or management variable governing expression for transcript j. We assume transcript degradation is first-order where kd,zjk denotes the rate constant governing the degradation of transcript j in condition k and ηjk denotes the specific rate of constitutive expression of transcript j under condition k. The
19-4
Modeling Tools for Metabolic Engineering
quantity µk denotes the specific growth rate in condition k. The transcript zjk can be translated to form protein ejk where the specific concentration of ejk obeys the mass balance:
dejk = rT,ejk(zjk,k) wjk - (kd,ejk +mk)ejk j = 1,2,...,E dt
(19.2)
The specific rate of translation of transcript j in condition k, denoted by rT,zjk (zjk, k), is a function of the transcript concentration and is modified by the wjk term which denotes the control of management variable governing the translation of transcript j under condition k. We assume protein degradation is non-specific and first-order where kd,ejk denotes the rate constant governing the degradation of protein j in condition k. The mass balance around metabolite j in condition k, denoted by xjk, is
dxjk = dt
Q
R
∑ i =1
α ji ri (e,x,k)vi +
∑β q (t ) - µ x il l
k jk
j = 1, 2,…, M
(19.3)
l =1
where αji, βjl denote the stoichiometric coefficients relating metabolite xjk with reaction ri and transport flux ql. The term vi denotes the control variable describing enzyme activity regulation while R denotes the number of intracellular reaction rates or fluxes (unknown) and Q denotes the number of exchange fluxes (measured). The last term in the metabolite mass balances accounts for dilution of the specific metabolite concentration by cell growth.
19.4 Construction of First-Principle and Reversed Engineered Models of Transcriptional Programs The ujk variable modifying the kinetic rate of transcription in Equation 19.1 could be thought of as the output of a transcriptional program governing the expression of gene j in condition k. If ujk << 1 then expression of gene j can occur near the kinetic limit, if however, ujk << 1 the expression of gene j is not favored. While this formalism is conceptually simple to visualize, the discovery and implementation of mathematical models describing transcriptional programs is difficult. In general, first principles models of gene regulation require detailed parametric and structural understanding of the mechanisms controlling gene expression that is not known. However, if this information is available, there are mathematical frameworks that can be used to formulate models for ujk. One such framework is the genetically structured modeling framework of Lee and Bailey [77,78]. Lee postulated a model structure very similar to Equation 19.1 where the ujk term, called the transcriptional efficiency, was formulated based upon the statistical mechanical weight of promoter configurations that yielded transcription. The genetically structured paradigm was used to model lac promoter–operator function for both chromosomal and plasmid based protein expression in E. coli [77,78]. Another notable genetically structured model was the glucose-lactose lac operon model of Wong et al., which was able to mechanistically describe diauxie growth of E. coli on mixtures of glucose and lactose [79]. A number of studies have been conducted and frameworks proposed to extract regulatory networks from gene expression data. Most early network inference methods relied primarily on clustering genes on the basis of their expression profiles [80–83]. Recently, there has been considerable interest in developing computational tools that go beyond answering the question of whether two or more genes have similar expression profiles. Instead, the central question has become whether we can uncover, hidden within gene expression data, the signature, extent, and directionality of interactions between different genes. In other words, rather than simply grouping genes with similar expression profiles, new methods have attempted to learn gene regulatory patterns from expression data. Broadly, these methods can be classified into two distinct categories based on their fundamental treatment of gene interactions. Deterministic model-based
19-5
Multiscale Modeling of Metabolic Regulation
methods assume there exists a deterministic formalism Y = f(X) that captures the effect of expression level of gene X on gene Y. Different choices for the function f(x) (e.g., linear, sigmoidal, etc.) have given rise to many versions of model-based methods [82,84–87]. Conversely, stochastic model-based methods start by postulating that experimentally observed gene expression profiles correspond to samples drawn from an unknown multivariate probability distribution. Bayesian networks provide a popular alternative for achieving this objective by postulating a multivariate joint conditional probability model that explains the observed expression data [88,89]. In addition to classifying gene network inference methods based upon the mathematical formalism used to model the control program, a further distinction can be made based upon how gene expression is handled within these formalisms. Boolean networks were among the first formalisms proposed to model gene interactions [90–92]. In the Boolean approach, genes are assumed to be either ON or OFF and the input–output relationships between them are modeled through deterministic logical functions (such as AND, OR, NOT, etc.). More recently, an extension of this approach to account for uncertainty in expression data has been proposed in the form of probabilistic Boolean networks [93–95]. However, in most real gene expression settings, Boolean idealizations may not be appropriate as genes are expressed at continuously varying intermediate expression levels [96]. Consequently, more general approaches have been proposed which model mRNA expression level as a continuously varying quantity. These include linear weight modeling [82,84], linear and nonlinear ordinary differential equations [97–99], graph-theoretic and hierarchal models (100–102), and S-systems [93,103–105].
19.5 Models of the Prokaryotic Translational Program The wjk variable variable modifying the rate of translation in Equation 19.2 could be thought of as the output of a control program governing the translation of mRNA transcript j in condition k. If wjk~1 then the translation of transcript j can occur near the kinetic limit, if however, wjk << 1 the translation of transcript j is not favored. Translational programs and posttranslational regulatory modifications have not received the same level of attention as transcriptional networks from the modeling community. This lack of focus is surprising as Hatzimanikatis and Lee [106] and later Doyle and coworkers have shown that both gene expression and proteomic measurements are needed to identify even simple mathematical models of metabolism [87]. Mehra et al., developed a general mathematical model of prokaryotic translation [107] which began with rT ,ei = ks ,i z i a protein balance similar to Equation 19.2 with w = 1 and
rT ,ei = ks ,i z i
(19.4)
After making the pseudo steady-state assumption along with some algebraic manipulation, Mehra et al., arrived at an expression relating shifts in the transcript level to changes in the protein concentration:
ks , i kd , i , o + µo m fi p = fi ks , i , o kd , i + µ
(19.5)
where:
fi p =
ei ei , o
fim =
zi zi , o
(19.6)
The quantities k s,i and kd,i denote the protein synthesis and degradation rate constants, respectively while µ denotes the specific growth rate. The subscript o denotes the values of the corresponding quantity at a reference state. On the basis of Equation 19.5 alone, Mehra et al., suggested that there could be situations where the amplification ratios observed in condition dependent transcription profiling would not have a one-to-one relationship with the corresponding protein ratios. However, they went on to
19-6
Modeling Tools for Metabolic Engineering
acknowledge that translation is much more complex than Equation 19.5 would suggest and formulated a molecular model that mechanistically describes the initiation, elongation, and termination steps in translation; the molecular model had 4n + 1 dependent variables, where n denotes the number of transcripts being considered. Lee et al., experimentally tested the Mehra translation model by comparing gene expression patterns gathered using high-density microarrays and protein abundance measurements gathered using a two dimensional electrophoresis-tandem mass spectrometry technique in perturbed E. coli [108]. Two different perturbation strategies were explored; first, a genetic perturbation was quantified where wild-type and a E. coli W3110 hemolysin hypersecretion phenotype were compared and second, an environmental perturbation was explored where varying concentrations of (IPTG) isopropyl-BD-thiogalactopyranoside were added to minimal media shake flask cultures of E. coli MG1655. In both cases, the general finding was that changes in protein concentration stemming from perturbation do not directly correspond with changes in mRNA abundance. The r2 correlation between mRNA and protein abundance for the IPTG perturbation ranged from 0.02 to 0.04 while the correlation between mRNA and protein concentrations for the hypersecretion hemolysin phenotype was 3–7 × 10 −4. While the Mehra model was not able to predict the relative changes for any single gene, it did give insight into how the centroid of mRNA and protein ratios changed as a function of perturbation.
19.6 Integrating Transcriptional and Translational Programs with Physiology Leads to More Predictive Models A body of literature, reviewed in the previous sections, exists on defining the underlying transcriptional and to a lesser extent translation programs of different organisms. In the context of multiscale models, these biological programs can be translated into the formulations of the u and w control variables in Equations 19.1 and 19.2, respectively. In what follows, two examples of how models of transcription and translation can be integrated or embedded into larger models of central carbon metabolism and beyond are reviewed. Two different integrated modeling paradigms are explored, augmented constraints based models and cybernetic models.
19.6.1 Multiscale Constraints Based Models Have Increased Capabilities Constraints based models and FBA in general do not explicitly contain regulatory program models; however, one might argue that FBA does implicitly consider the input from regulatory programs through experimentally derived constraints. Covert and Palsson have explicitly augmented FBA to include transcriptional regulation and have demonstrated marked improvement in model capabilities [109,110]. Constraints based models are formulated by writing mass balances around intracellular metabolites similar in form to Equation 19.3 with vi ≡ 1; denote the M-dimensional vector of intracellular metabolites as x, then the mass balances governing the time-rate of change of intracellular metabolites are given by:
dx = Sr(e,x, k )+ Tq(t) dt
(19.7)
where r(e, x, k) denotes R−dimensional vector of reaction rates or fluxes (function of both enzyme and metabolite concentrations and kinetic parameters) The M × R dimensional stoichiometric matrix is denoted by S while the M × Q transport matrix, which describes the stoichiometry of material exchange between the biotic and abiotic phases is denoted by T. The Q-dimensional vector of measured exchange fluxes is given by q. In general, the control programs affecting enzyme concentration and the kinetic parameters are not known, thus, how can we say anything about physiology? The key insight offered by the constraints based paradigm is that important metabolic properties can be calculated despite parametric uncertainty by reducing the complexity of the
Multiscale Modeling of Metabolic Regulation
19-7
problem. The time-scale of metabolic transients is 1 ms–10 s while the time-scale of cell-growth is on the order of hours to days, thus, Equation 19.7 is effectively at steady-state:
Sr*(tj) = −Tq(tj) j = 1,2,3,…
(19.8)
where r(tj) denotes the unknown steady-state flux vector while q(tj) denotes a vector of measured exchange fluxes at time step j. The unknown flux vector is constrained by experimental measurements and/or thermodynamic arguments, i.e., some metabolic transformations are irreversible whereas others are reversible, thus r(tj) is bounded:
αi(tj ) ≤ ri* (tj ) ≤ βi(tj ) i = 1, 2,..., R
(19.9)
Conceptually, the mass balance and bounds constraints describe a feasible region of the flux space which contains all possible physiological states, however, not all these states are reachable at any given time in large part because the influence of transcriptional control programs. To explore the impact that transcriptional control has upon constraints based models, Covert et al., developed a Boolean representation of transcriptional programs associated with carbon substrate preference, physiological response to oxygen limitation, amino acid biosynthesis, and the regulation of key metabolite concentrations. These Boolean programs were then integrated with a simplified core metabolic network and tested by simulation. The core metabolic model equipped with the Boolean programs was used to explore a series of examples ranging from predicting diauxie growth to understanding the physiological response to a complex medium [109]. In each case, the intracellular steady-state fluxes were computed at a given instant in time by solving Equation 19.8 using linear programming with the objective of maximizing growth subject to the flux bounds constraints. Extracellular species concentrations were then calculated by integrating the predicted exchange flux vector forward in time. The performance of the augmented model was impressive; diauxie growth on substitutable carbon substrates was correctly predicted along with several other physiologically distinct regimes. Condition dependent control programs superimposed on stoichiometric models reduced the feasible region of flux space making refined estimates of physiology possible. Covert and Palsson, using the skeleton metabolic network developed in Ref. [109], showed using an extreme pathway decomposition of the feasible flux space [110,111], that the number of extreme pathways (basis vectors which span the feasible flux space) was reduced from 80 to a maximum of 26 pathways. Thus, the control logic superimposed upon the flux calculation reduced the possible solutions to only those that were appropriate given the environmental condition. Covert and Palsson went further with the Boolean logic paradigm and demonstrated that boolean control programs could be developed for genome scale models and that these large-scale augmented models could be valuable tools in data integration and hypothesis generation (112).
19.6.2 Cybernetic Models Bridge Metabolic Hierarchies Cybernetic models are a class of dynamic models, largely pioneered by Ramkrishna and coworkers, which have their origin in the hypothesis that metabolic regulation and control has evolved so that cells make optimal decisions when presented with metabolic choices [113]. The hallmark of cybernetic models is the integration of kinetics with an abstracted description of metabolic regulation and control that is manifested as control variables that modify the kinetic rates. Thus, in the context of a cybernetic model, the ujk term in Equation 19.1 is the output of an optimal control program that manages the expression of gene j in condition k. Other cybernetic control variables have been formulated; the wjk term in Equation 19.2 denotes the output of an optimal translation program for transcript j in condition k while vqk denotes the cybernetic variable which describes enzyme activity in condition k. Since the early abstracted cybernetic models of Dujarti et al. [114] and Kompala et al. [115,116] which were
19-8
Modeling Tools for Metabolic Engineering
primarily focused upon modeling diauxie growth on mixtures of sugars, cybernetic models have significantly grown in metabolic complexity, albeit not to the degree of the genome scale models of Palsson and coworkers. Straight was the first to build explicit pathway structure into cybernetic models [117,118], Ramakrishna [119] and later Varner [120,121] built upon this foundation and developed more biologically refined portraits of intracellular networks. The work of Varner et al. in particular highlighted both the promise and downside of cybernetic models. A cybernetic model describing 45 genes in the central carbon metabolism of E. coli equipped with a description of transcription, translation, and enzyme level regulation was, after model identification on wild-type physiological data, able to predict the physiology of a pyk knockout mutant [121]. Optimization of flux thru the aspartate amino acid network is another example where a cybernetic model, identified on wild-type data, was able to predict the local impact of genetic manipulation (overexpression of feedback resistant pathway enzymes) [120]. Cybernetic models have also been used to study storage product formation and in advanced bioreactor control system design studies [122]. However, in all these cases the model complexity is overwhelming and model identification is ad hoc. The central issue is that, in addition to identifying kinetic parameters which itself is difficult, the structure of the optimal control programs governing metabolism must be formulated. Namjoshi and Ramkrishna [123] and later Young and Ramkrishna [124] have made progress on the cybernetic identification problem, however, it remains a critical issue.
19.7 Summary and Conclusions The capability to gather organism wide data has far outstripped the ability to understand it. Transforming large-scale data sets into a better cell requires tools that integrate physiology with its environment. One such tool is multiscale mathematical modeling where stoichiometry and kinetics are integrated with metabolic regulation and control. In this chapter, we reviewed the salient developments in the area of multiscale metabolic models starting with traditional dynamic and stoichiometric models and finishing by exploring two different multiscale modeling paradigms, augmented constraints based models and cybernetic models. Augmented constraints based models trade the description of metabolic transients for biological complexity. Constraints based models are outstanding tools to deeply explore microbial physiology because of their very detailed representation of reaction stoichiometry. Moreover, augmented constraints based models are able to capture the influence of regulatory and control decisions, even at the genome-scale. Conversely, cybernetic models, are capable of predicting both metabolic transients and steady-state behavior along with intracellular metabolite, protein, and mRNA concentrations. However, cybernetic models have a limited biological scope with the largest model being 45 genes in the central carbon metabolism of E. coli. The central issue with cybernetic models is model identification, if this issue can be overcome then cybernetic models may become comparable to their constraints based counterparts.
References 1. Raser, J. and O’Shea, E., 2005. Noise in gene expression: origins, consequences, and control. Science, 309:2010–2013. 2. DeRisi, J., Iyer, V., and Brown, P., 1997. Exploring the metabolic and genetic control of gene expression on a genomic scale. Science, 278:680–686. 3. Shuler, M., Leung, S., and Dick, C., 1979. A mathematical model for the growth of a single bacterial cell. Ann. NY Acad. Sci., 326:35–36. 4. Shuler, M. and Domach, M., 1983. Mathematical models of the growth of individual cells. Tools for testing biochemical mechanisms. In Foundations of Biochemical Engineering. 207. ACS. 5. Domach, M. and Shuler, M., 1984. Testing of a potential mechanism for E. coli temporal cycle imprecision with a structural model. J. Theor. Biol., 106:577–585.
Multiscale Modeling of Metabolic Regulation
19-9
6. Domach, M., Leung, S., Cahn, R., Cocks, G., and Shuler, M., 1984. Computer model for glucoselimited growth of a single cell of Escherichia coli B/r-A. Biotech. Bioeng., 26:203–216. 7. Wu, P., Ray, N., and Shuler, M., 1992. A single cell model for CHO cells. Ann. NY Acad. Sci., 665:152–187. 8. Wu, P., Ray, N., and Shuler, M., 1993. A computer model for intracellular pH regulation in Chinese Hamster Ovary cells. Biotech. Bioeng., 9:374–384. 9. Steinmeyer, D. and Shuler, M., 1989. Structured model for Saccharomyces cerevisiae. Chem. Eng. Sci., 44:2017–2030. 10. Lee, A., Ataai, M., and Shuler, M., 1984. Double substrate limited growth of Escherichia coli. Biotech. Bioeng., 26:1391–1401. 11. Ataai, M. and Shuler, M., 1985. Simulation of the growth pattern of a single cell of Escherichia coli under anaerobic conditions. Biotech. Bioeng., 27:1026–1035. 12. Kim, B., Good, T., Ataai, M., and Shuler, M., 1987. Growth behavior and prediction of copy number and retention of ColE1-type plasmids in E. coli under slow growth conditions. Ann. NY Acad. Sci., 506:384–395. 13. Ataai, M. and Shuler, M., 1987. A mathematical model for prediction of plasmid copy number and genetic stability in Escherichia coli. Biotech. Bioeng., 30:389–397. 14. Kim, B.G. and Shuler, M., 1990. A structured, segregated model for genet ically modified E. coli cells and its use for prediction of plasmid stability. Biotech. Bioeng., 36:581–592. 15. Shuler, M. 1999. Single-cell models: promise and limitations. J. Biotechnol., 71:225–228. 16. Rizzi, M., Baltes, M., Theobald, U., and Reuss, M., 1997. In-vivo analysis of metabolic dynamics in Saccharomyces cerevisiae: II. Mathematical model. Biotechnol. Bioeng., 55:592–608. 17. Mailinger, U.T.W., Baltes, M., Rizzi, M., and Reuss, M., 1997. In-vivo analysis of metabolic dynamics in Saccharomyces cerevisiae: I. Experimental observation. Biotech. Bioeng., 55:305–316. 18. Chassagnole, C., Rizzi, N., Schmid, J., Mauch, K., and Reuss, M., 2002. Dynamic modeling of the central carbon metabolism of Escherichia coli. Biotech. Bioeng., 79:53–73. 19. Vaseghi, S., Baumeister, A., Rizzi, M., and Reuss, M., 1999. In-vivo dynamics of the pentose phosphate pathway in Saccharomyces cerevisiae. Metab. Eng., 1:128–140. 20. Buziol, S., Becker, J., Baumeister, A., Jung, S., and Mauch, K. et al., 2002. Determination of in-vivo kinetics of the starvation-induced Hxt5 glucose transporter of Saccharomyces cerevisiae. FEMS Yeast Res., 2:283–291. 21. Nielsen, J. and Jorgensen, H., 1995. Metabolic control analysis of the penicillin biosynthetic pathway in a high-yielding strain of Penicillium chrysogenum. Biotechnol. Prog., 11:299–305. 22. Zangirolami, T., Johansen, C., Nielsen, J., and Jorgensen, S., 1997. Simulation of penicillin production in fed-batch cultivations using a morphologically structured model. Biotech. Bioeng., 56:593–604. 23. Theilgaard, H. and Nielsen, J., 1999. Metabolic control analysis of the penicillin biosynthetic pathway: the influence of the LLD-ACV:bisACV ratio on the flux control. Antonie Van Leeuwenhoek, 75:145–154. 24. Chassagnole, C., Rais, B., Quentin, E., Fell, D., and Mazat, J., 2001. An integrated study of threoninepathway enzyme kinetics in Escherichia coli. Biochem. J., 356:415–423. 25. Chassagnole, C., Quentin, E., Fell, D., de Atauri, P., and Mazat, J., 2003. Dynamic simulation of pollutant effects on the threonine pathway in Escherichia coli. CR Biol., 326:501–508. 26. Hatzimanikatis, V., Floudas, C., and Bailey, J., 1996. Optimization of regulatory architectures in metabolic reaction networks. Biotech. Bioeng., 52:485–500. 27. Hatzimanikatis, V., Emmerling, M., Sauer, U., and Bailey, J., 1998. Application of mathematical tools for metabolic design of microbial ethanol production. Biotech. Bioeng., 58:154–161. 28. Joshi, A. and Palsson, B., 1989. Metabolic dynamics in the human red cell. Part I: a comprehensive kinetic model. J. Theor. Biol., 141:515–528. 29. Joshi, A. and Palsson, B., 1990. Metabolic dynamics in the human red cell. Part IV Data prediction and some model computations. J. Theor. Biol., 142:69–85. 30. Lee, I., and Palsson, B., 1992. A Macintosh software package for simulation of human red blood cell metabolism. Comput. Methods Programs Biomed., 38:195–226.
19-10
Modeling Tools for Metabolic Engineering
31. Jamshidi, N., Edwards, J., Fahland, T., Church, G., and Palsson, B., 2001. Dynamic simulation of the human red blood cell metabolic network. Bioinformatics, 17:286–287. 32. Kauffman, K., Pajerowski, J., Jamshid, N., and Palsson, B., 2002. Description and analysis of metabolic connectivity and dynamics of the human red blood cell. Biophys. J., 83:646–662. 33. Daae, E., Dunnill, P., Mitsky, T., Padgette, S., and Taylor, N. et al., 1999. Metabolic modeling as a tool for evaluating polyhydroxyalkanoate copolymer production in plants. Metab. Eng., 1:243–254. 34. Poolman, M., Fell, D., and Thomas, S., 2000. Modelling photosynthesis and its control. J. Exp. Bot., 51:319–328. 35. Morgan, J. and Rhodes, D., 2002. Mathematical modeling of plant metabolic pathways. Metab. Eng., 4:80–89. 36. Papoutsakis, E. 1983. A useful equation for fermentations of butyric-acid bacteria. Biotechnol. Lett., 5:253–258. 37. Papoutsakis, E. 1984. Equations and calculations for fermentations of butyric acid bacteria. Biotech. Bioeng., 26:174–187. 38. Papoutsakis, E. and Meyer, C., 1985. Equations and calculations of product yields and preferred pathways for butanediol and mixed-acid fermentations. Biotech. Bioeng., 27:50–66. 39. Varma, A., Boesch, B., and Palsson, B., 1993. Stoichiometric interpretation of Escherichia coli glucose catabolism under various oxygenation rates. Appl. Environ. Microbiol., 59:2465–2473. 40. Varma, A. and Palsson, B., 1994. Stoichiometric flux balance models quantitatively predict growth and metabolic by-product secretion in wild-type Escherichia coli W3110. Appl. Environ. Microbiol., 60:3724–3731. 41. Vallino, J. and Stephanopoulos, G., 1994. Carbon flux distributions at the pyruvate branch point in C. glutamicum during lysine overproduction. Biotechnol. Prog., 10:327–334. 42. Vallino, J. and Stephanopoulos, G., 1994. Carbon flux distributions at the glucose-6 phosphate branch point in Corynebacterium glutamicum during lysine overproduction. Biotechnol. Prog., 10:320–326. 43. Sauer, U., Hatzimanikatis, V., Hohmann, H., Manneberg, M., and van Loon, A. et al., 1996. Physiology and metabolic fluxes of wild-type and riboflavin-producing Bacillus subtilis. Appl. Environ. Microbiol., 62:3687–3696. 44. Pramanik, J. and Keasling, J., 1997. Stoichiometric model of Escherichia coli metabolism: incorporation of growth-rate dependent biomass composition and mechanistic energy requirements. Biotech. Bioeng., 56:398–421. 45. Pramanik, J. and Keasling, J., 1998. Effect of Escherichia coli biomass composition on central metabolic fluxes predicted by a stoichiometric model. Biotech. Bioeng., 60:230–238. 46. Burgard, A. and Maranas, C., 2001. Probing the performance limits of the Escherichia coli metabolic network subject to gene additions or deletions. Biotech. Bioeng., 74:364–375. 47. Burgard, A., Nikolaev, E., Schilling, C., and Maranas, C., 2004. Flux coupling analysis of genomescale metabolic reconstructions. Genome Res., 14:301–312. 48. Pharkya, P., Burgard, A., and Maranas, C., 2003. Exploring the overproduction of amino acids using the bilevel optimization framework OptKnock. Biotech. Bioeng., 84:887–899. 49. Fong, S., Burgard, A., Herring, C., Knight, E., and Blattner, F. et al., 2005. Insilico design and adaptive evolution of Escherichia coli for production of lactic acid. Biotechnol. Bioeng., 91:643–648. 50. Pharkya, P. and Maranas, C., 2006. An optimization framework for identifying reaction activation/ inhibition or elimination candidates for overproduction in microbial systems. Metab. Eng., 8:1–13. 51. Schilling, C., Covert, M., Famili, I., Church, G., and Edwards, J. et al., 2002. Genome-scale metabolic model of Helicobacter pylori 26695. J. Bacteriol., 184:4582–4593. 52. Edwards, J. and Palsson, B., 2000. The Escherichia coli MG1655 in silico metabolic genotype: its definition, characteristics, and capabilities. Proc. Natl. Acad. Sci. USA, 97:5528–5533. 53. Edwards, J., Ibarra, R., and Palsson, B., 2001. In-silico predictions of Escherichia coli metabolic capabilities are consistent with experimental data. Nat. Biotechnol., 19:125–130. 54. Reed, J., Vo, T., Schilling, C., and Palsson, B., 2003. An expanded genome- scale model of Escherichia coli K-12 (iJR904 GSM/GPR). Genome Biol., 54:1–12.
Multiscale Modeling of Metabolic Regulation
19-11
55. Forster, J., Famili, I., Fu, P., Palsson, B., and Nielsen, J., 2003. Genome-scale reconstruction of the Saccharomyces cerevisiae metabolic network. Genome Res., 13:244–253. 56. Duarte, N., Becker, S., Jamshidi, N., Thiele, I., and Mo, M. et al., 2007. Global reconstruction of the human metabolic network based on genomic and bibliomic data. Proc. Natl. Acad. Sci. USA, 104:1777–1782. 57. Savinell, J. and Palsson, B., 1992. Optimal selection of metabolic fluxes for in-vivo measurement: I. Development of mathematical methods. J. Theor. Biol., 155:201–214. 58. Becker, S.A., Feist, A.M., Mo, M.L., Hannum, G., and Palsson, B.O., et al. 2007. Quantitative prediction of cellular metabolism with constraint-based models: the COBRA Toolbox. Nat. Protocols., 2:727–738. 59. Sriram, G. and Shanks, J., 2004. Improvements in Metabolic Flux Analysis using carbon bond labeling experiments: bondomer balancing and Boolean function mapping. Metab. Eng., 6:116–132. 60. Sauer, U., Cameron, D., and Bailey, J., 1998. Metabolic capacity of Bacillus subtilis for the production of purine nucleosides, riboflavin, and folic acid. Biotech. Bioeng., 59:227–238. 61. Fischer, E. and Sauer, U., 2003. Metabolic flux profiling of Escherichia coli mutants in central carbon metabolism using GC-MS. Eur. J. Biochem., 270:880–891. 62. Zupke, C. and Stephanopoulos, G., 1994. Modeling of isotope distribution and intracellular fluxes in metabolic networks using atom mapping matrices. Biotechnol. Prog., 10:489–498. 63. Szyperski, T. 1995. Biosynthetically directed fractional 13C-labeling of proteinogenic amino acids. An efficient analytical tool to investigate intermediary metabolism. Eur. J. Biochem., 232:433–438. 64. Schmidt, J., Carlsen, M., Nielsen, J., and Villadsen, J., 1997. Modeling isotopomer distributions in biochemical networks using isotopomer mapping matrices. Biotech. Bioeng., 55:831–840. 65. Sauer, U., Hatzimanikatis, V., Bailey, J., Hochuli, M., and Szyperski, T. et al., 1997. Metabolic fluxes in riboflavin-producing Bacillus subtilis. Nat. Biotechnol., 15:448–452. 66. Schmidt, K., Nielsen, J., and Villadsen, J., 1999. Quantitative analysis of metabolic fluxes in Escherichia coli, using two-dimensional NMR spectroscopy and complete isotopomer models. J. Biotechnol., 71:175–189. 67. Wittmann, C. and Heinzle, E., 1999. Mass spectrometry for metabolic flux analysis. Biotech. Bioeng., 62:739–750. 68. de Graaf, A., Mahle, M., Mollney, M., Wiechert, W., and Stahmann, P. et al., 2000. Determination of full 13C isotopomer distributions for metabolic flux analysis using heteronuclear spin echo difference NMR spectroscopy. J. Biotechnol., 77:25–35. 69. Canonaco, F., Hess, T., Heri, S., Wang, T., and Szyperski, T. et al., 2001. Metabolic flux response to phosphoglucose isomerase knock-out in Escherichia coli and impact of overexpression of the soluble transhydrogenase UdhA. FEMS Microbiol. Lett., 204:247–252. 70. Maaheimo, H., Fiaux, J., Cakar, Z., Bailey, J., and Sauer, U. et al., 2001. Central carbon metabolism of Saccharomyces cerevisiae explored by biosynthetic fractional (13)C labeling of common amino acids. Eur. J. Biochem., 268:2464–2479. 71. Wittmann, C. 2002. Metabolic flux analysis using mass spectrometry. Adv. Biochem. Eng. Biotechnol., 74:39–64. 72. Wiechert, W. 2002. An introduction to 13C metabolic flux analysis. Genet. Eng., 24:215–238. 73. Emmerling, M., Dauner, M., Ponti, A., Fiaux, J., and Hochuli, M. et al., 2002. Metabolic flux responses to pyruvate kinase knockout in Escherichia coli. J. Bacteriol., 184:152–164. 74. Fischer, E., Zamboni, N., and Sauer, U., 2004. High-throughput metabolic flux analysis based on gas chromatography-mass spectrometry derived 13C constraints. Anal. Biochem., 325:308–316. 75. Sauer, U. 2004. High-throughput phenomics: experimental methods for mapping fluxomes. Curr. Opin. Biotechnol., 15:58–63. 76. Csete, M. and Doyle, J., 2002. Reverse engineering of biological complexity. Science, 295:1664–1669. 77. Lee, S.B. and Bailey, J., 1984. Genetically structured models for lac promoter–operator function in Escherichia coli chromosome and in multicopy plasmids: lac operator function. Biotechnol. Bioeng., 26:1372–1383.
19-12
Modeling Tools for Metabolic Engineering
78. Lee, S.B. and Bailey, J., 1984. Genetically structured models for lac promoter–operator function in Escherichia coli chromosome and in multicopy plasmids: lac promoter function. Biotechnol. Bioeng., 26:1383–1389. 79. Wong, P., Gladney, S., and Keasling, J., 1997. Mathematical model of the lac operon: inducer exclusion, catabolite repression, and diauxic growth on glucose and lactose. Biotechnol. Prog., 13:132–143. 80. Eisen, M., Spellman, P., Brown, P., and Botstein, D., 1998. Cluster analysis and display of genomewide expression patterns. Proc. Natl. Acad. Sci. USA, 95:14863. 81. Wen, X., Fuhrman, S., Michaels, G., Carr, D., and Smith, S. et al., 1998. Largescale temporal gene expression mapping of central nervous system development. Proc. Natl. Acad. Sci. USA, 95:334–339. 82. D’Haeseleer, P., Wen, X., Fuhrman, S., and Somogyi, R., 1999. Linear modeling of mRNA expression levels during CNS development and injury. In: Proceedings of the Pacific Symposium on Biocomputing. Volume 4, 41. World Scientific Publishing Co., California. 83. Dougherty, E., Barrera, J., Brun, M., Cesar, S.K.R., and Chen, Y. et al., 2002. Inference from clustering with applications to gene-expression microarrays. J. Comp. Biol., 9:105–126. 84. Weaver, D., Workman, C., and Stormo, G., 1999. Modeling regulatory networks with weight matrices. In: Proceedings of the Pacific Symposium on Biocomputing. Volume 4, 1 12. World Scientific Publishing Co., California. 85. Holter, N., Maritan, A., Cieplak, M., Fedoroff, N., and Banavar, J., 2001. Dynamic modeling of gene expression data. Proc. Natl. Acad. Sci. USA, 98:1693–1698. 86. di Bernardo, T.G.D., Lorenz, D., and Collins, J., 2003. Inferring genetic networks and identifying compound mode of action via expression profiling. Science, 302:102–105. 87. Zak, D., Gonye, G., and Schwaber, J., 3rd FD, 2003. Importance of input perturbations and stochastic gene expression in the reverse engineering of genetic regulatory networks: insights from an identifiability analysis of an in silico network. Genome Res., 13:2396–2405. 88. Friedman, N., Linial, M., Nachman, I., and Pe’er D., 2000. Using Bayesian networks to analyze expression data. J. Comp. Biol., 7:601. 89. Pe’er, D., Regev, A., Elidan, G., and Friedman, N., 2001. Inferring subnetworks from perturbed expression profiles. Bioinformatics, 17:215. 90. Somogyi, R. and Sniegoski, C., 1996. Modeling the complexity of genetic networks: understanding multigenic and pleitropic regulation. Complexity, 1:45. 91. Akutsu, T., Miyano, S., and Kuhara, S., 1999. Identification of genetic networks from a small number of gene expression patterns under the Boolean network model. In: Proceedings of the Pacific Symposium on Biocomputing. Volume 4, 17. World Scientific Publishing Co., California. 92. Ideker, T., Thorsson, V., and Karp, R., 2000. Discovery of regulatory interactions through perturbation: inference and experimental design. In: Proceedings of the Pacific Symposium on Biocomputing. Volume 5, 302. World Scientific Publishing Co., California. 93. Akutsu, T., Miyano, S., and Kuhara, S., 2000. Inferring qualitative regulations in genetic networks and metabolic pathways. Bioinformatics, 16:727–734. 94. Shmulevich, I., Dougherty, E., Kim, S., and Zhang, W., 2002. Probabilistic Boolean networks: a rule based uncertainty model for gene regulatory networks. Bioinformatics, 18:261–274. 95. Shmulevich, I., Lahdesmaki, H., Dougherty, E., Astola, J., and Zhang, W., 2003. The role of certain post classes in Boolean network models of genetic networks. Proc. Natl. Acad. Sci. USA, 100:10734–10739. 96. Jong, H., 2002. Modeling and simulation of genetic regulatory systems: a literature review. J. Comp. Biol. 9:67. 97. Chen, T., He, H., and Church, G., 1999. Modeling gene expression with differential equations. In: Proceedings of the Pacific Symposium on Biocomputing. Volume 4, 9. World Scientific Publishing Co., California. 98. Dasika, M.S., Gupta, A., Maranas, C.D., and Varner, J.D., 2004. A mixed integer linear programming (MILP). Framework for inferring time delay in gene regulatory networks. In: Proceedings of the Pacific Symposium on Biocomputing. 474–485. World Scientific Publishing Co., California.
Multiscale Modeling of Metabolic Regulation
19-13
99. Gupta, A., Varner, J., and Maranas, C., 2005. Large-scale inference of the transcriptional regulation of Bacillus subtilis. Comp. Chem. Eng., 29:565–576. 100. Dobrin, R., Beg, Q., Barabasi, A., and Oltvai, Z., 2004. Aggregation of topological motifs in the Escherichia coli transcriptional regulation network. BMC Bioinformatics, 5:10–17. 101. Christensen, C., Gupta, A., Maranas, C., and Alberta, R., 2007. Large-scale inference and graphtheoretical analysis of gene-regulatory networks in B. subtilis. Physica A, 373:786–810. 102. Bergmann, S., Ihmels, J., and Barkai, N., 2004. Similarities and differences in genome-wide expression data of six organisms. PLOS Biol., 2:E9. 103. Savageau, M. 1998. Rules for the evolution of gene circuitry. In: Proceedings of the Pacific Symposium on Biocomputing. Volume 3, 54. World Scientific Publishing Co., California. 104. Maki, Y., Tominaga, D., Okamoto, M., Watanabe, S., and Eguchi, Y., 2001. Development of a system for the inference of large scale genetic networks. In: Proceedings of the Pacific Symposium on Biocomputing. Volume 6, 446. World Scientific Publishing Co., California. 105. Kikuchi, S., Tominaga, D., Arita, M., Takahashi, K., and Tomita, M., 2003. Dynamic modeling of genetic networks using genetic algorithm and S-system. Bioinformatics, 19:643–650. 106. Hatzimanikatis, V., and Lee, K., 1999. Dynamical analysis of gene networks requires both mRNA and protein expression information. Metab. Eng., 1:275 –281. 107. Amit, M., Lee, K., and Hatzimanikatis, V., 2003. Insights into the relation between mRNA and protein expression patterns: I. Theoretical considerations. Biotech. Bioeng., 84:822–833. 108. Lee, P., Shaw, L., Choe, L., Mehra, A., and Hatzimanikatis, V. et al., 2003. Insights into the relation between mRNA and protein expression patterns: II. Experimental observations in Escherichia coli. Biotechnol. Bioeng., 84:834–841. 109. Covert, M. and Palsson, B., 2002. Transcriptional regulation in constraintsbased metabolic models of Escherichia coli. J. Biol. Chem., 277:28058–28064. 110. Covert, M. and Palsson, B., 2003. Constraints-based models. Regulation of gene expression reduces the steady-state solution space. J. Theor. Biol., 221:309–325. 111. Schilling, C. and Palsson, B., 1998. The underlying pathay structure of biochemical reaction networks. Proc. Natl. Acad. Sci. USA, 95:4193–4198. 112. Covert, M., Knight, E., Reed, J., Herrgard, M., and Palsson, B., 2004. Integrating high-throughput and computational data elucidates bacterial networks. Nature, 429:92–96. 113. Ramkrishna, D., 1982. Cybernetic perspective of microbial growth. In Foundations of Biochemical Engineering: Kinetics and Thermodynamics in Biological Systems. American Chemical Society, Washington, DC. 114. Dhurjati, P., Ramkrishna, D., Flickinger, M., and Tsao, G., 1985. A cybernetic view of microbial growth: modeling of cells as optimal strategists. Biotech. Bioeng., 27:1–9. 115. Kompala, D., Ramkrishna, D., and Tsao, G., 1984. Cybernetic modeling of microbial growth on multiple substrates. Biotechnol. Bioeng., 26:1272–1281. 116. Kompala, D., Ramkrishna, D., Jansen, N., and Tsao, G., 1986. Investigation of bacterial growth on mixed substrates: Experimental evaluation of cybernetic models. Biotechnol. Bioeng., 28:1044–1055. 117. Straight, J. and Ramkrishna, D., 1994. Cybernetic modeling and regulation of metabolic pathways. Growth on complementary nutrients. Biotechnol. Prog., 10:574–587. 118. Straight, J. and Ramkrishna, D., 1994. Modeling of bacterial growth under multiply-limiting conditions. Experiments under carbonor/and nitrogen-limiting conditions. Biotechnol. Prog., 10:588–605. 119. Ramakrishna, R., Ramkrishna, D., and Konopka, A., 1996. Cybernetic modeling of growth in mixed, substitutable substrate environments: preferential and simultaneous utilization. Biotech. Bioeng., 52:141–151. 120. Varner, J. and Ramkrishna, D., 1999. Metabolic engineering from a cybernetic perspective. The aspartate family of amino acids. Metab., Eng. 1:88–116.
19-14
Modeling Tools for Metabolic Engineering
121. Varner, J., 2000. Large-scale prediction of phenotype: Concept. Biotechnol. Bioeng., 69:664–678. 122. Gadkar, K., 3rd FD, Crowley, T., Varner, J., 2003. Cybernetic model predictive control of a continuous bioreactor with cell recycle. Biotechnol. Prog., 19:1487–1497. 123. Namjoshi, A. and Ramkrishna, D., 2005. A cybernetic modeling framework for analysis of metabolic systems. Comp. Chem. Eng., 29:487–498. 124. Young, J. and Ramkrishna, D., 2007. On the matching and proportional laws of cybernetic models. Biotechnol. Prog., 23:83–99.
20 Validation of Metabolic Models 20.1 Introduction ������������������������������������������������������������������������������������ 20-1
Sang Yup Lee, Hyohak Song, Tae Yong Kim, and Seung Bum Sohn
Metabolic Modeling in the Pre–Genome Era • Metabolic Modeling in Post–Genome Era
Metabolic and Biomolecular Engineering National Research Laboratory (BK 21 Program)
20.3 Conclusions and Future Prospects............................................. 20-11 Acknowledgments ����������������������������������������������������������������������������������� 20-12 References ������������������������������������������������������������������������������������������������� 20-12
20.2 Metabolic Model Validation ��������������������������������������������������������� 20-4 Tools for Validating the Metabolic Model • Validation of the Metabolic Model • Validation by Genetic Perturbation
20.1 Introduction Recently we have been observing the generation of unprecedentedly large amounts of genome sequences and high-throughput experimental data. Thus, it is becoming more and more important to systematically analyze these data in order to perceive and comprehend an individual life as a whole system [1,2]. Over the years, metabolic modeling has progressed rapidly to allow one to qualitatively and quantitatively understand the complex intracellular reaction networks; this enables one to predict the functional behavior of an organism under various genetic and environmental perturbations through simulations of the metabolic model. More recently, various genome-scale metabolic models have been constructed and used for understanding metabolic characteristics and alterations under various conditions [3,4]. Before the genome information was available, construction of metabolic models relied on the physicochemical laws and principles based on knowledge related to biochemical reactions and thermodynamic parameters [5,6]. Even though these models are useful, they do not satisfactorily represent the true metabolic characteristics of a cell. Genome sequencing of organisms has paved the way to the construction of genome-scale metabolic models by integrating annotation results, biochemical and physiological data, and other related information from various studies [7–9]. The first genome-scale metabolic model of Escherichia coli appeared in 2000 and contained 720 reactions associated with the metabolism and the transport of metabolites [10]. In parallel to the rapid development in genomics, advances were made in high-throughput experimental techniques generating a wide range of omics data at various levels, such as transcriptomic, proteomic, and metabolomic data, that allow us to construct well-organized metabolic models [1,11,12]. While these models allow quantitative predictions of the behaviors of organisms, they should be validated by comparison with real experimental results to use them as prediction and analysis tools. Thus, the most important and essential step in the process of metabolic model construction is the validation of the ability of the model to predict the behavior of an organism with reasonable accuracy [13,14]. Although a myriad of data on metabolite concentrations, metabolic fluxes, reaction kinetics, 20-1
20-2
Modeling Tools for Metabolic Engineering
and enzyme activities are becoming available in the literature and databases [15–18], many reactions are still suspicious and the quantitative influence of the aforementioned factors are ambiguous. Also, the kinetics and constants governing the metabolic reactions are often incomplete or missing entirely. Thus, validating a metabolic model needs to always be a central part of the modeling process and should be confirmed by comparing the simulation results of the models with experimental data under the corresponding genetic and environmental perturbations (Figure 20.1). This chapter briefly introduces the procedures for the construction and validation of genome-scale metabolic models.
20.1.1 Metabolic Modeling in the Pre–Genome Era Modeling and simulation have been used for a long time and have served as a very powerful tool to interpret experimental results and describe natural phenomena in a wide range of disciplines encompassing science and engineering [1,5,10,19]. In biological sciences, many attempts for developing metabolic models have also been made to interpret cellular physiology and predict cellular behaviors under different environmental conditions [20–22]. In particular, the applications of metabolic models to the optimization of existing fermentation processes and designing of new processes have increased considerably over the years with the increasing demand of commercially valuable bioproducts [23–25]. However, these models are highly unstructured and based on only a limited number of measurable data with various hypotheses, and thus, cannot adequately describe the behaviors of organisms. A metabolic modeling process requires detailed information on the cellular metabolism, and the accuracy and usefulness of these models are strongly dependent on the quality and quantity of the available metabolic data. Traditional fermentation experiments can only provide information on biomass, substrates, and extracellular metabolites, but typically not on the intracellular metabolites needed Environmental and genetic perturbation Experimental analysis
Genome-scale metabolic model Computational model
Iterative model refinement based on new information and knowledge
Omics technology
Hypothesis
Simulation
Experimental model
Model validation
Observation
Knowledge generation
Figure 20.1 Validation and iterative refinement of a genome-scale metabolic model. Genome-scale models can be used to predict the responses of cellular metabolism to various genetic and environmental perturbations. These responses of cellular metabolism can be useful for the validation and refinement of genome-scale metabolic models by comparing the simulation results with the actual data obtained from wet-experiments. Even after the genomescale metabolic model is initially validated, it needs to go through iterative refinement procedures based on new information and knowledge.
Validation of Metabolic Models
20-3
to build the model. Although some advances in analytical techniques for measuring the activities of enzymes and intracellular metabolites have made it possible to build more structured metabolic models, the models are still highly dependent on many assumptions due to the shortage of reliable information on cellular metabolism [26,27]. Additionally, the complexity of the metabolic network as well as the large volume of information required for the reconstruction of the metabolic model has made it difficult to construct a well-structured model to fully understand the nature of cellular functions and predict their behaviors [28,29].
20.1.2 Metabolic Modeling in the Post–Genome Era In the post–genome era, the development of detailed and more accurate metabolic networks has become possible based on the annotated genome information. Also, the rapid development of new analytical techniques for profiling transcriptome, proteome, and metabolome has made it possible to decipher the cellular states at the mRNA, protein, and metabolite levels. These tools have paved the way for the creation of a new strategy for metabolic modeling, the genome-scale in silico metabolic modeling. With the availability of genome sequences since the mid-1990s, the genome-scale biochemical reaction networks for a number of microorganisms have been reconstructed. Currently, 949 complete genome sequences are available and the genome sequencing of 3,509 organisms is in progress (last updated: Feburary 6, 2009). Although genome data themselves cannot serve a complete metabolic model, the integration of genome information with known biochemical and physiological information allows the initial construction of a metabolic network. More recently, the imposition of governing constraints that must be satisfied for the in silico cell represented by the genome-scale metabolic models to survive has successfully elucidated new characteristics and capabilities of organisms. There are two classes of models that are used to simulate an organism: the kinetic model and the stoichiometric model [1]. The kinetic model requires detailed kinetic information on the particular metabolic reaction. With this information, the kinetic model is able to dynamically predict the results of any perturbations made to the system. Unfortunately, the kinetic data for many reactions are lacking and are not easy to obtain for a large number of reactions. As a result, construction of genome-scale kinetic models and their use in systems biological studies are limited. Even for those reactions with known kinetic parameters, the outcomes of simulations can often be doubtful as the kinetic parameters determined in vitro do not necessarily reflect the true values in vivo. Thus, the stoichiometric model has been employed more often for the representation of the genome-scale metabolism of a cell. A number of genome-scale metabolic models including Escherichia coli [7,10,30], Haemophilus influenzae [31], Helicobacter pylori [32,33], Lactobacillus plantarum [34], Lactococcus lactis [35], Mannheimia succiniciproducens [36], Saccharomyces cerevisiae [37,38], Staphylococcus aureus [39,40], Streptomyces coelicolor [41], and Methanosarcina barkeri [42] are now available (Table 20.1). The genome-scale metabolic networks of E. coli [7] and S. cerevisiae [37,38], representative microorganisms of prokaryotes and eukaryotes, respectively, have already been validated to a large extent using the information from literature and experimental data. These models are being actively employed for developing strategies for metabolic engineering. For instance, the genome-scale metabolic model of E. coli was successfully used for the development of a recombinant E. coli strain capable of enhanced production of succinic acid by applying metabolic flux analysis and pathways analysis [43]. Also, the validated metabolic model of S. cerevisiae was used to design a strategy to enhance ethanol production [44]. Through the addition of regulatory mechanisms allowing conditional activation and inactivation of metabolic pathways, the gene-to-protein relationship can be incorporated into the genome-scale metabolic model. The genome sequence of H. influenzae Rd was determined in 1995 [45]. Through the functional annotation of the genome, a great number of metabolic reactions were newly characterized, leading to a rapid increase in the sizes of the metabolic networks [7,33,37]. This demonstrated that it is possible to construct metabolic networks of biologically lesser studied organisms. As a result of these newly characterized reactions, the metabolic network of H. influenzae
20-4
Modeling Tools for Metabolic Engineering
Table 20.1 Recently Developed Genome-Scale in Silico Metabolic Models Organism
Year
Genome size (kbp)
Metabolites (ea)
Reactions (ea)
Escherichia coli K-12 iJE660a GSM Escherichia coli K-12 iJR904 GSM/GPR Escherichia coli K-12 EcoMBEL979 Haemophilus influenzae Rd Helicobacter pylor 26695 Helicobacter pylor 26695 Lactobacillus plantarum WCFS1 Lactococcus lactis Mannheimia succinciproducens MBEL55E Methanosarcina barkeri Saccharomyces cerevisiae iFF708 Saccharomyces cerevisiae iND750 Staphylococcus aureus N315 Staphylococcus aureus N315 Streptomyces coelicolor A3(2)
2001 2003 2005 2002 2001 2005 2005 2005
4,639 4,639 4,639 1,830 1,667 1,667 3,308 2,365
438 625 814 343 339 411 670 509
627 931 979 488 388 476 704 621
[10] [7] [30] [28] [29] [30] [31] [32]
2004
2,314
352
373
[33]
2006 2003 2004 2005 2005 2005
4,873 12,069 12,069 2,813 2,813 8,667
558 584 646 712 571 500
619 842 1149 774 640 971
[39] [35] [34] [36] [37] [38]
References
was expanded to 461 reactions and 343 metabolites; similarly, that of H. pylori was expanded to and 476 reactions and 411 metabolites [31,33]. Simulations of these models show good correlations with experimental data and are able to sufficiently predict the phenotypic changes under different environmental conditions. Also, the validated models, such as the genome-scale models of E. coli and S. cerevisiae, offer to some degree a basis for the selection of target genes for successful metabolic engineering to maximize the production of desired bioproducts. Thus, the validated genome-scale metabolic models can be used not only for deciphering the metabolic characteristics, but also for developing strategies for metabolic engineering.
20.2 Metabolic Model Validation Validation is possibly the most difficult and critical part in the model building process, and is often overlooked. Model validation can be defined as “the process of determining that a model or simulation is an accurate representation of the real world from the perspective of the intended use of the model and simulation” [46]. However, it is still difficult to define what a valid model is because the precision requirements of the models are different depending on the purposes of the modeling and simulation. For the quantitative model validation, all experimental data must be consistent with all model predictions within an acceptable error range [47]. In some cases, on the other hand, a model can be considered valid if it is able to qualitatively represent the basic behavior of the real system to the satisfaction of the user. When quantitative analysis of metabolic network is desired, the model must be validated by comparing the simulation results with as many experimental data as possible. The selection of data to be compared and the perturbations to be simulated are important for model validation. Recently, various analytical techniques are making it possible to generate large amounts of quantitative intracellular data related to metabolite concentrations, metabolic fluxes, reaction kinetics, and enzyme activities. Organisms have unique responses to different culture conditions, such as medium compositions, dilution rates, gas components in a head space, and carbon sources. It has also become possible to impose genetic perturbations including gene knock-out and gene overexpression on the genome-scale metabolic models to examine the consequences.
Validation of Metabolic Models
20-5
Once the model is constructed, the process of validating it is required to prove that it can accurately simulate the organism. One of the most straightforward methods for validating metabolic models is to measure in vivo intracellular fluxes by NMR or GC/MS experiments. In spite of the successful applications of the NMR or GC/MS experiments, however, they have so far been limited to relatively small-sized central metabolisms. Instead, many metabolic models have been validated by analyzing the cellular responses to various perturbations applied; the cellular metabolic network and/or its environment are perturbed and the experimental results are compared with those predicted from in silico simulations. This method not only allows validation of the genome-scale model, but it also helps to identify and reveal new information on the metabolic characteristics of the organism. The new information generated is then incorporated into the model and the process is repeated. Although current genome-scale metabolic models do not correctly represent all the characteristics of organisms, they are accurate to some degree to serve their purpose of successfully explaining metabolic properties under given conditions.
20.2.1 Tools for Validating the Metabolic Model 20.2.1.1 Flux Balance Analysis (FBA) When genome-scale models are reconstructed, the number of metabolites is in general much smaller than the number of reactions. In the case of the EcoMBEL979 E. coli model [30], the number of reactions is 979 and the number of metabolites is 814. There are some measurable exchange fluxes that can be used as additional constraints to the system, but the number of these fluxes is usually less than the degree of freedom. Thus, a linear optimization algorithm has been commonly employed to obtain the optimal flux distribution under various genotypic and environmental conditions by optimizing an objective function. By doing so, an optimal solution satisfying the applied constraints and objective function can be obtained. The objective function is commonly defined for the following purposes: first, the prediction of optimal growth phenotype, such as maximum growth rate; second, the identification of the capacity to produce desired products; and, third, minimization of byproduct formation. Other objective functions can also be defined depending on the purpose of the simulation. This approach, based on the linear optimization of an objective function to study flux distribution, is referred as FBA and is widely used for the analysis of various metabolic systems [48]. FBA has been applied for the analysis of various wild-type organisms for the validation of genomescale metabolic models and for deciphering metabolic characteristics under various conditions. Also, FBA has been applied to recombinant organisms producing metabolites. In lactic acid producing E. coli, FBA was used to search for the genes to be deleted to increase lactic acid production rate with increasing growth rate. Three strain designs for lactic acid production were implemented yielding a total of 11 evolved production strains. When FBA was applied to the recombinant E. coli strain producing a biodegradable polymer poly 3-hydroxybutyrate (PHB), it was predicted that the Entner–Doudoroff pathway was activated during PHB production. This prediction was validated by the results obtained with a mutant E. coli strain deficient in the activity of 2-keto-3-deoxy-6-phosphogluconate aldolase (Eda); the eda mutant strain accumulated less PHB than its parent strain, which could be restored by the overexpression of the eda gene [49]. These results suggest that FBA is useful for the analysis of metabolite producing systems, especially for the identification of important metabolic pathways operating under particular conditions. Therefore, FBA is useful in the validation of metabolic models as well as in the design of metabolic engineering strategies for the improved production of various metabolites of biotechnological interest. 20.2.1.2 Metabolic Pathway Analysis There are two key methods of pathway analysis: elementary mode and extreme pathway analysis. These two methods share many characteristics and can often be confused with one another. However, there are certain properties that distinguish the two methods from each other. Elementary modes are a set
20-6
Modeling Tools for Metabolic Engineering
of vectors derived from the stoichiometric matrix through convex analysis. These vectors are characterized by being a unique set in a given network. They consist a minimum number of reactions to be a functional unit (genetic independence) [50], and are a set of all possible routes through the system. Because elementary modes are a set of all possible routes, the number of pathways involved is numerous. Extreme pathway analysis utilizes the stoichiometric matrix to create a set of convex basis vectors [51]. These pathways share the first two characteristics of elementary modes in that they are a unique set in the given network and made up of the minimum number of reactions to exist as a functional unit. However, the extreme pathways do not cover nonnegative linear combinations of any other extreme pathways, whereas the elementary modes consist of all possible pathways, including these nonnegative linear combinations. Another characteristic of extreme pathways that distinguishes them from elementary modes is that they consider reversible reactions as two distinct reactions, whereas elementary modes do not. In most cases, the two methods lead to the same result and are indistinguishable. However, extreme pathways are actually a subset of the elementary modes lying on the edges of the set space. Elementary modes can include any number of additional pathways since they consist of all possible routes, including the nonnegative linear combinations. As a result, the number of elementary modes are more than or equal to the number of extreme pathways. In a more complex network of reactions, there are groups of reactions that always occur together at fixed ratios and form a subset within the extreme pathways, meaning that if one reaction goes through, then the others in the same subset will follow. This helps in the construction and validation of the metabolic model in that if one of the reactions is found in the system, then the rest of the reactions must be included. In the validation of the metabolic model, determining the characteristics of the metabolic pathways provides a list of possible fluxes through the system and an idea as to the responses of the system to any changes in it. In a complicated network, where more than two products are formed, the fluxes of the products can be determined and adjusted so that the main product is maximized with a minimal amount of by-product. This prediction of flux values can be compared with experimental results to determine the accuracy of the model prediction. Likewise, should an intermediate reaction be eliminated, the change in fluxes can be predicted and then compared to experimental observations. Additionally, metabolic pathway analysis can identify redundancies in the network so as to determine how sensitive cellular responses are to certain changes, such as gene deletion. Because extreme pathways or elementary modes contain the minimal number of reactions to make a particular pathway functional, any knock-out of a reaction would render that pathway silent. Should the cell be sensitive to that reaction, then a drastic change in cellular function would appear. However, if another pathway can be utilized to compensate for the deleted reaction, and the flux distribution is restructured, then the response is not dramatic. Using this method, the redistribution of the metabolic fluxes can be observed, and strategies can be designed to redirect the fluxes to the desired outcome. Extreme pathway analysis can be used to identify reactions that do not participate in the extreme pathway network and are considered unused pathways. These pathways are “dead end” reactions and provide some complications during the simulation because they often cause a mathematical build-up of metabolites with no future, which is not accurate. Therefore, by identifying these reactions, analysis can be done to further characterize the system and identify pathways that utilize the dead end product. Because the dead end product must go somewhere, reactions involving that particular metabolite as a substrate can be identified in other organisms. The gene sequences of those reactions are then searched against the genome sequence of the organism being modeled and a matching reaction is identified, making the dead end pathway no longer a dead end and thereby increasing the accuracy of the model. Schilling et al. [32] used extreme pathway analysis along with FBA in the modeling and validation of the H. pylori metabolic network. In that study, they divided the system into subsystems depending on the functions of the pathways, such as amino acid synthesis and nucleotide synthesis, to facilitate
Validation of Metabolic Models
20-7
the analysis of these extreme pathways. They also calculated the ratio of the number of pathways to the number of reactions in the subsystem in order to determine the robustness of the network. It was found that the ratios, except for the nucleotide subsystem, were similar to those found in H. influenzae, which is found in similar environments as H. pylori. More recently, a new approach was proposed for complementarily identifying multiple flux distributions during metabolic flux analysis and multiple reaction pathways during structural pathway analysis [52]. This integrated approach allows identification of multiple flux distributions and multiple reaction pathways by combining FBA and a graph-theoretical method for reaction-pathway identification. 20.2.1.3 Computational Tools: Simulation of Metabolic Model The advent of high-throughput technologies somewhat diminished the shortage of experimental results required for simulating the local cellular components or whole cell. Computational models continue to be generated, and the modeling targets are expanding beyond metabolic pathways to more complex cellular signaling pathways, metabolic cascades, and gene regulation [53,54]. The simulation of these cellular circuits with their high level of interconnections and organization is thus possible in the same way as electrical circuits can be simulated prior to production. This approach may lead to the understanding, replication, and even prediction of experimental results through simulations. Furthermore, it is expected that the systems biological analysis will enable genome-scale synthetic biology to re-design and reverse engineer life [55,56]. At this point, simulation methods are an important issue, and they are often performed by various optimization techniques. Currently, these simulation processes, including both dynamic and static simulations, have been greatly facilitated by the development of innovative software programs. These alleviate the necessity for developing sophisticated implementation procedures, for instance, coding processes, and render systems biology a powerful discipline to generate new knowledge. FBA is one of the most widely used optimization techniques for the static simulation of biological networks. Palsson’s group has released an FBA program through their website (http://gcrg.ucsd.edu). The advances made in this group eventually led to the establishment of a company called Genomatica, which developed an integrated software system, SimPheny, for FBA (http:/www.genomatica.com). MetaFluxNet is a program specifically designed for in silico analyses of biological networks using FBA by enabling simulations and efficient data management [57]. Its main features include an environment for customized model reconstruction, FBA simulations under genetic/environmental conditions, comparative flux analysis of different strains, different types of numerical solvers, a systems biology markup language (SBML) [58], and a metabolic flux analysis markup language (MFAML) [59] for convenient data exchange with other platforms. Algorithms are also available for pathway analysis. For studying elementary modes, the program METATOOL [60] is available. METATOOL is used to calculate the elementary modes of the metabolic network of interest and has been upgraded several times to become more user friendly. The input requires the stoichiometric matrix and also a vector that marks which reactions are reversible, and returns an output of the optimal yields for the fluxes. ExPA [61] can be used for analyzing extreme pathways. It identifies the extreme pathways from the stoichiometric matrix and can give the output as a matrix itself or as a list of reactions, depending on the preference of the user. It should be noted that for a large number of pathways being generated, lots of computations are needed in analyzing the system. Also, as the size of the metabolic model increases, the number of computations increases exponentially.
20.2.2 Validation of the Metabolic Model Validation of the model is not a one time event. In fact, validation requires continuous revisions to the model as new information on the system is discovered. The cycle of testing the model and updating it
20-8
Modeling Tools for Metabolic Engineering
is an important process for ensuring that the model is as accurate as it can be. A typical procedure for the validation of metabolic models is as follows. First, a standard culture condition for the organism of interest is determined based on the literature and experience. Then the organism is cultivated under this condition and all measurable exchange fluxes are experimentally determined. Using the carbon source uptake rate, which was determined in the previous step, as a constraint, FBA is performed to calculate all other cellular fluxes, both intracellular and exchange fluxes. Once these fluxes are calculated, the calculated exchange fluxes are compared to those measured experimentally. Should the fluxes be similar to each other within an acceptable range, then the metabolic model can be considered valid. If they do not agree well with each other, then the metabolic model and/or the simulation constraints need to be modified (Figure 20.1). When the model does not conform to the experimental observations, it means that there are constraints that should be further characterized and included in the model, such as additional limits to metabolic fluxes or enzyme capacities. All of these constraints are generated according to the pertaining physicochemical properties and/or cellular regulations. Disagreement between the simulated results and the experimental observations can be originated from the metabolic pathway structures rather than the fluxes. Here, all possible phenotypes are observed and then the pathways that lead to the expression of each phenotype are traced back till the routes that lead to this phenotype are identified. Although most of the problematic pathways, such as dead ends and missing reactions, have been characterized during the model construction, some pathways do not manifest any difficulties except under certain environmental conditions or genetic changes. It is these pathways that need to be checked for by comparing experimental results with the simulations. This process must be iterated under a wide range of perturbations until the model prediction matches the system’s phenotype well. In these validation processes, the parameters that are altered to create changes in the system in order to ensure that the model can represent all aspects of the system are described in the following sections. It should be noted that most simulations are based on the exponential growth state as indicated by the objective function of maximizing the biomass formation. However, in many biotechnological applications, the exponential growth phase may not be the growth state of interest. Therefore, more studies are required for the extension of the genome-scale metabolic simulations to decipher these other growth states. 20.2.2.1 Validation by Environmental Perturbation Generally, the factors that determine the types of environmental conditions that cells can encounter include dilution rates in a chemostat culture, media composition, osmolarity, head space gas composition, temperature, and medium pH, to name a few. Currently, genome-based metabolic models can be used to study metabolic states under varying dilution rates, media compositions, and gas compositions (Figure 20.2). 20.2.2.1.1 Energetic Parameters Genome-scale metabolic models are used for predicting optimal growth and the formation of various products. In many cases, some parameters, such as energetic parameters, should be directly included in the metabolic model for predicting real growth rate. Many metabolic reactions require the consumption of ATP with or without contributing to a net synthesis of biomass, and these reactions are represented by the energetic parameters. Energetic parameters are composed of the P/O ratio, the ATP yield coefficient YxATP, and the nongrowth-associated maintenance coefficient m ATP. The YxATP term is further composed of a synthesis term for building blocks, the polymerization term for the polymerization of small molecules into macromolecules, and the growth-associated maintenance term for maintaining the electrochemical gradients across the plasma membrane. In aerobic condition, these parameters (P/O ratio, YxATP, and m ATP) are usually unknown. Thus, one of the three parameters, generally the P/O ratio, is fixed and the other two are determined from experiments. The YxATP component related to growth-associated maintenance and m ATP are estimated from experimental
20-9
Validation of Metabolic Models Nutrient availability Energetic parameter
Gene deletion
In silico perturbation
Gene amplification
Model validation
Gas composition
Model refinement
Comparison with experimental result
Figure 20.2 Various genetic and environmental perturbations. Genetic perturbations are mainly gene deletion and gene amplification. Environmental perturbations include changing dilution rates in a chemostat culture, media composition, osmolarity, head space gas composition, temperature, and medium pH, to name a few. Genome-scale metabolic models can be used to study metabolic states under gene deletion, gene amplification, varying dilution rates, media compositions, and gas compositions.
Chemostat data
Dilution rate
B Change of mATP _nongrowth-associated maintenance enrgy
Glucose uptake rate
Change of YxATP _growthassociated maintenance enrgy
Glucose uptake rate
A
Dilution rate
Figure 20.3 Schematic procedure for the estimation of maintenance energy. The points shown in the graphs are experimental results obtained from chemostat cultures, each run done at a different dilution rate. The line represents in silico simulation of the model of the same system. (A) Without the values of the maintenance energy, the line will not fit the data from the fermentation. (B) By adjusting the parameters associated with maintenance energy, the line will adjust accordingly and will fit the experimental data when the values for the parameters are correctly determined.
data obtained during the chemostat [62]. These parameters have been used for fitting optimal growth phenotypes (Figure 20.3). By adjusting the parameters associated with maintenance energy, the substrate uptake rate or cell growth rate will adjust accordingly and will fit the experimental data when the values for the parameters are determined. However, the maintenance coefficients are a function of a number of other operating variables, e.g., medium composition, temperature, and medium pH. When the medium composition or temperature is changed, a change in maintenance metabolism may occur, and this causes a change in the maintenance coefficient. Thus, a quantitative study between maintenance coefficients and other variables will be needed for a more accurate simulation of a genome-scale metabolic model.
20-10
Modeling Tools for Metabolic Engineering
20.2.2.1.2 Medium Composition and Essential Metabolites In order to validate cell growth in various media, the cellular metabolites, including amino acids, vitamins, and nucleotides, are individually removed from the genome-scale metabolic model to determine whether they are required for the formation of biomass constituents and, consequently, cell growth. This can be implemented in FBA by constraining the uptake fluxes for the corresponding metabolite to zero and optimizing for the biomass objective reaction. A metabolite that caused no cellular growth when its uptake flux was set to zero can be defined as an essential compound. This procedure was successfully applied to H. pylori and identified essential transport systems necessary for growth [33]. 20.2.2.1.3 Gas Composition The importance of head space gas on cell growth is condition dependent. Typical conditions are aerobic and anaerobic, but sometimes other gases such as H2 or CO2 can be present. The effects of gas conditions on cellular metabolism can be significant. For example, when oxygen is limited, the carbon substrate is only partially oxidized, which leads to the secretion of by-products. The formation and secretion of metabolic by-products are the cell’s effort to generate energy while balancing the cellular redox potential. This was shown for E. coli growing under oxygen-limiting conditions [63]. Similar studies were carried out for Mannheimia succiniciproducens, which was isolated from bovine rumen rich in CO2. Simulation of the genome-scale metabolic model of M. succiniciproducens under varying gas conditions showed that CO2 is truly important for cell growth as well as for the carboxylation of phosphoenolpyruvate to oxaloacetate, which is then converted to succinic acid by the reductive tricarboxylic acid cycle using fumarate as a major electron acceptor. Other head gas conditions did not support cell growth and high succinic acid production. Thus, gas conditions can be of importance for the quantitative analysis of metabolism, and for the validation of the genome-scale metabolic model.
20.2.3 Validation by Genetic Perturbation 20.2.3.1 Gene Deletion In the past, gene knock-out was one of the primary means of engineering new strains for fundamental studies and for the production of valuable bioproducts. However, the problem lies in the fact that thousands of mutant strains have to be screened before a promising candidate can be identified for further analysis. With the availability of the genome-scale metabolic model, identification of target genes to be knocked out can be quickly identified through simulations of the model. However, the accuracy of the model needs to be first validated before this method can be used. One of the outcomes of knock-out validation is for the determination of gene essentiality: the determination of whether or not the gene is essential for the survival of the organism, under the condition the organism encounters. Each gene is knocked out and changes in the in silico growth behavior are observed. If a gene is knocked out and the organism fails to survive, then the gene is considered to be essential; if it survives, then it is considered non essential. There are also several different categories between the two extremes such as partial essentiality or false positives/negatives [64,65], depending on the style and nature of the experiments. For example, in many microorganisms, the phosphotransferase system (PTS) is important in the uptake of the carbon source. In most microorganisms, should the gene or genes that create this system be knocked out, they would fail to grow despite the presence of carbon sources because they cannot transfer them into the cell for their utilization. In this case, the gene (or genes) is essential. However, other microorganisms may have an alternative means of carbon uptake, such as glucokinase. Then, the PTS system becomes partially essential (or nonessential if the growth rate is unaffected, regardless of method of carbon uptake). Additionally, the essentiality of genes can also be tested under different environmental conditions such as varying media compositions. For example, glucokinase would be essential if the system has only glucose. However, if an alternate carbon source is present such as organic acids, then it would be nonessential.
Validation of Metabolic Models
20-11
Multiple in silico knock-out tests can also be performed. However, if there are too many combinatorial knock-outs to be performed, then experiments can be a limiting factor. Despite the difficulty in performing these multiple knock-outs, in silico double knock-outs and occasionally triple knock-outs have been successfully performed to provide additional information on cellular function and suggest new targets for metabolic engineering. Often, interesting results are revealed from multiple knock-out experiments that normally are not observed by single gene knock-out studies. For example, double deletion tests in the H. pylori model suggested a new lethal mutant when the genes for urease and the urea transporter were deleted [33]. 20.2.3.2 Gene Amplification In silico gene amplification has not been as extensively studied as knock-out studies due to the difficultly in the mathematical formulation of the process. The metabolic phenotypes after gene deletion are much easier to predict in silico because one can set the corresponding flux to zero during the flux analysis. However, gene amplification does not necessarily mean an increase in the corresponding metabolic fluxes due to the complex regulatory and kinetic mechanisms. With current knowledge, the attempt to predict how much fluxes will be increased by gene amplification would be too ambitious. Because of these hurdles, implementing gene amplification in flux analysis in mathematical simulations has been difficult. There are two types of gene amplifications in the system: amplifications of homologous and heterogeneous genes. Homologous gene expression is performed to provide additional copies of an existing pathway and is often attempted for flux amplification. Heterogeneous gene amplification, on the other hand, is often performed to introduce a new pathway and reaction to the system, and, as a result, changes the model. These additions can bring about many possible results, from a change in efficiency of the system to no effect on the system at all. For example, Burgard and Maranas show that arginine and asparagine can be more efficiently produced by using fewer ATPs by the introduction of 6-phosphofuctokinase and ADP-forming aspartate-ammonia ligase, respectively [65]. This shows that although a pathway toward an end product is present, an alternate version can be designed to improve the efficiency of the system. Obviously, one would like to see tests succeed and verify the results of any predictions, but this is not often the case. Fortunately, negative results are also helpful in improving the metabolic model of the organism of interest. False predictions can help identify where missing links of the model are located or point out erroneous data from experiments, and can consequently be used to improve the model. In silico reconstructions of the genome-scale metabolic models for a number of organisms have been done and gene deletion studies have been conducted to compare the results of in silico simulations to the data collected or generated from experiments. So far, we cannot say that these in silico simulations are 100% accurate due to the numerous factors that are still unknown. As more information becomes available in the future, the accuracy of the model will increase.
20.3 Conclusions and Future Prospects One of the key challenges of metabolic engineering in the post-genome era is to rationally design cells that have improved metabolic properties for industrial applications by using genome-scale biological data. A genome-scale metabolic model has made it possible to predict cellular behavior under different genetic and environmental conditions, and thus facilitate the design of an optimal metabolic system. However, the strength of a metabolic model is obviously dependent on its validity, which fundamentally relies on the agreement between the simulation results and experimental observations. Quantitative metabolic model validation is difficult, but it is critical in discovering how close the model is to the actual cellular behavior under different environmental conditions and genetic backgrounds. However, experimental data sets are always relatively small, and thus the genome-scale metabolic networks cannot be said to be complete. The limitation in the process of metabolic model validation is how to obtain the detailed and reliable data that can represent cellular metabolism under various conditions.
20-12
Modeling Tools for Metabolic Engineering
Also, it has become important to develop methods for rapidly identifying dominant parameters, fitting selected model parameters, and analyzing statistical results to validate models. In conclusion, the main challenges in the field of genome-scale metabolic model validation are not only to increase the accuracy of the model but also to increase the range of measurable data and develop new validation tools that efficiently determine the accuracy of the model. Nonetheless, the validation processes described above is a good starting strategy to fine-tune the genome-scale models, and consequently to design metabolic engineering strategies for the enhanced production of desired bioproducts.
Acknowledgments This work was supported by the Korean Systems Biology Program from the Ministry of Education, Science and Technology through the Korea Science and Engineering Foundation (No. M10309020000-03B5002-00000). Further supports by LG Chem Chair Professorship and Microsoft are appreciated.
References 1. Gombert, A. K. and Nielsen, J. Mathematical modelling of metabolism. Curr. Opin. Biotechnol., 11, 180, 2000. 2. Varner, J. and Ramkrishna, D. Mathematical models of metabolic pathways. Curr. Opin. Biotechnol., 10, 146, 1999. 3. Lee, S.J., Song, H., and Lee, S.Y. Genome-based metabolic engineering of Mannheimia succiniciproducens for succinic acid production. Appl. Environ. Microbiol., 72, 1939, 2006. 4. Lee, S.Y., Lee, D-.Y., and Kim, T.Y. Systems biotechnology for strain improvement. Trends in Biotechnol., 23, 349, 2005. 5. Aiba, S. and Matsuoka, M. Identification of metabolic model: Citrate production from glucose by Candida lipolytica. Biotechnol. Bioeng., 21, 1373, 1979. 6. Vallino, J.J. and Stephanopoulos, G. Flux determination in cellular bioreaction networks: Applications to lysine fermentations. In Frontiers in Bioprocessing. CRC Press, Boca Raton, FL, 1990, 205. 7. Reed, J.L. et al. An expanded genome-scale model of Escherichia coli K-12 (iJR904 GSM/GPR). Genome Biol., 4, R54, 2003. 8. Schilling, C. H. et al. Genome-scale metabolic model of Helicobacter pylori 26695. J. Bacteriol., 184, 4582, 2002. 9. Thiele, I. et al. An expanded metabolic reconstruction of Helicobacter pylori (iIT341 GSM/GPR): An in silico genome-sale characterization of single and double deletion mutants. J. Bacteriol., 187, 5818, 2005. 10. Edwards, J.S. and Palsson, B. Ø. The Escherichia coli MG1655 in silico metabolic genotype: Its definition, characteristics, and capabilities. Proc. Natl. Acad. Sci. USA, 97, 5528, 2000. 11. Patil, K.R., Akesson, M., and Nielsen. J. Use of genome-scale microbial models for metabolic engineering. Curr. Opin. Biotechnol., 15, 64, 2004. 12. Wiechert, W. Modeling and simulation: Tools for metabolic engineering. J. Biotechnol., 94, 37, 2002. 13. Koch, I., Junker, B.H., and Heiner, M. Application of Petri net theory for modeling and validation of the sucrose breakdown pathway in the potato tuber. Bioinformatics, 21, 1219, 2005. 14. Vanrolleghem, P.A. et al. Validation of a metabolic network for saccharomyces cerevisiae using mixed substrate studies. Biotechnol. Prog., 12, 434, 1996. 15. Ge, H., Walhout, A.J., and Vidal, M. Integrating ‘omic’ information: A bridge between genomics and systems biology. Trends Genet., 19, 551, 2003. 16. Liolios, K. et al. The genomes on line database (GOLD) v.2: A monomer of genome projects worldwide. Nucleic Acids Res., 34, D332, 2006. 17. Nielsen, J. and Oliver, S. The next wave in metabolome analysis. Trends Biotechnol., 23, 544, 2005.
Validation of Metabolic Models
20-13
18. Patterson, S.D. and Aebersold, R.H. Proteomics: The first decade and beyond. Nature Genet., 33, 311, 2003. 19. Bailey, J.E. Mathematical modeling and analysis in biochemical engineering: Past accomplishments and future opportunities. Biotechnol. Prog., 14, 8, 1998. 20. Nielsen, J. and Jorgensen, H.S. A kinetic model for the penicillin biosynthetic pathway in Penicillium chrysogenum. Control. Eng. Practice, 4, 765, 1996. 21. van Riel, N.A.W. et al. A structured, minimal parameter model of the central nitrogen metabolism in Saccharomyces cerevisiae: the prediction of the behavior of mutants. J. Theor. Biol., 191, 397, 1998. 22. Vaseghi, S. et al. In vivo dynamics of the pentose phosphate pathway in Saccharomyces cerevisiae. Metab. Eng., 1, 128, 1999. 23. Lee, B. et al. Incorporating qualitative knowledge in enzyme kinetic models using fuzzy logic. Biotechnol. Bioeng., 62, 722, 1999. 24. Pissara, P.N., Nielsen, J., and Bazin, M.J. Pathway kinetics and metabolic control analysis of a highyielding strain of Penicillium chrysogenum during fed batch cultivations. Biotechnol. Bioeng., 51, 168, 1996. 25. van Can, H.J.L. et al. An efficient model development strategy for bioprocesses based on neutral networks in macroscopic balances. Biotechnol. Bioeng., 62, 666, 1999. 26. Laffend, L.A. and Shuler, M.L. Structured model of genetic control via the lac promoter in Escherichia coli. Biotechnol. Bioeng. 43, 399, 1994. 27. Lee, J. et al. Optimality of co-feeding of carbon sources for maximizing cellular and energetic yields using a constrained network analysis. Appl. Environ. Microbiol., 63, 710, 1997. 28. Price, N.D. et al. Genome-scale models of microbial cells: evaluating the consequence of constraints. Nat. Rev. Microbiol., 2, 886, 2004. 29. Covert, M.W. et al. Integrating high-throughput and computational data elucidates bacterial networks. Nature, 429, 92, 2004. 30. Lee, S.Y. et al. Systems-level analysis of genome-scale microbial metabolisms under the integrated software environment. Biotechnol. Bioproc. Eng. 10, 425, 2005. 31. Edwards, J.S. and Palsson, B.Ø. Systems properties of the Haemophilus influenzae Rd metabolic genotype. J. Biol. Chem., 274, 17410, 1999. 32. Schilling, C.H. et al. Genome-scale metabolic model of Helicobacter pylori 26695. J. Bacteriol., 184, 4582, 2002. 33. Thiele, I. et al. An expanded metabolic reconstruction of Helicobacter pylori (iIT341 GSM/GPR): An in silico genome-scale characterization of single and double deletion mutants. J. Bacteriol., 187, 5818, 2005. 34. Teusink, B. et al. In silico reconstruction of the metabolic pathways of Lactobacillus plantarum: comparing predictions of nutrient requirements with those from growth experiments. Appl. Environ. Microbiol., 71,7253, 2005. 35. Oliveira, A.P., Nielsen, J., and Forster, J. Modeling Lactococcus lactis using a genome-scale flux model. BMC Microbiol., 5,39, 2005. 36. Hong, S.H. et al. The genome sequence of the capnophilic rumen bacterium Mannheimia succiniciproducens. Nat. Biotechnol., 22,1275, 2004. 37. Duarte, N.C., Herrgard, M.J., and Palsson, B.Ø. Reconstruction and validation of Saccharomyces cerevisiae iND750, a fully compartmentalized genome-scale metabolic model. Genome Res., 14, 1298, 2004, 38. Forster, J. et al. Genome-scale reconstruction of the Saccharomyces cerevisiae metabolic network. Genome Res., 13, 244, 2003. 39. Becker, S.A. and Palsson, B.Ø. Genome-scale reconstruction of the metabolic network in Staphylococcus aureus N315: An initial draft to the two-dimensional annotation. BMC Microbiol., 5, 8, 2005.
20-14
Modeling Tools for Metabolic Engineering
40. Heinemann, M. et al., In silico genome-scale reconstruction and validation of the Staphylococcus aureus metabolic network, Biotechnol. Bioeng., 92, 850, 2005. 41. Borodina, I., Krabben, P., and Nielsen, J. Genome-scale analysis of Streptomyces coelicolor A3(2) metabolism. Genome Res., 15, 820, 2005. 42. Feist, A.M. et al. Modeling methanogenesis with a genome-scale metabolic reconstruction of Methanosarcina barkeri. Mol. Syst. Biol., msb4100046, E1, 2006. 43. Lee, S.J. et al. Metabolic engineering of Escherichia coli for the enhanced production of succinic acid based on genome comparison and in silico gene knock-out simulation. Appl. Environ. Microbiol., 71, 7880, 2005. 44. Bro, C. et al. In silico aided metabolic engineering of Saccharomyces cerevisiae for improved bioethanol production. Metab. Eng., 8, 102, 2006. 45. Fleischmann, R.D. et al. Whole-genome random sequencing and assembly of Haemophilus influenzae Rd. Science, 269, 496, 1995. 46. Page, E.H. and Canova, B.S. A case study of verification, validation and accreditation for advanced distributed simulation. Model Comp. Sim., 7, 393, 1997. 47. Seber, G.A.F. and Wild, C.J. Nonlinear Regression. Wiley, New York, NY, 768, 1989. 48. Edwards, J.S., Ibarra, R.U., and Palsson, B.Ø. In silico predictions of Escherichia coli metabolic capabilities are consistent with experimental data. Nat. Biotechnol., 19, 125, 2001. 49. Hong, S.H. et al. In silico prediction and validation of the importance of the Entner-Doudoroff pathway in poly(3-hydroxybutyrate) production by metabolically engineered Escherichia coli. Biotechnol. Bioeng., 83, 854, 2003. 50. Schuster, S. et al. Reaction routes in biochemical reaction systems: algebraic properties, validated calculation procedure and example from nucleotide metabolism. J. Math. Biol., 45, 153, 2002. 51. Schilling, C.H. et al. Theory for the systemic definition of metabolic pathways and their use in interpreting metabolic function from a pathway-oriented perspective. J. Theor. Biol., 203, 229, 2000. 52. Lee, D.-Y. et al. Complementary identification of multiple flux distributions and multiple metabolic pathways. Metabo. Eng., 7, 182, 2005. 53. Bhalla, U.S. The chemical organization of signalling interactions. Bioinformatics, 18, 855, 2002. 54. Kuroda, S., Schwieghofer, N., and Kawato, M. Exploration of signal transduction pathways in cerebellar long-term depression by kinetic simulation. J. Neurosci., 21, 5693, 2001. 55. McAdams, H.H. and Shapiro, L. Circuit simulation of genetic networks. Science, 269, 650, 1995. 56. Barrett, C.L., et al. Systems biology as a foundation for genome-scale synthetic biology. Curr. Opin. Biotechnol., 17, 488, 2006. 57. Lee, D.Y. et al. MetaFluxNet: the management of metabolic reaction information and quantitative metabolic flux analysis. Bioinformatics, 19, 2144, 2003. 58. Hucka, M. et al. The systems biology markup language (SBML): a medium for representation and exchange of biochemical network models. Bioinformatics, 19, 524, 2003. 59. Yun, H. et al. MFAML: A standard data structure for representing and exchanging metabolic flux models. Bioinformatics, 21, 3329, 2005. 60. Pfeiffer, T. et al. METATOOL: For studying metabolic networks. Bioinformatics, 15, 251, 1999. 61. Bell, S.L. and Palsson, B.Ø. Expa: A program for calculating extreme pathways in biochemical reaction networks. Bioinformatics, 21, 1739, 2005. 62. Melzoch, K. et al. Lactic acid production in a cell retention continuous culture using lignocellulosic hydrolysate as a substrate. J. Biotechnol., 56, 25, 1997. 63. Varma, A., Boesch, B.W., and Palsson, B.Ø. Stoichiometric interpretation of Escherichia coli glucose catabolism under various oxygenation rates. Appl. Environ. Microbiol., 59, 2465, 1993. 64. Forster, J. et al. Large scale evaluation of in silico gene knockouts in Saccharomyces cerevisiae. Omics, 7, 193, 2003. 65. Burgard, A.P. and Maranas, C.D. Probing the performance limits of the Escherichia coli metabolic network subject to gene additions or deletions. Biotechnol. Bioeng. 74, 364, 2001.
Developing Appropriate Hosts for Metabolic Engineering
V
Jens Nielsen Chalmers University of Technology
21 Escherichia coli as a Well-Developed Host for Metabolic Engineering Eva Nordberg Karlsson, Louise Johansson, Olle Holst, and Gunnar Lidén.....................21-1 Why Escherichia coli? • Fundamentals of Metabolic Engineering of E. coli • Metabolic Engineering of a Specific Pathway—The Shikimate Pathway • Concluding Remarks
22 Metabolic Engineering in Yeast Maurizio Bettiga, Marie F. Gorwa-Grauslund, and Bärbel Hahn-Hägerdal................................................. 22-1 Introduction • Extension of Substrate Range • Metabolites Yield and Productivity • Extended Product Range • Improved Cellular Properties • Yeast as Biocatalyst • Concluding Remarks
23 Metabolic Engineering of Bacillus subtilis John Perkins, Markus Wyss, Hans-Peter Hohmann, and Uwe Sauer...................................................... 23-1 Introduction • Genetic Engineering Methods Unique to B. subtilis • Metabolic Engineering of Products with Well-Known Biochemistry • Metabolic Engineering of Products with Unusual Biochemistry • Other Potentially Relevant Products • Current Challenges and New Possibilities
V-1
V-2
Developing Appropriate Hosts for Metabolic Engineering
24 Metabolic Engineering of Streptomyces Irina Borodina, Anna Eliasson, and Jens Nielsen........................................................................................... 24-1 Streptomyces as Superhosts • The Streptomyces Genome and Its Modification • Analysis of Streptomyces Strains • Modeling and Design of Streptomyces Strains • Examples of Metabolic Engineering in Streptomyces • Perspectives
25 Metabolic Engineering of Filamentous Fungi Mikael Rørdam Andersen, Kanchana Rucksomtawin, Gerald Hofmann and Jens Nielsen........................................ 25-1 Introduction • System-Wide Approaches • Examples • Perspectives
26 Metabolic Engineering of Mammalian Cells Lake-Ee Quek and Lars Keld Nielsen.................................................................................................................... 26-1 Introduction • Use of Mammalian Cell Culture in Recombinant Protein Production • Conventional Metabolic Engineering • Future Directions— Systems Biology • Summary
Introduction In the biotech industry there has been a constant strive for improving the efficiency of the cell factories used for the production of fuels and chemicals, which is well illustrated by the more than 10,000 fold improvements obtained in the productivity of penicillin by the filamentous fungus Penicillium chrysogenum over the last 60 years. With the introduction of genetic engineering by Cohen, Boyer and coworkers in 1973 there was opened for a new approach to optimization of existing biotech processes and development of completely new ones. Shortly after the introduction of genetic engineering followed several successful applications of microorganisms for the production of human proteins, e.g., the production of growth hormone and human insulin by Escherichia coli and Saccharomyces cerevisiae, respectively. With the further development in genetic engineering techniques the possibility to apply this for optimization of classical fermentation processes soon became obvious. However, it was soon realized that it was technologically difficult to engineer metabolic pathways, and even though there are several examples of engineering microorganisms for production of chemicals, e.g., production of indigo by E. coli, few of these examples developed into industrially viable processes. It is a first in recent years that the use of directed pathway engineering has taken of in industry, and today the field of industrial biotechnology is a rapidly growing field. In the increasing shift toward a bio-based economy, there is a demand for being able to quickly and reliably develop efficient cell factories that can produce desirable products. Metabolic engineering is the enabling science in the field of industrial biotechnology, as it focuses on developing new cell factories or improving existing cell factories (Bailey, 1991). There are several definitions, but most of these are consistent with: the use of genetic engineering to perform directed genetic modifications of cell factories with the objective to improve their properties for industrial application. In this definition the word improve is to be interpreted in its broadest sense, i.e., it also encompasses the insertion of completely new pathways with the objective to produce a heterologous product in a given host cell factory. Metabolic engineering distinguishes itself from applied genetic engineering by the use of advanced analytical tools for identification of appropriate targets for genetic modifications and often mathematical models are used to perform in silico design of optimized cell factories. In fact the reason for the relatively slow migration of genetic engineering into the field of industrial biotechnology is primarily due to the requirement for advanced analytical techniques that allows for mapping of activities in different parts of the metabolism and detailed phenotypic characterization. With the developments in genomics, primarily driven by the large investments in the medical sciences, several advanced new techniques have been developed for phenotypic analysis, and with these techniques it has become possible better guiding the introduction of directed genetic modifications. Metabolic engineering is therefore often seen as a cyclic process (Nielsen, 2001), where the cell factory is analyzed and based on this an appropriate target is identified (the design phase). This target is then experimentally implemented and the resulting stain is analyzed again.
Developing Appropriate Hosts for Metabolic Engineering
V-3
Tools of Metabolic Engineering There are many different tools applied in the field of metabolic engineering. As mentioned above metabolic engineering often involves the heterologous expression of complete biosynthetic pathways leading toward interesting and valuable products. By introducing entire pathways it is possible to either produce known compounds more efficiently or even through combinatorial biosynthesis produce completely new chemical entities that may serve as possible new products as food ingredients, nutraceuticals or pharmaceuticals. The insertion of heterologous pathways for production of valuable products, in general, does not by itself result in high-level production of the desired product. In order to improve the yield or productivity, it is therefore normally necessary to improve the supply of the precursor metabolites and the cofactors required for biosynthesis of the product. Here it is interesting to notice that metabolism has a bow-tie structure where first nutrients are converted into 12 precursor metabolites, that then forms the basis for synthesis of all macromolecules and smaller metabolites in nature. Besides these 12 precursor metabolites the biosynthesis of metabolites and proteins requires the use of cofactors like NADPH, NADH, and ATP. These 12 precursor metabolites and the different cofactors are used in high frequency in cellular reactions, and this knit different parts of the metabolism together into a tightly connected metabolic network (Patil and Nielsen, 2005). Tight coupling of many different biochemical pathways imposes a major constraint when the objective is to increase the flux toward a specific precursor metabolite. As a result, redirection of fluxes requires a fundamental understanding of the operation of the complete network and not only on how the fluxes distribute over a few branch points. For this purpose, methods for flux quantification are playing a central role in metabolic engineering. The activity of different branches of the metabolic network, often quantified in the form of metabolic fluxes, can either be estimated through the use of flux balance analysis (Price et al., 2003) or through the use of C13-labeled substrate feeding followed by analysis of the labelling patterns in intracellular metabolites (Sauer, 2006). Due to their abundance and stability, C13-based methods have conventionally used proteinogenic amino acids to detect labelling patterns, but recently methods for direct analysis in the free pool of metabolites have been developed (van Winden et al., 2005; Wiechert and Noh, 2005; Noh and Wiechert, 2006). Metabolic flux analysis do not by itself provide much information, but by analyzing how the metabolic network is operating at different growth conditions, during growth of the cells on different media, and how specific mutations effect the operation of the network, it is possible to gain insight into how the fluxes are regulated, and this information can be used to design new metabolic engineering strategies. An example of application of metabolic flux analysis for gaining insight into the regulation of metabolic networks is a study of a large collection of Bacillus subtilis mutants, where it was shown possible to identify regulators that are involved in inhibition of growth (Fischer and Sauer, 2005). Thus, flux analysis today represents a standard technique for rapid phenotypic characterization of metabolically engineered strains, and this tool is likely to gain even wider use in the future. Another important tool in metabolic engineering is the analysis of intracellular metabolites, or metabolome analysis if a large number of metabolites are being measured (Jewett et al., 2006). It is inherently difficult to quantitatively measure intracellular metabolites as the very low time constants for turn over of these metabolites require very rapid quenching of the metabolites (Villas-Boas et al., 2005b). A requirement for the quenching process is that the metabolites do not leak out of the cells, and it is difficult to find a method that can be generally applied to measure different types of metabolites. However, in recent years several robust methods have been developed for analysis of specific groups of metabolites, e.g., sugar phosphates (Smits et al., 1998; Mashego et al., 2006) and amino and organic acids (Villas-Boas et al., 2005a). Metabolome analysis can also be used for high-throughput characterization of different mutants (Raamsdonk et al., 2001) and it has even been shown possible to rely on measurements of extracellular metabolites (so-called foot-printing) for rapid classification of yeast mutants (Allen et al., 2003). A major limitation of metabolome analysis is, however, that it is difficult to obtain
V-4
Developing Appropriate Hosts for Metabolic Engineering
truly quantitative data for a large number of metabolites, and there is therefore clearly a need for the development of better analytical methods (Nielsen and Oliver, 2005). With the high connectivity of the different metabolic reactions within the metabolic network it is difficult to dissect how specific mutations influence the individual pathways in the cell. However, by exploiting tools from functional genomics for mapping of global regulatory structures or even using high-throughput experimental techniques provided by the various omics it may be possible to dissect how the fluxes through the different branches of the metabolic network are controlled. This can only be done through the combination of experimental data and mathematical models of one kind or the other, and for this purpose the concept of metabolic control analysis can be extended for quantifying the distribution of flux control at the hierarchical level and the metabolic level (ter Kuile and Westerhoff, 2001; Rossell et al., 2006). Flux control at the hierarchical level means that the flux through a given reaction is controlled by transcription, translation or post translational modifications, i.e. modification of the active enzyme concentration; whereas flux control at the metabolic level indicates that the flux is controlled through interaction between the enzyme and the metabolites. Another approach to use omics data for identification of coregulated modules within the metabolic network, along with so-called reporter metabolites, is to use the network structure provided by a genome-scale metabolic network as a scaffold for analysis of transcriptome data (Patil and Nielsen, 2005). Here reporter metabolites represent hot-spots in the metabolic network where there is the most statistically significant transcriptional change between conditions or strains. The concept of reporter metabolites has been extended further to use metabolome data for identification of reporter reactions (Cakir et al., 2006), and by identifying both reporter reactions and reporter metabolites, it was possible to categorize reactions into metabolically or hierarchically regulated categories (Cakir et al., 2006). Another type of multilevel analysis for capturing how information stored at the genetic level is translated into phenotypic landscapes has been through the use of so-called vertical genomics strategies, where molecular measurements from multiple layers of the cellular hierarchy for a particular functional pathway are combined (e.g., mRNA, proteins, and metabolites). Hereby the sequential response of glycolytic reactions in S. cerevisiae to a sudden relief of glucose limitation has been studied (Kresnowati et al., 2006), and this analysis has provided insight into transcriptional control using proteomics (Kolkman et al., 2006). With transcriptome analysis being the most mature and well implemented omics technique, there has been much focus on whether this can be used to provide information on how the metabolic network is operating. An example of a successful application of transcriptome analysis for identification of metabolic engineering targets is the improvement of the galactose uptake in S. cerevisiae. Through genome-wide transcription analysis of several different mutants with improved galactose uptake, Bro et al. (2005) identified that there was up-regulation of PGM2 encoding phosphoglucomutase, and by overexpressing the PGM2 gene, the galactose uptake could be increased by 80%. A limitation of this strategy is, however, that due to the presence of regulation at the level of translation and at the metabolic level, there is generally a poor correlation between transcripts and metabolic fluxes. However, using a the concept of control effective fluxes (Stelling et al., 2002), which are functions of the different elementary flux modes (Schuster et al., 2000) in the metabolic network, it is possible to obtain a good correlation between mRNA levels and metabolic fluxes as shown in study of shift on growth at different carbon sources for S. cerevisiae (Cakir et al., 2004). In order to further look into the possible correlation between metabolic fluxes and transcript levels, Regenberg et al. (2006) performed transcriptome analysis at different specific growth rates in chemostat cultures (glucose limited). Hereby they identified which genes are decreasing and increasing for increasing specific growth rates. Besides mapping all genes related to the Crabtree effect, i.e. the onset of fermentative metabolism at aerobic growth conditions, they found that genes responsible for catabolism of C2 carbon sources, e.g., ethanol, are transcribed at low specific growth rates. This was further confirmed in a study by Vemuri et al. (2007), who through heterologous expression of oxidases in both the cytosol and in the mitochondria showed that there is indeed excess capacity of the TCA cycle, but that the onset of the Crabtree effect is caused by lack of capacity for oxidation of NADH in the mitochondria.
Developing Appropriate Hosts for Metabolic Engineering
V-5
Examples of Metabolic Engineering There are many examples of metabolic engineering. In an early review on metabolic engineering Bailey (1991) divided the examples into two: (1) recruiting heterologous activities for strain improvement and (2) redirecting metabolite flow. In a later review Cameron and Tong (1993) classified the examples into five groups: (1) improved production of chemicals already produced by the host organism, (2) extended substrate range for growth and product formation, (3) addition of new catabolic activities for degradation of toxic materials, (4) production of chemicals new to the host organism, and (5) modification of cell properties. More recently Nielsen (2001) grouped examples of metabolic engineering into the following categories: • Heterologous protein production. Examples are in the production of pharmaceutical proteins (hormones, antibodies, vaccines, etc.) and in the production of enzymes. Initially the heterologous gene needs to be inserted in the production host. Subsequently it is often necessary to engineer the protein synthesis pathway, e.g., to have an efficient glycosylation or secretion of the protein from the cell. In some cases it may also be necessary to engineer the strain to obtain an improved productivity. • Extension of substrate range. In many industrial processes it is interesting to extend the substrate range for the applied microorganism in order to have a more efficient utilization of the raw material. Initially, it is necessary to insert the necessary pathway (or enzyme) for utilization of the substrate of interest. Subsequently, it is important to ensure that the substrate is metabolized at a reasonable rate, and that the metabolism of the new substrate does not result in the formation of undesirable byproducts. This may in some cases involve extensive pathway engineering. • Pathways leading to new products. It is often of interest to use a certain host for production of several different products, especially if a good host system is available. This is exemplified by the production of adipoyl-7-ADCA by the penicillin producing fungus Penicillium chrysogenum. This can be achieved by extending existing pathways by recruiting heterologous enzymes. Another approach is to generate completely new pathways through gene shuffling or other methods of directed evolution. In both cases it is often necessary to further engineer the organism to improve the rate of production and eliminate byproduct formation. • Degradation of xenobiotics. Many organisms naturally degrade xenobiotics, but there are few organisms that degrade several different xenobiotics. In bioremediation it is attractive to have a few organisms that can degrade several different compounds. This may be achieved either by inserting pathways from other organisms or through engineering of the existing pathways. • Engineering of cellular physiology. In the industrial exploitation of microorganisms or higher cells it may be of significant interest to engineer the cellular physiology for process improvement, e.g., make the cells tolerant to low oxygen concentration, less sensitive to high glucose concentrations, improve their morphology, or increase flocculation. In cases where the underlying mechanisms are known this can be achieved by metabolic engineering. This may involve expression of heterologous genes, disruption of genes or over-expression of homologous genes. • Elimination or reduction of byproduct formation. For many industrial processes there are byproducts formed, which may be a problem due to the loss of carbon in these byproducts, due to toxicity of these compounds, or due to interference of the byproducts with the purification of the product. In some cases the byproducts can be eliminated through simple gene disruption, but in other cases the formation of the byproduct is essential for the overall cellular function and disruption of the pathway leading to the byproduct may be lethal. In the last case it is necessary to analyze the complete metabolic network, and based on this design a strategy for reduction of the byproduct formation. • Improvement of yield and productivity. In many industrial processes, especially in the production of low-value added products, it is important to continuously improve the yield and/or
V-6
Developing Appropriate Hosts for Metabolic Engineering
productivity. In some cases this can be achieved simply be increasing the activity of the biosynthetic pathway, e.g., by inserting additional gene copies. In other cases the pathways leading to the product of interest involves many steps, and it is therefore not possible to increase the activity of all the enzymes. Analysis of the flux regulation in the pathway is therefore required. Finally, in some cases the limitation is not in the actual pathway, and it may therefore be necessary to engineer the central carbon metabolism, which is generally very difficult due to the tight interaction of the different pathways as mentioned above. In this section there will be given many examples of metabolic engineering using different cell factories ranging from bacteria, yeast, fungi and cell cultures. The choice of cell factory generally depends on the type of product, but generally there is a trend in industry toward application of a few well characterized cell factories as this allows faster development and approval of new processes.
Impact of Systems Biology on Metabolic Engineering As mentioned above there is much interest in exploiting tools from functional genomics and systems biology in the field of metabolic engineering. Systems biology is a basic science that aims at obtaining a quantitative description of the biological system under study, and this quantitative description may be in the form of a mathematical model. In some cases, the model may be the final result of the study, i.e., the model captures key features of the biological system and can hence be used to predict the behavior of the system at conditions different from those used to derive the model. In other cases, mathematical modelling rather serves as a tool to extract information of the biological system, i.e., to enrich the information content in the data. There is not necessarily a conflict between the two, and generally, mathematical modelling goes hand in hand with experimental work. Until now there are only a few examples on how systems biology has impacted metabolic engineering and industrial biotechnology. However, the introduction of high-throughput experimental techniques has clearly enabled much faster progress in terms of phenotypic characterization of different mutants. In the future, when more advanced mathematical models and bioinformatics algorithms specifically suited for metabolic engineering have been developed, the value of using high-throughput experimental techniques for mapping detailed phenotypes will clearly increase. In particular methods that allows specific mapping of omics data on metabolic networks is interesting (Patil and Nielsen, 2005), as this allows for rapid identification of hot-spots in the metabolism based on transcriptome data. It is expected that mathematical models will be used more extensively in the design of metabolic engineering strategies, particularly as recent results have shown that the predictive power of metabolic models is sufficiently good to allow for identification of metabolic engineering targets (Bro et al., 2006). To further capitalize on systems biology in the field of industrial biotechnology, it is, however, important that metabolic models are extended to include regulation, as it is often possible to de-regulate fluxes through engineering of regulatory structures (Ostergaard et al., 2000). For this purpose it may be necessary to develop more detailed mathematical models that include details on the kinetics of at least a few key processes in the cell, and we are likely to see many examples of this in the future.
References Allen, J., Davey, H.M., Broadhurst, D., Heald, J.K., Rowland, J.J., Oliver, S.G., and Kell, D.B. 2003. Highthroughput classification of yeast mutants for functional genomics using metabolic footprinting. Nat. Biotechnol., 21, 692–696. Bailey, J.E. 1991. Toward a science of metabolic engineering. Science, 252, 1668–1675. Bro, C., Knudsen, S., Regenberg, B., Olsson, L., and Nielsen, J. 2005 Improvement of galactose uptake in Saccharomyces cerevisiae through overexpression of phosphoglucomutase: example of transcript analysis as a tool in inverse metabolic engineering. Appl. Environ. Microbiol., 71, 6465–6472.
Developing Appropriate Hosts for Metabolic Engineering
V-7
Cakir, T., Kirdar, B., and Ulgen, K.O. 2004. Metabolic pathway analysis of yeast strengthens the bridge between transcriptomics and metabolic networks. Biotechnol. Bioeng., 86, 251–260. Cakir, T., Patil, K.R., Onsan, Z., Ulgen, K.O., Kirdar, B., and Nielsen, J. 2006. Integration of metabolome data with metabolic networks reveals reporter reactions. Mol. Syst. Biol., 2, 50. Cameron, D.C. and Tong, I.-T. 1993. Cellular and metabolic engineering. Appl. Biochem. Biotechnol., 38 105–140. Fischer, E. and Sauer, U. 2005. Large-scale in vivo flux analysis shows rigidity and suboptimal performance of Bacillus subtilis metabolism. Nat. Genet., 37, 636–640. Jewett, M.C., Hofmann, G., and Nielsen, J. 2006. Fungal metabolite analysis in genomics and phenomics. Curr. Opin. Biotechnol., 17, 191–197. Kolkman, A., Daran-Lapujade, P., Fullaondo, A., Olsthoorn, M.M., Pronk, J.T., Slijper, M., and Heck, A.J. 2006. Proteome analysis of yeast response to various nutrient limitations. Mol. Syst. Biol., 2, 2006 0026. Kresnowati, M.T., van Winden, W.A., Almering, M.J., ten Pierick, A., Ras, C., Knijnenburg, T.A., DaranLapujade, P., Pronk, J.T., Heijnen, J.J., and Daran, J.M. 2006. When transcriptome meets metabolome: fast cellular responses of yeast to sudden relief of glucose limitation. Mol. Syst. Biol., 2, 49. Mashego, M.R., van Gulik, W.M., Vinke, J.L., Visser, D., and Heijnen, J.J. 2006. In vivo kinetics with rapid perturbation experiments in Saccharomyces cerevisiae using a second-generation BioScope. Metab. Eng., 8, 370–383. Nielsen, J. 2001. Metabolic engineering. Appl. Microbiol. Biotechnol., 55, 263–283. Nielsen, J. and Oliver, S. 2005. The next wave in metabolome analysis. Trends Biotechnol., 23, 544–546. Noh, K. and Wiechert, W. 2006. Experimental design principles for isotopically instationary 13C labeling experiments. Biotechnol. Bioeng., 94, 234–251. Ostergaard, S., Olsson, L., Johnston, M., and Nielsen, J. 2000. Increasing galactose consumption by Saccharomyces cerevisiae through metabolic engineering of the GAL gene regulatory network. Nat. Biotechnol., 18, 1283–1286. Patil, K.R. and Nielsen, J. 2005. Uncovering transcriptional regulation of metabolism by using metabolic network topology. Proc. Natl. Acad. Sci. USA, 102, 2685–2689. Price, N.D., Papin, J.A., Schilling, C.H., and Palsson, B.O. 2003. Genome-scale microbial in silico models: the constraints-based approach. Trends Biotechnol., 21, 162–169. Raamsdonk, L.M., Teusink, B., Broadhurst, D., Zhang, N.S., Hayes, A., Walsh, M.C., Berden, J.A., Brindle, K.M., Kell, D.B., Rowland, J.J., Westerhoff, H.V., van Dam, K., and Oliver, S.G. 2001. A functional genomics strategy that uses metabolome data to reveal the phenotype of silent mutations. Nat. Biotechnol., 19, 45–50. Regenberg, B., Grotkjaer, T., Winther, O., Fausboll, A., Akesson, M., Bro, C., Hansen, L.K., Brunak, S., and Nielsen, J. 2006. Growth-rate regulated genes have profound impact on interpretation of transcriptome profiling in Saccharomyces cerevisiae. Genome Biol., 7, R107. Rossell, S., van der Weijden, C.C., Lindenbergh, A., van Tuijl, A., Francke, C., Bakker, B.M., and Westerhoff, H.V. 2006. Unraveling the complexity of flux regulation: a new method demonstrated for nutrient starvation in Saccharomyces cerevisiae. Proc. Natl. Acad. Sci. USA, 103, 2166–2171. Sauer, U. 2006. Metabolic networks in motion: 13C-based flux analysis. Mol. Syst. Biol., 2, 62. Schuster, S., Fell, D.A., and Dandekar, T. 2000. A general definition of metabolic pathways useful for systematic organization and analysis of complex metabolic networks. Nat. Biotechnol., 18, 326–332. Smits, H.P., Cohen, A., Buttler, T., Nielsen, J., and Olsson, L. 1998. Cleanup and analysis of sugar phosphates in biological extracts by using solid-phase extraction and anion-exchange chromatography with pulsed amperometric detection. Anal. Biochem., 261, 36–42. Stelling, J., Klamt, S., Bettenbrock, K., Schuster, S., and Gilles, E.D. 2002. Metabolic network structure determines key aspects of functionality and regulation. Nature, 420, 190–193. ter Kuile, B.H. and Westerhoff, H.V. 2001. Transcriptome meets metabolome: hierarchical and metabolic regulation of the glycolytic pathway. FEBS Lett., 500, 169–171.
V-8
Developing Appropriate Hosts for Metabolic Engineering
van Winden, W.A., van Dam, J.C., Ras, C., Kleijn, R.J., Vinke, J.L., van Gulik, W.M., and Heijnen, J.J. 2005. Metabolic-flux analysis of Saccharomyces cerevisiae CEN.PK113-7D based on mass isotopomer measurements of (13)C-labeled primary metabolites. FEMS Yeast Res., 5, 559–568. Vemuri, G.N., Eiteman, M.A., McEwen, J.E., Olsson, L., and Nielsen, J. 2007. Increasing NADH oxidation reduces overflow metabolism in Saccharomyces cerevisiae. Proc. Natl. Acad. Sci. USA, 104, 2402–2407. Villas-Boas, S.G., Moxley, J.F., Akesson, M., Stephanopoulos, G., and Nielsen, J. 2005a. High-throughput metabolic state analysis: the missing link in integrated functional genomics of yeasts. Biochem. J., 388, 669–677. Villas-Boas, S.G., Mas, S., Akesson, M., Smedsgaard, J., and Nielsen, J. 2005b. Mass spectrometry in metabolome analysis. Mass Spectrom. Rev., 24, 613–646. Wiechert, W. and Noh, K. 2005. From stationary to instationary metabolic flux analysis. Adv. Biochem. Eng. Biotechnol., 92, 145–172.
21 Escherichia coli as a Well-Developed Host for Metabolic Engineering 21.1 Why Escherichia coli? ����������������������������������������������������������������������21-1 21.2 Fundamentals of Metabolic Engineering of E. coli....................21-3 Metabolism and Products • Genetic Elements and Tools • Commonly Used Strains and Strain Improvements • Cultivation Technology
Eva Nordberg Karlsson, Louise Johansson, Olle Holst, and Gunnar Lidén Lund University
21.3 M etabolic Engineering of a Specific Pathway— The Shikimate Pathway ����������������������������������������������������������������21-12
Increase of the Flux into the Pathway • Supply of Precursors • Minimization of By-Products • Comparison with Metabolic Engineering of an Artificial Pathway—1,3-Propanediol Production
21.4 Concluding Remarks ��������������������������������������������������������������������21-18 References ��������������������������������������������������������������������������������������������������21-18
21.1 Why Escherichia coli? In any encyclopedia concerning industrial applications of microorganisms, Escherichia coli will be included. Although few people would disagree on the importance of the organism today, it is a highly legitimate question to ask why a bacterium originally isolated from human intestines has become an important production host for as diverse products as precursors for plastics and therapeutic proteins? It has been stated [1] that if you “mention E. coli to the man in the street, he’s most likely to make some references to a dodgy burger and the resulting diarrhea.” However, even if there are pathogenic strains, E. coli is actually one of the dominating species in the bowel of healthy individuals. Furthermore, to most applied scientists E. coli is looked upon as a well-known and useful microbial model system. An important reason is that E. coli early became known for its ease of growth on synthetic media, and its fast doubling times [2], making it an attractive system to handle. The development leading to the current importance of E. coli is the result of many factors, some of which are summarized in Figure 21.1. Starting from the beginning, E. coli was first cultured from faeces of healthy individuals in 1885 by the German pediatrician Theodor Escherich. At that time he named the newly isolated organism “Bacterium coli” [3], to reflect the fact that it was a bacterium present in the colon (hence the name “coli”), but it was later renamed Escherichia coli [2,4]. The ease of cultivation was certainly one of the factors that promoted its distribution between scientists, who exchanged available strains for the elucidation of microbial and biochemical phenomena and pathways. In the 1940s and 1950s there was a large development in bacterial genetics, biochemistry, and physiology. During this time-period, Jacques Monod and coworkers at the Pasteur institute performed their classic studies on the lac-operon in E. coli, which resulted in the explanation of “diauxie” (or diauxic 21-1
21-2
Developing Appropriate Hosts for Metabolic Engineering
Discovery, 1855 ”Bacterium coli” (T. Escherich) Indicator of sanitary conditions in food and water (“coliform” test) to alert fecal contamination/ possible pathogen presence, 1892 Renamed Escherichia coli, 1919 ”The phage group” established by Delbrueck and Luria, in early 1940s to work on genes in a less complex system than Drosophila Bacterial conjugation, 1946 (Lederberg) Coordinated gene regulation (Diauxic growth, lac-operon, enzyme induction) by Monod and Jacob (1940-50s) Lederberg suggests the term ”plasmid” for extrachromosomal hereditary elements, 1952 Purification of plasmids. Bacterial antibiotics resistance genes shown to be carried on plasmids. 1968 (Cohen) Discovery of restriction enzymes and ligases, EcoR1 isolated by Boyer, 1970 First recombinant DNA with bacterial genes, 1972 (Berg) First cloning of DNA-fragment in plasmid (pSC101), 1973 (Boyer and Cohen) Humulin, Genentech´s human insulin licensed, 1978 Metabolic engineering for phenylalanine production, 1987 (Miller et al.)
Complete sequence of E. coli K12 genome, 1997 (Blattner et al.) Production of 1,3-propandiol in engineered E. coli, 2003 (Nakamura and Whited)
Figure 21.1 A brief overview of historical developments leading to the position of Escherichia coli as a model organism for many scientists, and a “workhorse” in molecular biology.
growth), i.e., the shift of growth from one substrate (glucose) to another (lactose) upon depletion of the former where growth on the respective substrate was separated by an adaptation or “lag”-phase [5]. Based on the E. coli model system it was also, for the first time proposed that the ability to utilize lactose was coupled to the formation of an enzyme, β-galactosidase (called lactase at that time). Fundamental questions on the relation between gene, enzyme, and inductive substrate were raised, leading to an understanding of
Escherichia coli as a Well-Developed Host for Metabolic Engineering
21-3
Table 21.1 Pros and Cons of E. coli as a Production Organism Advantages Rapid growth on a multitude of sugars Possible to grow to high cell densitites Well established molecular tools (fully sequenced organism, many plasmids available) No need for complex medium (i.e., vitamin addition) Well developed cultivation strategies Industrial acceptance due to large experience
Disadvantages Overflow metabolism giving acetate production at high glucose fluxes Limited capacity for synthesis of large proteins No protein glycosylation A relatively high maintenance requirement. Due to its facultative metabolism, the product formation may change in large scale operation in zones of oxygen depletion PTS sugar transport system Endotoxin production
the phenomenon of enzyme induction [6]. The parallel discovery of bacterial conjugation in E. coli 1946 by Lederberg [7,8] allowed an early genetic analysis that led to identification of the genes involved in lactose metabolism. By the end of the 1960s the main questions surrounding gene expression and regulation of the genes in the E. coli lac-operon had been solved, and can be said to have laid a foundation for the field of molecular biology [9]. During the years to come the increased understanding of genetic elements and DNA-modifying enzymes, led to development of molecular biology techniques. E. coli was again chosen as the model system, in the pioneering work by Stanley Cohen and Herbert Boyer, who with the help of restriction-endonucleases, and purified plasmids managed to performed the first successful cloning of a recombinant plasmid, which was transformed for expression in an E. coli host strain [10]. The new techniques constituted the basis for a symbiotic relationship between science and technology [11] that quickly resulted in commercial products in the pharmaceutical sector. Not surprisingly, the first products were recombinant proteins. The possibility of producing a target protein by microbial cultivation opened up entirely new possibilities in terms of supplying human proteins/peptides in amounts allowing therapeutic applications. In 1978 the first “molecular biology based” biotechnology company, Genentech, licensed human insulin from a gene cloned in E. coli to the company Eli Lilly, one of the major producers of insulin. Soon to follow (the same year), Genentech got the commission to clone a gene encoding human growth hormone, hGH, Genotropin (originally named Somatonorm), from Kabi-Gen, a Swedish pharmaceutical company (today part of Pfizer). This product finally reached the market in 1985 [12]. The speed of molecular biology development has since then accelerated and recombinant proteins may now be produced in a range of host organisms. E. coli maintains a strong position as a host for protein production, despite limitations in terms of e.g., protein secretion, glycosylation, and folding. The disadvantages are balanced by advantages such as the ease of cultivation, the possibility to achieve very high intracellular titers of recombinant protein (up to 30% of the dry weight) in E. coli and—not least—the substantial amount of genetic tools and practical experience of handling E. coli today accumulated in academia and industry (Table 21.1). The interest in E. coli as a host for metabolite production began with the isolation of high-yielding threonine producing E. coli strains, and the use of E. coli for amino acid production carried into the metabolic engineering era with the development of phenylalanine and tryptophan production [13,14]. One of the most spectacular recent achievement in terms of metabolic engineering of any microorganism is the engineering of E. coli for direct production of 1,3-propanediol from glucose carried out by Genencor in collaboration with Du Pont [15]. The establishment of large-scale commodity chemical production with metabolically engineered E. coli is certainly a milestone in the development of biotechnology.
21.2 Fundamentals of Metabolic Engineering of E. coli 21.2.1 Metabolism and Products To enable targeted metabolic engineering, a detailed knowledge of metabolic pathways and control elements for gene expression is necessary. A factor to keep in mind is that alterations of the metabolic
21-4
Developing Appropriate Hosts for Metabolic Engineering
pathways to improve cell properties and survival, is a natural adaptation process in bacteria, allowing the possibility to colonize different natural habitats. This flexibility is an asset also in metabolic engineering work. The principal advantage of E. coli as a host organism in terms of its metabolism is its ability of rapid growth on many different sugars with the addition of only mineral salts. The fact that only mineral salts are needed as nutrients considerably increases the range of commercially interesting end products. The metabolic drawbacks include the overflow metabolism—i.e., the formation of acetate also aerobically at high glycolytic fluxes [16], and a rather substantial maintenance energy requirement— giving a nongrowth associated consumption of the carbon source. The last fact requires careful considerations in the process design. E. coli is a facultative aerobe, i.e., it is capable of forming fermentative end products and to generate ATP also anaerobically. The anaerobic metabolism provides pathway flexibility, but may also complicate process scale-up in aerobic processes due to oxygen concentration gradients in large scale reactors [17]. A fundamental question to be asked is—what kind of products are likely to be produced efficiently in this host organism? A summary of already established commercial products as well as currently investigated new potential products from metabolically engineered E. coli is given in Table 21.2. From a market perspective the products range from therapeutic compounds—in the very high price end—down to commodity chemicals such as 1,3-propanediol. Also from a metabolic engineering perspective, the products span a wide range, with some products being achieved with only modest changes in the normal metabolism, whereas others require the insertion of entire pathways. The metabolism of E. coli can be broken down into principal blocks as shown in Figure 21.2. In terms of smaller metabolites, products to be formed are likely to be derived from the 12 central precursor metabolites formed in the catabolism. These can either be converted into a normal metabolite in E. coli, or a metabolite which is only formed in E. coli after introduction of an artificial pathway. Table 21.2 Examples of Products Obtained in Metabolically Engineered E. coli and the Principal Modifications Made Product Class Primary metabolites
Amino acids
Compound
Use
Ethanol
Fuel
Lactic acid
Materials
Succinic acid
Food, Materials
Tryptophan
Pharma
Principal Metabolic Modifications Required Overexpression of PDC (encoding pyruvate decarboxylase) and adhB (encoding alcohol dehydrogenase) from Zymomonas mobilis Deletion of pathways giving anaerobic by-products, i.e., deletion of fumarate reductase (deletion of frdABCD), alcohol dehydrogenase (deletion of adhE), and pyruvate formate lyase (deletion of pflB). Interesting difference in strain background, where an optically purer grade of D-lactate for polylactate production was obtained in strain B (KO11) Minimize drainage of NADH in undesired reductions, such as formation of lactate (deletion of ldhA) and ethanol (deletion of adhE). Increase carbon dioxide binding reactions, e.g., carboxylation of pyruvate (heterologous expression of pyc, Lactococcus lactis) or carboxylation of PEP (heterologous expression of PEPC, e.g., from Actinobacillus succinogenes). Provide an alternative nonreductive pathway for succinate production via activation of the glyoxylate pathway. This can be achieved by inactivation of the repressor encoded by iclR. Minimize acetate formation by deletion of ackA and pta. Modification of feed-back resistance in the first step in the aromatic pathway (mutation of e.g., aroH, aroG, and aroF), modifications in the PTS sugar uptake system, overexpression of the Trp operon (trpEDCBA) (catalyzing the reaction steps downstream of chorismate), deletion of tnaA encoding tryptophanase to minize the breakdown of tryptophan.
Reference 18
19,20
21,22
23,24
(Continued)
Escherichia coli as a Well-Developed Host for Metabolic Engineering
21-5
Table 21.2 (Continued) Product Class
Compound
Use
Phenylalanine
Food
Intermediates in pathways
Shikimic acid
Pharma
Secondary metabolites
Amorpha-4,11diene (an isoprenoid)
Pharma
Monomers
1,3-propanediol Materials
Polymers
Polyhydroxyalkanoates
Materials
Recombinant proteins
Insulin hGH
Pharma
DNA
plasmid DNA
Pharma (gene therapy)
Principal Metabolic Modifications Required Similar modifications as for overproduction of tryptophan with respect to sugar transport and feed-back inhibition in the upper part of the aromatic pathway. Modification of feed-back resistance in pheA encoding prephenate dehydratase, i.e., the dedicated step into L-phenylalanine formation from chorismate. Similar modifications as for overproduction of aromatic amino acids with respect to sugar transport and feed-back inhibition in the upper part of the aromatic pathway. Deletion of shikimate kinase (aroK, aroL). Insertion of the entire mevalonate-dependent isoprenoid pathway (8 genes), mainly from S. cerevisiae (atoB acetoacetyl-CoA-thiolase (E.coli), HMGS, HMGR (truncated), ERG12, ERG8, MVD1 (S. cer), idi - IPP isomerase (E. coli), ispA - FPP synthase (E. coli). Insertion of a codon-optimized variant of ADS (amorphadiene synthase). Insertion of heterogeneous genes encoding glycerol-3phosphate dehydrogenase (GPD1, S. cerevisiae) glycerol-3phosphate phosphatase (GPP2, S. cerevisiae), glycerol dehydratase (dhaB1-B3, Klebsiella pneumoniae), and 1,3 propanediol oxidoreductase*. Deletion of genes encoding glycerol kinase (glpK) and glycerol dehydrogenase (gldA). Change of uptake system, i.e., deletion of the PTS system and creation of an ATP-coupled uptake via overexpression of galactose permease (galP) and glucokinase (glk). Furthermore, insertion of reactivation factors for the glycerol dehydratase (dhaBX, orfX, and Klebsiella pneumoniae). Expression of PHA synthase phaC from Aeromonas acromogenes, phaJ from A. hydrophila (encoding enoyl-CoA hydratase) and phbB from Ralstonia eutropha. The first successful recombinant insulin production method was based on a two-chain method. A-and B-chains were produced separately as fusion-proteins, and were extracted from inclusion bodies. For hGH, a first production combined an E. coli expressed peptide with a chemically synthesized. Later hGH was correctly processed in E. coli after periplasmic production. Supported replication of ColE1 plasmids. Decreasing RNA degradation.
Reference 25
26,27
28
15
29,30
31–33
34
* The source for the oxidoreductase used in the final construct was in fact the E. coli gene yqhD.
The changes made to obtain the products shown in Table 21.2 can be said to principally fall into the following categories:
1. Modification of substrate uptake 2. Increase of the yield of the essential precursor molecule(s) from the substrate 3. Insertion of genes enabling the synthesis of the desired product molecule from precursor molecule(s) 4. Modification(s)—typically gene deletions—aiming at decreasing the formation of by-products
The initial modification made is very often of the third category, i.e., primarily one makes sure that the product of interest is indeed synthesized from a suitable precursor. Ethanol production from pyruvate can for example be obtained in E. coli by providing a heterologous pyruvate decarboxylase and an
21-6
Developing Appropriate Hosts for Metabolic Engineering
G6P R5P F6P E4P G3P 3-PG PEP Pyruvate AcCoA OA SuccCoA α-KG
Glucose Lactose Sucrose Xylose Galactose
Substrates Transport
Substrates Fuelling reactions
Precursor metabolites
Secretion
Proteins
Ethanol Lactic acid Succinic acid 1,3 propanediol
Metabolites Transport
Inherent or Engineered pathways
Building blocks Biosynthetic reactions Polymerization
Macromolecules Assemblance
Amino acids RNA-nucleotides DNA-nucleotides Fatty acids UDP-glucose UDP-N-acetylglucoseamine
Proteins RNA DNA Lipids LPS Peptidoglycan
Biomass (more cells)
Figure 21.2 A schematic breakdown of metabolism, showing principal precursors and potential product formation.
alcohol dehydrogenase [18]. The introduction of genes encoding the necessary enzymes for converting the precursor to product may be sufficient in some cases, but normally also other modifications are necessary. With respect to sugar uptake, many sugars—including glucose, mannose, and fructose—are taken up via the phosphoenol pyruvate: sugar phosphotransfer system (PTS). This results—as a net effect—in an intracellular phosporylated sugar molecule produced at the expense of the conversion of one molecule of phosphoenol pyruvate (PEP) to pyruvate [35]. Since the uptake is coupled to phosphorylation, the consumption of one PEP is not necessarily disadvantageous from an energetic viewpoint [36]. However, inactivation of PTS components has been shown efficient to avoid catabolite repression, and thereby allow uptake of several sugars simultaneously [37]. Furthermore, use of the PTS system decreases the maximum yield of PEP from glucose from two to one. This is a drawback for products derived from PEP, such as aromatic amino acids as will be discussed later. An alternative in such cases is to introduce a facilitated glucose transporter, by, e.g., expression of the glf gene encoding a glucose facilitator from the bacterium Zymomonas mobilis, and a glucose kinase [38]. To decrease the by-product formation is in some cases a question of fine-tuning, but may in other cases in fact be the main—or only—modification made. As an example it can be mentioned that production of lactate in E. coli is obtained by deleting the competing pathways in the fermentative metabolism, i.e., deletion of the enzymes catalyzing formation of ethanol, acetate, formate, and succinate (cf. Table 21.2).
21.2.2 Genetic Elements and Tools To implement the gene insertions or deletions in the E. coli metabolic pathways, tools for cloning, transformation, and control of gene expression are needed. As indicated in Table 21.3, a wealth of alternatives is available. A difference in metabolic engineering, compared to genetic engineering for recombinant protein production, is the interest in both inserting and deleting genes as well as both up- and down
Escherichia coli as a Well-Developed Host for Metabolic Engineering
21-7
Table 21.3 Selected Promoter Elements Used for Gene Expression in E. coli Promoter
Source
lac (lacUV5, lac(TS)) araBAD
E. coli E. coli
trp
E. coli
phoA
E. coli
recA cspA
E. coli E. coli
cadA
E. coli
Nalidixic acid Low temperature shift, <20°C Acid pH
lpp T7
E. coli T7
Constitutive IPTG
PL
λ S. aureus Hybrid
PSPA tac
Induction IPTG, lactose (thermal for lac(TS)) L-arabinose Trp-starvation/ β-indoleacrylic acid Phosphate starvation
High temperature Constitutive IPTG/high temperature
T7lac
Hybrid
IPTG
lpplac
Hybrid
IPTG
Comments Well characterized, weak induction, leaky promoter, many mutants, and hybrids available. Well characterized, titratable induction (to high levels), tight regulation, catabolite glucose repressed. Well characterized, induction can be titrated (to high levels), leaky expression. Relatively high level expression and low temp expression possible, limited media options. High level expression, no special host requirements. Relatively weak induction, not so well-known.
Reference 39,40 39–41 40,42,43 39,40 39,40 39,44
Induction can be titrated (up to high levels), not so well characterized. Strong promoter, not so frequently used. Well characterized, very high level expression, leaky expression, only transcribed by the T7-polymerase (chromosomally integrated under the lacUV5promoter). Well characterized, high level expression.
39,40
Expression during growth-phase. Well characterized, titratable induction up to high level expression, leaky expression. Well characterized promoter, very high expression levels, tight control. Strong promoter, fusion to lac-regulation sequence allows induction to high expression levels.
47,48 49
39,45 40,46
40
46,50 39,45
regulating gene expression, in order to fine-tune specific steps in metabolic pathways. Vectors, like plasmids, are needed for sequence and gene transfer, and are maintained either as extrachromosomal elements (for insertions), or utilized for site-specific integration (both insertions and deletions), where in the case of gene-deletions/modifications, chromosomal copies of the target genes or their control elements are replaced by homologous recombination. Plasmid stability is important when using extrachromosomal plasmids, and control of their maintenance (selection markers) and copy number (replicon) in the cell, need to be considered. Plasmids for expression in E. coli are often based on pBR322 (copy-number 15-20), pUC (copy number 500–700), both with the ColE1-replicon [41,51,52], or the pACYC-series (copy number 10–12) with the p15A replicon [41,53]. Stable plasmids (resulting in stable expression) are highly desired in metabolic engineering, and low-copy number plasmids are excellent alternatives for expression—and evaluation of expression levels—of inserted genes, as a lower copy number generally means a decreased metabolic burden [54]. Novel derivatives of the pBR322-based pET-system, like pETcoco (Novagen, EMD Biosciences Inc., Germany), with a copy number of only 1 per cell have recently been developed. If more than one plasmid needs to be introduced in the cell, plasmid compatibility has to be considered, and derivatives containing ColE1 and p15A replicons are known to be compatible [55]. Insertion of genes encoding selection markers is an established way to maintain extrachromosomal plasmids. Markers conferring antibiotics resistance are today completely dominating in E. coli systems [44]. One of the most common for labscale cultivations is the bla-gene encoding β-lactamase that give rise to ampicillin resistance. However, due to the degrading action of β-lactamase on the antibiotic, it is generally replaced by other antibiotics in larger scale. An alternative to extrachromosomal plasmids, and a necessity for gene deletions, is use of chromosomal integration, e.g., via integrating plasmids. Chromosomal integration offers the
21-8
Developing Appropriate Hosts for Metabolic Engineering
Cloned gene material Stability
Expression control Extrachromosomal plasmid
Chromosomal integration
Random
Transposon systems (Tn10, Tn7 or Tn5 based vectors)
Site-specific
RecA mediated recombination
Phage-protein mediated recombination (Red λ/Redβ from phage λ or RecE/RecT from the Rac phage) Phage attachment-site recombination (attB)
Figure 21.3 Some methods used for transfer of gene material into E. coli. Chromosomal integration can be desirable in order to improve strain stability. Several methods have been developed to insert or delete genes, and create mutations. (For a more detailed review of integration methodologies see Smolke et al. [56], and references therein.)
possibility to introduce stable genetic changes. Integration into the genome by homologous recombination can be achieved using different mechanisms Figure 21.3). The RecA—mediated mechanism, in RecA-positive strains, like JM101 (see Table 21.4), is perhaps the most well-known method [57,58], and for this purpose, vectors carrying the temperature-sensitive SC101-replicon may be useful tools [59]. These are unable to replicate at 44°C [60] and, therefore, offer the possibility to select for integration at this temperature [58]. Factors controlling expression levels of target genes are also central for metabolic engineering. The gene expression level depends on the promoter and other sequences controlling mRNA translation and stability. Modifications of these control elements influence gene expression, and allow evaluation of the degree of control exercised by individual enzymes on overall pathway flux [61]. For native E. coli genes, the natural promoter may be used, but when introducing novel genes for other organisms other promotors must be used and a number of different promoters are available for this purpose. Due to the previously discussed historical significance of the lac-operon, the lac-promoter (or lac-derived control elements) is one of the most well known and most utilized promoters in vectors for heterologous gene regulation [62]. However, several other promoters are available to meet a desired expression level (Table 21.3). Many promoters are designed to be relatively strong, with expression either fully on (in presence of inducer) or off. Use in metabolic engineering has also created a desire to “fine tune” induction, e.g., to gradually increase expression as a result of titration of an inducer. Expression systems like those utilizing the araBAD promoter (Table 21.3) or strains, such as Tuner (Table 21.4) which allows inducer titration, are examples of developments in this direction. Other factors of importance for the gene expression level include those that affect mRNA translation levels. Less attention is often given to these factors, but E. coli simultaneously uses many different types of sequence elements for this purpose, e.g., Shine Dalgarno sequences [39], +2 codons [89],
C600
DH1
DH5α
JM101
JM109
NM522
XL1-blue
W3110
K12
K12
K12
K12
K12
K12
K12
Strain
K12
Origin
λ- rrnD, rrnE [77]
F´ proA+ B + lacIq ∆(lacZ)M15/ ∆(lacproAB) glnV thi-1 ∆(hsdS-mcrB)5 [73] endA1 gyrA96(nalR) thi-1 recA1 relA1 lac glnV44 F´[::Tn10 proAB+ lacIq Δ(lacZ) M15] hsdR17(rK- mK+) [63,76]
endA1 glnV44 thi-1 relA1 gyrA96 recA1 mcrB+ Δ(lac-proAB) glnV44 e14- [F´ traD36 proAB+ lacIq lacZΔM15] hsdR17(rK-mK+) [52]
endA1 thi Δ(lac-proAB) [F´ traD36 proAB+ lacIq lacZΔM15] [52,63]
F- endA1 glnV44 thi-1 recA1 relA1 gyrA96 deoR nupG Φ80dlacZΔM15 Δ(lacZYAargF)U169, hsdR17(rK- mK+), λ− [66,69]
endA1 recA1 gyrA96 thi-1 glnV44 relA1 hsdR17(rK- mK+) λ- [65,66]
supE44 hsdR thi-1 thr-1 leuB6 lacY1 tonA21 [63]
Genotype
Recombination deficient strain, that support growth of vectors with some amber mutations, F´ allows blue/white screening, permits bacteriophage M13 superinfection. Commercially available from Stratagene. Widely used strain for genetic and physiological studies, some genetic variations in the strain reported from different stocks.
A strain supporting growth of vectors carrying amber mutations. Commercially available from NEB or Stratagene. Partly restriction-deficient; good strain for cloning repetitive DNA (RecA−). Can also be used for M13 cloning/sequencing and blue/white screening. Commercially available from NEB or Stratagene. General purpose strain/host. Commercially available from NEB or Stratagene.
Suppressing strain often used for making lysates and for progation of λgt10. Commercially available from Stratagene. Hoffman-Berling 1100 strain derivative, recombination deficient, more efficient at transforming large (40–60 Kb) plasmids, nalidixic acid resistant. Developed from strain DH1, nalidixic acid resistant.
Remarks
Use in Metabolic Engineering
(Continued)
Used in metabolic engineering of phenylalanine pathway [70]. Used in engineering for enrichment of PHA monomers [64]. Used for shikimic acid production [78].
Engineered for lycopene production [71].
Cloning host [74]. Developed for carbohydrate synthesis [75].
Analyzed concerning phenylacetyl-CoA catabolon [67]. Used in molecular breeding of the carotenoid biosynthetic pathway [72].
Cloning host [70]. Engineered for lycopene production [71]. Analyzed concerning phenylacetyl-CoA catabolon [67]. Used in engineering for PHA formation [68]. Engineered for lycopene production [71] and for further breeding of the carotenoid biosynthetic pathway [72].
Analysed concerning phenylacetyl-CoA catabolon [67]. Used in engineering for PHA formation [68].
Developed in engineering for enrichment of polyhydroxyalkanoate (PHA) monomers [64].
Table 21.4 Some Common E. Coli Strains Originating from Strains K12 and B, and Selected Use in Metabolic Engineering
Escherichia coli as a Well-Developed Host for Metabolic Engineering 21-9
BL21(DE3)
Tuner* (DE3)
Rosetta* (DE3)
HB101
B
B
B
K12/B hybrid
F- mcrB mrr hsdS20(r B- mB-) recA13 leuB6 ara-14 proA2 lacY1 galK2 xyl-5 mtl-1 rpsL20(SmR) glnV44 λ- [86]
F− ompT gal dcm hsdSB (r B- mB-) lacY1 λ(DE3) pRARE** (CmR) [50]
F − ompT gal dcm hsdSB (r B- mB-) lacY1 λ(DE3) [50]
F − ompT gal dcm hsdSB (r B- mB-) λ(DE3) [46]
F- mcrA Δ(mrr-hsdRMS-mcrBC) φ80lacZΔM15 ΔlacX74 deoR nupG recA1 araD139 Δ(ara-leu)7697 galU galK rpsL(StrR) endA1 λ- [79,80]
Genotype
Expression host with rare t-RNA codons and lac-permease mutation allows control of expression level. Commercially available from Novagen. Host for pBR322 and many plasmids
Expression strain derived from strain B834. With DE3, a λ prophage carrying the T7 RNA polymerase gene and lacIq. Commercially available from Novagen. Expression host with lac-permease mutation allows control of expression level. Commercially available from Novagen
Cloning host, very similar to DH10B. Commercially available from Invitrogen.
Remarks
Used in carotenoid pathway engineering [87]. Analysed concerning phenylacetyl-CoA catabolon [67].
Not yet documented use in metabolic engineering, but with potential as inducer titration can be made using IPTG. The strain has been used for artificial bacterial chromosome expression [84]. Used in production of plant specific flavonols [85].
Engineering to construct a catalyst for d-mannitol production [82]. Engineering of sialyte metabolism [83].
Used as cloning host [70]. Metabolic engineering of the carotenoid pathway [81].
Use in Metabolic Engineering
* Registered trademark: Novagen, EMD Biosciences Inc., Germany. ** pRARE encodes the tRNA genes argU, araW, ileX, glyT, leuW, proL, metT, thrT, tyrU, and thrU. The rare codons AGG, AGA, AUA, CUA, CCC, and GGA are supplemented [88].
Top10
Strain
K12
Origin
Table 21.4 (Continued)
21-10 Developing Appropriate Hosts for Metabolic Engineering
Escherichia coli as a Well-Developed Host for Metabolic Engineering
21-11
and codon usage [39,90]. The stability of mRNA also affects production of the encoded protein, and specific sequences (U-rich regions) in the 5´-untranslated region prolong its half-life. A hairpin at the 5´-terminus [91], and stem-loop structures at the 3´-terminus [89] are other examples of factors shown to promote the half-life of mRNA.
21.2.3 Commonly Used Strains and Strain Improvements The most common E. coli strains are derived from either strain K-12 or strain B, with the former being predominant [1,2]. Both strains are nonpathogenic. Strain K-12, whose genome was fully sequenced in 1997 [92], was isolated at Stanford University in 1922 from human feces, and was kept there as a stock strain under the label K-12 for many years [2]. The origin of strain B appears to be somewhat uncertain, but it was briefly described by Kalmanson and Bronfenbrenner in 1939, who at that time stated that the strain had been in use over a period of 15 years [93]. These two strains, and their derivatives (Table 21.4), have during many years of cultivation lost their “O”-surface antigens, which is a further assurance of their harmlessness to people [1]. In addition, the strains are continuously developed (and renamed) both for general recombinant protein production and for specific metabolic engineering purposes, resulting in an ever increasing number of derivatives. Table 21.4 gives an overview of the changes introduced compared to the respective type strain (K-12 or B). Typically, the genetic changes in commercially available strains are introduced either to improve expression of genes with features that are uncharacteristic for E. coli (e.g., by introducing genes encoding rare t-RNA codons as in the strain Rosetta, Table 21.4), or to promote protein folding/stability Figure 21.4). Genetic changes have also been introduced to stabilize extrachromosomal plasmids without the necessity to utilize antibiotic resistance markers by use of complementation strategies (Figure 21.4). Such improved strains can be advantageous in the metabolic engineering perspective, as genes from a wider range of sources can be used without the necessity to exchange rare codons, expression in active form is easier to optimize for aggregation-prone enzymes, and with stable plasmids the gene expression levels of genes selected as critical in a pathway can easier be optimized and their effects on pathway flux (i.e., steady state reaction rates, for which genetic stability is crucial) can be evaluated. Moreover, a few examples of strain developments are also known where cellular parameters (e.g., secretion ability or oxygen uptake) are modified, that lead to possibilities to modify the cultivation technology or strategy. Among such
Plasmid selection
Complementation
Alternative to antibiotic resistance. An example is ValS tRNA-synthetase mutated to temperature sensitivity chromosomally, while wild type on plasmid.
Gene expression
Codon usage
tRNA-genes for rare codons incorporated. Allow expression of genes with rare codons, e.g., E. coli strain Rosetta
Inducer titration/uptake
Lac-permease mutated. Allow diffusion controlled titratable inducer uptake for expression-tuning of lac-promoter, e.g., E. coli strain Tuner
Chaperone over-expression
Promote folding of ‘‘foreign proteins”
Deletion of proteaseencoding genes
Longer half-life of proteins. Often periplasmic proteases deactivated like in E. coli strain Bl21.
Protein folding
Protein stability
Figure 21.4 Examples of E. coli strain engineering.
21-12
Developing Appropriate Hosts for Metabolic Engineering
examples, an interesting strategy is to improve oxygen uptake of the strains, by introducing production of haemoglobin in E. coli [94,95]. This may allow more efficient oxygen uptake under conditions with oxygen gradients in the reactor.
21.2.4 Cultivation Technology Well-developed and reproducible cultivation strategies are other advantages when using E. coli in industrial applications. The mode of operation can be classified based on mass balances describing the systems as open or closed with respect to liquids (containing the nutrients) and solids (the cells). The principal modes of operation are batch, fed-batch, and continuous cultivation, and elaborations on these can be found in textbooks in bioengineering (e.g., [96–98]). An important difference between the modes of operation is that in continuous cultivation operating at steady-state, the chemical environment is constant, whereas in the other modes of operation it is changing with time. Hence, continuous cultivations offer a constant environment that facilitates analysis of metabolic flux. Obviously, a high volumetric productivity is desired, and this requires a high cell density. Furthermore, process robustness and reproducibility is desired both in the case of production of heterologous proteins or selected metabolites. A common process limitation is the oxygen supply rate, which causes a risk for formation of undesired by-products. Furthermore, metabolic limitation, i.e., the bacterial Crabtree effect, which gives acetic acid production when the specific glucose uptake rate is too high has to be taken into account. For these reasons fed-batch is the most commonly used strategy for E. coli cultivations, although alternatives such as dialysis cultivation have been suggested [99]. A number of attempts have been made to develop strategies to control cultivation in such a way that acetate formation and oxygen limitation is avoided and high cell densities are reached. Acetic acid formation can for example be detected on-line, by making pulses in the substrate feed profile. By analysis of the response in the dissolved oxygen it can be decided if the glucose feed rate is below the critical value, and the glucose feed can be adjusted to be below, but close to the critical value, in such a way that metabolic overflow is avoided and a high productivity maintained [100]. Temperature limitation in fed-batch cultivation has also been proposed as an alternative way of controlling the specific growth rate [101,102], and thereby also the oxygen uptake rate. Such a strategy has also been demonstrated useful as a way to limit the amount of endotoxins released to the culture medium [103]. Substantial effort has been invested in the optimisation of production of heterologous protein in E. coli. Addition of compatible solutes has proven to be a useful technique for efficient formation of difficult proteins normally forming inclusion bodies [104,105]. Additions of metabolic intermediates or amino acids to the culture medium are also shown to affect either growth or the protein production [106,107], and if carefully evaluated such results could give knowledge on the effects (promoting or inhibiting) on cellular metabolism. As shown by the examples above and as discussed previously, substantial development of methodology for heterologous protein production using E. coli has been done. To some extent the production of low value bulk products based on metabolic engineering (e.g., production of amino acids) can benefit from this. However, it is likely that also new processes based on continuous cultivation will emerge as the productivity in such systems is inherently higher than batch and fed batch operations.
21.3 Metabolic Engineering of a Specific Pathway— The Shikimate Pathway To illustrate the problems that need to be addressed in metabolic engineering of E. coli, we will here discuss in somewhat more detail a specific example, which is the engineering of E. coli for the production of shikimate. The shikimate pathway is part of the aromatic pathway, which has considerable industrial importance (recently reviewed in Ref. [108]), and systematic engineering of this pathway has a—relatively speaking—rather long history. The shikimate pathway (shown in Figure 21.5), which is
21-13
Escherichia coli as a Well-Developed Host for Metabolic Engineering
Glycolysis
H2O3PO HO
OH
O H2O3PO OH
PPP
PEP
CO2H
G6P
OH
HO O H H2O3PO OH
H2O
E4P 1
HO CO2H
Pi
O H2O3PO HO CO H 2 HO OH
OH
OH
NAD(P)+ QA
2 Pi
8
HO CO2H
DHQ
O OH
NAD(P)H
CO2H O OH
3
OH
NADPH
SA
HO
ADP
S3P
OH
PEP
OH
L-Phe
O
EPSP 77 CO2H
L-Try
CO2H
6
Pi CO2H
OH OH
55
CO2H
OH
CO2H
4
ATP
H2O3PO
OH
H2O
DHS
NADP+
L-Tyr
Shikimate pathway
DAHP
OH
Chor
H2O3PO
OH
O
CO2H
Pi
Ent
Men
Ubi
THF
Figure 21.5 The shikimate pathway reactions and the structures of the different metabolites. This pathway is also called the common pathway of the aromatic amino acids, enterobactin, menaquinone, ubiquinone and tetrahydrofolate pathways. The E. coli enzymes (genes) are: 1. DAHP synthase (aroF, aroG, aroH), 2. DHQ synthase (aroB) 3. DHQ dehydratase (aroD) 4./8. shikimate/quinate dehydrogenase (aroE, ydiB) 5. shikimate kinase I&II (aroK, aroL) 6. EPSP synthase (aroA) 7. chorismate synthase (aroC). (From L. Johansson and G. Lidén, J. Biotechnol., 126, 528–545, 2006. With permission.)
21-14
Developing Appropriate Hosts for Metabolic Engineering
part of the aromatic amino acid pathway, holds many chemically interesting metabolites and therefore this pathway has been subjected to metabolic engineering in various ways [24]. To obtain an efficient shikimate overproducing E.coli strain metabolic engineering is necessary, since shikimate is an intermediate in the pathway and is normally not excreted. The metabolic engineering for efficient production of shikimate can be divided into the following subproblems: • To obtain efficient transport of the carbon-source into the cell • To provide precursors for the shikimate pathway • To optimize the flux into and within the shikimate pathway, i.e., to minimize by-product formation and avoid feed-back inhibition of certain steps • To obtain efficient transport of shikimate out of the cell Many different approaches have been taken to address the problems (schematically shown in Figure 21.6) and highly producing strains have indeed been constructed (reviews on the subject include Refs [23,109–111] ). As previously discussed, the order in which the strain engineering is made, is normally different from the order in the list above.
21.3.1 Increase of the Flux into the Pathway In shikimate production—as in the case of engineering of the aromatic pathway in general—a critical problem is to increase the flux into the pathway. The aromatic pathway is feed-back inhibited by the end-products (tyrosine, phenylalanine, and tryptophan). This takes place by a direct inhibition on the enzymatic level at the first dedicated step, i.e., the condensation of E4P and PEP to form 3-deoxy-D-arabinoheptulosonate-7-phosphate (DAHP), but there is also transcriptional regulation [112]. Three DAHP synthases encoded by aroF, aroG, and aroH and feed-back inhibited by tyr, phe, and trp, respectively, are found in E. coli. The enzymes encoded by the first two genes account for the main DAHP synthase activity, whereas less than a few percent is related to the aroH gene. The a romatic amino acids normally only account for a few percent of the total amino acids in protein in E. coli (phe = 3.5%, trp = 2.6%, and tyr = 1.0%), whereas other amino acids are present in much higher amounts (e.g., gly = 11.5%, ala = 9.6%, and val = 7.9%) [110]. As a consequence only a small fraction of the carbon taken up enters the aromatic amino acid/shikimate pathway in a wild-type strain. An additional reason for the need of feedback resistance is that if the pathway is terminated to produce an intermediate (like shikimate), the strain generally becomes auxotrophic for the aromatic amino acids produced downstream of the intermediate. These amino acids then need to be supplied in the growth medium, and may cause inhibition of the DAHP synthase. To overcome the feed-back inhibition a feedback resistant form of one DAHP synthase needs to be overexpressed. AroG [113] or, most often, AroF has been chosen [114–116] for this purpose. Feed-back resistance can be obtained by mutation of a single amino acid [111,117], but the transcriptional repression is not affected by making the enzymes feedback resistant. However, if the DAHP synthase gene is introduced on multicopy plasmids the amount repressors will not be sufficient to titrate all affector binding sites for the DAHP synthase gene [111].
21.3.2 Supply of Precursors Once the feed-back resistant DAHP synthase has been overexpressed, there is a risk that supply of the precursors, PEP and E4P, becomes limiting. E4P is produced in the nonoxidative pentose phosphate pathway (PPP), and stoichiometrically it can be calculated that three molecules of E4P can be produced from two molecules F6P (Table 21.5). E4P is formed in reactions catalyzed by transketolase (TktA, TktB) and by transaldolase (TalA, TalB). Of these enzymes, the A isoform of transketolase has been shown to be most important for the availability of E4P to the shikimate pathway [118–120]. An amplification
21-15
Escherichia coli as a Well-Developed Host for Metabolic Engineering Glucose
PTS
galP
crr ptsH ptsI
glf
(Z.mobilis)
glk (E.coli & Z.mobilis)
G6P
6PGL
6PG
Glycolysis
Gluconeogenesis
F6P
tktA
PEP
ppc
pck
aroF, aroG, aroH
F6P
TCA
Fum
Acetate
IC α-KG
Succ-CoA
DHQ DHS
AcP Cit
OA
DAHP
aroB
pps
AcCoA
Succ
tktA
S7P
E4P
Pyruvate
Mal
R5P
G3P
G3P
pykF pykA
Ru5P
X5P
1,6FDP DHAP
Plasma membrane
Glf
GalP
aroE
Shikimate aroL, aroK
tyr
trp
phe
men
ubi
fol
Figure 21.6 (See color insert following page 10-18.) Principal targets for metabolic engineering for shikimate production. Genes, which have been overexpressed, are shown by thick red arrows and are written with a large font size. Genes, which have been deleted or inactivated, are shown by dotted blue arrows, crossed by a line, and written with a small font size. Proteins (genes in parenthesis): Glf (glf ) = glucose facilitator, PTS (crr, ptsH, ptsI, ptsG) = phosphotransferase system, GalP (galP) = galactose MFS-transporter. Genes: aroG aroF, aroH, aroB, aroE, aroL, aroK (see Figure 21.5), tktA = transketolase, pps, ppc, pck, pykF, pykA, glk = glucose kinase. Note: not all of the modifications have been carried out simultaneously. The most productive shikimate strain contains this combination of modifications: a nonfunctional PTS-system (∆crrptsHptsI), nonfunctional shikimate kinases (∆aroLaroK) and a multicopy plasmid containing aroFfbr, aroB, aroE, tktA and Z.mobilis glf and glk [26].
of this enzyme in combination with an overexpressed DAHP synthase was shown to double the flux into the pathway in comparison to the case where only DAHP synthase was overexpressed [121]. The availability of the other precursor, PEP, is determined by the balance between many competing reactions (Table 21.6) [36]. Formation of PEP takes place only by the glycolytic pathway during growth on glucose [122]. To increase the availability of PEP to the shikimate pathway, competing reactions should be inactived and/or formation of PEP should be boosted by (over)expressing genes for one of the gluconeogenetic reactions in order to replenish the PEP (cf. Table 21.6 and Figure 21.6). As already mentioned, sugar transport by the PTS system will directly decrease the maximum theoretical yield of DAHP from glucose by 50%, since one out of two PEP generated in glycolysis will be consumed
21-16
Developing Appropriate Hosts for Metabolic Engineering
Table 21.5 Connected Reactions of the Non-Oxidative Pentose Phosphate Pathway and Glycolysis Pathway Nonoxidative PPP
Glycolysis
Total reaction
Reaction
Enzyme
F6P + G3P ↔ E4P + X5P
TktA, TktB
X5P + R5P ↔ G3P + S7P
TktA, TktB
S7P + G3P ↔ E4P + F6P
TalA, TalB
F6P + G3P ↔ E4P + X5P
TktA, TktB
X5P ↔ Ru5P
Rpe
Ru5P ↔ R5P
RpiA, RpiB
F6P + ATP → F1,6P + ADP
PfkA, PfkB
F1,6P ↔ G3P + DHAP
FbaA, FbaB
DHAP ↔ G3P
Tpi
2F6P + ATP ↔ 3E4P + ADP
Table 21.6 Production and Consumption of PEP in E. coli During Aerobic Growth in Minimal Medium Pathway
Reaction
Flux Relative to Total Production (%)
Enzyme
Consumption PEP + glucose → pyruvate + G6P
50
Glycolysis
PEP + ADP → pyruvate + ATP
15
Crr, PtsG, PtsH, PtsI PykF, PykA
Peptido-glucan synthesis
PEP + UDP-GlcNAc → UDP-GlcNAcpyruvate-enol-ether + Pi
16
MurA
Anapleurotic pathway
PEP + CO2 → OA + Pi
16
Ppc
Sugar uptake (PTS)
Production Shikimate pathway Glycolysis Glyconeogenesis
2PG ↔ PEP + H2O
NA
AroF AroG AroH Eno
OA + ATP → PEP + CO2 + ADP
NA
Pck
H2O + ATP + pyruvate → PEP + AMP + Pi
NA
Pps
PEP + H2O + E4P → DAHP + Pi
3
in transport. The maximum yield of DAHP on glucose will therefore be 0.43 mol/mol. Patnaik and coworkers showed that a DAHP yield close to this value was obtained by overexpression of feedback resistant DAHP synthase and TktA [123]. Therefore, deletion of pykA, pykF, ppc, or murA (cf Table 21.6) appear not to be needed. To increase the yield further, the PTS needs to be considered. Glucose may be transported by the galactose permease system, GalP, and subsequently phosphorylation may take place by a glucose kinase, Glt [110]. Inactivation of the PTS may therefore be an option. However, the PTS is involved in global regulation by activating the adenylate cyclase (which catalyses the production of cAMP giving activation of the global regulator catabolite repressor CRP). Deletion of this transport system is therefore not completely straightforward and deletion strains have been shown to grow poorly on glucose in general [36]. In order to obtain a functional strain further modifications are needed, e.g., by “evolutionary engineering” in chemostats where the dilution rate is slowly increased. Thereby strains with increasing specific growth rate may be selected [124]. Alternatively, the glucose facilitator (Glf) from Zymomonas mobilis in combination with the glucose kinase (Glk) from the same organism can be introduced. This change is in fact part of the modification of the most efficient shikimate strains of today [26]. In principle, PEP may also be regenerated by overexpression of the gluconeogenetic genes pps or pck
Escherichia coli as a Well-Developed Host for Metabolic Engineering
21-17
giving PEP formation from pyruvate or oxaloacetate, respectively (cf. Figure 21.6). Overexpression of pps has shown to be very efficient and close to theoretical yield has been obtained in a DAHP-producing strain [113]. Overexpression of pck, on the other hand, has shown to be efficient only if TCA-metabolites such as succinic acid, malic acid, or α-ketoglutarate are added in the growth medium together with glucose [125]. A higher yield of DAHP (and downstream products) will also be expected if a non-PTStransported carbon source, such as arabinose or xylose, is used instead of glucose [123]. A higher yield of the product DHS from xylose and arabinose was indeed obtained by Li and Frost [125]. Even higher yields were in fact obtained with a mixture of glucose/xylose/arabinose in a 3/3/2 molar ratio. Since this molar ratio is close to what is obtained in hydrolysate from corn fiber, this is clearly interesting.
21.3.3 Minimization of By-Products The obviously most important reaction to remove for the production of shikimic acid is the phosphorylation reaction converting shikimate to S-3-P. This reaction is catalyzed by the two isozymes shikimate kinase I and II (the major enzyme) encoded by aroK and aroL, respectively. By deleting either only aroL [126] or both aroL and aroK [27] shikimate production is enabled. If both genes are deleted, the strain will be able to accumulate more shikimic acid than if only aroL is deleted, since in the later case some of the shikimate is consumed by shikimate kinase I. The drawback, however, when deleting both genes is auxotrophy for the aromatic amino acids, ubiquinone, menquinone, and folic acid. A double deletion mutant therefore needs supply of the aromatic amino acids and precursors for the other metabolites for survival. The termination of the pathway at the level of shikimate kinase, unfortunately, does not lead to secretion of only shikimate, but several other compounds found or derived from the upper part of the pathway, such as DHS, DHQ, and quinate, are also secreted. In general, it can be said that the further down in the pathway the metabolite of interest is found, the worse will be the problem. This is likely due to pools of intracellular metabolites, which may “leak” or be actively transported out of the cell. All enzymes in the pathway, except for DHS dehydrogenase (aroD), have been shown to exert some level of control of the shikimate pathway flux [127], although there are some contrasting opinions concerning the rate limiting character of shikimate dehydrogenase (aroE) [128]. Overexpression of tentatively rate limiting enzymes did not solve the problem of by-product formation [127], and “equilibration” caused by the reversible character of AroE and AroD is most likely important for the by-product formation (cf Figure 21.5) [26,126]. More knowledge on the transport of the intermediates is clearly needed to further address the by-product formation. An additional observation is that the sum of the yields of all the products formed in the shikimate pathway is lower than if DAHP is produced, for which the best producing strains close to theoretical yields are obtained [113]. The sum of yields tends to get lower the further down the metabolite of interest is found. For example, in a DHS-producing strain a total yield of 51% could be reached as compared to 38% in the best shikimate producing strain [26,129]. The reason behind this phenomenon is not known. An alternative way to obtain shikimate production is to terminate the pathway after the next enzyme in the pathway, i.e., the EPSP-synthase encoded by aroA (cf. Figure 21.6). In such a strain S-3-P has been shown to be secreted to the medium and a mixture of shikimate and S-3-P will be produced. However, the conversion of S-3-P to shikimate is relatively simple, and the strategy has been shown to give less other by-product formation than deletion of the shikimate kinase [109].
21.3.4 Comparison with Metabolic Engineering of an Artificial Pathway— 1,3-Propanediol Production The engineering of E. coli for shikimate production is an example of “amplification” of the production of a compound naturally produced in the organism. Among the successful examples of metabolic engineering of E. coli, we also find examples of introduced artificial pathways. One example is the production of 1,3-propanediol (cf Table 21.2). In this metabolically engineered E. coli an aerobic pathway
21-18
Developing Appropriate Hosts for Metabolic Engineering
utilizing glucose as the carbon source was created (instead of the naturally occurring glycerol fermentative pathway found in e.g., Clostridium and Klebsiella species). A K-12 strain, with weak glycerol producing capacity, and no 1,3-propanediol producing capacity was used, meaning that the successfully engineered strain relied on a heterologous carbon pathway that diverts carbon from DHAP in the central carbon metabolism to 1,3-propanediol [15]. Clearly, insertion of genes to create the missing links in metabolism is crucial. But in order to create an efficient pathway, genes are also deleted, both to avoid side-reactions with glycerol, and delete the PTS-system (Table 21.2). The pathway was constructed by inserting genes from Saccharomyces cerevisiae and Klebsiella pneumoniae, encoding three different enzyme activities. The S. cerevisiae genes encoding glycerol 3-phosphate dehydrogenase and glycerol 3-phosphate phosphatase [130] were inserted to obtain a route for glycerol production and the genes for three glycerol dehydratases, from K. pneumoniae were inserted to enable the conversion of glycerol to 3-hydroxypropionaldehyde [15]. Finally, a 1,3-propanediol oxidoreductase was needed. A surprise was the excellent results achieved upon using a native E. coli gene, yqhD, encoding a previously uncharacterized oxidoreductase, functioning as a 1,3-propanediol oxidoreductase in the final step of the production pathway [131]. The superior results obtained in fed-batch cultivations using this gene (instead of genes encoding the more obvious dehydrogenases obtained from dha-regulons) is believed to be a result of differences in the cofactor reduced/oxidized ratios. This illustrates the versatility of microbial metabolism, and that a priori unforeseen alternatives may prove very efficient in metabolic engineering.
21.4 Concluding Remarks One may argue that with a different development of molecular biology, we might never have considered using E. coli as a host for metabolic engineering. However, as discussed E. coli has many desirable traits and has today firmly established itself as a host organism. Certainly this organism will continue to play an important role in the development of metabolic engineering also in the future.
References 1. Thomas, G. Escherichia coli: model and menace. Microbiol. Today, 31, 115, 2004. 2. Lederberg, J. E. coli K-12. Microbiol. Today, 31, 116, 2004. 3. Escherich, T. Die Darmbakterien des Neugeborenen und Säuglings, Fortschr.d. Med. 3, 515-522; 251-251, 1885. 4. Feng, P., Weagant, S. D., and Grant, M. A. Bacteriological Analytical Manual, 8th edn. US Food and Drug Administration, Revision A, 1998, revised 2002, chapter 4. (on-line at http://www.cfsan.fda. gov/~ebam/bam-4.html) 5. Monod, J. From enzymatic adaptation to allosteric transitions. Science, 154, 475, 1966. 6. Jacob, F. and Monod, J. Genetic regulatory mechanisms in the synthesis of proteins. J. Mol. Biol., 3, 318, 1961. 7. Lederberg, J. and Tatum, E. L. Gene recombination in E. coli. Nature,158, 558, 1946. 8. Curtiss III, R. Bacterial conjugation. Annual Rev. Microbiol., 23, 69, 1969. 9. Ullman, A. Encyclopedia of Life Sciences. Nature Publishing Group, 2001. 10. Cohen S. N. et al. Construction of biologically functional plasmids in vitro. Proc. Nat. Acad. Sci. USA, 70, 3240, 1973. 11. Wiens, A. E. The symbiotic relationship of science and technology in the 21st Century. J. Technol. Studies, XXV, no 2, (no page numbers given), 1999. 12. Läkemedelsvärlden, Nyponextrakt, tuggummi och fusioner, no 3, 2002, http://www.lakemedelsvarlden.nu 13. Miller, J. E. et al. Production of phenylalanine and organic acids by phosphoenolpyruvate carboxylase-deficient mutants of Escherichia coli. J. Ind. Microbiol. Biotechnol., 2, 143, 1987.
Escherichia coli as a Well-Developed Host for Metabolic Engineering
21-19
14. Nilsson, J. and Skogman, S. G. Stabilization of Escherichia coli tryptophan−production vectors in continuous cultures: a comparison of three different systems. Bio/Technol., 4, 901, 1986. 15. Nakamura, C. E. and Whited, G. M. Metabolic engineering for the microbial production of 1,3propanediol. Curr. Opinion Biotechnol., 14, 454, 2003. 16. Majewski, R. A. and Domach, M. M. Simple constrained-optimization view of acetate overflow in E. coli. Biotechnol. Bioeng., 35, 732, 1990. 17. Lidén, G. Understanding the bioreactor. Bioproc. Biosys. Eng., 24, 273, 2002. 18. Ingram, L. O. et al. Genetic engineering of ethanol production in Escherichia coli. Appl. Environ. Microbiol., 53, 2420, 1987. 19. Zhou, A., Causey, T. B., Hasona, A., Shanmugam, K. T., and Ingram, L. O. Production of optically pure D-lactic acid in mineral salts medium by metabolically engineered Escherichia coli W3110. Appl. Environ. Microbiol., 69, 399, 2003. 20. Zhou, A. et al. Fermentation of 10% (w/v) sugar to D-lactate by engineered Escherichia coli B. Biotechnol. Lett., 27, 1891, 2005. 21. Sanchez, A. M., Bennett, G. N., and San, K.-Y. Efficient succinic acid production from glucose through overexpression of pyruvate carboxylase in an Escherichia coli alcohol dehydrogenase and lactate dehydrogenase mutant. Biotechnol. Prog., 21, 358, 2005. 22. Sanchez, A. M., Bennett, G. N., and San, K.-Y. Novel pathway engineering design of the anaerobic central metabolic pathway in Escherichia coli to increase succinate yield and productivity. Metab. Eng. 7, 229, 2005. 23. Berry, A. Improving production of aromatic compounds in Escherichia coli by metabolic engineering. Trends Biotechnol., 14, 250, 1996. 24. Bongaerts, J. et al. Metabolic engineering for microbial production of aromatic amino acids and derived compounds. Metabol. Eng., 3, 289, 2001. 25. Grinter, N. J. Developing an L-phenylalanine process. CHEMTECH, 28, 33, 1998. 26. Chandran, S. S. et al. Phosphoenolpyruvate availability and the biosynthesis of shikimic acid. Biotechnol. Prog., 19, 808, 2003. 27. Draths, K. M., Knop, D. R., and Frost, J. W. Shikimic acid and quinic acid: replacing isolation from plant sources with recombinant microbial biocatalysis. J. Am. Chem. Soc., 121, 1603, 1999. 28. Martin V. J. J. et al. Engineering a mevalonate pathway in Escherichia coli for production of terpenoids. Nature Biotechnol., 21, 796, 2003. 29. Park, S. J. et al. Production of poly(3-hydroxybutyrate-co-3-hydroxyhexanoate by metabolically engineered Escherichia coli strains. Biomacromol., 2, 248, 2001. 30. Park, S. J., Choi, J., and Lee, S. Y. Engineering of Escherichia coli fatty acid metabolism for the production of polyhydroxyalkanoates. Enz. Microb. Technol., 36, 579, 2005. 31. Ladisch, M. R. and Kohlmann, K. L. Recombinant human insulin. Biotechnol. Prog., 8, 469, 1992. 32. Goeddel, D. V. et al. Direct expression in Escherichia coli of a DNA sequence coding for human growth hormone, Nature, 281, 544, 1979. 33. Grey, G. L. et al. Periplasmic production of correctly processed human growth hormone in Escherichia coli: natural and bacterial signal sequences are interchangeable. Gene, 39, 247, 1985. 34. Wang, Z., Yuan Z., and Hengge, U. R. Processing of plasmid DNA with ColE1-like replication origin. Plasmid, 51, 149, 2004. 35. Postma, P. W., Lengeler, J. W., and Jacobson, G. R. Phosphoenolpyruvate: carbohydrate phosphotransferase systems. In: Escherichia coli and Salmonella: Cellular and Molecular Biology, 2nd edn, Neidhardt, F. C., Ed. ASM Press, Washington, DC, 1999. 36. Gosset, G. Improvement of Escherichia coli production strains by modification of the phosphoenolpyruvate: Sugar phosphotransferase system. Microbial Cell Fact., 4, 14, 2005. doi:10.1186/1475-2859-4-14. 37. Lindsay S. E, Bothast, B. S., and Ingram. L. O. Improved strains of recombinant Escherichia coli for ethanol production from sugar mixtures. Appl. Microbiol. Biotechnol, 43, 70, 1995.
21-20
Developing Appropriate Hosts for Metabolic Engineering
38. Snoep, J. L. et al. Reconstitution of glucose uptake and phosphorylation in a glucose-negative mutant of Escherichia coli by using Zymomonas mobilis genes encoding the glucose facilitator protein and glucokinase. J. Bacteriol., 176, 2133, 1994. 39. Hannig, G. and Makrides, S. C. Strategies for optimizing heterologous protein expression in Escherichia coli. Trends Biotechnol., 16, 54, 1998. 40. Weickert, M. et al. Optimization of heterologous protein production in Escherichia coli. Curr. Opin. Biotechnol., 7, 494, 1996. 41. Sörensen H. P. and Mortensen, K. K. Advanced genetic strategies for recombinant protein expression in Escherichia coli. J. Biotechnol., 115, 113, 2005. 42. Jonasson, P. et al. Genetic design for facilitated production and recovery of recombinant proteins in Escherichia coli. Biotechnol. Appl. Biochem., 35, 91, 2002. 43. Yansura, D. G. Expression as trpE fusion. Methods Enzymol., 185, 161, 1990. 44. Goldenberg, D., Azar, I., and Oppenheim, A. B. Differential mRNA stability of the cspA gene in the cold-shock response of Escherichia coli. Mol. Microbiol., 19, 241, 1996. 45. Inouye, S. and Inouye, M. Up-promoter mutations in the lpp gene of Escherichia coli. Nucl. Acids Res., 13, 3101, 1985. 46. Studier, F. W. and Moffat, B. A. Use of bacteriophage T7 RNA polymerase to direct selective highlevel expression of cloned genes. J. Mol. Biol. 189, 113, 1986. 47. Högset, A. et al. Expression and characterization of a recombinant human parathyroid hormone secreted by Escherichia coli employing the staphylococcal protein A promoter and signal sequence. J. Biol. Chem., 265, 7338, 1990. 48. Löfdahl, S. et al. Gene for Staphylococcal protein A. Proc. Natl. Acad. Sci. USA, 80, 697, 1983. 49. De Boer, H. A., Comstock, L. J., and Vasser, M. The tac promoter: a functional hybrid derived from the trp and lac promoters. Proc. Nat. Acad. Sci. USA, 80, 21, 1983. 50. pET-system Manual. 11th edn (http://www.merckbiosciences.co.uk/product/TB055), 2006. 51. Bolivar F. et al. Construction and characterization of new cloning vehicles. II. A multipurpose cloning system. Gene, 2, 95, 1977. 52. Yanisch-Perron, C., Vieira, J., and Messing, J. Improved M13 phage cloning vectors and host strains: nucleotide sequences of the M13mp18 and pUC19 vectors. Gene, 33, 103, 1985. 53. Nakano, Y. et al. Construction of a series of pACYC-derived plasmid vectors. Gene, 162, 157, 1995. 54. Keasling, J. D. et al. New tools for metabolic engineering of Escherichia coli. In: Metabolic Engineering, Lee, S. Y., and Papoutsakis, E. T. Eds. Marcel Dekker, Inc., New York, 1999. 55. Mayer, M. P. A new set of useful cloning and expression vectors derived from pBlueScript, Gene, 163, 41, 1995. 56. Smolke, C. D., Martin, V. J. J., and Keasling, J. D. Tools for metabolic engineering in Escherichia coli. Prot. Exp. Technol., 149, 2004. 57. Gooljarsingh, L. T. et al., Localization of GAR transformylase in Escherichia coli and mammalian cells. Proc. Natl. Acad. Sci. USA, 98, 6565, 2001. 58. Hamilton, C.A. et al. New method for generating deletions and gene replacements in Escherichia coli. J. Bacteriol., 171, 4617, 1989. 59. Gabriel K. and McClain, W. H. A set of plasmids constitutively producing different RNA Levels in Escherichia coli. J. Mol. Biol., 290, 385, 1999. 60. Hashimoto-Gotoh, T. and Sekiguchi, M. Mutations of temperature sensitivity in R plasmid pSC101. J. Bacteriol., 131, 405, 1977. 61. Koffas, M. et al. Metabolic engineering. Annu. Rev. Biomed. Eng. 01, 535, 1999. 62. Baneyx, F. Recombinant protein expression in Escherichia coli. Curr. Opin. Biotechnol., 10, 411, 1999. 63. Sambrook J., Fritsch, E. F., and Maniatis, T. Molecular Cloning: a Laboratory Manual, 2nd edn. Cold Spring Harbor Laboratory Press, NY, 1989.
Escherichia coli as a Well-Developed Host for Metabolic Engineering
21-21
64. Park, S. J. et al. Enrichment of specific monomer in medium-chain-length poly(3-hydroxyalkanoates) by amplification of fadD and fadE genes in recombinant Escherichia coli. Enz. Microb. Technol., 33, 62, 2003. 65. Durwald, H. and Hoffman-Berling, H. Endonuclease I-deficient and ribonuclease I-deficient Escherichia coli mutants. J. Mol. Biol., 34, 331, 1968. 66. Hanahan, D. Studies on transformation of Escherichia coli with plasmids. J.Mol. Biol.,166, 577, 1983. 67. Luengo, J. M., García, J. L., and Olivera, E. R. The phenylacetyl-CoA catabolon: a complex catabolic unit with broad biotechnological applications. Mol. Microbiol. 39, 1434, 2001. 68. Snell, K. D. et al. YfcX enables medium-chain-length poly(3-hydroxyalkanoate) formation from fatty acids in recombinant Escherichia coli fadB strains. J. Bacteriol., 184, 5696, 2002. 69. Aguilar, A. et al. Two genes from the capsule of Aeromonas hydrophila (serogroup O:34) confer serum resistance to Escherichia coli K12 strains. Res. Microbiol., 150, 395, 1999. 70. Mueller, U. et al. Metabolic engineering of the E. coli l-phenylalanine pathway for the production of d-phenylglycine (d-Phg). Met. Eng., 8, 196, 2006. 71. Kim, S.-W. and Keasling, J. D. Metabolic engineering of the nonmevalonate isopentenyl diphosphate synthesis pathway in Escherichia coli enhances lycopene production. Biotechnol. Bioeng., 72, 408, 2001. 72. Schmidt-Dannert, C., Umeno, D., and Arnold, F. H. Molecular breeding of carotenoid biosynthetic pathways. Nature Biotechnol., 18, 750, 2000. 73. Gough, J. A. and Murray, N. E. Sequence diversity among related genes for recognition of specific targets in DNA molecules. J. Mol. Biol., 166, 1, 1983. 74. Tabata, K. et al. Production of UDP-N-acetylglucosamine by coupling metabolically engineered bacteria. Biotechnol. Lett., 22, 479, 2000. 75. Zhang, J. et al. Large-scale synthesis of carbohydrates for pharmaceutical development. Curr. Org. Chem., 5, 1169, 2001. 76. Bullock, W. O., Fernandez, J. M., and Short, J. M. XL1-Blue: A high efficiency plasmid transforming recA Escherichia coli strain with beta-galactosidase selection. BioTechniques, 5, 376, 1987. 77. Asako, H. et al. Organic solvent tolerance and antibiotic resistance increased by overexpression of marA in Escherichia coli. Appl. Environ. Microbiol., 63, 1428, 1997. 78. Johansson, L. et al. Shikimic acid production by a modified strain of E. coli (W3110.shik1) under phosphate limited and carbon limited conditions. Biotechnol. Bioeng., 92, 541, 2005. 79. Casadaban, M. J. and Cohen, S. N. Analysis of gene control signals by DNA fusion and cloning in Escherichia coli. J. Mol. Biol., 138, 179, 1980. 80. Grant, S. G. N. et al. Differential plasmid rescue from transgenic mouse DNAs into Escherichia coli methylation-restriction mutants. Proc. Natl. Acad. Sci. USA, 87, 4645, 1990. 81. Matthews, P. D. and Wurtzel, E. T. Metabolic engineering of carotenoid accumulation in Escherichia coli by modulation of the isoprenoid precursor pool with expression of deoxyxylulose phosphate synthase. Appl. Microbiol. Biotechnol., 53, 396, 2000. 82. Kaup, B., Bringer-Meyer, S., and Sahm, H. Metabolic engineering of Escherichia coli: construction of an efficient biocatalyst ford -mannitol formation in a whole-cell biotransformation. Appl. Microbiol. Biotechnol., 64, 333, 2004. 83. Ringenberg, M., Lichtensteiger, C., and Vimr, E. Redirection of sialic acid metabolism in genetically engineered Escherichia coli. Glycobiology, 11, 533, 2001. 84. Chang, T.-S. et al. High-level expression of a lacZ gene from a bacterial artificial chromosome in Escherichia coli. Appl. Microbiol. Biotechnol., 61, 234, 2003. 85. Leonard, E., Yan, Y., and Koffas, M. A. G. Functional expression of a P450 flavonoid hydroxylase for the biosynthesis of plant-specific hydroxylated flavonols in Escherichia coli Met. Eng. 8, 172, 2006. 86. Boyer, H. W. and Roulland-Dussoix, D. A complementation analysis of the restriction and modification of DNA in Escherichia coli. J. Mol. Biol., 41, 459, 1969.
21-22
Developing Appropriate Hosts for Metabolic Engineering
87. Umeno, D. and Arnold, F. H. Evolution of a pathway to novel long-chain carotenoids. J. Bacteriol., 186, 1531, 2004. 88. Novy, R. et al. Overcoming the codon bias of E. coli for enhanced protein expression. inNovations 12, 1, 2001. 89. Stenström, C. M. et al. Codon bias at the 3´-side of the initiation codon is correlated with translation initiation efficiency in Escherichia coli. Gene, 263, 273, 2001. 90. Kurland, C. and Gallant, J. Errors of heterologous protein expression. Curr. Opinion Biotechnol., 7, 489, 1996. 91. Carrier, T., Jones, K. L., and Keasling, J. D. mRNA stability and plasmid copy number effects on gene expression from an inducible promoter system. Biotechnol. Bioeng., 59, 666, 2000. 92. Blattner, F. R. et al. The complete genome sequence of Escherichia coli K-12. Science, 277, 1453, 1997. 93. Abedon, S. T. The murky origin of Snow white and her T-even dwarfs. Genetics, 155, 481, 2000. 94. Khosla, C. and Bailey, J. E. Evidence for partial export of Vitreoscilla hemoglobin into the periplasmic space in Escherichia coli: implications for protein function. J. Mol. Biol. 210, 79, 1989. 95. Roos, V., Andersson, C. I. J., and Bulow, L. Gene expression profiling of Escherichia coli expressing double Vitreoscilla haemoglobin. J. Biotechnol., 114, 107, 2004. 96. Wang, D. I. C. et al. Fermentation and Enzyme Technology. John Wiley & Sons, New York, 1979. 97. Pirt, S. J. Principles of Microbe and Cell Cultivation. Blackwell Scientific Publications, Oxford, 1975. 98. Nielsen, J., Villadsen, J., and Lidén, G. Bioreaction Engineering Principles. Kluwer Academic/Plenum Publishers, New York, 2003. 99. Märkl, H. et al. Cultivation of Escherichia coli to high cell densities in a dialysis reactor. Appl. Microbiol. Biotechnol., 39, 48, 1993. 100. Åkesson, M. et al. Acetate formation and dissolved oxygen responses to feed transients in Escherichia coli cultivations. Biotechnol. Bioeng,, 64, 590, 1999. 101. Svensson, M., Svensson, I., and Enfors, S.-O. Osmotic stability of the cell membrane of Escherichia coli from a temperature limited fed-batch. Appl. Microbiol. Biotechnol., 67, 345, 2005. 102. de Mare, L. et al. A feeding strategy for E. coli fed-batch cultivations operating close to the maximum oxygen transfer capacity of the reactor. Biotechnol. Lett., 27, 983, 2005. 103. Svensson, M. et al. Control of endotoxin release in E. coli fed-batch cultures. Bioprocess Biosyst. Eng., 27, 91, 2005. 104. Blackwell, J. R. and Horgan, R. A novel strategy for production of a highly expressed recombinant protein in an active form. FEBS Lett., 295, 10, 1991. 105. Barth, S. et al. Compatible - solute -supported periplasmic expression of functional recombinant proteins under stress conditions. Appl. Environ. Microbiol., 66, 1572, 2000. 106. Ramchuran, S. O., Holst, O., and Nordberg Karlsson, E. Effect of postinduction nutrient feed composition and use of lactose as inducer during production of thermostable xylanase in Escherichia coli glucose-limited fed-batch cultivations. J. Biosci. Bioeng., 99, 477, 2005. 107. Han, L. et al. Effect of glycine on the cell yield and growth rate of Escherichia coli: evidence for cell-density-dependent glycine degradation as determined by 13C NMR spectroscopy. J. Biotechnol., 92:237, 2002. 108. Leuchtenberger, W., Huthmacher, K., and Drauz, K. Biotechnological production of amino acids and derivatives: current status and prospects Appl. Microbiol. Biotechnol., 69, 1, 2005. 109. Krämer, M. et al. Metabolic engineering for microbial production of shikimic acid. Met. Eng. 5, 277, 2003. 110. Valle, F. and Berry, A. Metabolic engineering of Escherichia coli for the production of aromatic compounds. In: Metabolic Engineering, Lee, S. Y., and Papoutsakis, E. T. Eds. Marcel Dekker, Inc., New York, 79, 1999. 111. Frost, J. W. and Draths, K. M. Biocatalytic syntheses of aromatics from D-glucose: renewable microbial sources of aromatic compounds. Ann. Rev. Microbiol., 49, 557, 1995.
Escherichia coli as a Well-Developed Host for Metabolic Engineering
21-23
112. Pittard, A. J. Biosynthesis of the aromatic amino acids. In: Escherichia coli and Salmonella: Cellular and Molecular Biology, 2nd edn, Neidhardt, F. C., Ed. ASM Press, Washington, DC, 1999. 113. Patnaik, R. and Liao, J. C. Engineering of Escherichia coli central metabolism for aromatic metabolite production with near theoretical yield. Appl. Environ. Microbiol., 60, 3903, 1994. 114. Schmitz, M. et al. Pulse experiments as a prerequisite for the quantification of in vivo enzyme kinetics in aromatic amino acid pathway of Escherichia coli. Biotechnol. Prog., 18, 935, 2002. 115. Frost, J. W., Frost, K. M., and Knop, D. R. Biocatalytic synthesis of shikimic acid in genetically engineered Escherichia coli. In: PCT Int. Appl. 27 pp. (Board of Trustees Operating Michigan State University), Wo, 2000. 116. Chen, R. et al. Metabolic consequences of phosphotransferase (PTS) mutation in a phenylalanineproducing recombinant Escherichia coli Biotechnol. Prog., 13, 768, 1997. 117. Jossek, R., Bongaerts, J., and Sprenger, G. A. Characterization of a new feedback resistant 3-deoxyD-arabino-heptulosonate 7-phophate synthase AroF of Escherichia coli. FEMS Microbiol. Lett., 202, 145, 2001. 118. Josephson, B. L. and Fraenkel, D. G. Transketolase mutants of Escherichia coli. J. Bacteriol., 100, 1289, 1969. 119. Josephson, B. L. and Fraenkel, D. G. Sugar metabolism in transketolase mutants of Escherichia coli. J. Bacteriol. 118, 1082, 1974. 120. Sprenger, G. A. Genetics of pentose-phosphate pathway enzymes of Escherichia coli K-12. Arch. Microbiol., 164, 324, 1995. 121. Draths, K. M. et al. Biocatalytic synthesis of aromatics from D-glucose: the role of transketolase. J. Am. Chem. Soc. 114, 3956, 1992. 122. Karp, P., Riley, M., and Paley, S. Electronic encyclopedia of E. coli genes and metabolism. Nucl. Acids Res., 27, 55, 1999. 123. Patnaik R., Spitzer, R. G., and Liao, J. C. Pathway engineering for production of aromatics in Escherichia coli: Confirmation of stoichiometric analysis by independent modulation of AroG, TktA, and Pps activities. Biotechnol. Bioeng., 46, 361, 1995. 124. Flores, N. et al. Pathway engineering for the production of aromatic compounds in Escherichia coli. Nature Biotechnol., 14, 620, 1996. 125. Li, K. and Frost, J. W. Utilizing succinic acid as a glucose adjunct in fed-batch fermentation: is butane a feedstock option in microbe-catalyzed synthesis? J. Am. Chem. Soc., 121, 9461, 1999. 126. Johansson, L. Metabolic analysis of shikimic acid producing Escherichia coli. Ph.D. thesis, Lund University, Sweden, 2006. 127. Dell, K. A. and Frost, J. W. Identification and removal of impediments to biocatalytic synthesis of aromatics from D-glucose: rate-limiting enzymes in the common pathway of aromatic amino acid biosynthesis. J. Am. Chem. Soc. 115, 11581, 1993. 128. Oldiges, M. et al. Stimulation, monitoring, and analysis of pathway dynamics by metabolic profiling in the aromatic amino acid pathway. Biotechnol. Prog., 20, 1623, 2004. 129. Yi, J., Li, K., Draths, K. M., and Frost, J. W. Modulation of phosphoenolpyruvate synthase expression increases shikimate pathway product yields in E. coli. Biotechnol. Prog., 18, 1141, 2002. 130. Norbeck J. et al. Purification and characterization of two isoenzymes of DL-glycerol-3-phosphatase from Saccharomyces cerevisiae. J. Biol. Chem., 271, 13875, 1996. 131. Emptage M. et al. Process for the biological production of 1,3-propanediol with high titer. United States Patent, No 6, 514, 733, 2003.
22 Metabolic Engineering in Yeast 22.1 Introduction ������������������������������������������������������������������������������������ 22-1 22.2 Extension of Substrate Range ������������������������������������������������������� 22-2 Cellobiose • Cellulose • Starch • Pentose Sugars • Xylan • Galactose • Lactose • Melibiose
22.3 Metabolites Yield and Productivity.............................................22-16 Redirection of Carbon and Redox Equivalents to Glycerol Production • Modulation of Ethanol Production • Succinic Acid • Diacetyl
22.4 Extended Product Range ������������������������������������������������������������� 22-25
Maurizio Bettiga, Marie F. GorwaGrauslund, and Bärbel HahnHägerdal Lund University
Lactic Acid • Terpenoids • Other Compounds • Heterologous Proteins
22.5 Improved Cellular Properties ����������������������������������������������������� 22-34 22.6 Yeast as Biocatalyst ����������������������������������������������������������������������� 22-35 L-Malic Acid Degradation • Xylitol Production • Stereoselective Bioreduction
22.7 Concluding Remarks ������������������������������������������������������������������� 22-37 References ������������������������������������������������������������������������������������������������� 22-38
22.1 Introduction Since metabolic engineering was first introduced to define a new engineering sub-discipline [1] its development in yeast has been almost synonymous with metabolic engineering in baker’s yeast Saccharomyces cerevisiae. S. cerevisiae is not only the oldest industrially exploited microorganism, it is also the industrial microorganism used in the largest scale in facilities up to 6000 m3 representing annual sales for billions of Euros. Traditionally S. cerevisiae has been used for bread leavening and the production of beer, wine, distilled spirits, and industrial ethanol [2]. Recently focus has been on the production of fuel ethanol from renewable resources [3]. Similar to the intestinal bacterium Escherichia coli (see Chapter 21), S. cerevisiae is an integral part of humanity and an organism for which considerable knowledge was acquired already in the pregenomic era [4,5]. Molecular tools for cloning and expression of S. cerevisiae genes were developed already in the 1970s and 1980s [6–10]. Similarly, the expression of heterologous genes in S. cerevisiae was early explored [11–13]. The sequencing of the S. cerevisiae genome [14,15] helped the development of novel metabolic engineering strategies. The metabolic engineering sub-discipline has lately been extended into a systems biology sub-discipline [16] as well as a systems biotechnology sub-discipline [17], which aim to integrate information acquired in high-throughput analysis at different cellular levels such as the genome, transcriptome, proteome, metabolome, fluxome, etc. to identify metabolic functions to target for improved metabolic performance. For baker’s yeast S. cerevisiae the 22-1
22-2
Developing Appropriate Hosts for Metabolic Engineering
systems biology approach has lately provided new knowledge about regulatory functions governing metabolite production [18–21]. The widespread use and industrial exploitation of S. cerevisiae has provided solid knowledgebase for large scale applications other than the production of baker’s yeast and ethanol. Recombinant strains of S. cerevisiae have been developed for the production of pharmaceuticals [22–25], commodity and fine chemicals and materials [26,27]. The experience generated through the development of novel strains of S. cerevisiae has also been translated to other yeast species that are traditionally called “nonconventional yeasts” (NCY) [28,29]. In relation to certain industrial applications these yeast have favorable characteristics such as pH, temperature and other environmental requirements [30–33]. Presently the genomes of Schizosaccharomyces pombe [34], Yarrowia lipolytica, Kluyveromyces lactis, Debaryomyces hansenii, Candida glabrata [35], and Pichia stipitis [36] and a more and more complete toolbox of molecular biology techniques is available [32,37–39], which no doubt will lead to an accelerated development of metabolic engineering in these yeast species. In addition to generate novel strains with improved cellular properties metabolic engineering has also greatly contributed to enhance our knowledge about cellular functions. The early design of metabolic engineering strategies in yeast was based on the biochemistry of enzyme regulation and proposed metabolic pathways [40,41]. When glycolytic enzymes known to be subject to stringent allosteric regulation were overexpressed in S. cerevisiae with the aim to improve the rate of glycolysis the yeast cell failed to respond [40] showing that regulation of glycolytic flux in S. cerevisiae is tightly regulated at levels other than the allosteric enzyme level [42,43]. Similarly an amino acid based strategy aiming to construct a brewing strain producing less diacetyl generated knowledge about the transport of amino acids in S. cerevisiae [44,45]. In the following metabolic engineering strategies in yeast will be discussed in relation to (i) extended substrate range; (ii) increased/decreased metabolite yield and productivity; (iii) extended product range; (iv) improved cellular properties; and (v) yeast as biocatalyst. As it will be clear from the text and tables, strong focus has been put on S. cerevisiae. The yeast metabolic engineering examples discussed in the text are summarized in chronological order in 14 tables. The different contributions are grouped according to the goal (e.g., “xylan utilization” or “lactic acid production”). The column designated “Activities” or “Target genes” reports the gene(s) that have been expressed or deleted. S. cerevisiae genes are simply reported according to the conventional nomenclature. Footnotes indicate figures that have been recalculated, while a dash is reported when no data was available, or calculation was not possible.
22.2 Extension of Substrate Range In this part, examples of yeast metabolic engineering aiming at extending or improving substrate utilization are reported. The substrate is a sugar, often in polymeric form Figure 22.1), that shall be used by yeast as carbon source. The vast majority of the carbon fixed by plant photosynthesis is available in the form of the sugar polymers constituting cellulose, hemicellulose and starch [46]. Cellulose and hemicellulose are structural components, while starch is a storage carbohydrate [47,48]. The three polymers differ in the composition of sugars, the type of bond linking the monomers and the degree of branching. While cellulose and starch are only composed of D-glucose, hemicellulose is a more heterogeneous assembly of polysaccharides comprising D-xylose, L-arabinose, D-mannose, D-glucose, D-galactose, D-glucuronic, and D-galacturonic acid [47]. In order to be used by microorganisms as a carbon source, sugar polymer chains need to be de-polymerized to yield their main monomeric building blocks [3,49,50]. In the context of the so-called consolidated bio processing (CBP) the microorganism should express hydrolytic enzymes, so that
22-3
Metabolic Engineering in Yeast
Hemicellulose xylan Starch
Raffinose
Cellulose Cellobiose Arabinose xylose Mannose Fructose Glucose
Melibiose
Polymers
Lactose
Tri sacch Galactose
Mono sacch
Di sacch
Figure 22.1 Extended substrate range. Solid arrow indicates sugars rapidly metabolized by S. cerevisiae, striped arrow indicates that the galactose is metabolized but utilization can be improved by genetic engineering, empty arrow indicates that genetic modification is a requirement for pentose sugars utilization.
polysaccharide hydrolysis and sugar monomer utilization take place simultaneously, with potential benefit for the process economy [48,51]. Consequently, metabolic engineering strategies aimed at endowing yeast with cellulolytic, xylanolytic and amylolytic capacities, with a strong focus on S. cerevisiae and ethanol production.
22.2.1 Cellobiose An extensive overview on the research performed up to year 2002 in order to endow yeast with cellulolytic and amylolytic properties has been given by L.R. Lynd and colleagues [48]. Therefore, only the most recent and successful examples concerning new yeast strains engineered for cellobiose, cellulose and starch utilization will be reported in the present and in the next two sections. Cellobiose is a β-1,4 linked glucose dimer that is not metabolized by S. cerevisiae. Consequently, metabolic engineering approaches for cellobiose utilization have focused on the expression of the enzyme β-glucosidase, which can cleave the β-1,4 bond and liberate two glucose molecules. Different fungal β-glucosidases have been expressed in S. cerevisiae (Table 22.1) [52,53]. While in all cases the enzyme was secreted with the help of a signal peptide, the different strategies differed on the choice of whether anchoring the secreted protein to the cell or not [54,55]. The latter option is sometimes referred to as cell surface engineering [56]. Saccharomycopsis fibuligera β-glucosidase, anchored to the cell wall either via fusion with the C-terminus of α-agglutinin or cell-wall-protein 2, allowed a higher cellobiose hydrolysis rate than the unanchored enzyme and therefore a higher aerobic biomass yield, equal to 0.42 g/g total sugars (Table 22.1) [53]. The improvement in the strain performances was suggested to result from the positive effect of the anchoring peptide on protein secretion and folding, rather than from an effect of the anchor on the kinetic properties of the enzyme per se. The same study showed that Aspergillus kawachii β-glucosidase expressed on an analogous constructs also benefited from the fusion with a cell wall anchor, even though the overall performance of the strain was inferior. Cell surface engineering with a fungal β-glucosidase has been used in a S. cerevisiae strain engineered for xylose utilization, enabling it to metabolize xylose and cellooligosaccharides present in prehydrolyzed lignocellulosic substrate (Table 22.1) [52]. The specific ethanol productivity from cellobiose was 25% of the one from glucose, indicating that the hydrolytic activity was still a limiting factor.
• β-glucosidase (BGL1; Saccharomycopsis. fibuligera) fused to: Trichoderma reesei xyn2 secretion signal
• β-glucosidase (BGL1; Saccharomycopsis fibuligera) fused to: Trichoderma reesei xyn2 secretion signal + S. cerevisiae α-agglutinin cell wall anchor
• β-glucosidase (BGL1; Aspergillus aculeatus) fused to: Rhizopus orizae unspecified glucoamylase secretion signal + S. cerevisiae α-agglutinin cell wall anchor.
Y294
Y294
MT8-1
P
P
P
Genetic Engineering Approacha
B, aerobic, mineral medium
PGK1
B, anaerobic, complex medium
B, aerobic, mineral medium
PGK1
TDH(−)
Cultivation Conditionsb
Promoter
b
a
P: plasmid vector. B: batch. c R: recalculated; Y : biomass yield; Y : ethanol yield. X/S E/S d Growth is reported, but re-calculation of growth rate could not be done with sufficient accuracy.
Activity(ies)
Strain
Table 22.1 Cellobiose
0.1
0.1
0.1
Scale (l)
(−)d
µ = 0.20 h−1
µ = 0.23 h−1
Growth
YX/S = ~0.08(R) YE/S = 0.38
YX/S = 0.60 YE/S = 0.09
YX/S = 0.42 YE/S = 0.26
Yields (g/g Total Sugars)
Outcomec
[52]
[53]
Reference
22-4 Developing Appropriate Hosts for Metabolic Engineering
Metabolic Engineering in Yeast
22-5
22.2.2 Cellulose Native cellulose fibers consist of linear β-1,4 linked glucose chains, held together in a tightly compacted, crystalline structure, that renders it water-insoluble and recalcitrant to enzymatic hydrolysis [48]. In amorphous cellulose, such as phosphoric acid swollen cellulose (PASC) the crystalline structure has been disrupted and as a result the polymer is more prone to enzymatic hydrolysis [57,58]. Three major types of cellulolytic activities can be found: (i) endoglucanases, cutting randomly inside the polysaccharide chain; (ii) exoglucanases, divided in (iia) cellodextrinases, which liberate glucose monomers from the chains’ extremities, and (iib) cellobiohydrolases, which liberate cellobiose from the chains’ extremities; (iii) β-glucosidases (Section 22.2), which liberate glucose monomers from cellooligosaccharides and from cellobiose [48]. Coexpression and cell surface engineering with an endoglucanase and a β-glucosidase was reported to allow growth on 45 g/l barley β-glucan as a sole carbon source (Table 22.2) [59]. However, β-glucans may not be considered strictly related to cellulose, due to the different distribution in nature and their far lower recalcitrance to hydrolysis [48,60]. Using amorphous cellulose, ethanol fermentation could only be achieved by expressing an endoglucanase (Trichoderma reesei), a cellobiohydrolase (T. reesei) and a β-glucosidase (Aspergillus aculeatus), altogether anchored at the cell wall [57] (Table 22.2), which confirmed that endoglucanases and cellobiohydrolases need to act synergistically in cellulose hydrolysis [61]. No cell growth was possible, but 65% of the initial 10 g/l cellulose was utilized for the production of 2.9 g/l ethanol (Table 22.2). Anaerobic growth on amorphous cellulose was achieved very recently, by applying a different strategy, in which endoglucanase and β-glucosidase were not anchored to the cell wall, but secreted to the medium [58]. However, production of 0.27 g/l dry cell weight (DCW) and 1 g/l ethanol from hydrolysis of ~27% of the initial 10 g/l cellulose required over 200 hours (Table 22.2). In this study, coexpression led to a decrease in activity, whereas previous reports indicated the inverse effect, albeit enzyme activity has not been determined [57]. This effect instead may be strain dependent, as well as dependent on the expression system, since the phenomenon appears to occur in a consistent way with the experimental set up used by each research group.
22.2.3 Starch Today starch remains the main feedstock for many fermentation processes, including ethanol production in the United States [2,46]. Starch is a storage polymer in which glucose moieties are joined by α-1,4 bonds in linear chains, with branches linked through α-1,6 bonds. Starch degrading enzymes can be grouped in four major categories: (i) α-amylase, cutting inside the α-1,4 chains; (ii) β-amylase, releasing maltose units from the nonreducing ends of the chains; (iii) glucoamylase releasing monomeric glucose from the non-reducing ends of the chains; (iv) the debranching enzymes pullulanase and isoamylase, active on α-1,6 side chains. S. cerevisiae is not able to degrade starch, but some strains belonging to Saccharomyces diastaticus do express glucoamylases, encoded by the STA gene family [62–64]. The cell surface engineering concept was also applied to starch utilization (Table 22.3) [55]. The expression of a fungal glucoamylase anchored to the cell wall allowed hydrolysis ~35 g/l of pretreated starch and released enough glucose to sustain aerobic growth in rich medium and anaerobic ethanol production of more than 20 g/l (Table 22.3) [65]. Coexpression of Lypomyces kononekoae α-amylases 1 and 2 in S. cerevisiae also enabled the hydrolysis of approximately 16 g/l out of 20 g/l of soluble starch to glucose [66]. No anaerobic growth was reported, but ~6 g/l ethanol was produced in mineral medium (Table 22.3). Finally, direct ethanol production from raw starch was achieved in a high cell density bioconversion, without cellular growth. The strain combined the expression of a glucoamylase and an α-amylase, also in this case anchored to the cell wall [67]. The fraction of hydrolyzed polysaccharide and the ethanol yield on total sugars are, in some case, comparable for cellulolytic and amylolytic strains (Tables 22.2 and 22.3) [57,66,67]. However, the reported final ethanol titer from raw corn starch reached ~60 g/l
• endoglucanase (EGII; Trichoderma reesei) • β-glucosidase (BGL1; Aspergillus aculeatus) The two enymes are expressed fused to: Rhizopus orizae unspecified glucoamylase secretion signal + S. cerevisiae α-agglutinin cell wall anchor. • endoglucanase (EGII; Trichoderma reesei) • cellobiohydrolase (CBHII; Trichoderma reesei) • β-glucosidase (BGL1; Aspergillus aculeatus) The three enzymes are expressed fused to: Rhizopus orizae unspecified glucoamylase secretion signal + S. cerevisiae α-agglutinin cell wall anchor. • endoglucanase (EGII; Trichoderma reesei) native secretion signal • β-glucosidase (BGL1; Saccharomycopsis. fibuligera) fused to: Trichoderma reesei xyn2 secretion signal
MT8-1
b
a
PGK1
P
TDH(−)
P
ENO1
TDH(−)
P
P
TDH(−)
TDH(−)
P
P
TDH(−)
Promoter
P
Genetic Engineering Approacha
P: plasmid vector. B: batch. c R: recalculated; Y : biomass yield; Y : ethanol yield. X/S E/S
Y294
MT8-1
Activity(ies)
Strain
Table 22.2 Cellulose
B, anaerobic, complex medium, amorphous cellulose 10 g/l
B, anaerobic, complex medium, amorphous cellulose 10 g/l
Β, anaerobic, mineral medium, β-glucan 45 g/l
Cultivation Conditionsb
0.1
−
−
Scale (l)
µ = 0.03 h−1
−
OD600 = 1.4 final
Growth
~27%
~65%(R)
~78%(R)
Cellulose Hydrolysis
Outcomec
YX/S = ~0.027 YE/S = 0.10
YE/S = 0.29
YE/S = 0.37(R)
Yields (g /g total Sugars)
[58]
[57]
[59]
Reference
22-6 Developing Appropriate Hosts for Metabolic Engineering
• α-amylases (LKA1 and LKA2; Lipomyces kononenkoae) • glucoamylase (−; Rhyzopus orizae) fused to: S. cerevisiae α-agglutinin cell wall anchor • α-amylase (amyA; Streptococcus bovis) fused to: S.cerevisiae FLO1 flocculation functional domain
Σ1278b
b
a
TDH(−)
TDH(−)
P
PGK1
I
P
TDH(−)
Promoter
P
Genetic Engineering Approacha
P: plasmid vector; I: chromosomal integration. B: batch. c R: recalculated; Y : ethanol yield. E/S d Not clear.
YF207
• glucoamylase (-; Rhyzopus orizae) fused to: S. cerevisiae α-agglutinin cell wall anchor.
Activity(ies)
YF207
Strain
Table 22.3 Starch
B, anaerobic, complex medium, raw starch 200 g/l
B, anaerobic, mineral medium, soluble starch 20 g/l
B, anaerobic, complex medium, soluble starch 50 g/l
Cultivation Conditionsb
0.05
0.15
2
Scale (l)
−
−
(−)d
Growth
~70% (R)
~82%
~70% (R)
Starch Hydrolysis
YE/S = 0.310
YE/S = 0.305
YE/S = ~0.455 (R)
Yields (g /g Total Sugars)
Outcomec
[67]
[66]
[65]
Reference
Metabolic Engineering in Yeast 22-7
22-8
Developing Appropriate Hosts for Metabolic Engineering
with amylolytic strains [67] while cellulolytic strains could produce only 3 g/l [57]. But hydrolysis and fermentation of the soluble starch was performed with up to 200 g/l starchy raw material (Table 22.3), while the insoluble cellulose raw material was used at a concentration of 10 g/l only (Table 22.4). At present, industrial implementation of cellulolytic and amylolytic yeast strains remains to be seen.
22.2.4 Pentose Sugars Lignocellulosic material is considered a renewable and environmentally friendly feedstock for the production of fuel ethanol. However, for process competitiveness, complete conversion of all the sugars in the feedstock has to be achieved [46]. The pentose sugars xylose and arabinose constitute a significant fraction of hemicellulose [68]. However, S. cerevisiae, the organism of choice for hexose-based ethanol production, is not naturally able to ferment xylose and arabinose, whereas P. stipitis, a xylose fermentingyeast, offers poorer ethanol production performances. Following the growing interest in obtaining an efficient lignocellulosic ethanol production process, a number of studies have focused on metabolic engineering of especially these two yeast species. The progress in the field has been extensively covered in several very recent reviews [3,27,36,46,69–71]. Therefore, only the most recent results will be reported here. The thermotolerant yeast Hansenula polymorpha is naturally able to ferment xylose through a xylose reductase (Xr) and xylitol dehydrogenase (Xdh) based pathway [33]. Xylose is converted by Xr and Xdh to xylulose, which then enters the pentose phosphate pathway after being phosphorylated to xylulose 5-phosphate by xylulokinase (Xk). H. polymorpha Xr and Xdh have been replaced with E. coli xylose isomerase, that directly converts xylose to xylulose [72]. Improvement in ethanol productivity was achieved only after overexpression of H. polymorpha Xk. In particular, ethanol volumetric productivity at 47°C increased from 1.4 mg/l/h of the parent strain to 5.8 mg/l/h of the new recombinant strain. Aerobic arabinose utilization by S. cerevisiae has been achieved in laboratory strains [73] and implemented in an industrial setting [74]. On the other hand, anaerobic arabinose growth has only very recently been achieved by introducing the bacterial pathway from Lactobacillus plantarum, consisting of arabinose isomerase, ribulokinase and L-ribulose 5-phosphate epimerase on a multicopy plasmid [75]. However, this has been achieved only after (i) overexpression of the enzymes belonging to the non oxidative part of pentose phosphate pathway and (ii) extensive evolutionary engineering in mineral medium with L-arabinose as sole C source. The generated haploid, laboratory genetic background and the use of plasmids make this strain hardly usable for an industrial process, but it rather serves as a useful source of knowledge for reverse metabolic engineering strategies in industrial strains [76]. Transport is considered to be a limiting factor for the efficient fermentation of arabinose in S. cerevisiae. Recently, thanks to a screening in a transport deficient strain, a gene encoding a bona fide P. stipitis arabinose transporter has been cloned in S. cerevisiae [77]. The existence of a high affinity and a low affinity transport system based on proton symport and facilitated diffusion, respectively, was demonstrated in Candida arabinofermentans and Pichia guilliermondii [78]. In addition, arabinose and xylose utilization as a function of aeration revealed that arabinose metabolism is exclusively respiratory, and oxygen limitation causes excretion of the metabolic intermediates L-arabitol and xylitol [79]. The fact that sub-optimal carbon utilization takes place also in species that natively utilize pentoses confirms that simple pathway transfer from these strains to, e.g., S. cerevisiae is a starting point only, and that subsequent optimization is needed. The intense research of the last years on pentose sugars metabolism, not only generated new engineered strains, but also increased the knowledge on the field available to the community, such as the completion of P. stipitis genome sequence [36].
22.2.5 Xylan Xylan is one of the major components of hemicellulose in hard woods and agricultural crops and it is the second most abundant source of fixed carbon in nature [68].
• β-xylanase (xyn2; Trichoderma reesei) • β-xylosidase (xlnD; Aspergillus niger) fused to: mating factor α1 secretion signal
• β-xylanase (xynC; Asperigillus kawachii) • β-xylosidase (xlnD; Aspergillus niger) fused to: mating factor α1 secretion signal
• β-xylanase (xyn2; Trichoderma reesei) fused to: Rhizopus orizae unspecified glucoamylase secretion signal + S. cerevisiae α-agglutinin cell wall anchor. • β-xylosidase (xylA; Aspergillus orizae) fused to: Rhizopus orizae unspecified glucoamylase secretion signal + S. cerevisiae α-agglutinin cell wall anchor. • xylose reductase (XYL1; Pichia stipitis) • xylytol dehydrogenase (XYL2; Pichia stipitis) • XKS1
S. cerevisiae Y294
Pichia stipitis TJ26
S. cerevisiae MT8-1 TDH(−)
TDH(−)
TDH(−) TDH(−) TDH(−)
P
P P P
XYL1 (P.stipitis)
P
P
TKL(−)
P
ADH2
P
b
a
B, anaerobic, complex medium
B, aerobic, mineral medium
B, aerobic, complex medium
ADH2
P
ADH2
Bioconversion in buffer
ADH2
P
P
Cultivation Conditionsb
Promoter
Genetic Engineering Approacha
P: plasmid vector. B: batch. c R: recalculated; Y : ethanol yield. E/S d Calculated from the amount of monomeric D-xylose released from 50 g/l xylan.
• β-xylanase (xyn2; Trichoderma reesei) • β-xylosidase (xynB; Bacillus pumilus) fused to: mating factor α1 secretion signal
Activity(ies)
S. cerevisiae Y294
Strain
Table 22.4 Xylan
0.01
0.05
0.1
0.01
Scale(l)
Xylan hydrolysis: 23.7%(R) Ethanol production YE/S: 0.071 g/g xylan(R)
Growth CFU/ml = 1.02 x 108 final
Xylan hydrolysis: 47.0%(R)d Rate: ~0.17 g/l/h(R)d
Xylan hydrolysis: 4.2%(R)d Rate: ~0.05 g/l/h (R)d
Outcomec
[85]
[84]
[83]
[82]
Reference
Metabolic Engineering in Yeast 22-9
22-10
Developing Appropriate Hosts for Metabolic Engineering
Xylan is a complex water-soluble polysaccharide, mainly constituted by a backbone of β-1,4-linked xylose moieties, which are partially substituted with acetyl, arabinosyl and glucuronosyl residues [47]. Metabolic engineering strategies of yeast for xylan degradation have been developed with a CBP perspective [51], in which lignocellulosic biomass would be the feedstock for bioethanol production. A prerequisite for the CBP concept are strains that efficiently utilize xylose and arabinose (Section 22.2.4) [3,80]. Two main enzymatic activities are needed for complete xylan degradation: (i) endo-1,4-β-xylanase (for simplicity designated “xylanase”), which cleaves the glycosidic bonds inside the xylan backbone, and (ii) β-D-xylosidase (for simplicity designated “xylosidase”), which releases xylose monomers from xylooligosaccharides or xylobiose [47,81]. The most successful examples of xylan degradation by recombinant yeasts entailed coexpression of both activities (Table 22.4) [82–85]. T. reesei xylanase with its own secretion signal was first coexpressed together with the bacterial xylosidase from Bacillus pumilus, which was fused with S. cerevisiae mating factor α1 secretion signal [82]. The two genes were cloned in tandem on the same multicopy vector, the stability of which in nonselective medium was ensured by the autoselective fur1 strain genetic background [86]. Incubation of the recombinant strain in the presence of 50 g/l birchwood xylan led to the release of xylooligosaccharides and xylobiose, known to inhibit xylanase [47], but only small amounts of xylose. Moreover, an isogenic strain expressing only the T. reesei xylanase yielded approximately the same hydrolysis pattern, indicating a very poor effect of the bacterial xylosidase. A ten-fold improvement was achieved by changing the bacterial xylosidase with the one from Aspergillus niger, resulting in 47% of the xylan being hydrolyzed by the recombinant strain [83]. When cell surface engineering [54,56], was used to express fungal xylanase and xylosidase anchored to the cell wall in a laboratory S. cerevisiae strain harboring the P. stipitis xylose utilization pathway, the resulting strain was able to directly convert xylan to ethanol (Table 22.4) [85]. However, xylan hydrolysis was significantly lower than achieved elsewhere [83]. The high degree of xylan substitution may limit the accessibility of the β-1,4 backbone to the xylanolytic enzymes [47,87]. It has been suggested that the strain could be further improved by coexpression of critical xylanolytic enzymes, e.g., a-L-arabinofuranosidase, already expressed in S. cerevisiae [88]. In P. stipitis, enhanced xylanolytic activity was achieved by coexpressing fungal xylanase and xylosidase and allowed a laboratory strain to grow to five-fold higher cells density than the untransformed control [84]. Xylan hydrolysis was not quantified, but cell counts achieved by strains singularly expressing either of the two proteins, suggest that the two enzymes acted synergistically.
22.2.6 Galactose Galactose is metabolized by S. cerevisiae through the Leloir pathway with a flux that is several times lower than the glucose flux [89]. Improved galactose utilization is relevant for industrial processes based on galactose-rich feedstocks, such as cheese whey (Section 22.2.7 and Table 22.6), beet molasses (Section 22.2.8 and Table 22.7) and softwood lignocellulosic biomass (Sections 22.2.4; 22.2.5 and Table 22.4) The Leloir pathway, is tightly regulated at the transcriptional level by glucose repression and galactose induction [90]. There are several proteins involved in the regulatory network, the four most relevant being Gal4p, Gal6p, Gal80p and Mig1p [90,91]. Gal4p is needed for transcriptional activation while, on the contrary, Gal6p wields negative control on the network. Gal80p may also be considered as a negative regulator of the network, since it is responsible for Gal4p inhibition in the absence of galactose; finally, Mig1p is a third negative regulator, since it mediates glucose repression. Two different strategies, both aiming at increasing the flux through the galactose pathway, have been pursued in S. cerevisiae (Table 22.5). One strategy, that aimed at increasing the pathway flux by engineering the regulatory network [91], was based on the hypothesis that the pathway enzymes levels could be increased by suppressing negative
GAL4
PGM2
CEN.PK
CEN.PK
P
∆ ∆ ∆ P
b
a
P: plasmid vector; ∆: gene deletion. B: batch. c R: recalculated; Y : ethanol yield. E/S
GAL6 GAL80 MIG1
Target Gene(s)
CEN.PK
Strain
Genetic Engineering Approacha
Table 22.5 Galactose
PMA1
− − − Native
Promoter
B, aerobic, mineral medium B, aerobic, mineral medium
B, aerobic, mineral medium
Cultivation Conditionsb
2–4
4
4
Scale (l)
0.17
0.13
0.13
Parent Strain
0.23
0.13
0.13
Engineered Strain
Growth Rate (h−1)
3.33
3.01
3.01
Parent Strain
5.78
3.80
4.25
Engineered Strain
Max Specific Galactose Uptake Rate (mmol/gCDW/h)
Outcomec
3.91
5.00
5.00
Parent Strain
6.30
7.83
8.48
Engineered Strain
YE/S (mmol/g)
[93]
[91]
Reference
Metabolic Engineering in Yeast 22-11
22-12
Developing Appropriate Hosts for Metabolic Engineering
regulators of the pathway, or by enhancing transcriptional activators. Deletion of the three repressors GAL6, GAL80, and MIG1 and overexpression of the activator GAL4 on multicopy plasmid indeed successfully enhanced the flux through the pathway [91]. The growth rate on galactose as sole carbon source, which is less than one half of the growth rate on glucose, was not affected by the increased carbon uptake. Rather, the extra carbon flux was channeled to ethanol production, consistently with the respire-fermentative propensity of S. cerevisiae [92]. The second strategy was elaborated thanks to reverse metabolic engineering [76,91,93]. Reverse metabolic engineering relies on the analysis of different mutants in order to identify targets for new metabolic engineering strategies [76]. The transcriptional profiles of two strains harboring an engineered regulatory network were compared with a reference strain [93]. Analysis showed an upregulation of PGM2, encoding phosphoglucomutase, the last enzyme of the Leloir pathway. When PGM2 was overexpressed on a multicopy plasmid, the galactose uptake rate was indeed significantly increased, which in this case was also reflected by a higher growth rate on galactose as sole carbon source.
22.2.7 Lactose Lactose is a disaccharide of galactose and glucose linked through a β-1,4 bond. Lactose is the most abundant sugar in milk and cheese whey. Lactose is a potential carbon source for fermentation processes aiming at the production of biomass and biomass-associated products, or fuels such as ethanol and biogas [94]. The possibility to use lactose contained in waste cheese whey as carbon source for fermentative production processes would have the double advantage of utilizing a relatively cheap and regional feedstock, together with environmentally friendly disposal of cheese whey, otherwise very challenging in terms of biological and chemical oxygen demand for wastewater treatment [95]. The enzyme β-galactosidase is needed to release monomeric glucose and galactose, which can then enter cellular carbon metabolism. Yeasts belonging to the Kluyveromyces genus naturally possess β-galactosidase encoding genes. However, this activity needs to be introduced in S. cerevisiae, which is the industrially most used yeast. Therefore, metabolic engineering for lactose utilization mainly involves transformation of S. cerevisiae with a β-galactosidase gene (Table 22.6). For some of the strategies lactose hydrolysis catalyzed by β-galactosidase was mediated by the uptake by a heterologously expressed permease [96–99]. In other cases, β-galactosidase was targeted to the extracellular medium either by active secretion [100,101] or by controlled auto-lysis of a subpopulation of cells [102]. In contrast to the work on galactose (Section 22.2.6) and melibiose (Section 22.2.8), glucose repression on utilization of galactose released from lactose was not perceived as a major problem; yields on total sugars and monotonic growth suggest a partial coconsumption of glucose and galactose in the recombinant strains, but direct experimental evidence is missing. Limiting lactose transport [97–99], or slow glucose release by the secreted β-galactosidase [100,101] are indicated as sufficient to limit overflow metabolism in the lactose-utilizing strains. Assessment of the stability of plasmid-based strains showed that ~60% [101], and up to 90% [100] of the population had lost the plasmid(s). For A. niger β-galactosidase secreting strains, the residual activity provided by the fraction of cells that still retained the plasmid prolonged the fermentation, but lactose consumption was not complete [100]. However, this is the sole example of plain whey permeate fermentation with an industrial strain, which displayed ethanol yields on consumed sugars of 0.46 g/g. The chromosomal integration of K. lactis β-galactosidase and lactose permease into the rDNA locus generated a stable strain harboring multiple copies of the heterologous genes [97]. However, the fermentation set up used in this investigation did not allow comparison with other strategies. Finally, lactose utilization was combined with amylolytic activity [99,102] and flocculence. [98,99]. While the latter may represent an obvious advantage from the process point of view, the conditions requiring both lactose-utilizing and amylolytic activity may be more difficult to encounter.
• lactose permease (putative sequence; Kluyveromyces. lactis) • β-galactosidase (LAC4; Kluyveromyces. lactis)
• β-galactosidase (secretable, putative sequence; Aspergillus niger)
• β-galactosidase (LacZ; Escherichia coli) • GAL4
• lactose permease (LAC12; Kluyveromyces lactis) • β-galactosidase (LAC4; Kluyveromyces lactis) • lactose permease (LAC12; Kluyveromyces marxianus) • β-galactosidase (LAC4; Kluyveromyces marxianus) • lactose permease (LAC12; Kluyveromyces lactis) • β-galactosidase (LAC4; Kluyveromyces lactis)
• β-galactosidase (LAC4; Kluyveromyces lactis) fused to: α-factor secretion signal
YNN27
Industr. Mauri’s
X4004
W303-1A and S288C
BJ3505
FB, aerobic, whey + yeast extract
B, aerobic, mineral medium
UASGAL/CYC1 UASGAL/CYC1 UASGAL/CYC1 UASGAL/CYC1
P P I I
B, aerobic, whey + proteases
B, aerobic, whey + yeast extract
UASGAL/CYC1 UASGAL/CYC1 ADH2
P @ I I P
B, aerobic, mineral medium
− −
P
P
B, anaerobic, whey permeate
Native
I
b
a
B, aerobic, mineral medium
Cultivation Conditionsb
ADH1
Native
Promoter
I
Genetic Engineering Approacha
P: plasmid vector; I: chromosomal integration; @: adaptation. B: batch; FB: fed batch. c R: recalculated; Y : biomass yield; Y : ethanol yield. X/S E/S
S288C
NCYC869-A3
Activity(ies)
Strain
Table 22.6 Lactose
−
200
−
1
2
2
−
Scale(l) −1
µ = 0.23 h−1
CFU/ ml = 3 × 108 final
−
µ = 0.32 h−1
−
−
µ = 0.24 h
Growth
100%
~95%
~98%
100%
100%
YX/S = 0.34(R)
−
YX/S = 0.032 - 0.200(R) YE/S = 0.310 - 0.460(R)
YX/S = 0.415(R) YE/S = 0.225(R)
YE/S = 0.122(R)
YX/S = 0.086 YE/S = 0.1(R)
−
yes
21%
Yields (g/g Total Sugars)
Lactose Uptake
Outcomec
[101]
[99]
[98]
[97]
[102]
[100]
[96]
Reference
Metabolic Engineering in Yeast 22-13
22-14
Developing Appropriate Hosts for Metabolic Engineering
22.2.8 Melibiose Molasses are commonly used as feedstock for ethanol and bakers’ yeast production. Molasses are rich in glucose, sucrose and fructose, but also contain 1–2% (wt/wt) of the trisaccharide raffinose [103]. Only the fructose moiety of raffinose is utilized by distillers’ and bakers’ yeast strains after hydrolysis by invertase (β-fructosidase), an enzyme encoded by SUC2 that is naturally present in all bakers’ and distillers’ yeast strains [104,105]. The remaining disaccharide is melibiose, constituted by galactose and glucose moieties linked through a α-1,6 bond. Melibiose is not utilized by most of S. cerevisiae strains due to the lack of melibiase (α-galactosidase) activity [105]. Therefore, new distillers’ and bakers’ yeast strains expressing melibiase would positively affect the process economy, allowing a more complete utilization of the sugars present in molasses and thus higher ethanol or biomass yields from the feedstock. In addition, raffinose depletion from the effluent streams would reduce the biological oxygen demand (BOD) in process wastewater treatment. Some Saccharomyces (e.g., Saccharomyces carlsbegensis or S. cerevisiae var. uvarum) strains, designated as Mel+, do express melibiase, encoded by the polygenic family MEL1-MEL11 [105–107]. Consequently, expression of the MEL1 gene, cloned from a Mel + strain, has been attempted in bakers’ or distillers’ strains for melibiose utilization (Table 22.7). Integration of one or two copies of MEL1 into industrial bakers’ yeast strains allowed growth on mineral medium with melibiose as sole carbon source and 8% improvement in biomass yield on molasses [108]. Similarly, some bakers’ yeast strains expressing MEL1, completely consumed melibiose in molasses medium [109]. The strains were also characterized concerning high CO2 evolution ability in high sugar synthetic dough, showing performances similar to those of the parental strain. A class of dominant mutants in ILV2 (SMR) displays resistance to the herbicide sulphometuron methyl (SM) [110]. The dominant marker SMR1 was chosen for the selection of recombinant clones in the first of the two examples [110,111]. Since both MEL1 and SMR1 are genes naturally occurring in S. cerevisiae, dominant selection of the resulting strain is possible without the introduction of any superfluous, antibiotic resistance sequences or other sequences alien to yeast, potentially facilitating its approval and commercialization. The same selection marker has been used by other authors as well [89,108,111–114]. Hydrolysis of melibiose yields glucose and galactose. To further improve raffinose utilization, MEL1 expression has been added to improve galactose utilization (Section 22.2.6) [89,112]. By combining genetic engineering and classical breeding, diploid industrial strains expressing MEL1 and deleted in MIG1 and GAL80 were generated [112]. The strains metabolized melibiose already from the early stages of anaerobic molasses fermentation, with no apparent accumulation of galactose, confirming the successful expression of melibiase and concomitant relief of glucose repression. All sugars were consumed and the ethanol yield was improved by 4% by the end of the fermentation, although transient accumulation of fructose and glucose occurred, indicating a decreased fermentation rate compared with the parental strain. This could be due to repeated breeding steps, and it demonstrates the importance of screening and characterization of the best hybrids. MEL1 strains, with or without the deletion of MIG1 and GAL80, were also characterized in aerobic fermentation of mineral medium (Table 22.5) [89]. The MEL1 strain hydrolyzed melibiose completely, albeit at low specific rate, that is 0.04 C-mol/g/h, equivalent to ~1.14 g/g/h. Slow release of glucose favored a mainly oxidative metabolism, resulting in high biomass and low ethanol yields. Galactose consumption started only 2 hours after complete glucose depletion. When the deletion of MIG1 and GAL80 was combined with additional copies of MEL1 [112], a six-fold higher specific melibiose hydrolysis rate (0.24 C-mol/g/h, equivalent to ~6.84 g/g/h) was obtained. High sugar availability translated into the typical respiro-fermentative metabolism of S. cerevisiae, with increased aerobic ethanol yield at the expenses of biomass yield. Moreover, thanks to the relief of glucose repression, galactose was partly coutilized with glucose and rapidly consumed after glucose depletion. Generally, it appears that the design of melibiose-utilizing strains is less problematic than obtaining efficient whey fermenting strains of S. cerevisiae. The first important advantage is that melibiase is naturally expressed and secreted by some yeast strains, which greatly diminishes the problems connected
(−)d (−)d
PGK1 − − PGK1 − − PGK1
I
I
I ∆ ∆
If ∆ ∆
I
• α−galactosidase (MEL1, S. cerevisiae)
• α−galactosidase; (MEL1, S. cerevisiae)
• α−galactosidase (MEL1, S. cerevisiae) MIG1 GAL80
• α−galactosidase (MEL1, S. cerevisiae) MIG1 GAL80
• α−galactosidase (MEL1, S. cerevisiae)
Ind. DGI342
Ind. DGI342
Ind. DGI342
b
a
I: chromosomal integration; ∆: gene deletion. B: batch; FB: fed batch. c R: recalculated; Y : biomass yield; Y : ethanol yield. X/S E/S d Not clear. e g extracted proteins/g raffinose. f Additional copies of MEL1 were added. g g melibiose/g cells/h.
Ind. L38
Ind. CT
Promoter
Activity(ies)/Gene(s)
Strain
Genetic Engineering Approacha
Table 22.7 Melibiose
B, aerobic, mineral medium
B, aerobic, mineral medium
FB, anaerobic, Complex medium (beet molasses)
B, aerobic, complex medium (cane molasses)
B, aerobic, mineral medium
Cultivation Conditionsb
4
4
10
−
0.1−0.5
Scale (l)
0.28
0.25
−
−
0.25
Engineered Strain
Growth Rate (h−1)
100% Rate = 1.14g
100% Rate = 6.84g
100%
100%
−
Engineered Strain
Melibiose Hydrolysis
−
−
YE/S = 0.48
−
YX/S = 0.015e
YE/S = 0.032(R) YX/S = 0.355(R)
YE/S = 0.266(R) YX/S = 0.056(R)
YE/S = 0.50
−
YX/S = 0.038e
Engineered Strain
Yields (g/g Total Sugars) Parent Strain
Outcomec
[89]
[112]
[109]
[108]
Reference
Metabolic Engineering in Yeast 22-15
22-16
Developing Appropriate Hosts for Metabolic Engineering
to substrate import or enzyme secretion, as seen for lactose. Secondly, molasses are already routinely employed as feedstock for yeast growth and fermentation. Fructose and glucose present in molasses greatly sustain growth, allowing yeast cells to express and secrete melibiase. Glucose repression of invertase expression and galactose metabolism genes could be an obstacle for optimal process performance, but the increasing knowledge on S. cerevisiae regulatory networks should allow overcoming it, at least partly.
22.3 Metabolites Yield and Productivity 22.3.1 Redirection of Carbon and Redox Equivalents to Glycerol Production Ethanol and glycerol are the main fermentation products of the yeast S. cerevisiae. Conversion of glucose to ethanol is a redox-neutral process, since NAD+ reduced by glyceraldehyde-3-phosphate (GA3P) dehydrogenase (Tdh) is reoxidized by alcohol dehydrogenase (Adh) Figure 22.2). However, biomass growth is associated with a net production of NADH [115]. Under anaerobic conditions, generation of glycerol is the sole process allowing yeast cells to reoxidize cytosolic extra NADH. GPD1 and GPD2 encode glycerol-3-phosphate dehydrogenase (Gpd), responsible for the NADH-dependent reduction of di-hydroxy acetone phosphate (DHAP) to glycerol-3-phosphate (G3P), which is eventually dephosphorylated to glycerol by G3P phosphatase (Gpp), encoded by both GPP1 and GPP2 genes (Figure 22.2). In the presence of oxygen, cytosolic NADH can be shuttled to mitochondria, but it may also be re-converted to NAD+ by Gpd1p or Gpd2p resulting in glycerol formation. Therefore, glycerol synthesis is an important mechanism for maintaining cytosolic redox balance. In addition, glycerol is the most important compatible solute in response to osmotic stress. For a comprehensive overview on the physiological role and the metabolism of glycerol the reader is referred to the extensive literature in the field [116,117]. Glycerol has a number of uses in food, pharmaceutical and chemical industry and high yields would be required to compete with chemical processes [118]. Similarly, increased glycerol production and consequent lower final ethanol titers are sometimes desirable features for wine strains as glycerol confers body and sweetness to wine, while an appropriate concentration of alcohol is important for the sensory balance of wine [119–122]. Metabolic engineering strategies to increase glycerol yield are summarized in Table 22.8. The relative importance of Gpd and Gpp activities for the achievement of glycerol overproduction has been addressed in a number of investigations. In general, overexpression of either GPD1 or GPD2 is required and equally important (Table 22.8), while the activity of Gpp is not rate controlling. The comparison of two wine strains, a low glycerol producer and a high glycerol producer, revealed a significant difference in Gpd activity [119]. This has been also pointed out through the characterization of strains expressing the genes in different combinations and by the construction of a kinetic model, built on parameters collected from independent experimental results [123,124]. In practice, glycerol production was increased by the overexpression of GPD1 or GPD2 in laboratory strains and industrial strains for wine making [119–122,125–127]. Further improvement of glycerol yield was achieved when the activity of pyruvate decarboxylase (Pdc) was reduced in a ∆pdc1 strain, in combination with overexpressing GPD1 [127]. However, in all cases, increased glycerol production was accompanied by accumulation of unwanted by products such as acetaldehyde and acetate, which seriously affect the quality of wine [119,121,123]. The problem was partly solved by deletion of ALD6, encoding acetaldehyde dehydrogenase required for acetate production [120,122] Acetate titers in glycerol overproducing strains were indeed reduced from 0.98 g/l of the simple Gpd overexpressing strain, to 0.43 g/l in the ∆ald6 strain (parent strain level 0.38 g/l), but other off-flavor determining compounds such as acetoin and acetaldehyde accumulated. A thorough characterization of the redistribution of fluxes toward flavor compounds in wine recombinant strains is required to refine engineering strategies and to minimize formation of undesirable compounds. Another strategy consisted in inactivating triose phosphate isomerase (Tpi encoded by TPI1) to get a molar stoichiometry of 1:1 for the generation of glycerol from glucose (Figure 22.2). A ∆tpi1 mutant
22-17
Metabolic Engineering in Yeast
GLU
Hxt
GLU
FRU1, 6BP
GLY NAD+ FPs
NADH
GA3P
Nox Gpp
GLY
NAD+ GLY3P
Tpi
NADH Gpd
Tdh
DHAP
NAD+ NADH
2,3BPG Pgk
Gut
2OG NAD(P)H
FAD
FADH2
NADH
Gdh
Nde
NADP+
PYR
NAD+
NADH
NAD+
Glt Gln NADH
NADH NADH Aox
NAD+
PYR
Pt
Pdc
Pdh E
NADP+ NADPH
NAD+
NAD+
Gapn
3PG
Ald
AC
A-CoA
ACs Cs
NAD(P)H
A-CoA
AA
NAD(P)+
Adh
EtOH
NADH NAD+
Figure 22.2 Relevant metabolites (plain text) and proteins or protein complexes (shaded boxes) mentioned in Section 22.2. Heterologous proteins expressed in metabolic engineering strategies are in rectangular boxes, yeast proteins are in elliptical boxes. Redox cofactor requirements reported for relevant reactions. ATP is omitted. Abbreviations, in alphabetical order. Metabolites: 2,3BPG, 2,3-bisphosphoglycerate; 2OG, 2-oxoglutarate; 3PG, 3-phosphoglycerate; AA, acetaldehyde; AC, acetate; A-CoA, acetyl-coenzyme A; DHAP, dihydroxyacetone phosphate; E, glutamate; EtOH, ethanol; FRU1,6BP, fructose 1,6-bisphosphate; GA3P, glyceraldehyde 3-phosphate; GLU, glucose; GLY, glycerol; GLY3P, glycerol 3-phosphate; PYR, pyruvate. Proteins: Acs, Acetyl-coenzyme A synthase; Adh, alcohol dehydrogenase; Ald, aldehyde dehydrogenase; Aox, alternative oxidase; Cs, acetyl coenzyme A shuttle; Fps, glycerol channel; Gapn, NADP + dependent non-phosphorylating glycerhaldehyde 3-phosphate dehydrogenase; Gdh, glutamate dehydrogenase; Gln, glutamine synthetase; Glt, NAD + dependent glutamate synthase; Gpd, glycerol 3-phosphate dehydrogenase; Gpp, glycerol 3phosphate phosphatase; Gut, mitochondrial FAD dependent glycerol 3-phosphate dehydrogenase; Hxt, hexose transporter; Nde, mitochondrial external NADH dehydrogenase; Nox, water forming NADH oxidase; Pdc, pyruvate decarboxylase; Pdh, pyruvate dehydrogenase; Pgk, phosphoglycerate kinase; Pt, mitochondrial pyruvate carrier; Tdh, glyceraldehyde 3-phosphate dehydrogenase; Tpi, triosephosphate isomerase.
showed molar yields of 0.7–0.9 in the conversion of glucose to glycerol, but the strain was not able to grow on glucose as sole carbon source (Table 22.8) [128]. Accumulation of DHAP in ∆tpi1 mutants, resulting from a shortage of cytosolic NADH, was indicated as one of the factors limiting growth [125,129,130]. Consequently, the availability of cytosolic NADH for Gpd was investigated by deleting the genes encoding for mitochondrial NADH shuttle systems (NDE1, NDE2, and GUT1) (Figure 22.2). The strategy proved to be successful even with normal levels of Gpd, reaching a molar yield of 0.99 [129]. However, this was achieved only after an adaptation procedure, resulting in yet unknown mutations in the new strain. It was speculated that the adaptation step increased the respiratory capacity of the strain. Experimental characterization of the adapted strain could represent a highly valuable source for a reverse metabolic engineering approach [76]. More
P P
GPD1 fps1-∆1 TPI1 NDE1 NDE2 GUT2
V5
ALD6 GPD1
NDE1 NDE2 GUT2 GPD2 FDH1
ALD3 ADH1 GPD1 TPI1
Ind. Wine « K1marquée »
TAM strain
CEN.PK
∆ ∆ ∆ P P P ∆ P ∆
∆ P
∆ P
∆ ∆ ∆ ∆ @
b
a
P: plasmid vector; ∆: gene deletion; @: adaptation. B: batch; FB: fed batch; C: continuous. c R: recalculated. d Not clear.
ALD6 GPD2
22767
CEN.PK
P
GPD1
V5
∆
TPI1
W303-1A
∆ P
PDC2 GPD1
Genetic Engineering Approacha
YSH306
Strain
Target Gene(s)
− − – TPI1 (−) d ADH1 – ADH1 –
− ADH1
− ADH1
− − − −
ADH1 (−)d
ADH1
− (−)d −
Promoter B, anaerobic, complex medium
B, aerobic, mineral medium
C, aerobic, mineral medium
B, anaerobic, mineral medium
B, anaerobic, mineral medium
B, aerobic, mineral medium
B, oxygen-limited, mineral medium B, anaerobic, mineral medium
G418/Ura/His
–
G418/Zeo
His/Ura
G418
Zeo
−
Ade/Trp/His
G418
Cultivation Conditionsb
FB, anaerobic, mineral medium
Auxotrophies/ Antibiotic(s) Resistance
Table 22.8 Redirection of Carbon and Redox Equivalents to Glycerol
2
1
1.1
0.05
2
1.1
1.1
−
0.5
Scale (l)
−
0.039(R)
–
0.029–0.032(R)
0.065(R)
−
0.022–0.029(R)
0.069
0.068
Parent Strain
0.90(R)
1.08
0.110–0.122(R)
0.21(R)
0.99
~0.12(R)
0.28
0.7–0.9
0.54
Engineered Strain
Glycerol Yield (mol/mol Glucose)c
[125]
[132]
[122]
[120]
[129]
[123]
[119]
[128]
[127]
Reference
22-18 Developing Appropriate Hosts for Metabolic Engineering
Metabolic Engineering in Yeast
22-19
recently, the improvement of ∆tpi1 strains focused on draining the DHAP pool of the cell by combining increased cytosolic NADH supply with Gpd overexpression [125,129]. In this strategy, GPD1 was overexpressed and sufficient cytosolic NADH was provided by engineering the acetaldehyde node, in particular by reducing ethanol formation (∆adh1) and promoting a NAD+ -dependent pyruvate dehydrogenase by-pass (overexpression of ALD3) (Figure 22.2) [125]. Thanks to the synergy of these modifications, a glycerol molar yield from glucose of 0.9 was reached, but unfortunately the available data do not include glycerol formation by the corresponding strain with normal Gpd levels, which would have been useful for a comparison with the strategy of deleting the mitochondrial shuttle systems [129]. The ∆tpi1 ∆adh1 strategy did not require adaptation and overexpression of the glycerol facilitator Fps1p did not improve glycerol yield. This is in contrast with a previous investigation in which, however, a truncated gene coding for a constitutively open channel was overexpressed [123]. Most of Tpi-based approaches relied on a fully aerobic set up, as respiratory ATP generation appeared to be needed for sustained growth. This can be explained by the fact that the glycolytic ATP yield equals zero at a molar glycerol yield of 1 (Figure 22.2), so growth ceases unless ATP is generated from respiration. The highest reported glycerol yield was achieved thanks to a third strategy in a laboratory strain subjected to extensive genetic manipulation combined with evolutionary engineering [131]. This strategy relied on the so-called TAM strain, obtained by extensive evolutionary engineering. TAM strain is Pdc-negative and displays zero alcoholic fermentation, but it is able to grow on glucose as sole carbon source [132]. In the TAM strain, a heterologous formate dehydrogenase provided NADH-regenerating activity which, upon cofeeding cells with formate, allowed higher cytosolic NADH concentrations and molar glycerol yield on glucose equal to 1.08 [131]. This strategy relied on the so-called TAM strain, obtained by extensive evolutionary engineering. TAM strain is Pdc-negative and displays zero alcoholic fermentation, but it is able to grow on glucose as sole carbon source [132]. When the experimental data were compared to a compartmentalized stoichiometric model developed for this particular strain, the authors postulated an extra mechanism to dissipate cytosolic NADH, perhaps via malate dehydrogenase. However, attempts to delete such a gene failed. The system may represent a model for academic research on the possibilities of exceeding the limits of yeast metabolism.
22.3.2 Modulation of Ethanol Production Channeling the carbon flux to ethanol and therefore reducing the production as glycerol as by product, would be advantageous for example in a strain designed for the distilled beverages industry and the large scale fuel ethanol production [133,134]. While increasing the availability of cytosolic NADH was shown to improve glycerol formation, the opposite situation was found to favor ethanol formation. This has been achieved by engineering the ammonium assimilation pathway [133], an example where it was possible to improve the flux through a certain pathway by modifying a completely different one. The NADPH dependent ammonium assimilation route via glutamate dehydrogenase (GDH1) was replaced by a new route, mediated by glutamine synthetase and glutamate synthase (GLN1 and GLT1, respectively), which consume NADH and ATP (Figure 22.2). In addition, ATP-dependent ammonium assimilation through Gln1p and Glt1p increased the ATP demand for biomass formation, which, should direct cellular metabolism toward higher glycolytic and ethanol flux. The ethanol yield did indeed increase by 10% and the glycerol yield decreased by 30%, however, this occurred at the expense of a reduced growth rate. GLN1 and GLT1 were overexpressed by promoter replacement and the authors suggest that the strain could be further improved by insertion of extra copies, which was however only further pursued in pentose fermenting strains [136]. The intracellular redox state can be also altered by the insertion of heterologous activities. A genomescale metabolic model [18] suggested that 10% of glyceraldehyde-3-phosphate (GA3P) could be diverted from GA3P dehydrogenase to a heterologously expressed nonphosphorylating, NADP + dependent GA3P dehydrogenase (Gapn) (Figure 22.2 and Table 22.9), in order to reduce NADH formation and
P
P
∆ I
∆ I I P
Genetic Engineering Approacha
b
a
P: plasmid vector; I: chromosomal integration; ∆: gene deletion. B: batch. c R: recalculated.
CEN.PK
V5
HXT1 to HXT7 hxt1/hxt7 chimera. S279Y mutation • Water-forming NADH oxidase (noxE; Lactococcus lactis) • Alternative oxidase (AOX1; Histoplasma capsulatum)
• NADP + -dependent glycerhaldehyde-3-phosphate dehydrogenase (gapN;; Streptococcus mutans)
M4054
CEN.PK
GDH1 GLN1 GLT1
Activity(ies)/Gene(s)
TN2
Strain
Table 22.9 Modulation of Ethanol Production
B, aerobic, mineral medium B, aerobic, mineral medium B, aerobic, mineral medium
TDH3
TPI1
B, anaerobic, mineral medium
B, anaerobic, mineral medium
– HXT7t
TPI1
– PGK1 PGK1
Promoter
Cultivation Conditionsb
–
Zeo
–
–
G418
Auxotrophy/ Antibiotic(s) Resistance
4
1
1.5
2
4.5
Scale (l)
1.21(R)
1.82(R)
~1.2(R)
1.60(R)
1.44(R)
Parent Strain
0.31(R)
1.54(R)
~0(R)
1.65(R)
1.59(R)
Engineered Strain
Ethanol Yield (mol/mol Glucose)c
[139]
[138]
[140]
[135]
[133]
Reference
22-20 Developing Appropriate Hosts for Metabolic Engineering
Metabolic Engineering in Yeast
22-21
eliminate glycerol production [137]. In vivo results confirmed the prediction of the model. Even though the ethanol yield did not increase significantly (+3%), the 40% decreased glycerol yield testified a factual redirection of the carbon flux in the recombinant strain. It was suggested that the ethanol yield could be increased by a higher GAPN activity. However this was not experimentally verified. In conclusion, the great number and variety of approaches aiming at modulating one of the most peculiar traits of yeast physiology, that is ethanolic fermentation, demonstrate how high the interest on this pathway still is, both from the fundamental and applied point of view. On the other hand, its recalcitrance to comply with researchers wishes in terms of yields and productivity demonstrates how inherently robust evolution has shaped it. From an applied point of view, expression of Gapn in order to partly by pass the double step GA3PDH(phospho-glycerate kinase) (Pgk) is an attractive alternative [137]. Such a modification does not introduce ectopic cofactor regeneration cycles, but rather directly affects the site of glycolytic generation of NADH, without affecting the growth rate. In other applications, the level of ethanol had to be reduced. One example concerns the use of yeast to produce wine from grapes with high sugar content. Another case is with heterologous proteins production where carbon should be redirected from ethanol to biomass production. Dissipation of NADH by the expression of a water-forming NADH oxidase from Lactococcus lactis was shown to reduce the aerobic ethanol yield from 1.82 to 1.54 mol/mol glucose [138]. The effect was even more pronounced when expressing an alternative oxidase from Histoplasma capsulatum in the mitochondria (Figure 22.2, Table 22.9) [139]. Increased respiratory capacity and reduced overflow metabolism allowed a reduction of ethanol yield from 1.21 to 0.31 mol/mol glucose under aerobic conditions. A different approach drastically reduced aerobic ethanol generation by modulating the glucose uptake rate. When all seven Hxt sugar transporters genes were deleted and replaced by the expression of a chimeric variant, the reduced glucose uptake rate reduced growth rate by 44%, while ethanol yield approached zero (Table 22.9) [140]. The glycolytic flux was reduced to a value sustainable for the actual mitochondrial capacity and therefore no overflow metabolism was observed. The functional chimeric transporter was obtained thanks to the serendipitous point mutation S279Y.
22.3.3 Succinic Acid Succinic acid is a key building block for further conversion to precursor molecules such as tetrahydrofuran, 1,4-butanediol, and butyrolactone. Succinic acid has the potential to become a commodity chemical, with world-wide annual demand exceeding $2 billion USD and over 160 million kg currently produced from petrochemical conversion of maleic anhydride. Recently, an evolutionary programming approach for in silico metabolic engineering has been developed, with succinate as a case study [19]. Several gene deletion strategies were proposed to increase the biomass-coupled yield of succinate and two S. cerevisiae deletion mutants were constructed [141]. The approach was to minimize glycine formation from serine and threonine, drawing increased flux demand from the reaction catalyzed by the enzyme alanine:glyoxylate aminotransferase, which converts glyoxylate to glycine. The conversion of isocitrate to succinate and glyoxylate, are both expected to increase proportionally to the glycine demand—an essential amino acid required for biomass formation. Both mutants exhibited reduced growth rates compared to the reference strain, however, one mutant had significantly improved succinate yield compared to the reference strain. The batch yield of succinate on glucose was 0.03 g-succinate/g-glucose, compared to no succinate produced in the reference strain. Although the yields obtained require significantly improvement for industrial relevance, a proof-of-concept for coupling biomass formation to production of succinic acid has been established. Furthermore, characterization of both mutants should enabled future metabolic engineering targets to be considered and increased the understanding of the glyoxylate pathway in S. cerevisiae [142,143].
22-22
Developing Appropriate Hosts for Metabolic Engineering
22.3.4 Diacetyl Lagering is a space and energy consuming part of beer brewing. During lagering, beer is stored in tanks and kept at low temperatures for a period of two weeks or more [144,145]. Given the volumes of beer daily produced in modern industrial breweries, the lagering step represents a significant cost due to the large structures and the energy required by the process. The duration of the process is mainly dictated by the slow, conversion of the off flavor compound diacetyl, which is converted first to acetoin, and finally to 2,3-butanediol, two compounds with higher tasting thresholds [144,146,147]. The tasting threshold of diacetyl is 0.02–0.1 mg/l [145,148]. A diacetyl concentration approaching this values in green beer would entail shorter and even null lagering times. Therefore, it would be advantageous to have brewers’ yeast strains yielding low diacetyl concentration. The literature on this topic has already been extensively reviewed [134,147]. The story of metabolic engineering of yeast for diacetyl reduction can be considered quite an exemplary case-study for metabolic engineering, encompassing several key-aspects such as design strategy, genetics of industrial yeast and public acceptance issues. Therefore, some examples will be presented in this section, to highlight some general aspects that can be considered important in a broader sense for the whole field of metabolic engineering of yeast. Diacetyl originates from the non enzymatic conversion of the valine biosynthesis intermediate α-acetolactate which is in turn synthesized from pyruvate by α-acetolactate synthase, encoded by ILV2. Acetolactate reductoisomerase (Ilv5p) further converts α-acetolactate to dihydroxyisovalerate. The compound α-acetolactate is the node around which all the different metabolic engineering strategies for diacetyl titer reduction have been centered. In fact, according to a classic metabolic engineering approach, accumulation of a certain compound (i.e., α-acetolactate) can be prevented either by restricting the flux upstream, or improving the flux downstream [1,134,149]. Accordingly, the mechanisms envisaged for reducing diacetyl accumulation during beer fermentation either aimed at lowering α-acetolactate formation or at increasing its conversion rate. The only α-acetolactate forming reaction is catalyzed by the enzyme α-acetolactate synthase (ILV2). As introduced in Section 22.2.8, SMR alleles of ILV2 confer resistance to the herbicide SM. Through classical breeding, it has been possible to select industrial strains with decreased α-acetolactate synthase activity, which therefore accumulated less diacetyl (Table 22.10) [148]. Selection of SM resistant mutants is a convenient strategy that has been applied also in other metabolic engineering studies [91,108,111–114]. This characteristic has more recently been exploited for obtaining a brewing strain with two desirable features: amylolytic capacity and reduced diacetyl formation [113]. In fact, in one step ILV2 was deleted through the introduction of an amylase gene. Furthermore, targeted replacement of the wild type ILV2 with the SMR1 allele conferred SM resistance, without impairing valine prototrophy [110,111]. Decreased diacetyl formation by 50% was also achieved through a symmetrical approach, in which ILV5, the enzyme downstream of α-acetolactate, was overexpressed on a multicopy plasmid [45]. Increased drainage of the α-acetolactate pool was the strategy in most of the other approaches. Instead of acting on the natural valine biosynthesis pathway, in those cases the preferred strategy entailed expression of a heterologous α-acetolactate decarboxylase. Early investigations demonstrated that a relevant diacetyl titer reduction in green beer was possible upon expression of Enterobacter aerogenes α-acetolactate decarboxylase (Table 22.11) [147]. However, the heterologous gene was expressed on a plasmid, which caused stability problems. In addition, selection was performed with geneticin resistance, which is not acceptable in industrial brewing strains. With the aim of obtaining a stable strain, subsequent strategies moved in the direction of achieving sufficient α-acetolactate decarboxylase activity in integrant strains, using acceptable selection markers derived from natural yeast sequences [150–152]. Strains displaying all the properties mentioned so far with a chromosomal integrated Klebsiella terrigena α-acetolactate decarboxylase, were tested in fully operational conditions in 50l brewing trials, and the finished product was tested by a tasting panel of
(−)d
I
• α-acetolactate decarboxylase (α-ald; Klebsiella terrigena)
• α-acetolactate decarboxylase (α-ald; Acetobacter xylinum) I ∆
PGK1 –
Copper
G418
3
0.5
– –
50
50
0.5
0.5
26
Scale (l)
–
–
–
Auxotrophies/ Antibiotic(s) Resistance
All the reported examples refer to brewing trials in wort medium. a P: plasmid vector; I: chromosomal integration; ∆: gene deletion; B: isolation of mutant via traditional breeding. b Flavor threshold value of diacetyl: 0.02–0.1 mg/l [144,147]. c Approximate values inferred from graphs. d Not clear.
• α-amylase (AMY; Saccharomycopsis fibuligera) fused to: α-factor secretion signal ILV2
PGK1
I
• α-acetolactate decarboxylase (α-ald; Klebsiella terrigena)
S. cerevisiae Ind. VTT-A-63015 or VTT-A-66024 S. cerevisiae Ind. VTT-A-63015 S. cerevisiae Ind. KI084
S. pastorianus Ind. Sc-11
PGK1
P
• α-acetolactate decarboxylase (ALDC; E. aerogenes)
S. carlsbergensis Ind. IFO0751
ADH1
ADH1
P
• α-acetolactate decarboxylase (Enterobacter aerogenes; ALDC) I
Native
B
Promoter
ILV2
Activity(ies)/Gene(s)
Genetic Engineering Approacha
S. carlsbergensis Ind. 244 S. cerevisiae var. uvarum Ind. KI084
Strain
Table 22.10 Diacetyl
0.73
0.44
~0.08–0.30c
~0.10–0.20c
0.63–0.98
0.68
0.84
Parent Strain
0.22
0.26
0.005–0.020
0.05
0.21
0.15
0.48
Recombinant Strain
Diacetyl Concentration in Green Beer (mg/l)b
[113]
[152]
[151]
[145]
[150]
[148]
[45]
Reference
Metabolic Engineering in Yeast 22-23
• L-lactate dehydrogenase (LDH-A; Bos taurus)
• KlPDC1 • L-lactate dehydrogenase (LDH-A; Bos taurus) • KlPDC1 • KlPDA1 • L-lactate dehydrogenase (LDH-A; Bos taurus) • PDC1 • PDC5 • L-lactate dehydrogenase (LDH-A; Bos taurus) • PDC1 • D-lactate dehydrogenase (D-LDH; Leuconostoc mesenteroides mesenteroides) • JEN1 • L-lactate dehydrogenase (LDH-A; Lactobacillus plantarum) • L-lactate dehydrogenase (ldhL; Lactobacillus helveticus)
GRF18
Kluyveromices lactis CBS2359 Kluyveromices lactis CBS2359 and CBS2360 OC-2T
I
I I
∆ Ι Μ
∆ ∆ P
∆ ∆ P
∆ P
P
Genetic Engineering Approacha
b
a
B, aerobic, mineral medium. (xylose as C source)
B, aerobic, mineral medium
TPI1 TPI1
ADH1 (P. stipitis)
B, aerobic, complex medium
B, aerobic, complex medium
FB, aerobic, mineral medium
FB, aerobic + anaerobic, mineral medium FB, aerobic, complex medium
Cultivation Conditionsb
PDC1
– PDC1 (K. lactis) – – PDC1 (K. lactis) – – PDC1
UASGAL/TATACYC1
Promoter
P: plasmid vector; I: chromosomal integration; ∆: gene deletion; M: mutagenesis. B: batch; FB: fed batch. c R: recalculated.
Pichia stipitis FPL UC7
GRF18U
OC-2T
Activity(ies)
Strain
Table 22.11 Lactic Acid
–
Leu
–
–
–
G418 Phleomycin
Ura
Auxotrophies / Antibiotic(s) Resistance
0.05
–
0.04
0.04
1
14
2
Scale (l)
–
3.2
~0.03
–
110–150
–
1.5
LDH Acticity (U/mg Protein)
~58
~8
61.5
82.3
60
109
20
Lactic Acid Titer (g/l)
Outcomec
0.58
~0.40
0.61
0.81
0.85
0.58
0.5–0.6
Lactic Acid Max Yield (g/g Sugar Consumed)
[160]
[156]
[154]
[155]
[158]
[161]
[153]
Reference
22-24 Developing Appropriate Hosts for Metabolic Engineering
Metabolic Engineering in Yeast
22-25
12–15 volunteers [145,151]. In both cases diacetyl titers were below 0.02 g/l already in the green beer (Table 22.10), which therefore required no lagering before bottling and tasting. In the attempt of designing a recombinant strain that could more easily obtain the GRAS status, α-acetolactate decarboxylase from E. aerogenes or K. terrigena was replaced by the enzyme from the food microbe Acetobacter aceti ssp. xylinum [152]. Diacetyl reduction was successfully achieved; however the lowest titer in the green beer was 0.22 g/l and never went below the tasting threshold (Table 22.10).
22.4 Extended Product Range 22.4.1 Lactic Acid Lactic acid is a three-carbon atom carboxylic acid with a wide range of industrial uses. Besides being a widespread food, cosmetics and drug formulation additive, L-(+)-lactic acid has gained great importance in recent years as a monomer for the renewable and biodegradable plastic poly lactic acid (PLA) [26]. PLA has a potential market of hundreds of thousands metric tons per year. Lactic acid is currently produced mainly via fermentation with lactic acid bacteria. However, lactic acid bacteria have fastidious growth requirements and the production of lactic acid is severely inhibited by low pH. Therefore, neutralization is required. Yeasts display vigorous growth on different substrates, tolerate to low pH but are not naturally able to produce lactic acid. Hence, metabolic engineering of yeast has received great attention, with the aim to provide a more robust and productive organism for the fermentative production of lactic acid. Lactic acid is obtained in one step by direct reduction of pyruvate, catalyzed by the enzyme lactate dehydrogenase (Ldh), in the so-called lactic fermentation. Expression of a Ldh from various donor organisms is a common feature of all the reported strategies in different yeasts (Table 22.11): S. cerevisiae [153–156], K. lactis, [157,158] Zygosaccharomyces bailii [159], Torulaspora delbrueckii [157], and P. stipitis [160]. Lactic acid formation is a feasible alternative for the reoxidation of glycolytically produced NADH. Similarly to ethanol production, lactic acid production from glucose is redox neutral and ATP-positive. Nonetheless, when Ldh is expressed in S. cerevisiae, a fraction of the carbon flux is still directed toward ethanol, which accumulates as by-product reducing the net yield of lactic acid [153]. When a S. cerevisiae strain overexpressing bovine Ldh was grown aerobically in fed batch, and aerobic growth was protracted till ethanol consumption, to partially repress pyruvate decarboxylase activity, the subsequent anaerobic bioconversion of glucose to lactic acid never exceeded a yield on consumed sugar of 0.6 g/g (Table 22.11), with the remaining carbon going to ethanol [153]. Subsequent strategies therefore aimed at reducing pyruvate consumption for ethanol production together with expressing an LDH, in K. lactis [158,161] and in S. cerevisiae [154,155]. In the Crabtree-negative strain K. lactis aerobic ethanol production is much less favored than in S. cerevisiae. Furthermore, K. lactis harbors only one pyruvate decarboxylase gene (KlPDC1) and its deletion does not affect aerobic growth on glucose. Indeed, simultaneous KlPDC1 deletion and overexpression of bovine LDH led to a completely homolactic strain, in which the pyruvate flux toward ethanol was entirely replaced by lactic acid production [161]. However, the strain was poorly capable of anaerobic/microaerophilic homolactic fermentation and the aerobic yield on consumed sugar reached a maximum value of only 0.58 g/g (Table 22.11). Since cofactor regeneration should have been ensured by the complete diversion of the flux toward lactate, the explanation for the low lactate yield was sought in the energy requirements of lactic acid producing cells. In fact, respiratory metabolism was suspected to be necessary in order to provide enough ATP to sustain growth and lactic acid excretion, thus requiring channeling of some carbon into TCA cycle through pyruvate decarboxylase. The energy requirements of a homolactic engineered yeast strain have subsequently been thoroughly investigated in S. cerevisiae laboratory strains [162]. The highest lactic acid yield was reached by additional deletion of KlPDA1, encoding a Pdc sub unit [158]. The strain was viable by virtue of the
22-26
Developing Appropriate Hosts for Metabolic Engineering
K. lactis mitochondrial acetyl-Co-A transport, but it required supplementation with both glucose and ethanol: the former to be converted to lactic acid and the latter to acetyl-CoA (A-CoA). Reduction of the Pdc activity has also been pursued in S. cerevisiae, resulting in redirection of carbon fluxes but, unlike K. lactis, in very poor growth [127,132,153,163]. The highest aerobic lactate yield of 0.81 g/g in S. cerevisiae was achieved in a ∆pdc1 ∆pdc5 S. cerevisiae strain expressing bovine Ldh, however, with low productivity [155]. Similar results where achieved in the production of the other enantiomer D-lactic acid, which can be blended with L-(+)-lactic acid to improve the thermal stability of PLA. A ∆pdc1 strain overexpressing a D-lactate dehydrogenase, converting pyruvate to D-lactic acid, produced up to 61.5 g/l D-lactic acid, with a yield of 0.61 g product per g consumed glucose (Table 22.11) [154]. As mentioned above, secretion of lactic acid across the plasma membrane may influence productivity [160,161]. JEN1 codes for a lactate transporter which is known to be required for lactate uptake. With the assumption that Jen1p could promote also lactate secretion, in case of intracellular overproduction, JEN1 was expressed in S. cerevisiae strains harboring different Ldhs, yielding detectable improvements in yields and productivities [156]. In particular, coexpression of JEN1 together with L. plantarum Ldh increased final lactate titer from ~6 g/l (Ldh-only strain) to ~8 g/l (Ldh plus transporter strain), with yields of 0.3 g/g and 0.4 g/g, respectively (Table 22.11). Expression of a Lactobacillus helveticus lactate dehydrogenase in P. stipitis allowed up to 58 g/l lactic acid production from xylose, glucose, or a mixture of the two sugars (Table 22.11) [160]. The yield on consumed xylose was comparable to the one obtained in the early attempts of S. cerevisiae engineering, however the result is relevant since pentose sugars are expected to gain increasing importance in the perspective of a switch toward a lignocellulosic biomass-based chemical (and fuels) industry (Sections 22.2.4 and 22.2.5). In addition, there still seems to be significant room for improvement of this P. stipitis strain, since no other key step of lactic acid production (e.g., transport and pyruvate fluxes) has been investigated.
22.4.2 Terpenoids Terpenoids are a large family of natural compounds with applications as flavoring agents, perfumes, antioxidants and drugs [164]. At the cellular level they are important for membrane fluidity, protein modification and signaling. Commercially interesting terpenoids are often extremely complex molecules, which are found in natural source (often plants) at low concentrations. This makes de novo chemical synthesis or purification inconvenient, hence the interest in engineering a new organism to produce these substances in an optically pure form at high titers. The starting point for terpenoid synthesis is A-CoA. A-CoA is used in the mevalonate pathway as a building block to assemble isopentenyl pyrophosphate (IPP), the 5 carbon atoms module constituting longer terpenoid molecules Figure 22.3). Other large nodal compounds are geranyl pyrophosphate (GPP, C10), farnesyl pyrophosphate (FPP, C15) and geranylgeranyl pyrophosphate (GGPP, C20). GPP is the precursor for the C10 monoterpenes and FPP, while FPP is the precursor for the C15 sesquiterpenes and the starting points for the biosynthesis of steroids. GGPP is the precursor of the C20 diterpenes and carotenoids [164]. Successful metabolic engineering for the production of new terpenoid compounds in yeast have focused on (i) the expression of heterologous enzymes leading to new compounds, and (ii) the enhancement of the pool of FPP precursor. Parallel strategies concerning point (ii) have been successful both in Candida utilis [164] and in S. cerevisiae (Table 22.12) [25,166–168]. The carotenoid compound lycopene is not normally synthesized by C. utilis [165]. When the three activities from Erwinia uredovora required for biosynthesis of lycopene were expressed C. utilis, the recombinant strain accumulated 1.1 mg/g DCW lycopene [165]. The low production was attributed to the low levels of the precursor compound FPP. A previously reported strategy to increase FPP pool in
22-27
Metabolic Engineering in Yeast
A-CoA (c2)
Carotenoids (Lycopene)
Diterpenes (c20)
Hmgl MEV (c5)
Geraniol
IPP (c5)
GPP (c10)
Artemisinic acid Epi-cedrol
GGPP (c20)
Amorphadiene
Erg20
FPP (c15)
Erg9
Monoterpenes (c10)
Sesquiterpenes (c15)
Squalene
Sterols (Hydrocortisone)
Figure 22.3 Simplified terpenoids pathway. Abbreviations, in alphabetical order. A-CoA, acetyl-coenzyme A; FPP, farnesyl pyrophosphate; GGPP, geranylgeranyl pyrophosphate; GPP, geranyl pyrophosphate; IPP, isopentenyl pyrophosphate; MEV, mevalonate.
S. cerevisiae consisted in modifying the level of 3-hydroxy-3-methylglutaril-CoA reductase (HMG1p), an enzyme of the mevalonate pathway encoded by HMG1 (Figure 22.3) [169]. Whereas HMG1 expression is tightly regulated and product-inhibited, a truncated allele (tHMG1) was found to be constitutively active [169]. Increased and constitutive Hmg1p activity was expected to increase the flux toward FPP. This proved to be effective also in C. utilis, probably due to the high homology between HMG1 in the two species [165]. Therefore, E. uredovora lycopene biosynthetic pathway was expressed in a C. utilis strain in which truncated HMG1 was expressed, together with deletion of ERG9 gene that encodes the FPPconsuming enzyme squalene synthase. In such high-FPP background, a seven-fold increased lycopene production was achieved, reaching 7.8 mg/g DCW [165]. The dominant mutant of the transcription factor Upc2, upc2-1, has been reported to increase metabolic flux toward sterols [170,171]. Thus, the expression of upc2-1 allele was tested in a S. cerevisiae strain expressing epi-cedrol synthase from the plant Artemisia annua, which converts FPP to the sesquiterpene alcohol epi-cedrol [166]. Contrary to expectation, epi-cedrol production was reduced upon expression of upc2-1. The result was not further elucidated, but it is likely that, whereas the flux through FPP was increased, its availability to the epi-cedrol synthase did not benefited from it, since downstream enzymes were induced as well. In contrast, more than double epi-cedrol production (from 180 µg/l to 370 µg/l, Table 22.12) was achieved when the expressions of upc2-1 and tHMG1 were combined. In the same study, the yield of FPP derived compounds was found to be two times higher in mating type “a” strains than in isogenic “α” strains. The authors suspect higher production of FPP in Mat a strains because it is needed for the posttranslational modification of the corresponding mating pheromone [172]. Artemisin is a sesquiterpene produced from FPP in A. annua, and to date it is among the most effective drugs against the malaria etiological agent Plasmodium falciparum. Artemisin can be also obtained by synthesis from artemisinic acid, an intermediate of the A. annua biosynthetic pathway. Artemisinic acid has been obtained in an engineered S. cerevisiae strain in which the expression of a part of the A. annua pathway was combined with a multiple strategy to increase the flux toward FPP [25]. All the previously mentioned strategies, tHMG1 and upc2-1 expression together with ERG9 repression, were combined with overexpression of the FPP synthase, encoded by ERG20. The maximum titer that exceeded 115 mg/l (Table 22.12) should now be increased 200 times to make a competitive process. A more recent strategy aimed at enhancing the FPP supply in S. cerevisiae for the biosynthesis of amorphadiene from FPP by amorphadiene synthase, i.e., the first step of the artemisinic acid (artemisin) biosynthetic pathway in A. annua [167]. The new approach concentrated on the supply of the first
HMG1(truncated) ERG9 • Geranylgeranyl pyrophosphate synthetase • Phytoene synthase • Phytoene dehydrogenase (crtE, crtB, crtI; Erwinia uredovora) upc2-1 HMG1(truncated) • Epi-cedrol synthase (-; Artemisia annua)
Candida utilis ATCC9950
S.cerevisiae JBY575
FY1679-28C
Carotenoids (Lycopene)
Epi-cedrol
Hydrocortisone
• ∆7 reductase (DWF5; A. thaliana) • Adrenotoxin -mature form (matFDXR; Bos taurus) • Adrenotoxin reductase -mature form (matFDX1; Bos taurus) • P450 side chain cleaving (CYP11A1; Bos taurus) • 3β-hydroxysteroid desaturase II (3β-HSD; Homo sapiens) • 17a-steroid hydroxylase (CYP17A1; Bos taurus) • 21-steroid hydrolase (CYP21A1; Homo sapiens) • 11β-steroid hydrolase (CYP11B1; Homo sapiens) ERG5 ATF2 GCY1 YPR1 ARH1
Activity(ies)/Gene(s)
Strain
Compound
Table 22.12 Terpenoids
PGK1 PGK1 (C. utilis)
P P
GAL10/CYC1 GAL10/CYC1 TEF1 TDH3 CYC1
CYC1
P P I I P ∆ ∆ ∆ ∆ I
– – – –
GAL10/CYC1
P
I P
B, –, complex medium
−,−, complex medium
Cultivation Conditionsb
GAL10/CYC1 B, –, complex medium GAL10/CYC1
– GAL1 GAL1
GAP1
P
– P P
GAP1 –
Promoter
I ∆
Genetic Engineering Approacha
–
0.005
–
Scale (l)
0 (mg/l)
180 (µg/l)
0 (mg/g DCW)
Refererence Strain
11.5 (mg/l)
370 (µg/l)
7.8 (mg/g DCW)
Recombinant Strain
Compound Final Titer
[175]
[166]
[165]
Reference
22-28 Developing Appropriate Hosts for Metabolic Engineering
ERG20 erg20-2 • Geraniol synthase (GES; Ocimum basilicum)
S.cerevisiae Y21258
Terpenoid alcohols
b
– Native PMA1
GAL1 GAL1 GAL1 MET3 GAL1
Ι Ι Ι D P ∆ Ι P
GAL1 GAL10
GAL1 GAL1
P P
P P
GAL1 GAL1 GAL1 MET3 GAL1
Ι Ι Ι D P
B, aerobic, mineral medium
B, aerobic, mineral medium
B, aerobic, mineral medium
P: plasmid vector; I: chromosomal integration; ∆: gene deletion; D: downregulation via promoter replacement. B: batch.
ALD6 • AcoA synthase (AcsL641P; Salmonella enterica) upc2-1 HMG1(truncated) ERG20 ERG9 • Amorphadiene synthase (ADS, Artemisia annua)
S.cerevisiae BY4742
Amorphadiene
a
upc2-1 HMG1(truncated) ERG20 ERG9 • Amorphadiene synthase (ADS, Artemisia annua) • Cytochrome P450 monooxygenase and oxidoreductase (CYP71AV1 and CPR; Artemisia annua)
S.cerevisiae BY4742
Antimalariar drug precursor artemisinic acid
0.15
0.05
0.05
–
0.356 (mM/ OD600)
0 (mg/l)
989 (µg/l)
0.604 (mM/ OD600)
~115 (mg/l)
[168]
[167]
[25]
Metabolic Engineering in Yeast 22-29
22-30
Developing Appropriate Hosts for Metabolic Engineering
building block (A-CoA) in the cytosol. Cytosolic A-CoA is produced in three steps via the so-called pyruvate dehydrogenase by-pass, which is an alternative pathway to the one-step conversion of pyruvate by pyruvate dehydrogenase (Figure 22.2). The three reactions are catalyzed by pyruvate decarboxylase, acetaldehyde dehydrogenase (Ald) and A-CoA synthase (Acs), respectively (Figure 22.2). To increase cytosolic A-CoA availability, the S. cerevisiae Ald and the constitutively active Salmonella enterica Acs mutant were overexpressed [173]. As a result, the increased A-CoA synthesis through the Pdh by-pass led to a significant increase in amorphadiene production (Table 22.12). The strategy is now waiting to be implemented in a strain carrying the complete pathway for artemisinic acid production. Terpenoid alcohols have a number of industrial applications, including the utilization as fragrances in essential oils and perfumes [164]. Monoterpenes such as the terpenoid alcohol geraniol are derived from GPP. As mentioned before, Erg20p is responsible for the formation of FPP, by addition of an IPP module to a GPP molecule (Figure 22.3). Yeast erg20-2 mutants are defective in FPP synthesis and secrete, under certain conditions, small amounts of the terpenoid alcohols geraniol and linalool [174]. Expression of Ocimum basilicum geraniol synthase in an erg20-2 background resulted in an overall increase of the production of terpenoid alcohols, with an almost complete shift toward geraniol [168]. Ergosterol is the sole steroid molecule produced by yeast. Production of hydrocortisone, the major steroid hormone in mammals, was achieved in yeast by reconstructing the mammalian pathway starting from ergosterol and its precursor, ergosta 5-7 dienol [175]. This strategy differed from those previously described because it aimed at reconstructing the pathway and eliminating side-reactions leading to dead-ends, rather than increasing the flux through the mevalonate pathway and FPP. In total, one plant and seven mammalian genes were introduced, four yeast genes were deleted and one was overexpressed (Table 22.12). Unfortunately the overwhelming amount of work was presented in an extremely condensed form [175]. The reader is therefore referred to other publications, which extensively reviewed and commented this piece of work [176–178].
22.4.3 Other Compounds Each of the strategies discussed so far have received intense interest over the years, and can be described by going through a “history” of publications which, in some cases, span over more than 20 years. Other yeast metabolic engineering strategies have received far less attention, nonetheless each of them is the proof of the vast opportunities that yeasts offer as safe, convenient and versatile cell factories. This section reports examples of yeast metabolic engineering strategies for the biosynthesis of new compounds comprising mannitol, glycerol 3-phosphate, 1,2-propanediol, resveratrol and other flavonoids, and also L-ascorbic acid (vitamin C) (Table 22.13). Metabolic engineering of S. cerevisiae for the production of mannitol—used as sweetener and hyperosmolar agent in medicinal therapy [179,180]—and L-glycerol 3-phosphate—used as building block in pharmaceutical chemistry [181] benefited from the knowledge on metabolism and redox balance generated by the work on glycerol (Section 22.3.1 and Figure 22.2). Expression of E. coli NAD+ -dependent mannitol 1-phosphate dehydrogenase that reduces fructose 6-phosphate to mannitol 1-phosphate was performed in a ∆gpd1∆gpd2 strain that is not capable of anaerobic growth, since cytosolic NAD+ is not regenerated by Gpd activity. Anaerobic mannitol formation was observed, although it was not sufficient to restore growth [182]. However the absence of growth was assumed to result from the detrimental intracellular mannitol accumulation rather than from insufficient cofactor regeneration by the heterologous dehydrogenase. Regarding glycerol-3-phosphate, interruption of the glycerol pathway downstream of glycerol 3-phosphate and GPD1 overexpression (Figure 22.2) led to a slight accumulation of glycerol 3-phosphate [181]. Additional PDC2 deletion, as previously shown [127], caused a further redirection of the carbon flux toward the glycerol branch, with an additional increase in glycerol 3-phosphate titers up to 17 mg/g DCW. However, this achievement was accompanied by the previously reported detrimental effects on growth [119,121,127] and product inhibition of Gpd was suspected to cause this very low productivity.
S. cerevisiae NOY386aA
S. cerevisiae Lab FY23
S. cerevisiae AH22
S. cerevisiae SCY4d S. cerevisiae GRF18
1,2-propane diole
Resveratrol
Flavonoids (Naringenin)
Nicotianamine
CUP1
I
GAL10
P
ScTPI1 ScTPI1
P P
b
a
TDH(–)
GAL10
P
P
GAL10
ENO2
P P
ADH2
P
I
(−)c – – – CUP1
– –
∆ ∆ P ∆ ∆ ∆
TPI1
Promoter
P
Genetic Engineering Approacha
P: plasmid vector; I: chromosomal integration; ∆: gene deletion. B: Batch. c Not clear. d The strain had been previously engineered for accumulation of S-adenosyl methionine [190].
Z. bailii ATCC60483
• Methylglyoxal synthase (mgs; Escherichia coli) • Glycerol dehydrogenase (gldA; Escherichia coli) • Coenzyme A ligase (4CL-116; hybrid Populus trichocarpa x deltoides) • Resveratrol synthase (vst1; Vitis vinifera) • Phenylalanine ammonia lyase (PAL; Rhodosporidium toruloides) • 4-coumarate:coenzyme A ligase (4CL; Arabidopsis. thaliana) • Chalcone synthase (CHS; Hypericum androsaemum) • Nicotianamine synthase (AtNAS2; Arabidopsis thaliana) • L-galactose dehydrogenase (LGDH; Arabidopsis thaliana) • D-arabino-1,4-lactone oxydase (ALO1; S. cerevisiae)
S. cerevisiae W303-1A
Glycerol 3-phosphate
L-ascorbic acid
• Mannitol 1-phosphate dehydrogenase (mtlD; Escherichia coli) GPP1 GPP2 GPD1 PDC2 GPP1 GPP2
S. cerevisiae W303-1A
Mannitol
Activity(ies)/Gene(s)
Strain
Compound
Table 22.13 Other Compounds
B, aerobic, mineral medium B, aerobic, mineral medium
B, aerobic, complex medium
B, aerobic, mineral medium + p-coumaric acid
B, aerobic, complex medium
B, aerobic, mineral medium
B, aerobic + anaerobic, mineral medium
Cultivation Conditionsb
–
0.05
0.025
0.2
0.005
–
2
Scale (l)
2.8 (mg/l)
0.766 (mg/g WCW) 70 (mg/l)
7.0 (mg/l)
1.45 (µg/l)
0.18 (g/g DCW)
17 (mg/g DCW)
0.33 (g/l)
Compound Final Titer
[193]
[22]
[189]
[187]
[183]
[181]
[182]
Reference
Metabolic Engineering in Yeast 22-31
22-32
Developing Appropriate Hosts for Metabolic Engineering
The compound 1,2-propanediol has a number of applications as additive in plastics, cosmetics, reservatives and antifreeze agents [183]. In S. cerevisiae, two E. coli genes encoding methylglyoxal p synthase (mgs) and glycerol dehydrogenase (gldA), were integrated in the yeast genome in order to enable the conversion of DHAP to 1,2-propanediol [183] and 1,2-propanediol production of 0.18 g/g DCW, corresponding to a final titer of ~0.24 g/l, was achieved (Table 22.13) [183]. The introduced activities do not represent all the reactions needed to convert DHAP to 1,2-propanediol and intermediate steps in the pathway were supposedly catalyzed by native S. cerevisiae enzymes. In addition, expression of methylglyoxal synthase led to the accumulation of byproducts such as methylglyoxal, lactaldehyde, or acetol [184]. The genetic approach for the construction of this 1,2-propanediol producing strain was elaborate. Two sets of strains, each containing only one of the two genes integrated in one, two, or three copies, were created. Upon mating a complete set of diploid strains covering all the possible combinations of copy-numbers of the two genes was generated. Enzymatic activities correlated with copy number in the haploid strains. However, after mating the correlation was lost. In addition, 1,2-propanediol production did not correlate neither with copy number, nor with enzymatic activity. The authors speculated that the variation could be due to strain variance, since the two original haploid strains were not isogenic. Unfortunately this variation was not further investigated, for example with the use of different strains. Flavonoid compounds constitute a large family of plant pigments reported to be beneficial for health [185–187]. For example, the grape flavonoid resveratrol is an antioxidant with putative antiaging properties and believed to be responsible for some of the positive properties attributed to red wine [186–188]. The starting point for the biosynthesis of complex flavonoid compounds is p-coumaroyl-CoA, an activated form of p-coumaric acid that is generated from the conversion of L-phenylalanine (or L-tyrosine in some plants) in the upper part of the so-called phenylpropanoid pathway [189]. One molecule of resveratrol is synthesized by resveratrol synthase from one molecule of p-coumaroyl-CoA and three molecules of malonyl-CoA. With the aim of increasing the resveratrol content in wine, a S. cerevisiae strain producing 1.45 µg/l of this compound was obtained upon expression of resveratrol synthase together with coenzyme A ligase that synthesizes p-coumaroyl-CoA from coumaric acid (Table 22.13) [187]. The strategy relied on the fact that coumaric acid was present in grape must, and malonyl-CoA was naturally synthesized by yeast. This simple and pragmatic approach focused on the particular application in wine strains. However more extensive modification of the metabolism would probably have a greater impact on strain performances and wine quality. A strain designed for resveratrol production from a medium devoid of precursors would have to be endowed with enzymatic activities for the upper part of the phenylpropanoid pathway, and perhaps also genetic modifications which would enhance the carbon flux toward the required precursors. In this perspective, there would be an interesting connection with some of the work done for terpenoids production (Section 22.4.2 and Table 22.12) since malonyl co-A is derived from carboxylation of A-CoA. Indeed, much higher titers of flavonoids such as naringenin (7 mg/l) were obtained when a phenylalanine ammonia lyase was expressed with a coenzyme A ligase (4-coumarate coenzyme A ligase) [189]. In this case, the final biosynthetic step was accomplished by a heterologously expressed chalcone synthase, which synthesizes naringenin from p-coumaroyl-CoA and malonyl-CoA (also in this case in stoichiometric ratio of 1:3). Nicotianamine (NA) is a potential antihypertensive molecule that is found in plants, where it is synthesized through the trimerization of S-adenosylmethionine, catalyzed by the enzyme NA synthase [190]. Production of 0.766 g NA per g wet cell weight (Table 22.14) was achieved upon expression of A. thaliana NA synthase in the S. cerevisiae SCY4 strain, which had been previously engineered for enhanced S-adenosylmethionine accumulation [22,190]. It was pointed out that NA production exceeded the amount that could be theoretically produced from the accumulated S-adenosylmethionine. Since NA appears less toxic for the cells than S-adenosylmethionine, its consumption by NA synthase could have a beneficial effect for the cell and increase the global flux toward the end product, supporting the idea that accumulation of biosynthetic intermediates could affect the yield of a desired product because it represents a loss of carbon, but also, in an even worse case, because it could be detrimental for the cells when such intermediates are toxic.
b
a
• Malate permease (mae1;Schyzosaccharomyces pombe) • Malolactic enzyme (mleS; Lactococcus lactis) • Malate permease (mae1; Schyzosaccharomyces pombe) • Malolactic enzyme (mleS; Lactococcus lactis) • Malate permease (mae1; Schyzosaccharomyces pombe) • Malic enzyme (mae2; Schyzosaccharomyces pombe) • Malate permease (mae1; Schyzosaccharomyces. pombe) • Malolactic enzyme (mleA; Oenococcus oeni)
Activity(ies)
PGK1 PGK1
I
PGK1
I
I
PGK1
ADH1
P I
PGK1
PGK1
P P
PGK1
Promoter
P
Genetic Engineering Approacha
P: plasmid vector; I: chromosomal integration. B: Batch.
Ind. S92
YPH259
OL1
YPH259
Strain
Table 22.14 L-Malic Acid
–
Ade//Hys/Leu/Lys/Ura
His
Ade/His/Lys
Auxotrophies or Antibiotic(s) Resistance
B, anaerobic, grape must
B, anaerobic, grape must
B, anaerobic, mineral medium
B, mineral medium
Cultivation Conditionsb
5.5
5
3
4.5
Initial Malate Concentration (g/l)
0.5
0.8
1.1
–
Scale (l)
9
10
~4
4
Outcome Time (days) for Total Malate Conversion
[212]
[114]
[211]
[204]
Reference
Metabolic Engineering in Yeast 22-33
22-34
Developing Appropriate Hosts for Metabolic Engineering
Vitamin C (L-ascorbic acid) is an essential nutrient for humans and other animals. Vitamin C affects a number of cellular functions and it is an active antioxidant and free radicals scavenger [192,193]. Instead of L-ascorbic acid, yeast synthesizes erythroascorbic acid, a related compound with similar properties. The last two steps of the yeast erythroascorbic acid biosynthesis pathway resemble the corresponding plants pathway for L-ascorbic acid. Moreover S. cerevisiae and Z. bailii do accumulate L-ascorbic acid when incubated with plant pathway intermediates [193]. On these bases, a strategy for producing vitamin C was successfully implemented in the two yeasts [193]. More specifically, (over)expression in the two yeasts of Arabidopsis thaliana L-galactose dehydrogenase and S. cerevisiae D-arabino-1,4-lactone oxidase (ALO1) allowed partial conversion of 250 mg/l L-galactose to 70 mg/l and 2.8 mg/l L-ascorbic acid in S. cerevisiae and Z. bailii, respectively (Table 22.13) [193]. The strategy was further developed through the additional expression of A. thaliana mannose epimerase and/or myoinositol phosphatase, allowing L-ascorbic acid production from D-glucose in both yeast species [30]. The recombinant strains producing L-ascorbic acid also gained in robustness, notably toward weak organic acids and hydrogen peroxide [194]. These desirable cellular properties for industrial yeast strains could be advantageous in the context of other metabolic engineering strategies, such as organic acids production. Antibiotics represent another very important category among chemicals produced via fermentation. A recent report presented the successful expression of Penicillum chrysogenum pcl gene, encoding the penicillin biosynthesis pathway enzyme phenylacetyl-CoA ligase in H. polymorpha [194]. Although the reconstruction of the penicillin pathway would require the expression of several more activities, this work opens new possibilities of metabolic engineering strategies in H. polymorpha because it required the construction of a novel vector and selection system that could be useful for other applications.
22.4.5 Heterologous Proteins Heterologous protein production is probably the field in which the largest number of applications of recombinant yeasts has been implemented industrially. Most of the metabolic engineering strategies described so far in this chapter entailed the expression of heterologous proteins in yeast. However, in those cases the expressed protein was used to carry out its regulatory or catalytic function in the context of the cellular system. In this section, the expression of heterologous protein is finalized at the production of the protein per se. Beyond transformation of the yeast with the gene encoding the proteins of interest, most of the engineering efforts in the field have been focused on modifying the glycosylation pathway and in some cases, improve secretion [23,196]. For a detailed overview on the key aspects of heterologous protein production in a number of yeast species, the reader is referred to Refs. [24,32,197].
22.5 Improved Cellular Properties In the context of an industrial process the chosen microorganism may face harsh conditions, such as the presence of inhibitory compounds, nonoptimal or fluctuating pH, and low oxygen levels and nutrient availability. The cell efforts to cope with and respond to adverse conditions are energy demanding and therefore directly affect biomass and product yield. Furthermore, some compounds directly inhibit enzymes involved in the synthesis of products of interest, thereby reducing the flux through the pathway and ultimately lowering productivity [198,199]. Ethanolic fermentation of lignocellulosic hydrolyzates is an example of a process in which the microorganism of choice has to face a combination of environmental stresses [198]. First, the microorganism can be exposed to very high sugar concentrations at the beginning of the fermentation, while the final ethanol titer can reach inhibitory levels. In addition, lignocellulosic hydrolyzates contain a variety of compounds which affect fermentation performances at different levels. The yeast S. cerevisiae is naturally tolerant to high sugars, ethanol and to some extent also to a variety of inhibitory compounds. However, in order to achieve optimal process performances, it is desirable to further improve these properties. Inhibitory compounds in lignocellulosic hydrolyzates, their effect on S. cerevisiae as well as metabolic engineering strategies to obtain more tolerant
Metabolic Engineering in Yeast
22-35
strains have recently been reviewed [197]. In another line of research, L-ascorbic acid accumulation in engineered S. cerevisiae and Z. bailii has been shown to improve the overall intracellular antioxidant cell capacity (Section 22.4.3.) [30,193]. Evolutionary engineering has also been used to improve the resistance of a laboratory strain to multiple stresses [200]. In that case, the most successful strategy entailed batch growth with selection pressure applied through freezing and thawing whereas other strategies produced only strains with improved characteristics for the specific kind of stress applied during the selection procedure [200]. The reason for this evolutionary response of S. cerevisiae has to be sought in the molecular mechanisms of resistance to each specific stress. In particular, elucidation of the regulatory networks underlying stress response could clarify possible hierarchies and interdependences of the different responses. For this sake, as suggested by the authors, characterization of transcriptome, proteome, and metabolome of the evolved strains would provide useful information. Directed evolution of cell populations, driven by spontaneous mutations and mutagenesis, is a holistic approach which, in contrast to targeted genetic modifications, impacts the whole cellular system to attain unforeseeable and favorable rearrangements not only in the metabolic but also in the regulatory network. Global transcription machinery engineering is an intermediate approach by which, mutations of one or more global regulatory transcription factors, results in vast cell reprogramming. This approach was used to improve the ethanol tolerance in S. cerevisiae, by random mutagenesis of the TATA- binding protein encoded by SPT15 [201]. Rapid response to sudden changes in environmental conditions requires considerable amounts of cellular energy in the form of ATP. Some types of animal cells are able to store high energy phosphate in the so-called phosphagens, such as phosphocreatine and phosphoarginine [202]. ATP can promptly be generated from this pool, allowing the cells to respond to sudden increases in cellular energy requirements. Under physiological conditions, phosphocreatine and phosphoarginine pools are refilled by the respective phosphagen kinases [199,202]. Yeast does not posses any phosphagen pathway. With the hypothesis that a functional phosphagen pathway would improve the ability to cope with environmental stresses, arginine kinase was expressed in a S. cerevisiae strain [203]. When compared to a control strain, the engineered strain showed an increased ability to respond to brief periods of starvation, which translated into higher biomass yields and a more constant intracellular ATP pool [199,203].
22.6 Yeast as Biocatalyst 22.6.1 L-Malic Acid Degradation Malic acid is the most abundant organic acid in grape must [204]. Wine yeast strains performing primary (alcoholic) wine fermentation are not able to efficiently metabolize extracellular malic acid; therefore a secondary fermentation (malolactic), which is normally performed by lactic acid bacteria, is needed for many wines [114,205,206]. During malolactic fermentation malic acid is converted to L-(+)lactic acid and CO2 by lactic acid bacteria. The reaction is catalyzed by the malolactic enzyme. However, lactic acid bacteria scantly grow in wine and this sometimes stops the malolactic fermentation, which is referred to as “sluggish” or “stuck” [206]. The designation “stuck fermentation” is in this case improperly used, since in wine making this term refers to premature termination of yeast growth during alcoholic fermentation [207]. Other yeast species, such as S. pombe, can to a lower extent degrade malate via an alternative fermentative route, i.e., the so-called maloethanolic fermentation. In maloethanolic fermentation, extracellular malate is converted to pyruvate by malic enzyme and ultimately to ethanol. However, S. pombe growth is undesirable since it is often associated with formation of off-flavors [204]. The goal of a number of metabolic engineering strategies has been to obtain a S. cerevisiae strain which is able to completely degrade malic acid either via malolactic or maloethanolic fermentation. In the great majority of the approaches, the core of the strategy was the expression of a malolactic enzyme
22-36
Developing Appropriate Hosts for Metabolic Engineering
from lactic acid bacteria (Table 22.14) [205,208–212]. The expression in S. cerevisiae of the malic enzyme from S. pombe was also investigated (Table 22.14) [114,205]. Initially, the sole expression in yeast of L. lactis gene mleS, encoding the malolactic enzyme, did not exert any significant effect on the yeast’s capacity to metabolize extracellular malic acid [209]. Despite high malolactic activity detectable in crude cell extracts, the recombinant strain converted approximately the same amount of malic acid as the parent strain and only small amounts of lactate were produced. In order to elucidate the fluxes toward lactate, a recombinant yeast expressing mleS was grown in the presence of [13C]glucose and [14C]malate. Isotopic filiation showed that: (i) only 25% of the produced lactate originated from extracellular malate, (ii) all the degraded exogenous malate was converted to lactate and (iii) lactate was produced also in the absence of any added extracellular malic acid. Altogether, results suggested that malate uptake was important for malolactic fermentation [210]. Indeed, complete malate conversion was achieved with a laboratory strain coexpressing malate permease (encoded by mae1 from S. pombe) and L. lactis malolactic enzyme [205]. It was also shown that malate conversion correlated with the expression levels of the transporter [211]. Conversion of malate via maloethanolic fermentation, similarly to S. pombe, has sometimes been preferred, since, in certain varieties of wine, malolactic fermentation is considered to negatively affect the organoleptic profile [205]. Partial malate conversion to ethanol via maloethanolic fermentation was achieved in 2% glucose medium using S. cerevisiae laboratory strains expressing S. pombe malate permease and malic enzyme (encoded by mae2) [205]. However, when the medium was supplemented also with 10% w/v fructose and 10% w/v glucose malate conversion was no longer observed for the maloethanolic strain, while a related malolactic strain could completely convert 4.5 g/l of malate in four days (Table 22.14). The authors argued that the malic enzyme could suffer from either cofactor shortage or pyruvate inhibition, owing to the high glycolytic flux in the presence of high extracellular sugar concentrations. Nonetheless, a related strain obtained later, in which mae1 and mae2 were chromosomally integrated, was shown to entirely convert 5 g/l of malate in 10% glucose medium [205]. Since differences in enzyme levels were ruled out by Western blot experiments, the only difference seemed to be the presence of 10% fructose. Unfortunately comparisons of the two strains under identical conditions are missing and the problem remains to be elucidated. Finally, a commercial wine strain harboring S. pombe permease and Oenococcus oeni malolactic enzyme and able to degrade malate during operational condition in wine fermentation has been generated and granted the status of Generally regarded as safe (GRAS) by FDA [210]. Great attention was given to meet the requirements for a food grade recombinant microorganism. The sources of genetic material are the organisms S. pombe and O. oeni that are naturally present in the wine micro flora. In addition, the two heterologous genes were introduced by cotransformation with a phleomycin resistance plasmid and targeted to the chromosomal URA3 locus via homologous recombination. Selection of the recombinant clones on phleomycin medium was followed by screening for lactic acid accumulation. After curing the plasmid by growing on nonselective medium, the resulting strain was devoid of bacterial or antibiotic resistance sequences. The new strain has also been thoroughly characterized at the genetic, transcriptomic, proteomic and physiological level. The resulting data allowed the researchers to claim a substantial equivalence with the parental strain.
22.6.2 Xylitol Production Xylitol is a five carbon atoms polyol with a number of uses in the food and pharmaceutical industry due to its sweetening and anticariogenic properties [211]. Xylitol is produced in large scale via chemical reduction of D-xylose. Chemical synthesis offers high productivity, but it has been suggested that bioconversion could offer higher yield, lower costs and therefore be economically advantageous [212]. Furthermore, microbial conversion offers high selectivity for D-xylose, thus, requiring a less pure substrate [213]. Several yeasts, particularly belonging to the genera Candida and Pichia [215,216], are naturally able to perform reduction of xylose to xylitol by means of xylose reductase. Research on xylitol
Metabolic Engineering in Yeast
22-37
production by yeast bioconversion has mainly focused on process engineering, but examples of metabolic engineering of S. cerevisiae and C. tropicalis are also reported [217,218]. In particular, in the case of S. cerevisiae, a laboratory strain was transformed with P. stipitis xylose reductase, which enabled it to convert xylose to xylitol with a yield of 0.95 g/g [217]. C. tropicalis naturally expresses a complete xylose utilization pathway, consisting of xylose reductase and xylitol dehydrogenase. Therefore, the metabolic engineering strategy in this case entailed blocking the pathway downstream the product of interest by deleting the gene encoding xylitol dehydrogenase [218]. The recombinant strain needed to be supplemented with glycerol as cosubstrate for cell growth and cofactor regeneration, allowing xylitol yields on xylose as high as 0.97 g/g. However, the bioconversion method has not yet been able to compete with the chemical method on industrial scale.
22.6.3 Stereoselective Bioreduction Enantiomerically pure organic molecules are important building blocks for the production of drugs and fine chemicals. In particular, chiral alcohols, derived from the stereoselective reduction of β-ketoesters and diketones, are used for synthesis of a number of economically interesting compounds [219]. Chemical routes for the synthesis of specific enantiomers can be complex, generate low product yield and require extreme reaction conditions, such as high temperature and pressure [220]. In contrast, whole cell microbial catalysts may represent a cheaper and more accessible alternative for stereoselective reduction, also for nonspecialists. Wild type strains of S. cerevisiae and other yeasts species have been extensively used for stereoselective bioreductions at low scale. However economically feasible production of chiral alcohol by yeast requires further strain engineering in order to increase the product yield and productivity and to decrease the level of cosubstrate, usually glucose, used for NAD(P)H cofactor regeneration and metabolic engineering examples on S. cerevisiae have started to emerge [221]. Increased enantiomeric excess (ee) of the desired alcohol enantiomer was obtained by overexpressing suitable reductase genes and/or deleting genes encoding reductases that generated unwanted enantiomers [220,222]. In addition, a reduction in the level of phosphoglucose isomerase (Pgi) enabled to reduce glucose uptake and to channel more glucose toward NADPH generation, which helped in reducing the glucose need by a factor of 10 [222,223]. Expression of heterologous or mutated reductases, strain engineering for enhanced cofactor regeneration and process engineering still leave significant room for improvement of whole cell bioreductions, which have certainly the potential to be adopted as efficient and environmentally friendly processes for the production of complex chiral molecules.
22.7 Concluding Remarks In this review a number of examples of metabolic engineering of yeast have been discussed. Some strategies proved to be successful, from proof of principle to the construction of new strains, ready to be exploited in an industrial context, such as the wine strains able to degrade malic acid (Section 22.6.1 and Table 22.14) [212] or brewers’ strains with negligible diacetyl production (Section 2.3.3 and Table 22.10) [151,224]. However, such recombinant strains may never be used in commercial applications, although at least one is commercially available in some countries [212]. This is linked to public acceptance of recombinant DNA technology applied to foods. In applications such as bread, beer, and wine making, which are strongly characterized by adherence to the traditional practices and even some sorts of “rituals,” especially for the latter one, advantages given by recombinant yeast strains may not counterbalance the negative perception that, at least today, would prevent a relevant part of the public from accepting the innovative product. Engineering complex pathways for the production of unusual molecules in yeast has proven to be feasible (Sections 22.4.2; 22.4.3). Some examples are seminal, paving the way for future development
22-38
Developing Appropriate Hosts for Metabolic Engineering
of more process-friendly strains and generating useful knowledge for the whole scientific yeast metabolic engineering community. Nonetheless, all reported examples are still quite far from industrial applications and they mainly represent a proof of concept. It is not only because of supposedly low titers or productivity. Once the proof of concept is given, the whole metabolic engineering strategy has to be rethought with the perspective of the final, and real aim: scale up and production with a system that is constituted by more or less growing, or more or less living, cells. Therefore yeast physiology should always be taken into consideration when it comes to products with, for example, high redox and oxygen demands such as terpenoids or steroids. Remarkably, none of the studies reviewed in Section 22.4.2 have gone further than tube or sometimes shake flask culture, with no assessment of, for example, the quality of aeration. Moreover, with few exceptions, galactose inducible promoters seem always to be the easiest choice (Tables 22.13 and 22.14). This is somehow questionable, because (i) it has been shown that strong and constitutive promoters have a better effect for the expression of the heterologous activities [225]; (ii) induction with galactose is not feasible in cheap industrial media and most importantly; (iii) shift in carbon source, especially from glucose to galactose, exerts profound effects on the central carbon metabolism. In other cases engineered strains retained one or more auxotrophy, which heavily influences strain performance [226]. Finally, besides accelerating the overall technology transfer, from the proof of concepts to the application, development of more process-oriented strains would open the opportunity for great improvement of the results through process engineering.
References 1. Bailey, J.E. Toward a science of metabolic engineering. Science 252, 1668, 1991. 2. Rudolf, A., Karhumaa, K., and Hahn-Hagerdal, B. Yeast Biotechnology: Diversity and Applications. Satyanarayana, T. and Kunze, G. Eds. Springer, Berlin-Heidelberg, Germany, 2009. 3. Hahn-Hägerdal, B., Karhumaa, K., and Fonseca, C. et al. Towards industrial pentose-fermenting yeast strains. Appl. Microbiol. Biotechnol., 74, 937, 2007. 4. Rose, A.H. and Harrison, J.S. The Yeasts, Vol. 1–4. Academic Press Ltd., London, U.K., 1989. 5. Kurtzmann, C.P. and Fell, J.W. The Yeasts, A Taxonomic Study. Elsevier, Amsterdam, the Netherlands, 1998. 6. Ausubel, F.M., Brent, R., and Kingston, E. et al. Current Protocols in Molecular Biology. John Wiley & Sons, New York, 1995. 7. Guthrie, C. and Fink, G.R. Guide to Yeast Genetics and Molecular Biology. Part A, Volume 194. Guthrie, C., Fink, G.R., Abelson, J., and Simon, M. Eds. Academic Press, San Diego, CA, 1991. 8. Guthrie, C. and Fink, G.R. Part B: Guide to Yeast Genetics and Molecular and Cell Biology, Volume 350. Guthrie, C. and Fink, G.R. Eds. Academic Press, San Diego, CA, 2002. 9. Guthrie, C. and Fink, G.R. Guide to Yeast Genetics and Molecular and Cell Biology Part C, Volume 351. Guthrie, C. and Fink, G.R. Eds. Academic Press, San Diego, CA, 2002. 10. Sambrook, J. and Russel, D.W. Molecular Cloning: A Laboratory Manual. Cold Spring Harbor Laboratory Press, Cold Spring Harbor, New York, 2001. 11. Mellor, J., Dobson, M.J., and Roberts, N.A. et al. Efficient synthesis of enzymatically active calf chymosin in Saccharomyces cerevisiae. Gene, 24, 1, 1983. 12. McAleer, W.J., Buynak, E.B., and Maigetter, R.Z. et al. Human hepatitis B vaccine from recombinant yeast. Nature, 307, 178, 1984. 13. Chen, C.Y., Oppermann, H., and Hitzeman, R.A. Homologous versos heterologous gene expression in the yeast, Saccharomyces cerevisiae. Nucl. Acids Res., 12, 8951, 1984. 14. Goffeau, A., Barrell, B.G., and Bussey, H. et al. Life with 6000 genes. Science, 274, 546, 1996. 15. Clayton, R.A., White, O., and Ketchum, K.A. et al. The first genome from the third domain of life. Nature, 387, 459, 1997. 16. Kitano, H. Systems Biology: a brief overview. Science, 295, 1662, 2002.
Metabolic Engineering in Yeast
22-39
17. Lee, S.Y., Lee, D.Y., and Kim, T.Y. Systems biotechnology for strain improvement. Trends Biotechnol., 23, 349, 2005. 18. Forster, J., Famili, I., and Fu, P. et al. Genome-scale reconstruction of the Saccharomyces cerevisiae metabolic network. Genome Res., 13, 244, 2003. 19. Patil, K., Rocha, I., and Förster, J. et al. Evolutionary programming as a platform for in silico metabolic engineering. BMC Bioinformatics, 6, 308, 2005. 20. Westergaard, S.L., Oliveira, A.P., and Bro, C. et al. A systems biology approach to study glucose repression in the yeast Saccharomyces cerevisiae. Biotechnol. Bioeng., 96, 134, 2007. 21. Nielsen, J. and Jewett, M.C. Impact of systems biology on metabolic engineering of Saccharomyces cerevisiae. FEMS Yeast Res., 8, 122, 2008. 22. Wada, Y., Kobayashi, T., and Takahashi, M. et al. Metabolic engineering of Saccharomyces cerevisiae producing nicotianamine: potential for industrial biosynthesis of a novel antihypertensive substrate. Biosci. Biotechnol. Biochem., 70, 1408, 2006. 23. Kjeldsen, T., Ludvigsen, S., and Diers, I. et al. Engineering-enhanced protein secretory expression in yeast with application to insulin. J. Biol. Chem., 277, 18245, 2002. 24. Gerngross, T.U. Advances in the production of human therapeutic proteins in yeasts and filamentous fungi. Nat. Biotechnol. 22, 1409, 2004. 25. Ro, D.K., Paradise, E.M., and Ouellet, M. et al. Production of the antimalarial drug precursor artemisinic acid in engineered yeast. Nature, 440, 940, 2006. 26. Steinbuchel, A. Non-biodegradable biopolymers from renewable resources: perspectives and impacts. Curr. Opin. Biotechnol., 16, 607, 2005. 27. Otero, J.M., Panagiotou, G., and Olsson, L. Fueling industrial biotechnology growth with bioethanol. Adv. Biochem. Eng. Biotechnol., 108, 1, 2007. 28. Abbas, C.A. The Alcohol Textbook. Nottingham University Press, Nottingham. U.K., 2003. 29. Boekhout, T. and Kurtzman, C.P. Principles and methods used in yeast classification, and an overview of currently accepted yeast genera. In Nonconventional Yeasts in Biotechnology. Wolf, K. Ed. Springer, Berlin-Heidelberg, Germany, 1996, pp. 1. 30. Branduardi, P., Sauer, M., and Mattanovich, D. et al. 2005. Ascorbic acid production from D-glucose in yeast. Patent application no. 20060234360. 31. Passoth, V., Fredlund, E., and Druvefors, U.A. et al. Biotechnology, physiology and genetics of the yeast Pichia anomala. FEMS Yeast Res., 6, 3, 2006. 32. Gellissen, G., Kunze, G., and Gaillardin, C. et al. New yeast expression platforms based on methylotrophic Hansenula polymorpha and Pichia pastoris and on dimorphic Arxula adeninivorans and Yarrowia lipolytica - A comparison. FEMS Yeast Res., 5, 1079, 2005. 33. Ryabova, O.B., Chmil, O.M., and Sibirny, A.A. Xylose and cellobiose fermentation to ethanol by the thermotolerant methylotrophic yeast Hansenula polymorpha. FEMS Yeast Res., 4, 157, 2003. 34. Wood, V., Gwilliam, R., and Rajandream, M.-A. The genome sequence of Schizosaccharomyces pombe. Nature, 415, 871, 2002. 35. Dujon, B., Sherman, D., and Fischer, G. et al. Genome evolution in yeasts. Nature, 430, 35, 2004. 36. Jeffries, T.W., Grigoriev, I.V., and Grimwood, J. et al. Genome sequence of the lignocellulose-bioconverting and xylose-fermenting yeast Pichia stipitis. Nat. Biotechnol., 25, 319, 2007. 37. Pribylova, L., Straub, M.-L., and Sychrova, H. et al. Characterisation of Zygosaccharomyces rouxii centromeres and construction of first Z. rouxii centromeric vectors. Chromosome Res., 15, 439, 2007. 38. Boretsky, Y.R., Pynyaha, Y.V., and Boretsky, V.Y. et al. Development of a transformation system for gene knock-out in the flavinogenic yeast Pichia guilliermondii. J. Microbiol. Methods, 70, 13, 2007. 39. Pecota, D.C., Rajgarhia, V., and Da Silva, N.A. Sequential gene integration for the engineering of Kluyveromyces marxianus. J. Biotechnol., 127, 408, 2007. 40. Schaaff, I., Heinisch, J., and Zimmermann, F.K. Overproduction of glycolytic enzymes in yeast. Yeast 5, 285, 1989.
22-40
Developing Appropriate Hosts for Metabolic Engineering
41. Petersen, J.G.L., Holmberg, S., and Nilsson-Tillgren, T. et al. Molecular cloning and charachterization of the threonone deaminase (ILV1) gene of Saccharomyces cerevisiae. Carlsberg Res. Commun., 48, 149, 1983. 42. Gancedo, J.M. Yeast carbon catabolite repression. Microbiol. Mol. Biol. Rev., 62, 334, 1998. 43. Santangelo, G.M. Glucose signaling in Saccharomyces cerevisiae. Microbiol. Mol. Biol., Rev. 70, 253, 2006. 44. Tullin, S., Gjermansen, C., and Kielland-Brandt, M.C. A high-affinity uptake system for branchedchain amino acids in Saccharomyces cerevisia. Yeast, 7, 933, 1991. 45. Gjermansen, C., Nilsson-Tillgren, T., and Petersen, J.G. et al. Towards diacetyl-less brewers’ yeast. Influence of ilv2 and ilv5 mutations. J. Basic Microbiol., 28, 175, 1988. 46. Hahn-Hägerdal, B., Galbe, M., and Gorwa-Grauslund, M.F. et al. Bio-ethanol - the fuel of tomorrow from the residues of today. Trends Biotechnol., 24, 549, 2006. 47. Polizeli, M.L.T.M., Rizzatti, A.C.S., and Monti, R. et al. Xylanases from fungi: properties and industrial applications. Appl. Microbiol. Biotechnol., 67, 577, 2005. 48. Lynd, L.R., Weimer, P.J., and van Zyl, W.H. et al. Microbial cellulose utilization: fundamentals and biotechnology. Microbiol. Mol. Biol. Rev., 66, 506, 2002. 49. Chandra, R., Bura, R., and Mabee, W. et al. Substrate Pretreatment: The Key to Effective Enzymatic Hydrolysis of Lignocellulosics? Adv. Biochem. Eng. Biotechnol., 108, 67, 2007. 50. Galbe, M. and Zacchi, G. Pretreatment of lignocellulosic materials for efficient bioethanol production. In Advances in Biochemical Engineering/Biotechnology. Springer, Berlin/Heidelberg, Germany, 2007. 51. Lynd, L.R., Zyl, W.H.v., and McBride, J.E. et al. Consolidated bioprocessing of cellulosic biomass: an update. Curr. Opin. Biotechnol., 16, 577, 2005. 52. Katahira, S., Mizuike, A., and Fukuda, H. et al. Ethanol fermentation from lignocellulosic hydrolysate by a recombinant xylose- and cellooligosaccharide-assimilating yeast strain. Appl. Microbiol. Biotechnol., 72, 1136, 2006. 53. van Rooyen, R., Hahn-Hagerdal, B., and La Grange, D.C. et al. Construction of cellobiose-growing and fermenting Saccharomyces cerevisiae strains. J. Biotechnol., 120, 284, 2005. 54. Ueda, M. and Tanaka, A. Cell surface engineering of yeast: construction of arming yeast with biocatalyst. J. Biosci. Bioeng., 90, 125, 2000. 55. Kondo, A. and Ueda, M. Yeast cell-surface display-applications of molecular display. Appl. Microbiol. Biotechnol., 64, 28, 2004. 56. Georgiou, G., Stathopoulos, C., and Daugherty, P.S. et al. Display of heterologous proteins on the surface of microorganisms: from the screening of combinatorial libraries to live recombinant vaccines. Nat. Biotech., 15, 29, 1997. 57. Fujita, Y., Ito, J., and Ueda, M. et al. Synergistic saccharification, and direct fermentation to ethanol, of amorphous cellulose by use of an engineered yeast strain codisplaying three types of cellulolytic enzyme. Appl. Environ. Microbiol., 70, 1207, 2004. 58. Den Haan, R., Rose, S.H., and Lynd, L.R. et al. Hydrolysis and fermentation of amorphous cellulose by recombinant Saccharomyces cerevisiae. Metab. Eng., 9, 87, 2007. 59. Fujita, Y., Takahashi, S., and Ueda, M. et al. Direct and efficient production of ethanol from cellulosic material with a yeast strain displaying cellulolytic enzymes. Appl. Environ. Microbiol., 68, 5136, 2002. 60. Muralikrishna, G. and Subba Rao, M.V.S.S.T. Cereal non-cellulosic polysaccharides: structure and function relationship-An overview. Crit. Rev. Food Sci. Nutr., 47, 599, 2007. 61. Medve, J., Karlsson, J., and Lee, D. et al. Hydrolysis of microcrystalline cellulose by cellobiohydrolase I and endoglucanase II from Trichoderma reesei: adsorption, sugar production pattern, and synergism of the enzymes. Biotechnol. Bioeng., 59, 621, 1998. 62. Pretorius, I.S. and Marmu, J. Localization of yeast glucoamylase genes by PFGE and OFAGE. Curr. Genet., 14, 9, 1988.
Metabolic Engineering in Yeast
22-41
63. Salema-Oom, M., Valadao Pinto, V., and Goncalves, P. et al. Maltotriose utilization by industrial Saccharomyces strains: characterization of a new member of the α-glucoside transporter family. Appl. Environ. Microbiol., 71, 5044, 2005. 64. Spencer-Martins, I. and van Uden, N. Yields of yeast growth on starch. Appl. Microbiol. Biotechnol., 4, 29, 1977. 65. Kondo, A., Shigechi, H., and Abe, M. et al. High-level ethanol production from starch by a flocculent Saccharomyces cerevisiae strain displaying cell-surface glucoamylase. Appl. Microbiol. Biotechnol., 58, 291, 2002. 66. Eksteen, J.M., Van Rensburg, P., and Cordero Otero, R.R. et al. Starch fermentation by recombinant Saccharomyces cerevisiae strains expressing the alpha-amylase and glucoamylase genes from Lipomyces kononenkoae and Saccharomycopsis fibuligera. Biotechnol. Bioeng., 84, 639, 2003. 67. Shigechi, H., Koh, J., and Fujita, Y. et al. Direct production of ethanol from raw corn starch via fermentation by use of a novel surface-engineered yeast strain codisplaying glucoamylase and alphaamylase. Appl. Environ. Microbiol., 70, 5037, 2004. 68. Hayn, M., Steiner, W., and Klinger, R. et al. Basic research and pilot studies on the enzymatic conversion of lignocellulosics. In Bioconversion of Forest and Agricultural Plant Residues, Saddler, J.N. Ed. CAB International, Wallingford, U.K., 1993, pp. 33. 69. Jeffries, T.W. and Jin, Y.-S. Metabolic engineering for improved fermentation of pentoses by yeasts. Appl. Microbiol. Biotechnol., 63, 495, 2004. 70. Chu, B.C.H. and Lee, H. Genetic improvement of Saccharomyces cerevisiae for xylose fermentation. Biotechnol. Adv., 25, 425, 2007. 71. van Maris, A., Abbott, D., and Bellissimi, E. et al. Alcoholic fermentation of carbon sources in biomass hydrolysates by Saccharomyces cerevisiae: current status. Antonie van Leeuwenhoek, 90, 391, 2006. 72. Dmytruk, O.V., Voronovsky, A.Y., and Abbas, C.A. et al. Overexpression of bacterial xylose isomerase and yeast host xylulokinase improves xylose alcoholic fermentation in the thermotolerant yeast Hansenula polymorpha. FEMS Yeast Res., doi:10.1111/j.1567-1364.2007.00289.x, E, 2007. 73. Becker, J. and Boles, E. A modified Saccharomyces cerevisiae strain that consumes L-Arabinose and produces ethanol. Appl. Environ. Microbiol., 69, 4144, 2003. 74. Karhumaa, K., Wiedemann, B., and Hahn-Hägerdal, B. et al. Co-utilization of L-arabinose and D-xylose by laboratory and industrial Saccharomyces cerevisiae strains. Microb. Cell. Fact., 5, 18, 2006. 75. Wisselink, H.W., Toirkens, M.J., and del Rosario Franco Berriel, M. et al. Engineering of Saccharomyces cerevisiae for efficient anaerobic alcoholic fermentation of L-arabinose. Appl. Environ. Microbiol., 73, 4881, 2007. 76. Bailey, J.E., Sburlati, A., and Hatzimanikatis, V. et al. Inverse metabolic engineering: A strategy for directed genetic engineering of useful phenotypes. Biotechnol. Bioeng., 52, 109, 1996. 77. Keller, M. and Boles, E. Cloning on an L-arabinose transporter from the yeast Pichia stipitis and its functional expression in recombinant Saccharomyces cerevisiae. In Physiology of Yeasts and Filamentous Fungi-PYFF 3, VTT Technical Research Centre of Finland, Helsinki, Finland, 2007. 78. Fonseca, C., Romao, R., and Rodrigues de Sousa, H. et al. L-arabinose transport and catabolism in yeast. FEBS J., 274, 3589, 2007. 79. Fonseca, C., Spencer-Martins, I., and Hahn-Hägerdal, B. L-arabinose metabolism in Candida arabinofermentans PYCC 5603T and Pichia guilliermondii PYCC 3012: influence of sugar and oxygen on product formation. Appl. Microbiol. Biotechnol., 75, 303, 2007. 80. Jeffries, T.W. Engineering yeasts for xylose metabolism. Curr. Opin. Biotechnol., 17, 320, 2006. 81. Collins, T., Gerday, C., and Feller, G. Xylanases, xylanase families and extremophilic xylanases. FEMS Microbiol. Rev., 29, 3, 2005. 82. La Grange, D.C., Claeyssens, M., and Pretorius, I.S. et al. Coexpression of the Bacillus pumilus beta-xylosidase (xynB ) gene with the Trichoderma reesei beta-xylanase 2 ( xyn2 ) gene in the yeast Saccharomyces cerevisiae. Appl. Microbiol. Biotechnol., 54, 195, 2000.
22-42
Developing Appropriate Hosts for Metabolic Engineering
83. La Grange, D.C., Pretorius, I.S., and Claeyssens, M. et al. Degradation of xylan to D-xylose by recombinant Saccharomyces cerevisiae coexpressing the Aspergillus nigerbeta -xylosidase (xlnD) and the Trichoderma reesei xylanase II (xyn2) genes. Appl. Environ. Microbiol., 67, 5512, 2001. 84. Den Haan, R. and Van Zyl, W.H. Enhanced xylan degradation and utilisation by Pichia stipitis overproducing fungal xylanolytic enzymes. Enzyme Microb. Technol., 33, 620, 2003. 85. Katahira, S., Fujita, Y., and Mizuike, A. et al. Construction of a xylan-fermenting yeast strain through codisplay of xylanolytic enzymes on the surface of xylose-utilizing Saccharomyces cerevisiae cells 10.1128/AEM.70.9.5407-5414.2004. Appl. Environ. Microbiol., 70, 5407, 2004. 86. Kern, L., de Montigny, J., and Jund, R. et al., The FUR1 gene of Saccharomyces cerevisiae: cloning, structure and expression of wild-type and mutant alleles. Gene, 88, 149, 1990. 87. Li, K., Azadi, P., and Collins, R. et al.Relationships between activities of xylanases and xylan structures. Enzyme Microb. Technol., 27, 89, 2000. 88. Margolles-Clark, E., Tenkanen, M., and Nakari-Setala, T. et al. Cloning of genes encoding alpha-Larabinofuranosidase and beta-xylosidase from Trichoderma reesei by expression in Saccharomyces cerevisiae. Appl. Environ. Microbiol., 62, 3840, 1996. 89. Ostergaard, S., Roca, C., and Ronnow, B. et al. Physiological studies in aerobic batch cultivations of Saccharomyces cerevisiae strains harboring the MEL1 gene. Biotechnol. Bioeng., 68, 252, 2000. 90. Melcher, K. Galactose metabolism in Saccharomyces cerevisiae: a paradigm for eukaryotic gene regulation. In Yeast Sugar Metabolism: Biochemistry, Genetics, Biotechnology, and Applications, Zimmermann, F. K. and Entian, K. D. Eds. Technomic Publishing Company Inc., Lancaster, PA, 1997, pp. 235. 91. Ostergaard, S., Olsson, L., and Johnston, M. et al. Increasing galactose consumption by Saccharomyces cerevisiae through metabolic engineering of the GAL gene regulatory network. Nat. Biotechnol., 18, 1283, 2000. 92. Pronk, J.T., Y., S.H., and Van Dijken, J.P. Pyruvate metabolism in Saccharomyces cerevisiae. Yeast, 12, 1607, 1996. 93. Bro, C., Knudsen, S., and Regenberg, B. et al. Improvement of galactose uptake in Saccharomyces cerevisiae through overexpression of phosphoglucomutase: example of transcript analysis as a tool in inverse metabolic engineering. Appl. Environ. Microbiol., 71, 6465, 2005. 94. Gonzalez Siso, M.I. The biotechnological utilization of cheese whey: a review. Bioresour. Technol., 57, 1, 1996. 95. Gillies, M.T. Whey Processing and Utilization. Noyes Data Corporation, London, U.K., 1974. 96. Sreekrishna, K. and Dickson, R.C. Construction of strains of Saccharomyces cerevisiae that grow on lactose. Proc. Natl. Acad. Sci. USA, 82, 7909, 1985. 97. Rubio-Texeira, M., Castrillo, J.I., and Adam, A.C. et al. Highly efficient assimilation of lactose by a metabolically engineered strain of Saccharomyces cerevisiae. Yeast, 14, 827, 1998. 98. Domingues, L., Teixeira, J.A., and Lima, N. Construction of a flocculent Saccharomyces cerevisiae fermenting lactose. Appl. Microbiol. Biotechnol., 51, 621, 1999. 99. Rubio-Texeira, M., Arevalo-Rodriguez, M., and Lequerica, J.L. et al. Lactose utilization by Saccharomyces cerevisiae strains expressing Kluyveromyces lactis LAC genes. J. Biotechnol., 84, 97, 2000. 100. Ramakrishnan, S. and Hartley, B.S. Fermentation of lactose by yeast cells secreting recombinant fungal lactase. Appl. Environ. Microbiol., 59, 4230, 1993. 101. Beccerra, M., Díaz Prado, S., and Rodríguez-Belmonte, E. et al. Metabolic engineering for direct lactose utilization by Saccharomyces cerevisiae. Biotechnol. Lett., 24, 1391, 2002. 102. Compagno, C., Porro, D., and Smeraldi, C. et al. Fermentation of whey and starch by transformed Saccharomyces cerevisiae cells. Appl. Microbiol. Biotechnol., 43, 822, 1995. 103. Rosen, K. Production of baker’s yeast. In Yeast Biotechnology Part V, Berry, D.R., Russel, I., and Stewart, G.G. Eds. Allen & Unwin, London, U.K., 1987, pp. 471. 104. Carlson, M., Celenza, J.L., and Eng, F.J. Evolution of the dispersed SUC gene family of Saccharomyces by rearrangements of chromosome telomeres. Mol. Cell. Biol., 5, 2894, 1985.
Metabolic Engineering in Yeast
22-43
105. Kreger-van Rij, N.J.W. The yeasts. A Taxonomic Study. Elsevier Science Publishers, B.V., Amsterdam, the Netherlands, 1984. 106. Liljeström-Suominen, P.L., Joutsjoki, V., and Korhola, M. Construction of a stable {alpha}-galactosidase-producing baker’s yeast strain. Appl. Environ. Microbiol., 54, 245, 1988. 107. Sumner-Smith, M., Bozzato, R.P., and Skipper, N. et al. Analysis of the inducible MEL1 gene of Saccharomyces carlsbergensis and its secreted product, alpha-galactosidase (melibiase). Gene, 36, 333, 1985. 108. Gasent-Ramirez, J.M., Codon, A.C., and Benitez, T. Characterization of genetically transformed Saccharomyces cerevisiae baker’s yeasts able to metabolize melibiose. Appl. Environ. Microbiol., 61, 2113, 1995. 109. Vincent, S.F., Bell, P.J.L., and Bissinger, P. et al. Comparison of melibiose utilizing baker’s yeast strains produced by genetic engineering and classical breeding. Lett. Appl. Microbiol., 28, 148, 1999. 110. Falco, S.C. and Dumas, K.S. Genetic analysis of mutants of Saccharomyces cerevisiae resistant to the herbicide sulfometuron methyl. Genetics, 109, 21, 1985. 111. Xiao, W. and Rank, G.H. The construction of recombinant industrial yeasts free of bacterial sequences by directed gene replacement into a nonessential region of the genome. Gene, 76, 99, 1989. 112. Rønnow, B., Olsson, L., and Nielsen, J. et al. Derepression of galactose metabolism in melibiase producing bakers’ and distillers’ yeast. J. Biotechnol., 72, 213, 1999. 113. Liu, Z., Zhang, G., and Liu, S. Constructing an amylolytic brewing yeast Saccharomyces pastorianus suitable for accelerated brewing. J. Biosci. Bioeng., 98, 414, 2004. 114. Volschenk, H., Viljoen-Bloom, M., and Subden, R.E. et al. Malo-ethanolic fermentation in grape must by recombinant strains of Saccharomyces cerevisiae. Yeast, 18, 963, 2001. 115. Oura, E. Reaction products of yeast fermentation. Process Biochem., 12, 19, 1977. 116. Nevoigt, E. and Stahl, U. Osmoregulation and glycerol metabolism in the yeast Saccharomyces cerevisiae. FEMS Microbiol. Rev., 21, 231, 1997. 117. Hohmann, S. Osmotic stress signaling and osmoadaptation in yeasts. Microbiol. Mol. Biol. Rev., 66, 300, 2002. 118. Wang, Z., Zhuge, J., and Fang, H. et al. Glycerol production by microbial fermentation—A review. Biotechnol. Adv., 19, 201, 2001. 119. Michnick, S., Roustan, J.L., and Remize, F. et al. Modulation of glycerol and ethanol yields during alcoholic fermentation in Saccharomyces cerevisiae strains overexpressed or disrupted for GPD1 encoding glycerol 3-phosphate dehydrogenase. Yeast, 13, 783, 1997. 120. Eglinton, J.M., Heinrich, A.J., and Pollnitz, A.P. et al. Decreasing acetic acid accumulation by a glycerol overproducing strain of Saccharomyces cerevisiae by deleting the ALD6 aldehyde dehydrogenase gene. Yeast, 19, 295, 2002. 121. Remize, F., Roustan, J.L., and Sablayrolles, J.M. et al. Glycerol overproduction by engineered Saccharomyces cerevisiae wine yeast strains leads to substantial changes in by-product formation and to a stimulation of fermentation rate in stationary phase. Appl. Environ. Microbiol., 65, 143, 1999. 122. Cambon, B., Monteil, V., and Remize, F. et al. Effects of GPD1 overexpression in Saccharomyces cerevisiae commercial wine yeast strains lacking ALD6 genes. Appl. Environ. Microbiol., 72, 4688, 2006. 123. Remize, F., Barnavon, L., and Dequin, S. Glycerol Export and Glycerol-3-phosphate Dehydrogenase, but not Glycerol Phosphatase, are rate limiting for Glycerol production in Saccharomyces cerevisiae. Metab. Eng., 3, 301, 2001. 124. Cronwright, G.R., Rohwer, J.M., and Prior, B.A. Metabolic control analysis of glycerol synthesis in Saccharomyces cerevisiae. Appl. Environ. Microbiol., 68, 4448, 2002. 125. Cordier, H., Mendes, F., and Vasconcelos, I. et al. A metabolic and genomic study of engineered Saccharomyces cerevisiae strains for high glycerol production. Metab. Eng., 9, 364, 2007. 126. Remize, F., Cambon, B., and Barnavon, L. et al. Glycerol formation during wine fermentation is mainly linked to Gpd1p and is only partially controlled by the HOG pathway. Yeast, 20, 1243, 2003.
22-44
Developing Appropriate Hosts for Metabolic Engineering
127. Nevoigt, E. and Stahl, U. Reduced pyruvate decarboxylase and increased glycerol-3-phosphate dehydrogenase [NAD+] levels enhance glycerol production in Saccharomyces cerevisiae. Yeast, 12, 1331, 1996. 128. Compagno, C., Boschi, F., and Ranzi, B.M. Glycerol production in a triose phosphate isomerase deficient mutant of Saccharomyces cerevisiae. Biotechnol. Prog., 12, 591, 1996. 129. Overkamp, K.M., Bakker, B.M., and Kotter, P. et al. Metabolic engineering of glycerol production in Saccharomyces cerevisiae. Appl. Environ. Microbiol., 68, 2814, 2002. 130. Shi, Y., Vaden, D.L., and Ju, S. et al. Genetic perturbation of glycolysis results in inhibition of de novo inositol biosynthesis. J. Biol. Chem., 280, 41805, 2005. 131. Geertman, J.-M.A., van Maris, A.J.A., and van Dijken, J.P. et al. Physiological and genetic engineering of cytosolic redox metabolism in Saccharomyces cerevisiae for improved glycerol production. Metab. Eng., 8, 532, 2006. 132. van Maris, A.J., Geertman, J.M., and Vermeulen, A. et al. Directed evolution of pyruvate decarboxylase-negative Saccharomyces cerevisiae, yielding a C2-independent, glucose-tolerant, and pyruvatehyperproducing yeast. Appl. Environ. Microbiol., 70, 159, 2004. 133. Nissen, T.L., Kielland-Brandt, M.C., and Nielsen, J. et al. Optimization of ethanol production in Saccharomyces cerevisiae by metabolic engineering of the ammonium assimilation. Metab. Eng., 2, 69, 2000. 134. Ostergaard, S., Olsson, L., and Nielsen, J. Metabolic engineering of Saccharomyces cerevisiae. Microbiol. Mol. Biol. Rev., 64, 34, 2000. 135. Grauslund, M., Didion, T., Kielland-Brandt, M.C., and Andersen, H.A. BAP2, a gene encoding a permease for branched-chain amino acids in Saccharomyces cerevisiae. Biochim. Biophys. Acta. 1269, 275, 1995. 136. Roca, C., Nielsen, J., and Olsson, L. Metabolic engineering of ammonium assimilation in xylosefermenting Saccharomyces cerevisiae improves ethanol production. Appl. Environ. Microbiol., 69, 4732, 2003. 137. Bro, C., Regenberg, B., and Forster, J. et al. In silico aided metabolic engineering of Saccharomyces cerevisiae for improved bioethanol production. Metab. Eng., 8, 102, 2006. 138. Heux, S., Cachon, R., and Dequin, S. Cofactor engineering in Saccharomyces cerevisiae: Expression of a H(2)O-forming NADH oxidase and impact on redox metabolism. Metab. Eng., 8, 303, 2006. 139. Vemuri, G.N., Eiteman, M.A., and McEwen, J.E. et al. Increasing NADH oxidation reduces overflow metabolism in Saccharomyces cerevisiae. Proc. Natl. Acad. Sci. USA, 104, 2402, 2007. 140. Otterstedt, K., Larsson, C., and Bill, R.M. et al. Switching the mode of metabolism in the yeast Saccharomyces cerevisiae. EMBO Rep., 5, 532, 2004. 141. Otero, J.M., Cimini, D., and Patil, K. et al. Metabolic engineering of S. cerevisiae for overproduction of succinic acid. In ISSY 25. Systems Biology of Yeasts—From Models to Applications. Espoo, Finland, 2006. 142. Otero, J.M., Personal communication, 2007. 143. Otero, J.M., Cimini, D., and Patil, K. et al. Central carbon metabolism in Saccharomyces cerevisiae: succinic acid production. In Metabolic Engineering VI. Noordwijkerhout, The Netherlands, 2006. 144. Mac Leod, A. Beer. In Alcoholic Beverages, Rose, A.H. Ed. Academic Press Inc. Ltd., London, U.K., 1977, pp. 44. 145. Suihko, M.L., Blomqvist, K., and Penttila, M. et al. Recombinant brewer’s yeast strains suitable for accelerated brewing. J. Biotechnol., 14, 285, 1990. 146. Rose, A.H. History and scientific basis of alcoholic beverage production. In Alcoholic Beverages, Rose, A.H. Ed. Academic Press Inc. Ltd., London, U.K., 1977, pp. 1. 147. Hansen, J. and Kielland-Brandt, M.C. Modification of biochemical pathways in industrial yeasts. J. Biotechnol., 49, 1, 1996. 148. Sone, H., Fujii, T., and Kondo, K. et al. Nucleotide sequence and expression of the Enterobacter aerogenes alpha-acetolactate decarboxylase gene in brewer’s yeast. Appl. Environ. Microbiol., 54, 38, 1988.
Metabolic Engineering in Yeast
22-45
149. Stephanopoulos, G., Aristodou, A., and Nielsen, J. Metabolic Engineering. Academic Press Inc., San Diego, CA, 1998. 150. Fujii, T., Kondo, K., and Shimizu, F. et al. Application of a ribosomal DNA integration vector in the construction of a brewer’s yeast having alpha-acetolactate decarboxylase activity. Appl. Environ. Microbiol., 56, 997, 1990. 151. Blomqvist, K., Suihko, M.L., and Knowles, J. et al. Chromosomal integration and expression of two bacterial alpha-acetolactate decarboxylase genes in brewer’s yeast. Appl. Environ. Microbiol., 57, 2796, 1991. 152. Yamano, S., Kondo, K., and Tanaka, J. et al. Construction of a brewer’s yeast having alpha-acetolactate decarboxylase gene from Acetobacter aceti ssp. xylinum integrated in the genome. J. Biotechnol., 32, 173, 1994. 153. Porro, D., Brambilla, L., and Ranzi, B.M. et al. Development of metabolically engineered Saccharomyces cerevisiae cells for the production of lactic acid. Biotechnol. Prog., 11, 294, 1995. 154. Ishida, N., Suzuki, T., and Tokuhiro, K. et al. D-lactic acid production by metabolically engineered Saccharomyces cerevisiae. J. Biosci. Bioeng., 101, 172, 2006. 155. Ishida, N., Saitoh, S., and Onishi, T. et al. The effect of pyruvate decarboxylase gene knockout in Saccharomyces cerevisiae on L-lactic acid production. Biosci. Biotechnol. Biochem., 70, 1148, 2006. 156. Branduardi, P.P., Sauer, M.M., and De Gioia, L.L. et al. Lactate production yield from engineered yeasts is dependent from the host background, the lactate dehydrogenase source and the lactate export. Microb. Cell. Fact., 5, 4, 2006. 157. Porro, D., Bianchi, M.M., and Ranzi, B.M. et al. Yeast strains for the production of lactic acid. PCT WO 99/14335, 1999. 158. Bianchi, M.M., Brambilla, L., and Protani, F. et al. Efficient homolactic fermentation by Kluyveromyces lactis strains defective in pyruvate utilization and transformed with the heterologous LDH gene. Appl. Environ. Microbiol., 67, 5621, 2001. 159. Branduardi, P., Valli, M., and Brambilla, L. et al. The yeast Zygosaccharomyces bailii: a new host for heterologous protein production, secretion and for metabolic engineering applications. FEMS Yeast Res., 4, 493, 2004. 160. Ilmén, M., Koivuranta, K., and Ruohonen, L. et al. Efficient production of L-lactic acid from xylose by Pichia stipitis. Appl. Environ. Microbiol., 73, 117, 2007. 161. Porro, D., Bianchi, M.M., and Brambilla, L. et al. Replacement of a metabolic pathway for large-scale production of lactic acid from engineered yeasts. Appl. Environ. Microbiol., 65, 4211, 1999. 162. van Maris, A.J., Winkler, A.A., and Porro, D. et al. Homofermentative lactate production cannot sustain anaerobic growth of engineered Saccharomyces cerevisiae: possible consequence of energydependent lactate export. Appl. Environ. Microbiol., 70, 2898, 2004. 163. Hohmann, S. Characterization of PDC6, a third structural gene for pyruvate decarboxylase in Saccharomyces cerevisiae. J. Bacteriol., 173, 7963, 1991. 164. Maury, J., Asadollahi, M., and Møller, K. et al. Microbial isoprenoid production: an example of green chemistry through metabolic engineering. In Biotechnology for the Future, Nielsen, J. Ed. Springer, Berlin-Heidelberg, Germany. 2005, pp. 19. 165. Shimada, H., Kondo, K., and Fraser, P.D. et al. Increased carotenoid production by the food yeast Candida utilis through metabolic engineering of the isoprenoid pathway. Appl. Environ. Microbiol., 64, 2676, 1998. 166. Jackson, B.E., Hart-Wells, E.A., and Matsuda, S.P. Metabolic engineering to produce sesquiterpenes in yeast. Org. Lett., 5, 1629, 2003. 167. Shiba, Y., Paradise, E.M., and Kirby, J. et al. Engineering of the pyruvate dehydrogenase bypass in Saccharomyces cerevisiae for high-level production of isoprenoids. Metab. Eng., 9, 160, 2007. 168. Oswald, M., Fischer, M., and Dirninger, N. et al. Monoterpenoid biosynthesis in Saccharomyces cerevisiae. FEMS Yeast Res., 7, 413, 2007.
22-46
Developing Appropriate Hosts for Metabolic Engineering
169. Hampton, R., Dimster-Denk, D., and Rine, J. The biology of HMG-CoA reductase: the pros of contra-regulation. Trends in Biochem. Sci., 21, 140, 1996. 170. Lewis, T.L., Keesler, G.A., and Fenner, G.P. et al. Pleiotropic mutations in Saccharomyces cerevisiae affecting sterol uptake and metabolism. Yeast, 4, 93, 1988. 171. Crowley, J.H., Leak, F.W., Jr., and Shianna, K.V. et al. A mutation in a purported regulatory gene affects control of sterol uptake in Saccharomyces cerevisiae. J. Bacteriol., 180, 4177, 1998. 172. Anderegg, R., Betz, R., and Carr, S. et al. Structure of Saccharomyces cerevisiae mating hormone a-factor. Identification of S-farnesyl cysteine as a structural component. J. Biol. Chem., 263, 18236, 1988. 173. Starai, V.J., Gardner, J.G., and Escalante-Semerena, J.C. Residue Leu-641 of Acetyl-CoA synthetase is critical for the acetylation of residue Lys-609 by the protein acetyltransferase enzyme of Salmonella enterica. J. Biol. Chem., 280, 26200, 2005. 174. Chambon, C., Ladeveze, V., and Oulmouden, A. et al. Isolation and properties of yeast mutants affected in farnesyl diphosphate synthetase. Curr. Genet., 18, 41, 1990. 175. Szczebara, F.M., Chandelier, C., and Villeret, C. et al. Total biosynthesis of hydrocortisone from a simple carbon source in yeast. Nat. Biotechnol., 21, 143, 2003. 176. Dumas, B., Brocard-Masson, C., and Assemat-Lebrun, K. et al. Hydrocortisone made in yeast: metabolic engineering turns a unicellular microorganism into a drug-synthesizing factory. Biotechnol. J., 1, 299, 2006. 177. Kelly, D. and Kelly, S. Rewiring yeast for drug synthesis. Nat. Biotechnol., 21, 133, 2003. 178. Veen, M. and Lang, C., Production of lipid compounds in the yeast Saccharomyces cerevisiae. Appl. Microbiol. Biotechnol., 63, 635, 2004. 179. Cruz, J., Minoja, G., and Okuchi, K. Improving clinical outcomes from acute subdural hematomas with the emergency preoperative administration of high doses of mannitol: a randomized trial. Neurosurgery, 49, 864, 2001. 180. Hugenholtz, J. and Smid, E.J. Nutraceutical production with food-grade microorganisms. Curr. Opin. Biotechnol., 13, 497, 2002. 181. Nguyen, H.T., Dieterich, A., and Athenstaedt, K. et al. Engineering of Saccharomyces cerevisiae for the production of L-glycerol 3-phosphate. Metab. Eng., 6, 155, 2004. 182. Costenoble, R., Adler, L., and Niklasson, C. et al. Engineering of the metabolism of Saccharomyces cerevisiae for anaerobic production of mannitol. FEMS Yeast Res., 3, 17, 2003. 183. Lee, W. and Dasilva, N.A. Application of sequential integration for metabolic engineering of 1,2propanediol production in yeast. Metab. Eng., 8, 58, 2006. 184. Altaras, N.E. and Cameron, D.C. Metabolic engineering of a 1,2-propanediol pathway in Escherichia coli. Appl. Environ. Microbiol., 65, 1180, 1999. 185. Di Carlo, G., Mascolo, N., and Izzo, A.A. et al. Flavonoids: old and new aspects of a class of natural therapeutic drugs. Life Sci., 65, 337, 1999. 186. Chen, D. and Guarente, L. SIR2: a potential target for calorie restriction mimetics. Trends Mol. Med., 13, 64, 2007. 187. Becker, J.V., Armstrong, G.O., and van der Merwe, M.J. et al. Metabolic engineering of Saccharomyces cerevisiae for the synthesis of the wine-related antioxidant resveratrol. FEMS Yeast Res., 4, 79, 2003. 188. Kopp, P. Resveratrol, a phytoestrogen found in red wine. A possible explanation for the conundrum of the ‘French paradox’? Eur. J. Endocrinol., 138, 619, 1998. 189. Jiang, H., Wood, K.V., and Morgan, J.A. Metabolic engineering of the phenylpropanoid pathway in Saccharomyces cerevisiae. Appl. Environ. Microbiol., 71, 2962, 2005. 190. Shojima, S., Nishizawa, N.K., and Fushiya, S. et al. Biosynthesis of nicotianamine in the suspensioncultured cells of tobacco (Nicotiana megalosporum). Biol. Metals, 2, 142, 1989. 191. Roje, S., Chan, S.Y., and Kaplan, F. et al. Metabolic engineering in yeast demonstrates that S-adenosylmethionine controls flux through the methylenetetrahydrofolate reductase reaction in vivo. J. Biol. Chem., 277, 4056, 2002.
Metabolic Engineering in Yeast
22-47
192. Padh, H. Cellular functions of ascorbic acid. Biochem. Cell. Biol., 68, 1166, 1990. 193. Sauer, M., Branduardi, P., and Valli, M. et al. Production of L-ascorbic acid by metabolically engineered Saccharomyces cerevisiae and Zygosaccharomyces bailii. Appl. Environ. Microbiol., 70, 6086, 2004. 194. Branduardi, P., Pagani, P., and Papini, M. et al. L-ascorbic acid production from D-glucose in metabolic engineered Saccharomyces cerevisiae and its effect on strain robustness. In ISSY 25. Systems Biology of Yeasts—From Models to Applications. Espoo, Finland, 2006. 195. Gidijala, L., van der Klei, I.J., and Veenhuis, M. et al. Reprogramming Hansenula polymorpha for penicillin production: expression of the Penicillium chrysogenum pcl gene. FEMS Yeast Res., doi:10.1111/j.1567-1364.2007.00228.x, 2007. 196. Kjeldsen, T., Balschmidt, P., and Diers, I. et al. Expression of insulin in yeast: the importance of molecular adaptation for secretion and conversion. Biotechnol. Genet. Eng. Rev., 18, 89, 2001. 197. Porro, D., Sauer, M., and Branduardi, P. et al. Recombinant protein production in yeasts. Mol. Biotechnol., 31, 245, 2005. 198. Almeida, J., Modig, T., and Petersson, A. et al. Increased tolerance and conversion of inhibitors in lignocellulosic hydrolysates by Saccharomyces cerevisiae. J. Chem. Technol. Biotechnol., 82, 340, 2007. 199. Sauer, U. and Schlattner, U. Inverse metabolic engineering with phosphagen kinase systems improves the cellular energy state. Metab. Eng., 6, 220, 2004. 200. Çakar, Z.P., Seker, U.O.S., and Tamerler, C. et al. Evolutionary engineering of multiple-stress resistant Saccharomyces cerevisiae. FEMS Yeast Res., 5, 569, 2005. 201. Alper, H. Moxley, J., and Nevoigt, E. et al. Engineering yeast transcription machinery for improved ethanol tolerance and production. Science, 314, 1565, 2006. 202. Wyss, M. and Kaddurah-Daouk, R. Creatine and creatinine metabolism. Physiol. Rev. 80, 1107, 2000. 203. Canonaco, F., Schlattner, U., and Pruett, P.S. et al. Functional expression of phosphagen kinase systems confers resistance to transient stresses in Saccharomyces cerevisiae by buffering the ATP pool. J. Biol. Chem., 277, 31303, 2002. 204. Radler, F. Yeasts-metabolism of organic acids. In Wine Microbiology and Biotechnology, Fleet, G.H. Ed. Hardwood Academic Publishers, Chur, Switzerland, 1993, pp. 165. 205. Volschenk, H., Viljoen, M., and Grobler, J. et al. Engineering pathways for malate degradation in Saccharomyces cerevisiae. Nat. Biotechnol., 15, 253, 1997. 206. Henick-Kling, T. Malolactic fermentation. In Wine Microbiology and Biotechnology, Fleet, G. H. Ed. Hardwood Academic Publishers, Chur, Switzerland, 1993, pp. 289. 207. Fleet, G.H. and Heard, G.M. Yeasts-growth during fermentation. In Wine Microbiology and Biotechnology, Fleet, G.H. Ed. Hardwood Academic Publishers, Chur, Switzerland, 1993, pp. 43. 208. Williams, S.A., Hodges, R.A., and Strike, T.L. et al. Cloning the gene for the malolactic fermentation of wine from Lactobacillus delbrueckii in Escherichia coli and yeasts. Appl. Environ. Microbiol., 47, 288, 1984. 209. Denayrolles, M., Aigle, M., and Lonvaud-Funel, A. Functional expression in Saccharomyces cerevisiae of the Lactococcus lactis mleS gene encoding the malolactic enzyme. FEMS Microbiol. Lett., 125, 37, 1995. 210. Ansanay, V., Dequin, S., and Camarasa, C. et al. Malolactic fermentation by engineered Saccharomyces cerevisiae as compared with engineered Schizosaccharomyces pombe. Yeast, 12, 215, 1996. 211. Bony, M., Bidart, F., and Camarasa, C. et al. Metabolic analysis of S. cerevisiae strains engineered for malolactic fermentation. FEBS Lett., 410, 452, 1997. 212. Husnik, J.I., Volschenk, H., and Bauer, J. et al. Metabolic engineering of malolactic wine yeast. Metab. Eng., 8, 315, 2006. 213. Granström, T., Izumori, K., and Leisola, M. A rare sugar xylitol. Part II: biotechnological production and future applications of xylitol. Appl. Microbiol. Biotechnol., 74, 273, 2007.
22-48
Developing Appropriate Hosts for Metabolic Engineering
214. Leathers, T.D. Bioconversions of maize residues to value-added coproducts using yeast-like fungi. FEMS Yeast Res., 3, 133, 2003. 215. Sirisansaneeyakul, S., Staniszewski, M., and Rizzi, M. Screening of yeasts for production of xylitol from –xylose. J. Ferm. Bioeng., 80, 565, 1995. 216. Mayerhoff, Z.D.V.L., Roberto, I.s.C., and Silva, S.l.S. Xylitol production from rice straw hemicellulose hydrolysate using different yeast strains. Biotechnol. Lett., 19, 407, 1997. 217. Hallborn, J., Walfridsson, M., and Airaksinen, U. et al. Xylitol production by recombinant Saccharomyces cerevisiae. Biotechnology (NY), 9, 1090, 1991. 218. Ko, B., Rhee, C., and Kim, J.S. Enhancement of xylitol productivity and yield using a xylitol dehydrogenase gene-disrupted mutant of Candida tropicalis under fully aerobic conditions. Biotechnol. Lett., 28, 1159, 2006. 219. De Souza Pereira, R. The use of baker’s yeast in the generation of asymmetric centers to produce chiral drugs and other compounds. Crit. Rev. Biotechnol.,18, 28, 1998. 220. Rodriguez, S., Kayser, M.M., and Stewart, J.D. Highly stereoselective reagents for beta-keto ester reductions by genetic engineering of baker’s yeast. J. Am. Chem. Soc., 123, 1547, 2001. 221. Johanson, T., Katz, M., and Gorwa-Grauslund, M.F. Strain engineering for stereoselective bioreduction of dicarbonyl compounds by yeast reductases. FEMS Yeast Res., 5, 513, 2005. 222. Katz, M., Frejd, T., and Hahn-Hägerdal, B. et al. Efficient anaerobic whole cell stereoselective bioreduction with recombinant Saccharomyces cerevisiae. Biotechnol. Bioeng., 84, 573, 2003. 223. Katz, M., Sarvary, I., and Frejd, T. et al. An improved stereoselective reduction of a bicyclic diketone by Saccharomyces cerevisiae combining process optimization and strain engineering. Appl. Microbiol. Biotechnol., 59, 641, 2002. 224. Yamano, S., Ishii, T., and Nakagawa, M. et al. Metabolic engineering for production of beta-carotene and lycopene in Saccharomyces cerevisiae. Biosci. Biotechnol. Biochem., 58, 1112, 1994. 225. Takahashi, S., Yeo, Y., and Greenhagen, B.T. et al. Metabolic engineering of sesquiterpene metabolism in yeast. Biotechnol. Bioeng., 97, 170, 2007. 226. Pronk, J.T. Auxotrophic yeast strains in fundamental and applied research. Appl. Environ. Microbiol., 68, 2095, 2002.
23 Metabolic Engineering of Bacillus subtilis 23.1 Introduction ������������������������������������������������������������������������������������ 23-1 23.2 Genetic Engineering Methods Unique to B. subtilis................. 23-2 Strain History • Gene Transfer and Selectable Markers • Expression of Genes by Use of Plasmid Vectors • Marker-Retaining Chromosomal Modifications • Marker-Free Chromosomal Modifications • Chromosomal Gene Amplification • Large Scale, Stable Chromosomal Cloning
23.3 M etabolic Engineering of Products with Well-Known Biochemistry ����������������������������������������������������������������������������������� 23-6
John Perkins, Markus Wyss, and Hans-Peter Hohmann DSM Nutritional Products Ltd
Uwe Sauer ETH Zürich
Riboflavin • Pantothenic Acid
23.4 M etabolic Engineering of Products with Unusual Biochemistry ������������������������������������������������������������������23-14
Thiamin • Biotin • Vitamin B6
23.5 Other Potentially Relevant Products.......................................... 23-22 23.6 Current Challenges and New Possibilities................................ 23-22 Metabolic Engineering of General Host Properties • Genome Resequencing • Omics and Systems Biology • Conclusions
References.................................................................................................. 23-25
23.1 Introduction As a microbial model organism, Bacillus subtilis is second only to Escherichia coli.1–4 This status was earned primarily because this Gram positive features the perhaps simplest form of developmental processes. In particular the capacity of B. subtilis cells to differentiate into a mother cell and an endospore has sparked extensive research. From a biotechnological perspective, all the usual benefits of using a model organism apply; e.g., established genetic methods and tools, 5 well-characterized physiology and biochemistry, the genome sequence,6 comprehensive mutant libraries,7 and a proteome map.8 Very importantly, there is also long-standing experience with large-scale fermentations of B. subtilis.9 For these reasons and the large number of secreted exoenzymes, B. subtilis became the organism of choice for a whole range of commercial products, ranging from specialty chemicals and antibiotics to vitamins and food enzymes.10 When produced by B. subtilis, the latter two product classes have Generally regarded as Safe (GRAS) status in the United States, which is based either on a recognized history of safe use in foods prior to 1958 or on scientific procedures similar to those required for a food additive regulation. In particular the lack of any known enterotoxin genes in the B. subtilis type strain 16811 was pivotal for this status. A status similar to GRAS, called Qualified presumption of safety (QPS), has been recently proposed by the EU regulatory authorities.12 It is important to note, however, that GRAS status is only for the product, 23-1
23-2
Developing Appropriate Hosts for Metabolic Engineering
not for the microorganisms themselves. Although there are only few examples on the use of B. subtilis or related species as a probiotic for animal or human application,13–16 this might change because B. subtilis natto, a closely related 168 species, is used to produce the traditional Japanese fermented soy bean delicacy “natto.” Early industrial interest in B. subtilis was based on its natural capacity to secrete large quantities (20–25 g/l) of extracellular protein, and Bacillus enzymes currently make up about half of the world market for industrial enzymes that exceeds one billion US$.3 The main applications are food enzymes (29%), feed enzymes (15%) and general technical enzymes (56%), and most of the detergent proteases currently in the market are Bacillus spp. serine proteases.10,17 While significant progress was made in engineering improved protein secreting strains,18,19 the impact of metabolic engineering has been comparatively low because of the complexity of the secretion processes. Hence, we focus here primarily on the commercially relevant product class of vitamins, for which processes were developed in the last decade or are under development using metabolic engineering. From this experience, the empirical observation emerges that some compounds can rapidly be produced in a gram per liter range, while we often struggle at the mg level for others. With hindsight, the more persistent compounds are often synthesized through incompletely known biosynthetic routes or involve unusual or complicated biochemical reactions. Since this ultimately affects also the metabolic engineering approaches taken, we structure this review according to this somewhat empirical classification of products. A final note to the reader: the information in this chapter mainly covers the period until 2006, but also includes a limited update of information up to the time of publication.
23.2 Genetic Engineering Methods Unique to B. subtilis Over the last 25 years, many sophisticated genetic engineering methods have been developed for B. subtilis that take advantage of its genetic and physiological properties. Here we focus on methods that are unique (or nearly so) to B. subtilis; most of them relate to the fact that plasmid-based expression systems do not work as well as chromosomal-based symptoms. For description of general molecular biological methods for B. subtilis, there are several excellent reviews and books available to investigators.5,20
23.2.1 Strain History The B. subtilis type-strain 168 is a trpC2 mutant (defective in tryptophan biosynthesis) of the wild-type Marburg strain that was isolated by X-ray mutagenesis21 and later shown to be naturally competent (i.e., having the ability for DNA transformation).22 Over the past 45 years, this one isolate was the progenitor to almost all mutants and strains in commercial or academic research on cell physiology, competence, DNA recombination, sporulation, and other basic prokaryotic biological processes. Most of these strains are deposited at the Bacillus Genetic Stock Center at The Ohio State University (http:// www.bgsc.org/). Typically, prototrophy was conferred to B. subtilis 168 by transformation using trp+ DNA from the related B. subtilis 23 or W23. Strain 23 is an UV-induced thr mutant isolated at the same time as 168, and W23 is a prototrophic, streptomycin resistant (str) Bacillus of unknown ancestry that probably originates from a culture collection from Western Reserve University.23 The physiology of W23 differs from 168 to an extent that it has been proposed to reclassify it as the distinct subspecies spizizenii,23–26 with a sequence divergence of about 5%.27,28 Recent DNA resequencing studies (see Section 6.2 below) has shown that W23 is in fact derived from strain 23.29 Interestingly, the early practise of transformation for prototrophy led to W23 Trp+ “islands” surrounding the trp locus.29 In at least one case, the sequence divergence between both strains led to lower activity of a biotechnologically relevant enzyme; i.e., the panB encoded ketopantoate hydroxymethyltransferase. Moreover, there is at least one additional W23 islands (at sacA) within the genomes of Trp+ derivatives of 168, since early transformation experiments were done at high DNA concentrations that
Metabolic Engineering of Bacillus subtilis
23-3
facilitate general integration of W23 DNA islands to other regions of the 168 chromosome by a process called congression.30,31 Genetic or physiological effects of this genome “mixing” are unclear, but might account for some of the different growth properties reported for the PY79 and SMY strains (A.L. Sonenshein, personal communication; M. Hans, personal communication). A major regulatory issue for industrial processes using GMO-derived B. subtilis is physical containment of the engineered strains. Since B. subtilis can produce endospores, a dormant form of the bacteria that can persist in the environment for years due to their unique properties of heat, chemical, and UV resistance, commercial strains are required to contain mutation(s) that arrest this terminal developmental stage. Consequently, most commercialized production strains contain a mutation within the spo0A gene, which encodes the master regulatory protein of sporulation. 32,33 Such mutants can only grow vegetatively and, upon nutrient deprivation, will lyse and die. In the case of riboflavin biosynthesis, introduction of this mutation in engineered strains actually increases riboflavin production by an unknown process. 34 Conversely, in the case of pantothenate biosynthesis, the presence of the spo0A mutation dramatically decreases production, again by an unknown process. For this reason, mutations were identified in two late stage sporulation transcription factors, SigE and SigG, which prevented sporulation yet maintained high pantothenate production. 35 Alternative sporulation mutants (e.g., SigF) have also been described for B. licheniformis for the production of enzymes. 36
23.2.2 Gene Transfer and Selectable Markers A prerequisite for metabolic engineering is the ability to delete, insert, or modify genes in a fast and efficient manner. The genetic transfer methods for B. subtilis include transformation, transduction (specialized and general), and protoplast fusion, while conjugation and electroporation are technically more difficult or inefficient. Independent of whether plasmids or chromosomal modifications are used, antibiotic resistances typically serve as positive selection markers. A variety of antibiotic resistance cassettes (Abr) have been developed with genes from Streptococcus, Staphylococcus, or Bacillus sources; including resistance genes to chloramphenicol (cat), macrolide-lincosamide-streptogramin (MLS) antibiotics (erm), neomycin/kanamycin (neo/kan), tetracycline (tet), spectinomycin (spec), phleomycin (ple), trimethoprim (dhrf ), and blasticidin S.37–42 There are also new methods for rapidly changing from one resistance to another by transformation and in vivo recombination.43
23.2.3 Expression of Genes by Use of Plasmid Vectors Although B. subtilis does not have endogenous plasmids, a plethora of plasmid vectors have been imported from other Gram-positives (e.g., Staphylococcus aureus plasmids pUB110, pC194 and pE194). Several excellent reviews44,45 and databases (http://www.plasmids.bham.ac.uk/default.htm; http://www. bgsc.org/) are available that list these vectors and their attributes. The value of these vectors, however, has been limited due to a low efficiency of cloning and high instability of the recombinant plasmid. In part, these problems are attributed to the B. subtilis mechanism of plasmid replication, which occurs via a rolling-circle model involving a single-stranded DNA intermediate. Moreover, establishment of these plasmids during competent cell transformation is hampered by the requirement of multimeric plasmid forms, which can be difficult to produce during DNA ligation. Despite several attempts to improve B. subtilis plasmid vectors (i.e., protoplast fusion, E. coli shuttle vectors, cryptic Bacillus plasmids, plasmids that replicate via the theta-type mechanism, and donorhelper plasmids), only a handful of commercial production strains contain expression plasmids with entire biosynthetic operons. The most notable examples are the riboflavin production strains generated by the Stepanov group in the late 1980s46 and later by the Perumov group.47 For these reasons, the preferred overexpression strategy for commercial B. subtilis production strains is stable chromosomal integration and amplification that is discussed below.
23-4
Developing Appropriate Hosts for Metabolic Engineering
23.2.4 Marker-Retaining Chromosomal Modifications Early methods utilize E. coli integration plasmids containing an antibiotic resistance gene next to a modified gene, and antibiotic-resistant cells arise by integration of the whole vector into the chromosome. Endogenous genes integrate at the native locus, but exogenous genes without B. subtilis homologues must be targeted to a specific location by adding two sequences of at least 250 bp that flank the chromosomal target site. Since E. coli plasmids cannot replicate in B. subtilis, the Abr cassette is inserted into the target site in single copy by double-crossover recombination between the chromosome and the flanking DNA segments on the plasmid. Depending on the junction end points of the flanking DNA sequences and the resistance gene, this technique can introduce the Abr within a gene generating an in vivo insertional knock-out or deletion mutation, or next to a gene allowing genetic information to be introduced into other strains by transduction or transformation. Several improvements of this method increased speed and efficiency. To prevent polar mutations that affect downstream gene expression, the Ehrlich group at INRA developed the pMUTIN integration vector and several derivatives,48 in which the spac promoter is positioned within the plasmid-disrupted target gene so that it can transcribe the genes downstream of the disruption by IPTG induction. This vector was used for systematic inactivation of all unknown B. subtilis y-genes as part of an international genome project.7,49 For commercial production strains, however, these vectors have several drawbacks, the most critical being the need to maintain antibiotic selection pressure during analysis of the mutant. A simpler approach utilizes antibiotic cassettes without transcription termination sequences that allow readthrough from the promoter of the antibiotic resistance gene into the downstream sequences. One example is a terminator-less cat cassette, which has been used to analyze biosynthetic operons involved in riboflavin, biotin, and thiamin production.50,51 For instable plasmid constructs direct transformation of the ligation mixture can be used. While the efficiency is quite low, it can be improved by long flanking homology (LFH) PCR transformation. This method was first developed for generating yeast gene disruption cassettes in vitro without the need of first generating an E. coli plasmid,52 and was later adapted for B. subtilis.53 Several versatile and widely used systems are available to target foreign DNA into well-characterized, nonessential regions of the chromosome54,55 in a process called ectopic integration. These loci include the amyE gene, bpr, dif, epr, sacB, thrC, vpr and more recently gltA, lacA, pyrD, and sacA (Refs 32,34,56–62; J. Perkins and J. Pero, unpublished data). Successful integration at the amyE, sacA, and sacB loci can be quickly confirmed by testing colonies with a simple color assay (e.g., α-amylase), and the gltA, pyrD, and thrC loci can be tested for their auxotrophic phenotype. While Abr-based methods are generally rapid and efficient, the number of modifications per chromosome is limited by the number of available resistance genes, and the required multiantibiotic pressure could affect the physiology of the engineered strain. Recently, US and EU government regulatory agencies have also taken a more critical view of commercial enzyme or fine chemical processes that utilize strains which retain antibiotic resistance genes, especially for additives used in animal nutrition.63 For that reason, construction of genetically engineered strains without these elements has increasingly gained favor, as discussed in the next section.
23.2.5 Marker-Free Chromosomal Modifications Early methods to generate recombinant strains that are devoid of antibiotic resistance genes require several recombination steps to remove the initially used marker and are quite labor intensive.57,64 The principle is based on inserting a second resistance cassette such that both can be looped out with an integrated plasmid upon growth without selection pressure. The efficiency of this process is variable but low, ranging from 0.1 to 10% depending on the chromosomal locus. Several one-step strategies have been recently developed by introducing a cassette into the chromosome that contains direct repeats (DR) flanking a counter-selectable marker. The excision of the counter-selectable marker leaves solely
Metabolic Engineering of Bacillus subtilis
23-5
the desired mutation in the chromosome. Such counter-selectable markers can be a toxic gene—either mazF,65 or upp in the presence of 5-fluoruracil66 —or a conditional auxotrophy by placing expression of a biosynthetic gene under the control of blaI, which encodes a repressor involved in β-lactamase regulation.67 A more recently reported method utilizes the endogenous chromosomal RipX/CodV site-specific recombinase site, dif.68 A much simpler method can be used to replace metabolic or biosynthetic wild-type genes with an engineered construct. A deletion mutation of the target gene that generates an auxotrophic phenotype (i.e., failure to grow on minimal medium) is first generated using an antibiotic resistance gene. This mutant is then transformed with either an integrative plasmid or LFH PCR products that contain the modified gene (or parts thereof) and chromosomal sequences that flank the target gene. By selecting for prototrophy and screening for loss of the Abr marker, colonies can be obtained that contain the modified gene at its native locus. This method has been used to replace native, regulated promoters with strong constitutive promoters for genes and operons involved in the biosynthesis of riboflavin (rib), biotin (bio), pantothenate (pan), isoleucine-valine (ilv), folate (fol), or thiamin (thi).51,62,69–72 This method can also be used to introduce point mutations, as long as sufficient enzymatic activity remains to allow for selection of prototrophic colonies.
23.2.6 Chromosomal Gene Amplification A hallmark of B. subtilis is its ability to stably increase the copy number of a gene or operon in the chromosome.73 Specifically, the copy number of integrated vectors increases by isolating clones that grow at increasingly higher concentrations of antibiotics. It is important to note that the amplified state is preexisting in the transformant, and challenging the strain to grow at higher antibiotic levels enriches, not selects, for those colonies with the amplified state.73 What differentiates gene amplification in B. subtilis from other microorganisms is that the amplified state can be maintained in the absence of antibiotic pressure, at least through a standard 48 h fermentation regime. The reason for this attribute is not known, but could be caused by differences in intramolecular-specific recombination functions. A weakness of the early methods using E. coli vectors was retention of the E. coli DNA in the amplified chromosomal region. However, ligated concatamer DNA containing just the engineered gene or operon and the antibiotic resistance gene has been successfully used to generate strains with amplified genes or operons.51,61,69 Another issue is the presence of the antibiotic resistance gene in the engineered strain, as discussed above. The challenge for the future will be to identify nonantibiotic resistance genes that can be used for chromosomal amplification.
23.2.7 Large Scale, Stable Chromosomal Cloning With the discovery of horizontal gene transfer across genera, the idea to transfer large regions of chromosomal DNA from one bacterium to another has gained favor for increasing genome variation. Based on this idea, Itaya and coworkers74 have recently developed a new technology that allows giant DNA segments to be introduced into the B. subtilis chromosome. Exogenous DNA is guided into the B. subtilis genome by simultaneous homologous recombination at two small flanking DNAs called landing pad sequences (LPSs). Two such LPSs constitute an array called a LPA; LPA sites were located in the proB, leuB, or ytqB loci. Sliding the LPA by repeated transformation allows contiguous segments of exogenous DNA to be incorporated into the chromosome. As an example of the power of this method, the complete 3.5 Mb genome of Synechocystis was cloned into the genome of B. subtilis. Although this method has not been used to generate commercial production strains, it offers clear advantages over other large-scale DNA cloning methods, e.g., bacterial and yeast artificial chromosome cloning, by increasing the size and stability of cloned giant DNA.
23-6
Developing Appropriate Hosts for Metabolic Engineering
23.3 Metabolic Engineering of Products with Well-Known Biochemistry 23.3.1 Riboflavin 23.3.1.1 Physiological and Commercial Significance Riboflavin (vitamin B2) is the precursor molecule for the coenzymes flavin mononucleotide (FMN) and flavin adenine dinucleotide (FAD), which serve as the prosthetic groups of flavoproteins. Important examples of the large family of flavoproteins, all of which are involved in cellular oxidation and reduction reactions, are succinate dehydrogenase, NADH-Q reductase, acyl-CoA dehydrogenase, and glutathione reductase. The essential compound riboflavin can be produced by plants and microorganisms, but not by higher animals, who must acquire it through the diet. Like many other vitamins, riboflavin is industrially produced at a several thousand metric ton scale per year. Roughly 70% of the commercial riboflavin is used as a feed additive for livestock, particularly for pigs and poultry. About 18–12% are used for pharma and food applications, respectively. Major industrial riboflavin producers are DSM Nutritional Products (Switzerland), BASF (Germany), and Hubei Guanji (China). For decades, the major industrial production route was a three-step chemical process conceived by Karrer and Tishler.75 During the 1990s, competitive microbial production processes were developed either based on yeasts or B. subtilis as host strains, which have replaced the chemical process to a great extent. Not only under economic, but also under ecologic aspects the microbial riboflavin processes are superior to the chemical process. A cradle to grave life cycle comparison of the traditional chemical and the B. subtilis-based microbial production process, funded by the German federal environmental protection agency, showed that out of five environmental impact categories, four (global warming, acidification, ozone creation, and energy consumption) were clearly in favor of the microbial process (Figure 23.1). With regard to one category, eutrophication, the chemical process was superior (Ref. 76; http://www.umweltbundesamt.de/gentechnik/index.htm). The study concluded that the microbial riboflavin production processses improved the life cycle performance of riboflavin production and offer more sustainable production processes.
Life cycle impact %
160 140 120 100 80 60 40 20 0
g in rm l a w ia al nt ob ote l G p
Chemical process Biotechnological process
n n n y io rg tio tio at ea l ne n ca l r i e fi ic tial c ia ph tia id en ed tio ne nt tro oten Ac pot lat mp zo ote u u O p E p m su Cu con
Figure 23.1 Life cycle impact assessment of chemical versus biotechnological riboflavin process. Results of a study performed in 2004 by the Bavarian Institute of Applied Environmental Research and Technology GmbH, Augsburg, Germany. Funded by the German Environmental Protection Agency, Berlin, Germany. Data from the riboflavin production plants of DSM Nutritional Products, Grenzach-Wyhlen, Germany. Methodological framework: DIN EN ISO 14040pp. Global warming potential assessed from emission of CO2-equivalents; acidification potential assessed from emission of SO2-equivalents; eutrophication potential assessed from emission of PO43- equivalents; photochemical ozone creation potential assessed from emission of ethylene-equivalents; resource consumption assessed from energy consumption.
Metabolic Engineering of Bacillus subtilis
23-7
The following provides an overview on the development of microbial riboflavin production processes based on B. subtilis, which are currently used by DSM Nutritional Products and Hubei Guanji. For additional reading, please refer to two recent reviews.10,77 23.3.1.2 Biosynthesis of Riboflavin and Regulation of rib Gene Expression Riboflavin biosynthesis in B. subtilis (Figure 23.2) follows the common prokaryotic route starting from guanosine triphosphate and ribulose-5-phosphate. The hydrolytic opening of the imidazole ring of GTP is followed by deamination of the resulting pyrimidinone ring, by reduction of the ribosyl side chain, and by dephosphorylation of the resulting ribityl side chain. 6,7-Dimethyl-8-ribityl lumazine (DMRL), the direct biosynthetic precursor of riboflavin, is synthesized by adding a four-carbon moiety derived from ribulose-5-phosphate to pyrimidinedione. Finally, two molecules of DMRL are converted in an intriguing dismutation reaction to riboflavin and the pyrimidinedione intermediate of the pathway (for detailed reviews on riboflavin biosynthesis see Refs. 78–80). The six enzymatic activities of the riboflavin biosynthetic pathway in B. subtilis are associated with four proteins; i.e., RibD, RibE, RibA, and RibH. The corresponding genes are clustered together with a fifth gene of unknown function (ribT) as an operon at map position 208o of the B. subtilis chromosome. During the course of pathway engineering, chemically plausible and experimentally supported reaction mechanisms were derived for the conversion of ribulose-5-phosphate to 3,4-dihydroxy-2-butanone 4-phosphate (DHBP)81 and the dismutation reaction of DMRL to afford riboflavin.82 Riboflavin secretion is not observed from wild-type B. subtilis strains because rib gene expression is tightly controlled by transcriptional attenuation. Mutants secreting the vitamin are easily obtained by selection for resistance against riboflavin analogues, e.g., roseoflavin.83,84 Two classes of riboflavin secreting mutants can be distinguished.85 Mutants of the ribO type that carry point mutations in the 5′ untranslated leader region of the rib operon, within the operator structure required to trigger formation of the transcription attenuation loop. The second class are ribC mutants with lesions in the flavokinase/FAD synthase-encoding gene.86,87 The flavokinase in ribC mutants has a drastically reduced enzymatic activity compared to the wild-type enzyme, leading to a reduced intracellular concentration of the coenzyme FMN, which is synthesized from riboflavin by a phosphorylation step catalyzed by RibC.88 Since FMN, but not riboflavin is the effector molecule of the rib regulatory system, ribC mutants are deregulated for riboflavin biosynthesis and secrete up to 30 mg/l riboflavin in overnight shake flask cultures. From the plethora of rib-deregulated B. subtilis mutants isolated over decades in various laboratories, only cis-acting ribO and ribC type mutants were identified, but never a mutant that indicated the existence of a trans-acting rib regulator protein. Hence, repression of rib gene expression in B. subtilis by FMN does most likely not involve a mediator protein.88,89 In fact, it has been shown in vitro that premature transcription attenuation occurred by direct interaction of FMN with the nascent rib mRNA.90,91 23.3.1.3 Riboflavin Pathway Engineering B. subtilis VNIIGenetika 304/pMX45 from the Russian Institute for Genetics and Selection of Industrial Microorganisms, Moscow46 was the first genetically engineered riboflavin production strain ever constructed. With this strain 4.5 g/l riboflavin were produced from 100 g/l saccharose during a 25 h aerobic fermentation run. The host strain carried a resistance marker against 8-azaguanine and a ribC mutation to deregulate the purine92 and the riboflavin biosynthetic pathway, respectively. The plasmid pMX45 transformed into the host strain contained a 6.3 MDa (ca. 10 kbp) EcoRI fragment of the chromosome of a B. subtilis ribO mutant comprising the entire rib operon as deduced from complementation studies.93 The concurrence of a chromosomal and episomal copy of the rib operon in B. subtilis VNIIGenetika 304/pMX45 gave rise to plasmid instability which could be prevented by introducing a recE4 mutation.47 However, the recE4 mutation resulted in decreased growth rates and prevented further selection of genetically modified strains because of the low survival rate of recE mutants. As an attempt to breed
23-8
Developing Appropriate Hosts for Metabolic Engineering Guanosine-PPP GTP O N NH
3-P-glycerate COOH OH H CH2OPO3––
N
O
PPP O
ser, gly, pur
NH2
N
H H H H OH HO
Pathway CHO
RibA Cyclohydrolase II
OH OH OH
H H H
Formate
CH OPO –– 2
O
CH
–O
3
Ribose-5-P
O H2N PPP
O H
HN
O
H
NH N
NH
2
H
OH HO
H
RibD Deaminase O H2N PPP
O H
H
OH HO
H2N
H
CH OPO –– 3
NH
H H
N H H OH
H H
OH OH
HN
OH O OH OH 2
H
O
Ribulose-5-P
H H
O
Reductase
RibD
H
N H
HN
O
H
NH
O
CH OPO –– 2
3
Unspecific phosphatase?
O –O CH Formate
Lumazine synthase
CH3 O HO H
O
H N
NH
2
RibA DHBP synthase
HN N O HH H OH H OH H OH H CH OH
RibH
2
3
O H C
N
H C
N
3
3
RibE NH
N
O
N
H C
N
3
2
CH OPO ––
H C 3
Riboflavin synthase
H H H H
O NH
N H OH OH OH CH OH
O
2
Riboflavin
H OH OH OH CH OH 2 6,7-dimethyl-8-ribityllumazine H H H H
Figure 23.2 Riboflavin biosynthesis in B. subtilis. The relevant metabolic building blocks ribulose-5-phosphate, ribose-5-phosphate, and 3-phosphoglycerate are indicated.
Metabolic Engineering of Bacillus subtilis
23-9
more stable production strains, the rib operon of a B. amyloliquefaciens ribO mutant was episomally introduced into a B. subtilis host strain.47 In B. subtilis Marburg 168 trpC2, but not in a riboflavin production host, the plasmid was stably passed on. The plasmid instability in the latter strain was correlated with an “overdosing” of rib gene expression. To finally obtain genetically stable riboflavin production strains, integration plamids were constructed containing a 6 kbp DNA fragment from a B. amyloliquefaciens ribO mutant comprising the entire rib operon, a chloramphenicol resistance marker, and random BamHI fragments of B. subtilis DNA.94 The circular plasmids lacking a B. subtilis replication origin were targeted via the BamHI fragments into various loci of the chromosome of a riboflavin auxotrophic B. subtilis strain. The resulting transformants overproduced riboflavin, but only a small percentage of them could stably transmit the overproducing phenotype in the absence of antibiotic selection. The high frequency of unstable strains was explained by the formation of DR during the single crossover transformation event (see also below and Ref. 95). From one of the stable transformants containing the transgenes in close proximity to the pur operon, a phage lysate was prepared and used to transduce a riboflavin producing B. subtilis host strain. The resulting strain Y32 produced 3 g/l riboflavin during a 72 h shake flask fermentation, but has potential to produce considerably more under industrial fedbatch fermentation conditions. Instability of even integrated plasmids in the above studies was correlated with an “overdosing” of rib gene expression. Such precise copy number dosing of the deregulated rib operons was recently emphasized by a research group from the Tianjin University, Tianjin, China,96 where a copy number of seven to eight was found optimal. Chromosomal integration via double cross-over proved superior over single cross-over, presumably because of stability reasons.97 Starting from the B. subtilis-derived riboflavin production strain AS5,98 strains resistant against threonine or proline analogues were bred by researchers from the CJ corporation in Korea.99 Riboflavin titers of 22.4 g/l and 26.5 g/l in 60–70 h fed-batch fermentation runs were obtained with the parent strain AS5 and the analogue resistant strains, respectively. B. subtilis AS5 contains a ribO mutation and the plasmid pMX45 like B. subtilis VNIIGenetika 304/pMX45 (see above). To link the threonine analog resistance with the increased riboflavin productivity, the CJ researchers referred to an Ashbya gossypii strain with an improved riboflavin productivity as the result of debottle-necking the glycine supply from threonine via the threonine aldolase catalyzed reaction.100 Increased osmopressure resistance due to deregulated proline biosynthesis was alleged to explain the improved performance of the proline analog resistant strains. The primary structure of the B. subtilis rib operon,101,102 functional assignment of the open reading frames within the operon, and identification of the relevant transcriptional and translational signals were elucidated by OmniGene Bioproducts, Inc. (Cambridge, MA) in collaboration with Roche Vitamins Ltd (Basel, Switzerland) in order to facilitate more precise genetic engineering of riboflavin production strains.34 The endogenous upstream promoter and the leader sequence comprising the repressible rib operator were replaced by a strong constitutive phage promoter. Another copy of the same promoter was inserted between ribE and ribA. The modified rib operon was then closely linked to a chloramphenicol resistance cassette and introduced via single cross-over recombination into the native rib locus of a B. subtilis 168 descendant with deregulated purine and riboflavin biosynthesis. In the resulting strain RB50::[pRF69]n, the chloramphenicol resistance gene was flanked by extended repetitive DNA sequences encoding the native and the modified rib operon. Such genomic structures give rise to individuals within the bacterial population with multiple copies of the rib operon and of the chloramphenicol resistance gene.95 The increased chloramphenicol resistance of bacteria with multiple copies of the cat gene provides a convenient selection tool for individuals with amplifications. The antibiotic selection pressure was maintained during the inoculum fermentation runs, whereas the 48 h main fermentation process under glucose-limited conditions could be performed in the absence of the antibiotic without loss of amplification as revealed by Southern analysis and confirmed by the analysis of riboflavin biosynthetic enzyme activities. During the 48 h fermentation process, 14 g/l riboflavin were produced.
23-10
Developing Appropriate Hosts for Metabolic Engineering
Cultivation of RB50::[pRF69]n with fractional 13C-labeled glucose as carbon source and metabolic flux balancing revealed that riboflavin formation in this strain was limited by the fluxes through the terminal biosynthetic rather than the central carbon metabolic pathways.103 To further increase rib gene expression and debottleneck the biosynthetic pathway, a similar approach as described above was applied to introduce a second modified rib operon at map position 136°, affording strain RB50::[pRF69]n::[pRF93]m. Proteome analysis of an extract of amplified RB50::[pRF69]n::[pRF93]m revealed that each of the Rib proteins, which are barely detectable in the untransformed parent strain, accumulated to 3–5% of the total soluble protein content (I. Maillet, unpublished data). Doublecrossover integration at map position 302° of a single copy of the ribA gene transcribed from the medium strong B. subtilis veg promoter yielded strain VB2XL1, which displayed a further increase in riboflavin productivity.104 23.3.1.4 Transketolase Mutants with Improved Production Performance Guanosine triphosphate and ribulose-5-phosphate are recruited in a 1:2 stoichiometric ratio by GTP cyclohydrolase II and DHBP synthase, respectively, for riboflavin biosynthesis. Since at substrate saturation, the activity of the latter enzyme is twice the activity of the former enzyme (our unpublished observations) and since both enzymatic activities are associated with a single protein encoded by ribA, a balanced formation of the pyrimidinedione and the dihydroxybutanone intermediates is ensured. However, the Michaelis and Menten constant of DHBP synthase (~1 mM, our unpublished observations) is about 100-fold higher than the Michaelis and Menten constant of GTP cyclohydrolase II (0.01 mM, our unpublished observations) imposing the risk of excessive synthesis of the pyrimidinone and pyrimidinedione intermediates in case of reduced intracellular concentrations of pentose phosphate pathway intermediates as can be expected, for instance, in glucose-limited fed-batch fermentations frequently used in industrial applications. The pyrimidinone and pyrimidinedione intermediates are highly reactive, oxidative compounds, which can do serious damage on the bacteria. In mutants of various Bacillus strains defective for transketolase, a key enzyme of the pentose phosphate pathway, elevated intracellular C5 carbon sugar pools are reached up to a level that exceeds the physiological requirements of the bacteria and leads to secretion of excess ribose into the fermentation broth.105,106 Because of the importance of a sufficiently high intracellular ribulose-5-phosphate concentration for a balanced precursor supply for riboflavin biosynthesis, the effects of mutations in the transketolase-encoding tkt gene on the riboflavin production performance have been evaluated. Improved riboflavin production was observed with the B. subtilis transketolase mutant 24.107 The mutant was auxotroph for shikimic acid and could not grow with gluconate as sole carbon source in accordance with the expected phenotype of a transketolase knock-out mutant. By contrast, in the RB50::[pRF69]n strain, riboflavin production was affected negatively by a transketolase knock-out mutation (PCT patent application by DSM Nutritional Products)108. However, certain RB50::[pRF69]n mutants with tkt genes encoding impaired enzymes showed increased production performance. The mutated tkt genes were generated by error-prone PCR and selected for their ability to rescue a B. subtilis tkt deletion mutant from shikimic acid auxotrophy and allow them to grow on gluconate albeit at reduced growth rates compared to a B. subtilis strain containing wild-type tkt. 23.3.1.5 Development of a Decoupled Fermentation Process A glucose-limited fed-batch protocol was initially developed for cultivation of prototrophic riboflavin production strains like RB50::[pRF69]n34,101 and derivatives of this strain. After an initial phase of unrestricted growth with excess substrates, during which mainly biomass and only minor amounts of riboflavin were produced, glucose supply became growth rate limiting. Riboflavin accumulated mainly at growth rates below µ = 0.05 h -1 under strict glucose-limited conditions. A series of glucose-limited continuous culture experiments at dilution rates between µ = 0.03 h -1 and µ = 0.3 h -1 with BS9711, a derivative of the riboflavin production strain RB50::[pRF69]n::[pRF93]m, revealed a dichotomy between the fermentation conditions to reach optimal productivity versus optimal product
Metabolic Engineering of Bacillus subtilis
23-11
yield on the consumed carbon source. At the higher growth rates with increased specific glucose uptake rates, higher specific productivities were obtained, whereas higher yields on glucose were achieved at the lower growth rates. To facilitate a high-yield fermentation process at an elevated specific glucose uptake rate to maintain a high specific productivity, a decoupled fermentation process was developed, in which the rate of biomass and riboflavin production is controlled by different fermentation substrates. While amino acid auxotrophs were unsuccessful, presumably because of generally detrimental effects on protein synthesis, biotin-limited growth of a biotin auxotrophic mutant of the production strain proved to be a suited strategy.109 In chemostat fermentations with BS1101, a biotin auxotrophic mutant of the riboflavin production strain BS9711, biotin was the growth-limiting fermentation substrate if its content in the feed medium was kept below 0.188 µg per gram of glucose. With the transition from a glucose to a biotin-limited fermentation regime, an increase in both the specific riboflavin productivity and the specific glucose uptake rate of BS1101 was observed. Since the former was more pronounced than the latter, a clear increase in product yield on glucose was achieved in biotin-limited fermentation runs around biotin to glucose ratios of 0.125 and 0.150 µg/g. Lower biotin to glucose ratios resulted in a further increase in specific productivity, but the increase was outpaced by the increase in specific glucose uptake rate. This, obviously, had a negative impact on the product yield on glucose, which dropped even below the yield obtained in biotin excess cultures. 23.3.1.6 Outlook B. subtilis proved to be a particularly suited and versatile host strain for microbial riboflavin production. Limitations of the metabolic flux through the riboflavin biosynthetic pathway were overcome by massively increasing the expression of the riboflavin biosynthetic enzymes. Further improvements can be expected by fine tuning the expression of the individual rib genes and balancing the flux through the pathway. Another problem are the relatively low specific activities of the Rib enzymes that are normally only needed to synthesize a minor cellular constituent. Here, protein engineering will breed more active enzymes as already exemplified for GTP cyclohydrolase II.110
23.3.2 Pantothenic Acid 23.3.2.1 Physiological and Commercial Significance As another B family vitamin, pantothenate is a nutritional requirement for mammals that is used primarily for the biosynthesis of coenzyme A (CoA) and acyl carrier protein. These essential coenzymes function in the metabolism of acyl moieties, which form thioesters with the sulfhydryl group of the 4′-phosphopantethein portion of these molecules. The key starting material for large-scale chemical processes is R-pantolactone. This molecule can be coupled with either β-alanine to form R-pantothenate or with 3-aminopropanol to form panthenol (an important ingredient in cosmetics). Purification of R-pantolactone from racemic mixtures occurs either by stereoselective chemical or enzymatic rearrangement.111 Pantothenate is industrially produced as calcium pantothenate (calpan) at a several thousand metric ton scale per year. Roughly 75% of the commercial calpan is used as a feed additive for livestock, particularly for pigs and poultry. The remainder is split for uses in pharma (15%) and food applications (10%). Major industrial calpan producers are DSM Nutritional Products (Switzerland), BASF (Germany), Daiichi (Japan), and Xinfu (China). For decades, the major industrial production route was a chemical process that utilizes the starting materials (R)-pantolactone and calcium β-alanine.112 (R)-Pantolactone is also used for the preparation of panthenol. Despite the development of several processes for the preparation of optically active (R)-pantolactone by resolution of racemates by chemical or enzymatic recycling,113–115 this step remains the key cost factor in these processes.113–115 During the 1990s and early 2000s several microbial production processes were developed to supplant this chemical process, using E. coli,116 Corynebacterium
23-12
Developing Appropriate Hosts for Metabolic Engineering
glutamicum,117–119 or B. subtilis62,70,71 as hosts. The B. subtilis process appears to be the most attractive in terms of pantothenate production (>80 g/l in 48 h). Nevertheless, we still await commercial implementation of such a microbial process. 23.3.2.2 Biosynthesis of Pantothenate In microorganisms, pantothenate is composed of two units, pantoic acid and β-alanine.120 The two units are synthesized by separate pathways and are then coupled to give pantothenate via an enzymecatalyzed condensation reaction. Pantothenate is then converted to CoA in a multistep pathway; the key rate-limiting step is controlled by pantothenate kinase (PanK) that is subject to feedback regulation by CoA. Both E. coli and B. subtilis utilize many of the same precursors, intermediates, and biosynthetic genes (ilv and pan) in de novo synthesis of pantothenate (Figure 23.3, Table 23.1).121 Pantoic acid is derived from α-ketoisovaleric acid, which, in turn, is formed from two molecules of pyruvate by enzymes IlvB, IlvN, IlvC, and IlvD. These enzymes are also utilized for the synthesis of leucine, valine, and isoleucine. Consequently, manipulation of these genes to improve pantothenate production could affect synthesis of these branched-chain amino acids. The first committed step to form pantoate is a hydroxylmethyl transfer reaction catalyzed by the PanB enzyme, which converts α-ketoisovaleric acid (α-KIV) to α-ketopantoate. This reaction also requires the cofactor methylenetetrahydrofolate (CH2-THF) as the source of a carbon unit. Thus, the glycine cleavage pathway plays a key role in pantothenate biosynthesis. α-Ketopantoate is then reduced to form pantoate by the PanE1 (YlbQ) enzyme. The IlvC enzyme can also catalyze this reaction; however, the ketopantoate reductase activity of IlvC is only about 5% of that of PanE. In addition, the B. subtilis genome contains a panE paralog (panE2/ykpB), which could play a role in the synthesis of side products.62 β-Alanine, on the other hand, is formed from another amino acid, aspartic acid. This reaction is catalyzed by aspartate-1-decarboxylase (PanD), a unique pyruvoyl-containing enzyme that is translated as a preprotein and then further processed. The efficiency of processing could be a rate-limiting step for this reaction. Aspartic acid is formed from oxaloactate, a key intermediate in central carbon metabolism, through the action of one or more unidentified deaminases. The pantothenate biosynthetic genes are located in two operons, panBCD and panE. The α-KIV biosynthetic genes are located in a supra-operon linked to leucine biosynthetic genes (ilvBNC-leuABCD) and in a separate single-gene operon for ilvD. B. subtilis contains two PanK genes, coaA and coaX. CoaA is the ortholog of E. coli PanK, whereas coaX is found only in Gram-positive pathogens,122 although B. subtilis itself is not a pathogen (see above). 23.3.2.3 Pantothenate Pathway Engineering Methods of producing D-pantothenic acid were first described using E. coli, a natural producer of this vitamin. It has been reported that wild-type bacteria produce approximately 12.5 µM in the growth medium when grown to stationary phase;123 it is thought that production of such copious amounts of pantothenate is due to a combination of pantothenate synthetase not being tightly regulated and tight feed-back inhibition of the next enzymatic step to CoA, PanK.120 First increases in pantothenate synthesis were achieved by isolating E. coli mutants resistant to salicylic acid alone or to combinations with additional analogues and/or intermediates, viz. α-ketoisovaleric acid, α-ketobutyric acid, α-aminobutyric acid, β-hydroxyaspartic acid, or O-methyl-threonine.116 Production of D-pantothenic acid was further enhanced by transformation of the multiple analogue-resistant E. coli strains with a plasmid (pFV31) comprising the genes panB, panC, and panD,116 or a plasmid (pFV202) comprising, in addition to the pantothenic acid biosynthesis genes, the branched-chain amino acid biosynthesisrelated ilvGM genes.124 Pantothenate production of these engineered strains ranged from 52 to 66 g/l in shake flask experiments after 72 h incubation. However, feeding of 30 g/l β-alanine was required for high pantothenate production by these strains. It is not clear why this is so since the expression of the panD gene was substantially increased.
H
α-ketoisovalerate
H
H3C
COOH
COOH
COOH
COOH
COOH
H
H3C
OH C
H3C C O
panE1
panB
ilvD
ilvC
ilvB, N
OH H2C C
Ketopantoate reductase
Ketopantoate hydroxymethyl transferase
Dihydroxy-acid dehydratase
Isomeroreductase
Acetohydroxy acid synthase
-THF
NH CH2 CH2
panC COOH
panD
Pantothenate synthase
CH2
O
CO2 HOOC CH2 CH2 NH2
Aspartate-1decarboxylase
NH2 HOOC CH2 CH COOH
HOOC CH2 C COOH Various deaminases
β-alanine
Aspartate
Oxaloacetate
Figure 23.3 Pantothenate biosynthesis pathway of B. subtilis. The pan and ilv genes were identified by genetic and/or biochemical evidence, or by homology to known E. coli genes. B. subtilis contains two panE genes, panE1 (ylbQ) and panE2 (ykpB); PanE1 has ketopantoate reductase activity, but PanE2 does not and may be involved in formation of the side-product [R]-3-(2-hydroxy-3-methyl-butyrylamino)-propionic acid (HMBPA). The grey arrow labeled CH 2-THF represents the cofactor methylenetetrahydrofolate produced from the glycine cleavage pathway (serA, glyA).
C
OH
C
Pantoate OH H C 2
H3C
C
O
C
C
H3C
O
C H
OH
CH3
C
CO2 OH
H3C
H3C
C
OH
C
H 3C α-keto- OH H C C 2 pantoate H3C
H3C
H 3C
O
O CH3
CH3
O
Dihydroxyisovalerate
Acetolactate
Pyruvate
COOH
COOH
Metabolic Engineering of Bacillus subtilis 23-13
23-14
Developing Appropriate Hosts for Metabolic Engineering
Table 23.1 Pantothenate Biosynthetic and Regulatory Genes Gene panB panC panD panE1 (ylbQ) panE2 (ykpB) coaA coaX (yacB) glyA serA ilvB ilvH ilvC ilvD
Enzyme/Function Ketopantoate hydroxymethyltransferase Pantoate-β-alanine ligase Aspartate 1-decarboxylase Ketopantoate reductase Possible enzyme involved in HMBPA formation Pantothenate kinase Pantothenate kinase Serine hydroxymethyltransferase Phosphoglycerate dehydrogenase Acetolactate synthase (acetohydroxy-acid synthase) (large subunit) Acetolactate synthase (acetohydroxy-acid synthase) (small subunit) Ketol-acid reductoisomerase (acetohydroxy-acid isomeroreductase) Dihydroxy-acid dehydratase
Length (aa)
Location (°)
831 858
201.1 201
381 298 303 319 233 415 525 574 174 342 588
200.9 134.7 129.3 210.9 6.8 323.7 205.9 247.4 247.3 247.2 196.6
Source: Data from SubtiList (http://genomeweb.pasteur.fr/GenoList/SubtiList) and NCBI (http://www.ncbi.nih.gov) websites.
Like E. coli, B. subtilis is also a natural producer of pantothenate. Wild-type strains produce approximately 0.5 mg/l pantothenate in overnight culture on chemically defined pantothenate-free medium (G. Schyns, personal communication). However, genetically engineered B. subtilis strains produce much higher pantothenate titers (>80 g/l in 48 h fed-batch fermentations) without the need for addition of β-alanine. These strains are reported in a series of patent applications by the Pero group.62,70,71 Overproduction of pantothenate was achieved solely by increasing the transcription levels of the ilv and pan genes and genes involved in one-carbon metabolism. This was done by introducing stronger promoters (e.g., SPO1-26) upstream of the native operons (panBCD, panE, ilvBNC, and ilvD), addition of an artificial SPO1-26-driven engineered operon at a second chromosomal site (panBD), and by increasing the copy number of glyA and serA using gene amplification. These latter modifications are critical in reducing the level of pantothenate-like by-products, such as [R]-3-(2-hydroxy-3-methyl-butyrylamino)propionic acid (HMBPA). Thus, by comparing the different changes in gene transcription and copy number, it appears that panD, and genes glyA and serA in the glycine cleavage pathway, undergo the most manipulation, suggesting that these enzymes are critically important in pantothenate synthesis. It remains to be determined whether further genetic engineering improvements in pantothenate production are feasible. Possible options include manipulation of coaA or coaX70 or introduction of analogresistance mutations into these strains. As mentioned above, inactivation of the spo0A gene, the master regulator of sporulation, results in a dramatic decrease in pantothenate production.35 Sporulation-deficient strains for commercial production can be constructed, however, by incorporation of mutations in later-stage transcription factors SigE and SigG, which prevent sporulation yet maintain high pantothenate production.
23.4 Metabolic Engineering of Products with Unusual Biochemistry Using strong promoters to drive biosynthetic gene transcription and taking advantage of the copy number effect on gene expression were the key elements to engineer B. subtilis production strains for riboflavin and pantothenic acid. In both cases, the biochemical reactions from common precursor compounds to the final products were known to a great detail. Moreover, efficient in vitro riboflavin production from GTP and ribulose-5-phosphate with purified Rib enzymes and alkaline phosphatase was possible (our
Metabolic Engineering of Bacillus subtilis
23-15
own unpublished results), indicating that no important element influencing the metabolic flux through the riboflavin pathway is missing. For the products discussed below, there is a distinct lack of knowledge on the exact biochemistry and in some cases genetics, which has so far hampered successful development of commercial B. subtilis production processes. Biosynthesis of biotin and thiamin, for example, involves unusual chemical reactions that are only partially understood for the time being. Biosynthetic genes that are crucial to achieve a high metabolic flux toward the desired product seem to be missing for these vitamins, and only inefficient in vitro production systems with purified enzymes could be reconstituted.
23.4.1 Thiamin 23.4.1.1 Physiological and Commercial Significance Thiamin, also called vitamin B1, serves as the coenzyme for a large number of enzyme systems in the metabolism of carbohydrates and amino acids (e.g., pyruvate dehydrogenase, pyruvate oxidase, and transketolase). Deficiency of thiamin in humans leads to the disease Beriberi, in which the benefit of thiamin for prevention and treatment is not debated. Thiamin deficiencies have been associated with other possible medical syndromes (e.g., sudden infant death125), but no direct cause and effect relationships have been established. Thiamin is industrially produced in several commercial forms at several thousand metric tons per year. The commercial forms are mainly thiamin chloride HCl, and thiamin nitrate.126 The market breakdown is approximately 40% for feed additives, 40% for pharma, and 20% for food applications. The major industrial thiamin producers are DSM Nutritional Products (Switzerland) and various companies in China. There are several chemical processes published, but the one that is generally followed by thiamin manufacturers involves construction of the thiazole ring around a preformed pyrimidine moiety, Grewe diamine.126 No complete glucose-to-thiamin microbial process has been described. Only a biotransformation method has been reported for producing thiamin phosphates in a process using thiamin or thiamin monophosphate and E. coli cells that overproduce thiamin kinase (ThiK) or TMP kinase (ThiL).127,128 23.4.1.2 Biosynthesis of Thiamin The thiamin biosynthetic pathway in B. subtilis is diagrammed in Figure 23.4 and the genes/enzymes are listed in Table 23.2. The biosynthetic genes are located in at least three operons, thiC, ywbI-thiME, and tenAI-thiOSGFD. The reader is directed to several reviews on the biosynthesis of this vitamin.129–131 Briefly, thiamin is composed of two units, a pyrimidine moiety, 4-amino-5-hydroxymethyl-2-methylpyrimidine phosphate (HMP-P), and a thiazole moiety, 5-(2-hydroxyethyl)-4-methylthiazole phosphate (HET-P). HMP-P is derived from an intermediate in the de novo purine biosynthetic pathway, 5-aminoimidazole ribotide (AIR), in a conversion catalyzed by the thiC gene product and, possibly, an as yet unknown protein (see below). HMP-P is then phosphorylated to HMP-PP by the product of the thiD (yjbV) gene prior to coupling with the thiazole unit. The thiazole moiety, 5-(2-hydroxyethyl)-4methylthiazole phosphate (HET-P), is derived from 1-deoxy-D-xylulose 5-phosphate (DXP), glycine and cysteine in a complex oxidative condensation reaction requiring the products of at least five different genes, thiF, thiS, thiO, thiG, and a nifS-like gene. The two moieties are then coupled by the ThiE enzyme to yield thiamin monophosphate (TMP). Additional phosphorylation of TMP by ThiL yields thiamin pyrophosphate (TPP), the biologically active form of the cofactor. So, unlike riboflavin and biotin, thiamin is not the product of de novo synthesis, but is part of the salvage pathway and can be converted by the cell to TMP and TPP using two sequential kinase reactions (thiamin to TMP to TPP) or a single pyrophosphorylase reaction (thiamin to TPP) (see below). Over 98% of intracellular thiamin products are in the form of TPP, and essentially undetectable levels of thiamin or of other thiamin forms are found outside the cell.
yuaJ ykoFEDC (thiBPQ) thiN/yloS
? Thiamin phyrophosphate (TPP)
thiL/ydiA
Thiamin monophosphate (TMP)
Hydroxymethylpyrimidine pyrophosphate (HMP-PP)
thiD / yjbV
thiD/yjbV Gene organization thiC ywbl-thiME tenAl-thiOSGFD thiL-ydiBCDE ywdC-pdxK yuaJ ykoFEDC nifZ-ytbJ
thiC + gene X ? Hydroxymethylpyrimidine Hydroxymethylpyrimidine (HMP) phosphate thiD/yjbV pdxK/ywdB (HMP-P)
Aminoimidizole ribotide (air)
Figure 23.4 Thiamin biosynthesis and salvage pathways of B. subtilis. The thi genes were identified by genetic, biochemical, and/or in silico evidence; in some cases their original y-gene designation is listed. E. coli intermediates or genes that are not present in B. subtilis but have similar functions are given in parentheses. The thiamin-TMPTPP salvage pathway is boxed. Question marks indicate putative gene(s). The operon organization of thiamin biosynthesis genes is shown in the right panel.
Exogenous thiamin products (THI, TMP, TPP)
(thiK)
thiE
thiF thiS thiG thiO (thiH thiI)
Hydroxyethylthiazole phosphate (HET-P)
Thiamin ABC transporter
Hydroxyethylthiazole (HET) thiM
nifZ (iscS)
1-deoxy-D-xylulose phosphate + L-glycine (L-tyrosine) + L-cysteine
dxs/yqiE
Pyruvate + glyceraldehyde-3-phosphate
Purine pathway
23-16 Developing Appropriate Hosts for Metabolic Engineering
23-17
Metabolic Engineering of Bacillus subtilis Table 23.2 Thiamin Biosynthetic and Regulatory Genes Gene thiC ywbI thiM thiE tenA tenI thiO (goxB) thiS (yjbS) thiG (yjbT) thiF (yjbU) thiD (yjbV) pdxK (thiD) thiL (ydiA) thiN (yloS) dxs (yqiE) thiX (ykoC) thiW (ykoD) thiV (ykoE) thiU (ykoF) thiT (yuaJ) ylmB
Enzyme/Function Biosynthesis of hydroxymethylpyrimidine phosphate Similar to LysR family of transcriptional regulators Hydroxyethylthiazole kinase Thiamin phosphate synthase Thiaminase Possible antagonist of TenA Glycine oxidase Sulfur carrier protein Thiazole synthase ThiS adenylyltransferase Phosphomethylpyrimidine kinase Pyridoxol/pyridoxal/pyridoxamine kinase TMP kinase Thiamin pyrophosphorylase 1-Deoxy-D-xylulose synthase Possible transmembrane component of thiamin-related ABC transporter Possible ATPase component of thiamin-related ABC transporter Possible transmembrane component of thiamin-related ABC transporter Possible ligand binding protein of thiamin-related ABC transporter Possible thiamin permease Similar to acetylornithine deacetylase
Length (aa) 590 301 272 222 236 201 369 66 256 336 271 271 325 214 633 254 490 199 200 576 426
Location (°) 81.8 335.9 335.8 335.7 106.1 106.1 106.2 106.3 106.3 106.4 106.5 333.1 54.7 141.3 215.7 118.6 118.7 118.8 118.8 271.5 137.3
Source: Data from SubtiList (http://genomeweb.pasteur.fr/GenoList/SubtiList) and NCBI (http://www.ncbi.nih.gov) websites.
23.4.1.3 Thiamin Pathway Engineering Since the current chemical processes produce thiamin chloride or thiamin nitrate,126 it is desired that a fermentation process should produce a “substantially equivalent” product to avoid registration of a new product, a procedure which is time-consuming and costly. Consequently, a key hurdle to generate an economically relevant glucose-to-thiamin fermentation process is the need to produce one or both of these compounds. This has proven to be difficult because major re-routing of the thiamin-TMPTPP pathway is needed to excrete thiamin from the cell without disrupting intracellular synthesis of TPP. Nevertheless, Schyns et al.51 have made important strides toward this goal through the isolation and characterization of four mutations in thiamin biosynthetic and salvage genes: thiL, encoding TMP kinase; thiN, encoding thiamin pyrophosphorylase; thiT (yuaJ), encoding a thiamin permease; and thiW (ykoD), encoding a component of a thiamin ABC transporter. Combination of these four mutations into a single strain, called TH95, resulted in significant excretion of thiamin into the culture medium. Fermentation studies at the 10 l scale resulted in the production of thiamin products in the mg/l range. With the successful re-routing of the thiamin-TMP-TPP pathway to excrete thiamin, it should be possible to further improve thiamin production through direct engineering of the thi biosynthetic genes. Schyns et al. have also taken the first steps toward this goal by increasing the expression of two thi operons, thiC encoding a HMP-P synthase and the so-called thiB operon, tenAI-thiOSGFD, encoding genes involved in thiazole formation and HMP kinase, in the TH95 strain by replacing the native promoters with constitutive promoters derived from the Bacillus SPO1 bacteriophage132,133 and by amplifying the engineered genes in the chromosome.134 Unfortunately, only a very modest increase in thiamin production (<10-fold) was observed. Although the reason for this result is not clear, two inherent features of the pathway and the biosynthetic genes themselves may make it difficult to achieve increases in thiamin production to the gram per liter level.
23-18
Developing Appropriate Hosts for Metabolic Engineering
First, the activity of several of the Thi biosynthetic enzymes is quite poor. This has been quantitated in vitro by Begley and his coinvestigators using reconstitution assays for both the thiazole and HMP pathways. In particular, the HMP-P synthase reaction catalyzed by ThiC requires an intricate intramolecular rearrangement of AIR, a purine intermediate, to form the pryrimidine. Therefore it is not surprising that this reaction is very slow (Lawhorn et al., 2004).135 Recent evidence indicates that ThiC contains an 4Fe–4S cluster required for its physiological function.136,137 Consequently, this reaction has mechanistic features similar to that of the biotin synthase (BioB) reaction, which constitutes a difficult bottleneck to breach in engineering of the biotin biosynthetic pathway (see below). Second, the availability of the key thiamin precursors, AIR for HMP-P formation and DXP for thiazole formation, may be insufficient. While carbon flow through the purine pathway can be substantial in purine-producing B. subtilis, the steady-state level of AIR may be too low to allow for higher production levels of HMP-P. Nevertheless, it should be possible to increase the levels of AIR by decreasing PurE activity while simultaneously deregulating the purine pathway. This must be done such that limitation of purines and their derivatives in the cell is prevented. Similarly, the competition of isoprenoid biosynthesis for DXP also limits the availability of this precursor for thiazole biosynthesis. Attempts to increase the expression of the dxs gene have not led to further increases in thiamin levels. Moreover, manipulation of this gene to increase isoprenoid biosynthesis has also led to only modest increases.138 This may indicate that Dxs is subject to allosteric feedback inhibition, thus making overexpression an inappropriate engineering tool to increase flux through this reaction. Finally, tenA, the first gene in the thiB operon for thiazole biosynthesis, has been recently shown to encode a type 2 thiaminase.139 However, this enzyme is involved in the regeneration of the thiamin pyrimidine rather than in thiamin degradation, identifying a new pathway involved in the salvage of base-degraded forms of thiamin, which is widely distributed among bacteria, archaea, and eukaryotes. Consequently, overexpression of the thiB operon may enhance salvage of HMP from thiamin analog production generated by degradation.140 23.4.1.4 Outlook While recent genetic engineering efforts have shown that it is possible to produce thiamin by fermentation, it is clear that the low activity of the biosynthetic enzymes and the low carbon flux into the pathway prevent generation of high performance strains. Consequently, more basic knowledge of the biosynthetic pathway for thiamin and its precursors will be necessary in any attempt to generate a commercially viable fermentation process.
23.4.2 Biotin 23.4.2.1 Physiological and Commercial Significance Biotin, also called vitamin H, is a cofactor required for numerous carboxylases, such as acetyl coenzyme A (acetyl-CoA) carboxylase and, putatively, pyruvate carboxylase. Commercially, biotin is produced exclusively via chemical syntheses involving ten or more reaction steps,141,142 because fermentation processes have not yet cracked the 1 g/l boundary. The total market size is around 120 t/a, with biotin being almost exclusively sold to the animal feed sector with only 10% sold for pharma and food applications. The major industrial biotin producers are DSM Nutritional Products (Switzerland) and various companies in China. 23.4.2.2 Biosynthesis of Biotin All known biosynthetic genes are organized into a single operon (bioWFADBI-ytbQ) (Table 23.3); two other genes, bioY1 and bioY2, show strong similarity to bioY of Rhizobium etli, which has been shown to be involved in biotin uptake.143 Although quite some work was done on the biosynthesis of biotin (e.g., Refs. 129,130,139,142,143), the pathway is not entirely clear. Briefly, the first known intermediate is
23-19
Metabolic Engineering of Bacillus subtilis Table 23.3 Biotin Biosynthetic and Regulatory Genes Gene bioW bioA bioF bioD bioB bioI ytbQ birA bioY1 (yhfU) bioY2 (yuiG)
Enzyme/Function
Length (aa)
Location (°)
Pimeloyl-CoA synthase DAPA aminotransferase 7-KAP synthase DTB synthetase Biotin synthase Biotin cytochrome P450 Unknown/not required for biotin synthesis Biotin operon repressor/biotin-protein ligase Possible biotin permease Possible biotin permease
259 448 389 231 335 395 253 325 186 200
264.3 264.2 264.1 264 264 263.9 263.8 201.2 95 281.3
Source: Data from SubtiList (http://genomeweb.pasteur.fr/GenoList/SubtiList) and NCBI (http://www.ncbi.nih.gov) websites.
pimelic acid, which in B. subtilis is synthesized from an unknown fatty acid precursor by a unique cytochrome P450 encoded by the bioI gene (Figure 23.5). Extensive work by the Munro and De Voss laboratories has shown that BioI has a high affinity for long-chain fatty acids and can form small amounts of pimelic acid from tetradecanoic acid in vitro.146–148 Pimelic acid is then converted to pimeloyl-CoA by BioW, which so far has been found only in B. subtilis and B. sphaericus. Biosynthetic enzymes BioA, BioF, BioD, and BioB convert pimeloyl-CoA to biotin in equivalent steps as in most other bacteria. However, the DAPA aminotransferases from E. coli and B. sphaericus use S-adenosylmethionine (SAM) as an amino donor to convert 8-amino-7-ketopelargonic acid (7-KAP) to 7,8-diaminopelargonic acid (DAPA), whereas the DAPA aminotransferase of B. subtilis uses lysine as the amino donor.61 23.4.2.3 Biotin Pathway Engineering Several biotin producers have been reported, with those for E. coli and Serratia marcescens producing the highest quantity of biotin at approximately 1 g/l.141 These strains combine overexpression of biotin biosynthesis genes with host mutations selected by resistance to biotin analogs such as acidomycin. Engineered strains of B. subtilis (and B. sphaericus), conversely, produce much lower levels of biotin. In these engineered B. subtilis strains, the expression of the biosynthetic genes is increased by replacing the native promoter and regulatory site with the SPO1-15 promoter and, then, by amplifying the entire operon (P15 bioWFADBIorf2) containing all of the biosynthetic genes to five to seven copies per cell. These strains produce up to 1 g/l dethiobiotin from exogenously fed pimelic acid in a 48 h fed-batch fermentation, but at most 25 mg/l of authentic biotin,69 indicating a severe block in the biotin synthase (encoded by the bioB gene) step in the pathway. Limitation of this reaction is hardly surprising since it is also seen in all BioB orthologs from other genera. The enzymology of this step is complex, involving free-radical chemistry to insert a sulfur atom between two nonreactive carbon atoms (the methyl and methylene carbon atoms adjacent to the ureido ring). Biotin synthase has been shown to contain two iron sulfur clusters, a [4Fe-4S] center that forms a free radical (5′-deoxyadenosyl radical, DOA*) by reductive cleavage of S-adenosyl-L-methionine and catalyzes the formation of the two C-S bonds, and a [2Fe-2S] center that is thought to donate the sulfur atom.145,149,150 In addition to BioB, this step also requires other enzymes [e.g., flavodoxin (or ferredoxin) and flavodoxin (or ferredoxin) NADP+ reductase] to shuttle reducing equivalents from NADPH to BioB to drive the reaction.144 In all, two molecules of SAM are required to insert one sulfur atom into DTB. Why do engineered Gram-negative strains produce more biotin than Gram-positive strains? The reason for this is not known, but it clearly resides in better conversion of DTB to biotin by BioB. Until recently, no in vitro assay has been developed where turnover of BioB has been demonstrated. The Cronan group, however, has recently demonstrated turnover of E. coli BioB in a novel quantitative
23-20
Developing Appropriate Hosts for Metabolic Engineering B. subtilis ?
B. sphaericus ? bioX
biol O C
HO
CH2(CH2)4COOH
ATP, CoA
Pimelic acid
bioW O
E. coli ?
CoA
S
C
CH2(CH2)4COOH
bioC, bioH L-Ala
Pimeloyl-CoA
bioF
NH2 O CH2(CH2)4COOH
H3 C E. coli/SAM B. subtilis/Lysine
7-keto-8-aminopelargonic acid (7-KAP)
bioA NH2 NH2 CH2(CH2)4COOH
H3C ATP, HCO3-
7,8-diaminopelargonic acid (DAPA)
bioD O NH
HN
CH2(CH2)4COOH
H3C NADPH, SAM (2), Cys ( NifSU-like proteins Flavodoxin Ferrodoxin reductase Other unknown factors
Fe-S)
Dethiobiotin (DTB)
bio B O NH
HN
Biotin S
CH2(CH2)4COOH
Figure 23.5 Biotin biosynthesis pathways of B. subtilis, E. coli, and B. sphaericus. Question marks indicate yet unknown reactions. The last reaction is catalyzed by the bioB gene product; the potential sulfur donor and the additional proteins and cofactors listed are discussed in the text.
Western blot assay that measured incorporation of labeled biotin into AccB, the biotinylated subunit of acetyl-CoA carboxylase.151 Although the level of turnover was modest (~50 molecules of biotin per molecule of BioB), it still demonstrated that BioB is not a substrate or reactant. Moreover, Cronan’s group has demonstrated that 5′-deoxyadenosine inhibits BioB activity and that this inhibitor is removed by
Metabolic Engineering of Bacillus subtilis
23-21
5′-methylthioadenosine/S-adenosylhomocysteine nucleosidase encoded by E. coli pts.152 Furthermore, the Jarrett group has recently shown that excess iron reduces the rate of BioB degradation, presumable by preventing loss of the Fe-S cluster.153 It could be that selective combination of chemical mutagenesis and genetic engineering in E. coli and S. marcescens has improved the stability of BioB, partially debottleneck the BioB step that results in strains that produce up to 1 g/l biotin.141 Since B. subtilis contains similar genes and enzymes, it will be interesting to see if a similar approach will be successful in this Gram-positive bacterium.
23.4.3 Vitamin B6 23.4.3.1 Physiological and Commercial Significance The terms “vitamin B6” and “pyridoxine” designate a group of six B6 vitamers: pyridoxol, pyridoxal, pyridoxamine, and their respective 5′-phosphate esters. Pyridoxal 5′-phosphate (PLP) is an essential cofactor for more than 100 enzymes, the majority of which is involved in amino acid metabolism. Pyridoxamine 5′-phosphate is an essential cofactor in the biosynthesis of deoxysugars. Commercially, vitamin B6 is produced exclusively via chemical synthesis. The total market size is <5000 t/a. Vitamin B6 is mostly used for animal feed and pharma applications, with lesser use in food and cosmetics. The major industrial vitamin B6 producers are DSM Nutritional Products (Switzerland) and various companies in China. 23.4.3.2 Biosynthesis of Vitamin B6 Most organisms contain salvage pathways that allow interconversions between the different B6 vitamers. For de novo biosynthesis of vitamin B6, two different pathways are known: (1) the PdxA/PdxJ pathway is found in E. coli and many other Gram-negative bacteria, comprises six dedicated steps and yields pyridoxol 5′-phosphate as the initial B6 vitamer. (2) The second pathway is more prevalent and is found, e.g., in archaebacteria, fungi, plants, plasmodia, some metazoa, and a subset of eubacteria including B. subtilis. In this pathway, PdxS and PdxT (YaaD/YaaE in B. subtilis) are the only enzymes required to convert three substrates from central metabolism into PLP: ribose 5-phosphate or ribulose 5-phosphate; dihydroxyacetone phosphate or glyceraldehyde 3-phosphate; and glutamine (or another ammonia donor). PdxS and PdxT appear to be the synthase and glutaminase subunits, respectively, of an equimolar enzyme complex.154 23.4.3.3 Vitamin B6 Pathway Engineering Reports on the overproduction of vitamin B6 in B. subtilis are scarce and of limited success, with maximal product titers of 61 mg/l in 48-h shake-flask culture.155 The attractiveness of producing vitamin B6 in B. subtilis lies in the facts that all required substrates come from central metabolism, and that only two enzymes may need to be overexpressed and/or engineered to achieve high-level production. Nevertheless, a number of arguments question the feasibility of such an approach. Firstly, the catalytic efficacy of the PdxST PLP synthase complex is very low (kcat = 0.02 min -1).156,157 Therefore, overexpression of PdxST alone will not be sufficient to attain meaningful PLP levels, requiring protein engineering to drastically increase the specific activity of the PLP synthase complex. Secondly, the catalytic repertoire of the PdxST PLP synthase is more complex than reactions catalyzed by other glutamine amidotransferases (see Refs. 156,157). In addition to ammonia transfer, the PdxST complex catalyzes condensation of two phosphosugars, closure of the pyridine ring, as well as isomerase reactions for its two pairs of phosphosugar substrates. The complexity of this enzyme is reflected in high sequence conservation of PdxS from archaebacteria to eukarya (at least 60% amino acid sequence identity), suggesting that targeted engineering of PdxS may be very difficult. Thirdly, overproduction of PLP may interfere with metabolic pathways that depend on this cofactor. Fourthly, PLP or pyridoxal are more reactive and, thus, less inert and stable, than pyridoxol 5′-phosphate or pyridoxol, respectively, making the latter the B6 vitamers of choice for biotechnological production. In summary, building on
23-22
Developing Appropriate Hosts for Metabolic Engineering
B. subtilis PdxST for biotechnological production of vitamin B6 appears to require several technical breakthroughs, which may be too numerous to make this approach a viable option.
23.5 Other Potentially Relevant Products Primarily for their ability to enhance the flavor of food, purine nucleotides are industrially relevant, Corynebacterium glutamicum is an attractive host because it secretes the phosphorylated compounds directly. Alternatively, B. subtilis K (later reclassified as B. amyloliquefaciens) processes were developed that accumulate above 20 g/l of the nucleosides inosine or guanosine, which are then chemically phosphorylated to the nucleotides.10,160 Initially as the starting material for the chemical synthesis of flavor-enhancing nucleotides but more importantly today as a chiral synthon for antiviral and anticancer therapeutics, the pentose sugar ribose has received significant attention. Commercial feasibility of fermentative ribose production was achieved with B. subtilis and B. pumilus strains with mutations in transketolase and/or ribulose-5-phosphate 3-epimerase. Thus, by blocking the nonoxidative pentose phosphate pathway, an effective one-way route from glucose to ribose is established that yields ribose titers exceeding 90 g/l.105 Such transketolase mutants are apparently also capable of producing in the range of 20 g/l of the seven carbon sugar sedoheptulose, an intermediate of the pentose phosphate pathway, from ribose as the substrate.161 This preexisting industrial experience from the 1970s and 1980s facilitated also the initial decision to engineer the above discussed vitamin biosynthesis pathways in B. subtilis, because purine nucleotides, their intermediates and/or ribose are key biochemical building blocks. As an unbranched polysaccharide of up to 20000 disaccharide units of N-acetyl-D-glucosamine and D-glucuronic acid, hyaluronic acid has a broad range of applications in the cosmetic and pharmaceutical industries with a current market estimate of $1 billion. The current commercial sources of hyaluronic acid are rooster combs (with the concomitant viral pathogenic and allergic potential) and attenuated strains of group C Streptococcus. While most metabolic eingineering efforts focussed on various streptococci,162 recent overexpression of the hyaluronan synthase gene (hasA) from S. equisimilis in B. subtilis resulted in a modest hyaluronan yield.163 Construction of artificial operons that contain hasA plus one or more genes involved in UDP-precursor synthesis revealed that the production of UDP-glucuronic acid is limiting in B. subtilis. Overexpression of hasA along with the endogenous tuaD gene is sufficient for modest production of hyaluronic acid at a quality that meets the standards of commercial preparations. Less impressive results have recently been published on the overproduction of the vitamin folic acid and the lantibiotic subtilin in B. subtilis. Zhu et al.164 tested a number of strategies to increase folic acid production in B. subtilis. A strain possessing IPTG-inducible pyruvate kinase, overexpressed aroH (to improve supply of the folic acid precursor para-aminobenzoic acid), and increased transcription and translation of genes from the folate operon exhibited the best yield. However, the improvement was only eight-fold relative to the parent, B. subtilis 168. Heinzmann et al.165 employed two engineering strategies to improve the production of the lantibiotic subtilin in B. subtilis. Overexpression of the subtilin self-protection (immunity) genes spaIFEG enhanced the subtilin yield 1.7-fold. Deletion of a repressor of subtilin gene expression, abrB, caused a six-fold increase in subtilin yield. Disappointingly, however, combination of the two engineering strategies lowered rather than further increased subtilin production, and the produced subtilin fraction predominantly consisted of succinylated subtilin species with less antimicrobial activity compared to unmodified subtilin.
23.6 Current Challenges and New Possibilities 23.6.1 Metabolic Engineering of General Host Properties The aforementioned applications of metabolic engineering focused almost exclusively on optimizing the specific biosynthetic pathways to particular products and their supply routes. One key area that will
Metabolic Engineering of Bacillus subtilis
23-23
inevitably gain increasing relevance in commercial overproducers is engineering of general host properties that relate to the performance under industrial process conditions, which typically include low growth rates, large material flows to products at little (preferably no) biomass formation, and/or growth of highly engineered strains in large-scale bioreactors under nutrient limitation. A particular problem of B. subtilis is its comparatively high maintenance energy demand;166 i.e., the amount of energy that cells require to simply maintain themselves in an active state without product or biomass formation.167 This maintenance energy demand is an inherent feature of an organism to maintain, for example, ion gradients across cellular membranes, and becomes a major consumer of substrate resources during industrial fed-batch fermentations with low growth rates and high biomass concentrations. A successful example of host engineering to reduce the substrate expenditures for the still unaltered maintenance demand was rerouting the electron flux through more efficient branches of the respiratory chain of B. subtilis, thereby significantly improving riboflavin production in fed-batch processes.168,169 To further cut down on process and substrate costs, the conceptual development of completely decoupled product formation from cell growth would be extremely rewarding.170 If production of biochemicals could be sustained at a high rate for extended periods under conditions of zero growth, much higher titers and yields would be feasible because valuable nutrients and energy would not be channeled away from the desired product. The advantages of a partial decoupling were already demonstrated in the riboflavin section.109 Whether a complete decoupling is actually possible in bacteria at commercially relevant rates is an open question, but first advances were made by using appropriate selection protocols169 and rational engineering in E. coli.172 Another recent development that offers promising opportunities for the engineering of generally improved production hosts are powerful genome modification techniques. The key applied goal is a “minimal” production host that contains only the required set of genes to grow and optimally produce the compound(s) of interest. It is naïve to expect that simply aiming for a minimal genome would concomitantly yield superior production strains, as such strains would at least lack the industrially required robustness and sturdiness. Nevertheless, many cellular traits are clearly irrelevant or even detrimental for the overproduction of industrially relevant compounds. One such example is the 15% reduction of the E. coli genome, where primarily removal of recombigenic or mobile DNA elements and cryptic virulence genes generated a host for stable propagation of otherwise instable plasmids/genes.173 For B. subtilis, removal of large dispensable genome regions containing 332 genes (e.g., prophages and polyketide synthesis genes) reduced the genome by 7.7%, without concomitantly affecting cell physiology, fluxes, or protein secretion.174 On the other extreme, Itaya et al.74 cloned the entire 3.5-Mb genome of the photosynthetic bacterium Synechocystis PCC6803 into the genome of B. subtilis 168, thereby generating a 7.7-Mb composite genome. What is required now are systematic attempts to remove and add further genome regions in combination with rigorous testing of strain performance under industrially relevant conditions.
23.6.2 Genome Resequencing Of particular industrial value for truly genome-based strain reconstructions175 are the emerging lower-cost technologies for genomic (re)sequencing from companies such as NimbleGen, Affymetrix, 454 Life Sciences, or Solexa.176–178 Much of the contemporary industrial practise still exploits the traditional approach of random mutagenesis and screening for overproducers, and several such examples that screen for antimetabolite resistant vitamin overproducers were discussed above. While incredibly successful, the drawback of such brute-force methods is the accumulation of (near-) neutral or even counterproductive mutations that often result in crippled production hosts.179,180 Affordable comprehensive resequencing of strains from classical mutagenesis programs can identify the crucial mutations for metabolite overproduction, thereby allowing a rational (re)construction of production strains without the otherwise accumulating negative traits. As a first application, re-sequencing of B. subtilis 168 with NimbleGen microarrays demonstrated that the published B. subtilis 168 genome sequence6 may contain
23-24
Developing Appropriate Hosts for Metabolic Engineering
~2000 sequencing errors (B. Chevreux et al., unpublished data). Recent publications demonstrate the utility of resequencing methods to measure sequence divergence among between different bacterial laboratory strains and between their individual isolates, and in strain lineages.29,181 Such methods are quickly finding their ways into industrial strain engineering programs.175,182,183
23.6.3 Omics and Systems Biology Reflecting its status as the primary Gram-positive model bacterium, the available omics methods for B. subtilis are quite well advanced. Expression microarrays covering virtually all protein-encoding genes are available on a number of different platforms (e.g., Refs. 182,183), and classical expression microarrays and so-called tiling microarrays are used to systematically map the relevance of small, regulatory RNAs in B. subtilis.186 Advanced proteomics technologies allow concurrent identification and quantification of several hundred B. subtilis proteins in a single experiment.8,187 Similarly, arising metabolomics platforms allow the concurrent analysis of several hundred metabolites of B. subtilis.188,189 A major challenge in metabolomics remains the large number of still unidentified metabolites, even in a simple model organism such as B. subtilis. Beyond molecular concentrations, regulatory processes are crucial and the DBTBS database (http:// dbtbs.hgc.jp; Ref. 188) is a collection of experimentally validated gene regulatory relations and the corresponding transcription factor binding sites upstream of B. subtilis genes. In addition, DBTBS contains information on operon structures and transcriptional terminators. As a complement, regulatory mechanisms can be inferred computationally, for instance from microarray data.191 The knowledge on proteinprotein interactions is less-well advanced and covers primarily proteins involved in DNA replication and chromosome dynamics.192 With few exceptions, also the available information on biomolecular turnover rates is rather scarce and fragmented for B. subtilis. Using a combination of L-[35S] methionine pulse-labelling with 2D-gel electrophoresis, Bernhardt et al.193 analyzed qualitatively the response of protein synthesis rates to heat shock and oxidative stress. Hambraeus et al.194 determined the decay rates for mRNAs corresponding to ~1500 B. subtilis genes at early stationary phase. About 80% of these mRNAs had a half-life of less than 7 min. On the other hand, more than 30 mRNAs, including both mono- and polycistronic transcripts, had half-lives of ≥15 min. At the level of intracellular fluxes, the most comprehensive “fluxome” analysis with a plethora of B. subtilis knockout mutants to date revealed a robust distribution of flux, which indicates a surprisingly high degree of network rigidity of B. subtilis.195 This result might explain why, with the notable exception of the impaired transketolase modification described in the riboflavin section, various attempts to redirect the central metabolic flux toward the building blocks were met with limited success. Such negative examples include expression modification of glycolysis, TCA cycle, or pentose phosphate pathway enzymes for riboflavin production (e.g., Ref. 194). Possibly the application of multiple omics methods (for an example of riboflavin production see Ref. 195) may increase the success rate. While “holistic” concentration analysis with any of the above omics technologies can provide valuable insights, experience far beyond B. subtilis clearly reveals the misconception that any such single technology will have a strong impact on metabolic engineering. Not only are we still waiting for true success stories of omics applications in industrial biotechnology, they fail inherently to capture the complex and interdependent component interactions and processes that define a living cell. Obviously metabolic flux measurements are a most valuable complement to proteomics and metabolomics analysis for the elucidation of further targets in metabolite overproducing strains. Simply doing these analyses in parallel without some integrating framework, however, exploits only a fraction of the potential. Conceptual advances are only expected when such data are truly integrated by some type of mathematical/computational framework, such as for example network-embedded thermodynamic analysis198 or detailed kinetic models,199,200 and we thus enter the territory of systems biology.201,202 Ideally, data emerging from both established and forthcoming “omics” technologies should be quantitatively interpreted in the context of existing knowledge (e.g., the metabolic network) through mathematical
Metabolic Engineering of Bacillus subtilis
23-25
models of increasing detail and precision. Development of such models in B. subtilis, however, has significantly lagged behind virtually all other model microbes. Previous modelling efforts have generally ignored cellular dynamics and focussed on metabolism; and even in these studies, only few stoichiometric models were used to elucidate limits and capabilities of vitamin or biomass production.203–207 Only very recently has there occurred a notable shift in focus to dynamic and partly stochastic modelling of primarily the differentiation processes in B. subtilis.208–211 A large-scale modelling attempt to metabolism is the recently started EC project BaSysBio that is coordinated by Philippe Noirot from INRA, Jouy en Josas, France. By bringing together a number of well-known modelling and informatics groups with strong experimental partners from the B. subtilis world, the project aims to achieve just such data integration by developing models that unravel the global regulatory structure of B. subtilis metabolism to understand how transcriptional regulation is integrated with the other levels of control.
23.6.4 Conclusions A fair question at the end of any such review is how a new project would be executed today, compared to the successful empirical and intuitive efforts used some 15 years ago, e.g., in metabolic engineering of riboflavin overproduction.34 Despite the omics revolution in the last decade, we are inclined to conclude that from a conceptual perspective, hardly anything has changed. However, the impressive improvements in comprehensiveness and throughput of analytical (omics) technologies would allow obtaining more detailed insights into cellular physiology, at greater speed and/or higher time resolution. The genome sequence simplifies many strategies, and constructed mutants and production strains can be tested more rigorously, which certainly helps to abandon unsuccessful lines more rapidly and possibly also point out unsuspected further targets. While detailed stoichiometry of central metabolism and riboflavin biosynthesis was already modelled years ago for vitamin production assessment and qualitative in silico elucidation of different metabolic engineering strategies, 203,204 there is some chance that present stoichiometric models at genome-scale might suggest even more useful strategies.205,212 Overall, omics approaches would save some man-years in reaching metabolic engineering targets. Equally or even more important contributors to commercial success are, in our view, experimental know-how in rapid genetic engineering, a deep understanding of the organism’s physiology, and experience with large-scale cultivation. The trajectory to more powerful predictive “whole-cell” models of B. subtilis, involving both metabolic and regulatory network structures, however, clearly has the potential to change industrial practise. While work in this field still needs to be done, systems biology holds the promise to process highthroughput experimental data such that meaningful hypotheses and models can be generated and validated.
References 1. Sonenshein, A. L., Hoch, J. A., and Losick, R. Bacillus Subtilis and Other Gram-Positive Bacteria: Biochemistry, Physiology, and Molecular Genetics. American Society for Microbiology: Washington DC, 1993. 2. Sonenshein, A. L., Hoch, J. A., and Losick, R. Bacillus Subtilis and its Closest Relatives. ASM Press: Washington, DC, 2002. 3. Outtrup, H. and Jorgensen, S. T. The importance of Bacillus species in the production of industrial enzymes. In Applications and Systems of Bacillus and Relatives. Berkley, R., Ed. Blackwell Science Inc.: Malden, MA, 2002; pp 206–218. 4. Zeigler, D. R. and Perkins, J. B. The genus Bacillus. In Practical Handbook of Microbiology, 2nd ed. Goldman, E. and Green, L., Eds. CRC Press: Boca Raton, FL, 2008; pp 301–329. 5. Harwood, C. R. and Cutting, S. M. Molecular Biological Methods for Bacillus. John Wiley & Sons: New York, 1990.
23-26
Developing Appropriate Hosts for Metabolic Engineering
6. Kunst, F., Ogasawara, N., and Moszer, I. et. al. The complete genome sequence of the Gram-positive bacterium Bacillus subtilis. Nature, 1997, 390, 249–256. 7. Kobayashi, K., Ehrlich, D. S., Albertini, A. M., Amati, G., Andersen, K. K., Arnaud, M., and Asai, K. et al. Essential Bacillus subtilis genes. Proc. Natl. Acad. Sci. USA, 2003, 100, 4678–4683. 8. Hecker, M. and Volker, U. Towards a comprehensive understanding of Bacillus subtilis cell physiology by physiological proteomics. Proteomics, 2004, 4 (12), 3727–3750. 9. Arbige, M. V., Bulthuis, B. A., Schultz, J., and Crabb, D. Fermentation of Bacillus. In Bacillus Subtilis and Other Gram-Positive Bacteria: Biochemistry, Physiology, and Molecular Genetics. Sonenshein, A. L., Hoch, J. A., and Losick, R., Eds. American Society for Microbiology: Washington, DC, 1993; pp 871–895. 10. Schallmey, M., Singh, A., and Ward, O. P. Developments in the use of Bacillus species for industrial production. Can. J. Microbiol., 2004, 50 (1), 1–17. 11. Phelps, R. J. and McKillip, J. L. Enterotoxin production in natural isolates of Bacillaceae outside the Bacillus cereus group. Appl. Environ. Microbiol., 2002, 68, 3147–3151. 12. EFSA. Opinion of the Scientific Committee on a request from EFSA related to a generic approach to the safety assessment by EFSA of microorganisms used in food/feed and the production of food/ feed additives. EFSA J., 2005b, 226, 1–12. 13. Alexopoulos, C., Georgoulakis, I. E., Tzivara, A., Kritas, S. K., Siochu, A., and Kyriakis, S. C. Field evaluation of the efficacy of a probiotic containing Bacillus licheniformis and Bacillus subtilis spores, on the health status and performance of sows and their litter. J. Anim. Physiol. Anim. Nutr. (Berlin) 2004, 88, 381–392. 14. Tam, N. K. M., Uyen, N. G., Hong, H. A., Duc, L. H., Hoa, T. T., Serra, C. R., Henriques, A. O., and Cutting, S. M. The intestinal life cycle of Bacillus subtilis and close relatives. J. Bacteriol., 2006, 188, 2692–2700. 15. Sanders, M. E., Morelli, L., and Tompkins, T. A. Spore formers as human probiotics: Bacillus, Sporolactobacillus, and Brevibacillus. Compr. Rev. Food Sci. Food Safety (CRFSFS), 2003, 2, 101–110. 16. Senesi, S. Bacillus spores as probiotic products for human use. In Bacterial Spore Formers: Probiotics and Emerging Applications. Ricca, E., Henriques, A. O., and Cutting, S. M., Eds. Horizon Scientific Press: Norwich, U.K., 2004; pp 131–141. 17. Westers, L., Westers, H., and Quax, W. J. Bacillus subtilis as cell factory for pharmaceutical proteins: a biotechnological approach to optimize the host organism. Biochim. Biophys. Acta, 2004, 1694, 299–310. 18. Tjalsma, H., Antelmann, H., Jongbloed, J. D., Braun, P. G., Darmon, E., Dorenbos, R., and Dubois, J. Y. et al. Proteomics of protein secretion by Bacillus subtilis: separating the “secrets” of the secretome. Microbiol. Mol. Biol. Rev., 2004, 68, 207–233. 19. Sarvas, M., Harwood, C. R., Bron, S., and van Dijl, J. M. Post-translocational folding of secretory proteins in Gram-positive bacteria. Biochim. Biophys. Acta., 2004, 1694, 311–327. 20. Meima, R. B., van Dijl, J. M., Holsappel, S., and Bron, S. Expression systems in Bacillus. In Protein Expression Technologies: Current Status and Future Trends. Baneyx, F., Ed. Horizon Bioscience: Norfold, U.K., 2004; pp 199–252. 21. Burkholder, P. R. and Giles, N. H. Induced biochemical mutations in Bacillus subtilis. Am. J. Bot., 1947, 34, 345–348. 22. Spizizen, J. Transformation of biochemically deficient strains of Bacillus subtilis by deoxyribonucleate. Proc. Natl. Acad. Sci. USA, 1958, 44, 1072–1078. 23. Hemphill, H. E. and Whiteley, H. R. Bacteriophages of Bacillus subtilis. Bacteriol. Rev., 1975, 39 (3), 257–315. 24. Nakamura, L. K., Roberts, M. S., and Cohan, F. M. Relationship of Bacillus subtilis clades associated with strains 168 and W23: a proposal for Bacillus subtilis subsp. subtilis subsp. nov. and Bacillus subtilis subsp. spizizenii subsp. nov. Int. J. Syst. Bacteriol., 1999, 49 (Pt 3), 1211–1215.
Metabolic Engineering of Bacillus subtilis
23-27
25. Lazarevic, V., Abellan, F. X., Moller, S. B., Karamata, D., and Mauel, C. Comparison of ribitol and glycerol teichoic acid genes in Bacillus subtilis W23 and 168: identical function, similar divergent organization, but different regulation. Microbiology, 2002, 148 (Pt 3), 815–824. 26. Dubnau, D., Smith, I., Morell, P., and Marmur, J. Gene conservation in Bacillus species. I. Conserved genetic and nucleic acid base sequence homologies. Proc. Natl. Acad. Sci. USA, 1965, 54 (2), 491–8. 27. Rudner, R. Variation of nucleotide sequences among related Bacillus genomes. In Genetic exchange—a celebration and a new generation. Streipts, U. N., Ed. Marcel Dekker, Inc.: New York and Basel, 1982; pp 339–351. 28. Bower, S., Perkins, J., Yocum, R. R., Serror, P., Sorokin, A., Rahaim, P., Howitt, C. L., Prasad, N., Ehrlich, S. D., and Pero, J. Cloning and characterization of the Bacillus subtilis birA gene encoding a repressor of the biotin operon. J. Bacteriol., 1995, 177, 2572–2575. 29. Zeigler, D. R., Prágai, Z., Rodriguez, S., Chevreux, B., Muffler, A., Albert, T., Bai, R., Wyss, M., and Perkins, J. B. The origins of 168, W23, and other Bacillus subtilis legacy strains. J. Bacteriol., 2008, 190, 6983–6995. 30. Nester, E. W., Schafer, M., and Lederberg, J. Gene linkage in DNA transfer: a cluster of genes concerned with aromatic biosynthesis in Bacillus subtilis. Genetics, 1963, 48, 529–551. 31. Erickson, R. J. and Copeland, J. C. Congression of unlinked markers and genetic mapping in the transformation of Bacillus subtilis 168. Genetics, 1973, 73 (1), 13–21. 32. Guérout-Fleury, A. M., Frandsen, N., and Stragier, P. Plasmids for ectopic integration in Bacillus subtilis. Gene, 1996, 180, 57–61. 33. Errington, J. Regulation of endospore formation in Bacillus subtilis. Nat. Rev. Microbiol., 2003, 1, 117–126. 34. Perkins, J. B., Sloma, A., Hermann, T., Theriault, K., Zachgo, E., Erdenberger, T., and Hannett, N. et al. Genetic engineering of Bacillus subtilis for the commercial production of riboflavin. J. Ind. Microbiol. Biotechnol., 1999a, 22 (1), 8–18. 35. Perkins, J. B. and Pragai, Z. Production of pantothenate using microorganisms incapable of sporulation. PCT International Patent Application No. WO 2004/113510, 2004. 36. Priest, F. G., Fleming, A. B., Tangney, M., Jorgensen, P. L., and Diderichsen, B. Production of proteins using Bacillus incapable of sporulation. PCT International Patent Application No. WO 97/03185, 1997. 37. Itaya, M., Kondo, K., and Tanaka, T. A neomycin resistance gene cassette selectable in a single copy state in the Bacillus subtilis chromosome. Nucl. Acids Res., 1989, 17, 4410. 38. Dale, G. E., Langen, H., Page, M. G., Then, R. L., and Stuber, D. Cloning and characterization of a novel, plasmid-encoded trimethoprim-resistant dihydrofolate reductase from Staphylococcus haemolyticus MUR313. Antimicrob. Agents Chemother., 1995, 39 (9), 1920–1924. 39. Horinouchi, S. and Weisblum, B. Nucleotide sequence and functional map of pC194, a plasmid that specifies inducible chloramphenicol resistance. J. Bacteriol., 1982, 150, 815–825. 40. Perkins, J. B. and Youngman, P. Streptococcus plasmid pAMα1 is a composit of two separable replicons, one of which is closely related to Bacillus plasmid pBC16. J. Bacteriol., 1983, 155, 607–615. 41. Guerout-Fleury, A. M., Shazand, K., Frandsen, N., and Stragier, P., Antibiotic-resistance cassettes for Bacillus subtilis. Gene, 1995, 167 (1–2), 335–336. 42. Itaya, M., Yamaguchi, I., Kobayashi, K., Endo, T., and Tanaka, T. The blasticidin S resistance gene (bsr) selectable in a single copy state in the Bacillus subtilis chromosome. J. Biochem. (Tokyo), 1990, 107, 799–801. 43. Steinmetz, M. and Richter, R. Plasmids designed to alter the antibiotic resistance expressed by insertion mutations in Bacillus subtilis, through in vivo recombination. Gene, 1994, 142, 79–83. 44. Bron, S. Plasmids. In Molecular Biological Methods for Bacillus. Harwood, C. R. and Cutting, S. M., Eds. John Wiley & Sons: New York, 1990; pp 75–174. 45. Jannière, L., Gruss, A., and Ehrlich, S. D. Plasmids. In Bacillus Subtilis and other Gram-Positive Bacteria: Biochemistry, Physiology, and Molecular Genetics. Sonenshein, A. L., Hoch, J. A., and Losick, R., Eds. American Society for Microbiology: Washington, DC, 1993; pp 625–644.
23-28
Developing Appropriate Hosts for Metabolic Engineering
46. Stepanov, A. I., Zhdanov, V. G., Kukanova, A. Y., Khaikinson, M. Y., Rabinovich, P. M., Iomantas, J. A. V., and Galushkina, Z. M. Riboflavin preparation. French Patent Application No. FR 2546907, 1984. 47. Stoinova, N. V., Abalakina, E. G., Gusarov, I. I., Podchernyaev, D. A., Iomantas, Y. V., Kozlov, Y. I., Kun, C., Hong, L., Perumov, D. A., and Kreneva, R. A. The construction of recombinant strains that produce riboflavin. I. Cloning and expression of the riboflavin operon of Bacillus amyloliquefaciens in Bacillus subtilis cells—recombinant riboflavin production via vector plasmid pMX30 and plasmid p30R2-mediated gene transfer and expression in a bacterium for use in the foodindustry. Biotekhnologiya, 1996a, 11, 1–5. 48. Vagner, V., Dervyn, E., and Ehrlich, S. D. A vector for systematic gene inactivation in Bacillus subtilis. Microbiology, 1998, 144 (Pt 11), 3097–3104. 49. Biaudet, V., Samson, F., and Bessieres, P. Micado—a network-oriented database for microbial genomes. Comput. Appl. Biosci., 1997, 13 (4), 431–438. 50. Bower, S., Perkins, J. B., Yocum, R. R., Howitt, C. L., Rahaim, P., and Pero, J. Cloning, sequencing, and characterization of the Bacillus subtilis biotin biosynthetic operon. J. Bacteriol., 1996, 178, 4122–4130. 51. Schyns, G., Potot, S., Geng, Y., Barbosa, T., Henriques, A., and Perkins, J. B. Isolation and characterization of new thiamin-deregulated mutants of Bacillus subtilis. J. Bacteriol., 2005, 187, 8127–8136. 52. Wach, A. PCR-synthesis of marker cassettes with long flanking homology regions for gene disruptions in S. cerevisiae. Yeast, 1996, 12, 259–265. 53. Eichenberger, P., Jensen, S. T., Conlon, E. M., Ooij, V. C., Silvaggi, J., Gonzalez-Pastor, J.-E., Fujita, M., Ben-Yehuda, S., Stragier, P., Liu, J.-S., and Losick, R. The sigma E regulon and the identification of additional sporulation genes in Bacillus subtilis. J. Mol. Biol., 2003, 327, 945–972. 54. Mountain, A. Gene expression systems for Bacillus subtilis. In Biotechnology Handbooks 2: Bacillus. Harwood, C. R., Ed. Plenum Publishing Corporation: New York, 1989; pp 73–114. 55. Kiel, J. A., ten Berge, A. M., Borger, P., and Venema, G. A general method for the consecutive integration of single copies of a heterologous gene at multiple locations in the Bacillus subtilis chromosome by replacement recombination. Appl. Environ. Microbiol., 1995, 61, 4244–4250. 56. Shimotsu, H. and Henner, D. J. Construction of a single-copy integration vector and its use in analysis of regulation of the trp operon of Bacillus subtilis. Gene, 1986, 43 (1–2), 85–94. 57. Perego, M. Integrational vectors for genetic manipulationi in Bacillus subtilis. In Bacillus Subtilis and Other Gram-Positive Bacteria: Biochemistry, Physiology, and Molecular Genetics. Sonenshein, A. L., Hoch, J. A., and Losick, R., Eds. American Society for Microbiology: Washington, DC, 1993; pp 615–624. 58. Middleton, R. and Hofmeister, A. New shuttle vectors for ectopic insertion of genes into Bacillus subtilis. Plasmid, 2004, 51 (3), 238–245. 59. Sciochetti, S. A., Piggot, P. J., and Blakely, G. W. Identification and characterization of the dif site from Bacillus subtilis. J. Bacteriol., 2001, 183 (3), 1058–1068. 60. Hartl, B., Wehrl, W., Wiegert, T., Homuth, G., and Schumann, W. Development of a new integration site within the Bacillus subtilis chromosome and construction of compatible expression cassettes. J. Bacteriol., 2001, 183, (8), 2696–2699. 61. Van Arsdell, S. W., Perkins, J. B., Yocum, R. R., Luan, L., Howitt, C. L., Chatterjee, N. P., and Pero, J. G. Removing a bottleneck in the Bacillus subtilis biotin pathway: bioA utilizes lysine rather than S-adenosylmethionine as the amino donor in the KAPA-to-DAPA reaction. Biotechnol. Bioeng., 2005, 91, (1), 75–83. 62. Hermann, T., Patterson, T. A., Pero, J. G., Yocum, R. R., Baldenius, K.-U., and Beck, C. Process for enhanced production of pantothenate. World Patent Application WO02/057474 A2, 2002. 63. EFSA. Opinion of the Scientific Panel on Additives and Products or Substances used in Animal Feed on the updating of the criteria used in the assessment of bacteria for resistance to antibiotics of human or veterinary importance. EFSA J., 2005a, 223, 1–12.
Metabolic Engineering of Bacillus subtilis
23-29
64. Pero, J. G. and Sloma, A., Proteases. In Bacillus Subtilis and Other Gram-Positive Bacteria: Biochemistry, Physiology, and Molecular Genetics. Sonenshein, A. L.; Hoch, J. A.; Losick, R., Eds. American Society for Microbiology: Washington, DC, 1993; pp 939–952. 65. Zhang, X. Z., Yan, X., Cui, Z. L., Hong, Q., and Li, S. P. mazF, a novel counter-selectable marker for unmarked chromosomal manipulation in Bacillus subtilis. Nucleic Acids Res., 2006, 34 (9), e71. 66. Fabret, C., Erhlick, S. D., and Noirot, P. A new mutation delivery system for genome-scale approaches in Bacillus subtilis. Mol. Microbiol., 2002, 46, 25–36. 67. Brans, A., Filée, P., Chevigné, A., Claessens, A., and Joris, B. New integrative method to generate Bacillus subtilis recombinant strains free of selection markers. Appl. Environ. Microbiol., 2004, 70, 7241–7250. 68. Bloor, A. E. and Cranenburgh, R. M. An efficient method of selectable marker gene excision by Xer recombination for gene replacement in bacterial chromosomes. Appl. Environ. Microbiol., 2006, 72, 2520–2525. 69. Bower, S. G., Perkins, J. B., Yocum, R. R., and Pero, J. Biotin biosynthesis in Bacillus subtilis. US Patent No. 6,057,136, 2000. 70. Yocum, R. R., Patterson, T. A., Hermann, T., and Pero, J. Methods and microorganisms for production of panto-compounds. PCT International Patent Application WO 01/21772 A2, 2001. 71. Yocum, R. R., Patterson, T. A., Pero, J. G., and Hermann, T. Microorganisms and processes for enhanced production of pantothenate. PCT Patent Application No. WO 02/061108 A2, 2002. 72. Perkins, J. B., Sloma, A., Pero, J. G., Hatch, R. T., Hermann, T., and Erdenberger, T. Bacterial strains which overproduce riboflavin. US Patent No. US 5,925,538, 1999b. 73. Jannière, L., Niaudet, B., Pierre, E., and Ehrlich, S. D. Stable gene amplification in the chromosome of Bacillus subtilis. Gene, 1985, 40, 47–55. 74. Itaya, M., Tsuge, K., Koizumi, M., and Fujita, K. Combining two genomes in one cell: stable cloning of the Synechocystis PCC6803 genome in the Bacillus subtilis 168 genome. Proc. Natl. Acad. Sci. USA, 2005, 102 (44), 15971–15976. 75. Isler, O. I., Brubacher, G., Ghisla, S., and Kräutler, B. Vitamine II. Georg Thieme Verlag: Stuttgart, Germany, 1988. 76. Hoppenheidt, K., Mücke, W., Peche, R., Tronecker, D., Roth, U., Würdinger, E., Hottenroth, S., and Rommel, W. Reducing Environmental Load of Chemical Engineering Processes and Chemical Products by Biotechnological Substitutes; Bayerisches Institut für Angewandte Umweltforschung und -technik GmbH: Augsburg, Germany, 2004. 77. Stahmann, K. P., Revuelta, J. L., and Seulberger, H. Three biotechnical processes using Ashbya gossypii, Candida famata, or Bacillus subtilis compete with chemical riboflavin production. Appl. Microbiol. Biotechnol., 2000, 53 (5), 509–516. 78. Bacher, A. Biosynthesis of flavins. In Chemistry and Biochemistry of Flavoenzymes. Muller, F., Ed. CRC Press: Boca Raton, FL, 1990; Vol. 1, pp 215–259. 79. Bacher, A., Eberhardt, S., Fischer, M., Kis, K., and Richter, G. Biosynthesis of vitamin B2 (riboflavin). Annu. Rev. Nutr., 2000, 20, 153–167. 80. Fischer, M. and Bacher, A. Biosynthesis of flavocoenzymes. Nat. Prod. Rep., 2005, 22 (3), 324–350. 81. Volk, R. and Bacher, A. Biosynthesis of riboflavin. Studies on the mechanism of L-3,4-dihydroxy-2butanone 4-phosphate synthase. J. Biol. Chem., 1991, 266, 20610–20618. 82. Fischer, M., Romisch, W., Illarionov, B., Eisenrich, W., and Bacher, A. Structures and reaction mechanisms of riboflavin synthases of eubacterial and archaeal origin. Biochem. Soc. Trans., 2005, 33 (4), 780–784. 83. Matsui, K., Wang, H. C., Hirota, T., Matsukawa, H., Kasai, S., Shinagawa, K., and Otani, S. Riboflavin production by roseoflavin-resistant strains of some bacteria. Agric. Biol. Chem., 1982, 46 (8), 2003–2008. 84. Stepanov, A. I., Kukanova, A., Glazunov, E. A., and Zhdanov, V. G., Analogs of riboflavin, lumiflavin and alloxazine derivatives. II. Effect of roseoflavin on 6,7-dimethyl-8-ribityllumazine and riboflavin synthetase synthesis and growth of Bacillus subtilis. Genetika, 1977, 13, 490–495.
23-30
Developing Appropriate Hosts for Metabolic Engineering
85. Bresler, S. E., Cherepenko, E. I., and Perumov, D. A. Investigation of the operon of riboflavin biosynthesis in Bacillus subtilis. 3. Production and properties of mutants with a complex regulator genotype. Genetika, 1971, 7 (11), 1466–1470. 86. Gusarov, I. I., Kreneva, R. A., Rybak, K. V., Podcherniaev, D. A., Iomantas, I. V., Kolibaba, L. G., Polanuer, B. M., Kozlov, I. I., and Perumov, D. A. Primary structure and functional activity of the Bacillus subtilis ribC gene. Mol. Biol. (Mosk), 1997, 31 (5), 820–825. 87. Coquard, D., Huecas, M., Ott, M., van Dijl, J. M., van Loon, A. P. G. M., and Hohmann, H. P. Molecular cloning and characterisation of the ribC gene from Bacillus subtilis: a point mutation in ribC results in riboflavin overproduction. Mol. Gen. Genet., 1997, 254, 81–84. 88. Mack, M., van Loon, A. P. G. M., and Hohmann, H. P. Regulation of riboflavin biosynthesis in Bacillus subtilis is affected by the activity of the flavokinase/flavin adenine dinucleotide synthetase encoded by ribC. J. Bacteriol., 1998, 180 (4), 950–955. 89. Gelfand, M. S., Mironov, A. A., Jomantas, J., Kozlov, Y. I., and Perumov, D. A. A conserved RNA structure element involved in the regulation of bacterial riboflavin synthesis genes. Trends Genet., 1999, 15, (11), 439–442. 90. Mironov, A. S., Gusarov, I., Rafikov, R., Lopez, L. E., Shatalin, K., Kreneva, R. A., Perumov, D. A., and Nudler, E. An mRNA structure that controls gene expression by binding FMN. Cell, 2002, 111, 747–756. 91. Winkler, W. C., Cohen-Chalamish, S., and Breaker, R. R. An mRNA structure that controls gene expression by binding FMN. Proc. Natl. Acad. Sci. USA, 2002, 99 (25), 15908–15913. 92. Shiio, I. Production of primary metabolites. In Bacillus Subtilis: Molecular Biology and Industrial Application. Maruo, B. and Yoshikawa, H., Eds. Kodansha Ltd. and Elsevier Science Publishers: Tokyo, Japan, 1989. 93. Panina, L. I., Iomantas, I. V., Khaikinson, M., and Rabinovich, P. M. Cloning the operon genes of riboflavin biosynthesis in Bacillus subtilis on plasmid vector pBR322 in Escherichia coli. Genetika, 1983, 19, 174–176. 94. Stoinova, N. V., Abalakina, E. G., Gusarov, I. I., Podchernyaev, D. A., Iomantas, Y. V., Kozlov, Y. I., Kun, C., Hong, L., Perumov, D. A., and Kreneva, R. A. The construction of recombinant strains that produce riboflavin. II. Development of methods of integration of forgain genes into the Bacillis subtilis chromosome on a model of the riboflavin operon of Bacillis amyloliquefaciens. Biotekhnologiya, 1996b, 11, 7–10. 95. Albertini, A. M. and Galizzi, A., Amplification of a chromosomal region in Bacillus subtilis. J. Bacteriol., 1985, 162 (3), 1203–1211. 96. Chen, T., Chen, X., Wang, J. Y., and Zhao, X. M. Effects of riboflavin operon dosage on riboflavin productivity in Bacillus subtilis. Trans. Tianjin Univ., 2005, 11, 1–5. 97. Chen, T., Wang, J. Y., Ban, R., Chen, X., and Zhao, X. M. The effect of integration of riboflavin operon in B. subtilis 24R7 chromosome on riboflavin production. J. Wuxi Univ. Light Ind., 2005, 24, 6–10. 98. Gershanovich, V. N., Bol’shakova, T. N., Dobrynina, O., Galushkina, Z. M., Kukanova, A., and Stepanov, A. Nitrogen assimilation enzymes in Bacillus subtilis mutants with hyperproduction of riboflavin. Mol. Gen. Mikrobiol. Virusol., 2005, 3, 29–34. 99. Choi, H., Han, J. G., Lee, G. H., Lee, G. H., Park, J. H., and Park, Y. H. Microorganism producing riboflavin and production methods of riboflavin using thereof. Korean Patent Application KR2003051237, 2003. 100. Monschau, N., Sahm, H., and Stahmann, K. Threonine aldolase overexpression plus threonine supplementation enhanced riboflavin production in Ashbya gossypii. Appl. Environ. Microbiol., 1998, 64, 4283–4290. 101. Perkins, J. B., Pero, J. G., and Sloma, A. Riboflavin overproducing strains of bacteria. European Patent Application No. EP 0405370 A1, 1991. 102. Mironov, V. N., Kraev, A. S., Chernov, B. K., Ul’ianov, A. V., and Golova, Y. B., Riboflavin biosynthesis genes of Bacillus subtilis—complete primary structure and organization model. Dokl. Akad. Nauk. SSSR, 1989, 305, 482–487.
Metabolic Engineering of Bacillus subtilis
23-31
103. Sauer, U., Hatzimanikatis, V., Bailey, J. E., Hochuli, M., Szyperski, T., and Wuthrich, K. Metabolic fluxes in riboflavin-producing Bacillus subtilis. Nat. Biotechnol., 1997, 15, 448–452. 104. Hümbelin, M., Griesser, V., Keller, T., Schurter, W., Haiker, M., Hohmann, H. P., Ritz, H., Richter, G., Bacher, A., and van Loon, A. P. G. M. GTP cyclohydrolase II and 3,4-dihydroxy-2-butanone 4-phosphatesynthase are rate-limiting enzymes in riboflavin synthesis of an industrial Bacillus subtilis strain used for riboflavin production. J. Ind. Microbiol. Biotechnol., 1999, 22 (1), 1–7. 105. De Wulf, P. and Vandamme, E. J. Production of D-ribose by fermentation. Appl. Microbiol. Biotechnol., 1997, 48, 141–148. 106. Sasajima, K. and Yoneda, M. Production of pentoses by microorganisms. Biotechnol. Genet. Eng. Rev., 1984, 2, 175–213. 107. Gershanovich, V. N., Kukanova, A., Galushkina, Z. M., and Stepanov, A. I. Transketolase mutation in riboflavin-synthesizing strains of Bacillus subtilis. Mol. Gen. Mikrobiol. Virusol., 2000, 3, 3–7. 108. Lehmann, M., Hohmann, H. P., Laudert, D., and Hans, M. Modiefid transketolase and use thereof. European Patent Application No. EP1957640, 2008. 109. Hohmann, H. P., Mouncey, N., Schlieker, H., and Stebbins, J. Polynucleotide portions of the biotin operon from B. subtilis for use in enhanced fermentation. US Patent Application US6656721, 2003. 110. Ebert, S., Hohmann, H. P., Lehmann, M., Mouncey, N., and Wyss, M. Improved enzymes. PCT International Patent Application No. WO 2006/003015 A2, 2006. 111. Blum, R. Pantothenic acid. In Ullmann’s encyclopedia of industrial chemistry. VCH Verlagsgesellschaft: Weinheim, Germany, 1996; Vol. A27, pp 559–572. 112. Kaiser, K. and Potzolli-de, B. Pantothenate. In Ullmann’s encyclopedia of industrial chemistry. VCH Verlagsgesellschaft: Weinheim, Germany, 1996; Vol. A27, pp 559–566. 113. Sun, Z. Preparation of D-lactone valerate with microbial enzyme method. Chinese Patent Application CN1313402, 2001. 114. Sakamoto, K., Yamada, H., and Shimizu, S. D-pantolactone hydrolase and production thereof. European Patent Application EP0504421, 1992. 115. Sakamoto, K., Yamada, H., and Shimizu, S. Process for the preparation of D-pantolactone. US Patent Application US5275949, 1994. 116. Hikichi, Y., Moriya, T., Miki, H., Yamaguchi, T., and Nogami, I. Production of D-pantoic acid and D-pantothenic acid. US Patent Application US5,518,906, 1996. 117. Sahm, H. and Eggeling, L. D-pantothenate synthesis in Corynebacterium glutamicum and use of panBC and genes encoding L-valine synthesis for D-pantothenate overproduction. Appl. Environ. Microbiol., 1999, 65, 1973–1979. 118. Eggeling, L. and Sahm, H. Method for the production of pantothenic acid by fermentation. PCT International Patent Application No. WO02055711 A2, 2002. 119. Hüser, A. T., Chassagnole, C., Lindley, N. D., Merkamm, M., Guyonvarch, A., Elisáková, V., Pátek, M., Kalinowski, J., Brune, I., Pühler, A., and Tauch, A. Rational design of a Corynebacterium glutamicum pantothenate production strain and its characterization by metabolic flux analysis and genomewide transcriptional profiling. Appl. Environ. Microbiol., 2995, 71, 3255–3268. 120. Jackowski, S. Biosynthesis of pantothenic acid and coenzyme A. In Escherichia Coli and Salmonella Typhimurium Cellular and Molecular Biology, 2nd ed. Neidhardt, F. C., Curtiss, R., Ingraham, J. L., Lin, E. C. C., Low, K. B., Magasanik, B., Reznikoff, W. S., Riley, M., Schaechter, M., and Umbarger, H. E., Eds. American Society for Microbiology: Washington, DC, 1996. 121. Webb, M. E., Smith, A. G., and Abell, C. Biosynthesis of pantothenate. Nat. Prod. Rep., 2004, 21, 695–721. 122. Brand, L. A. andStrauss, E. Characterization of a new pantothenate kinase isoform from Helicobacter pylori. J. Biol. Chem., 2005, 280, 20185–20188. 123. Jackowski, S. and Rock, C. O. Regulation of coenzyme A biosynthesis. J. Bacteriol., 1981, 148, 926–932. 124. Moriya, T., Hikichi, Y., Moriya, Y., and Yamaguchi, T. Process for producing D-pantoic acid and D-pantothenic acid or salts thereof. US Patent No. 5,932,457, 1999.
23-32
Developing Appropriate Hosts for Metabolic Engineering
125. Lonsdale, D. A review of the biochemistry, metabolism and clinical benefits of thiamin(e) and its derivatives. Evid Based Complement Alternat. Med., 2006, 3, 49–59. 126. Moine, G. and Hohmann, H.-P., Thiamin. In Ullmann’s Encyclopedia of Industrial Chemistry. VCH Verlagsgesellschaft: Weinheim, Germany, 1996; Vol. A27, pp 506–521. 127. Imamura, N. and Nakayama, H. thiK and thiL loci of Escherichia coli. J. Bacteriol., 1982, 151 (2), 708–717. 128. Fujio, T., Hayashi, M., Akihiro, I., Nishi, T., and Hagihara, T. Process for producing thiamine phosphates. European Patent Application No. EP 0417953, 1991. 129. Begley, T. P., Downs, D. M., Ealick, S. E., McLafferty, F. W., Van Loon, A. P., Taylor, S., Campobasso, N., Chiu, H. J., Kinsland, C., Reddick, J. J., and Xi, J. Thiamin biosynthesis in prokaryotes. Arch. Microbiol., 1999, 171, 293–300. 130. Perkins, J. B. and Pero, J. G. Vitamin biosynthesis. In Bacillus Subtilis and its Relatives: From Genes to Cells. Sonenshein, A. L., Hoch, J. A., and Losick, R., Eds. American Society for Microbiology: Washington, DC, 2001; pp 279–293. 131. Perkins, J. B. and Pero, J. G. Biosynthesis of riboflavin, biotin, folic acid and cobalamin. In Bacillus Subtilis and Other Gram-Positive Bacteria: Biochemistry, Physiology, and Molecular Genetics. Sonenshein, A. L., Hoch, J. A., and Losick, R., Eds. American Society for Microbiology: Washington, DC, 1993; pp 319–334. 132. Lee, G. and Pero, J. Conserved nucleotide sequences in temporally controlled bacteriophage promoters. J. Mol. Biol., 1981, 152, 247–265. 133. Lee, G., Talkington, C., and Pero, J. Nucleotide sequence of a promoter recognized by Bacillus subtilis RNA polymerase. Mol. Gen. Genet., 1980, 180, 57–65. 134. Goese, M., Perkins, J. B., and Schyns, G. Thiamin production by fermentation. PCT International Patent Application No. WO 2004/106557 A2, 2004. 135. Lawhorn, B. G., Mehl, R. A., and Begley, T. P. Biosynthesis of the thiamin pyrimidine: the reconstitution of a remarkable rearrangement reaction. Org. Biomol. Chem., 2004, 2, 2538–2546. 136. Chatterjee, A., Li, Y., Zhang, Y., Grove, T. L., Lee, M., Krebs, C., Booker, S. J., Begley, T. P., and Ealick, S. E. Reconstitution of ThiC in thiamine pyrimidine biosynthesis expands the radical SAM superfamily. Nat. Chem. Biol., 2008, 4, 758–765. 137. Martinez-Gomez, N. C. and Downs, D. M. ThiC is an [Fe-S] cluster protein that requires AdoMet to generate the 4-amino-5-hydroxymethyl-2-methylpyrimidine moiety in thiamin synthesis. Biochemistry, 2008, 47, 9054–9056. 138. Harker, M. and Bramley, P. M. Expression of prokaryotic 1-deoxy-D-xylulose-5-phosphatases in Escherichia coli increases carotenoid and ubiquinone biosynthesis. FEBS Lett., 1999, 448, 115–119. 139. Toms, A. V., Haas, A. L., Park, J. H., Begley, T. P., and Ealick, S. E. Structural characterization of the regulatory proteins TenA and TenI from Bacillus subtilis and identification of TenA as a thiaminase II. Biochemistry, 2005, 44, 2319–2329. 140. Jenkins, A. H., Schyns, G., Potot, S., Sun, G., and Begley, T. P. A new thiamin salvage pathway. Nat. Chem. Biol., 2007, 3, 492–497. 141. Streit, W. R. and Entcheva, P. Biotin in microbes, the genes involved in its biosynthesis, its biochemical role and perspectives for biotechnological production. Appl. Microbiol. Biotechnol., 2003, 61 (1), 21–31. 142. Casutt, M. Biotin. In Ullmann’s encyclopedia of industrial chemistry. VCH Verlagsgesellschaft: Weinheim, Germany, 1996; Vol. A27, pp 566–575. 143. Guillen-Navarro, K., Araiza, G., Garcia-de los Santos, A., Mora, Y., and Dunn, M. F. The Rhizobium etli bioMNY operon is involved in biotin transport. FEMS Microbiol. Lett., 2005, 250, 209–219. 144. Schneider, G. and Lindqvist, Y. Structural enzymology of biotin biosynthesis. FEBS Lett., 2001, 495, 7–11. 145. Marquet, A., Tse Sum Bui, B., and Florentin, D. Biosynthesis of biotin and lipoic acid. Vitamins Hormones, 2001, 61, 51–101.
Metabolic Engineering of Bacillus subtilis
23-33
146. Lawson, R. J., Leys, D., Sutcliffe, M. J., Kemp, C. A., Cheesman, M. R., Smith, S. J., and Clarkson, J. et al. Thermodynamic and biophysical characterization of cyctochrome P450 BioI from Bacillus subtilis. Biochemistry, 2004, 39, 12410–12426. 147. Lawson, R. J., von Wachenfeld, C., Ihtshamul, H., Perkins, J. B., and Munro, A. W. Expression and characterization of the two flavodoxin proteins of Bacillus subtilis, YkuN and YkuP: biophysical properties and interactions with cyctochrome P450 BioI. Biochemistry, 2004, 39, 12390–12409. 148. Cryle, M. J. and De Voss, J. J. Carbon-carbon bond cleavage by cytochrome p450(BioI)(CYP107H1). Chem. Commun. (Camb.), 2004 (1), 86–87. 149. Jarrett, J. T. Biotin synthase: enzyme or reactant. Chem. Biol., 2005, 12, 409–410. 150. Fontecave, M., Ollagnier-de-Choudens, S., and Mulliez, E. Biological radical sulfur insertion reactions. Chem. Rev., 2003, 103, 2149–2166. 151. Choi-Rhee, E. and Cronan, J. E. Biotin synthase is catalytic in vivo, but catalysis engenders destruction of the protein. Chem. Biol., 2005a, 12, 461–468. 152. Choi-Rhee, E. and Cronan, J. E. A nucleosidase required for in vivo function of the S-adenosylL-methionine radical enzyme, biotin synthase. Chem. Biol., 2005b, 12, 589–593. 153. Reyda, M. R., Dippold, R., Dotson, M. E., and Jarrett, J. T. Loss of iron-sulfur clusters from biotin synthase as a result of catalysis promotes unfolding and degradation. Arch. Biochem. Biophys., 2008, 471, 32–41. 154. Belitsky, B. R. Physical and enzymological interaction of Bacillus subtilis proteins required for de novo pyridoxal 5′-phosphate biosynthesis. J. Bacteriol., 2004, 186, (4), 1191–1196. 155. Yocum, R. R., Williams, M. K., and Pero, J. G. Methods and organisms for production of B6 vitamers. PCT International Patent Application No. WO 2004/035010 A2, 2004. 156. Raschle, T., Amrhein, N., and Fitzpatrick, T. B. On the two components of pyridoxal 5′-phosphate synthase from Bacillus subtilis. J. Biol. Chem., 2005, 280 (37), 32291–32300. 157. Burns, K. E., Xiang, Y., Kinsland, C. L., McLafferty, F. W., and Begley, T. P. Reconstitution and biochemical characterization of a new pyridoxal-5′-phosphate biosynthetic pathway. J. Am. Chem. Soc., 2005, 127, 3682–3683. 158. Zhu, J., Burgner, J. W., Harms, E., Belitsky, B. R., and Smith, J. L. A new arrangement of (β/α)8 barrels in the synthase subunit of PLP synthase. J. Biol. Chem., 2005, 280 (30), 27914–27923. 159. Hanes, J. W., Burns, K. E., Hilmey, D. G., Chatterjee, A., Dorrestein, P. C., and Begley, T. P. Mechanistic studies on pyridoxal phosphate synthase: the reaction pathway leading to a chromophoric intermediate. J. Am. Chem. Soc., 2008, 130, 3043–3052. 160. Shiio, I. Industrial applications of Bacillus subtilis. Production of primary metabolites. In Bacillus Subtilis: Molecular Biology and Industrial Applications. Maruo, B. and Yoshikawa, H., Eds. Elsevier Science B.V.: Amsterdam, The Netherlands, 1989; pp 191–211. 161. Yokuta, A. Production of sedoheptulose by Bacillus subtilis. J. Ferm. Bioeng., 1993, 75, 409–413. 162. Chong, B. F., Blank, L. M., Mclaughlin, R., and Nielsen, L. K. Microbial hyaluronic acid production. Appl. Microbiol. Biotechnol., 2005, 66, 341–351. 163. Widner, B., Behr, R., Von Dollen, S., Tang, M., Heu, T., Sloma, A., Sternberg, D., Deangelis, P. L., Weigel, P. H., and Brown, S. Hyaluronic acid production in Bacillus subtilis. Appl. Environ. Microbiol., 2005, 71 (7), 3747–3752. 164. Zhu, T., Pan, Z., Domagalski, N., Koepsel, R., Ataai, M. M., and Domach, M. M. Engineering of Bacillus subtilis for enhanced total synthesis of folic acid. Appl. Environ. Microbiol., 2005, 71 (11), 7122–7129. 165. Heinzmann, S., Entian, K. D., and Stein, T. Engineering Bacillus subtilis ATCC 6633 for improved production of the lantibiotic subtilin. Appl. Microbiol. Biotechnol., 2006, 69 (5), 532–536. 166. Sauer, U., Hatzimanikatis, V., Hohmann, H. P., Manneberg, M., van Loon, A. P. G. M., and Bailey, J. E. Physiology and metabolic fluxes of wild-type and riboflavin-producing Bacillus subtilis. Appl. Environ. Microbiol., 1996, 62, 3687–3696. 167. Russell, J. B. and Cook, G. M. Energetics of bacterial growth: balance of anabolic and catabolic reactions. Microbiol. Rev., 1995, 59, 48–62.
23-34
Developing Appropriate Hosts for Metabolic Engineering
168. Zamboni, N. and Sauer, U. Knockout of the high-coupling cytochrome aa3 oxidase reduces TCA cycle fluxes in Bacillus subtilis. FEMS Microbiol. Lett., 2003, 226, 121–126. 169. Zamboni, N., Mouncey, N., Hohmann, H. P., and Sauer, U. Reducing maintenance metabolism by metabolic engineering of respiration improves riboflavin production by Bacillus subtilis. Metab. Eng., 2003, 5, 49–55. 170. Stouthamer, A. H. and van Verseveld, H. W. Microbial energetics should be considered in manipulating metabolism for biotechnological purposes. Trends Biotechnol., 1987, 5, 149–155. 171. Sonderegger, M., Schümperli, M., and Sauer, U., Selection of quiescent Escherichia coli with high metabolic activity. Metab. Eng., 2005, 7, 4–9. 172. Rowe, D. C. D. and Summers, D. K. The quiescent-cell expression system for protein synthesis in Escherichia coli. Appl. Environ. Microbiol., 1999, 65, 2710–2715. 173. Posfai, G., Plunkett, G. r., Feher, T., Frisch, D., Keil, G. M., Umenhoffer, K., and Kolisnychenko, V. et al. Emergent properties of reduced-genome Escherichia coli. Science, 2006, 312, 1044–1046. 174. Westers, H., Dorenbos, R., van Dijl, J. M., Kabel, J., Flanagan, T., Devine, K. M., Jude, F., and Séror, S. J. et al. Genome engineering reveals large dispensable regions in Bacillus subtilis. Mol. Biol. Evol., 2003, 20, 2076–2090. 175. Ohnishi, J., Mitsuhashi, S., Hayashi, M., Ando, S., Yokoi, H., Ochiai, K., and Ikeda, M. A novel methodology employing Corynebacterium glutamicum genome information to generate a new L-lysineproducing mutant. Appl. Microbiol. Biotechnol., 2002, 58, 217–223. 176. Albert, T. J., Dailidiene, D., Dailide, G., Norton, J. E., Chang, D., Kalia, A., Richmond, T. A., Molla, M., Singh, J., Green, R. D., and Berg, D. E. Whole genome scanning for point mutations by hybridization-based “comparative genome resequencing”: change associated with high-level metronidazole resistance in Helicobacter pylori. Nat. Meth., 2005, 2 (12), 951–953. 177. Bennett, S. T., Barnes, C., Cox, A., Davies, L., and Brown, C. Toward the $1,000 human genome. Pharmacogenomics, 2005, 6, (4), 373–382. 178. Margulies, M., Egholm, M., Altman, W. E., Attiya, S., Bader, J. S., Bemben, L. A., and Berka, J. et al. Genome sequencing in microfabricated high-density picolitre reactors. Nature, 2005, 437 (7057), 376–380. 179. Sauer, U. Evolutionary engineering of industrially important microbial phenotypes. Adv. Biochem. Eng. Biotechnol., 2001, 73, 129–170. 180. Adrio, J. L. and Demain, A. L. Genetic improvement of processes yielding microbial products. FEMS Microbiol. Rev., 2006, 30, 187–214. 181. Srivatsan, A., Han, Y., Peng, J., Tehranchi, A. K., Gibbs, R., Wang, J. D., and Chen, R. High-precision, whole-genome sequencing of laboratory strains facilitates genetic studies. PLOS Genet., 2008, 4, e1000139. 182. Ikeda, M., Ohnishi, J., Hayashi, M., and Mitsuhashi, S. A genome-based approach to create a minimally mutated Corynebacterium glutamicum strain for efficient L lysine production. J. Ind. Microbiol. Biotechnol., 2006, 33, 610–615. 183. Ohnishi, J., Mizoguchi, H., Takeno, S., and Ikeda, M. Characterization of mutations induced by N-methyl-N′-nitro-N-nitrosoguanidine in an industrial Corynebacterium glutamicum strain. Mutat. Res., 2008, 649, 239–244. 184. Lee, J. M., Zhang, S., Saha, S., Santa Anna, S., Jiang, C., and Perkins, J. RNA expression analysis using an antisense Bacillus subtilis genome array. J. Bacteriol., 2001, 183, 7371–7380. 185. Jürgen, B., Tobisch, S., Wumpelmann, M., Gordes, D., Koch, A., Thurow, K., Albrecht, D., Hecker, M., and Schweder, T. Global expression profiling of Bacillus subtilis cells during industrial-close fedbatch fermentations with different nitrogen sources. Biotechnol. Bioeng., 2005, 92, (3), 277–298. 186. Silvaggi, J. M., Perkins, J. B., and Losick, R. Genes for small, noncoding RNAs under sporulation control in Bacillus subtilis. J. Bacteriol., 2006, 188, 532–541. 187. Völker, U. and Hecker, M. From genomics via proteomics to cellular physiology of the Gram-positive model organism Bacillus subtilis. Cell Microbiol., 2005, 7 (8), 1077–1085.
Metabolic Engineering of Bacillus subtilis
23-35
188. Soga, T., Ohashi, Y., Ueno, Y., Naraoka, H., Tomita, M., and Nishioka, T. Quantitative metabolome analysis using capillary electrophoresis mass spectrometry. J. Proteome Res., 2003, 2 (5), 488–494. 189. Koek, M. M., Muilwijk, B., van der Werf, M. J., and Hankemeier, T. Microbial metabolomics with gas chromatography/mass spectrometry. Anal. Chem., 2006, 78 (4), 1272–1281. 190. Makita, Y., Nakao, M., Ogasawara, N., and Nakai, K. DBTBS: database of transcriptional regulation in Bacillus subtilis and its contribution to comparative genomics. Nucleic Acids Res., 2004, 32 (Database issue), D75–D77. 191. Gupta, A., Varner, J. D., and Maranas, C. D. Large-scale inference of the transcriptional regulation of Bacillus subtilis. Comput. Chem. Eng., 2005, 29, 565–576. 192. Noirot, P. and Noirot-Gros, M. F. Protein interaction networks in bacteria. Curr. Opin. Microbiol., 2004, 7 (5), 505–512. 193. Bernhardt, J., Buttner, K., Scharf, C., and Hecker, M. Dual channel imaging of two-dimensional electropherograms in Bacillus subtilis. Electrophoresis, 1999, 20 (11), 2225–2240. 194. Hambraeus, G., von Wachenfeldt, C., and Hederstedt, L. Genome-wide survey of mRNA half-lives in Bacillus subtilis identifies extremely stable mRNAs. Mol. Genet. Genomics, 2003, 269 (5), 706–714. 195. Fischer, E. and Sauer, U. Large-scale in vivo flux analysis shows rigidity and suboptimal performance of Bacillus subtilis metabolism. Nat. Genet., 2005, 37 (6), 636–640. 196. Zamboni, N., Maaheimo, H., Szyperski, T., Hohmann, H.-P., and Sauer, U. The phosphoenolpyruvate carboxykinase also catalyzes C3 carboxylation at the interface of glycolysis and the TCA cycle of Bacillus subtilis. Metab. Eng., 2004, 6, 277–284. 197. Zamboni, N., Fischer, E., Muffler, A., Wyss, M., Hohmann, H. P., and Sauer, U. Transient expression and flux changes during a shift from high to low riboflavin production in continuous cultures of Bacillus subtilis. Biotechnol. Bioeng., 2005, 89, 219–232. 198. Kümmel, A., Panke, S., and Heinemann, M., Putative regulatory sites unraveled by network-embedded thermodynamic analysis of metabolome data. Mol. Sys. Biol., 2006, 2, 34. 199. Chassagnole, C., Noisommit-Rizzi, N., Schmid, J. W., Mauch, K., and Reuss, M. Dynamic modeling of the central carbon metabolism of Escherichia coli. Biotechnol. Bioeng., 2002, 79, 53–73. 200. Moritz, B., Striegel, K., de Graaf, A. A., and Sahm, H. Changes of pentose phosphate pathway flux in vivo in Corynebacterium glutamicum during leucine-limited batch cultivation as determined from intracellular metabolite concentration measurements. Metab. Eng., 2002, 4, 295–305. 201. Kitano, H. Systems biology: a brief overview. Science, 2002, 295, 1662–1664. 202. Alberghina, L. and Westerhoff, H. V. Systems Biology: Definitions and Perspectives. Springer: Berlin, Germany, 2005. 203. Sauer, U. and Bailey, J. E. Estimation of P-to-O ratio in Bacillus subtilis and its influence on maximum riboflavin yield. Biotechnol. Bioeng., 1999, 64, 750–754. 204. Sauer, U., Cameron, D. C., and Bailey, J. E. Metabolic capacity of Bacillus subtilis for the production of purine nucleotides, riboflavin, and folic acid. Biotechnol. Bioeng., 1998, 59, 227–238. 205. Park, S. M., Schilling, C. H., and Palsson, B. O. Compositions and methods for modeling Bacillus subtilis metabolism. PCT International Patent Application No US 2003/008326 A1, 2003. 206. Lee, J., Goel, A., Ataai, M. M., and Domach, M. M. Supply-side analysis of growth of Bacillus subtilis on glucose-citrate medium: feasible network alternatives and yield optimality. Appl. Environ. Microbiol., 1997, 63, 710–718. 207. Dettwiler, B., Dunn, I. J., Heinzle, E., and Prenosil, J. E. A simulation model for the continuous production of acetoin and butanediol using Bacillus subtilis with integrated pervaporation separation. Biotechnol. Bioeng., 1993, 41, 791–800. 208. Igoshin, O. A., Price, C. W., and Savageau, M. A. Signalling network with a bistable hysteretic switch controls developmental activation of the sigma transcription factor in Bacillus subtilis. Mol. Microbiol., 2006, 61, 165–184. 209. Iber, D. A computational analysis of the impact of the transient genetic imbalance on compartmentalized gene expression during sporulation in Bacillus subtilis. J. Mol. Biol., 2006, 360, 15–20.
23-36
Developing Appropriate Hosts for Metabolic Engineering
210. Suel, G. M., Garcia-Ojalvo, J., Liberman, L. M., and Elowitz, M. B. An excitable gene regulatory circuit induces transient cellular differentiation. Nature, 2006, 440, 545–550. 211. De Jong, H., Geiselmann, J., Batt, G., Hernandez, C., and Page, M. Qualitative simulation of the initiation of sporulation in Bacillus subtilis. Bull. Math. Biol., 2004, 66, 261–299. 212. Price, N. D., Reed, J. L., and Palsson, B. O. Genome-scale models of microbial cells: evaluating the consequences of constraints. Nat. Rev. Microbiol., 2004, 2, 886–897.
24 Metabolic Engineering of Streptomyces 24.1 Streptomyces as Superhosts ����������������������������������������������������������� 24-1 24.2 The Streptomyces Genome and Its Modification........................24-4
Streptomyces Genomes • Molecular Biology of Streptomyces
Transcriptome • Proteome • Metabolome • Fluxome
24.3 Analysis of Streptomyces Strains................................................. 24-10 24.4 Modeling and Design of Streptomyces Strains.......................... 24-12 24.5 Examples of Metabolic Engineering in Streptomyces.............. 24-13
Irina Borodina and Anna Eliasson Technical University of Denmark
Jens Nielsen Chalmers University of Technology
Using an Optimized Host • Increasing Expression of Genes from the Biosynthetic Cluster • Increasing Precursor or Cofactor Supply • Changing Morphology • Improving Oxygen Supply • Improving Secretion and Reducing Degradation of Recombinant Proteins • Changing Regulation • Modifying the Product
24.6 Perspectives ����������������������������������������������������������������������������������� 24-21 Acknowledgments ����������������������������������������������������������������������������������� 24-21 References ������������������������������������������������������������������������������������������������� 24-21
24.1 Streptomyces as Superhosts Some of the common soil microorganisms are actinomycetes, Gram-positive bacteria with high GC content. Because of their mycelial habit they were initially believed to be fungi, which reflected in their name (mucus (lat.) means fungus). In 1939, one year before rediscovery of penicillin by Florey and Chain, soil microbiologist Waksman has set his lab on a quest for new antimicrobial drugs. From the previous studies he knew that actinomycetes can inhibit the growth of other soil bacteria through secretion of bioactive compounds, which he named “antibiotics” (anti (lat.) against, bio (lat.) life). Systematic search for antibiotics produced by actinomycetes resulted in the discovery of actinomycin (1940), clavacin, and streptothricin (1942), all of them sadly turned out to be toxic in animal tests. In 1943 Waksman’s student Schatz isolated streptomycin-producing strain of Streptomyces griseus.1 Streptomycin was not particularly toxic to animals and humans, but remarkably was the first compound active against tuberculosis bacteria. Many pharmaceutical companies and research laboratories started to collect soil samples from all over the world in search of antibiotics-producing organisms. Most of the discoveries were made in the first ten years of the “hunt,” the larger part involved Streptomyces species. Streptomyces is a genus in the genera of actinomycetes, many of these bacteria produce volatile compounds that give the earth its characteristic odor. Streptomyces proved to be an excellent source of secondary metabolites, including antibiotics, anticancerous agents, antihelmintic drugs, and other useful compounds (Table 24.1). At present more than half of antibiotics in clinical use are produced in Streptomyces species.
24-1
24-2
Developing Appropriate Hosts for Metabolic Engineering
Table 24.1 Examples of Industrial Processes That Use Streptomyces Product Type
Product
Streptomyces sp.
Antibacterial Antifungal
Rifampicin Nystatin
S. mediterranei S. noursei
Antihelmintic Herbicidal Insecticidal Antitumour Immunosupressants Enzymes Heterologous proteins
Ivermectin Bialaphos Avermectins Bleomycin Tacrolimus Glucose isomerase Human GranulocyteMacrophage ColonyStimulating Factor (GM-CSF)
S. averimitilis S. hygroscopicus S. averimitilis S. verticillus S. tsukubaensis S. murinus S. lividans expressing rhGM-CSF gene on a plasmid
Companies* Sanofi-Aventis (France) Bristol-Myers Squibb (USA), Bayer (Germany) Merck Meiji Seika Kaisha, Ltd (Japan) Novartis (Switzerland) Bristol-Myers Squibb (USA) Fujisawa (Japan) Novozymes (Denmark) Cangene (Canada)
*Only few examples of the producing companies are given. For most of the drugs the original patents have expired and multiple generic variants are available.
Besides secondary metabolites biosynthesis, Streptomyces have also been exploited for enzymes production. Namely Streptomyces spp. are capable of decomposing cellulose, lignin, chitin, and other complex polymers thanks to excretion of efficient hydrolyzing enzymes. These enzymes can be applied for degradation of cellulose in bioethanol production, for making high fructose corn syrup, and other processes.2–4 Some streptomycetes can degrade toxic compounds and are used in bioremediation.5–8 Due to their ability of protein secretion Streptomyces have recently also been used for manufacturing of pharmaceutical recombinant proteins.9,10 So Streptomyces besides being indispensable in nature as decomposers, also are of high industrial importance. Very few Streptomyces strains are pathogenic, one of them being S. scabies, causing scab on tubers of potatoes, beets, etc. Most of the Streptomyces strains are nonpathogenic and nontoxic and some of the products made in Streptomyces fermentations have a GRAS status (“generally regarded as safe”), e.g., glucose isomerase made by S. rubiginosus (Food Standards Agency, U.K.). Typical features of Streptomyces are mycelial growth and morphological differentiation, which makes them one of the most sophisticated groups of bacteria. In favorable conditions Streptomyces spores germinate and produce vegetative mycelium (hyphae); when the nutrients become sparse or the cells get stressed in another way, morphological differentiation begins. Parts of mycelial colonies undergo lysis and hereby nutrients are supplied for the growth of aerial mycelium, which later segments into spores Figure 24.1). The spores are unicellular with a single copy of the genome, which is in contrast to the hyphae that consist of cells with several nucleoids. Streptomyces spores are resistant to the absence of nutrients and dry environments. It is anticipated that many Streptomyces strains survive in the soil in the form of spores.11 During transition from vegetative growth to aerial mycelium formation, Streptomyces often secrete bioactive compounds, e.g., antibiotics. Keith Chater has proposed a theory that Streptomyces are quite vulnerable at the stage of differentiation and that antibiotics production has been developed as a defense mechanism preventing other soil bacteria to use the nutrients that are released during differentiation.11 Indeed, about half of the Streptomyces strains isolated in the laboratory proved to produce an antimicrobial compound of one or another kind, and many strains produced several antibiotics. Sequencing of Streptomyces genomes revealed many secondary metabolite gene clusters per genome, but quite a few of these clusters seem to be silenced under laboratory conditions.12 Streptomyces are the cell factories of choice for production of many secondary metabolites, particularly antibiotics. They are easy to grow in submerged cultures, they can utilize cheap complex industrial media and they secrete metabolites into the broth. It is most common for industry to optimize natural Streptomyces isolates employing random mutagenesis and screening and to use the resulting overproducing mutants in production. The strains optimized in this way are not considered to be
24-3
Metabolic Engineering of Streptomyces Spore Spore maturation Spore germination
Aerial mycelium and secondary metabolites
Growth
Colored secondary metabolite (actinorhodin)
Agar
Aerial mycelium
Substrate mycelium Agar
Figure 24.1 (See color insert following page 10-18.) Streptomyces life cycle. The pictures show cross-sections of S. coelicolor colonies on agar medium. In the upper right corner of each picture the development stage is drawn schematically.
genetically modified organisms (GMOs), which easies the regulatory requirements around production facilities and the process. As for production of enzymes and heterologous proteins, Streptomyces is one of the hosts to consider. In general, bacterial processes are usually preferred for protein production compared with eukaryotic expression systems, because they are cheaper and easier to run in reactors and give higher yields. In comparison to the typical bacterial host Escherichia coli, which accumulates proteins intracellularly in insoluble form, Gram-positive bacteria as Bacillus subtilis and Streptomyces can efficiently secrete proteins into the medium. They can also express genes with a wide range of GC content without codon adjustment, while E. coli has difficulties with high-GC genes expression. The folding of proteins differs in B. subtilis and streptomycetes and some proteins are better expressed by the latter. For instance, a large xylogluconase protein from Jonesia sp. was successfully secreted in functional soluble form by an engineered S. lividans, while attempts to express it in E. coli and B. subtilis failed.13 Streptomyces are also less susceptible to phage infections than E. coli and B. subtilis. On the other hand, there are also disadvantages of using Streptomyces instead of classical bacterial hosts as E. coli and B. subtilis, e.g.,: • Mycelial growth limits mass transfer and complicates mixing and oxygen supply. • Tools for genetic manipulation are not as well developed. • Streptomyces genomes are usually very large with many regulatory proteins, which makes it difficult to predict an outcome of a genetic modification. All in all, Streptomyces are very suitable for secondary metabolites production and as more knowledge is accumulated they are also becoming more attractive hosts for production of other compounds. In this chapter we will briefly discuss Streptomyces genomic organization and tools available for genetic modifications (Table 24.2). We will thereafter describe the application of some analysis tools that allow more advanced knowledge-based strain design, like microarrays, proteomics, fluxomics, etc.
24-4
Developing Appropriate Hosts for Metabolic Engineering
Table 24.2 Useful Links for Streptomyces http://streptomyces.org.uk
http://www.surrey.ac.uk/SBMS/Fgenomics/ Microarrays/html/Facility.html http://genome-www5.stanford.edu/ strep-microarray.sbs.surrey.ac.uk/stropE.html http://streptomyces.org.uk/redirect
Streptomyces resource, contains links to S. coelicolor A3(2) genome annotation, pathway/genome database ScoCYC, microarray and proteomics data Microarray production facility in Surrey University, making S. coelicolor DNA arrays Stanford MicroArray Database contains some expression data from Streptomyces Operon predictions in S. coelicolor47 Web tools for design of REDIRECT knock-out in S. coelicolor
The chapter will end with a review of various strategies that have been used to enhance production of secondary metabolites and proteins in Streptomyces species.
24.2 The Streptomyces Genome and Its Modification 24.2.1 Streptomyces Genomes It is an implication for metabolic engineering that genetic tools for changing gene expression are available. Knowing the genome sequence of the organism gives a tremendous advantage and improves both the speed and the quality of genetic modifications. Until 2002 only sequences of single Streptomyces genes or clusters were available. In 1999 Hopwood and coworkers from the John Innes Centre and the Sanger Institute started an initiative on sequencing the whole genome of S. coelicolor A3(2), the best genetically studied Streptomyces strain, often used as a model antibiotics producer. In 2002 the genome sequence was published in Nature and this marked a new era for Streptomyces research.14 At the time of publication it was the largest bacterial genome sequenced with 8.7 Mb base pairs (Figure 24.2). The coding density was high as also found for other bacteria, and about 8,000 open reading frames (ORFs) were predicted. Thus the bacterium contained more genes than the eukaryotic yeast Saccharomyces cerevisiae. The chromosomes of most of Streptomyces spp. are linear in contrast to more common circular chromosomes in prokaryotes.15,16 To the ends of the chromosome terminal proteins are covalently bound, these proteins probably act as primers for the replication of the last fragment of the lagging strand when the transcriptase reaches the end of the chromosome. The ends contain long terminal inverted repeats. The origin of replication is often positioned around the center of the chromosome. The genes essential for growth are mostly allocated on the core of the chromosome, while the arms contain genes for secondary metabolism and alternative nutrients utilization. The arms of a chromosome are often subjected to mutations, which can cause deletions of up to 2 kb of sequence, duplications of parts of the sequence or circularization of the chromosome,17 consequently the industrial producers are often genetically unstable, that is the product titers can decrease in the course of propagation or cultivation due to spontaneous genetic rearrangements. The genes for secondary metabolism are grouped into clusters, which usually include biosynthetic, resistance, and regulatory genes. In the S. coelicolor genome 20 clusters were predicted, of which only three were known prior to the genome sequencing. It was predicted that around 1,000 proteins play a regulatory role as transcription factors, response regulators, σ-factors, etc. This renders Streptomyces one of the most sophisticatedly regulated bacteria and thus makes metabolic engineering of this organism more challenging. A year after publication of the S. coelicolor sequence, the genome sequence of S. avermitilis, an industrial producer of the antiparasitic compounds avermectins, was released.18 The sequence of the plant parasite S. scabies has become available in 2006, although the annotation of the genome was not yet completed at the time of writing. Other Streptomyces genomes are in the pipeline for genomic sequencing: S. ambofaciens, S. peucetius (Genomes OnLine Database, www.genomesonline.org). S. diversa genomic sequencing was performed by Diversa/Celera but the sequence remains p roprietary so far.
24-5
Metabolic Engineering of Streptomyces
8 1
7 Streptomyces coelicolor 8,667,507 bp
2
6 3
4
5 Ori
Figure 24.2 (See color insert following page 10-18.) Streptomyces coelicolor chromosome. The outer scale is numbered anticlockwise in megabases and indicates the core (dark blue) and arm (light blue) regions of the chromosome. Circles 1 and 2 (from the outside in), all genes (reverse and forward strand, respectively) color-coded by function (black, energy metabolism; red, information transfer and secondary metabolism; dark green, surface associated; cyan, degradation of large molecules; magenta, degradation of small molecules; yellow, central, or intermediary metabolism; pale blue, regulators; orange, conserved hypothetical; brown, pseudogenes; pale green, unknown; grey, miscellaneous); circle 3, selected “essential” genes (for cell division, DNA replication, transcription, translation and amino-acid biosynthesis, color coding as for circles 1 and 2); circle 4, selected “contingency” genes (red, secondary metabolism; pale blue, exoenzymes; dark blue, conservon; green, gas vesicle proteins); circle 5, mobile elements (brown, transposases; orange, putative laterally acquired genes); circle 6, G + C content; circle 7, GC bias ((G–C/G + C), khaki indicates values > 1, purple < 1). The origin of replication (Ori) and terminal protein (blue circles) are also indicated. (From Bentley, S.D., et al., Nature, 417, 141, 2002. With permission.)
24.2.2 Molecular Biology of Streptomyces Molecular biology work with Streptomyces has its own rules and tricks, which are well described in an excellent handbook “Practical Streptomyces Genetics” by Kieser et al. from the John Innes Center.11 Here we will just briefly go through the main difficulties that one might stumble into in the beginning of working with Streptomyces. Streptomyces have a very high GC content, e.g., 72% in the S. coelicolor genome. This complicates PCR reactions, sequencing and might render plasmids unstable in E. coli. Furthermore many Streptomyces have a restriction barrier, which makes it impossible to transform them with methylated DNA.19 To avoid methylation the plasmids can be passed through nonmethylating E. coli strains (as ET12567)19 or
24-6
Developing Appropriate Hosts for Metabolic Engineering
through S. lividans before introduction into the host. It is common to use conjugation from E. coli to deliver the DNA into Streptomyces and the method works well with nearly all species.20,21 Some strains can be transformed by PEG-assisted protoplast transformation or transfected by phages. One should be careful when choosing antibiotics selection markers because many strains will be naturally resistant to certain antibiotics. 24.2.2.1 Overexpression of Genes Many vectors for overexpression of genes in Streptomyces have been published and patented recently. These include both self-replicating and integrating vectors. There have been some reports that multicopy vectors reduce antibiotics production, so it is advisable to use integrating vectors whenever possible. High-producing strains seem to be particularly sensitive to the metabolic load imposed by multicopy plasmids and vectors are therefore lost in the absence of selection pressure.11 Integrative vectors often contain the attachment site attP and the gene int, coding for integrase from phage phiC31.22,23 Many Streptomyces carry an attB sequence, identical to attP, in their chromosome. The integrase protein promotes recombination between att sites and thus causes vector integration. Even when there is no attB sequence present in the chromosome, integration can still occur through recombination with sequences, which are homologous to attB.24 A range of strong constitutive and inducible promoters are available, e.g., thiostrepton-inducible tipA,25 modified constitutive ermP* (personal communication with Mervyn Bibb, John Innes Centre, UK), and snpR-activated snp.26,27 To achieve a range of expression levels one can also use synthetic promoter libraries.28,29 24.2.2.2 Replacement of Genes The classical gene inactivation method works by interrupting the desirable gene by a plasmid through homologous recombination. Depending on the number and position of homologous sites on the plasmid single or double cross-over occurs and the gene is either interrupted by insertion of a foreign fragment or is partly or totally substituted by a selection marker. The selection marker can be afterward removed leaving a small nucleotide sequence, called “scar,” in the place of the deleted gene. The process of plasmid construction and mutant selection is normally quite time-consuming and can easily take several man-months. The real breakthrough for yeast genetics was the discovery of a PCR-based gene replacement method,30 which reduced the time needed to make a mutation down to a few days and furthermore it could be almost fully automated. Later they found out that a similar strategy could be applied in E. coli by expressing λ-recombinase from a low-copy number easily curable plasmid.31 The success with E. coli has paved the way for a faster gene replacement method for Streptomyces (Figure 24.3).32 Figure 24.3 (Opposite) Flowchart of gene disruption in Streptomyces using PCR targeting.32 1, The knock-out cassette, containing antibiotics resistance marker, origin of transfer and elements for cassette loop-out, is cut from the plasmid and purified. 2, 3, The cassette is amplified by PCR using primers which carry upstream and downstream sequences of the gene which should be replaced. 4, Streptomyces cosmid is transformed into an E. coli strain BW25113/pIJ790, which was designed for improved homologous recombination. 5, The E. coli carrying the cosmid with the gene of interest is transformed with the knock-out cassette. On induction of λ-RED genes by arabinose, the cassette integrates into the cosmid by homologous recombination. The λ-RED plasmid with a temperature sensitive replicon is cured by inoculation at 37ºC. The mutated cosmid is isolated and transformed into a nonmethylating E. coli strain ET12567, which carries transfer functions on plasmid pUZ8002. 6, The mutated cosmid is transferred into Streptomyces by conjugation from E. coli and the mutants screened, the double cross-over transformants should be resistant to the chosen resistance marker and sensitive to kanamycin. 7, The knock-out cassette can be removed from the mutated cosmid in E. coli with expressed FLP-recombinase gene and the newly made cosmid with a “scar” can be introduced into the mutated Streptomyces strain. The double cross-over event is selected by loss of antibiotics resistance. (From Binnie, C., et al., J Bacteriol, 177, 6033, 1995. With permission.)
24-7
Metabolic Engineering of Streptomyces Primer design
4
cat
aadA
oriT
1496 bp
pIJ778 Paac FRT
oriT
3
FRT
Marker
FRT
gam bet exo
1424 bp
FRT
Amplification of the disruption cassette
ts λ RED Recomb. plasmid
Paac FRT
E.coli BW25113/PIJ790
1384 bp
pIJ773
20nt 39nt
PCR
oriT
FRT
ORF
aac(3)IV
FRT
oriT
FRT
Marker
Cosmid
Transformation
1
Paac
FRT
neo
Templates for PCR amplification
2
39nt 19nt
FRT
Preparation of E.coli containing λ RED plasmid and S.coelicolor cosmid
vph
oriT
pIJ780
Induction of λ RED by L-arabinose
Transformation PCR-targeting
5
cat
E.coli BW25113 / PIJ790 / cosmid
Legend:
neo
ts λ RED Recomb. plasmid
cosmid
aac(3)IV: apramycin resistance gene aadA: Spectinomycin/streptomycin resistance gene bet, exo: Promote recombination cat: Chloramphenicol resistance gene FLP: FLP-recombinase FRT: FLP recognition target gam: inhibits the host RecBCD exonuclease V neo: Kanamycin resistance gene ORF: Open reading frame oriT: Origin of transfer from RK2 ts: Temperatur-sensitive replicon vph: Viomycin resistance gene
ORF
gam bet exo
Loss of temperature sensitive λ RED recombination plasmid PIJ790 overnight at 37ºC
30ºC 37ºC
E.coli BW25113 / cosmid neo
cosmid Transformation of ET12456 containing the Tra+ plasmid pUZ8002
Transformation of E.coli strain containing FLP system
E.coli ET12567/pUZ8002/cosmid neo
pUZ8002
cosmid
Conjugation with S.coelicolor
neo
pUZ8002
cosmid
FLP-mediated excision of disruption cassette
Conjugative 6 transfer and screening for double cross-overs
E.coli DHα/BT340 ts FLP Recomb. plasmid
neo
Streptomyces
cosmid
FLP
ORF
Streptomyces disruption mutant
Transformation
neo
cosmid
30ºC 43ºC
SCAR
Streptomyces in frame deletion
Induce FLP synthesis and loss of temperature sensitive FLP recombination plasmid BT340 overnight
E.coli DH5α neo
cosmid
Gene disruptions can be repeated using the same marker
7
cat
E.coli ET12567/pUZ8002/cosmid
SCAR
24-8
Developing Appropriate Hosts for Metabolic Engineering
The method, called REDIRECT, is based on replacement of a gene on a cosmid by a PCR fragment in an E. coli strain, engineered for efficient λ-recombination and for low degradation of linear DNA strands. The mutated cosmid is then extracted from E. coli and introduced into Streptomyces, where it by homologous recombination replaces the gene of interest. In comparison to the plasmid-based methods, the mutant cosmid construction is quite fast and the success of recombination in Streptomyces is increased due to very large homologous regions on the cosmid. It is possible to loop out the resistance cassette later, leaving a “scar,” which will not disturb the transcription of the surrounding genes. This allows studying an effect of single gene deletion even when the gene is a part of an operon.32 The removal can either be done using FLP recombination or Cre-loxP system. In FLP recombination method the mutated cosmid is transferred into an E. coli strain carrying FLP recombinase on a plasmid. After the recombination has occurred the cosmid with the “scar” can be isolated from the E. coli and transferred into the mutated Streptomyces strain. The replacement of resistant marker by “scar” in Streptomyces is selected by loss of antibiotics resistance.32 Another procedure suggests to use loxP sequences instead of FLP recombination targets and to use an engineered Cre phage to infect the mutated Streptomyces strain and to perform the loxP recombination directly inside the Streptomyces cell. The phage infection can be cured in a few sub-cultivations.33 The prerequisite for using this knock-out method is the availability of a cosmid with the gene of interest, which is not always readily available for industrial strains. 24.2.2.3 Transposon Mutagenesis Genome-scale mutant libraries are interesting for systems biology approaches, for example to relate phenotypic features to the genotype. To generate large libraries, transposon-based mutagenesis techniques have been successfully applied.34–36 Transposons are mobile genetic elements, which can change their position in the genome. Depending on their structure they are divided into class I and class II transposons. Class I transposons are composed of some gene (e.g., drug resistance gene) flanked by insertion sequences (IS). ISs are typically 0.7–1.8 kb long sequences with repeats at the termini; they code for a single protein, which is involved in transposition. Class I transposons can transpose as a unit, but the flanking ISs can also transpose separately. Class II transposons carry a transposase and some other gene (-s) flanked by indirect repeats of 30–40 bp, they can only transpose as the whole unit. Genetically altered transposons lacking transposase have been designed. Transposase can be supplied in trans for the transposition event and then removed; without transposase the transposon insertion is stable in the genome. There are some difficulties associated with transposon mutagenesis. More than one insertion per genome can occur and hence some mutants can carry insertions in several places of DNA. The problem can be circumvented by using Tn3-like transposon, which insertion results in transposition immunity whereby a second insertion in the DNA molecule is inhibited.37 Another problem is that transposons are usually biased toward integrating in particular sequences and so the obtained mutations will not be evenly distributed over the chromosome.11,38 Finally, the transposons are usually polar, that is upon insertion they will disturb expression of the downstream genes. This is because the flanking regions of the tranposons normally serve as termination signals for RNA transcription and transposons have nonsense codons in all reading frames causing translation termination. In vivo transposon mutagenesis is performed by introducing transposase on a suicide plasmid and a transposon into the host and selecting for antibiotics resistant phenotype caused by transposon insertion into the chromosome Figure 24.4a). The result of in vivo mutagenesis is a mix of mutants, which can further be isolated and characterized to find out which gene that has been disrupted. Because identification of insertion position is at present a time-consuming procedure, only strains with interesting phenotypes are generally investigated. A mutant library of S. lividans has been generated by in vivo transposon mutagenesis.39 Screening identified a bald mutant, which was investigated further. By inverse PCR and sequencing it was determined that the mutant had an insertion in the osmoadaptation
cosmid KanR
1
in vitro reaction
Transposase
TransposonAprR
Streptomyces
1
Transposase
Transposon AprR
E.coli
cosmid KanR
cosmid KanR
E.coli
TransposonAprR
Streptomyces
TransposonAprR
Streptomyces
2
2
Streptomyces (AprRKanS)
cosmid KanR
Streptomyces
PCR
Streptomyces 3
4
3
Figure 24.4 Simplified scheme of in vivo (a) and in vitro (b) transposon mutagenesis. (a) 1, Streptomyces is transformed with a suicide plasmid containing transposon and transposase gene. 2, Streptomyces insertion mutants are selected by apramycin resistance. 3, Mutants with interesting phenotypes are characterized by PCR to determine the transposon insertion position. (b) 1, Cosmid containing a part of Streptomyces genome is mixed with transposon and purified transposase in vitro, the resulting mutated cosmids are electroporated into E. coli. 2, Antibiotics resistant clones are selected; the position of insertions is characterized by PCR for a large number of clones. 3, For disruption of a particular gene (yellow) a suitable mutated cosmid is transformed into Streptomyces by conjugation from E. coli. 4, Streptomyces colonies that are apramycin resistant and kanamycin sensitive are selected; they result from double cross-over event and carry insertion in the gene of interest.
(b)
(a)
Metabolic Engineering of Streptomyces 24-9
24-10
Developing Appropriate Hosts for Metabolic Engineering
regulator osaB gene. The mutant turned out to overproduce antibiotics on medium with high osmolytes concentration.40 In vitro mutagenesis can be used as a targeted approach. Here the mutagenesis can be performed by mixing the cosmids containing the sequence of the host organism with the transposon and purified transposase (Figure 24.4b). The mutagenized cosmids are transformed into E. coli, isolated and the insertion position is determined. To obtain a desired insertion, a suitable cosmid is transformed into the host, where its integration by double cross-over can be detected using selection markers. A collection of S. coelicolor mutagenized cosmids is being constructed with the funding from The Biotechnology and Biological Sciences Research Council (BBSRC, U.K.) and is available for academic research from Swansea University, U.K. The positions of the insertions are shown on the SCODB website (http://streptomyces.org.uk/sco/ index.html). At the time of writing the library contained mutagenized cosmids for disruption of about 60% of the ORFs. In the cosmids that have been processed single insertions were obtained for about 90% of the ORFs; the rest of the ORFs will have to be inactivated by a different method (e.g., REDIRECT). The disadvantage of gene inactivation by insertion instead of deletion is that: firstly, the gene can remain partly or fully functional with insertion, particularly if the insertion is further from the 3´-end of the gene41,42 and secondly, the expression of the downstream genes will most likely be disturbed by transposon, making the method unsuitable for studying knock-outs of genes, which are parts of an operon.
24.3 Analysis of Streptomyces Strains During the past years we have witnessed an extensive development of analytical techniques in biological sciences. They have allowed quantitative measurements of various cellular components from RNA and proteins to metabolites. The advances in analytical methods along with progress in computer driven analysis of data gave birth to systems biology, where the cells are viewed as a system of various interacting components. One outcome of systems biology is integrative models that allow us to make better estimations of how we can change the cellular metabolism in a desired direction, so we can make metabolic engineering with a higher success rate.
24.3.1 Transcriptome As soon as the genomic sequence of S. coelicolor started to be released, the first cDNA arrays appeared. All of the transcriptome work published todate are done with PCR —or oligonucleotide arrays available for academic research from Stanford School of Biomedical and Molecular Sciences (U.S.) and from Surrey University (U.K.). It is also possible to buy in situ synthesized arrays (NimbleGen Systems Inc., USA), though the high price is a barrier for their extensive use. Gene expression analyses have cast light on many transcriptionally regulated processes in Streptomyces. In the first published study the authors monitored gene expression during the course of a batch fermentation of S. coelicolor. They observed that expression of secondary metabolite gene clusters as well as expression of many biosynthetic, ribosomal, and regulatory genes was changing during the different developmental phases. This paper also presented an algorithm for finding boundaries of clusters of coexpressed genes.43 In another study the same group tested the hypothesis that the genes in the central part of the chromosome encode “core” functions connected to growth. They found that during the nonlimited growth indeed the expression level was higher in the core of the chromosome than on its right and left arms. During the stationary phase and at different stress conditions the expression of the core genes decreased and various other genes throughout the chromosome composed the bigger part of the transcripts. Those were genes related to stress, morphological differentiation, and other less characterized genes.44 There has also been a few studies on regulatory mutants of S. coelicolor, e.g., with deletions of specific activators of antibiotics gene clusters (actII-4 and redD),43 deletion of the heat shock protein hspR45 and deletion of the two-component element absA1 with an effect on antibiotics genes expression.46
Metabolic Engineering of Streptomyces
24-11
A range of bioinformatic tools have been used for prediction of operons in S. coelicolor.47 When expression data from some of the previous studies was overlaid on the predicted operons, it was observed that expression usually decreases from the first to the last gene in a given operon. This polarity of gene expression is not observed in E. coli and is thought to be attributed to the Streptomyces high GC content, which complicates the transcription process. A few genes positioned in the middle of the operon were expressed higher than the first gene, but in front of those genes internal promoters were commonly found.47 The global transcriptional studies have revealed regional fluctuations of gene expression across the chromosome (Colin Smith, Surrey University, U.K., presentation at Genetics of Industrial Microorganisms 2006). One of the explanations could be the variation in the binding of histone-like proteins to DNA. The S. coelicolor histone-like proteins are small proteins around 12–16 KDa, which bind to the chromosome in form of dimers or heterodimers, making the nucleoid. It has been proven that genes transcription was positively correlated with their exposure as determined by sensitivity to DNase I treatment.48 Hopefully further research in gene expression will provide system-level knowledge on transcriptional regulation in Streptomyces. Microarrays can also be used to analyze overproducing mutants generated by random mutagenesis in order to understand why overproduction occurs and use the knowledge for directed design of strains, in the process of inverse metabolic engineering. The industrial overproducers of erythromycin and tylosin, Saccharopolyspora erythraea and S. fradiae respectively, were compared to the corresponding wild type strains using spotted arrays. The arrays were based on the S. coelicolor genomic sequence but additionally contained spots for the erythromycin and tylosin gene clusters. The industrial S. erythraea was found to have longer and more extensive expression of antibiotics biosynthetic genes than the wild type, while the industrial S. fradiae strain had changes in the expression of some metabolic genes involved in biosynthesis of the tylosin precursor.49
24.3.2 Proteome The levels of mRNA, which can be measured by microarrays, do not necessarily correlate with protein concentrations. Besides, protein activity, function and half-life time are greatly affected by posttranslational modifications. Protein concentrations and modifications can be estimated in proteomics analysis and the information may be helpful in metabolic engineering. Hesketh et al. reported results of 2D gel electrophoresis of S. coelicolor whole cell extracts followed by MALDI-TOF mass spectrometry.50 About 10% of all the predicted S. coelicolor proteins were detected. Positively, the sequences of all the proteins agreed with previously predicted sequences,14 signifying the high quality of the genome annotation. On average 1.2 spots were detected per protein sequence, which shows that there is a high degree of posttranslational modifications. 2D gels provide more qualitative than quantitative information, which limits data interpretation. A more advanced approach to proteome analysis is multidimensional LC-LC-MS, which has been applied for analysis of the industrial S. diversa strain (Steve Briggs, Diversa, USA, presentation at the conference Metabolic Engineering V).
24.3.3 Metabolome There are no published data on metabolome analysis in Streptomyces. For the analysis of intracellular metabolites a popular quenching method with cold methanol is not advisable because of the substantial leakage of intracellular metabolites.51,52 Rapid filtration followed by quenching with liquid nitrogen can be an alternative,53 but this is not very suitable for metabolites with high turnover rates.
24.3.4 Fluxome In metabolic engineering comparison of flux maps for reference and mutant strains represents an invaluable source of information about operation of the metabolic network in the two strains. Metabolic fluxes
24-12
Developing Appropriate Hosts for Metabolic Engineering
for Streptomyces have often been calculated applying metabolic models of various sizes based on only a few measurements of fluxes in and out of the cell.54–56 One should be careful in interpretation of these data, as these metabolic models often have a high degree of freedom and hence the solution identified, i.e., the set of fluxes, may not be unique. More sound flux data is generated by applying isotope labeling technique for metabolic flux analysis (13C MFA), where metabolite balancing is supplemented by isotope balancing making the models more determined. The enrichment of compound with isotopes can be determined by gas chromatography coupled to mass spectrometry (GC-MS) or 13C nuclear magnetic resonance (NMR). Besides enabling robust quantification of the metabolic fluxes this approach may be used to map the network topology. Thus, GC-MS analysis of cells grown on [1-13C]-glucose led to discovery of a glycolytic pathway unusual for actinomycetes in Nonomuraea sp. 3972757 and in S. tenebrarius.58 In both these organisms the Entner–Doudoroff (ED) pathway was identified, and its presence has implications on the reducing cofactor biosynthesis and energy metabolism, because of a different stoichiometry in this pathway compared with the Embden-Meyerhof-Parnas (EMP) pathway. In vivo 13C and 15N NMR analysis of actinomycin D-producing Streptomyces parvulus, grown on fructose-glutamate medium, revealed high flux via gluconeogenesis as well as the origin of some of actinomycin D precursor amino acids.59In other studies using C13 MFA it was possible to spot changes in metabolism occurring over the time course of cultivation when the cells progress from exponential growth to the stationary production phase. In both actinorhodin-producing S. lividans and in nystatin-producing S. noursei, a decrease in the pentose phosphate (PP) pathway activity was observed upon onset of antibiotics production.54,60 In actinomycetes that have an active ED pathway (Nonomuraea sp. 39727 and S. tenebrarius), the ED pathway flux was found to decrease and more glucose was diverted to the EMP and PP pathways upon onset of production.58,61 C13 flux analysis of a S. coelicolor mutant with deletion of one of the phosphofructokinase isoenzymes (PFKA2) showed that this mutant had a higher flux through the PP pathway. PP pathway supplies NADPH, a cofactor used in many antibiotics biosynthesis. The increase of PP flux correlated with overproduction of the pigmented antibiotics actinorhodin (ACT) and prodigiosins (RED) in the mutant.52
24.4 Modeling and Design of Streptomyces Strains Whole cellular models still remain only a far-away perspective, even for such well-studied organisms as E. coli and yeast. For the complex Streptomyces bacteria availability of whole cell models, which include metabolism, regulation, kinetics, etc. is not likely for a very long time. However, for the purpose of metabolic engineering simpler genome-scale metabolic models have proven to be useful. Genome-scale metabolic models represent most of the metabolic reactions known to be performed by a given organism either from its genomic sequence or from physiological and biochemical data. As an example of genomic evidence, the presence of lactate dehydrogenase in the S. coelicolor genome implies that the cells can reduce pyruvate into lactate at the expense of NADH. As an example of physiological evidence could be the ability of S. coelicolor to degrade starch, telling that the organism possesses amyloglucosidases. Furthermore, biochemical data tells for instance that the S. coelicolor proteins contain histidine and as the cells do not require amino acids supplementation for growth, all the enzymes for histidine biosynthesis must be present. A genome-scale metabolic model has been reconstructed for S. coelicolor62 and models for a few other Streptomyces are under way (e.g., S. scabies). The models can be useful for omic data interpretation63 and for predictions, e.g., of strain improvement targets.64,65 The reporter metabolite methodology,66 which identifies around which metabolites most pronounced transcriptional changes have occurred, was applied for analysis of transcriptional data from phosphofructokinase deletion mutant (∆pfkA2) as compared to the reference strain. One of the most scoring reporter metabolites was NADPH, which is consistent with the increased PP pathway flux in the mutant.52
Metabolic Engineering of Streptomyces
24-13
Model guided strain design is a new phenomena, and has not been used for Streptomyces yet, however it has been successfully applied for other organisms, e.g., for improvement of lycopene production in E. coli.67,68 This systematic metabolic engineering approach could allow design of superhosts for production of different classes of secondary metabolites. Another application could be design of strains with certain metabolic problems (like excess production of NADPH). These strains can be used for improvement of enzymes properties. For example, for improvement of NADPH-dependent reductase, it is inserted into an engineered NADPH-overproducing strain and the strain is grown for many generations until the growth rate is improved. The resulting enzyme-coding gene is isolated and it will often carry mutations that have improved enzymatic activity (METEVOL ® technology from METabolic EXplorer, France).
24.5 Examples of Metabolic Engineering in Streptomyces 24.5.1 Using an Optimized Host Development of industrial Streptomyces production strains usually starts (and often ends) with chemical or UV mutagenesis followed by selection. Another popular method is protoplast fusion, which often results in breeds performing better than the parental strains. These methods are simple and relatively easy to apply; they do not require prior knowledge about the biosynthetic pathway, coding genes, etc. The disadvantages are that they are random, nonreproducible, can take long time and have to be applied to each strain that produces a new product. Many new secondary metabolite-producing organisms have been discovered recently. To evaluate the bioactivity of new metabolites it is necessary to produce them in fair amounts, which is often a problem with natural producers. Optimizing each wild type strain is tedious and expensive. For secondary metabolite genes, which are isolated from soil samples or generated by gene shuffling and other procedures, there is no strain to start with at all. A good expression host would greatly speed up the testing process and further enable high level production if the metabolite turns out to possess interesting properties. The most natural solution would be to have a range of optimized hosts for production of compounds from a particular class (β-lactams, polyketides, nonribosomal peptides, etc.). The hosts should be clean, that is not being able to produce native secondary metabolites in significant amounts. They should also be capable of producing precursors for many compounds belonging to the same class. One of the options is to develop superhosts from already optimized industrial strains. Industrial strains of S. fradiae and Saccharopolyspora erythraea have been engineered into plug&play superhosts for polyketide production.69 To start with, the whole native clusters of tylosin and erythromycin were deleted. S. erythraea did not contain a specific attB site for phage integration and hence this site was introduced to allow integration of the plasmid delivering biosynthetic genes for a new product. To test the hosts, the biosynthetic genes were then cloned from wild type low-producing strains of S. fradiae and S. erythraea and introduced into the engineered hosts using integrative plasmids. The levels of antibiotics production did not differ from that obtained with the endogeneous antibiotic clusters in the original industrial strains, even when native promoters for the endogenous biosynthetic genes were used. It was found that the promoter sequences did not differ in the mutated industrial and the wild type strains, indicating that during the optimization of these strains most of the changes did not occur in the sequences of the biosynthetic genes or their promoters, but elsewhere. Better production in industrial strains might instead have been caused by increased substrates uptake rates, better production of Gibbs free energy and precursors and most probably also deregulation of secondary metabolite genes expression. One should, however, keep in mind that sometimes the strains can be overproducing for other reasons, like multiplications of the biosynthetic clusters (see also Section 24.5.2), and obviously such strains do not represent a suitable superhost platform for production of other antibiotics. Interestingly, expressing the biosynthetic gene clusters behind a strong constitutive promoter like ermEP* or by using the S. coelicolor actUIp/actII2-4 expression system in the above mentioned industrial strain backgrounds resulted
24-14
Developing Appropriate Hosts for Metabolic Engineering
in 20-fold decrease in antibiotics titers.69 This is curious because there has been a successful attempt to improve erythromycin production through substitution of the native promoter in a wild type S. erythraea to the actUIp/actII2-4 system.70 However, in the industrial strains the regulation might be changed and the expression of the biosynthetic genes from the newly introduced promoters might have been weaker than from the original ones.69The barrier in using industrial strains as superhost platforms is that they are usually proprietary and not sequenced. Alternatives could be well-studied strains such as S. coelicolor and S. lividans. Both of these strains have been used for heterologous expression of secondary metabolites biosynthetic genes from other organisms that are poorly characterized genetically or are problematic to grow in the lab. Expression of a large modular polyketide synthase from S. erythraea of about 30.8 kb long in S. coelicolor 71 and in S. lividans72 gave good yields of 6-deoxyerythronolide B, the aglycone unit of erythromycin. The myxobacterium Sorangium cellulosum is a natural producer of the valuable anti-cancer agent epothilone, but the bacterium is slow-growing and forms multicellular fruiting bodies on agar plates, which complicates genetic engineering.73 The epothilone-coding genes were therefore cloned into S. coelicolor CH999 host, which has the actinorhodin biosynthetic genes deleted and the prodigiosins production blocked.74 This resulted in a strain that could produce up to 0.1 mg/L epothilones A and B in nonoptimized fermentation.75 The titer was lower than in Sorangium cellulosum fermentations (up to 20 mg/L), but similar values could be reached after medium optimization. The main advantage of this strategy is that S. coelicolor grows ten times faster than the native host and is well amenable for genetic manipulations, which is useful for generating epothilone analogues.
24.5.2 Increasing Expression of Genes from the Biosynthetic Cluster In the wild type strains secondary metabolite genes are often weakly expressed or not expressed at all under the given conditions and the yields of the compounds are very low. Overexpression of the whole cluster or of some of the rate-controlling enzymes will often result in higher production. Multiplication of the whole cluster was observed in industrial strains that have undergone empirical selection. A commercial S. kanamyceticus strain (Meiji Seika Kaisha Ltd., Japan) used for kanamycin production contains up to 36 copies of the kanamycin biosynthetic cluster.76 The cluster is located within an amplifiable unit of DNA (AUD), which made this high copy number possible. AUD consists of internal sequence flanked by direct repeats, AUDs usually are located in the arms of the chromosome, which are genetically unstable. AUDs can make tandem repeats, which include several hundred copies of the DNA unit.39 The amplification is often accompanied by deletions of other regions of DNA. Interestingly, Meiji Seika Kaisha’s strain is maintained as a mycelial culture because the high producing kanamycin phenotype gets lost in a single sporulation round. The authors hypothesized that the problem lies in the limited space for chromosome packaging in the spore, which can not accommodate the 5 Mb long amplification.76 Another way to increase gene expression is to replace a native promoter with a stronger, possibly inducible, promoter. Promoter substitution has been used to increase the expression of the xylose isomerase gene (xylA). This gene is normally induced by xylose and inhibited by glucose, and in order to have constitutive expression the xylA gene was put under control of a constitutive promoter and the construct was integrated on the chromosome of Streptomyces violaceoniger. This resulted in four to five fold improved xylose isomerase activity in the absence of xylose as inducer.77 Higher expression of whole biosynthetic gene clusters has been achieved by overexpressing positive regulators of the pathway or by deleting negative regulators. When the activator of the actinorhodin cluster (actII-4) or the activator of the prodigiosin cluster (redD) were introduced into S. coelicolor on a multicopy plasmid, the production of ACT and RED increased.78,79 In another study two polyketide synthase clusters were found by DNA fingerprinting in Streptomyces sp. PGA64: rubromycin cluster and a putative augucycline cluster. The second cluster was most probably silent under the given fermentation conditions because the strain produced only rubromycin in detectable amounts. Disruption of a putative augucycline cluster repressor resulted in a partial activation of the cluster and in secretion of two new augucycline metabolites—UWM6 and rabelomycin.80
Metabolic Engineering of Streptomyces
24-15
Overexpression of one or a few enzymatic genes from the biosynthetic cluster has also been a successful strategy. The challenging part is to determine which enzyme that has high flux control. In the fermentations of S. fradiae two major macrolides were found: tylosin, which is the product of interest and the tylosin precursor—macrocin. In the industrial strains from Eli Lilly they reported that tylosin accounted for 50–55% of all macrolides produced, while macrocin accounted for about 40%.81 This clearly indicated that the conversion of macrocin into tylosin is rate-limiting in these strains. This process is catalyzed by a single enzyme macrocin O-methyltransferase TYLF, and by inserting a second copy of the tylF gene it was possible to increase the fraction of tylosin in the macrolide pool from 50 to 80–85%.81
24.5.3 Increasing Precursor or Cofactor Supply Other approaches to enhance secondary metabolite production are to increase the biosynthesis of the precursor metabolites and to remove or hamper pathways that compete for this precursor with the secondary metabolites reactions. For example, malonyl-CoA and methylmalonyl-CoA serve as precursors for many polyketide antibiotics. To escalate the supply of these precursors the dicarboxylate transporter matC and malonyl-CoA synthase matB from Rhizobium trifolii were expressed in S. coelicolor, resulting in a strain that could take up relatively cheap substrates as malonate and methylmalonate and convert them into corresponding CoA thioesters. The strategy was tested in S. coelicolor, producing macrolactone 6-deoxyerythronolide B, and resulted in a 3-fold increase in the macrolactone titer.82 Also overexpression of acetyl-CoA carboxylase, which converts acetyl-CoA into malonyl-CoA had a positive effect on production of polyketide compound actinorhodin in S. coelicolor.83 Nikkomycin production in S. ansochromogenes was improved by overexpressing two genes sanU and sanV, which together with coenzyme B12 form an active glutamate mutase, the latter participates in the biosynthesis of a precursor of nikkomycin.84 A more straightforward approach is to add precursors directly to the fermentation medium, and if the precursor(s) price is low this can be a feasible industrial solution. Thus, it has been found that addition of glycine, arginine, phenylalanine, and tyrosine improve vancomycin production in Amycolatopsis orientalis.85 Tyrosine is a direct precursor in the pathway, whereas phenylalanine degradation leads to formation of another precursor 4-hydroxyphenylpyruvate. Some precursors also lead to induction of the genes encoding enzymes involved in biosynthetic pathways. For example, lysine is a precursor for the first enzyme of cephalosporin biosynthetic pathway (L-Lysine epsilon-aminotransferase (LAT)), but it also induces expression of this enzyme in S. clavuligerus, and addition of lysine to defined medium improved cephamycin production.86 An example of removing a competing pathway is blocking of clavams and cephamycin C biosynthesis in industrial S. clavuligerus, which improved clavulanic acid titers by 10%.87 Disruption of glyceraldehydes-3-phosphate dehydrogenase gap1 in the wild type S. clavuligerus doubled clavulanic acid titers most likely because of higher diversion of glyceraldehyde-3-phosphate to clavulanic acid biosynthesis rather than g lycolysis pathway.88 An interesting finding was that polyphosphate kinase ppk gene deletion causes antibiotics overproduction in S. lividans.89 In vivo, in condition of phosphate limitation, Ppk likely acts as a nucleoside di-phosphate kinase regenerating ATP from ADP and polyphosphates.90 The absence of this important ATP regenerating enzyme is predicted to lead to the lowering of the intracellular energetic charge (low ATP/ADP ratio), which further triggers a strong activation of central metabolism coupled to respiratory chain in order to regenerate the necessary ATP.90 The increased antibiotics production can be caused by higher supply of precursors from the central carbon metabolism, though regulatory events due to lower ATP/ADP ratio are probably important as well.
24.5.4 Changing Morphology Filamentous bacteria pose certain challenges in connection with large scale fermentation processes when compared to unicellular organisms. During fermentation there are problems with nutrients and oxygen
24-16
Developing Appropriate Hosts for Metabolic Engineering
transfer and shear damage of cells from agitation. Filtration/centrifugation in the downstream processing is also complicated by the high viscosity of the fermentation broth. In many strains hyphae clump together in so-called pellets, which are spherical hyphal clumps. The cells in the center of the pellet may lack nutrients and oxygen, and these cells are often stressed or dead resulting in an inactive core of the pellets. Effectively, it is only the cells on the surface of the pellets that grow and synthesize the product. It has been observed for S. noursei, that when pellet size increased above a certain critical value, the growth and nystatin production ceased.91 In S. tendae submerged cultivations it was found that maximal productivity of nikkomycins was achieved at a pellet diameter of 1.4 mm, and the productivity was lower for both smaller and bigger pellets.92 Possibly, pellet formation protects S. tendae mycelia from hyphal fragmentation and hereby has a positive effect until the pellets reach a critical diameter where nutrient diffusion becomes a problem. There is still much unknown about the impact of morphology on production and in some cases formation of larger pellets seems to be desirable. For instance, S. lividans engineered for production of a hybrid antibiotic, was making the product only in the presence of compact mycelial pellets.93 Apart from physical methods that can be applied to decrease pellets formation (high agitation rates, addition of viscosity-increasing agents, etc.), one can change the morphology at the genetic level. The SsgA protein that accumulates just before the onset of spore formation was first described in S. griseus (Kawamoto 1995). The protein is involved in formation of cross walls called septa. Vegetative septa are thin single-layer walls that divide vegetative mycelium into compartments, in contrast sporulation septa are thick double-layer walls that often form synchronically and divide sporogenic hyphae into chains of spores.94 When ssgA was overexpressed, S. griseus failed to sporulate in the submerged culture and the mycelium was highly fragmented.95 A homologous gene was also found in S. coelicolor; and overexpression of this gene caused increased septa formation in the vegetative mycelium.96 This resulted in a more fragmented mycelium and smaller and fewer pellets. The specific growth rate was increased 40 to 70% depending on the medium and the final concentration of undecylprodigiosin was higher by an order of magnitude.97 Overexpressing the S. coelicolor ssgA gene in other Streptomyces had similar effects: in S. roseosporus and S. lividans mycelial clumps got smaller and S. venezuelae did not longer sporulate in liquid medium. Excess of SsgA also improved growth and productivity of a S. lividans strain making a model protein tyrosinase, i.e., the specific growth rate was increased by 45% and tyrosinase productivity more than two fold.97
24.5.5 Improving Oxygen Supply Because of the filamentous character of Streptomyces, oxygen supply during fermentations is often limited. Streptomyces are strictly aerobic organisms and proper oxygen supply is essential for obtaining good biomass yields in the initial growth phase. Sufficient aeration is also necessary for some oxygendependent steps in product biosynthesis. For instance, there are two oxygen-dependent reactions in the biosynthesis of erythromycin.98,99 Oxygen uptake of Saccharopolyspora erythraea was improved by overexpressing the bacterial hemoglobin gene vhb from Vitreoscilla sp. on an integrating vector, resulting in 60% higher erythromycin production.100 Positive effect of vhb expression on antibiotics production has also been observed in other Streptomyces species.101
24.5.6 Improving Secretion and Reducing Degradation of Recombinant Proteins The commonly used host for recombinant protein production E. coli often accumulates large amounts of heterologous proteins as insoluble aggregates in the form of inclusion bodies. Recombinant protein purification consequently has to involve cell disruption, separation of inclusion bodies as well as laborious protein refolding. Production of pharmaceutical proteins in secreted form has some major advantages over intracellular production: the secreted target protein is usually natively folded, yields can be as high as or higher than that obtained from intracellular E. coli accumulation and finally there is a reduced requirement for expensive extraction and purification procedures. In E. coli, as Gram-negative
Metabolic Engineering of Streptomyces
24-17
bacterium, secreted proteins accumulate in the periplasm due to the presence of an outer membrane. Secretion of proteins directly into the fermentation medium can be obtained using Gram-positive bacteria as production host, i.e., B. subtilis. Lately S. lividans has also been employed for a number of recombinant protein processes.9,10 Streptomyces are good at secreting proteins because their survival in soil depends on their ability to secrete enzymes for degrading complex substrates. In the genome of S. coelicolor 819 genes encoding secreted proteins were predicted. These proteins include various enzymes for the degradation of exogenous substrates: proteases/peptidases, chitinases/chitosanases, cellulases/endoglucanases, amylases, and pectate lyases. Extracellular proteins accumulate in the culture broth of S. coelicolor mostly during transition and stationary phases.102 Proteins in Gram-positive organisms are transported through the Sec and the twin-arginine translocation (Tat) pathways, but some polypeptides can be exported by ABC transporters and for some specific proteins other Sec/Tat-independent mechanisms exist (Paul Dyson, Swansea University, U.K., presentation at Genetics of Industrial Microorganisms, 2006). The Sec pathway is the most commonly used pathway (Figure 24.5). In B. subtilis deletion of the gene encoding SecA, a major player in the Sec pathway, reduced the extracellular proteome by 90%, suggesting that all these proteins used Sec transport.103 The Sec pathway consists of a few components. Membrane proteins SecY, SecE, and SecG form a channel through which proteins are transported.104–106 SecA is the main driver of the transport process: it exists in a free form in the cytoplasm and in the SecYEG-bound form. SecA binds pre-proteins and, using the energy from ATP hydrolysis, step by step pushes the largely unfolded pre-proteins through the cytoplasmic membrane channel.107–110 SecDF associate with the SecYEG complex and contribute to the efficiency of protein secretion.111–114 Targeting of proteins to the SecYEG channel can occur in a post-translational process by chaperones or via the SRPmediated pathway in a cotranslational fashion. In the so-called cotranslational translocation, protein translocation is coupled to their synthesis on the ribosomes.115,116 When these proteins are emerging from the ribosome, their signal peptide is recognized by the signal recognition particle (SRP), which binds to the SRP receptor, FtsY protein,117 conserved among all studied bacteria, including S. lividans.118,119 In B. subtilis, SRP is required for the targeting of most secretory proteins.120 Other proteins are first completely synthesized and then exported in a so called post-translational translocation. It is important that the folding of those proteins in the cytoplasm is prevented, because fully folded proteins can not be transported through the Sec system. A secretion-dedicated E. coli chaperone SecB stabilizes pre-proteins inside the cytoplasm, so they preserve the translocation-competent state.121,122 SecB binds to SecA and in this way directs the proteins toward the translocation channel.123,124 SecB homologs supporting post-translational targeting are not present in Gram-positive organisms.125 The role of other common chaperones and heat shock proteins (GroE and DnaK series) in the process of protein secretion is not very clear, but there is some evidence that they enhance secretion of proteins by reducing formation of protein agglomerates. Higher expression of GroE and DnaK operons, mediated by deletion of a negative regulator hrcA, combined with overexpression of extracytoplasmic molecular chaperone PsrA improved production of single chain antibody fragments in B. subtilis.126,127 In Streptomyces the GroE operon is under negative regulation by the HrcA repressor128 and the DnaK operon is repressed by HspR.45,129,130 Deletion mutants ∆hrcA and ∆hspR have been constructed in S. coelicolor, S. lividans, and S. albus strains.45,128–130 So far, only the effect of hrcA and hspR deletion was investigated in S. lividans and, here, no positive effect on Sec-dependent protein secretion was observed (personal communication with Jozef Anné, Katholieke Universiteit Leuven, Belgium). The twin-arginine translocation (Tat) pathway can transport already folded proteins often bound to redox cofactors. The pathway only uses proton-motive force (PMF) as the energy source for transport. The proteins transported through the Tat pathway possess a signal peptide with a particular motif S/TR-R-x-Φ-Φ (with Φ being a hydrophobic residue) with two neighboring arginines. The Tat pathway is well studied in E. coli131 and B. subtilis132 and was recently found in S. lividans.133 The search for Tatdependent signal peptides in the sequenced genomes of S. coelicolor and S. avermitilis gave a list of 230 proteins, which is the highest number of predicted Tat-substrates for a bacterial genome.134 Very
24-18
Developing Appropriate Hosts for Metabolic Engineering
Ribosome N
SRP C
ATP
GTP GDP + Pi
ATP + Pi
SecA
FtsY
∆µH+
Sec
YEG SecDF
PrsA N
SipSTUVW
Proteases Cell wall
Figure 24.5 Cotranslational (whole line) and post-translational (dashed line) translocation of proteins via Sec pathway. (From van Wely, K.H. et al., FEMS Microbiol Rev, 25, 437, 2001. With permission.)
recently, experimental data provided strong evidence that the Tat system is used as a major general export pathway in Streptomyces.135 S. lividans with deletions of principal components of the Tat pathway, such as TatC or TatB, had retarded growth as well as impaired morphological differentiation on solid medium, confirming the importance of Tat-dependent secretion.134 Some proteins require Tatdependent secretion for being active. For instance, when xylanase C, a Tat-substrate, was directed to the Sec pathway through replacement of the signal peptide, the result was an inactive protein, which
Metabolic Engineering of Streptomyces
24-19
was quickly degraded after secretion.136 Overexpression of the tatABC genes in S. lividans, however, improved xylanase C production.137 To compare the transport through the Sec and Tat pathways, two human proteins (tumor necrosis factor and interleukin 10) were expressed in S. lividans with fused Sec- or Tat-dependent signal peptides. The secretion was lower for the Tat pathway, which might be attributed to suboptimal combinations of signal peptide and target protein or to the lower capacity and speed or higher energetics of the Tat pathway itself.138 The authors made an interesting discovery: secretion through the Sec pathway was several fold (up to 15) improved in the strain with an impaired Tat-pathway (∆tatB). It seems that the presence of Tat proteins has a negative effect on the Sec pathway, because it has also been shown that in a S. lividans strain with overexpressed TatABC proteins, the secretion of Sec-substrates was reduced.137 In some cases combination of both secretion pathways can be beneficial: xylanase B production was improved in S. lividans when xylanase B1 was engineered to be secreted by Sec pathway and xylanase B2 by Tat pathway.139 The signal peptide, which directs the pre-protein toward a secretion pathway, is an important determinant of how efficiently the protein is secreted. A library of naturally occurring signal peptides from B. subtilis has recently been constructed.140 The library screening was successfully used to optimize production of cutinase from Fusarium solani pisi and of a cytoplasmatic esterase. However, the signal peptide, which was optimal for one protein was not efficient for another protein secretion.140 This study highlights that secretion depends both of the signal peptide sequence and of the sequence of the mature part, hence signal peptides should be optimized individually for different proteins. Li et al. mutagenized the signal peptide of xylanase C, exported through the Tat pathway, but did not achieve improvement of secretion in S. lividans.141 In contrast, it was possible to enhance the secretion of mouse tumor necrosis factor alpha through Sec pathway by decreasing the positive charge in the signal peptide sequence142,143During the export through the Sec or Tat pathways the signal peptide has to be cleaved off by signal peptidases. There are four known type I signal peptidases in S. lividans: sipW, sipX, sipY and sipZ.144 Deletion of one of the peptidases does not influence protein secretion, with an exception of sipY deletion, which decreases extracellular proteins concentration and gives a sporulation delay.145 Overexpression of sip genes had a positive effect on protein processing in S. lividans (personal communication with Nick Geukens, Katholieke Universiteit Leuven, Belgium), as also observed in B. subtilis.146,147 Recently, it was also shown that phage-shock protein A (PspA), which is supposed to play a role in the maintenance of the PMF,148 affects the protein secretion yield in S. lividans. As also observed in E. coli,149 pspA overexpression was found to improve the Tat-dependent protein secretion in S. lividans. The effect on Sec-dependent secretion was less pronounced and appeared to be protein dependent.150 During the whole process of translocation as well as after secretion the proteins are targets of endoand extracellular peptidases/proteases. S. lividans is a popular strain for heterologous protein production due to its low protease activity.151 Growing S. lividans in a pelleted form further reduces the activity of extracellular peptidases.152 Some intra- and extracellular proteases have, however, been identified in S. lividans.153,154 Actinomycetes were also found to contain 20S proteosome, a structure made of selfcompartmentalizing protease, which otherwise occurs only in eukaryotes and archaea.155 Deletion of 20S proteosome in S. lividans had a positive effect on production of two heterologous proteins: soluble human tumour necrosis factor receptor II (shuTNFRII), secreted via Sec-pathway and salmon calcitonin (sCT), secreted via Tat-pathway, but did not affect the production of soluble human tumour necrosis factor receptor I (shuTNFRI).156 This suggests that only some of the proteins are subjected to the proteosome activity and can benefit from proteosome deletion.
24.5.7 Changing Regulation By changing the central transcriptional regulators it is possible to introduce global changes in the metabolism. The outcome of such a modification is difficult to predict, but the increasing amounts of protein–DNA interactions and transcriptional data will hopefully allow designing modifications of this magnitude in the future.
24-20
Developing Appropriate Hosts for Metabolic Engineering
One of the best studied global regulators of secondary metabolism is the regulatory protein AfsR. It was first discovered in S. coelicolor as an element that induced overproduction of the signaling molecule A-factor and of the antibiotics actinorhodin, undecylprodigiosin,157 and CDA158 when cloned into S. lividans.158 The effect on antibiotics production was mediated by enhancing transcription of secondary metabolic genes. Antibiotics production was reduced but not eliminated in a ∆afsR mutant.159 A homologue of asfR gene was found in S. peucetius, and when overexpressed it resulted in antibiotics overproduction in S. peucetius (doxorubicin), S. lividans TK 24 (γ-actinorhodin), S. clavuligerus (clavulanic acid), and S. griseus (streptomycin). AfsR is believed to activate transcription of a sigma-like gene afsR2, which in turn enhances transcription of the secondary metabolic genes by a yet unknown mechanism.160 Overexpression of afsR2 caused antibiotics overproduction in S. coelicolor,161 S. lividans,162 S. avermitilis,163 and S. noursei.164 Another regulator is S-adenosylmethionine (SAM), the methyl donor in methylation of nucleic acids, proteins, and small metabolites. SAM levels in S. coelicolor peak around the point when actinorhodin appears in the medium.165 When SAM synthase coded by the metK gene was overexpressed in S. coelicolor, the intracellular SAM levels increased and it caused overproduction of actinorhodin, prodigiosins and calcium-dependent antibiotic (CDA). Actinorhodin biosynthesis was influenced through increased expression of the actinorhodin cluster activator actII-4. The antibiotics were also synthesized earlier than normally. Similar effect was observed when SAM was added into the culture of a wild type strain.165 Positive effect of SAM on antibiotics production has also been observed in S. lividans166 and other Streptomyces spp.167
24.5.8 Modifying the Product The increasing resistance of pathogens to antibiotics in current use raises the need for discovery of new antibiotics. The screening of soil isolates continues to bring new antibiotic-producing Streptomyces, but the process is slowing down as more and more species become known. It is speculated that 99% of all microorganisms can not be cultivated under laboratory conditions. To capture the antibiotics that could be produced by noncultivatable Streptomyces and other microorganisms, researchers have started to look for DNA instead of microbes themselves. DNA can be isolated from environmental samples and incorporated into libraries, which can for example be screened by hybridization of the conserved genes sequences (like PKS).168–170 Subsequently the discovered genes can be expressed in a heterologous host to study possible products encoded by the DNA. Another way of obtaining new products is by varying the genes coding for their biosynthesis. Most of the work has been done on polyketide synthases, which are large modular enzymes, encoding the synthesis of complex polyketide molecules. Polyketides are made in a mechanism that resembles the biosynthesis of long fatty acids, but the units added at each step are often larger than an acetyl unit and moreover the molecules are subjected to various transformations during extension. Polyketide synthases consist of modules, each module containing transferases, reductases, dehydrases, and other enzymes in different amounts and order. The modular structure and modules position in the cluster defines the final product. This unique organization of polyketide synthases allows generation of new products by combinatorial biosynthesis.74 Native parts of the module can be substituted with analogous enzymes from different strains: replacement of the acyl carrier protein in S. coelicolor actinorhodin cluster by homologues from other clusters resulted in production of new actinorhodin analogues.171 Other strategies are to change the substrate specificity of PKS enzymes by mutations172 or to shuffle the positions of the modules in the PKS173 or to delete certain enzyme functionalities.174 Finally, it has been shown possible to reconstruct a desirable PKS from required elements.175 Other genes that are very amenable for “genetic molecules design” are nonribosomal peptide synthases (NRPS), where different subunits are responsible for incorporating a particular amino acid into the peptide. A number of new daptomycin derivatives with antibacterial activity were obtained by substitution of the native NRPS subunit by a homolog from a different cluster.176Feeding of slightly changed
Metabolic Engineering of Streptomyces
24-21
antibiotics precursors which can still be recognized and processed by the biosynthetic enzymes can also be used as a strategy to obtain new products.177
24.6 Perspectives Over the past 50 years Streptomyces have been a rich source for production of many valuable secondary metabolites. With the recent development in genomic techniques Streptomyces studies have entered the postgenomic level, and this may allow better understanding of the metabolism, morphological differentiation, and overall regulation in these complex bacteria. The accumulated knowledge will allow us to faster optimize industrial Streptomyces strains for the production of secondary metabolites, enzymes, and other products. We envisage improvement of genetic manipulation techniques, which will facilitate the molecular biology of streptomycetes. We hope that large amounts of projects on screening of environmental DNA and RNA samples will bear fruits and new bioactive compounds, including efficient antibiotics, will be discovered to help us fight infections caused by drug-resistant pathogenic bacteria and other diseases. Thus, we foresee that Streptomyces will continue to play a very important role as workhorses for the production of natural products that serves as important pharmaceuticals also in the next 50 years.
Acknowledgments We thank Nick Geukens (Laboratory of Bacteriology, Rega Institute for Medical Research, Katholieke Universiteit Leuven, Belgium), Paul Dyson (Institute of Life Science, School of Medicine, University of Wales Swansea, U.K.), Marie Jolie Virolle (Laboratoire de Biologie et Genetique Moleculaire de l’Institut de Genetique et Microbiologie, France) and Jette Thykkær (Center for Microbial Biotechnology, BioCentrum-DTU, Technical University of Denmark, Denmark) for comments on the manuscript.
References 1. Schatz, A., Bugie, E., and Waksman, S. Streptomycin, a substance exhibiting antibiotic activity against gram-positive and gram-negative bacteria. Proc. Soc. Exptl. Biol. Med., 55, 66, 1944. 2. Tuncer, M., Kuru, A., Isikli, M., Sahin, N., and Celenk, F.G. Optimization of extracellular endoxylanase, endoglucanase and peroxidase production by Streptomyces sp. F2621 isolated in Turkey. J. Appl. Microbiol., 97, 783, 2004. 3. Macedo, J.M., Gottschalk, L.M., and Bon, E.P. Lignin peroxidase and protease production by Streptomyces viridosporus T7A in the presence of calcium carbonate. Nutritional and regulatory carbon sources. Appl. Biochem. Biotechnol., 77–79, 735, 1999. 4. Yokoyama, K., Nio, N., and Kikuchi, Y. Properties and applications of microbial transglutaminase. Appl. Microbiol. Biotechnol., 64, 447, 2004. 5. Okeke, B.C. and Frankenberger, W.T. Jr. Use of starch and potato peel waste for perchlorate bioreduction in water. Sci. Total. Environ., 347, 35, 2005. 6. Sette, L.D., de Oliveira, V.M., and Manfio, G.P. Isolation and characterization of alachlor-degrading actinomycetes from soil. Antonie Van Leeuwenhoek, 87, 81, 2005. 7. Okeke, B.C. and Frankenberger, W.T. Jr. Biodegradation of methyl tertiary butyl ether (MTBE) by a bacterial enrichment consortia and its monoculture isolates. Microbiol. Res., 158, 99, 2003. 8. Sariaslani, F.S., Trower, M.K., and Omer, C.A. Constitutive expression of P450SOY and ferredoxinSOY in Streptomyces, and biotransformation of chemicals by recombinant organisms. Patent US 9210885, 1993. 9. Garvin, R.T. and Malek, L.T. An expression system for the secretion of bioactive human granulocyte macrophage colony stimulating factor (GM-CSF) and other heterologous proteins from Streptomyces. Patent 89113607.9, 1989.
24-22
Developing Appropriate Hosts for Metabolic Engineering
10. DeSanti, C.L. and Strohl, W.R. Soluble recombinant endostatin and method of making same from Streptomyces sp. Patent US 0009747, 2000. 11. Kieser, T., Bibb, M., Buttner, M., and Chater, K.D.A.H. Practical Streptomyces Genetics. The John Innes Foundation, Norwich, 2000. 12. Hopwood, D.A. Streptomyces genes: from Waksman to Sanger. J. Ind. Microbiol. Biotechnol., 30, 468, 2003. 13. Sianidis, G., Pozidis, C., Becker, F., Vrancken, K., Sjoeholm, C., Karamanou, S., Takamiya-Wik, M., van Mellaert, L., Schaefer, T., Anne, J., and Economou, A. Functional large-scale production of a novel Jonesia sp. xyloglucanase by heterologous secretion from Streptomyces lividans. J. Biotechnol., 2005. 14. Bentley, S.D., Chater, K.F., Cerdeno-Tarraga, A.M., Challis, G.L., Thomson, N.R., James, K.D., Harris, D.E. et al. Complete genome sequence of the model actinomycete Streptomyces coelicolor A3(2). Nature, 417, 141, 2002. 15. Lin, Y.S., Kieser, H.M., Hopwood, D.A., and Chen, C.W. The chromosomal DNA of Streptomyces lividans 66 is linear. Mol. Microbiol., 10, 923, 1993. 16. Lezhava, A., Mizukami, T., Kajitani, T., Kameoka, D., Redenbach, M., Shinkawa, H., Nimi, O., and Kinashi, H. Physical map of the linear chromosome of Streptomyces griseus. J. Bacteriol., 177, 6492, 1995. 17. Chen, C.W., Huang, C.H., Lee, H.H., Tsai, H.H., and Kirby, R. Once the circle has been broken: dynamics and evolution of Streptomyces chromosomes. Trends Genet., 18, 522, 2002. 18. Ikeda, H., Ishikawa, J., Hanamoto, A., Shinose, M., Kikuchi, H., Shiba, T., Sakaki, Y., Hattori, M., and Omura, S. Complete genome sequence and comparative analysis of the industrial microorganism Streptomyces avermitilis. Nat. Biotechnol., 21, 526, 2003. 19. MacNeil, D.J. Characterization of a unique methyl-specific restriction system in Streptomyces avermitilis. J. Bacteriol., 170, 5607, 1988. 20. Mazodier, P., Petter, R., and Thompson, C. Intergeneric conjugation between Escherichia coli and Streptomyces species. J. Bacteriol., 171, 3583, 1989. 21. Flett, F., Mersinias, V., and Smith, C.P. High efficiency intergeneric conjugal transfer of plasmid DNA from Escherichia coli to methyl DNA-restricting streptomycetes. FEMS Microbiol. Lett., 155, 223, 1997. 22. Boccard, F., Smokvina, T., Pernodet, J.L., Friedmann, A., and Guerineau, M. The integrated conjugative plasmid pSAM2 of Streptomyces ambofaciens is related to temperate bacteriophages. EMBO J., 8, 973, 1989. 23. Boccard, F., Smokvina, T., Pernodet, J.L., Friedmann, A., and Guerineau, M. Structural analysis of loci involved in pSAM2 site-specific integration in Streptomyces. Plasmid, 21, 59, 1989. 24. Combes, P., Till, R., Bee, S., and Smith, M.C. The Streptomyces genome contains multiple pseudoattB sites for the (phi)C31-encoded site-specific recombination system. J. Bacteriol., 184, 5746, 2002. 25. Murakami, T., Holt, T.G., and Thompson, C.J. Thiostrepton-induced gene expression in Streptomyces lividans. J. Bacteriol., 171, 1459, 1989. 26. DeSanti, C.L. and Strohl, W.R. Characterization of the Streptomyces sp. strain C5 snp locus and development of snp-derived expression vectors. Appl. Environ. Microbiol., 69, 1647, 2003. 27. Nikodinovic, J. and Priestley, N.D. A second generation snp-derived Escherichia coli - Streptomyces shuttle expression vector that is generally transferable by conjugation. Plasmid, 2006. 28. Jensen, P.R. and Hammer, K. The sequence of spacers between the consensus sequences modulates the strength of prokaryotic promoters. Appl. Environ. Microbiol., 64, 82, 1998. 29. Hammer, K., Mijakovic, I., and Jensen, P.R. Synthetic promoter libraries - tuning of gene expression. Trends Biotechnol., 24, 53, 2006. 30. Baudin, A., Ozier-Kalogeropoulos, O., Denouel, A., Lacroute, F., and Cullin, C. A simple and efficient method for direct gene deletion in Saccharomyces cerevisiae. Nucleic Acids Res., 21, 3329, 1993.
Metabolic Engineering of Streptomyces
24-23
31. Datsenko, K.A. and Wanner, B.L. One-step inactivation of chromosomal genes in Escherichia coli K-12 using PCR products. Proc. Natl. Acad. Sci. USA, 97, 6640, 2000. 32. Gust, B., Challis, G.L., Fowler, K., Kieser, T., and Chater, K.F. PCR-targeted Streptomyces gene replacement identifies a protein domain needed for biosynthesis of the sesquiterpene soil odor geosmin. Proc. Natl. Acad. Sci. USA, 100, 1541, 2003. 33. Khodakaramian, G., Lissenden, S., Gust, B., Moir, L., Hoskisson, P.A., Chater, K.F., and Smith, M.C. Expression of Cre recombinase during transient phage infection permits efficient marker removal in Streptomyces. Nucleic Acids Res., 34, 2006. 34. Shin, S.J., Wu, C.W., Steinberg, H., and Talaat, A.M. Identification of novel virulence determinants in Mycobacterium paratuberculosis by screening a library of insertional mutants. Infect. Immun., 74, 3825, 2006. 35. Pobigaylo, N., Wetter, D., Szymczak, S., Schiller, U., Kurtz, S., Meyer, F., Nattkemper, T.W., and Becker, A. Construction of a large signature-tagged mini-Tn5 transposon library and its application to mutagenesis of Sinorhizobium meliloti. Appl. Environ. Microbiol., 72, 4329, 2006. 36. Kumar, A., Seringhaus, M., Biery, M.C., Sarnovsky, R.J., Umansky, L., Piccirillo, S., Heidtman, M. et al. Large-scale mutagenesis of the yeast genome using a Tn7-derived multipurpose transposon. Genome Res., 14, 1975, 2004. 37. Wallace, L.J., Ward, J.M., and Richmond, M.H. The tnpR gene product of TnA is required for transposition immunity. Mol. Gen. Genet., 184, 87, 1981. 38. Herron, P.R., Hughes, G., Chandra, G., Fielding, S., and Dyson, P.J. Transposon Express, a software application to report the identity of insertions obtained by comprehensive transposon mutagenesis of sequenced genomes: analysis of the preference for in vitro Tn5 transposition into GC-rich DNA. Nucleic Acids Res., 32, 2004. 39. Volff, J.N. and Altenbuchner, J. Genetic instability of the Streptomyces chromosome. Mol. Microbiol., 27, 239, 1998. 40. Bishop, A., Fielding, S., Dyson, P., and Herron, P. Systematic insertional mutagenesis of a streptomycete genome: a link between osmoadaptation and antibiotic production. Genome Res., 14, 893, 2004. 41. Chakraburtty, R. and Bibb, M. The ppGpp synthetase gene (relA) of Streptomyces coelicolor A3(2) plays a conditional role in antibiotic production and morphological differentiation. J. Bacteriol., 179, 5854, 1997. 42. Chakraburtty, R., White, J., Takano, E., and Bibb, M. Cloning, characterization and disruption of a (p)ppGpp synthetase gene (relA) of Streptomyces coelicolor A3(2). Mol. Microbiol., 19, 357, 1996. 43. Huang, J., Lih, C.J., Pan, K.H., and Cohen, S.N. Global analysis of growth phase responsive gene expression and regulation of antibiotic biosynthetic pathways in Streptomyces coelicolor using DNA microarrays. Genes Dev., 15, 3183, 2001. 44. Karoonuthaisiri, N., Weaver, D., Huang, J., Cohen, S.N., and Kao, C.M. Regional organization of gene expression in Streptomyces coelicolor. Gene, 353, 53, 2005. 45. Bucca, G., Brassington, A.M., Hotchkiss, G., Mersinias, V., and Smith, C.P. Negative feedback regulation of dnaK , clpB and lon expression by the DnaK chaperone machine in Streptomyces coelicolor, identified by transcriptome and in vivo DnaK-depletion analysis. Mol. Microbiol., 50, 153, 2003. 46. Mehra, S., Lian, W., Jayapal, K.P., Charaniya, S.P., Sherman, D.H., and Hu, W.S. A framework to analyze multiple time series data: a case study with Streptomyces coelicolor. J. Ind. Microbiol. Biotechnol., 33, 159, 2006. 47. Laing, E., Mersinias, V., Smith, C.P., and Hubbard, S.J. Analysis of gene expression in operons of Streptomyces coelicolor. Genome Biol., 7, 2006. 48. McArthur, M. and Bibb, M. In vivo DNase I sensitivity of the Streptomyces coelicolor chromosome correlates with gene expression: implications for bacterial chromosome structure. Nucleic Acids Res., 34, 5395, 2006.
24-24
Developing Appropriate Hosts for Metabolic Engineering
49. Lum, A.M., Huang, J., Hutchinson, C.R., and Kao, C.M. Reverse engineering of industrial pharmaceutical-producing actinomycete strains using DNA microarrays. Metab. Eng., 6, 186, 2004. 50. Hesketh, A.R., Chandra, G., Shaw, A.D., Rowland, J.J., Kell, D.B., Bibb, M.J., and Chater, K.F. Primary and secondary metabolism, and post-translational protein modifications, as portrayed by proteomic analysis of Streptomyces coelicolor. Mol. Microbiol., 46, 917, 2002. 51. Wittmann, C., Kromer, J.O., Kiefer, P., Binz, T., and Heinzle, E. Impact of the cold shock phenomenon on quantification of intracellular metabolites in bacteria. Anal Biochem, 327, 135, 2004. 52. Borodina, I., Siebring, J., Zhang, J., Smith, C., Van Keulen, G., Dijkhuizen, L., and Nielsen, J. Antibiotics overproduction in Streptomyces coelicolor A3(2) mediated by phosphofructokinase A2 deletion. J. Biol. Chem., 283, 25186, 2008. 53. Kromer, J.O., Sorgenfrei, O., Klopprogge, K., Heinzle, E., and Wittmann, C. In-depth profiling of lysine-producing Corynebacterium glutamicum by combined analysis of the transcriptome, metabolome, and fluxome. J. Bacteriol., 186, 1769, 2004. 54. Avignone Rossa, C., White, J., Kuiper, A., Postma, P.W., Bibb, M., and Teixeira de Mattos, M.J. Carbon flux distribution in antibiotic-producing chemostat cultures of Streptomyces lividans. Metab. Eng., 4, 138, 2002. 55. Kim, H.B., Smith, C.P., Micklefield, J., and Mavituna, F. Metabolic flux analysis for calcium dependent antibiotic (CDA) production in Streptomyces coelicolor. Metab. Eng., 6, 313, 2004. 56. Naeimpoor, F. and Mavituna, F. Metabolic flux analysis in Streptomyces coelicolor under various nutrient limitations. Metab. Eng., 2, 140, 2000. 57. Gunnarsson, N., Mortensen, U.H., Sosio, M., and Nielsen, J. Identification of the Entner-Doudoroff pathway in an antibiotic-producing actinomycete species. Mol. Microbiol., 52, 895, 2004. 58. Borodina, I., Scholler, C., Eliasson, A., and Nielsen, J. Metabolic network analysis of Streptomyces tenebrarius , a Streptomyces species with an active entner-doudoroff pathway. Appl. Environ. Microbiol., 71, 2294, 2005. 59. Inbar, L. and Lapidot, A. 13C nuclear magnetic resonance and gas chromatography-mass spectrometry studies of carbon metabolism in the actinomycin D producer Streptomyces parvulus by use of 13C-labeled precursor., J. Bacteriol., 173, 7790, 1991. 60. Jonsbu, E., Christensen, B., and Nielsen, J. Changes of in vivo fluxes through central metabolic pathways during the production of nystatin by Streptomyces noursei in batch culture. Appl. Microbiol. Biotechnol., 56, 93, 2001. 61. Gunnarsson, N., Bruheim, P., and Nielsen, J. Glucose metabolism in the antibiotic producing actinomycete Nonomuraea sp. ATCC 39727. Biotechnol. Bioeng., 88, 652, 2004. 62. Borodina, I., Krabben, P., and Nielsen, J. Genome-scale analysis of Streptomyces coelicolor A3(2) metabolism.Genome Res., 15, 820, 2005. 63. Borodina, I. and Nielsen, J. From genomes to in silico cells via metabolic networks. Curr. Opin. Biotechnol., 16, 350, 2005. 64. Pharkya, P. and Maranas, C.D. An optimization framework for identifying reaction activation/inhibition or elimination candidates for overproduction in microbial systems. Metab. Eng., 8, 1, 2006. 65. Patil, K.R., Rocha, I., Forster, J., and Nielsen, J. Evolutionary programming as a platform for in silico metabolic engineering. BMC Bioinformatics, 6308, 2005. 66. Patil, K.R. and Nielsen, J., Uncovering transcriptional regulation of metabolism by using metabolic network topology. Proc. Natl. Acad. Sci. USA, 102, 2685, 2005. 67. Alper, H., Jin, Y.S., Moxley, J.F., and Stephanopoulos, G. Identifying gene targets for the metabolic engineering of lycopene biosynthesis in Escherichia coli. Metab. Eng., 7, 155, 2005. 68. Alper, H., Miyaoku, K., and Stephanopoulos, G. Construction of lycopene-overproducing E. coli strains by combining systematic and combinatorial gene knockout targets. Nat. Biotechnol., 23, 612, 2005. 69. Rodriguez, E., Hu, Z., Ou, S., Volchegursky, Y., Hutchinson, C.R., and McDaniel, R. Rapid engineering of polyketide overproduction by gene transfer to industrially optimized strains. J. Ind. Microbiol. Biotechnol., 30, 480, 2003.
Metabolic Engineering of Streptomyces
24-25
70. Rowe, C.J., Cortes, J., Gaisser, S., Staunton, J., and Leadlay, P.F. Construction of new vectors for highlevel expression in actinomycetes. Gene, 216, 215, 1998. 71. Kao, C.M., Katz, L., and Khosla, C. Engineered biosynthesis of a complete macrolactone in a heterologous host. Science, 265, 509, 1994. 72. Ziermann, R. and Betlach, M.C. Recombinant polyketide synthesis in Streptomyces: engineering of improved host strains. Biotechniques, 26, 106, 1999. 73. Zirkle, R., Ligon, J.M., and Molnar, I. Heterologous production of the antifungal polyketide antibiotic soraphen A of Sorangium cellulosum So ce26 in Streptomyces lividans. Microbiology, 150, 2761, 2004. 74. McDaniel, R., Ebert-Khosla, S., Hopwood, D.A., and Khosla, C. Engineered biosynthesis of novel polyketides. Science, 262, 1546, 1993. 75. Tang, L., Shah, S., Chung, L., Carney, J., Katz, L., Khosla, C., and Julien, B. Cloning and heterologous expression of the epothilone gene cluster. Science, 287, 640, 2000. 76. Yanai, K., Murakami, T., and Bibb, M. Amplification of the entire kanamycin biosynthetic gene cluster during empirical strain improvement of Streptomyces kanamyceticus. Proc. Natl. Acad. Sci. USA, 103, 9661, 2006. 77. Bejar, S., Belghith, K., and Ellouz, R. Glucose isomerase of S. violaceoniger . Fundamental and applied aspects. Arch. Inst. Pasteur Tunis, 71, 407, 1994. 78. Gramajo, H.C., Takano, E., and Bibb, M.J. Stationary-phase production of the antibiotic actinorhodin in Streptomyces coelicolor A3(2) is transcriptionally regulated. Mol. Microbiol., 7, 837, 1993. 79. Takano, E., Gramajo, H.C., Strauch, E., Andres, N., White, J., and Bibb, M.J. Transcriptional regulation of the redD transcriptional activator gene accounts for growth-phase-dependent production of the antibiotic undecylprodigiosin in Streptomyces coelicolor A3(2). Mol. Microbiol., 6, 2797, 1992. 80. Metsa-Ketela, M., Ylihonko, K., and Mantsala, P. Partial activation of a silent angucycline-type gene cluster from a rubromycin beta producing Streptomyces sp. PGA64. J. Antibiot., 57, 502, 2004. 81. Baltz, R.H., McHenney, M.A., Cantwell, C.A., Queener, S.W., and Solenberg, P.J. Applications of transposition mutagenesis in antibiotic producing streptomycetes. Antonie Van Leeuwenhoek, 71, 179, 1997. 82. Lombo, F., Pfeifer, B., Leaf, T., Ou, S., Kim, Y.S., Cane, D.E., Licari, P., and Khosla, C. Enhancing the atom economy of polyketide biosynthetic processes through metabolic engineering. Biotechnol. Prog., 17, 612, 2001. 83. Ryu, Y.G., Butler, M.J., Chater, K.F., and Lee, K.J. Engineering of primary carbohydrate metabolism for increased production of actinorhodin in Streptomyces coelicolor. Appl. Environ. Microbiol., 2006. 84. Li, Y., Ling, H., Li, W., and Tan, H. Improvement of nikkomycin production by enhanced copy of sanU and sanV in Streptomyces ansochromogenes and characterization of a novel glutamate mutase encoded by sanU and sanV. Metab. Eng. JT, 7, 165, 2005. 85. McIntyre, J., Bull, A., and Bunch, A. Vancomycin production in batch and continuous culture. Biotechnol. Bioeng., 49, 412, 1996. 86. Rius, N., Maeda, K., and Demain, A.L. Induction of L-lysine epsilon-aminotransferase by L-lysine in Streptomyces clavuligerus, producer of cephalosporins. FEMS Microbiol. Lett., 144, 207, 1996. 87. Paradkar, A.S., Mosher, R.H., Anders, C., Griffin, A., Griffin, J., Hughes, C., Greaves, P., Barton, B., and Jensen, S.E. Applications of gene replacement technology to Streptomyces clavuligerus strain development for clavulanic acid production. Appl. Environ. Microbiol., 67, 2292, 2001. 88. Li, R. and Townsend, C.A. Rational strain improvement for enhanced clavulanic acid production by genetic engineering of the glycolytic pathway in Streptomyces clavuligerus. Metab. Eng., 8, 240, 2006. 89. Chouayekh, H. and Virolle, M.J. The polyphosphate kinase plays a negative role in the control of antibiotic production in Streptomyces lividans. Mol. Microbiol., 43, 919, 2002.
24-26
Developing Appropriate Hosts for Metabolic Engineering
90. Ghorbel, S., Smirnov, A., Chouayekh, H., Sperandio, B., Esnault, C., Kormanec, J., and Virolle, M.J., Regulation of ppk expression and in vivo function of Ppk in Streptomyces lividans TK24. J. Bacteriol., 188, 6269, 2006. 91. Jonsbu, E., McIntyre, M., and Nielsen, J. The influence of carbon sources and morphology on nystatin production by Streptomyces noursei. J. Biotechnol., 95, 133, 2002. 92. Vecht-Lifshitz, S.E., Sasson, Y., and Braun, S. Nikkomycin production in pellets of Streptomyces tendae. J. Appl. Bacteriol., 72, 195, 1992. 93. Sarra, M., Perez-Pons, J.A., Godia, F., and Casas Alvero, C. Importance of growth form on production of hybrid antibiotic by Streptomyces lividans TK21 by fed-batch and continuous fermentation. Appl. Biochem. Biotechnol., 75, 235, 1998. 94. Kwak, J., Dharmatilake, A.J., Jiang, H., and Kendrick, K.E. Differential regulation of ftsZ transcription during septation of Streptomyces griseus. J. Bacteriol., 183, 5092, 2001. 95. Kawamoto, S., Watanabe, H., Hesketh, A., Ensign, J.C., and Ochi, K. Expression analysis of the ssgA gene product, associated with sporulation and cell division in Streptomyces griseus. Microbiology, 143 (Pt 4), 1077, 1997. 96. van Wezel, G., J, v.d.M., Kawamoto, S., Luiten, R.G., Koerten, H.K., and Kraal, B. ssgA is essential for sporulation of Streptomyces coelicolor A3(2) and affects hyphal development by stimulating septum formation. J. Bacteriol., 182, 5653, 2000. 97. van Wezel, G.P., W., Krabben, P., Traag, B.A., Keijser, B.J., Kerste, R., Vijgenboom, E., Heijnen, J.J., and Kraal, B. Unlocking Streptomyces spp. for use as sustainable industrial production platforms by morphological engineering. Appl. Environ. Microbiol., 72, 5283, 2006. 98. Katz, L. and Donadio, S. Macrolides. Biotechnology (Reading, MA), 28, 385, 1995. 99. Stassi, D., Donadio, S., Staver, M.J., and Katz, L. Identification of a Saccharopolyspora erythraea gene required for the final hydroxylation step in erythromycin biosynthesis. J. Bacteriol., 175, 182, 1993. 100. Brunker, P., Minas, W., Kallio, P.T., and Bailey, J.E. Genetic engineering of an industrial strain of Saccharopolyspora erythraea for stable expression of the Vitreoscilla haemoglobin gene (vhb). Microbiology, 144, 2441, 1998. 101. Tabakov, V., Emel’ianova, L.K., Antonova, S.V., and Voeikova, T.A. Effect of the gene for bacterial hemoglobin vhb on the effectiveness of the process of Escherichia coli - Streptomyces interspecies conjugation and production of antibiotics in streptomycetes. Genetika, 37, 422, 2001. 102. Kim, D.W., Chater, K., Lee, K.J., and Hesketh, A. Changes in the extracellular proteome caused by the absence of the bldA gene product, a developmentally significant tRNA, reveal a new target for the pleiotropic regulator AdpA in Streptomyces coelicolor. J. Bacteriol., 187, 2957, 2005. 103. Hirose, I., Sano, K., Shioda, I., Kumano, M., Nakamura, K., and Yamane, K. Proteome analysis of Bacillus subtilis extracellular proteins: a two-dimensional protein electrophoretic study, Microbiology, 146 (Pt 1), 65, 2000. 104. Brundage, L., Hendrick, J.P., Schiebel, E., Driessen, A.J., and Wickner, W. The purified E. coli integral membrane protein SecY/E is sufficient for reconstitution of SecA-dependent precursor protein translocation. Cell, 62, 649, 1990. 105. Akimaru, J., Matsuyama, S., Tokuda, H., and Mizushima, S. Reconstitution of a protein translocation system containing purified SecY, SecE, and SecA from Escherichia coli. Proc. Natl. Acad. Sci. USA, 88, 6545, 1991. 106. Manting, E.H., van Der Does, C., Remigy, H., Engel, A., and Driessen, A.J. SecYEG assembles into a tetramer to form the active protein translocation channel. EMBO J., 19, 852, 2000. 107. Schiebel, E., Driessen, A.J., Hartl, F.U., and Wickner, W. Delta mu H + and ATP function at different steps of the catalytic cycle of preprotein translocase. Cell, 64, 927, 1991. 108. van der Wolk, J., Klose, M., Breukink, E., Demel, R.A., de Kruijff, B., Freudl, R., and Driessen, A.J. Characterization of a Bacillus subtilis SecA mutant protein deficient in translocation ATPase and release from the membrane. Mol. Microbiol., 8, 31, 1993.
Metabolic Engineering of Streptomyces
24-27
109. Papanikou, E., Karamanou, S., Baud, C., Sianidis, G., Frank, M., and Economou, A. Helicase Motif III in SecA is essential for coupling preprotein binding to translocation ATPase. EMBO Rep., 5, 807, 2004. 110. Karamanou, S., Vrontou, E., Sianidis, G., Baud, C., Roos, T., Kuhn, A., Politou, A.S., and Economou, A. A molecular switch in SecA protein couples ATP hydrolysis to protein translocation. Mol. Microbiol., 34, 1133, 1999. 111. Gardel, C., Johnson, K., Jacq, A., and Beckwith, J. The secD locus of E.coli codes for two membrane proteins required for protein export. EMBO J., 9, 3209, 1990. 112. Arkowitz, R.A. and Wickner, W. SecD and SecF are required for the proton electrochemical gradient stimulation of preprotein translocation. EMBO J., 13, 954, 1994. 113. Economou, A., Pogliano, J.A., Beckwith, J., Oliver, D.B., and Wickner, W. SecA membrane cycling at SecYEG is driven by distinct ATP binding and hydrolysis events and is regulated by SecD and SecF. Cell, 83, 1171, 1995. 114. Duong, F. and Wickner, W. Distinct catalytic roles of the SecYE, SecG and SecDFyajC subunits of preprotein translocase holoenzyme. EMBO J, 16, 2756, 1997. 115. Luirink, J., ten Hagen-Jongman, C.M., van der Weijden, C.C., Oudega, B., High, S., Dobberstein, B., and Kusters, R. An alternative protein targeting pathway in Escherichia coli: studies on the role of FtsY. EMBO J., 13, 2289, 1994. 116. Powers, T. and Walter, P. Co-translational protein targeting catalyzed by the Escherichia coli signal recognition particle and its receptor. EMBO J., 16, 4880, 1997. 117. Seluanov, A. and Bibi, E. FtsY, the prokaryotic signal recognition particle receptor homologue, is essential for biogenesis of membrane proteins. J. Biol. Chem., 272, 2053, 1997. 118. Palacin, A., de la Fuente, R., Valle, I., Rivas, L.A., and Mellado, R.P. Streptomyces lividans contains a minimal functional signal recognition particle that is involved in protein secretion. Microbiology, 149, 2435, 2003. 119. Palomino, C. and Mellado, R.P. The Streptomyces lividans cytoplasmic signal recognition particle receptor FtsY is involved in protein secretion. J. Mol. Microbiol. Biotechnol., 9, 57, 2005. 120. Yamane, K., Bunai, K., and Kakeshita, H. Protein traffic for secretion and related machinery of Bacillus subtilis. Biosci. Biotechnol. Biochem., 68, 2007, 2004. 121. Kumamoto, C.A. and Beckwith, J. Mutations in a new gene, secB, cause defective protein localization in Escherichia coli. J. Bacteriol., 154, 253, 1983. 122. Baars, L., Ytterberg, A.J., Drew, D., Wagner, S., Thilo, C., van Wijk, K., and de Gier, J. Defining the role of the Escherichia coli chaperone SecB using comparative proteomics. J. Biol. Chem., 281, 10024, 2006. 123. Trun, N.J. and Silhavy, T.J. The genetics of protein targeting in Escherichia coli K12. J. Cell Sci. Suppl., 11, 13, 1989. 124. Crane, J.M., Mao, C., Lilly, A.A., Smith, V.F., Suo, Y., Hubbell, W.L., and Randall, L.L. Mapping of the docking of SecA onto the chaperone SecB by site-directed spin labeling: insight into the mechanism of ligand transfer during protein export. J. Mol. Biol., 353, 295, 2005. 125. van Wely, K.H., Swaving, J., Freudl, R., and Driessen, A.J. Translocation of proteins across the cell envelope of Gram-positive bacteria. FEMS Microbiol. Rev., 25, 437, 2001. 126. Wu, S.C., Yeung, J.C., Duan, Y., Ye, R., Szarka, S.J., Habibi, H.R., and Wong, S.L. Functional production and characterization of a fibrin-specific single-chain antibody fragment from Bacillus subtilis: effects of molecular chaperones and a wall-bound protease on antibody fragment production. Appl. Environ. Microbiol., 68, 3261, 2002. 127. Wu, S.C., Ye, R., Wu, X.C., Ng, S.C., and Wong, S.L. Enhanced secretory production of a single-chain antibody fragment from Bacillus subtilis by coproduction of molecular chaperones. J. Bacteriol., 180, 2830, 1998. 128. Grandvalet, C., Rapoport, G., and Mazodier, P. hrcA, encoding the repressor of the groEL genes in Streptomyces albus G, is associated with a second dnaJ gene. J. Bacteriol., 180, 5129, 1998.
24-28
Developing Appropriate Hosts for Metabolic Engineering
129. Bucca, G., Hindle, Z., and Smith, C.P. Regulation of the dnaK operon of Streptomyces coelicolor A3(2) is governed by HspR, an autoregulatory repressor protein. J. Bacteriol., 179, 5999, 1997. 130. Grandvalet, C., Servant, P., and Mazodier, P. Disruption of hspR, the repressor gene of the dnaK operon in Streptomyces albus G. Mol. Microbiol., 23, 77, 1997. 131. Palmer, T., Sargent, F., and Berks, B.C. Export of complex cofactor-containing proteins by the bacterial Tat pathway. Trends Microbiol., 13, 175, 2005. 132. Jongbloed, J.D., Martin, U., Antelmann, H., Hecker, M., Tjalsma, H., Venema, G., Bron, S., van Dijl, J.M., and Muller, J. TatC is a specificity determinant for protein secretion via the twin-arginine translocation pathway. J. Biol. Chem., 275, 41350, 2000. 133. Schaerlaekens, K., Schierova, M., Lammertyn, E., Geukens, N., Anne, J., and Van Mallaert, L. Twin-arginine translocation pathway in Streptomyces lividans. J. Bacteriol., 183, 6727, 2001. 134. Schaerlaekens, K., Van Mallaert, L., Lammertyn, E., Geukens, N., and Anne, J. The importance of the Tat-dependent protein secretion pathway in Streptomyces as revealed by phenotypic changes in tat deletion mutants and genome analysis. Microbiology, 150, 21, 2004. 135. Widdick, D.A., Dilks, K., Chandra, G., Bottrill, A., Naldrett, M., Pohlschroder, M., and Palmer, T. The twin-arginine translocation pathway is a major route of protein export in Streptomyces coelicolor. Proc. Natl. Acad. Sci USA, 103, 17927, 2006. 136. Faury, D., Saidane, S., Li, H., and Morosoli, R. Secretion of active xylanase C from Streptomyces lividans is exclusively mediated by the Tat protein export system. Biochim. Biophys. Acta, 1699, 155, 2004. 137. De Keersmaeker, S., Vrancken, K., Van Mallaert, L., Lammertyn, E., Anne, J., and Geukens, N. Evaluation of TatABC overproduction on Tat- and Sec-dependent protein secretion in Streptomyces lividans. Arch. Microbiol.,186, 507, 2006. 138. Schaerlaekens, K., Lammertyn, E., Geukens, N., De Keersmaeker, S., Anne, J., and Van Mallaert, L. Comparison of the Sec and Tat secretion pathways for heterologous protein production by Streptomyces lividans. J. Biotechnol., 112, 279, 2004. 139. Gauthier, C., Li, H., and Morosoli, R. Increase in xylanase production by Streptomyces lividansthrough simultaneous use of the Sec- and Tat-dependent protein export systems. Appl. Environ. Microbiol., 71, 3085, 2005. 140. Brockmeier, U., Caspers, M., Freudl, R., Jockwer, A., Noll, T., and Eggert, T. Systematic screening of all signal peptides from Bacillus subtilis : a powerful strategy in optimizing heterologous protein secretion in Gram-positive bacteria. J. Mol. Biol., 362, 393, 2006. 141. Li, H., Faury, D., and Morosoli, R. Impact of amino acid changes in the signal peptide on the secretion of the Tat-dependent xylanase C from Streptomyces lividans. FEMS Microbiol. Lett., 255, 268, 2006. 142. Lammertyn, E., L, V.M., Schacht, S., Dillen, C., Sablon, E., A, V.B., and Anne, J. Evaluation of a novel subtilisin inhibitor gene and mutant derivatives for the expression and secretion of mouse tumor necrosis factor alpha by Streptomyces lividans, Appl Environ Microbiol, 63, 1808, 1997. 143. Lammertyn, E., Desmyter, S., Schacht, S., Van Mallaert, L., and Anne, J. Influence of charge variation in the Streptomyces venezuelae alpha-amylase signal peptide on heterologous protein production by Streptomyces lividans. Appl. Microbiol. Biotechnol., 49, 424, 1998. 144. van Roosmalen, M., Geukens, N., Jongbloed, J.D., Tjalsma, H., Dubois, J.Y., Bron, S., van Dijl, J., and Anne, J. Type I signal peptidases of Gram-positive bacteria. Biochim Biophys. Acta, 1694, 279, 2004. 145. Palacin, A., Parro, V., Geukens, N., Anne, J., and Mellado, R.P. SipY is the Streptomyces lividans type I signal peptidase exerting a major effect on protein secretion. J. Bacteriol., 184, 4875, 2002. 146. Bolhuis, A., Tjalsma, H., Smith, H.E., de Jong, A., Meima, R., Venema, G., Bron, S., and van Dijl, J. Evaluation of bottlenecks in the late stages of protein secretion in Bacillus subtilis. Appl. Environ. Microbiol., 65, 2934, 1999.
Metabolic Engineering of Streptomyces
24-29
147. van Dijl, J., de Jong, A., Vehmaanpera, J., Venema, G., and Bron, S. Signal peptidase I of Bacillus subtilis: patterns of conserved amino acids in prokaryotic and eukaryotic type I signal peptidases. EMBO J., 11, 2819, 1992. 148. Darwin, A.J. The phage-shock-protein response. Mol. Microbiol., 57, 621, 2005. 149. DeLisa, M.P., Lee, P., Palmer, T., and Georgiou, G. Phage shock protein PspA of Escherichia coli relieves saturation of protein export via the Tat pathway. J. Bacteriol., 186, 366, 2004. 150. Vrancken, K., De Keersmaeker, S., Geukens, N., Lammertyn, E., Anne, J., and Van Mellaert, L. pspA overexpression in Streptomyces lividans improves both Sec- and Tat-dependent protein secretion. Appl. Microbiol. Biotechnol., 73, 1150, 2007. 151. Binnie, C., Cossar, J.D., and Stewart, D.I. Heterologous biopharmaceutical protein expression in Streptomyces. Trends Biotechnol., 15, 315, 1997. 152. Yun, S.I., Yahya, A.R., Malten, M., Cossar, D., Anderson, W.A., Scharer, J.M., and Moo-Young, M. Peptidases affecting recombinant protein production by Streptomyces lividans. Can. J. Microbiol., 47, 1137, 2001. 153. Binnie, C., Butler, M.J., Aphale, J.S., Bourgault, R., DiZonno, M.A., Krygsman, P., Liao, L., Walczyk, E., and Malek, L.T. Isolation and characterization of two genes encoding proteases associated with the mycelium of Streptomyces lividans 66. J. Bacteriol., 177, 6033, 1995. 154. Krieger, T.J., Bartfeld, D., Jenish, D.L., and Hadary, D. Purification and characterization of a novel tripeptidyl aminopeptidase from Streptomyces lividans 66. FEBS Lett., 352, 385, 1994. 155. De Mot, R., Nagy, I., Walz, J., and Baumeister, W. Proteasomes and other self-compartmentalizing proteases in prokaryotes. Trends Microbiol., 7, 88, 1999. 156. Hong, B., Wang, L., Lammertyn, E., Geukens, N., Van Mellaert, L., Li, Y., and Anne, J. Inactivation of the 20S in Streptomyces lividans and its influence on the production of heterologous proteins. Microbiology, 151, 3137, 2005. 157. Horinouchi, S., Hara, O., and Beppu, T. Cloning of a pleiotropic gene that positively controls biosynthesis of A-factor, actinorhodin, and prodigiosin in Streptomyces coelicolor A3(2) and Streptomyces lividans. J. Bacteriol., 155, 1238, 1983. 158. Horinouchi, S., Malpartida, F., Hopwood, D.A., and Beppu, T. afsB stimulates transcription of the actinorhodin biosynthetic pathway in Streptomyces coelicolor A3(2) and Streptomyces lividans. Mol. Gen. Genet., 215, 355, 1989. 159. Floriano, B. and Bibb, M. afsR is a pleiotropic but conditionally required regulatory gene for antibiotic production in Streptomyces coelicolor A3(2). Mol. Microbiol., 21, 385, 1996. 160. Lee, P.C., Umeyama, T., and Horinouchi, S. afsS is a target of AfsR, a transcriptional factor with ATPase activity that globally controls secondary metabolism in Streptomyces coelicolor A3(2). Mol. Microbiol., 43, 1413, 2002. 161. Matsumoto, A., Ishizuka, H., Beppu, T., and Horinouchi, S. Involvement of a small ORF downstream of the afsR gene in the regulation of secondary metabolism in Streptomyces coelicolor A3(2). Actinomycetologica, 9, 37, 1995. 162. Vogtli, M., Chang, P.C., and Cohen, S.N. afsR2: a previously undetected gene encoding a 63-amino-acid protein that stimulates antibiotic production in Streptomyces lividans. Mol. Microbiol., 14, 643, 1994. 163. Lee, J., Hwang, Y., Kim, S., Kim, E., and Choi, C. Effect of a global regulatory gene, afsR2, from Streptomyces lividans on avermectin production in Streptomyces avermitilis. J. Biosci. Bioeng., 89, 606, 2000. 164. Sekurova, O., Sletta, H., Ellingsen, T.E., Valla, S., and Zotchev, S., Molecular cloning and analysis of a pleiotropic regulatory gene locus from the nystatin producer Streptomyces noursei ATCC11455. FEMS Microbiol. Lett., 177, 297, 1999. 165. Okamoto, S., Lezhava, A., Hosaka, T., Okamoto-Hosoya, Y., and Ochi, K. Enhanced expression of S-adenosylmethionine synthetase causes overproduction of actinorhodin in Streptomyces coelicolor A3(2). J. Bacteriol., 185, 601, 2003.
24-30
Developing Appropriate Hosts for Metabolic Engineering
166. Kim, D.J., Huh, J.H., Yang, Y.Y., Kang, C.M., Lee, I.H., Hyun, C.G., Hong, S.K., and Suh, J.W. Accumulation of S-adenosyl-L-methionine enhances production of actinorhodin but inhibits sporulation in Streptomyces lividans TK23. J. Bacteriol., 185, 592, 2003. 167. Huh, J.H., Kim, D.J., Zhao, X.Q., Li, M., Jo, Y.Y., Yoon, T.M., Shin, S.K., Yong, J.H., Ryu, Y.W., Yang, Y.Y., and Suh, J.W. Widespread activation of antibiotic biosynthesis by S-adenosylmethionine in streptomycetes. FEMS Microbiol. Lett., 238, 439, 2004. 168. Bertrand, H., Poly, F., Van, V.T., Lombard, N., Nalin, R., Vogel, T.M., and Simonet, P. High molecular weight DNA recovery from soils prerequisite for biotechnological metagenomic library construction. J. Microbiol. Methods, 62, 1, 2005. 169. Ginolhac, A., Jarrin, C., Gillet, B., Robe, P., Pujic, P., Tuphile, K., Bertrand, H., Vogel, T.M., Perriere, G., Simonet, P., and Nalin, R. Phylogenetic analysis of polyketide synthase I domains from soil metagenomic libraries allows selection of promising clones. Appl. Environ. Microbiol., 70, 5522, 2004. 170. Courtois, S., Cappellano, C.M., Ball, M., Francou, F.X., Normand, P., Helynck, G., Martinez, A. et al. Recombinant environmental libraries provide access to microbial diversity for drug discovery from natural products. Appl. Environ. Microbiol., 69, 49, 2003. 171. Khosla, C., McDaniel, R., Ebert-Khosla, S., Torres, R., Sherman, D.H., Bibb, M.J., and Hopwood, D.A. Genetic construction and functional analysis of hybrid polyketide synthases containing heterologous acyl carrier proteins. J. Bacteriol., 175, 2197, 1993. 172. Reeves, C.D., Murli, S., Ashley, G.W., Piagentini, M., Hutchinson, C.R., and McDaniel, R. Alteration of the substrate specificity of a modular polyketide synthase acyltransferase domain through sitespecific mutations. Biochemistry, 40, 15464, 2001. 173. McDaniel, R., Kao, C.M., Hwang, S.J., and Khosla, C. Engineered intermodular and intramodular polyketide synthase fusions. Chem. Biol., 4, 667, 1997. 174. Byrne, B., Carmody, M., Gibson, E., Rawlings, B., and Caffrey, P. Biosynthesis of deoxyamphotericins and deoxyamphoteronolides by engineered strains of Streptomyces nodosus. Chem. Biol., 10, 1215, 2003. 175. McDaniel, R., Ebert-Khosla, S., Fu, H., Hopwood, D.A., and Khosla, C. Engineered biosynthesis of novel polyketides: influence of a downstream enzyme on the catalytic specificity of a minimal aromatic polyketide synthase. Proc. Natl. Acad. Sci. USA, 91, 11542, 1994. 176. Miao, V., Coëffet-Le Gal, M.F., Nguyen, K., Brian, P., Penn, J., Whiting, A., Steele, J. et al. Genetic engineering in Streptomyces roseosporus to produce hybrid lipopeptide antibiotics. Chem. Biol., 13, 269, 2006. 177. Carreras, C., Frykman, S., Ou, S., Cadapan, L., Zavala, S., Woo, E., Leaf, T. et al. Saccharopolyspora erythraea-catalyzed bioconversion of 6-deoxyerythronolide B analogs for production of novel erythromycins. J. Biotechnol., 92, 217, 2002
25 Metabolic Engineering of Filamentous Fungi 25.1 Introduction ������������������������������������������������������������������������������������ 25-1 25.2 System-Wide Approaches ������������������������������������������������������������� 25-2
Mikael Rørdam Andersen, Kanchana Rucksomtawin, and Gerald Hofmann Technical University of Denmark
Jens Nielsen Technical University of Denmark and Chalmers University of Technology
Genomics • Transcriptomics • Proteomics • Metabolomics • Modeling
25.3 Examples �����������������������������������������������������������������������������������������25-11 Citric Acid Production by A. niger • Production of Xylitol in T. reesei • Glucoamylase Production in A. niger and A. oryzae • α-Amylase Production in A. oryzae and A. niger • Cellulase Production in T. reesei • Heterologous Protein Production in A. niger, A. oryzae, and T. reesei • Lovastatin Production in Aspergillus terreus • Penicillin Production in P. chrysogenum
25.4 Perspectives ����������������������������������������������������������������������������������� 25-22 References ������������������������������������������������������������������������������������������������� 25-23
25.1 Introduction Filamentous fungi have been work horses in the service of humanity for a long time. Fermented Asian foods such as shoyu (soy sauce), miso, and sake have been produced using koji fungi (domesticated versions of Aspergillus oryzae) in China and Japan for more than a thousand years. However, the modern well-defined industrial production of specific compounds was spurred in 1894 by the Japanese-American Jockichi Takamine. Takamine used cultures of Aspergillus oryzae to produce a complex mixture of enzymes sold as TakadiastaseTM (Hjort, 2003). Since then, the fermentation industry has expanded, and filamentous fungi are now being used to produce a wide range of commercially available products from simple organic acids in Aspergillus niger through more complex molecules such as statins and β-lactams (in A. terreus and P. chrysogenum, respectively) and to protein production in several fungal species. For several types of products, filamentous fungi dominate the market as production hosts (Nevalainen et al., 2005). Table 25.1 summarizes the most used cell factories and the ones being discussed in this review. There are several different definitions of metabolic engineering, but here we will use the definition of Bailey (1991), who defined metabolic engineering as “the improvement of cellular activities by manipulations of enzymatic, transport, and regulatory functions of the cell with the use of recombinant DNA technology.” The section of examples in this chapter will present studies of all these changes in the cells. However, as several of the examples of metabolic engineering described in this chapter will make clear, that the robustness of the cellular machinery is a challenge for the successful use of directed genetic modifications. Redundancies and complex regulatory circuits within the cell often counteract the modifications as discussed by Friboulet and Thomas (2005). Additionally, regulation can occur on the transcriptomic, proteomic, and metabolomic levels, and a more holistic view than the traditional 25-1
25-2
Developing Appropriate Hosts for Metabolic Engineering Table 25.1 Filamentous Fungi Most Used as Cell Factories Organism A. oryzae
Main Product(s)
Sequenced Yes
A. niger
α-Amylase, heterologous enzymes Citric acid, glucoamylase, heterologous enzymes
Yes
A. terreus
Statins
Yes
P. chrysogenum
β-Lactams
Yes (proprietary of DSM)
T. reesei
Cellulase
Yes
metabolic engineering strategy is, therefore, often required. For this reason, the use of a system-based approach to metabolic engineering is gaining ground. Traditionally, the only technique capable of making changes on a system wide level was the use of classical strain improvement through mutagenesis and screening. However this is an indirect approach and the changes are difficult to identify and interpret. Use of a systems-wide approach may assist in deliberately engineering features, while still taking the complexity of the system into account. By the use of several different “omic” techniques, the appearance of large datasets and the use of mathematical modeling, one may hope to accumulate a critical mass of knowledge, crystallizing into understanding the system, and thereby suggesting a strategy for the metabolic engineer. Additionally, several studies have been published in recent years using systems-wide techniques in filamentous fungi, which are either applied in metabolic engineering strategies or provides techniques that are relevant for this kind of applications. The sudden burst in available fungal genomes Nierman et al. (2005); Machida et al. (2005); Galagan et al. (2005a); Pel et al. (2006) makes the system-wide approach even more appealing for the future. For these reasons, we have chosen to include sections on genomic, transcriptomic, proteomic, metabolomic, and modeling studies in filamentous fungi in the context of metabolic engineering of filamentous fungi.
25.2 System-Wide Approaches 25.2.1 Genomics Genomics is defined as the study of the whole genome sequence of an organism and the use of sequence information to identify genes, and fungal genomics has played an important role in the field of genomics (reviewed in Archer and Dyer, 2004; Hofmann et al., 2003). The fungal genomics era began in 1996 after the complete genome sequence of Saccharomyces cerevisiae was published (Goffeau et al., 1996). Following this, several other yeast species were sequenced, but there was a long gap before the release of the first filamentous fungal genome sequence. Thus, it was first in February 2001, the first assembled version of the Neurospora crassa genome was released by the Center for Genome Research at the Whitehead Institute. The complete genome sequence of N. crassa was published in 2003 (Galagan et al., 2003). The first draft of Aspergilli genome sequence, Aspergillis nidulans, was made available to restricted public access by Cereon Genomics (Monsato) in April 2001, but the sequence was not assembled and there was only performed low-coverage sequencing. In March 2003 the Broad Institute released a complete sequence of the A. nidulans genome. Additionally, in December 2001 the Dutch Company DSM also announced the successful sequencing of the A. niger genome in a press release, but the sequence was kept as proprietary information by the company. At the same time, the genome projects of A. oryzae and other fungi were also carried out. Furthermore, the Fungal Genome Initiative (FGI) has drafted a “white paper” proposing 15 fungal genome projects (Fungal Genome Initiative, 2002). The fungal candidates for sequencing included human and plant pathogens as well as biotechnological interesting fungi were selected with the criteria to enable evolutionary studies based on comparative genomics. With the increasing number of fungal genomes, comparative genomics approach will facilitate the discovery of genes as well as other important functional sequences such as promoter structures.
25-3
Metabolic Engineering of Filamentous Fungi
To date the genomes of a significant number of fungi have been, or are presently being, sequenced. The summary of each genome project can be found in the Genome OnLine Database (GOLD) homepage (http://www.genomesonline.org). So far a few species of filamentous fungi have complete published genome sequences which include N. crassa (Galagan et al., 2003), A. oryzae (Machida et al., 2005), A. fumigatus (Nierman et al., 2005), A. niger (Pel et al., 2006; Joint Genome Institute (JGI), see Table 25.2), and T. reesei (JGI, see Table 25.2), whereas several genome projects are ongoing. In this review, the genomics information of the five most biotechnological interesting fungi, namely A. oryzae, A. niger, A. terreus, P. chrysogenum, and T. reesei, will be discussed in details. Furthermore, the genomics information of a well-known model organism, A. nidulans, will also be described. A summary of genomics information of the mentioned strains is presented in Table 25.2. 25.2.1.1 Aspergillus nidulans A. nidulans is one of the most popular filamentous fungi in genetics and cell biology. It has served as a model organism for genetic research for more than half a century. The main reasons for choosing A. nidulans as a genetic model are the well-establishment of molecular tools and techniques. Furthermore, it is has served as a model organism for Aspergilli, which encompass many industrial and medical important species, (e.g., A. niger, A. oryzae, and A. fumigatus). Unlike these other Aspergilli, A. nidulans can produce both asexual spores (conidia) and sexual spores (ascospores). According to its perfect stage Table 25.2 Summary of Genome Information of A. nidulans, A. oryzae, A. niger, A. terreus, P. chrysogenum and T. reesei.a Species
Strain
Estimated Genome Size
A. nidulans
FGSC A4
30.1 Mb 10,701 ORFs
A. oryzae
RIB40
37 Mb 14,063 ORFs
A. niger
CBS 513.88
35.9 Mb 14,097 ORFs 37.1 Mb 11,200 ORFs 32 Mb
ATCC 1015 ATCC 9029 A. terreus
NIH 2624
35 Mb 10,406 ORFs
ATCC 20542
35 Mb
P. chrysogenum
N/A
34.1 Mb
T. reesei
QM9414
33 Mb 9,997 ORFs
Genomics Status complete (draft released in 2003) (Galagan et al, 2005) http://www.broad.mit.edu/annotation/genome/ aspergillus_nidulans/Home.html complete (Machida et al, 2005) http://www.bio.nite.go.jp/dogan/ MicroTop?GENOME_ID=ao complete (2001)b http://www.dsm.com/dfs/innovation/ genomics complete (2007) http://genome.jgi-psf.org/Aspni5 incompletec http://www.integratedgenomics.com/products.html complete (draft released in 2005) http://www.broad.mit.edu/annotation/genome/ aspergillus_terreus/Home.html incomplete http://www.microbia.com/index.phpd incomplete http://www.genomics.org.cn/bgi/english/2. htma in finishing (draft released in 2005) http://gsphere.lanl.gov/trire1/trire1.home.html
Number of EST 16,849e
21,735f
15,360e
-
107e 44,964e
Information in this table is summarized from the data in Genome OnLine Database (GOLD) website (http://www. genomesonline.org). b The sequencing data are not publicly available. c The sequencing data are available on request. d Data available at http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?db=genomeprj&cmd=Retrieve&dopt=Overview& list_uids=9561 e EST databases are available at http://www.ncbi.nlm.nih.gov/projects/dbEST/. f EST database is available at http://www.nrib.go.jp/ken/EST/db/index.html. N/A: data are not available. a
25-4
Developing Appropriate Hosts for Metabolic Engineering
or teleomorph, A. nidulans should be correctly called Emericella nidulans. The whole genome sequencing of A. nidulans was initiated by Cereon Genomics (Monsanto) in 1998 and a 3 × coverage genome sequence was released with restricted public access in April 2001. In March 2003 a 13 × coverage of the A. nidulans genome (fully assembled), with incorporation of the 3 × coverage provided by Monsanto, was publicly released by the Broad Institute (a research collaboration between the Whitehead Institute, MIT and Harvard University). In June 2003 a fully automated genome annotation and genome visualization became available, whereas publication of the genome paper followed two years later (Galagan et al., 2005a). The A. nidulans genome is approximately 30.1 Megabases (Mb) in size. It consists of eight chomosomes containing 9,541 protein coding genes. In March 2006, an extensive manual revision of the gene predictions originally made by automated annotation was released. This resulted in an update of the number of predicted protein encoding genes to 10,701. The full genome sequence is available for download via the Broad Institute website (http://www.broad.mit.edu/annotation/genome/aspergillus_nidulans/Home.html). 25.2.1.2 Aspergillus oryzae A. oryzae has been used in traditional Japanese food fermentation since ancient times. Today, A. oryzae is also used as a cell factory for production of enzymes and heterologous proteins. The whole genome sequencing of A. oryzae RIB40 was initiated in 1996 at the National Institute of Advanced Industrial Science and Technology (AIST), Japan. The project was extended from 1998 to 2001 and has been carried out by the collaboration of the Japanese National Institute of Technology and Evaluation (NITE), AIST and other members of the A. oryzae Genome Analysis Consortium (Machida, 2002). The complete genome sequence was publicly released in 2005 (http://www.bio.nite.go.jp/dogan/MicroTop?GENOME_ ID = ao). The total A. oryzae genome size is 37 Mb, consisting of eight chromosomes containing 14,063 protein coding genes (Machida et al., 2005). A comparative analysis of the A. oryzae genome with two related Aspergilli: the model organism A. nidulans and the human pathogen A. fumigatus revealed an extension of 7–9 Mb in the genome size of A. oryzae and a higher number of open reading frames (Galagan et al., 2005b). In addition to the presence of syntenic blocks, the A. oryzae genome represents specific blocks that are absent in A. nidulans and A. fumigatus. Both syntenic and A. oryzae specific blocks are scattered in a mosaic manner throughout the A. oryzae genome. Notably, the genes involved in the synthesis of secondary metabolites are enriched in A. oryzae-specific regions. A comparative genome analysis among the three Aspergilli provided valuable information about the fungal biology and their evolution. 25.2.1.3 Aspergillus niger A. niger has been used extensively in the production of organic acids (e.g., citric acid) and industrial enzymes (e.g., α-amylase). The genome sequencing of A. niger was initiated by DSM/Gene Alliance in July 2000 and the complete genome sequence was announced by DSM in a press release in December 2001. The 35.9 Mb genome of A. niger contains 14,097 predicted genes distributed over eight chromosomes (Archer and Dyer, 2004). The genome sequence has been proprietary information held by DSM, but is planned to be released in connection with publication of the sequence (Pel et al., 2006). The lack of an available genome sequence has had a major impact on applying metabolic engineering approach for strain improvement in A. niger. In 2005 Integrated Genomics, Inc. provided restricted public-access of their A. niger genome sequence, which was based on a 4 × coverage of the genome of A. niger ATCC 9029. In the meanwhile, the JGI, through support from the US Department of Energy (DOE), undertook complete genome sequencing of A. niger ATCC 1015 with a final coverage of 8 ×. In November 2005, the 8.9 × draft genome assembly of whole genome shotgun sequencing of A. niger ATCC 1015 was publicly released (http://genome.jgi-psf.org/Aspni1/Aspni1.home.html). The total genome is 37.1 Mb in size. The current draft release contains 11,200 protein-coding genes predicted and is currently functionally annotated using the JGI annotation pipeline. The manual revision of the gene prediction is in progress.
Metabolic Engineering of Filamentous Fungi
25-5
25.2.1.4 Aspergillus terreus A. terreus is an important industrial organism for producing cholesterol lowering agents and itaconic acid (Bonnarme et al., 1995; Manzoni and Rollini, 2002). Besides its biotechnological application, A. terreus has been reported to be a potential human pathogen causing aspergillosis (Birren et al., 2004). The complete genome sequence of A. terreus will provide useful information on both industrial and medical points of view. Thus, A. terreus was chosen as one of the original 15 target organisms by the Steering Committee of the Fungal Genome Initiatives (FGIs) for genome sequencing. The complete genome of A. terreus will then be used for comparative analysis with the genome of other pathogenic Aspergilli, e.g., A. fumigatus, A. clavatus and A. fischerianus to improve the understanding in fungal evolution and pathogenesis (Birren et al., 2004). A. terreus has an estimated genome size of 35 Mb distributed over eight chromosomes. In August 2005, a draft genome sequence of a clinical strain of A. terreus (NIH 2624) with 11.05 × coverage was publicly released by the Broad Institute. The assembly consists of 267 contigs in 26 supercontigs (scaffolds) that cover 29.2 Mb. The automated annotation of A. terreus was released later in May 2006. It contained 10,406 protein coding genes spanning eight chromosomes. The genome sequence is available via the Broad Institute website (http://www.broad.mit.edu/annotation/genome/aspergillus_terreus/Home.html). Furthermore, the genome sequence of an A. terreus industrial strain is in the process of being sequenced by Microbia. The availability of the complete genome sequence will be a useful tool for identification of novel genes that have expression patterns associated with secondary metabolite production. 25.2.1.5 Penicillium chrysogenum Although P. chrysogenum has been extensively used for the production of β-lactam antibiotics for more than half of a century, the genome sequence is still in its infancy. From the GOLD website, the genome of P. chrysogenum is in the process of being sequenced by Beijing Genomics Institute but the data are not yet available. Additionally, the FGI has listed P. chrysogenum as one of the next candidates for whole genome sequencing. So far, no time frame has been set for initiation the P. chrysogenum genome project. Recently DSM announced that the company has sequenced this organism, but the genome sequence is kept as proprietary information. The estimated genome size of P. chrysogenum is 34.1 Mb distributes over four chromosomes. Recently, physical map of the P. chrysogenum genome was constructed using capillary electrophoresis (CE) technology (Xu et al., 2005). This physical map will be a valuable tool for further analysis of the P. chrysogenum genome. 25.2.1.6 Trichoderma reesei T. reesei (asexual from of Hyocrea jecorina) is an industrial fungus due to its capacity to produce and secrete large amounts of cellulose and hemicellulose hydrolyzing enzymes used in various industrial applications. The whole genome sequencing of T. reesei QM9414 has been carried out by collaboration between the JGI and the University of Wisconsin under a DOE sponsorship. The estimated genome size of T. reesei is 33 Mb, organized in seven chromosomes. The draft assembly currently consists of 170 scaffolds and was publicly released in April 2005. The estimated genome size of T. reesei is 33 Mb, organized in seven chromosomes. Using the automated JGI annotation pipeline, a total of 9,997 protein coding genes were predicted. The sequencing data can be downloaded from the JGI website. (http:// gsphere.lanl.gov/trire1/trire1.home.html)
25.2.2 Transcriptomics Studies on transcriptomics of filamentous fungi described in the literature have not directly been targeted toward application in metabolic engineering. However, the topics studied are relevant to metabolic engineering, and we will, therefore, provide an overview of these studies, particularly as the methods are relevant for metabolic engineering. For this reason, the presented sections will have an emphasis on the methods used and the perspectives of the methods for metabolic engineering.
25-6
Developing Appropriate Hosts for Metabolic Engineering
The main groupings of transcriptional studies are the use of microarrays for functional annotation of genes, studies of carbon flow regulatory mechanisms, EST database mining for new target, and direction of metabolic engineering using association analysis. Additionally, a few new uses have been reported in 2006 that deserves mention. 25.2.2.1 Finding New Targets: Functional Assignment of Genes Using Microarrays With the increasing number of genomes of filamentous fungi being sequenced and published, the number of potential targets for metabolic engineering is increasing rapidly (Nierman et al., 2005; Machida et al., 2005; Galagan et al., 2005a, Pel et al., 2007). However, the speed of sequencing and the relatively smaller community for genetics in filamentous fungi compared to yeast, makes high-quality annotation for all of the metabolic genes hard to come by. Automatic annotation is being improved as described in Section 25.2.1, but still manual curation is needed for the less studied genes. One approach is using transcriptomics as done for A. nidulans in the study by Sims et al. (2004a,b), who presented two case-studies with slightly different approaches. One study is elucidation of the malate dehydrogenase isoenzymes and the other concerns itself with examining 20 secretion related genes. However, the approaches could be used for any group of genes of interest with some a priori knowledge of the regulation in the examined organism or another. For the identification and examination of malate dehydrogenases, the first three possible targets were found by using the three S. cerevisiae MDH genes as“in silico probes”and comparing these to the genome using BLASTn. This gave three distinct regions of the A. nidulans genome with significant homology. Using clustering, it was examined how the homology between the two set of three nucleotide sequences were, but probably due to introns in the nucleotide sequence, there was no pairwise clustering. By translating the coding regions of the A. nidulans sequence and using hierarchical clustering where the three putative malate dehydrogenase sequences were probed with each of the yeast sequences in turn, it was found, that the three A. nidulans sequences each paired with a distinct yeast malate dehydrogenase. This functional assignment was confirmed by examining the regulation of the three A. nidulans genes, and it was indeed found, that the genes behaved according to the functional prediction in a glucose-shift experiment Sims et al. (2004b). The other study by Sims et al. (2004a) found 20 known genes from A. niger related to the secretory pathway. By the use of clustalW, homologous sequences were identified in A. nidulans. The putative exons of these sequences were examined in a new array, and found to be regulated on a transcriptional level in a manner mirroring that of the behavior in A. niger. Additionally, hypothetical orthologues of the ER chaperone were found to be up-regulated, which supported the functional assignment. 25.2.2.2 Examining Regulation of Carbon Metabolism Knowledge of the regulation of the central carbon fluxes is useful for understanding the potential for metabolic engineering to improve the production of a specific product. Microarrays can be a useful tool for that. A study of T. reesei by Chambergo et al. (2002) is based on an initial sequencing of a constructed EST library. The library was obtained from cells grown on glycerol. 4,320 clones were examined, from that 2,835 unique ESTs were sequenced. Clustering revealed the presence of 1,151 genes in the sequences. From the genes, cDNA microarrays were constructed from the set of genes. The comparative analysis of glucose-grown and glucose-deprived cells elucidated a number of fundamental differences between the response of S. cerevisiae and T. reesei. These studies could be used to predict the responses of T. reesei to engineering of its central pathways. Maeda et al. (2004) performed a similar analysis with A. oryzae, but with a special emphasis on the excreted hydrolytic enzymes. In this case, an EST library of >16,000 sequences was prepared from mRNA derived from mycelia grown under several different culture conditions, including glucoserich and carbon-deprived medium. In this case approximately 6,000 nonredundant sequences were
Metabolic Engineering of Filamentous Fungi
25-7
identified (compared to the 1,151 in Chambergo et al. (2002)). Of these sequences, 2,070 highly expressed sequences were used for the construction of cDNA microarrays. By comparing the expression profiles of cultures grown on different media, it was found that the production of hydrolytic enzymes is largest on a complex wheat-bran medium. By examining the transcription levels of the EMP pathway and TCA it was found that both of the pathways are active when grown on glucose. The use of EST-libraries is a very useful alternative when a complete genome sequence is not available. Another use of EST libraries is to make new functional assignments, analogous to the studies by Sims et al. (2004a) described above. In a recent study David et al. (2006) used oligo-nucleotide arrays covering the majority of the identified in the A. nidulans genome. Their analysis also involved a reconstruction of the metabolic network from genomic sequences, and in this process they annotated more than 500 metabolic genes. Through combination of the reconstructed metabolic network and transcription profiling during growth on three different carbon sources (glucose, glycerol, and ethanol) they identified coregulated metabolic genes and mapped the operation of the metabolic network during growth on these three carbon sources. 25.2.2.3 EST Mining An example of the use of functional annotation by the use of an EST library is presented by Chigira et al. (2002). The work shown here is the cloning and sequencing of a new chitin synthase from Aspergillus oryzae based on studies of an EST database. Initially, candidates were found in the database and cloned from the available sequence. Afterward, the expression of the isolated gene was compared to the levels of other chitin synthases using microarrays. 25.2.2.4 Association Analysis: Directing the Metabolic Engineering Perhaps one of the best examples of the use of transcriptomics in the context of metabolic engineering was presented by Askenazi et al. (2003). The authors develop a method, referred to as association analysis, which has the potential to decode the relationship between gene expression and metabolite levels. The method was applied on lovastatin and (+)-geodin production in A. terreus. Initially, 21 strains with different yields of lovastatin and (+)-geodin was constructed and their geneexpression was examined on microarrays containing approximately 21,000 random genetic elements with an average length of 2 kb. Additionally, metabolite-levels were determined using HPLC and mass spectrometry (MS). Association analysis was performed on the combined metabolic and transcriptional data. For what is termed the ordinal approach, the complexity of the data was reduced to up-or down regulation and no significant change for the transcription data, and decrease/increase in level for the metabolite data. For the transcriptional data, the high variance in signals with low intensity was taken into account. Using Goodman and Kruskal’s gamma-algorithm, it was thus, determined which genetic elements that were significantly associated with metabolite production. This algorithm gave a better result than Pearson correlation. The association analysis was able to successfully identify the lovastatin biosynthetic cluster as being associated with lovastatin production. Additionally, it identified several previously unknown genetic elements correlated with geodin production, including the polyketide synthase (PKS). In addition to functional assignment, association analysis also provides information on the regulation of central pathways relative to production of secondary metabolites. In the context of metabolic engineering, association analysis proves to be a valuable tool. In addition to the discovery of new targets to engineer, several other uses are proposed and tried. Firstly, the information on regulation of the central pathways gives pointers to which regulations it could be profitable to uncouple. Secondly, if the analysis is performed with the yield of an unwanted by-product, one could find targets for elimination, and thus, potentially improve yields of the wanted compound, and make down-stream processing less cumbersome. Third, as demonstrated by Askenazi et al. (2003), promoters from genes correlated with metabolite yields can be used for designing efficient reporter systems.
25-8
Developing Appropriate Hosts for Metabolic Engineering
25.2.2.5 New Uses in Filamentous Fungi A few studies using novel methods of transcriptome analysis deserves mention here. One is the work by Castillo et al. (2006), where the transcriptome of P. chrysogenumis examined without the use of a genome sequence or an EST database. Another is the work by Bok et al. (2006), who investigated the production of secondary metabolites. Castillo et al. (2006) investigated the carbon-repression in P. chrysogenum by the use of suppression subtractive hybridization. The technique allows the experimenter to amplify cDNA of mRNA that is expressed uniquely or preferentially under one condition compared to another. The two conditions compared were glucose medium (carbon-repressing) and lactose-medium (carbon nonrepressing). When comparing the conditions, a number of cDNAs were found with differential expression, and the sequences of them were identified. This gave the sequence for a number of novel genes, and reported the fact that the isopenicillin N synthase is strongly repressed by glucose. The study of Bok et al. (2006) examines the potential of Aspergillus nidulans to produce novel secondary metabolites, but the method can be applied to any filamentous fungus with a sequenced genome. Secondary metabolites differ from other metabolites in filamentous fungi in that their genes are often arranged in cluster. By examining the up-or down-regulation as a function of the position of the genome, clusters that are coregulated can be found. Bok et al. (2006) used this to find the cluster for the biosynthesis of a novel metabolite, terrequinone A.
25.2.3 Proteomics Proteomics is the simultaneous study of the function of all protein components in an organism. Proteomics is crucial for many aspects of both fundamental research and industrial application of filamentous fungi. However, proteome analysis is much more complicated than transcriptome analysis as the proteome is highly dynamic and the expression levels of proteins depend on the complex regulation systems. Despite the relatively new field of research, several innovative methods has been developed for comprehensive proteome analysis such as two-dimensional polyacrylamide gel electrophoresis (2-D PAGE) combined with MS (Bader and Hogue, 2002), yeast two-hybrid system (Auerbach et al., 2002), protein arrays (Zhu et al., 2001), and isotope-coded affinity tag (Gygi et al., 1999). 2-D PAGE combined with MS is the most popular technique used for separation and quantification in proteomics. However, low abundance and specific classes of proteins are known to be absent or underrepresented by this method. Yeast two-hybrid system is a powerful tool to study protein–protein interactions of a single protein at a time, which is often time-consuming. Recently, the high-throughput yeast two-hybrid system has been developed which can provide the investigation of a large number of protein-protein interactions. With the increasing development of the entire genome sequences, protein arrays are rapidly becoming a powerful high-throughput technique to detect proteins, monitor their expression levels, identify their functions, and investigate protein–protein interactions. This technique can complement with other techniques, e.g., MS and yeast two-hybrid sytem to identify thousands of protein–protein interactions. Isotope-codes affinity tag is an approach for accurate quantification of the individual protein in the complex mixtures. This method utilizes stable isotope labeling to perform quantitative analysis of paired protein samples, followed by separation and identification of proteins by liquid chromatography and MS. Since proteomics requires the information of possible gene products, the increasing availability of fungal genomes will enable a wider use of proteomics for studying filamentous fungi. So far, the progress in proteomics is mainly derived from E. coli and yeast, and only few proteomics studies were performed with filamentous fungi. Oda et al. (2003) used proteome analysis to identify several extracellular proteins isolated from solid-state cultures of A. oryzae. At that time only 50 genes encoding extracellular enzymes had been cloned and characterized, and many observed proteins separated by 2-D electrophoresis therefore, remained uncharacterized. Using a combination of sequence database and MS, 41 proteins from extract solid-state cultures were identified. Notably, 35 proteins had not previously
Metabolic Engineering of Filamentous Fungi
25-9
been identified (Oda et al., 2003). In addition, proteome analysis was also used to compare the secreted proteins from A. oryzae in solid-state cultures and submerged cultures (Oda et al., 2005). Using 2-D electrophoresis and MALDI-TOF MS, the protein secretion profile was revealed. Based on the secretion pattern, the identified proteins were classified into four groups. Furthermore, data generated from proteome analysis and northern blot analyses suggested that protein secretion is controlled at posttranscriptional and translational levels. The use of proteomics has also been reported for T. reesei. Using 2-D PAGE combined with MS, proteins associated with the cell wall were investigated from T. reesei grown on cellulose-inducing and cellulose-repressing media (Lim et al., 2001). Furthermore, the proteomics approach was also used for comparative analysis of extracellular and intracellular proteins produced by T. reesei (Pakula et al., 2003). The major cellulases from extracellular and intracellular samples of T. reesei can be recognized by 2-D gel electrophoresis. Moreover, a large number of intracellular proteins affected by secretion stress were identified.
25.2.4 Metabolomics Metabolomics is defined as the whole set of metabolites in the cell. To date, several high-throughput methods for quantitative analysis of metabolome are developed such as NMR, gas chromatography mass spectrometry (GC-MS), gas chromatography time-of-flight mass spectrometry (GC-TOF), and liquid chromatography-mass spectrometry (LC-MS) (Villas-Boas et al., 2005). Despite the advance development of analytical methods, this field is in the context of metabolic engineering still in its infancy. So far, only few research papers have been published and most of research works are focused on identification and classification. Allen et al. (2003) used metabolic footprinting approach for highthroughput classification of yeast mutants. By combining MS data of extracellular metabolites with appropriate clustering and machine learning techniques, the difference between several physiological states of wide-type yeast and between yeast single-gene deletion mutants can be distinguished. This approach was proposed to be an effective strategy to classify unknown mutants by genetic defect. Smedsgaard and Nielsen (2005) also demonstrated the use of metabolite profiling for the identification and classification of filamentous fungi and yeast. Using direct infusion MS and HPLC with diode array detection integrated with automated data handling, novel compounds could be discovered. In addition, it is possible to differentiate between closely related fungal species and gain a better insight into yeast and fungal phenotypic behaviors.
25.2.5 Modeling Generally, there exist two methodologies of modeling applicable to metabolic engineering (Nielsen, 1998). One is metabolic flux analysis (MFA) characterized by the calculation of carbon fluxes using a list of bio-chemical reactions available to the cell (a stoichiometric model) and applying mass balances for each metabolite. Typically, carbon fluxes into and out of the cell are estimated by measuring extracellular concentrations and biomass and used to calculate the fluxes in the cell. Stoichiometric model for the central metabolism of A. niger and P. chrysogenum already exits (David et al., 2003; Henriksen et al., 1996, respectively), and several genome-scale models are under construction for aspergilli (Vongsangnak et al., unpublished, David et al., unpublished, Andersen et al., unpublished). The other approach to modeling is metabolic control analysis (MCA). This approach focuses on quantifying the flexibility of the enzyme activities in a pathway by defining two set of parameters for each reaction in the network, called elasticity coefficients and flux control coefficients (FCC). The first expresses the correlation of reaction rate to metabolite concentrations and the FCC is an expression of the sensitivity of the flux toward the enzyme activities in the metabolic network. In order to derive the elasticity coefficients and the FCCs it is necessary to either titrate the enzyme level inside the cells or to set up a kinetic model for the individual enzymatic reactions in the metabolic network under study. The
25-10
Developing Appropriate Hosts for Metabolic Engineering
latter requires extensive information on the kinetics of the individual enzymes as well as solid data on the intracellular metabolite levelism, and is therefore, laborious to obtain. To circumvent this problem it has been proposed to use generalized kinetic expression, the so-called S-system, where the kinetics of each reaction in the network is described as a power-law function of all the metabolites in the network (Voit, 2005). 25.2.5.1 Stoichiometric Modeling The use of MFA has several useful properties for metabolic engineering. The main strength is the power to elucidate the distribution of carbon fluxes and identifying key branch points within the metabolic network. This clearly improves the target selection, but it may also be used to guide the design of new strains (Patil et al., 2004). One also has the possibility of examining whether the insertion of new enzymes or entire pathways has the potential to improve the yield. Yet another use is the calculation of the maximum theoretical yields (Nielsen, 1998). Price et al. (2004) describes a toolbox available to analysis of stoichiometric models of microbial cells. One example of MFA in a filamentous fungus is the study of penicillin production in P. chrysogenum (Jørgensen et al., 1995; Henriksen et al., 1996). In both studies a model of the central metabolism of P. chrysogenum is applied to examine the growth energetics during fed-batch and continuous cultivation. Jørgensen et al. (1995) applied MFA to determine the maximum theoretical yield of penicillin V when grown on glucose. Additionally, it was found that if a biosynthetic pathway of cysteine is present, that utilizes direct sulfhydrylation, rather than transsulfuration, the yield of penicillin V could increase with up to 20%. Also it was found that the yield could increase by 20% by adding α-aminoadipate, valine and cysteine. A somewhat larger model for A. niger central carbon metabolism was presented by David et al. (2003). In this case, the model was applied to search for strategies to improve succinate production. When testing all combinations of two gene deletions, is was found that a fruitful strategy might be a deletion of ATP:citrate oxaloacetate-lyase and pyruvate decarboxylase giving a yield of at least 1.12 mol succinate per mol glucose. 25.2.5.2 Kinetic Modeling Torres et al. have addressed the kinetic modeling of glucose metabolism in A. niger in several articles (Torres et al., 1993; Torres, 1994a,b; Torres et al., 1998). In Torres et al. (1993) and Torres (1994a) the model is presented and tested for stability. Torres (1994b) analyses the system, and comes to the conclusion that an up-regulation of the hexokinase transporter would be beneficial for citrate production, however this analysis used lumped expressions for the largest part of glycolysis. In a further analysis using newer algorithms (a variation of MCA) Torres et al. (1998) found the requirements for optimal production of citrate while still holding the metabolite pools approximately constant. However, this solution required the simultaneous modulation of seven or more enzymes. The model was further improved with an increased complexity and used in the study by AlvarezVasquez et al. (2000). This model represents the main pathways involved in citric acid production including the mitochondrial reactions, which were not included in the study by Torres et al. (1998). When applied to the optimization of citric acid production, it was found that the maximum potential was not reached. However, a further increase would require the modification of a minimum of 13 enzymes. The rate of citrate production was determined to be able to increase three to 50-fold. The implications of this result are discussed later (Section 25.3.1). Another dynamic pathway modeling in A. niger was performed by de Groot et al. (2005) where MCA was employed to examine the potential to grow on L-arabinose. The goal was to find targets for metabolic engineering so that organic waste may be used for production processes involving A. niger. By examining the FCC’s it was found that the first three enzymes of the pathway, L-arabinose reductase, L-arabitol dehydrogenase, and L-xylulose dehydrogenase would be the most profitable targets. A technique to insert the L-arabinose pathway in fungal species and thus, expanding their production range has been patented by Londesborough et al. (2002).
Metabolic Engineering of Filamentous Fungi
25-11
25.3 Examples 25.3.1 Citric Acid Production by A. niger Citric acid is used for several purposes in the general household and industry. Originally, production of citric acid was initiated due to a shortage of Italian lemons during World War I, and remains in use as a substitute for lemons. As an acidifier and flavoring agent (E-number 330), citric acid is added to food, candy and soft drinks. Since citric acid is a tri-carboxylic acid (pKa1−3: 3.15, 4.77, 6.40) and an excellent buffer, it is widely used in household cleaners. The make-up of the molecule makes it capable of chelating metals (from calcium to heavy metals), which is a useful property in detergents (Karaffa et al., 2001; Karaffa and Kubicek, 2003). Additionally, it finds use in the biotechnology and pharmaceutical industry for cleaning of stainless steel pipes, where it is considered to be an environmentally friendly alternative to inorganic acids. With its versatility, the annual world market for citric acid reached $1.2 billion in 2002. By far the majority of this market is covered by fermentation with A. niger. Traditionally, the strain improvement for the production of citric acid has been done by classical methods, e.g., mutagenesis and screening (Hjort, 2003), leading to concentrations of citrate in the fermentation broth of >100 g/L (Netik et al., 1997). Modern production strains can yield up to 95% citrate from sucrose in controlled fermentations (Karaffa and Kubicek, 2003). This leaves little room for improvement by metabolic engineering, but several investigations have been performed in research strains where the extracellular concentration of citrate is as high as 55 g/L (Ruijter et al., 1998). The early attempts at metabolic engineering of citric acid production are presented in an excellent review by Ruijter et al. (1998) where several general strategies are presented. One such strategy is decreased by-product formation. A. niger is, as other Aspergilli, characterized by producing a number of organic acids and polyol as an overflow metabolism Witteveen et al. (1993). Among these, gluconic acid and oxalic acid are the main problems. Gluconic acid is produced outside the cell by glucose oxidase, and reduces the carbon flow toward citrate (Witteveen et al., 1992; Ruijter et al., 1998). The production of oxalic acid is a carbon drain as well, but oxalic acid also precipitates with calcium-ions, thus, complicating the purification of citrate in the down-stream processing (oxalic acid is toxic and not wanted in food grade chemicals). A glucose oxidase deficient A. niger strain has been constructed by Witteveen et al. (1990), but whether the citrate yield increased was not measured in this study. The use of a production strain without oxaloacetate hydrolase, and thus, without the capability of producing oxalic acid has been patented by Hjort and Pedersen (2000) based on the characterization of Pedersen et al. (2000b). The elimination of oxaloacetate hydrolase is also desirable in the production of polypeptides as discussed later. Torres et al. (1993) used sensitivity analysis of the citrate system to show that there are at least three control steps in the production of citric acid: (1) hexose uptake and/or phosphorylation, (2) mitochondrial citrate export and (3) export through the cytoplasmic membrane. Initial efforts to understand the hexose-phosphorylation and uptake in the context of citrate production came from studies by Schreferl-Kunar et al. (1989), where mutagenized strains were screened for enhanced growth and increased production of citrate. Mutants with >20% increase in citrate production were found to have faster sucrose uptake and the activities of hexokinase and phosphofructokinase were increased. Torres (1994a,b) and Torres et al. (1998) examined the entire glucolysis by kinetic modeling, and in their study it was demonstrated that the level of as many as ten enzymes has to be regulated to maximize the flux through glycolysis, including the hexokinase and the sugar transport system. Additionally, as the author recognizes, this may be difficult to achieve due to the interconnectivity of the regulation of the pathway with other parts of the metabolism. Some of the enzymes suggested modulated by Torres et al. (1998) were overexpressed in a study by Ruijter and Visser (1997), where the levels of phosphofructokinase and pyruvate kinase levels was increased.
25-12
Developing Appropriate Hosts for Metabolic Engineering
However, as Torres et al. (1998) suggests, these modifications alone did not improve the citrate production significantly. The mitochondrial part of the citrate system consists at a minimum of the pyruvate importer, pyruvate dehydrogenase, the citrate synthase, and the citrate-malate anti-porter (see Figure 25.1). Although there, to our knowledge, has not been performed any direct modulation of the pyruvate importer or pyruvate dehydrogenase, the citrate synthase has been 11-fold overexpressed by Ruijter et al. (2000). However, as Ratledge (2000) points out, there is already a large capacity of the mitochondrial citrate synthase, and therefore, overexpression of citrate synthase has no effect on the citrate production, as also found by Ruijter et al. (2000). The malate-citrate anti-porter has not been studied directly in A. niger, but results from other organisms suggests that the export is largely dependent on the cytosolic concentration of malate. The citrate export system has been examined by Netik et al. (1997), and found to be dependent on the absence of manganese and is likely to be symported with protons. A recent study by Burstaller (2006) found that passive transport would be sufficient at extracellular pH as low as 3. However, both studies agree that the gradient over the cytoplasmic membrane is the limiting factor. 25.3.1.1 Discussion Metabolic engineering of the citrate production system in A. niger, as described above, have been attempted by several researchers, and with increased production of citrate as a result in several of them. However, as studies by Torres et al. have shown, the enzymes of the glycolytic pathway are difficult to engineer, and a large number of manipulations are necessary. Additionally, it was not possible to unearth any patents for improving the citrate production of fungal strains. One might therefore, assume that strain improvement to the largest extent, if not exclusively, was performed by classical methods.
25.3.2 Production of Xylitol in T. reesei An example for metabolic engineering of T. reesei has been described by Ojamo et al. (2002). Here a process for the production of xylitol from xylan containing raw material is demonstrated. The process utilizes a combination of a genetically engineered strain of T. reesei and a yeast strain that is able to efficiently convert xylose to xylitol. This system takes advantage of the already inherently quite high xylanolytic activity of the secreted proteins of T. reesei and enhances these even further through the introduction of a recombinant xylanase. On the other hand it increases the productivity even more by inactivating the ability of T. reesei to catabolize the desired product by deletion of the xylitol dehydrogenase, which by itself already leads to the accumulation of xylitol (Wang et al., 2005). The process may substantially reduce the costs in xylitol production, which is currently a chemical process where xylose is the raw material.
25.3.3 Glucoamylase Production in A. niger and A. oryzae Glucoamylase (EC 3.2.1.3) is used by in food industry for converting starch into glucose syrup. This process was earlier performed by acidic hydrolysis using hydrochloric acid. This gave several by-products and resulted in formation of salts when the reaction mixture was neutralized and concentrated. The use of glucoamylase made the process less costly and gave higher quality of the product, and this process is one of the first large scale enzymatic processes, and even today the starch industry uses by volume the largest amounts of enzymes (Hjort, 2003). The main fungal workhorses for this production are A. niger and A. oryzae. The optimization of glucoamylase production can be divided in to two main categories: gene dosage effect and morphological engineering. The first type deals mainly in increasing the production of amylase, whereas the other enhances the excretion of the protein.
Citrate exporter
Figure 25.1 The citrate system.
Medium
e OA
ADP CO2
NADH
Mala ADP dehydro te CIT gena ACCOA se COA ATP Citrate loop
e lyas Citrat
ATP PYR
From glycolysis
MAL
NAD
Pyruvate carboxylase
Cytosol
Citrate/Malate antiporter
Pyruvate transporter
Ci tra te OAm nase e g o hydr MALm te de Mala Citrate loop
NADm
NADHm
PYRm ACCOAm Pyruvate dehydrogenase
CoAm NADm CO2m NADHm
Mitochondrion
sy nt ha s
CITm
e
Metabolic Engineering of Filamentous Fungi 25-13
25-14
Developing Appropriate Hosts for Metabolic Engineering
25.3.3.1 Effect of Gene Dosage A rather crude but effective method of improving the production of glucoamylase by A. niger was performed by Wallis et al. (1999) by inserting 80 extra copies of the gene into the genome. The enhanced strain secreted five to eight-fold more protein than the parent strain. An interesting observation made was that the glycosylation of the produced protein was similar to the wild-type due to an up-regulation of glycosylation enzymes. 25.3.3.2 Morphological Engineering Some studies of the anatomy of fungal cell indicate that protein secretion is highly localized to the hyphal tip (Peberdy, 1994). As a function of this, several studies have attempted to improve the protein secretion by changing the morphology of the fungus, and the production of glucoamylase has often been used as a model system. Dunn-Coleman et al. (2000) presents the fact, that certain mutations involved in the hyphal branching regulation, can increase the production of glucoamylase. This is continued by the same group of people in the patent of Pollerman and Memmott (2005). This patent describes the improvement of protein production by introducing a mutation in the hyphal branching regulator HbrA. This mutation induces hyper-branching when grown at 42 degrees. Exact values for the increase are not provided. In another study Akin et al. (2002) present a truncation of the cotA gene that gives a compact morphology. This method is also patented for improving protein secretion.
25.3.4 α-Amylase Production in A. oryzae and A. niger Another enzyme important for the starch processing industry is α-amylase (EC 3.2.1.1), produced by A. oryzae and A. niger. This enzyme has been subject to several metabolic engineering attempts. Specifically, studies have patented hybrids and variants of the molecule to improve starch-processing. Additionally, morphological engineering has been performed for the improvement of the α-amylase production. 25.3.4.1 Enzymatic Engineering Svendsen et al. (2005) patents a number of variants of fungal α-amylase constructed using 3D-modeling of the protein. Spatial examination of the make-up of the enzyme allowed for the engineering of changes in the catalytic site. The method was applied on α-amylase from both A. oryzae and A. niger, and in both cases the variants exhibited increased exo-activity on a starch substrate. The work by Taira et al. (2005) patents a hybrid polypeptide for starch processing. The polypeptide is a three-part construct composed of a catalytic module from α-amylase, a linker sequence and a carbohydrate-binding domain (CBD) (more specifically a starch binding domain in most applications). Different combinations of catalytic unit, linker, and CBD from several Aspergilli are presented and screened for activity. 25.3.4.2 Morphological Engineering Although the improvement of morphology previously has been done mostly by mutagenesis as patented by Shuster and Royer (1997), Müller et al. (2002) presents a targeted approach to manipulating the morphology of A. oryzae. Chitin synthase B and chitin synthesis myosin A was disrupted and the result was an increased branching of the mycelium. However, this did not increase the production of α-amylase production, but the pelleting of the mycelium was decreased, which resulted in a decrease in the broth viscosity—a desirable property for large scale fermentations.
25.3.5 Cellulase Production in T. reesei Fungi from the genus Trichoderma are members of the ascomycetes and are well known for their saprophytic lifestyle, and particularly for their ability to degrade partially destroyed wood, and other dead
Metabolic Engineering of Filamentous Fungi
25-15
plant material (Kubicek-Pranz, 1998). Especially Trichoderma reesei (asexual form of Hypocrea jecorina), which has been isolated from decaying cotton fabrics during world war II, is highly specialized for the degradation of the most abundant polysaccharide on earth, namely cellulose, which is composed of several thousand glucose residues that are interlinked via beta-1,4 glycosidic bonds and form crystalline microfibrils. This fungus produces a whole range of enzymes to accomplish the utilization of this quite difficult to degrade carbohydrate, What makes T. reesei interesting for industrial applications is that it also secretes these enzymes in relatively large quantities into the environment. Secreted protein titers up to 35g/l (with half of that being the major cellobiohydrolase I) (Durand et al., 1988) have been reported, which reflects the outstanding capability of the T. reesei secretion system. Classical strain improvement of T. reesei to increase the production of cellulases has already demonstrated the enormous potential of this fungus for protein production. Genetic engineering has been applied to improve not the overall cellulase productivity but rather to tailor the spectrum of enzymatic activities to a specific application trough overexpressing or reducing certain cellulolytic enzymes (Harkki et al., 1991). Other examples of this approach have been published for specific applications, namely for the enhancement of the cellulolytic activities for the biofinishing of cotton (Miettinen-Oinonen et al., 2005), and the improvement of the stonewashed effect of denim fabrics (Miettinen-Oinonen et al., 2001; Miettinen-Oinonen et al., 2005). The induction mechanism of the cellulases of T. reesei has been a very active field of research over the past decade and might hold great promises for further improvement of the production of endogenous as well as recombinant proteins. The puzzling question is how an insoluble polysaccharide like cellulose triggers the initial expression of its catabolic enzymes. It has been shown that is that not cellulose itself but rather low molecular weight oligosaccharides are the physiological inducers of the cellulose degradation machinery. Besides cellobiose and sophorose, which can be directly derived from cellulose, lactose is also a very potent inducer of cellulases (Morikawa et al., 1995). This disaccharide consists of one glucose and one galactose moiety that are connected by a glycosidic beta-1,4 linkage, and the outstanding properties of lactose to other cellulase inducers is that it is water soluble and can therefore, be applied in large scale industrial fermentation processes for the production of homologous as well as heterologous proteins that are under the very strong promoter of the cbh1 gene (encoding for cellobiohydrolase I). Furthermore, lactose is also a relatively cheap substrate since it is a byproduct of cheese production. This triggered the investigation of the catabolism of lactose, with the goal to obtain knowledge that could be exploited for metabolic engineering strategies to increase the potency of lactose for the induction of the cellulolytic systems. Seiboth et al. investigated in detail the galactose metabolism of T. reesei (Seiboth et al., 2002a,b, 2004, 2005). It was shown that, even though growth on galactose alone does not induce cellulase formation, the formation of galactose-1-phosphate is essential for this process, since a disruption of the catabolic pathway after this phosphorylation step does not have a major impact on cellulase production (Seiboth et al., 2002b), whereas the deletion of the galactosekinase abolishes it completely (Seiboth et al., 2004). In addition to that, also the manipulation of the beta-galactosidase activity is involved in the regulation of cellulolytic genes, and it was found that, albeit the overexpression of the major beta-galactosidase encoding gene bga1 leads to a loss of expression of the genes cbh1 and cbh2, encoding for the two cellobiohydrolases, the overall cellulolytic activity is not significantly effected (Seiboth et al., 2005). Even though the mutant strain with the enhanced beta-galactosidase activity showed faster growth and a shorter lag phase, both properties that would be highly appreciated in a production process for enzymes, the abolition of the cbh1 transcription hampers its applicability greatly, since many processes rely on the promoter of the cbh1 gene. Yet another interesting aspect on the regulation of the cellulolytic system of T. reesei was discovered recently. By investigating differentially expressed transcripts of a wild-type strain and a cellulase negative strain of T. reesei with the rapid subtractive hybridization approach it was found that a gene, which has similarity to a gene that is responsive to light, is not expressed in the cellulase negative strain (Schmoll et al., 2004). This led to the investigation of the influence of light in general on the cellulase
25-16
Developing Appropriate Hosts for Metabolic Engineering
production, and revealed that induction of cellulase transcription is indeed enhanced by light, and that the identified gene (named envoy) links the light signaling pathway and the cellulase regulatory system (Schmoll et al., 2005). The same method that led to the discovery of envoy was also used to compare the transcriptomes of T. reesei after cellulase induction with cellulose but not by sophorose to identify genes that are exclusively expressed (Schmoll and Kubicek, 2005) in the presence of celluloase. It was shown that a set of least ten genes, which comprises both previously characterized as well as some uncharacterized genes, are expressed only after induction with cellulose, and this therefore, established that the elements involved in the signaling pathways of these two cellulase inducers are not identical, which could lead to new strategies for the improvement of protein production in T. reesei.
25.3.6 Heterologous Protein Production in A. niger, A. oryzae, and T. reesei The uses of enzymes as biocatalysts in the industry are many and diverse. A few examples are for food products, in cosmetics and care products, in textile industry and as detergents. The world market for industrial enzymes was close to 1.6 billion US dollars in 2001, and in 2005 the sales of enzymes by Novozymes A/S, the worlds largest enzyme producer, were approximately one billion US dollars Novozymes (2006). Due to the fact that filamentous fungi have a large potential for production for secreted proteins, and can be cultivated in submerged cultures, enzymes produced by filamentous fungi amounted for as much as half the market in 2001 (Yoder and Lehmbeck, 2004). Numbers for 2005 suggests that the percentage still holds true (Novozymes, 2006). For the production of recombinant protein, several fungal hosts can be utilized. Due to its potential to secrete large amounts of proteins T. reesei has been used as an industrial host for production of a range of recombinant proteins (Nevalainen et al., 2005; Hjort, 2003; Penttilä, 1998). A relatively new host is TM F. venetatum, known for the production of single cell protein Quorn , used as a nonanimal source of protein for humans. This has been adapted for heterologous protein production, as it has the benefits of already being evaluated safe for human consumption (Yoder and Lehmbeck, 2004). However, the main workhorses in the field has traditionally been A. oryzae and A. niger, as processes involving these organisms have been classified as Generally regarded as safe (GRAS) by the FDA for many years (Hjort, 2003) (for an extensive review of hosts, see Yoder and Lehmbeck (2004)). In this section we will therefore, focus on the use of A. niger or A. oryzae for recombinant protein production. Generally, a large number of different enzymes have been expressed in these two hosts. Some references in this section include Candida antarctica lipase B and Scytalidium thermophilum catalase (Conelly and Brody, 2004), Trametes versicolor laccase (Valkonen et al., 2003), bovine preprochymosin (Valkonen et al., 2003), A. niger αgalactosidase (Knap et al., 1994), A. oryzae stereo-selective esterase, chymosin (Hjort, 2003), etc., but the number of actual products is much larger and the potential is virtually endless. A review by Yoder and Lehmbeck (2004) provides an extensive list of heterologous proteins and hosts. Metabolic engineering of strains for protein production allows for the production of enzymes from potentially any organism. When optimizing the production of heterologous proteins, several general strategies are used, either alone or in combination. One method of improving production is the use of genetic engineering for enhancing promoter activity, copy number, gene properties, or other features of the gene itself. Another method is more system-wide changes, where regulators of the gene-expression or competing enzymes are targeted. The final group of strategies involves targeting enzymes of the metabolism involved in biosynthesis of by-products, precursors, and/or cofactors. In several cases, the yield or the production process of proteins benefits from such modifications. This can be due to improved physical/chemical properties of the medium, thus, improving down-stream processing, or removal of secreted enzymes competing with the wanted product (by-product formation).
25-17
Metabolic Engineering of Filamentous Fungi
25.3.6.1 Gene Modifications To ensure high production levels and efficient transport of the protein out of the cell, artificial constructs can be used. The up-and down-stream sequences of highly expressed and exported enzymes can be recombined with the heterologous gene. In A. oryzae the TAKA α-amylase is often used as a dependable system for high level expression (Yoder and Lehmbeck, 2004; Hjort, 2003), but other sequences are in use as well. For example in T. reesei the fusion of the desired product with parts of the highly expressed cellobiohydrolase I (CBHI) provides a very successful strategy to improve the yields of secreted heterologous proteins as it was initially presented by Harkki et al. (1989), were a short region of the CBHI coding region was fused to the N-terminus of chymosin. Subsequently this strategy was further optimized, e.g., for the secretion of antibody molecules (Nyyssonen et al., 1993; Nyyssonen and Keranen, 1995; Nyyssonen et al., 1995). An overview of the most used sequences in the examined patents is presented in Figure 25.2. Use of these systems has allowed for yields in the commercial processes higher than 5 g protein/l (Yoder and Lehmbeck, 2004). The use of synthetic promoters is found in Yaver and Nham (2004), where variations of the Fusarium venenatum glucoamylase promoter are patented. The method regulates the expression of fungal genes including glucoamylase and lipase. Specifically, the expression of the gene after the promoter was increased six-fold by adding copies of a domain in the promoter required for high level expression. The method is patented for several fungal species, but the most preferred embodiments are with Aspergillus or Fusarium host cells. 25.3.6.2 Modified Regulators The targeting of transcription factors can in some cases be a fruitful way of stimulating the production host to produce more protein. One such example is the overexpression of hacA, Valkonen et al. (2003) (patented by Penttilä et al. (2001)). hacA has a role as an inducer of the unfolded protein response (UPR) in fungi. This process allows for the refolding of unfolded protein, which otherwise would have been degraded by the cell. In the production of Trametes versicolor laccase and bovine preprochymosin by A. niger, the production was improved 2.9-fold and seven-fold, respectively, by inserting hacA with Terminators
Leader sequences A.oryzae TAKA amylase A.nidulans triose phosphate isomerase
A.oryzae TAKA amylase A.niger alpha-glucosidase A.niger glucoamylase A.nidulans anthranilate synthase
YFG
A.oryzae TAKA amylase A.niger glucoamylase A NA2-tpi hybrid T.reesei cellobiohydrolase Promoters
A.oryzae TAKA amylase A.niger neutral alpha-amylase A.niger glucoamylase R.miehel aspartic proteinase H.insolens cellulase H.lanuginosa lipase Signal peptides
A.oryzae TAKA amylase A.niger alpha-glucosidase A.niger glucoamylase A.nidulans anthranilate synthase F.oxysporum trypsin-like protease Polyadenylation signal
Figure 25.2 Overview of the different cassettes used for heterologous gene production in A. niger, A. oryzae, and T. reesei. (Based on information from M.Conely and H.Brody. Patent US 2004/0191864, 2004 and C. Hjort. Patent WO 00/10064, 2000.)
25-18
Developing Appropriate Hosts for Metabolic Engineering
an A. niger glaA promoter. In T. reesei it was also shown that the UPR is activated if heterologous proteins like fragments of an IgG antibody are expressed (Saloheimo et al., 1999) and furthermore that the induction of the UPR decreases the production of endogenous secreted proteins by downregulating the expression of the corresponding genes, e.g., cbh1 via distinctive DNA-regions in their promoters. Since this promoter is often used to drive the expression of heterologous genes the engineering of this mechanism therefore, represents a successful strategy to improve the production of recombinant proteins as described in a patent of Pakula et al. (2002). A series of patents (Hjort et al., 2000, 2001) concerns the activity of the activator of fungal proteases (PrtT). In unmodified cells, a high level of proteases is found in the medium. These degrade the heterologous product thus, lowering the yield. PrtT induces these activities, and is thus, a target for deletion. Hjort et al. (2000) patents the use of the activator of fungal proteases (PrtT) in the context of producing polypetides. The deletion and over-expression of prtT is provided as examples. Hjort et al. (2001) patents the modifications of transcriptional regulators of proteases to increase the product formation of heterologous proteins. The transcriptional regulator prtT is preferentially obtained from A. niger, but disruption of the gene was performed in both A. niger and A. oryzae. However, none of these patents presents data on the increase in yield, but the technique is used for production of C. antarctica lipase B in Conelly and Brody (2004). In this case, protease activity was reduced to a fifth of the wild-type level and the product yield increased by 200%. 25.3.6.3 Deletion of Enzymes A number of enzymes interfere with the production of heterologous protein. One enzyme that deserves special mention is oxaloacetate hydrolase, producing oxalate and acetate. Oxalate is excreted into the medium. Oxalate precipitates with calcium, which causes difficulties in the downstream processing of the fermentation product. Additionally, this property causes oxalate to be toxic for humans. Finally, the production of oxalate results in loss of carbon in the production organism. Hjort and Pedersen (2000) patent the use of fungal strains with a deletion of the oxaloacetate hydrolase gene, particularly for the production of heterologous polypeptides. However, the deletion of the oxaloacetate hydrolase apparently decreases the glucoamylase production by 50% (Pedersen et al., 2000a). Other strategies includes the deletion of genes coding for by-products, e.g., other secreted enzymes. The patent Conelly and Brody (2004) provides perhaps the most comprehensive overview of the improvement of production of recombinant protein in A. niger. A comprehensive study of the effect of combinations of enzyme deletions on production of Candida antarctica lipase B and Scytalidium thermophilum catalase is provided. The deleted enzymes and genes include: oxaloacetate hydrolase (oah), glucoamylase (gla), acid Stable α-amylase (asa), prtT and neutral α-amylases (amyA and amyB). Strains with the prtT deletion and combinations of enzyme deletions were used to produce the above mentioned lipase and catalase. The yield of lipase was improved three-fold over an eight day fermentation mainly due to a protease activity drop to 20% of the wild-type level. The catalase yield increased by 40% in the best of the mutant strains.
25.3.7 Lovastatin Production in Aspergillus terreus A. terreus is a prolific producer of a number of secondary metabolites and mycotoxins including lovastatin (Greenspan and Yudkovitz, 1985), geodin (Askenazi et al., 2003), itaconate (Bonnarme et al., 1995), citrinin (Kiriyama et al., 1977), sulochrin (Couch and Gaucher, 2004), terrecyclic acid (Nakagawa et al., 1982) and butyrolactone I (Rao et al., 2000). Among those compounds, lovastatin is the most abundant secondary metabolite produced by A. terreus and it has been widely used as a potent cholesterollowering agent. The commercial production of lovastatin from A. terreus has been carried out since 1980 (Manzoni and Rollini, 2002). Classical strain improvement, e.g., random mutation and screening strategies has been extensively used to obtain overproducing strains (Kumar et al., 2000b; Vinci et al., 1991). Furthermore, process development programs such as fermentation optimization, medium design, and
25-19
Metabolic Engineering of Filamentous Fungi
scale-up have been applied to increase the productivity and yield of lovastatin (Bradamante et al., 2002; Casas Lopez et al., 2003; Hajjaj et al., 2001; Kumar et al., 2000a; Lai et al., 2002). At present, a new A. terreus mutant strain produces lovastatin up to 16 g/l, whereas only 30–60 mg/l was produced by the initial strain (Matkinen News February 2006; Metkinen Oly, Finland). This hyperproducer was obtained by using combined (chemical and physical) induced mutagenesis method and process development program. During strain development, several efforts were put into investigation and understanding the biosynthetic pathway of lovastatin using nuclear magnetic resonance and MS (Chan et al., 1983; Moore et al., 1985; Yoshizawa et al., 1994). These results revealed that lovastatin is derived from two distinct polyketide chains, namely a nonaketide and a diketide, joined through an ester linkage. Recently, the PKS gene involved in lovastatin biosynthesis was isolated and characterized (Hendrickson et al., 1999). This PKS gene is required for the biosynthesis of the nonaketide backbone of lovastatin and was renamed as LNKS (lovB). Based on the knowledge that genes involved in secondary metabolism are organized in cluster, the entire lovastatin gene cluster was cloned and characterized using lovB as a probe (Kennedy et al., 1999). Nine genes (lovA-lovH and IvrA) were assumed to encode polypeptides required for lovastatin biosynthesis (Auclair et al., 2001; Hutchinson et al., 2002; Kennedy et al., 1999). Improved knowledge about the lovastatin biosynthetic gene cluster enabled metabolic engineering for improvement of lovastatin productivity. Consequently, several researches have been focused on improvement of lovastatin production via manipulating the regulatory control. Introducing extra copies of the specific regulatory gene lovE into A. terreus wide-type strain led to increase the amount of lovastatin by at least two-fold. Some transformants were found to produce five-fold to seven-fold lovastatin more than the wild-type strain. Recently, a novel Aspergillus nuclear protein (LaeA), which is assumed to act as a global regulator of secondary metabolites in Aspergillus sp., was discovered by complementation of a sterigmatocystin mutant (Bok and Keller, 2004). The role of LaeA was investigated by deletion and overexpression of laeA in A. nidulans, A. terreus, and A. fumigatus. Deletion of laeA resulted in a decrease in sterigmatocystin, penicillin and lovastatin gene expression. On the contrary, an increase in the transcription of genes involved in penicillin and lovastatin biosynthesis, and subsequent increased product formation, was observed in laeA overexpression strains. The proposed model for regulation of secondary metabolites production by LaeA is shown in Figure 25.3. Another example is the heterologous expression of the truncated lovastatin gene cluster in nonlovastatin-producing organisms, i.e., A. nidulans. Two A. nidulans transformants were found to be significantly resistant to lovastatin compared to the reference strain. Furthermore, both transformants were able to produce lovastatin intermediates, monacolin J and monacolin L (Hutchinson et al., 2002; Kennedy et al., 1999). Secondary metabolisms Hyphal pigments Gliotoxin LaeA
Lovastatin Penicillin AflR
Sterigmatocystin
Figure 25.3 The proposed model of secondary metabolites regulation by LaeA. (Based on information from J.W. Bok and N.P. Keller. Eukargot Cell., 3(2): 527–535, 2004.)
25-20
Developing Appropriate Hosts for Metabolic Engineering
25.3.8 Penicillin Production in P. chrysogenum 25.3.8.1 Kanchana Rueksomtawin The industrial production of penicillin by P. chrysogenum was the first biotechnological process for production of a pharmaceutical and it represents a good example of how an industrial fermentation process can develop as this process has been used for more than 50 years. Despite an increase in microbial resistance toward penicillin, penicillin is still extensively used for treatment of bacterial infections. Thus, several research groups are still focused on improving the penicillin productivity and searching for penicillins with improved properties. Traditionally, classical strain improvement programs, such as the Panlabs Penicillin Strain Improvement Program, have been used to increase yield and productivity of penicillin (reviewed in Demain and Elander (1999); Nielsen (1997); Thykaer and Nielsen (2003)). By applying multiple rounds of mutation and selection, superior production strains were isolated. Additionally, physiological characterization, media design, and process optimizations were performed to improve penicillin productivity in the superior strains. However, the drawback of classical strain improvement is the unknown mechanisms underlying the improve properties of the production strains. Therefore, several efforts were tried to elucidate the biosynthetic pathway leading to penicillins (reviewed in Brakhage (1998); Thykaer and Nielsen (2003)). The penicillin biosynthetic pathway in P. chrysogenum is divided into three steps and the overview of the pathway is shown in Figure 25.4. In the first step,
L-α-aminoadipic acid
ACV Synthetase
LLD-ACV
pcbAB
H2N O
COOH IPN Synthetase IPN
L-Valine
L-Cysteine
COOH
O
SH
O
H N COOH
pcbC
H N
H2N
H N
S N
O
COOH penDE S
2N
Acetyl-coa: Isopenicillin n acyltransferase
H
L α-AAA
N
O
COOH
6-APA
Phenoxyacetyl CoA
Penicillin V
HSCoA H N
O O
O
S N
COOH
Figure 25.4 Biosynthetic pathway for penicillins. Abbreviations: LLL-ACV, αL-aminoadipyl-L-cysteinyl-Dvaline; IPN, isopenicillin N; 6-APA, 6aminopenicillinanic acid. (Modified from J. Thykaer and J. Nielsen, Metabol. Eng., 5:56–69, 2003.)
Metabolic Engineering of Filamentous Fungi
25-21
three amino acids L-α-aminoadipate (L-α-AAA), L-cysteine and L-valine are condensed to the tripeptide α-aminoadipyl-cysteinyl-valine (LLD-ACV) by the enzyme ACV synthetase, which is encoded by pcbAB gene. In the second step, the LL-ACV is cyclized by isopenicillin N synthetase (IPNS) encoded by the pcbC gene by to form isopenicillin N (IPN). The final step is the replacement of the hydrophilic L-αAAA side chain with the CoA-thioester activated form of a hydrophobic acyl group, e.g., phenylacetic acid or phenoxyacetic acid, yielding penicillin G and penicillin V, respectively. This step is catalyzed by a single enzyme acyl-CoA:IPN acyltransferase (IAT) encoded by penDE in two conversion steps. The availability of detailed information in the penicillin biosynthetic gene cluster has allowed the use of metabolic engineering to improve penicillin production. The first application of metabolic engineering to improve penicillin production was reported in 1991 (Veenstra et al., 1991). Amplification of the copy numbers of pcbC and penDE led to a 40% increase in the penicillin production in the single copy strain Wis 54-1255 of P. chrysogenum. Similarly, Theilgaard et al. (2001) introduced different combinations of the three structural genes (pcbAB, pcbC, and penDE) into Wis 541225. By introducing the entire penicillin gene cluster, transformants showed the largest increase of 176% in specific penicillin productivity (Theilgaard et al., 2001). Several research groups have tried to investigate the molecular basis behind the improvement in penicillin production of the industrial strains derived from classical strain improvement programs (Christensen et al., 1995; Fierro et al., 1995; Newbert et al., 1997). Amplification of the copy number and the increase in the transcriptional levels of the entire penicillin gene cluster were observed. These changes were suggested to be the main driving force in the industrial strains. However, there seem to be an upper limit in the copy number as a linear relationship between the copy number and penicillin titer can be seen for cluster copy number below 5 (Newbert et al., 1997; Smith et al., 1989). It has been argued that there should be limitations in the mechanisms such as in the supply of precursors or cofactors required for penicillin biosynthesis at high gene dosages (Thykaer and Nielsen, 2003). It is, therefore, of great interest to investigate alternative targets for metabolic engineering of penicillin production. Casqueiro et al. (1999) reported a new strategy for increasing penicillin production. Since α-aminoadipic acid is the branch intermediate for lysine and penicillin pathways, blocking the lysine biosynthetic branch should led to increase the pool of α-aminoadipic acid available for penicillin production. The lys2 gene encoded for the first enzyme in lysine biosynthetic route was disrupted in P. chrysogenum Wis 54-1255. The penicillin levels were increased two fold in the lys2-disrupted mutants compared to the reference strain. The recent application of metabolic engineering reported by Kiel et al. (2005) opens up another possibility to improve penicillin production without the amplification of penicillin gene dosages (Kiel et al., 2005). The experiment was designed based on the observation that certain high penicillin producing strains had considerably high number of microbodies. Additionally, the microbodies are reported to be essential organelles for penicillin biosynthetic pathway as the enzymes required in the final stage of penicillin biosynthesis are sub-cellular localization in the microbody matrix (Muller et al., 1992). Increasing microbody abundance in the cell was achieved by overexpression of Pc-Pex1 gene encoding a peroxin Pc-Pex11p involved in microbody abundance. The two- to 2.5-fold increase in penicillin productivity was observed in PcPex1p overproducing strains. Another successful example of metabolic engineering approach in P. chrysogenum was the introduction of a hybrid cephalosporin pathway into a penicillin producing strain (Crawford et al., 1995). Since the biosyntheses of penicillins and cephalosporins have the first two steps in common, introducing the expandase gene into penicillin producing strain P. chrysogenum should led to expand the five-membered ring of penicillin to the six-membered ring of cephalosporin. The expandase gene (cefE) from Streptomyces clavuligerus or expandase-hydrolase gene (cefEF) from Acremonium chrysogenum was transformed into an industrial strain of P. chrysogenum. By feeding adipate as side chain precursor, the recombinant strains were capable of producing the cephalosporin intermediates adipyl-7-aminodeacetoxycephalosporanic acid (ad-7-ADCA) and adipyl-7-aminocephalosporanic acid (ad-7-ACA), respectively (Figure 25.5).
25-22
Developing Appropriate Hosts for Metabolic Engineering L-α-aminoadipic acid L-Cysteine L-Valine pcbAB ACV pcbC Adipate IPN penDE
Phenoxyacetic acid
Ad-6-APA
PEN V H N
O
OO
S N
cefE
cefEF
COOH Ad-7-ADCA H S N COOH
OO
Ad-7-ADAC cefEF
N COOH
Ad-7-ACA H S N COOH
O
O
N
OAc
COOH
Figure 25.5 The pathway leading to ad-7-ADCA and ad-7-ACA pathway in P. chrysogenum carrying the S. clavuligerus expandase gene and A. chrysogenum expandase-hydrolase gene, respectively. (Based on L. Crawford et al., Bio/Technology, 13:58–62, 1995.)
25.4 Perspectives Filamentous fungi have a very diversified metabolism and many valuable natural products are naturally being produced by this group of organisms. This has been exploited for industrial production of these natural products. Traditionally high producing strains were isolated through classical strain improvement programs, but recently metabolic engineering has been used to design strains that improve natural products. Furthermore, high producing strains used for penicillin production have been exploited for the production of more valuable antibiotic precursors like adipoyl-7-ADCA through pathway extension. In the future the potential of filamentous fungi to produce different natural products is likely to be further exploited, both in terms of identifying new products, but also as host for the production of different natural products that can serve as scaffolds for pharmaceuticals and active food ingredients. This development is supported strongly by the developments in genomics of filamentous fungi, where several genomes have been sequenced, and many different genomics technologies are being developed for use to study filamentous fungi. The metabolic diversity of filamentous fungi is also valuable in terms of exploiting these organisms for the production of bulk chemicals, used as solvents, polymer building blocks, fuels, resins, etc. A. niger is currently used for very efficient production of citric acid, and if it will be possible to redirect the carbon fluxes toward other organic acids like succinic acid or malic acid, it will be possible to
Metabolic Engineering of Filamentous Fungi
25-23
develop highly efficient bioprocesses for the production of these desirable chemical building blocks. Filamentous fungi are particularly well suited for production of organic acids as they tolerate very low pH and hence it is possible to produce the free acids, which substantially reduces the purification costs. A further advantage of filamentous fungi is the diversified metabolism, particularly in terms of using biomass as raw material. Thus, many filamentous fungi can grow on complex substrates like lignocellulose and hereby it may be possible to develop integrated bioprocesses for the conversion of biomass to commodity chemicals. Another classical application of filamentous fungi is the production of enzymes. They naturally produce a wide range of industrially important enzymes, and still this group of microorganisms represents a valuable source of new enzymes that it will be possible to tap into. Besides serving as a source of new enzymes, filamentous fungi also represent one of the most efficient group of cell factories for the production of enzymes, and their role in this business segment is likely to remain as the enzyme business is continuing its double digit annual growth rates. With its traditional use for the production of enzymes it has been natural to exploit the use of filamentous fungi for the production of pharmaceutical proteins, in particularly proteins where there is a requirement for high level production. It has generally been challenging to produce human proteins using filamentous fungi as cell factories due to proteolytic degradation and difficulties with post-translational processing. However, there have been some progress recently, and there is, therefore, hope for the use of filamentous fungi for the production of e.g., antibodies that needs to be produced in large quantities. Here it may be desirable to express human post-translational pathways in fungal cell factories, as has been demonstrated possible for yeast cells (Hamilton et al., 2006). From the above it is clear that filamentous fungi have the potential to play a prominent role in the future of biotech production of chemicals and proteins. However, a major challenge is still that it is more cumbersome to perform directed genetic modifications in these organisms compared with other cell factories like many bacteria and S. cerevisiae. Furthermore, the diversified metabolism is also a challenge as this may lead to formation of by-products when heterologous pathways are being expressed in a fungal cell factory. Also in connection with protein production the rather complex protein processing in filamentous fungi may cause problems when a new protein is to be produced. All the above mentioned problems may, however, be solved through the exploitation of functional genomics tools, and in particularly through further use of these tools in the context of systems biology. Thus, we have already seen how genome scale metabolic models can be derived directly from genomic sequences, and these models represents integrated information on the metabolism, and through further expansion of these models to include secondary metabolism, it may be possible to exploit the complex secondary metabolism of filamentous fungi to produce completely new chemical compounds. Furthermore, systems biological analysis of protein synthesis and secretion in filamentous fungi may lead to mapping of key processes that can be engineered for improved production of not only homologous enzymes, but also of heterologous proteins. Thus, the current development in genomics and systems biology of filamentous fungi may drive the development toward further use of these organisms for industrial production of many new valuable products.
References Akin, A.R., E.A Bodie, S. Burrow, N. Dunn-Coleman, G. Turner, and M. Ward. Regulatable growth of filamentous fungi. Patent WO 02/079399, 2002. Allen, J., H.M. Davey, D. Broadhurst, J.K. Heald, J.J. Rowland, S.G. Oliver, and D.B. Kell. High-throughput classification of yeast mutants for functional genomics using metabolic footprinting. Nat. Biotechnol., 21(6):692–696, 2003. Alvarez-Vasquez, F., C. Gonz´alez-Alcon, and N.V. Torres. Metabolism of citric acid production by Aspergillus niger: model definition, steady-state analysis and constrained optimization of citric acid production rate. Biotechnol. Bioeng., 70(1):82–108, 2000.
25-24
Developing Appropriate Hosts for Metabolic Engineering
Archer, D.B. and P.S. Dyer. From genomics to post-genomics in Aspergillus. Curr. Opin. Microbiol., 7:499– 504, 2004. Askenazi, M., E.M. Driggers, D.A. Holtzman, T.C. Norman, S. Iverson, D.P. Zimmer, and M.-E. Boers et al. Integrating transcriptional and metabolite profiles to direct the engineering of lovastatin-producing fungal strains. Nat. Biotech., 21(2):150–156, 2003. Auclair, K., J. Kennedy, C.R. Hutchinson, and J.C. Vederas. Conversion of cyclic nonaketides to lovastatin and compactin by a lovC deficient mutant of Aspergillus terreus. Bioorg. Med. Chem. Lett., 11:1527–1531, 2001. Auerbach, D., S. Thaminy, M.O. Hottiger, and I. Stagljar. The post-genomic era of interactive proteomics: facts and perspectives. Proteomics, 2:611–623, 2002. Bader, G.D., and C.W. Hogue. Analyzing yeast protein-protein interaction data obtained from different sources. Nat. Biotechnol., 20:991–997, 2002. Bailey, J.E. Towards a science of metabolic engineering. Science, 252:1668–1674, 1991. Birren, B., D. Denning, and B. Nierman. Comparative analysis of an emerging fungal pathogen, Aspergillus terreus. White paper, 2004. Bok, J.W. and N.P. Keller. LaeA, a regulator of secondary metabolism in Aspergillus spp. Eukaryot Cell., 3(2):527–535, 2004. Bok, J.W., D. Hoffmeister, L.A. Maggio-Hall, R. Murillo, J.D. Glasner, and N.P. Keller. Genomic mining for Aspergillus natural products. Chem. Biol., 13(1):31–37, 2006. Bonnarme, P., B. Gillet, A.M. Sepulchre, C. Role, J.C. Beloeil, and C. Ducrocq. Itaconate biosynthesis in Aspergillus terreus. J. Bacteriol., 177:3573–3578, 1995. Bradamante, S., L. Barenghi, G. Beretta, M. Bonfa, M. Rollini, and M. Manzoni. Production of lovastatin examined by an integrated approach based on chemometrics and dosy-nmr. Biotechnol. Bioeng., 80:589–593, 2002. Brakhage, A.A. Molecular regulation of beta-lactam biosynthesis in filamentous fungi. Microbiol. Mol. Biol. Rev., 62:547–585, 1998. Burstaller, W. Thermodynamic boundary conditions suggest that a passive transport step suffices for citrate excretion in Aspergillus and Penicillium. Microbiology, 152(3):887–893, 2006. Casas Lopez, J.L., J.A. Sanchez Perez, J.M. Fernandez Sevilla, F.G. Acien Fernandez, E. Molina Grima, and Y. Chisti. Production of lovastatin by Aspergillus terreus: effects of the C:N ratio and the principal nutrients on growth and metabolite production. Enzyme Microbial. Technol., 33:270– 277, 2003. Casqueiro, J., S. Gutierrez, O. Banuelos, M.J. Hijarrubia, and J.F. Martin. Gene targeting in Penicillium chrysogenum: disruption of the lys2 gene leads to penicillin overproduction. J. Bacteriol., 181:1181– 1188, 1999. Castillo, N.I., F. Ferro, S. Gutiérrez, and J.F. Martin. Genome-wide analysis of differentially expressed genes from Penicillium chrysogenum grown with a repressing or a non-repressing carbon source. Curr. Genet., 49:85–96, 2006. Chambergo, F.S., E.D. Bonaccorsi, A.J.S. Ferreira, A.S.P. Ramos, J.R. Ferreira Jr., J. Abrahão-Neto, J.P.S. Farah, and H. El-Dorry. Elucidation of the metabolic fate of glucose in the filamentous fungus Trichoderma reesei using expressed sequence tag (EST) analysis and cDNA microarrays. J. Biol. Chem., 277(16):13983–13988, 2002. Chan, J.K., R.N. Moore, T.T. Nakashima, and J.C. Vederas. Biosynthesis of mevinolin. spectral assignment by double-quantum coherence nmr after high carbon13 incorporation. J. Am. Chem. Soc., 105:3334–3336, 1983. Chigira, Y., K. Abe, K. Gomi, and T. Nakajima. chsZ a gene for a novel class of chitin synthase from Aspergillus oryzae. Curr. Genet., 41:261–267, 2002. Christensen, L.H., C.M. Henriksen, J. Nielsen, J. Villadsen, and M. Egel-Mitani. Continuous cultivation of Penicillium chrysogenum. Growth on glucose and penicillin production. J. Biotechnol., 42:95–107, 1995.
Metabolic Engineering of Filamentous Fungi
25-25
Conelly, M. and H. Brody. Method for producing biological substances in enzyme-deficient mutants of Aspergillus. Patent US 2004/0191864, 2004. Couch, R.D. and G.M. Gaucher. Rational elimination of Aspergillus terreus sulochrin production. J. Biotechnol., 108:171–178, 2004. Crawford, L., A.M. Stepan, P.C. McAda, J.A. Rambosek, M.J. Conder, V.A. Vinci, and C.D. Reeves. Production of cephalosporin intermediates by feeding adipic acid to recombinant Penicillium chrysogenum strains expressing ring expansion activity. Bio/Technology, 13:58–62, 1995. David, H., M. Akesson, and J. Nielsen. Reconstruction of the central carbon of Aspergillus nigerr. Eur. J. Biochem., 270:4243–4253, 2003. David, H., G. Hofmann, A.P. Oliveira, H. Jarmer, and J. Nielsen. Metabolic network driven analysis of genome-wide transcription data from Aspergillus nidulans. Genome Biol., 7(11):R108, 2006. de Groot, M.J., W. Prathumpai, J. Visser, and G.J. Ruijter. Metabolic control analysis of Aspergillus niger L-arabinose catabolism. Biotechnol. Prog., 21(6):1610–1616, 2005. Demain, A.L., and R.P. Elander. The beta-lactam antibiotics: past, present, and future. Antonie Van Leeuwenhoek, 75:5–19, 1999. Dunn-Coleman, N., G. Turner, S.E. Pollerman, and S.D. Memmott. Hyphal growth in fungi. Patent WO 00/56893, 2000. Durand, H., M. Clanet, and G. Tiraby. Genetic improvement of Trichoderma reesei for large scale cellulase production. Enzyme Microb. Technol., 10(6):341–346, 1988. Fierro, F., J.L. Barredo, B. Diez, S. Gutierrez, F.J. Fernandez, and J.F. Martin. The penicillin gene cluster is amplified in tandem repeats linked by conserved hexanucleotide sequences. Proc. Natl. Acad. Sci. USA, 92:6200–6204, 1995. Friboulet, A. and D. Thomas. Systems Biology—an interdisciplinary approach. Biosensor. Bioelectron., 20:2404–2407, 2005. Fungal Genome Initiative. White paper developed by the fungal research community. White paper, 2002. URL:http://www-genome.wi.mit.edu/seq/fgi/ Galagan, J.E., S.E. Calvo, K.A. Borkovich, E.U. Selker, N.D. Read, D. Jaffe, and W. FitzHugh et al. The genome sequence of the filamentous fungus Neurospora crassa. Nature, 422:859–868, 2003. Galagan, J.E., S.E. Calvo, C. Cuomo, L-J. Ma, J.R. Wort-man, S. Batzoglou, and S-I. Lee et al. Sequencing of Aspergillus nidulans and comparative analysis with A. fumigatus and A. oryzae. Nature, 438(7071):1105–1115, 2005a. Galagan, J.E., M.R. Henn, L-J. Ma, C.A. Cuomo, and B. Birren. Genomics of the fungal kingdom: Insights into eukaryotic biology. Genome Res., 15(12):1620–1631, 2005b. Goffeau, A., B.G. Barrell, H. Bussey, R.W. Davis, B. Dujon, H. Feldmann, and F. Galibert, J.D. et al. Life with 6000 genes. Science, 274(5287):546, 563–567, 1996. Greenspan, M.D. and J.B. Yudkovitz. Mevinolinic acid biosynthesis by Aspergillus terreus and its relationship to fatty acid biosynthesis. J. Bacteriol., 162:704–707, 1985. Gygi, S.P., B. Rist, S.A. Gerber, F. Turecek, M.H. Gelb, and R. Aebersold. Quantitative analysis of complex protein mixtures using isotope-coded affinity tags. Nat. Biotechnol., 17:994–999, 1999. Hajjaj, H., P. Niederberger, and P. Duboc. Lovastatin biosynthesis by Aspergillus terreus in a chemically defined medium. Appl. Environ. Microbiol., 67:2596–2602, 2001. Hamilton, S.R., R.C. Davidson, N. Sethuraman, J.H. Nett, Y. Jiang, S. Rios, and P. Bobrowicz, T.A. et al. Humanization of yeast to produce complex terminally sialylated glycoproteins. Science, 313(5792):1441–1443, 2006. Harkki, A., J. Uusitalo, M. Bailey, M. Penttilä, and J.K.C. Knowles. A novel fungal expression system: secretion of active calf chymosin from the filamentous fungus Trichoderma reesei. Bio/Technology, 7(6):596–600, 603, 1989. Harkki, A., A. Mantyla, M. Penttilä, S. Muttilainen, R. Buhler, P. Suominen, J. Knowles, and H. Nevalainen. Genetic engineering of Trichoderma to produce strains with novel cellulase profiles. Enzyme Microbial. Technol., 13(3):227–233, 1991.
25-26
Developing Appropriate Hosts for Metabolic Engineering
Hendrickson, L., C.R. Davis, C. Roach, D.K. Nguyen, T. Aldrich, P.C. McAda, and C.D. Reeves. Lovastatin biosynthesis in Aspergillus terreus: characterization of blocked mutants, enzyme activities and a multifunctional polyketide synthase gene. Chem. Biol., 6:429–439, 1999. Henriksen, C.M., L.H. Christensen, J. Nielsen, and J. Villadsen. Growth energetics and metabolic fluxes in continuous cultures of Penicillium chrysogenum. J. Biotechnol., 45(2):149–164, 1996. Hjort, C. Polypeptides with protein disulfide reducing properties. Patent WO 00/70064, 2000. Hjort, C. and H. Pedersen. Oxaloacetate hydrolase deficient fungal host cells. Patent WO 00/50576, 2000. Hjort, C., C.A.M.J.J. van den Hondel, P.J. Punt, and F. Schuren. Fungal transcriptional activator usefyul in methods for producing polypeptides. Patent WO 00/20596, 2000. Hjort, C., C.M.J.J. van den Hondel, P.J. Punt, F.H.J. Schuren, and T. Christensen. Fungal transcriptional activator useful in methods for producing polypeptides. Patent WO 01/68864, 2001. Hjort, C. Production of food additives using filamentous fungi. In Genetically engineered food. WileyVCH, Germany, 86–99, 2003. Hofmann, G., M. McIntyre, and J. Nielsen. Fungal genomics beyond Saccharomyces cerevisiae? Curr. Opin. Biotechnol., 14:226–231, 2003. Hutchinson, C.R., J. Kennedy, and C. Park. Method of producing antihypercholesterolemic agents. US Patent 6, 391, 538, 2002. Jørgensen, H., J. Nielsen, J. Villadsen, and H. Møllgaard. Metabolic flux distributions in Penicillium chrysogenum during fed-batch cultivations. Biotechnol. Bioeng., 46(2):117–131, 1995. Karaffa, L. and C.P. Kubicek. Aspergillus niger citric acid accumulation: do we understand this well working black box? Appl. Microbiol. Biotechnol., 61(3):189–196, 2003. Karaffa, L., E. Sándor, E. Fekete, and A. Szentirmai. The biochemistry of citric acid accumulation by Aspergillus niger. Acta Microbiol. Immunol. Hungarica, 48(3–4):429–440, 2001. Kennedy, J., K. Auclair, S.G. Kendrew, C. Park, J.C. Vederas, and C.R. Hutchinson. Modulation of polyketide synthase activity by accessory proteins during lovastatin biosynthesis. Science, 284(5418):1368–1372, 1999. Kiel, J.A., I.J. van der Klei, M.A. van den Berg, R.A. Bovenberg, and M. Veenhuis. Overproduction of a single protein, Pc-Pex11p, results in 2-fold enhanced penicillin production by Penicillium chrysogenum. Fungal Genet. Biol., 42:154–164, 2005. Kiriyama, N., K. Nitta, Y. Sakaguchi, Y. Tagushi, and Y. Yamamoto. Studies on the metabolic products of Aspergillus terreus. III. Metabolites of the strain IFO 8835. Chem. Pharm. Bull. (Tokyo), 25:2593– 2601, 1977. Knap, I.H., C.M. Hjort, T. Halkier, and L.V. Kofod. An alpha-galactosidase enzyme. Patent WO 94/23022, 1994. Kubicek-Pranz, E.M. Nutrition, cellular structure and basic metabolic pathways in Trichoderma and Glio cladium. In Eds Christian P. Kubicek and Gary E. Harman, Trichoderma and Gliocladium. Francis & Taylor, London, 95–119, 1998. Kumar, M.S., S.K. Jana, V. Senthil, V. Shashanka, S.V. Kumar, and A.K. Sadhukhan. Repeated fed-batch process for improving lovastatin production. Proc. Biochem., 36:363–368, 2000a. Kumar, M.S., P.M. Kumar, H.M. Sarnaik, and A.K. Sadhukhan. A rapid technique for screening of lovastatin-producing strains of Aspergillus terreus by agar plug and Neurospora crassa bioassay. J. Microbiol. Methods, 40:99–104, 2000b. Lai, L.S., T.H. Tsai, and T.C. Wang. Application of oxygen vectors to Aspergillus terreus cultivation. J. Biosci. Bioeng., 94:453–459, 2002. Lim, D., P. Hains, B. Walsh, P. Bergquist, and H. Nevalainen. Proteins associated with the cell envelope of Trichoderma reesei: a proteomic approach. Proteomics, 1: 899–909, 2001. Londesborough, J., M. Penttilä, and P. Richard. Engineering fungi for the utilisation of L-arabinose. Patent WO 2002066616, 2002. Machida, M. Progress of Aspergillus oryzae genomics. Adv. Appl. Microbiol., 51:81–106, 2002.
Metabolic Engineering of Filamentous Fungi
25-27
Machida, M., K. Asai, M. Sano, T. Tanaka, T. Kumagai, G. Terai, and K-I. Kusumoto et al. Genome sequencing and analysis of Aspergillus oryzae. Nature, 438(7071):1157–1161, 2005. Maeda, H., M. Sano, Y. Maruyama, T. Tanno, T. Akao, Y. Totsuka, and M. Endo et al. Transcriptional analysis of genes for energy catabolism and hydrolytic enzymes in the filamentous fungus Aspergillus oryzae using cDNA microarrays and expressed sequence tags. Appl. Microbiol. Biotechnol., 65(1):74–83, 2004. Manzoni, M. and M. Rollini. Biosynthesis and biotechnological production of statins by filamentous fungi and application of these cholesterol-lowering drugs. Appl. Microbiol. Biotechnol., 58:555– 564, 2002. Miettinen-Oinonen, A., L. Heikinheimo, J. Buchert, J. Morgado, L. Almeida, P. Ojapalo, and A. CavacoPaulo. The role of Trichoderma reesei cellulases in cotton finishing. AATCC Rev., 1(1):33–35, 2001. Miettinen-Oinonen, A., M. Paloheimo, R. Lantto, and P. Suominen. Enhanced production of cellobiohydrolases in Trichoderma reesei and evaluation of the new preparations in biofinishing of cotton. J. Biotechnol., 116(3):305–317, 2005. Müller, C., M. McIntyre, K. Hansen, and J. Nielsen. Metabolic engineering of the morphology of Aspergillus oryzae by altering chitin synthesis. Appl. Environ. Microbiol., 68(4):1827–1836, 2002. Moore, R.N., G. Bigam, J.K. Chan, A.M. Hogg, T.T. Nakashima, and J.C. Vederas. Biosynthesis of the hypocholesterolemic agent mevinolin by Aspergillus terreus. Determination of the origin of carbon, 13 hydrogen, and oxygen atoms by C NMR and mass spectrometry. J. Am. Chem. Soc., 107:3694– 3701, 1985. Morikawa, Y., T. Ohashi, O. Mantani, and H. Okada. Cellulase induction by lactose in Trichoderma reesei pc-3-7. Appl. Microbiol. Biotechnol., 44(1–2):106–111, 1995. Muller, W.H., R.A. Bovenberg, M.H. Groothuis, F. Kattevilder, E.B. Smaal, L.H. van der Voort, and A.J. Verkleij. Involvement of microbodies in penicillin biosynthesis. Biochim Biophys. Acta, 1116:210–213, 1992. Nakagawa, M., A. Hirota, H. Sakai, and A. Isogai. Terrecyclic acid A, a new antibiotic from Aspergillus terreus. I. Taxonomy, production, and chemical and biological properties. J. Antibiot. (Tokyo), 35:778–782, 1982. Netik, A., N.V. Torres, J-M. Riol, and C.P. Kubicek. Uptake and export of citric acid by Aspergillus niger is reciprocally regulated by manganese ions. Biochim. Biophys. Acta, 1326(2):287–294, 1997. Nevalainen, K.M.H., V.S.J. Te’o, and P.L. Bergquist. Heterologous protein expression in filamentous fungi. Trends Biotechnol., 23(9):468–474, 2005. Newbert, R.W., B. Barton, P. Greaves, J. Harper, and G. Turner. Analysis of a commercially improved Penicillium chrysogenum strain series: involvement of recombinogenic regions in amplification and deletion of the penicillin biosynthesis gene cluster. J. Ind. Microbiol. Biotechnol., 19:18–27, 1997. Nielsen, J. Physiological Engineering Aspects of Penicillium chrysogenum. World Scientific Publishing Co. Pte. Ltd, Singapore, 1997. Nielsen, J. Metabolic engineering: techniques for analysis of targets for genetic manipulations. Biotech. Bioeng., 58(2–3):125–132, 1998. Nierman, W.C., A. Pain, M.J. Anderson, J.R. Wortman, H.S. Kim, J. Arroyo, and M. Berriman et al. Genomic sequence of the pathogenic and allergenic filamentous fungus Aspergillus fumigatus. Nature, 438 (7071):1151–1156, 2005. Novozymes. The Novozymes Report 2005. www.novozymes.com, 2006. Web. Nyyssonen. E. and S. Keranen. Multiple roles of the cellulase CBHI in enhancing production of fusion antibodies by the filamentous fungus Trichoderma reesei. Curr. Genet., 28(1):71–79, 1995. Nyyssonen, E., M. Penttilä, A. Harkki, A. Saloheimo, J.K.C. Knowles, and S. Keranen. Efficient production of antibody fragments by the filamentous fungus Trichoderma reesei. Bio/Technology, 11(5):591– 595, 1993.
25-28
Developing Appropriate Hosts for Metabolic Engineering
Nyyssonen, E., J. Demolder, R. Contreras, S. Keranen, and M. Penttilä. Protein production by the filamentous fungus Trichoderma reesei: Secretion of active antibody molecules. Can. J. Bot., 73(Suppl. 1, Sect. E-H,):S885–S890, 1995. Oda, K., K. Iwashita, D. Kakizono, H. Iefuji, and O. Akita. Proteome analysis of extracellular proteins from solid-state culture of Aspergillus oryzae. Poster, Genomics, 2003. Oda, K., D. Kakizono, O. Yamada, H. Iefuji, O. Akita, and K. Iwashita. Proteome analysis of secreted proteins from Aspergillus oryzae in submerged and solid-state culture conditions. Poster, Genomics and Proteomics, 2005. Ojamo, H., M. Penttilä, H. Heikkila, J. Uusitalo, M. Ilmen, M.-L. Sarkki, and M.-L. Vehkomaki. Method for the production of xylitol. Patent WO 2002006504, 2002. Pakula, T., M. Saloheimo, J. Uusitalo, A. Huuskonen, A. Watson, D. Jeenes, D. Archer, and M. Penttilä. Improved method for heterologous production of secreted proteins in fungi based on transcription enhancement of secreted protein genes by modified promoter. Patent WO 2002064624, 2002. Pakula, T., M. Saloheimo, and M. Penttilä. Proteomics studies in Trichoderma reesei. Poster, Genomics, 2003. Patil, K.R., M. Akesson, and J. Nielsen. Use of genome-scale microbial models for metabolic engineering. Curr. Opin. Biotechnol. 15(1):64–9, 2004. Peberdy, J.F. Protein secretion in filamentous fungi–trying to understand a highly productive black box. Trends Biotechnol., 12(2):50–57, 1994. Pedersen, H., B. Christensen, C. Hjort, and J. Nielsen. Construction and characterization of an oxalic acid nonproducing strain of Aspergillus niger. Metab. Eng., 2(1): 34–41, 2000a. Pedersen, H., C. Hjort, and J. Nielsen. Cloning and characterization of oah, the gene encoding oxaloacetate hydrolase in Aspergillus niger. Mol. Gen. Genet., 263(2): 281–286, 2000b. Pel, H.J., J.H. de Winde, D.B. Archer, P.S. Dyer G. Hofmann, P.J. Schaap, and G. Turner et al. Genome sequencing and analysis of the versatile cell factory Aspergillus niger CBS 513.88. Nat Biotechnol., 25(2):221–231, 2007. Penttilä, M. Heterologous Protein Production in Trichoderma in Trichoderma and Gliocladium, volume 2. Taylor & Francis, London, 322–333, 1998. Penttilä, M.E., M. Ward, H. Wang, M.J. Valkonen, and M.L.A. Saloheimo. Increase production of secreted proteins by recombinant eukaryotic cells. Patent US 2001/0034045, 2001. Pollerman, S. and S. Memmott. Hyphal growth in fungi. Patent WO 2005/080420, 2005. Price, N.D., J.L. Reed, and B.O. Palsson. Genome-scale models of microbial cells: evaluating the consequences of constraints. Nat. Rev. Microbiol., 2(11): 886–897, 2004. Rao, K.V., A.K. Sadhukhan, M. Veerender, V. Ravikumar, E.V. Mohan, S.D. Dhanvantri, and M. Sitaramkumar et al. Butyrolactones from Aspergillus terreus. Chem. Pharm. Bull., 48:559–562, 2000. Ratledge, C.. Look before you clone. A comment on ‘Properties of Aspergillus niger citrate synthase and effects of citA overexpression on citric acid production’. FEMS Microbiol. Lett., 189(2):317–319, 2000. Ruijter, G.J.G. and J. Visser. Carbon repression in Aspergilli. FEMS Microbiol. Lett., 151:103–114, 1997. Ruijter, G.J.G., H. Panneman, and J. Visser. Metabolic engineering of the glycolytic pathway in Aspergillus niger. Food Technol. Biotechnol., 36(3):185–188, 1998. Ruijter, G.J.G., H. Panneman, D.-B. Xu, and J. Visser. Properties of Aspergillus niger citrate synthase and effects of citA overexpression on citric acid production. FEMS Microbiol. Lett., 184:35–40, 2000. Saloheimo, M., M. Lund, and M.E. Penttilä. The protein disulphide isomerase gene of the fungus Trichoderma reesei is induced by endoplasmic reticulum stress and regulated by the carbon source. Mol. Gen. Genet., 262(1):35–45, 1999. Schmoll, M. and C. Kubicek. ooc1, a unique gene expressed only during growth of Hypocrea jecorina (anamorph: Trichoderma reesei) on cellulose. Curr. Genet., 48(2):126–133, 2005.
Metabolic Engineering of Filamentous Fungi
25-29
Schmoll, M., S. Zeilinger, R.L. Mach, and C.P. Kubicek. Cloning of genes expressed early during cellulase induction in Hypocrea jecorina by a rapid subtraction hybridization approach. Fungal Genet. Biol., 41(9):877–887, 2004. Schmoll, M., L. Franchi, and C.P. Kubicek. Envoy, a pas/lov domain protein of Hypocrea jecorina (anamorph Trichoderma reesei), modulates cellulase gene transcription in response to light. Eukaryotic Cell, 4(12):1998–2007, 2005. Schreferl-Kunar, G., M. Grotz, M. Roehr, and C.P. Kubicek. Increased citric acid production by mutants of Aspergillus niger with increased glycolytic capability. FEMS Microbiol. Lett., 59(3):297–300, 1989. Seiboth, B., G. Hofmann, and C.P. Kubicek. Lactose metabolism and cellulase production in Hypocrea jecorina: the gal7 gene, encoding galactose-1-phosphate uridylyltransferase, is essential for growth on galactose but not for cellulase induction. Mol. Genet. Genomics JT -Molecular Genet. Genomics: MGG., 267(1):124–132, 2002a. Seiboth, B., L. Karaffa, E. Sandor, and C. Kubicek. The Hypocrea jecorina gal10 (uridine 5’-diphosphateglucose 4-epimerase-encoding) gene differs from yeast homologues in structure, genomic organization and expression. Gene, 295(1):143–149, 2002b. Seiboth, B., L. Hartl, M. Pail, E. Fekete, L. Karaffa, and C.P. Kubicek. The galactokinase of Hypocrea jecorina is essential for cellulase induction by lactose but dispensable for growth on d-galactose. Mol. Microbiol., 51(4): 1015–1025, 2004. Seiboth, B., L. Hartl, N. Salovuori, K. Lanthaler, G.D. Robson, J. Vehmaanpera, M.E. Penttilä, and C.P. Kubicek. Role of the bga1 -encoded extracellular beta-galactosidase of Hypocrea jecorina in cellulase induction by lactose. Appl. Environ. Microbiol., 71(2):851–857, 2005. Shuster, J.R. and J.C. Royer. Obtaining mutant filamentous fungal cells with improved polypeptide production -by examination for restricted colonial phenotype and more extensive hyphal branching than parent fungal cells. Patent WO9726330, 1997. Sims, A.H., M.E. Gent, G.D. Robson, N.S. Dunn-Coleman, and S.G. Oliver. Combining transcriptome data with genomic and cDNA sequence alignments to make confident functional assignments for Aspergillus nidulans genes. Mycol. Res., 108(8):853–857, 2004a. Sims, A.H., G.D. Robson, D.C. Hoyle, S.G. Oliver, G. Turner, R.A. Prade, H.R. Russell, N.S. DunnColeman, and M.E. Gent. Use of expressed sequence tag analysus and cDNA microarrays of the filamentous fungus Aspergillus nidulans. Fungal Genet. Biol., 41:199–212, 2004b. Smedsgaard, J. and J. Nielsen. Metabolite profiling of fungi and yeast: from phenotype to metabolome by MS and informatics. J. Exp. Bot., 56:273–286, 2005. Smith, D.J., J.H. Bull, J. Edwards, and G. Turner. Amplification of the isopenicillin N synthetase gene in a strain of Penicillium chrysogenum producing high levels of penicillin. Mol. Gen. Genet., 216:492– 497, 1989. Svendsen, A., L. Beier, J. Vind, T. Spendler, and M. Jensen. Fungal alpha-amylase variants. Patent WO 2005/019443 A2, 2005. Taira, R., S. Tkagi, C. Hjort, A. Virksø-Nielsen, E. Allain, and H. Udagawa. Enzymes for starch processing. Patent WO 2005/003311, 2005. Theilgaard, H., M. van Den Berg, C. Mulder, R. Bovenberg, and J. Nielsen. Quantitative analysis of Penicillium chrysogenum Wis54-1255 transformants overexpressing the penicillin biosynthetic genes. Biotechnol. Bioeng., 72:379–388, 2001. Thykaer, J. and J. Nielsen.Metabolic engineering of β-lactam production. Metabol. Eng., 5:56–69, 2003. Torres, N.V. Modeling approach to control of carbohydrate metabolism during citric acid accumulation by Aspergillus niger: I. Model definition and stability of the steady state. Biotechnol. Bioeng., 44(1):104–111, 1994a. Torres, N.V. Modeling approach to control of carbohydrate metabolism during citric acid accumulation by Aspergillus niger: II. Sensitivity analysis. Biotechnol. Bioeng., 44(1):112–118, 1994b.
25-30
Developing Appropriate Hosts for Metabolic Engineering
Torres, N.V., C. Regalado, A. Sorribas, and M. Cascante. Modeling of cell processes with applications to biotechnology and medicine: Quality assessment of a metabolic model and system analysis of citric acid production by Aspergillus niger in Modern Trends in Biothermokinetics. Plenum Press, New York, 115–124, 1993. Torres, N.V., E.O. Voit, C. Glez-Alcón, and F. Rodríguez. A novel approach to design of overexpression strategy for metabolic engineering. Application to the carbohydrate metabolism in the citric acid producing mould Aspergillus niger. Food Technol. Biotechnol., 36(3):177–184, 1998. Valkonen, M., M. Ward, H. Wang, M. Pentillä, and M. Saloheimo. Improvement of foreign-protein production in Aspergillus niger var. awamori by constitutive induction of the unfolded-protein response. Appl. Environ. Microbiol., 69(12):6979–6986, 2003. Veenstra, A.E., P. van Solingen, R.A. Bovenberg, and L.H. van der Voort. Strain improvement of Penicillium chrysogenum by recombinant DNA techniques. J. Biotechnol., 17:81–90, 1991. Villas-Boas, S.G., S. Mas, M. Akesson, J. Smedsgaard, and J. Nielsen. Mass spectrometry in metabolome analysis. Mass Spectrom. Rev., 24:613–646, 2005. Vinci, V.A., T.D. Hoerner, A.D. Coffman, T.G. Schimmel, R.L. Dabora, A.C. Kirpekar, C.L. Ruby, and R.W. Stieber. Mutants of a lovastatin-hyperproducing Aspergillus terreus deficient in the production of sulochrin. J. Ind. Microbiol. Biotechnol., 8:113–119, 1991. Voit, E.O. Smooth bistable S-systems. Syst. Biol. (Stevenage), 152(4):207–213, 2005. Wallis, G.L.F., R.J. Swift, F.W. Hemming, A.P.J. Trinci, and J.F. Peberdy. Glucoamylase overexpression and secretion in Aspergillus niger: analysis of glycosylation. Biochimica Biophysica Acta, 1472:576–586, 1999. Wang, T.H., Y.H. Zhong, W. Huang, T. Liu, and Y.W. You. Antisense inhibition of xylitol dehydrogenase gene, xdh1 from Trichoderma reesei. Lett. Appl. Microbiol., 40(6):424–429, 2005. Witteveen, C.F.B., P. van de Vondervoort, K. Swart, and J. Visser. Glucose oxidase overproducing and negative mutants of Aspergillus niger. Appl. Microbiol. Biotechnol., 33:683–686, 1990. Witteveen, C.F.B., M. Veenhuis, and J. Visser. Localization of glucose oxidase and catalase activities in Aspergillus niger. Appl. Environ. Microbiol., 58 (4):1190–1194, 1992. Witteveen, C.F.B., P.J. Van de Vondervoort, H.C. van den Broeck, A.C. van Engelenburg, L.H. de Graaff, M.H. Hillebrand, P.J. Schaap, and J. Visser. Induction of glucose oxidase, catalase, and lactonase in Aspergillus niger. Curr. Genet., 24(5):408–416, 1993. Xu, Z., M.A. van den Berg, C. Scheuring, L. Covaleda, H. Lu, F.A. Santos, and T. Uhm et al. Genome physical mapping from large-insert clones by fingerprint analysis with capillary electrophoresis: a robust physical map of Penicillium chrysogenum. Nucleic Acids Res., 33:e50, 2005. Yaver, D. and P. Nham. Promoter variants for expressing genes in a fungal cell. Patent WO 2004/046373 A2, 2004. Yoder, W.T. and J. Lehmbeck. Heterologous expression and protein. In Advances in fungal biotechnology for industry, agriculture and medicine. In Jan S. Tkacz and Lene Lange. Kluwer Academic/ Plenum Publishers, 201–219, 2004. Yoshizawa, Y., D.J. Witter, Y. Liu, and J.C. Vederas. Revision of the biosynthetic origin of oxygens in mevinolin (lovastatin), a hypocholesterolemic drug from Aspergillus terreus MF 4845. J. Am. Chem. Soc., 116:2693–2694, 1994. Zhu, H., M. Bilgin, R. Bangham, D. Hall, A. Casamayor, P. Bertone, and N. Lan et al. Global analysis of protein activities using proteome chips. Science, 293:2101–2105, 2001.
26 Metabolic Engineering of Mammalian Cells 26.1 Introduction ������������������������������������������������������������������������������������ 26-1 26.2 Use of Mammalian Cell Culture in Recombinant Protein Production ������������������������������������������������������������������������� 26-2 26.3 Conventional Metabolic Engineering.......................................... 26-2 Increasing VCI • Increasing Specific Productivity • Product Quality
Lake-Ee Quek and Lars Keld Nielsen The University of Queensland
26.4 Future Directions—Systems Biology......................................... 26-10 Component Analysis • Metabolic Flux Analysis • Tools for Advanced Gene Expression
26.5 Summary ��������������������������������������������������������������������������������������� 26-13 References ������������������������������������������������������������������������������������������������� 26-14
26.1 Introduction Mammalian cell culture is necessary for the production of several complex biopharmaceutical products, including complex proteins such as monoclonal antibodies (MAbs), viral vaccines, and gene therapy vectors. Metabolic engineering of mammalian cells involves several unique challenges. Mammalian cells are slow growing, have complex nutritional requirements and are cumbersome to genetically engineer greatly reducing throughput. The key pathways involved in complex biopharmaceutical production—folding, glycosylation, and secretion—remain poorly characterized. Moreover, mammalian cells respond to most stresses by undergoing programmed cell death (apoptosis). Dealing with a complex system, we are finding a need for more sophisticated systems-level assessment and intervention tools to gain physiological insights into the dynamic interactions between large array of cellular components, and endeavoring to identify molecular mechanisms that confer cells with greater productivity and robustness. This chapter reviews the progress made in metabolic engineering of mammalian cells for the production of recombinant protein, highlighting the challenges, and potential strategies to engineer more robust and productive host cell lines. The discussion accounts for conventional metabolic engineering strategies to increase product titer by enhancing viable cell index (VCI) and cell-specific productivity, as well as to rectify quality issues related to product heterogeneity. This is followed up with the exploration of systems biology as a principle investigation approach, describing the use of complementary omics datasets (transcriptomics, proteomics, metabolomics, and fluxomes), with an emphasis on the challenges posed by model-based estimation of fluxome for mammalian system.
26-1
26-2
Developing Appropriate Hosts for Metabolic Engineering
26.2 Use of Mammalian Cell Culture in Recombinant Protein Production About 60–70% of all recombinant protein biopharmaceuticals are produced in mammalian cells,1 despite their high cost, complex nutritional demands, fragility, and sub-optimal growth and productivity. This reflects a need for complex post-translational modifications to yield biologically active product for complex proteins such as MAbs. Metabolic engineering efforts in mammalian cells strive to improve global and specific cellular properties, such as the central metabolism, apoptosis and cell-cycle regulatory pathways, and protein expression and glycosylation machineries.2–4 Key objectives are to increase productivity and product quality. Conventional fed-batch optimization of feed composition and rate has dramatically increased the product titers achieved over the past two decades.5–7 By keeping cells productive at a higher cell density and for a much longer period, typical MAb titers have improved by a factor of 20–50, while specific productivity has remained largely unchanged. Final titers in excess of 2 g/L have been reported in fedbatch culture of hybridoma and GS-NS0 cell lines due to a significant enhancement of integral viable cell density.8,9 Titers as high as 5–10 g/L have been reported by companies. The emergence of blockbuster MAbs like Rituxan, Enbrel, Remicade, and Herceptin used at high doses has led to product demands in excess of 100 kg/annum and has again created a push for improving productivities through strain and process design. The need for better strains and processes, however, has to be balanced against the massive cost of extending “time to market.” Apart from the potential loss of several billion dollars per additional year spent in development, delayed introduction could greatly affect market penetration in a competitive market. Therefore, metabolic engineering has been focused on improving a few well-characterized and approved cell lines, such as Chinese hamster ovary (CHO), mouse myeloma (NS0), baby hamster kidney (BHK), human embryonic kidney (HEK-293), and human retina-derived (PERC6) cells. These cell-lines are adapted to serum-free media and grown in suspension culture, have rapid growth and authentic posttranslational modification compatible with human use, and have access to a variety of transfection and clonal selection platforms. The procedures for developing a producer cell based on these cell lines are wellestablished, and the issues associated have been reviewed extensively.1,10,11 The development cycle may span three to five years, involving transfection of host cell, expansion of clones, and selection of high producers, adaptation to serum-free and suspension culture, and finally optimization of bioprocessing conditions. There is no room for modifying the inherent properties of the producer cell lines to achieve greater productivity. Metabolic engineering does however offer the opportunity of creating a new generation of host cells that carry improved phenotypes, preferably ones that are generically suited for production of different recombinant proteins. In particular, new host cells for MAb production are of great interest since MAb represents the majority of new biopharmaceuticals and typically are required in large quantities.
26.3 Conventional Metabolic Engineering Conventional metabolic engineering is employed to increase product titers as well as enhance product quality. The final product titer is given by the integral over time of viable cell density and specific productivity. Thus, the titer can be improved by increasing specific productivity p. and/or by increasing the so-called VCI, i.e., the integral over cultivation time of viable cell density.
26.3.1 Increasing VCI 26.3.1.1 Process Considerations The most appropriate strategies used to increase VCI depend on the mode of cultivation. Fed-batch and perfusion are the mainstream bioreactor configurations currently used in industry.12
26-3
Metabolic Engineering of Mammalian Cells
Fed-batch culture is the most common approach adopted by the industry because it enables control of cell specific growth rate and residual nutrient concentrations. Concentrated nutrient-feeding scheme are typically used to maximize viable cell density and product titer. Cell density in the 107 cells/ml range can be attained in fed-batch, and these cells can be maintained in a productive state over extended time frame. Fed-batch cultures typically display a biphasic growth profile consisting of the initial growth and the final decline phases. Assuming exponential growth and decline phases, the titer is determined as
titer = p × (CI growth + CI decline )
∫
1 ( x peak - xinitial ) µ
∫
x peak kD
CI growth = xdt =
CI decline = xdt =
Thus, production during the growth phase is proportional to peak cell density (xpeak) and inversely proportional to cell specific growth rate (µ). While feeding is used to elevate peak cell density, the achievable peak is limited by accumulation of inhibitory by-products such as ammonia and increased osmolarity due to base addition to neutralize lactic acid. Rate of change as much as actual level appear to define the limit of growth,13 providing an added incentive to slowing down growth. Limiting growth by imposing glucose- and/or glutamine-limiting feeding regime also provides the means of reducing overflow metabolism reducing the accumulation of by-products and hereby enabling higher peak density. For animal cells, however, there is a fine line between limiting growth and inducing apoptosis, i.e., programmed cell death. This translates into a high death rate (kD) during the decline phase and a significant loss of production. Rather than trying to extend the growth phase through nutrient limitation, it is common to supply nutrients in excess during the exponential phase and then deliberately halt growth through cold-shock or addition of growth inhibitors prior to cells entering stress induced apoptosis. Metabolic engineering strategies have been pursued are mostly ones that (a) reduces overflow metabolism even in rich medium, (b) suppress apoptosis in response to stress and/or (c) affect proliferation arrest. In continuous process, the productivity is given by the steady-state viable cell density (xss). Thus, perfusion culture is gaining popularity because high cell density culture can be conducted, achieving the equivalent productivity in fed-batch in a much reduced (up to ten-fold) bioreactor volume. The cell retention device is the pivotal component in perfusion culture, capable of retaining cells to a high density of at least 107 cells/ml. Acoustic and alternative tangential flow filters are considered the best potential cell retention devices for large-scale production.12 At steady-state, the perfusion culture is specified by the following equations.
µ = kD x ss =
D(sin - s ) D ≈ YSX sin YXS µ kD
i.e., cell density and product titer is proportional to feed rate and concentration, while inversely proportional to death rate. Productivity is limited by the capacity of the cell retention device and— significantly—by the degree to which perfusion induces cell death. For many cell lines, approaching the equilibrium of growth equal death is more a case of cell death through apoptosis increasing than growth rate decreasing. The result is a bioreactor producing large amounts of dead cells and a bleed line is often used to enable operation at a healthy growth rate, in order to reduce problems associated with
26-4
Developing Appropriate Hosts for Metabolic Engineering
accumulation of debris. High death rate and/or bleeding, however, is associated with greatly reduced performance and selection or engineering of cells with increased stress tolerance is the single most important strategy for improving performance of perfusion bioreactors. 26.3.1.2 Improved Metabolic Phenotype Mammalian cells exhibit a wasteful overflow metabolism, where high level of glycolysis and glutaminolysis leads to production of the by-products lactate and ammonia. As the result of the overflow metabolism, TCA cycle tends to operate in a truncated fashion, where a lower pyruvate to citrate flux is decoupled from a higher α-ketoglutarate to oxaloacetate flux. Most pyruvate is converted into lactate, while glutamine is partially oxidized by transamination reactions into ammonia and alanine or aspartate. The accumulation of ammonia and lactate cause a decrease in viable cell density because ammonia is cytotoxic, while lactate inhibit cell growth through an osmoloarity effect.14,15 A partial knock-out of lactate dehydrogenase significantly reduced lactate formation despite high glucose concentration in the medium and led to a doubling in MAb productivity.16 Glucose consumption was significantly reduced while glutamine consumption remained high, suggesting that increased flux of glucose into the TCA cycle did not occur. This suggest that pyruvate dehydrogenase could be a rigid node, or may be related to a bottleneck in the import of NADH from cytoplasm into mitochondria.17,18 To counteract this bottleneck, BHK, HEK-293, and CHO cells have been engineered to overexpress yeast pyruvate carboxylase gene, allowing the conversion of pyruvate to oxaloacetate, and subsequently to malate in the cytoplasm, along with the regeneration of NAD+.19–22 Pyruvate carboxylase activity has not been detected in cultured mammalian cell, despite the gene being present.23,24 The engineered cells showed marked improvement of the enzymatic connectivity between glycolysis and TCA cycle. This finding was based on the evidence of the reduction of glucose and glutamine consumption by four- and two-fold respectively, of the reduction of lactate accumulation and of the improvement of cell yield on glucose.20 Subsequently, the BHK cell line was transfected to express recombinant human erythropoietin (rhEPO) in order to examine the effects on cell productivity.19 The productivity of rhEPO in BHK cells overexpressing yeast pyruvate carboxylase was significantly enhanced under glucose-limiting conditions when compared to the control cells. The maximum productivity was reached at glucose concentrations above 0.2 g/L, whereas the control only showed the same productivity at glucose concentrations above 4.5 g/L. The results suggest that yeast pyruvate carboxylase can increase the efficiency of pyruvate utilization. In terms of bioprocessing, this provides the opportunity to carry out glucose-limited feeding regime without compromising cell specific productivity. Apart from glycolysis, glutaminolysis pathway has been targeted too. Glutamine metabolism was modified to reduce ammonia production by introduction of glutamine synthetase (GS) gene into NS0 myeloma and hybridoma cell lines.25–27 The engineered cell line is able to grow in glutamine-free media, with glutamate provided as a substitute for glutamine. This strategy not only eliminated the accumulation of ammonia due to excessive glutaminolysis, but also increased antibody cell specific productivity up to 50%.27 The GS system also forms an effective selectable marker for high-producer cells.25 Other metabolic engineering approaches include cloning the complete bacterial threonine synthesis pathway into CHO cells, enabling CHO cells to growth in threonine-free medium, 28 as well as transfecting CHO cells with urea cycle enzymes carbamoyl phosphate synthetase I and ornithine transcarbamoylase in order to convert free ammonia into urea.29 It is interesting to note that the efficiency of mammalian metabolism can improved significantly without genetic intervention. For a fixed set of inputs, mammalian metabolism can display multiple steady-states depending on the path leading to the steady-state. This is rarely seen in bacteria or yeast. Mammalian flux distribution shifts to a more efficient state when the cells are pre-adapted to limiting nutrient conditions. The increase in efficiency is characterized by an increase of cell concentration (up to two-fold), a decrease of lactate on glucose yield (up to ten-fold), a decrease of glucose, glutamine, and amino acids consumption rates (up to 75%), an increase in anapleurotic flux (oxaloacetate to pyruvate),
Metabolic Engineering of Mammalian Cells
26-5
and an overall decrease of glycolysis and TCA cycle fluxes.30–33 No clear justification has been provided to explain this behavior, apart from the fact that the steady-state is not fixed or discrete,33 and that the efficient metabolism is easily destabilized by a short exposure to high glucose concentration.32 Surprisingly, the large change in metabolism is not associated with large changes in transcriptional or translational activity of individual components as judged by transcriptome and proteome analysis. 34 While the existence of multiple steady-states indicates that mammalian cell metabolism lacks a clear objective function, steady-state multiplicity has been predicted using a cybernetic modeling approach, where various enzyme systems compete on short-term perspective in response to environmental inputs, and subsequently generate the steady-state outcome.35 26.3.1.3 Apoptosis Metabolic engineering of apoptotic pathways is a well-established strategy to generate stress-resistant production cell lines by blocking the apoptotic signal transduction pathways. Apoptosis is a controlled physiological response to environmental stress factors, represented by a cascade of events leading to cell death. Apoptosis is especially problematic in large-scale mammalian cell bioreactor, where localized nutrient or oxygen limitation, or hydrodynamic stress, induces an irreversible commitment to cell death.36 Strategies employed to enhance cell robustness and survivability have been extensively reviewed.37,38 Suppression of apoptosis is critical for sustaining high viable cell density during the entire productive phase of cell culture by counteracting stress signals. Overexpression of bcl-2 or bcl-x L gene are the predominant strategies used to suppress apoptosis. These genes act by inhibiting the release of pro-apoptotic molecules from the mitochondria. CHO, hybridoma, BHK, and NS0 cell lines engineered to overexpress bcl-2 or bcl-x L show an increase in viability and viable cell density during the decline phase of culture or under stressful culture conditions.39–45 The cells are more robust and are able to tolerate harsher perfusion or fixed-bed bioreactor operation.46,47 Cells overexpressing anti-apoptotic genes are also more tolerant to sub-optimal medium conditions. Engineered hybridomas expressing the survival genes bcl-2 and/or bcl-x L showed enhanced adaptability to media lacking glutamine or growth factor (serum),48,49 hyperosmolarity medium,45,50 or deprivation of any single amino acids during culture.40 Similar results were observed in CHO and NS0 myeloma cells engineered to overexpress bcl-2.44,51 There have been documented cases where engineered cells retained their proliferative capabilities even after exposure to a prolonged decline phase.43,44 This confers an advantage of surviving unexpected nutrient depletion during culture, and the opportunity to adapt a particular cell line to commercial mediums. The effect of an enhanced recombinant protein titer due to apoptosis suppression is better appreciated in perfusion and fed-batch culture, rather than batch culture. In batch studies, anti-apoptotic strategies had no significant effect on increase in the final MAb titer in NS0 myeloma; cells maintained high viability through the culture until exhaustion of key nutrients required for both growth and protein production terminated the culture.51,52 On the other hand, marked increases in the final MAb titer and MAb concentration were reported in fed-batch and perfusion culture, respectively.46,51,52 This is consistent with the earlier discussion of the process equations, where cell death becomes the dominant factor when cells are growing slowly imposed by nutrient limitation during steady-state perfusion and in extended fed-batch.53,54 Especially during the decline phase of fed-batch, a high fraction of engineered cells that are retained in G1 phase remained viable and productive, and displayed marked reduction of nutrient consumption rates.55 Anti-apoptotsis strategies provide the means to achieve biphasic culture profile in extended fed-batch, by tailoring separate nutrient feeding strategies for both the proliferative and the productive phases, in order to maximize the final product titers. The level of protection conferred by over-expression of a particular anti-apoptotic gene varies between different cell lines.48,56,57 For example, upon infection by alphavirus vector, CHO was more protected by Bcl-x L , while BHK was more protected by Bcl-2.42 Similarly, NS0 cells are protected by overexpression of Bax and Bcl-x L but not Bcl-2.56 Overexpression of a single anti-apoptotic gene can delay apoptosis and susceptibility to stressful environment, but not completely block apoptosis. At least three
26-6
Developing Appropriate Hosts for Metabolic Engineering
pathways—mitochondrial-mediated pathway, ER stress-induced pathway, and cell surface-mediated signal transduction pathway—lead to apoptosis, 38 and each of these pathways display redundancy. Other strategies of apoptosis suppression are available as well. CHO cells overexpressing IGF-I receptor, in combination with IGF-I and transferrin supplementation to the media, have better survival and proliferation upon withdrawal of serum.58 The inhibition of pro-apoptotic proteins, such as bax, bcl-sx, bad, bak, and caspases, have been very effective too.37 A new avenue of apoptosis suppression is to modify the structural properties of anti-apoptotic protein to increase their potency. A mutant XIAP (X-linked inhibitor of apoptosis), the most potent caspase inhibitor, was found to confer better protection because the mutant protein distributed uniformly in the cytosol, instead of forming aggregates like the wildtype XIAP.59 Similarly, a mutant Bcl-2, lacking the loop domain, was found to be more resistant against proteolytic degradation than the wild-type Bcl-2, thus conferred a better protection.60 The major drawback of anti-apoptosis strategies is the accumulation of variant cells during longterm culture. The constitutive overexpression of anti-apoptotic genes can cause genomic instability by preventing p53-induced deletion of cells bearing genetic abnormalities.61 In terms of biopharmaceutical production, this may represent a major set-back because this introduce heterogeneity into the producer cell, and possibly affect product quality. The use of an inducible expression system could be a solution, at least for extended fed-batch culture. An inducible bcl-x L expression using a mammalian metallothionein gene promoter was engineered into B-cell hybridoma, thus restricting bcl-x L expression toward the later phase of batch and fed-batch cultures when ZnSO4 was added.62 There is yet to be effective strategies that can selectively target only apoptosis induced by environmental stress, without interfering with crucial cell cycle checkpoints. Overall, the results of suppression of apoptosis by metabolic engineering have been very positive regarding the increase of robustness and the extension of productive state of various cell lines. However, we are still limited in terms of our understanding of the regulation of apoptotic pathways, how different stress factors induce specific set of apoptotic pathways, and what constitutes to the variation in sensitivity of different cell line toward stress factors. The opportunity to execute multiple points of modification simultaneously using multicistronic expression vectors will aid our progress in the direction of rationally designing apoptosis resistant production cell. 26.3.1.4 Proliferation The concept of increasing cell specific productivity by reducing cell specific growth rate is well established, pioneered by reports showing increase in specific productivity of growth-inhibited hybridoma induced by chemical additives.63,64 The motivation for applying proliferation control is due to the need to modify the physiology of producer cell lines, from rapid and indefinite proliferation capabilities to ones that maximize recombinant protein production. Most proliferation arrest strategies involving blocking G1 to S phase transition using overexpression of cyclin-dependent kinase inhibitor (CDI) p21 and p27,65–68 modified tumor suppressor p53175P,65 or interferon regulatory factor (IRF-1).69,70 Other approaches involve manipulating cell culture conditions, such as applying cytostatic agent thymidine,64 temperature shift,66,71–74 or carbon source shift from glucose to galactose.75 All of the above studies have reported a positive enhancement of cell specific productivity, although the exact physiological basis for this effect is not clear. The increase in cell specific productivity could be attributed to accumulation of cells in the G1 phase.76,77 This is reinforced by the findings that the specific antibody production rate was at the maximum during G1 phase. Alternatively, cell specific productivity increase due to the increase of cell size, mitochondrial mass and activity, and ribosome biogenesis when cell proliferation was arrested.68 Cell-cycle arrest tends to result in a reduction of cell viability. Cell-cycle progression and apoptosis are linked in terms of their molecular overlaps in their signaling pathways.78 The G1 to S checkpoint represents the main molecular intersection of cell-cycle and apoptosis, where the decision to undergo apoptosis is made during the late G1 phase in response to DNA damage. Separate studies of arresting cells in the G1 phase using thymidine, temperature shift from 34° to 39°C and estrogen-induced
Metabolic Engineering of Mammalian Cells
26-7
expression of IRF-1 showed similar results of rapid decrease in viable cell density upon prolonged exposure to the applied stimulus.64,69,71 This led to the switching technique used to alternate addition and withdrawal of stimulus in order to counteract the toxic effects. Alternatively, overexpression of survival gene bag-1 and/or bcl-2 was shown to suppress apoptosis effects of thymidine.57,79 The combination of bag-1 and bcl-2 manage to confer a better protection than a single gene.79 Another interesting point raised is that cell overexpressing bcl-2 exhibited not only an increase in cell viability, but also a reduction of growth rate due to extension of G1 phase duration, but theses effects were only observed during sub-optimal (low) growth conditions.54 Fussenegger and coworkers published a series of papers describing their progressive efforts of enhancing a CHO cell line’s cell specific productivity of model product secreted alkaline phosphatase (SEAP) by overexpressing key cytostatic genes p21, p27, and p53175P (mutant p53 deficient of apoptotic function). These cytostatic genes inhibit the progress of cell cycle particularly at the G1 phase. CHO cells were separately transiently transfected and stably transformed using an inducible di-cistronic expression vector containing either cytostatic gene and SEAP gene.80,81 All three cytostatic genes in the transient study generated similar effect on proliferation arrest and gave similar enhancement of SEAP production of up to four-fold.80 In addition, although CHO cells were retained in G1 phase for a prolonged duration, the cells remained viable and recovered their ability to proliferate upon repression of the cytostatic genes. From this, they proposed that higher productivity may be an intrinsic feature of G1 phase arrested CHO cells. However, the equivalent results of transiently transfected CHO cells did not carry over to the stable transformants.81 Apoptosis appeared to be induced in cells expressing p53175P, while p21 overexpression did not result to proliferation arrest. Only p27 overexpression led to a ten to 15-fold increase of SEAP production. Later, the p21 function in stably transformed CHO cell was restored by using a tri-cistronic expression vector combining SEAP, p21, and differentiation factor CCAAT/enhancer-binding protein α, which induces p21 expression and stabilizes p21 protein.80 More interestingly, when a tri-cistronic expression vector containing SEAP, p27, and bcl-x L was used to stably transform CHO cells, the cell specific productivity increase by another three-fold when compared to the dicistronic counterpart.80 However, the contribution of bcl-x L is not clear. They originally hoped to use bcl-x L to reduce the potential of apoptosis in cytostatic cell culture, but found that the increase in cell specific productivity is not attributed to the anti-apoptotic function of bcl-x L . This is compared to another study showing that Bcl-2 had no impact on improving viability of arrested NS0 myeloma cell line that constitutively express a chimeric IgG4 antibody and inducibly express p21 and Bcl-2.67 Overall, these varying results highlight the complexity and interaction between both apoptosis and cell-cycle regulation architectures. Proliferation control strategies are not necessarily limited to growth arrest. CHO cells engineered to expression E2F-1 transcription factor or Cyclin E showed the capacity to continue proliferation without exogeneous growth factor requirements.82,83 The CHO cells were able to proliferate in serum-free and protein-free basal medium. Both E2F-1 and Cyclin E control cell-cycle progression from G1 to S phase, therefore overexpression of these proteins allow cells to progress through cell cycle checkpoint, bypassing the requirments of exogenous mitogenic stimulation found in serum. In the case of CHO cells expressing cyclin E, CHO cells displayed phenotypes as if the cells were stimulated by basic fibroblast growth factor, such as the loss of anchorage dependence.82
26.3.2 Increasing Specific Productivity Conventional host cell engineering and vector design have increased cell specific productivities up to 60 pg/cell/day for monoclonal antibody (MAb) production by CHO and NS0 cell, though most current producer lines produce in the 10–25 pg/cell/day range. In comparison, hybridoma cells can theoretically produce up to 170 pg/cell/day (assuming 8000 MAb molecules per cell per second) suggesting there is great potential to increase productivity.176 Possible bottlenecks in recombinant protein production can be found at the transcription, translation, and post-translation processing level. However, most metabolic engineering strategies attempting to
26-8
Developing Appropriate Hosts for Metabolic Engineering
overcome these rate-limiting individually had limited success. At the transcriptional level, it was found that, in stable producer cells, the recombinant transcripts abundance are above the saturating capacity of downstream translational machineries.84 The key rate-limiting steps appear to be located in the protein processing and secretion pathway. Thus, it is intuitive to expand protein expression machinery by up-regulating folding and assembly enzymes. A proteomic analysis of a GS-NS0 cell line showed that the cell specific antibody productivity is correlated with the abundance of ER-resident proteins (molecular chaperones BiP, protein disulfide isomerase (PDI), and endoplasmin), and cytosolic (HS7C) and mitochondrial (HSP60) molecular chaperones.85 These chaperones are known to associate with nascent Ig heavy chain in the ER. Results also suggested that heavy-chain abundance is limiting productivity, and a greater light-chain to heavy-chain abundance ratio is required to drive the antibody folding and assembly reactions forward.85–87 However, up-regulation of discrete molecular chaperones appears to be counter-productive. Overexpression of chaperones, such as BiP and PDI, can cause the retention of proteins that associate with these chaperones in the ER.88,89 Due to a series of bottlenecks in protein expression, targeting individual ratelimiting steps tend to generate only marginal improvement.90 Therefore, there is a need for a “systems” approach to cell engineering for a global expansion of folding, assembly, and secretion pathways to achieve a concerted increase in recombinant protein production. The differentiation of B-lymphocytes into professional antibody secretors plasma cells can be used as a model to guide metabolic engineering strategies.90 Hundreds of genes are differentially regulated during B-cell differentiation, and different modes of differentiation can be induced for production of different antibodies.91 The core signaling pathway involved in the differentiation process is the unfolded protein response (UPR) pathway.92 The classic activation of UPR consists of three major pathways, IRE1 (inositol requiring), ATF6 (activating transcription factor) and PERK (dsRNA-activated protein kinase-like ER kinase).93 These pathways sense the accumulation of unfolded nascent protein in the ER, mediated by the dissociation of BiP from the transducers to preferentially bind unfolded polypeptide. The activation of UPR leads to a broad-spectrum up-regulation of ER chaperones and foldases, as well as the attenuation of mRNA translation. Chronic signaling will ultimately lead to enhancement of ER-associated degradation (ERAD) growth arrest, and finally culminating in apoptosis. All three pathways are rapidly activated during the artificial induction of ER-stress using the glycosylation inhibitor tunicamycin or the calcium ionophore thapsigargin.93 Interestingly, the differentiation into plasma cells involves an initial expansion of metabolic capacity and secretory machinery, which precede the production of IgM.94 The differentiation process invoke, specific UPR pathways independent of unfolded protein, namely IRE1 and ATF6 but not PERK, by unknown mechanisms, that lead to ER and Golgi expansion without attenuation of translation or induction of ERAD.95 Nevertheless, when approaching the end stage of differentiation, the maximum induction of ER chaperones is attained as a consequence of a feedback UPR signal in response to the accumulation of IgM in the ER.94 Since it is not practical to target individual genes to emulate B-cell differentiation process, metabolic engineering strategies have resorted to target key controlling factors involved in the event. Central to both B-cell differentiation and ER-stress mediated UPR is the transcription factor XBP-1 (X-box binding protein). It induces the expression of genes responsible for many secretory pathway components, via either ER-stress responsive elements (ERSE) or UPR responsive elements (UPRE) containing promoters.96 In addition, XBP-1 induces the physical expansion of ER, as well as increase cell size, lysosome content, mitochondrial mass and activity, ribosome numbers, and total protein synthesis. XBP-1 is thus the master regulator that directs the programming of cellular architectures toward protein synthesis, representing a key target for metabolic engineering strategies. CHO-derived cell lines expressing human XBP-1 showed an overall increase in protein synthesis capacity, resulted from the expansion of ER and Golgi apparatus.97 However, XBP-1 based engineering increase cell specific productivity to the same extent as low-temperature cultivation or controlled proliferation strategies, but failed to complement
Metabolic Engineering of Mammalian Cells
26-9
or enhance them. This raised the possibility that these strategies may act through a similar generic mechanism.
26.3.3 Product Quality Complex biologicals produced by mammalian cells are inherently heterogeneous due to post-translational modifications, such as glycosylation, deamidation, and oxidation. Many modifications are undesired and otherwise suitable candidates often have to be discarded due to an inability to remove undesired product variants in the final product. Glycosylation is generally desired but results in a heterogeneous product. The exact glycosylation pattern dictates immunogenicity, product half-life, and in some cases functionality (e.g., MAb effector function). While glycosylation does not appear to be limiting productivity, glycosylation by mammalian cells remains sub-optimal in terms of quality and has been the focus of much research. Cross-species examination showed that key variation between human and producer cells were found in the proportion of terminal galactose, core fucose, and bisecting N-acetyl-glucosamine (GlcNAc).98 For example, CHO cells lack functional α2,3-sialyltransferase (α2,3ST) and β1,4-GlcNAc transferase III (GnTIII) activities, which are observed in human, but these defects are well tolerated in human. However, glycosylation pattern that are foreign to human body tend to be immunogenic. For example, mouse- and hamster-derived cells produce small portion of N-glycolyl-neuraminic acid (NGNA) sialylation, on top of normal N-acetyl-neuraminic acid (NANA). Similarly, mouse cell lines, such as hybridoma and NS0, tend to produce glycoprotein with α-linked instead of exclusively β-linked galactose residues. The basic structure for normal human IgG is a bi-antennary heptasaccharide having GlcNAc terminal residues, but other common glycoforms include the addition of core fucose, terminal galactose, and/or NANA.99 Therefore the aim of glycosylation engineering is to expand and control host cell glycosylation machinery such that a product with better efficacy and safety is produced. More importantly, the increase in product potency will allow substantial reduction of drug dosage. Complete terminal sialylation is required to increase half-life of glycoproteins administered into the body. Human cells produce both α2,6- and α2,3-linked sialic acid, but only the latter is present in CHO. Nevertheless, CHO cells have been engineered to express α2,6-ST, with the α2,6-ST preferentially attaching to the α1,3-Mannose branch of the bi-antennary complex.100,101 Although there is no clear indication how the ratio of sialylation linkages affect product efficacy, the ratio can be manipulated by the controlled expression of the two types of ST. The proportion of α2,3- and α2,6-sialic acid was found to be determined by the activity ratio of α2,3- to α2,6-ST.102 Thus theoretically, the terminal sialylation of recombinant glycoprotein can be altered to reflect natural composition. The addition of a bisecting GlcNAc increases the efficiency of recruitment of antibody-dependent cell cytotoxicity (ADCC) by increasing the affinity of Fc region of IgG for Fc-γ receptor III.89 CHO cells expressing β1,4-GlcNAc transferase III (GnTIII) produced IgG antibody bearing bisecting GlcNAc, which exhibited a 15–20 fold improvement in ADCC.103,104 The same has been achieved in CHO cells expressing recombinant interferon-β.105 However, a study has suggested that the inducible expression of glycosyltransferase is preferred over constitutive overexpression because high level of gene expression can cause growth inhibition.106 Similarly, gene knock-out or silencing has been used successfully to modify glycosylation in mammalian cells. Earlier studies demonstrated that fucosyltransferase activity can be silenced using anti-sense technology in order to modulate cell-cell adhesion.107 siRNA targeting α1,6-fucosyltransferase (FUT8) in CHO cell was successfully performed, yielding 60% of defucosylated fraction of antibody, leading to over 100-fold increase in ADCC.108 Subsequently, a stable CHO cell line with FUT8 knockout was generated.109,110 The increase in ADCC is due to the increase in affinity for FcγRIII.111 It was shown that defucosylation produced a more potent effect on ADCC enhancement than the addition of bisecting GlcNAc residue.112 Rituxan and Herceptin are the examples of defucosylated IgG1 currently in the market for treatment of non-Hodgkin’s lymphoma and metastatic breast cancer, respectively.
26-10
Developing Appropriate Hosts for Metabolic Engineering
The glycosylation process consists of a complex series of enzyme-catalyzed reactions and transport steps across Golgi cisternae. Glycosylation is dictated by many parameters, including local peptide, and oligosaccharide sequence, activity of host cell glycosylation machineries, and precursors (nucleotidesugar) availability.113,114 Glycosylation is also affected by culture conditions, including the presence of serum and time of harvest.115–118 Heterogeneity arises from the accumulation of possible variation in site occupancy, number of antennae, types of terminal sugar residue, presence of bisecting GlcNAc, type of intra-saccharide linkages, and percentage of terminal sialylation during the glycosylation process. Modulation of glycosylation activities is yet far from delivering homogeneous glycoproteins. Nevertheless, there has been some progress in this aspect. For example, the overexpression of both β1,4-galactosyltransferase and α2,3-sialyltransferase in CHO have significantly improve homogeneity of N-linked oligosaccharide structure.119 It appears that the glycan heterogeneity can be reduced by driving the glycosylation processes to completion using high activity of both enzymes. Both enzymes are required to achieve the best outcome because galactosylation promotes the subsequent sialylation, while sialylated terminal is capped from further reactions.
26.4 Future Directions—Systems Biology As evident from the preceding section, metabolic engineering of mammalian cells for recombinant protein production requires a multifaceted approach and balancing of conflicting objectives. For example, slowing growth or increasing expression would be advantageous to overall yields, if not associated with an increase in cell death through apoptosis. Similarly, pushing more product through the secretion pathway may affect the quality of the product produced. There are many cases, where metabolic engineering has successfully addressed the specific objective targeted, yet yielded strains that overall were inferior performers. It is, thus, not surprising that much of the current effort is directed toward a systems biology approach. This area is still very much in its infancy with the focus being on getting the tools to work. Omics tools (transcriptomics, proteomics, and metabolomics) used for component analysis are fast evolving, but their optimal use in metabolic engineering remains to be realized. Tools required for estimating the metabolic phenotype, i.e., metabolic fluxes, have been pursued for the last couple of decades, but still suffer from the complexity of the culture media used. Even with the analytical tools in place, a major bottleneck is the tools for engineering mammalian cells, in particular the disconnect between what is possible in cells lines used for transient expression (e.g., HEK293) and what is possible in the cell lines used for stable production (e.g., CHO).
26.4.1 Component Analysis Omics analysis provides the means for determining capacities (transcriptome, proteome) and driving forces (metabolome) in the metabolic network in a (semi-)global manner. Transcriptomics and proteomics have been applied to gain physiological insights into the molecular mechanisms that confer cells with higher productivity or better potential for adaptation to large-scale culture conditions. The emerging consensus is that even large changes in productivity or culture conditions only elicit moderate changes in gene expression.120 When sodium butyrate (histone deactylation inhibitor) was added to hybridoma and CHO cells to increase cell-specific productivity, a very profound physiological response was observed, but the vast majority of the transcripts do not exhibit significant change.120,121 Similarly, metabolic shift of hybridoma cell, which caused dramatic down regulation of central metabolism and increase in efficiency of substrates usage, was reflected only moderately in terms of changes of gene expression level and protein abundance.34,122 In the absence of lead gene candidates, a number of groups have compared expression between low and high producers on a broad functional basis.123–126 For example, the high producer phenotype has
Metabolic Engineering of Mammalian Cells
26-11
been linked to an upregulation of genes involved in protein folding (molecular chaperones, BiP, endoplasmin, and PDI).85 However, the utility of this crude approach for metabolic engineering is doubtful as illustrated by the fact that overexpression of the molecular chaperones BiP and PDI led to a reduction of recombinant protein synthesis.88,127 In order to realize the benefits of having global capacity data (transcriptome, proteome), it is likely that data has to be integrated with data for driving forces (metabolomics) and fluxes (fluxomics) using network-based rather than a naïve mining algorithms. Metabolomics for mammalian cell metabolic engineering is still in its infancy. Small-scale metabolomic studies have been performed to quantify availability of intracellular metabolites for oligosaccharide biosynthesis. Nyberg et al. measured intracellular concentrations of nucleotides and nucleotide sugars, and subsequently demonstrated that glucose and glutamine limited culture can reduce N-linked glycosylation site occupancy of Interferon-γ due to depletion of UDP-N-acetyl-glucosamine (UDP-GlcNAc) and UDP-N-acetyl-galactosamine (UDP-GalNAc) intracellular pools, which is linked to reduction of ribose and amino sugars synthesis, respectively.114 Similarly, the supplementation of glucosamine in the culture medium can lead to an increased branching glycosylation reactions in recombinant glycoprotein production in BHK cells.128 On the other hand, media supplementation with mannose, glucosamine, and dolicholphosphate did not increase site occupancy in CHO cells.129 Systematic metabolomics is required to gain a better understanding of driving forces in mammalian cell metabolism. The fact that large phenotypical changes occur with only moderate changes in capacities (as measured by transcriptomics and proteomics) suggests that much of the regulation occurs at enzyme level via metabolites. As for transcriptomics and proteomics, it is likely that their usefulness only can be realized through integration with other omics data and using network-based rather than a naïve mining algorithms.
26.4.2 Metabolic Flux Analysis The fluxome is the key indicator of the cellular phenotype in terms of metabolism. Flux analysis allows a quantitative survey of a cell’s metabolism, providing a global perspective of integrated regulation at the transcriptional, translational, and enzymatic level.130 Fluxes through a metabolic network cannot be determined directly. Pulse chase analysis with labeled compounds can be used to determine peripheral fluxes, but as the pulses broaden through each step of the network it gradually becomes impossible to accurately determine fluxes. Hence, indirect, model-based approaches are used to resolve fluxes at the global level. Metabolic flux analysis exploits the fact that accumulation of individual metabolites is trivial compared to fluxes going to and from the metabolites and hence the sum of fluxes to a given metabolite approximately equals the sum of fluxes away from this metabolite. The set of linear equations (one for each intracellular metabolite considered) can be used to determine the unknown intracellular (net) fluxes from the much smaller set of measured exchange fluxes with the environment. A complete model of all reactions in a cell will always be underdetermined, i.e., the exchange rates are insufficient to resolve all fluxes. Three approaches have been used to address this issue: simplifying the models (flux balance analysis (FBA)), using additional information from labeling (13C flux analysis) and optimization (constraint based analysis). 26.4.2.1 Flux Balances Analysis FBA has been used to determine the metabolic response of mammalian cells to different culture environments, including glucose and glutamine metabolism, 24,131,132 optimization of media composition and feeding strategy,6,133 and assessment of osmolarity and oxidative stress.134–136 FBA cannot be used to resolve fluxes through parallel (e.g., EMP and PPP) or circular (TCA) pathways without making major assumptions. For example, the NADPH balance is required to specify flux through the oxidative branch of the pentose phosphate pathway, but this requires further assumptions on the contribution of transhydrogenation reactions and NADP+ linked malic enzyme. In eukaryotes, the situation is further complicated by the duplication of many reactions in cytosol and mitochondria,
26-12
Developing Appropriate Hosts for Metabolic Engineering
a compartmentalization that is known to play an important role in transformed cells, but cannot be captured in FBA models. Furthermore, low biomass density complicates the measurements of biomass composition, oxygen uptake rate, and carbon evolution rate.137,138 Arguably, FBA does not provide more information than can be gleaned directly from the experimental data, though employing the approach can ensure a systematic assessment of data quality. 26.4.2.2 13C Flux Analysis Isotopic tracer based methods are commonly used to determine intracellular fluxes of under-determined metabolic network. The redistribution of the tracer among metabolic intermediates and end products is a “unique signature” generated depending on the network connectivity, flux value, and type of tracer used. Therefore, in principle, fluxes through reversible reactions, cyclic, or parallel pathways can be resolved given a known network and tracer. In addition, validation of model can be performed using redundant measurements derived from 13C isotopic tracer.18,139 13C flux analysis has been used to resolve fluxes in central carbon metabolism of several microorganisms by simultaneous metabolite and isotopomer balancing using measured input/output rates and 13C enrichment data of multiple metabolites, respectively. Examples of this approach include E. coli (32 reactions),140,141 C. glutamicum (47 reactions)142,143, and S. cerevisiae (37 reactions).144,145 In contrast, 13C flux analysis has not yet been applied to fully resolve central carbon metabolism in mammalian cells. The main problem relates to the complexity of mammalian culture media, which in addition to glucose contains all amino acids and often other metabolites such as nucleotides. It is thus impossible to use the preferred approach of inferring metabolite labeling from the labeling pattern in proteogenic amino acids. Moreover, it is difficult to quantify the degree of label dilution caused by transamination, the high rate of glutaminolysis as well as catabolism of essential amino acids, such as leucine, isoleucine, valine, threonine, and lysine.146,147 13C tracer studies do play a critical role in testing assumption made in FBA models.139,148 Labeled glucose can be used to assess fluxes through the pentose phosphate pathway, but fails when assessing flux partitioning in the TCA cycle since most 13C atoms tend to be excreted in the form of lactate and alanine. To overcome this, 3-13C glutamine can be used to better resolved activity of TCA cycle and malate shunt. This approach was used to assess the activity of TCA cycle (from citrate to 2-oxoglutarate) relative to glutaminolysis and to demonstrate negligible anapleuretic activity by pyruvate carboxylase and the diversion of most glutamine to pyruvate via malate shunt in hybridoma cells.23 26.4.2.3 Constrain Based Flux Analysis Flux balancing reduces the dimensionality of the problem of determining intracellular fluxes. While some fluxes may be fully determined by balancing, the remaining fluxes can take on values from minus to plus infinity. It is possible to greatly constrain the solution space by incorporation of constraints such as irreversibility constraints and maximum rates. Exploration of the feasible solution space—as opposed to identification of the single correct flux solution—enables the consideration of genome scale models. Genome-scale reconstruction of the metabolic network is the most extensive and successful approaches so far for in silico simulation of whole-cell metabolism.149 Genome-scale metabolic reconstruction has been performed for Escherichia coli,150–152, Saccharomyces cerevisiae,153–155 Haemophilus influenzae Rd,156 Helicobacter pylori,157 Lactococcus lactis,158 Staphylococcus aureus N315,159 Mus musculus,160 and most recently Homo sapien.161 The metabolic network is derived from the organism’s genome sequence, thus, reflecting the full potential of the organism. This reduces the reliance on userdefined assumption of network components and connectivity. By establishing the framework from gene (as the invariant basis) to reaction (as the functional basis), genome-scale model potentially allows one to reconcile heterogeneous “omics” dataset into one platform, creating an environment to efficiently test hypotheses of genotype-phenotype relationship at the systems level. The success of genome-scale modeling draws upon the capability to predict physiological meaningful states using known cellular constraints and objective function(s), without involving a large number of
Metabolic Engineering of Mammalian Cells
26-13
parameters. E. coli’s metabolic phenotype under normal or perturbed (gene deletion, sub-optimal substrates) conditions can be consistently predicted by the “maximum growth” optimization function.162– 164 These findings are significant because they support the notion that complex biological interactions involved in growth and homeostasis can be accounted for once a valid set of metabolic constraints and objectives are applied. This is the key driving force behind rational metabolic engineering strategies using genome-scale model. For mammalian cells, the maximum growth rate objective makes little sense, as cell metabolism continue to reflect features essential at an organismal rather than cellular level. No single objective function can completely describe the metabolic behavior of mammalian cells in culture.17 The situation may be helped by “second generation” genome-scale models, applying adjustable biological constraints, such as transcriptional regulations, to further restrict the solution space in a condition-dependent manner.152,155,165–168 While this is unlikely to fully resolve the problem, it will at least overcome the problem that the mammalian genome encodes for vastly different metabolic phenotypes, which cannot coexist. For metabolic engineering of mammalian cells, constraint-based modeling may first and foremost facilitate analysis of heterogeneous high-throughput data and design of experiments. Recently, constraint based analysis169 was used to analyze TCA intermediate isotopomer data from a 13C labeling study performed on perfused mouse heart.170 While a reduced model in this case was constructed by hand, one can foresee this task being automated through transcriptome analysis. At present, the greatest weakness of the constraint-based approach is difficulty in assessing the meaningfulness of the solutions found given that the optimization algorithms will always return solutions.
26.4.3 Tools for Advanced Gene Expression In a complex system, it is inevitable that multiple and optimized modifications are required to achieve a desired phenotype. It is expected that metabolic engineering of mammalian cells demands sophisticated multi-genes metabolic engineering strategies, and likely to require the coordinated and controlled expression of several transgenes in order to achieve the desired cell phenotype. Multicistronic expression vector is the paramount technology for one-step cloning of transgenes into mammalian cells. The Bailey group has pioneered the pTRIDENT vector platform, which enables multicistronic expression up to four genes.171–175 The key to multicistronic expression technology is the internal translation initiation mechanisms using internal ribosome entry sites (IRES) of poliovirus and cap-independent translational enhancer (CITE) of encephalomyocarditis virus. The initiation of translation of the second and subsequent transgenes occurs via a specialized secondary mRNA structure that assembles the ribosomes and initiates translation. This allows coordinated and simultaneous expression of up to four independent transgenes in mammalian cell under the control of a single promoter.
26.5 Summary Metabolic engineering offers the opportunity of performing targeted modification of host cells such that they are better equipped for large-scale production of recombinant protein. The metabolic engineering of mammalian cells draws upon different but complementary strategies to create a new generation of host cell that performs better in culture. We have identified that culture performance is dominated by significant loss of productive cells, especially for commonly used fed-batch and perfusion cultures. The loss of productive cells by apoptosis can be attributed to stress response to harsh culture conditions, slow growth, or depletion of nutrients. Slow growth kinetics is desirable to increase peak cell density by concomitantly reducing the accumulation of inhibitory effects of by-products. Metabolic pathway engineering explores the means of diverting the precursors of these by-products toward biomass or product more effectively, instead of relying on stringent manipulation of extrinsic cell culture parameters, which tend to induce cell death. Apoptosis engineering provides the opportunity to delay or reduce the cell’s susceptibility to these apoptotic stimuli, although we are yet not able to fully abolish apoptosis.
26-14
Developing Appropriate Hosts for Metabolic Engineering
Lastly, proliferation arrest strategies are useful to divert the commitment of biosynthetic resources and capacities from biomass to product. It is particularly important to induce stationary growth phase independent of nutrient depletion to reduce cell’s tendency to undergo apoptosis and to ensure a consistent supply of nutrient for recombinant protein synthesis and cell maintenance. Hypothetically, it would be very attractive to engineer a cell with the ability to grow rapidly and efficiently in the presence of nutrient-rich media, switchable from rapid proliferation to high productivity when peak cell density is achieved, and subsequently maintained at a stationary growth but productive state for extended period of time. It is, thus, apparent that pathway, apoptosis, and proliferation engineering are necessary to achieve an ideal bi-phasic fed-batch culture. Ideally, the host cell is engineered to display a global metabolic phenotype of enhanced robustness, metabolic efficiency, and productivity. We also explored the opportunities of modeling the differentiation process of B-cells into plasma cells in order to increase cell-specific productivity. At current state-of-art, we must rely on the perturbation of upstream regulatory elements to elicit numerous, and frequently non-intuitive, downstream changes to enhance the global biosynthetic capabilities of the cell. The bottlenecks in protein synthesis are distributed, thus, targeting individual rate-limiting steps tend to be futile and impractical. Glycosylation engineering is necessary to design host cells that produce a homogeneous product with consistent bioactivity and efficacy, especially for clinical setting. Glycosylation is not guided by a template and engineering strategies have resorted to the manipulation of the glycosylation machineries available to the cell. Although the control of the glycosylation pattern at this level is very coarse, the modification of terminal glycan has led to a few successful biopharmaceutical products in the market. The key motivation to adopting systems biology is that the complexity of the cell cannot be resolved on the basis of a single component alone, such as transcriptomics or proteomics. In mammalian cells, we have observed that large physiological changes are only accompanied by marginal transcriptional or translational activity changes. We are still far from using omics data for targeted engineering of the cell, though preliminary results from differential study of producer cells may be incorporated into existing clonal selection programs to better harness the genetic heterogeneity and improve screening heuristics. Ultimately, we need to explore the integrated use of global measurements of capacity (transcriptome, proteome), driving forces (metabolomics) and flows (fluxomics) to generate meaningful interpretation of cellular physiology that confer higher productivity. A large gap in the systems biology approach is the inadequate tools for metabolomics and fluxomics. Particularly for fluxomics, it is difficult to generate reliable flux data due to the nutritional complexity of mammalian cells. To overcome this, we must recognize the fact that it may not be necessary to accurately fix metabolic activity to a specific state, rather, to obtain a narrow range of metabolic states that cell could take on. Consequently, constraint-based modeling is deemed to be an adequate platform for model-based generation of fluxomics data, by incorporating intelligent or experimentally derived constraints such as thermodynamics (reaction reversibility and maximum capacity), transcriptional regulation and 13C-based tracer data. To conclude, intensive culture conditions and high productivity expectations have pushed mammalian cell further into an unnatural physiological demand. Consequently, we are faced with a new prospect of performing metabolic engineering of mammalian cell at the systems level, as we begin to rationally look into the intrinsic elements of the cell to improve its performance.
References 1. Wurm, F. Production of recombinant protein therapeutics in cultivated mammalian cells. Nat. Biotechnol., 22 (11), 1393, 2004. 2. Godia, F. and Cairo, J.J. Metabolic engineering of animal cells. Bioprocess Biosyst. Eng., 24, 289, 2002. 3. Fussenegger, M. et al. Genetic optimization of recombinant glycoprotein production by mammalian cell., Trends Biotechnol., 17, 35, 1999.
Metabolic Engineering of Mammalian Cells
26-15
4. Andersen, D.C. and Krummen, L. Recombinant protein expression for therapeutic applications. Curr. Opin. Biotechnol., 13, 117, 2002. 5. Bibila, T. and Robinson, D. In pursuit of the optimal fed-batch process for monoclonal-antibody production. Biotechnol. Prog., 11 (1), 1, 1995. 6. Xie, L. and Wang, D. Integrated approaches to the design of media and feeding strategies for fed-batch cultures of animal cells. Trends Biotechnol., 15, 109, 1997. 7. Zhang, L., Shen, H., and Zhang, Y. Fed-batch culture of hybridoma cells in serum-free medium using an optimized feeding strategy. J. Chem. Tech. Biotechnol., 79, 171, 2004. 8. Xie, L. and Wang, D. High cell density and high monoclonal antibody production through medium design and rational control in a bioreactor. Biotechnol. Bioeng., 51 (6), 725, 1996. 9. Zhou, W. et al. Fed-batch culture of recombinant NS0 myeloma cells with high monoclonal antibody production. Biotechnol. Bioeng., 55 (5), 783, 1997. 10. Butler, M. Animal cell cultures: recent achievements and perspectives in the production of biopharmaceuticals. Appl. Microbiol. Biotechnol., 68, 283, 2005. 11. Hesse, F. and Wagner, R. Developments and improvements in the manufacturing of human therapeutics with mammalian cell cultures. Trends Biotechnol., 18, 173, 2000. 12. Warnock, J.N. and Al-Rubeai, M. Bioreactor systems for the production of biopharmaceuticals from animal cells. Biotechnol. Appl. Biochem., 45, 1, 2006. 13. Newland, M., Kamal, M.N., and Greenfield, P.N.L. Ammonia inhibition of hybridomas propagated in batch, fed-batch, and continuous culture. Biotechnol. Bioeng., 43, 434, 1994. 14. Ozturk, S., Riley, M., and Palsson, B.Ø. Effects of ammonia and lactate on hybridoma growth, metabolism, and antibody production. Biotechnol. Bioeng., 39, 418, 1992. 15. Martinelle, K., Westlund, A., and Haggstrom, L. Ammonium ion transport – a cause of cell death. Cytotechnology 22, 251, 1996. 16. Chen, K.Q. et al. Engineering of a mammalian cell line for reduction of lactate formation and high monoclonal antibody production. Biotechnol. Bioeng., 72 (1), 55, 2001. 17. Savinell, J. and Palsson, B.Ø. Network analysis of intermediary metabolism using linear optimization. I. Development of mathematical formalism. J. Theor. Biol., 154, 421, 1992. 18. Paredes, C. et al. Estimation of the intracellular fluxes for a hybridoma cell line by material balances. Enzyme Microb. Technol., 23, 187, 1998. 19. Irani, N., Beccaria, A.J., and Wagner, R. Expression of recombinant cytoplasmic yeast pyruvate carboxylase for the improvement of the production of human erythropoietin by recombinant BHK-21 cells. J. Biotechnol., 93, 269, 2002. 20. Irani, N. et al. Improvement of primary metabolism of cell cultures by introducing a new cytoplasmic pyruvate carboxylase reaction. Biotechnol. Bioeng., 66, 238, 1999. 21. Elias, C.B. et al. Improving glucose and glutamine metabolism of human HEK 293 and Trichoplusia ni insect cells engineered to express a cytosolic pyruvate carboxylase enzyme. Biotechnol. Prog., 19, 90, 2003. 22. Fogolin, M.B. et al. Impact of temperature reduction and expression of yeast pyruvate carboxylase on hGM-CSF-producing CHO cells. J. Biotechnol., 109 (1–2), 179, 2004. 23. Mancuso, A. et al. Examination of primary metabolic pathways in a murine hybridoma with carbon-13 nuclear magnetic resonance spectroscopy. Biotechnol. Bioeng., 44, 563, 1994. 24. Vriezen, N. and van Dijken, J. Fluxes and enzyme activities in central metabolism of myeloma cells grown in chemostat culture. Biotechnol. Bioeng., 59 (1), 28, 1998. 25. Bebbington, C. et al. High-level expression of a recombinant antibody from myeloma cells using a glutamine-synthetase gene as an amplifiable selectable marker. Bio/Technology 10 (2), 169, 1992. 26. Birch, J.R. et al. Selecting and designing cell-lines for improved physiological characteristic. Cytotechnology, 15 (1–3), 11, 1994. 27. Bell, S.L. et al. Genetic engineering of hybridoma glutamine metabolism. Enzyme Microb. Technol., 17, 98, 1995.
26-16
Developing Appropriate Hosts for Metabolic Engineering
28. Rees, W.D. and Hay, S.M. The biosynthesis of threonine by mammalian cell: expression of a complete bacterial biosynthetic pathway in an animal cell. Biochem. J., 309 (3), 999, 1995. 29. Park, H.S. et al. Expression of carbamoyl phosphate synthetase I and ornithine transcarbamoylase genes in Chinese hamster ovary dhfr-cells decreases accumulation of ammonium ion in culture media. J. Biotechnol., 81 (2–3), 129, 2000. 30. Cruz, H., Moreira, J., and Carrondo, M. Metabolic shifts by nutrient manipulation in continuous cultures of BHK cells. Biotechnol. Bioeng., 66, 104, 1999. 31. Follstad, B. et al. Metabolic flux analysis of hybridoma continuous culture steady state multiplicity. Biotechnol. Bioeng., 63, 675, 1999. 32. Europa, A. et al. Multiple steady states with distinct cellular metabolism in continuous culture of mammalian cells. Biotechnol. Bioeng., 67, 25, 2000. 33. Gambhir, A. et al. Analysis of cellular metabolism of hybridoma cells at distinct physiological states. J. Biosci. Bioeng., 95 (4), 317, 2003. 34. Korke, R. et al. Large scale gene expression profiling of metabolic shift of mammalian cells in culture. J. Biotechnol., 107, 1, 2004. 35. Namjoshi, A.A., Hu, W.S., and Ramkrishna, D. Unveiling steady-state multiplicity in hybridoma cultures: the cybernetic approach. Biotechnol. Bioeng., 81, 80, 2003. 36. Al-Rubeai, M. and Singh, R.P. Apoptosis in cell culture. Curr. Opin. Biotechnol., 9, 152, 1998. 37. Vives, J. et al. Metabolic engineering of apoptosis in cultured animal cells: implications for the biotechnology industry. Metab. Eng., 5, 124, 2003. 38. Arden, N. and Betenbaugh, M.J. Life and death in mammalian cell culture: strategies for apoptosis inhibition. Trends Biotechnol., 22 (4), 174, 2004. 39. Itoh, Y., Ueda, H., and Suzuki, E. Overexpression of bcl-2, apoptosis suppressing gene - prolonged viable culture period of hybridoma and enhanced antibody-production. Biotechnol. Bioeng., 48 (2), 118, 1995. 40. Simpson, N.H. et al. In hydriboma cultures, deprivation of any single amino acid leads to apoptotic death, which is suppressed by the expression of the bcl-2 gene. Biotechnol. Bioeng., 59, 90, 1998. 41. Simpson, N.H., Milner, A.E., and Al-Rubeai, M. Prevention of hybridoma cell death by bcl-2 during suboptimal culture conditions. Biotechnol. Bioeng., 54 (1), 1, 1997. 42. Mastrangelo, A.J. et al. Part 1. Bcl-2 and Bcl-x(L) limit apoptosis upon infection with alphavirus vectors. Biotechnol. Bioeng., 67, 544, 2000. 43. Mastrangelo, A.J. et al. Overexpression of Bcl-2 family members enhances survival of mammalian cells in response to various culture insults. Biotechnol. Bioeng., 67, 555, 2000. 44. Tey, B.T. et al. Influence of Bcl-2 on cell death during the cultivation of a Chinese hamster ovary cell line expressing a chimeric antibody. Biotechnol. Bioeng., 68 (1), 31, 2000. 45. Kim, N.S. and Lee, G.M. Response of recombinant Chinese hamster ovary cells to hyperosmotic pressure: effect of Bcl-2 overexpression. J. Biotechnol., 95 (3), 237, 2002. 46. Fassnacht, D. et al. Influence of bcl-2 on antibody productivity in high cell density perfusion cultures of hybridoma. Cytotechnology, 30 (1–3), 95, 1999. 47. Fussenegger, M. et al. Regulated overexpression of the survival factor bcl-2 in CHO cells increases viable cell density in batch culture and decreases DNA release in extended fixed-bed cultivation. Cytotechnology, 32 (1), 45, 2000. 48. Chung, J.D., Sinskey, A.J., and Stephanopoulos, G. Growth factor and Bcl-2 mediated survival during abortive proliferation of hybridoma cell line. Biotechnol. Bioeng., 57 (2), 164, 1998. 49. Charbonneau, J.R. et al. Bcl-xL expression interferes with the effects of L-glutamine supplementation on hybridoma cultures. Biotechnol. Bioeng., 81 (3), 279, 2003. 50. Perani, A. et al. Variable functions of bcl-2 in mediating bioreactor stress-induced apoptosis in hybridoma cells. Cytotechnology, 28 (1–3), 177, 1998. 51. Tey, B.T. et al. Bcl-2 mediated suppression of apoptosis in myeloma NS0 cultures. J. Biotechnol., 79 (2), 147, 2000.
Metabolic Engineering of Mammalian Cells
26-17
52. Mercille, S. et al. Dose-dependent reduction of apoptosis in nutrient-limited cultures of NS/0 myeloma cells transfected with the E1B-19K adenoviral gene. Biotechnol. Bioeng., 63 (5), 516, 1999. 53. Paulovich, A.G., Toczyski, D.P., and Hartwell, L.H. When checkpoints fail. Cell, 88 (3), 315, 1997. 54. Simpson, N.H. et al. Bcl-2 over-expression reduces growth rate and prolongs G1 phase in continuous chemostat cultures of hybridoma cells. Biotechnol. Bioeng., 64, 174, 1999. 55. Mercille, S. and Massie, B. Apoptosis-resistant E1B-19K-expression NS/0 myeloma cells exhibit increased viability and chimeric antibody productivity under perfusion culture conditions. Biotechnol. Bioeng., 63, 529, 1999. 56. Murray, K. et al. NSO myeloma cell death: Influence of bcl-2 overexpression. Biotechnol. Bioeng., 51 (3), 298, 1996. 57. Singh, R.P., Emery, A.N., and Al-Rubeai, M. Enhancement of survivability of mammalian cells by overexpression of the apoptosis-suppressor gene bcl-2. Biotechnol. Bioeng., 52, 166, 1996. 58. Sunstrom, N.A.S. et al. Insulin-like growth factor-I and transferrin mediate growth and survival of Chinese hamster ovary cells. Biotechnol. Prog., 16 (5), 698, 2000. 59. Sauerwald, T.M., Oyler, G.A., and Betenbaugh, M.J. Study of caspase inhibitors for limiting death in mammalian cell culture. Biotechnol. Bioeng., 81 (3), 329, 2003. 60. Figueroa, B. et al. Comparison of bcl-2 to a bcl-2 deletion mutant for mammalian cells exposed to culture insults. Biotechnol. Bioeng., 73 (3), 211, 2001. 61. Minn, A.J., Boise, L.H., and Thompson, C.B. Expression of Bcl-x(L) and loss of p53 can cooperate to overcome a cell cycle checkpoint induced by mitotic spindle damage. Genes Dev., 10 (20), 2621, 1996. 62. Jung, D. et al. Inducible expression of Bcl-xL restricts apoptosis resistance to the antibody secretion phase in hybridoma cultures. Biotechnol. Bioeng., 79 (2), 180, 2002. 63. Suzuki, E. and Ollis, D.F. Enhanced antibody-production at slowed growth-rates experimental demonstration and a simple structured model. Biotechnol. Prog., 6 (3), 231, 1990. 64. Al-Rubeai, M. et al. Specific monoclonal-antibody productivity and the cell cycle-comparisons of batch, continuous and perfusion cultures. Cytotechnology, 9 (1–3), 85, 1992. 65. Fussenegger, M., Mazur, X., and Bailey, J.E. A novel cytostatic process enhances the productivity of Chinese hamster ovary cells. Biotechnol. Bioeng., 55 (927), 1998. 66. Kaufmann, H. et al. Comparative analysis of two controlled proliferation strategies regarding product quality, influence on tetracycline-regulated gene expression, and productivity. Biotechnol. Bioeng., 72 (6), 592, 2001. 67. Ibarra, N. et al. Modulation of cell cycle for enhancement of antibody productivity in perfusion culture of NS0 cells. Biotechnol. Prog., 19 (1), 224, 2003. 68. Bi, J.X., Shuttleworth, J., and Al-Rubeai, M. Uncoupling of cell growth and proliferation results in enhancement of productivity in p21CIP1-arrested CHO cells. Biotechnol. Bioeng., 85 (7), 741, 2004. 69. Koster, M. et al. Proliferation control of mammalian cell by the tumor suppressor IRF-1. Cytotechnology, 18, 67, 1995. 70. Geserick, C. et al. Enhanced productivity during controlled proliferation of BHK cells in continuously perfused bioreactors. Biotechnol. Bioeng., 69 (3), 266, 2000. 71. Jenkins, N. and Hovey, A. Temperature control of growth and productivity in mutant chinesehamster ovary cells synthesizing a recombinant protein. Biotechnol. Bioeng., 42 (9), 1029, 1993. 72. Yoon, S.K., Kim, S.H., and Lee, G.M. Effect of low culture temperature on specific productivity and transcription level of anti-4-1BB antibody in recombinant Chinese hamster ovary cells. Biotechnol. Prog., 19, 1383, 2003. 73. Yoon, S.K. et al. Biphasic culture strategy for enhancing volumetric erythropoietin productivity of Chinese hamster ovary cells. Enzyme Microb. Technol., 39, 362, 2006. 74. Fox, S.R. et al. Maximizing interferon-gamma production by Chinese hamster ovary cells through temperature shift optimization: experimental and modeling. Biotechnol. Bioeng., 85, 177, 2004. 75. Altamirano, C., Cairo, J.J., and Godia, F. Decoupling cell growth and product formation in chinese hamster ovary cells through metabolic control. Biotechnol. Bioeng., 76, 351, 2001.
26-18
Developing Appropriate Hosts for Metabolic Engineering
76. Linardos, T.I., Kalogerakis, N., and Behie, L.A. Cell-cycle model for growth-rate and death rate in continuous suspension hybridoma cultures. Biotechnol. Bioeng., 40 (3), 359, 1992. 77. Cazzador, L. and Mariani, L. Growth and production modeling in hybridoma continuous cultures. Biotechnol. Bioeng., 42 (11), 1322, 1993. 78. Fussenegger, M. and Bailey, J.E. Molecular regulation of cell cycle progression and apoptosis in mammalian cells: Implications for biotechnology. Biotechnol. Prog., 14, 807, 1998. 79. Terada, S. et al. Anti-apoptotic genes, bag-1 and bcl-2, enabled hybridoma cells to survive under treatment for arresting cell cycle. Cytotechnology, 25 (1–3), 17, 1997. 80. Fussenegger, M. et al. Controlled proliferation by multigene metabolic engineering emhances the productivity of Chinese hamster ovary cells. Nat. Biotechnol., 16, 468, 1998. 81. Mazur, X. et al. Higher productivity of growth-arrested chinese hamster ovary cells expressing the cyclin-dependent kinase inhibitor p27. Biotechnol. Prog., 14, 705, 1998. 82. Renner, W.A. et al. Recombinant cyclin E expression activates proliferation and obviates surface attachment of Chinese hamster ovary (CHO) cells in protein-free medium. Biotechnol. Bioeng,. 47, 467, 1995. 83. Lee, K.H. et al. Deregulation expression of cloned transcription factor E2F-1 in Chinese hamster ovary cells shifts protein patterns and activities growth in protein-free medium. Biotechnol. Bioeng., 50, 273, 1996. 84. Barnes, L.M., Bentley, C.M., and Dickson, A.J. Molecular definition of predictive indicators of stable protein expression in recombinant NSO myeloma cells. Biotechnol. Bioeng., 85 (2), 115, 2004. 85. Smales, C.M. et al. Comparative proteomic analysis of GS-NS0 murine myeloma cell lines with varying recombinant monoclonal antibody production rate. Biotechnol. Bioeng., 88 (4), 474, 2004. 86. Schlatter, S. et al. On the optimal ratio of heavy to light chain genes for efficient recombinant antibody production by CHO cells. Biotechnol. Prog., 21, 122, 2005. 87. Dorai, H. et al. Correlation of heavy and light chain mRNA copy numbers to antibody productivity in mouse myeloma production cell lines. Hybridoma, 25 (1), 1, 2006. 88. Dorner, A.J. and Kaufman, R.J. The levels of endoplasmic reticulum proteins and ATP affect folding and secretion of selective proteins. Biologicals, 22 (2), 103, 1994. 89. Davis, R. et al. Effect of PDI overexpression on recombinant protein secretion in CHO cells. Biotechnol. Prog., 16, 736, 2000. 90. Dinnis, D.M. and James, D.C. Engineering mammalian cell factories for improved recombinant monoclonal antibody production: lessons from nature? Biotechnol. Bioeng., 91 (2), 180, 2005. 91. Sciammas, R. and Davis, M.M. Modular nature of blimp-1 in the regulation of gene expression during B cell maturation. J. Immunol., 172 (9), 5427, 2004. 92. Gass, J.N., Gifford, N.M., and Brewer, J.W. Activation of an unfolded protein response during differentiation of antibody-secreting B cells. J. Biol. Chem., 277 (50), 49047, 2002. 93. Rutkowski, D.T. and Kaufmann, R.J. A trip to the ER: coping with stress. Trends Cell Biol., 14 (1), 20, 2004. 94. van Anken, E. et al. Sequential waves of functionally related proteins are expressed when B cells prepare for antibody secretion. Immunity, 18 (2), 243, 2003. 95. Gass, J.N. et al. Stressed-out B cells? Plasma-cell differentiation and the unfolded protein response. Trends Immunol., 25 (1), 17, 2004. 96. Shaffer, A.L. et al. XBP1, downstream of Blimp-1, expands the secretory apparatus and other organelles, and increases protein synthesis in plasma cell differentiation. Immunity, 21 (1), 81, 2004. 97. Tigges, M. and Fussenegger, M. Xbp1-based engineering of secretory capacity enhances the productivity of Chinese hamster ovary cells. Metab. Eng., 8, 264, 2006. 98. Raju, T.S. et al. Species-specific variation in glycosylation of IgG: evidence for the species-specific sialylation and branch-specific galactosylation and importance for engineering recombinant glycoprotein therapeutics. Glycobiology, 10 (5), 477, 2000. 99. Jefferis, R. Glycosylation of recombinant antibody therapeutics. Biotechnol. Prog., 21, 11, 2005.
Metabolic Engineering of Mammalian Cells
26-19
100. Grabenhorst, E. et al. Construction of stable BHK-21-cells coexpressing human secretory glycoproteins and human gal(beta-1-4)glcNAc-r alpha-2,6-sialyltransferase alpha-2,6-linked neuAc is preferentially attached to the gal(beta-1-4)glcNAc(beta-1-2)man(alpha-1-3)-branch of diantennary oligosaccharides from secreted recombinant beta-trace protein. Eur. J. Biochem., 232 (718), 1995. 101. Minch, S.L., Kallio, P.T., and Bailey, J.E. Tissue-plasminogen activator coexpressed in Chinesehamster ovary cells with alpha(2,6)-sialyltransferase contains NeuAc-alpha(2,6)gal-beta(1,4) GlcNAcR linkages. Biotechnol. Prog., 11 (3), 348, 1995. 102. Fukuta, K. et al. Genetic engineering of CHO cells producing hman interferon-γ by transfection of sialyltransferases. Glycoconj. J., 17, 895, 2000. 103. Umana, P. et al. Engineered glycoforms of an antineuroblastoma IgG1 with optimized antibodydependent cellular cytotoxic activity. Nat. Biotechnol., 17 (2), 176, 1999. 104. Davies, J. et al. Expression of GnTIII in a recombinant anti-CD20 CHO production cell line: expression of antibodies with altered glycoforms leads to an increase in ADCC through higher affinity for Fc gamma RIII. Biotechnol. Bioeng., 74 (4), 288, 2001. 105. Sburlati, A.R. et al. Synthesis of bisected glycoforms of recombinant IFN-beta by overexpression of beta-1,4-N-acetylglucosaminyltransferase III in Chinese hamster ovary cells. Biotechnol. Prog., 14 (2), 189, 1998. 106. Umana, P., Jean-Mairet, J., and Bailey, J.E. Tetracycline-regulated overexpression of glycosyltransferases in Chinese hamster ovary cells. Biotechnol. Bioeng., 65, 542, 1999. 107. Prati, E.G.P. et al. Antisense strategies for glycosylation engineering of Chinese hamster ovary (CHO) cells. Biotechnol. Bioeng., 59, 445, 1998. 108. Mori, K. et al. Engineering chinese hamster ovary cells to maximize effector function of produced antibodies using FUT8 siRNA. Biotechnol. Bioeng., 88 (7), 901, 2004. 109. Yamane-Ohnuki, N. et al. Establishment of FUT8 knockout Chinese hamster ovary cells: an ideal host cell line for producing completely defucosylated antibodies with enhanced antibody-dependent cellular cytotoxicity. Biotechnol. Bioeng., 87 (5), 614, 2004. 110. Kanada, Y. et al. Comparison of cell lines for stable production of fucose-negative antibodies with enhanced ADCC. Biotechnol. Bioeng., 94 (4), 680, 2006. 111. Okazaki, A. et al. Fucose depletion from human IgG1 oligosaccharide enhances binding enthalpy and association rate between IgG1 and FcgammaRIIIa. J. Mol. Biol., 336, 1239, 2004. 112. Shinkawa, T. et al. The absence of fucose but not the presence of galactose or bisecting N-acetylglucosamine of human IgG1 complex-type oligosaccharides shows the critical role of enhancing antibody-dependent cellular cytotoxicity. J. Biol. Chem., 278, 3466, 2003. 113. Varki, A. Factors controlling the glycosylation potential of the Golgi apparatus. Trends Cell Biol., 8, 34, 1998. 114. Nyberg, G.B. et al. Metabolic effects on recombinant interferon-gamma glycosylation in continuous culture of Chinese hamster ovary cells. Biotechnol. Bioeng., 62 (3), 336, 1999. 115. Curling, E.M. et al. Recombinant human interferon-gamma. Differences in glycosylation and proteolytic processing lead to heterogeneity in batch culture. Biochem. J., 272, 333, 1990. 116. Patel, T.P. et al. Different culture methods lead to differences in glycosylation of a murine IgG monoclonal-antibody. Biochem. J., 285 (3), 839, 1992. 117. Gawlitzek, M. et al. Characterization of changes in the glycosylation pattern of recombinant proteins from BHK-21-cells due to different culture conditions. J. Biotechnol., 42 (2), 117, 1995. 118. Hooker, A.D. et al. N-glycans of recombinant human interferon-gamma change during batch culture of Chinese-hamster ovary cells. Biotechnol. Bioeng., 48 (6), 639, 1995. 119. Weikert, S. et al. Engineering Chinese hamster ovary cells to maximize sialic acid content of recombinant glycoproteins. Nat. Biotechnol., 17 (11), 1116, 1999. 120. Wlaschin, K.F., Seth, G., and Hu, W.S. Toward genomic cell culture engineering. Cytotechnology, 50, 121, 2006.
26-20
Developing Appropriate Hosts for Metabolic Engineering
121. De Leon Gatti, M. et al. Comparative transcriptional analysis of mouse hybridoma and recombinant chinese hamster ovary cells undergoing butyrate treatment. J. Biosci. Bioeng., 103 (1), 82, 2007. 122. Seow, T.K. et al. Proteomic investigation of metabolic shift in mammalian cell culture. Biotechnol. Prog., 17, 1137, 2001. 123. Dinnis, D.M. et al. Functional proteomic analysis of GS-NS0 murine myeloma cell lines with varying recombinant monoclonal antibody production rate. Biotechnol. Bioeng., 94 (5), 830, 2006. 124. Nissom, P.M. et al. Transcriptome and proteome profiling to understanding the biology of high productivity CHO cells. Mol. Biotechnol., 34 (2), 125, 2006. 125. Khoo, S.H., Falciani, F., and Al-Rubeai, M. A genome-wide transcriptional analysis of producer and non-producer NS0 myeloma cell lines. Biotechnol. Appl. Biochem, 40, 85, 2007. 126. Seth, G. et al. Molecular portrait of high productivity in recombinant NS0 cells. Biotechnol. Bioeng., 97, 933, 2007. 127. Kitchin, K. and Flickinger, M.C. Alteration of hybridoma viability and antibody secretion in transfectomas with inducible overexpression of protein disulfide-isomerase. Biotechnol. Prog., 11 (5), 565, 1995. 128. Gawlitzek, M., Valley, U., and Wagner, R. Ammonium ion and glucosamine dependent increases of oligosaccharide complexity in recombinant glycoproteins secreted from cultivated BHK-21 cells. Biotechnol. Bioeng., 57, 518, 1998. 129. Andersen, D. Multiple cell culture factors can affect the glycosylation of Asn-184 in CHO-produced tissue-type plasminogen activator. Biotechnol. Bioeng., 70, 25, 2000. 130. Sauer, U. High-throughput phenomics: experimental methods for mapping fluxomes. Curr. Opin. Biotechnol., 15, 58, 2004. 131. Fitzpatrick, L., Jenkins, H., and Butler, M. Glucose and glutamine metabolism of a murine B-lymphocyte hybridoma. Appl. Environ. Microbiol., 43, 93, 1993. 132. Martinelle, K. et al. Elevated glutamate dehydrogenase flux in glucose-deprived hybridoma and myeloma cells: evidence from 1H/15N NMR. Biotechnol. Bioeng., 60, 508, 1998. 133. Xie, L. and Wang, D. Material balance studies on animal-cell metabolism using a stoichiometrically based reaction network. Biotechnol. Bioeng., 52, 579, 1996. 134. Ozturk, S. and Palsson, B.Ø. Effect of medium osmolarity on hybridoma growth, metabolism and antibody production. Biotechnol. Bioeng., 37, 989, 1991. 135. Zupke, C., Sinskey, A., and Stephanopoulos, G. Intracellular flux analysis applied to the effect of dissolved oxygen on hybridomas. Appl. Microbiol. Biotechnol., 44, 27, 1995. 136. Bonarius, H. et al. Metabolic-flux analysis of hybridoma cells under oxidative and reductive stress using mass balance. Cytotechnology, 32, 97, 2000. 137. Bonarius, H.P. et al. Determination of the respiration quotient in mammalian cell culture in bicarbonate buffered media. Biotechnol. Bioeng., 45, 524, 1995. 138. Eyer, K., Oeggerli, A., and Heinzle, E. Online gas analysis in animal cell cultivation: II. Methods for oxygen uptake rate estimation and its application to controlled feeding of glutamine. Biotechnol. Bioeng., 45, 54, 1995. 139. Bonarius, H. et al. Metabolic-flux analysis of continuously cultured hybridoma cells using 13CO2 mass spectrometry in combination with 13C-lactate nuclear magnetic resonance spectroscopy and metabolite balancing. Biotechnol. Bioeng., 74, 528, 2001. 140. Schmidt, K., Nielsen, J., and Villadsen, J. Quantitative analysis of metabolic fluxes in Escherichia coli, using two-dimensional NMR spectroscopy and complete isotopomer models. J. Biotechnol., 71, 175, 1999. 141. Fischer, E., Zamboni, N., and Sauer, U. High-throughput metabolic flux analysis based on gas chromatography-mass spectrometry derived 13C constraints. Anal. Biochem., 325, 308, 2004. 142. Marx, A. et al. Determination of the fluxes in the central metabolism of Corynebacterium glutamicum by nuclear magnetic resonance spectroscopy combined with metabolite balancing. Biotechnol. Bioeng., 49, 111, 1996.
Metabolic Engineering of Mammalian Cells
26-21
143. Klapa, M., Aon, J., and Stephanopoulos, G. Systematic quantification of complex metabolic flux networks using stable isotopes and mass spectrometry. Eur. J. Biochem., 270 (17), 3525, 2003. 144. Maaheimo, H. et al. Central carbon metabolism of Saccharomyces cerevisiae explored by biosynthetic fractional 13C labeling of common amino acids. Eur. J. Biochem., 268, 2464, 2001. 145. Christensen, B., Gombert, A., and Nielsen, J. Analysis of flux estimates based on 13C-labelling experiments. Eur. J. Biochem., 269, 2795, 2002. 146. Sharfstein, S. et al. Quantitative in vivo nuclear magnetic resonance studies of hybridoma metabolism. Biotechnol. Bioeng., 43, 1059, 1994. 147. Xie, L. and Wang, D. Energy metabolism and ATP balance in animal cell cultivation using a stoichiometrically based reaction network. Biotechnol. Bioeng., 52 (5), 591, 1996. 148. Zupke, C. and Stephanopoulos, G. Intracellular flux analysis applied in hybridomas using mass balances and in vitro 13C NMR. Biotechnol. Bioeng., 45, 292, 1995. 149. Palsson, B.Ø. In silico biology through “omics”. Nat. Biotechnol., 20 (7), 649, 2002. 150. Edwards, J. and Palsson, B.Ø. The Escherichia coli MG1655 in silico metabolic genotype: its definition, characteristics and capabilities. Proc. Natl. Acad. Sci. USA, 97, 5528, 2000. 151. Reed, J.L. et al. An expanded genome-scale model of Escherichia coli K-12 (iJR904 GSM/GPR). Genome Biol., 4, R54, 2003. 152. Covert, M. et al. Integrating high-throughput and computational data elucidates bacterial networks. Nature, 429, 92, 2004. 153. Forster, J. et al. Genome-scale reconstruction of Saccharomyces cerevisiae metabolic network. Genome Res., 13, 244, 2003. 154. Duarte, N.C., Herrgard, M.J., and Palsson, B.Ø. Reconstruction and validation of Saccharomyces cerevisiae iND750, a fully compartmentalized genome-scale metabolic model. Genome Res., 14, 1298, 2004. 155. Herrgard, M.J. et al. Integrated analysis of regulatory and metabolic networks reveals novel regulatory mechanisms in Saccharomyces cerevisiae. Genome Res., 16, 627, 2006. 156. Schilling, C.H. and Palsson, B.Ø. Assessment of the metabolic capabilities of Haemophilus influenzae Rd through a genome-scale pathway analysis. J. Theor. Biol., 203, 249, 2000. 157. Schilling, C.H. et al., Genome-scale metabolic model of Helicobacter pylori 26695. J. Bacteriol., 184 (16), 4582, 2002. 158. Oliveira, A.P., Nielsen, J., and Forster, J. Modeling Lactococcus lactis using a genome-scale flux model. BMC Microbiol., 5, 39, 2005. 159. Becker, S.A. and Palsson, B.Ø. Genome-scale reconstruction of the metabolic network in Staphylococcus aureus N315: an initial draft to the two-dimensional annotation. BMC Microbiol., 5, 8, 2005. 160. Sheikh, K., Forster, J., and Nielsen, L. Modeling hybrioma cell metabolism using a generic genomescale metabolic model of Mus musculus. Biotechnol. Prog., 21, 112, 2005. 161. Duarte, N.C. et al. Global reconstruction of the human metabolic network based on genomic and bibliomic data. Proc. Natl. Acad. Sci. USA, 104 (6), 1777, 2007. 162. Edwards, J., Ramakrishna, R., and Palsson, B.Ø. Characterizing the metabolic phenotype: a phenotype phase plane analysis. Biotechnol. Bioeng., 77 (1), 27, 2002. 163. Ibarra, R.U., Edwards, J.S., and Palsson, B.Ø. Escherichia coli K-12 undergoes adaptive evolution to achieve in silico predicted optimal growth. Nature, 420, 186, 2002. 164. Fong, S.S. and Palsson, B.Ø. Metabolic gene-deletion strains of Escherichia coli evolve to computationally predicted growth phenotypes. Nat. Genet., 36, 1056, 2004. 165. Covert, M.W. and Palsson, B.Ø. Transcriptional regulation in constraints-based metabolic models of Escherichia coli. J. Biol. Chem., 277 (31), 28058, 2002. 166. Famili, I. et al. Saccharomyces cerevisiae phenotypes can be predicted by using constraint-based analysis of a genome-scale reconstructed metabolic network. Proc. Natl. Acad. Sci. USA, 100, 13134, 2003.
26-22
Developing Appropriate Hosts for Metabolic Engineering
167. Barrett, C.L. et al. The global transcriptional regulatory netowrk for metabolism in Escherichia coli exhibits few dominant functional states. Proc. Natl. Acad. Sci. USA, 102 (52), 19103, 2005. 168. Joyce, A.R. et al. Experimental and computational assessment of conditionally essential genes in Escherichia coli. J. Bacteriol., 188 (23), 8259, 2006. 169. Vo, T.D. et al. Isotopomer analysis of cellular metabolism in tissue culture: a comparative study between the pathway and network-based methods. Metabolomics, 2 (4), 243, 2006. 170. Khairallah, M. et al. Profiling substrate fluxes in the isolated working mouse heart using 13C-labeled substrates: focusing on the origin and fate of pyruvate and citrate carbons. Am. J. Physiol. Heart Circ. Physiol., 286 (4), H1461, 2004. 171. Fussenegger, M. et al. Autoregulated multicistronic expression vectors provided one-step cloning of regulated product gene expression in mammalian cells. Biotechnol. Prog., 13, 733, 1997. 172. Fussenegger, M., Moser, S., and Bailey, J.E. pQuattro vectors allow one-step multigene metabolic engineering and auto-selection of quattrocistronic artificial mammalian operons. Cytotechnology, 28 (1–3), 229, 1998. 173. Fux, C. et al. New-generation multicistronic expression platform: pTRIDENT vectors containing size-optimized IRES elements enable homing endonuclease-based cistron swapping into lentiviral expression vectors. Biotechnol. Bioeng., 86 (2), 174, 2004. 174. Weber, W. et al. Versatile macrolide-responsive mammalian expression vectors for multiregulated multigene metabolic engineering. Biotechnol. Bioeng., 80, 691, 2002. 175. Moser, S. et al. An update of pTRIDENT multicistronic expression vectors: pTRIDENTs containing novel streptogramin-responsive promoters. Biotechnol. Prog., 16 (5), 724, 2000. 176. Savinell, J.M., Lee, G.M., and Palsson, B.Ø. On the orders of magnitude of epigenic dynamics and monoclonal antibody production. Bioprocess Biosyst. Eng., 4 (5), 231, 1989.
Index A Acetyl coenzyme A synthetase (acs) acs operon and promoter architecture, 12-14 to 12-15 acsP2, activation, 12-16 to 12-17 acsP1, repression of, 12-15 to 12-16 and transcription, 12-13 to 12-14 Actinomycetes γ-butyrolactone in, 13-9 pleiotropic and global regulators in autoregulator signaling systems, 13-7 to 13-9 signal transduction, 13-2 to 13-7 secondary metabolism in, 13-2 regulation of, 13-2 to 13-3 Activator of fungal proteases (PrtT), 25-18 Adenosine structure, 2-6 Amino acids A. vinosum PhaC, sequence homology search by BLAST, 4-5 biosynthesis, 3-3 aromatic nonpolar, 3-10, 3-12 aspartate family and branch-chained, 3-6, 3-8 building blocks formation interconnections, 3-4 histidine, 3-12 2-oxoglutarate derived, 3-4, 3-6 serine-glycine family, 3-8, 3-10 rare conversions using, 5-14 Anabolic and catabolic reaction networks in living cells, 6-6 to 6-7 Anaplerotic reactions, 2-21 to 2-22 Anoxygenic photosynthesis, 2-14 to 2-15 Antiporters proteins, 1-11; see also Porter proteins Aquaporins transport proteins, 1-5 Archaea spp. cell membranes structure, 1-5 cytoplasmic membrane architecture, 1-4 Na+ transporting methyl tetra hydro methanopterin/coenzyme M methyltransferase (NaTMMM), 1-12 to 1-13 Arginine biosynthesis from glutamate, 3-6 Aromatic amino acids biosynthetic pathways, 3-11
Artemisin in Plasmodium falciparum, 22-27 Aspartate family of amino acids aspartate carbamoyl transferase and pyrimidine synthesis, 3-17 biosynthesis, 3-7 Aspergillus spp., 25-3 to 25-5 α-amylase production enzymatic engineering and morphological engineering, 25-14 citric acid production, 25-11 to 25-12 citrate system, 25-13 glucoamylase production, 25-12 gene dosage, effect of, 25-14 morphological engineering, 25-14 heterologous protein production enzymes, deletion of, 25-18 gene modifications, 25-17 modified regulators, 25-17 to 25-18 lovastatin production LaeA, proposed model of, 25-19 Autoregulator signaling systems, 13-7 to 13-9 Autotrophic growth, anabolic reaction calculation, 11-3 to 11-4
B Bacillus subtilis biotin biosynthesis pathways and regulatory genes, 23-18 to 23-19 dethiobiotin (DTB), 23-19 pathway engineering, 23-19 to 23-21 physiological and commercial significance, 23-18 general host properties, metabolic engineering of, 23-23 generally regarded as safe (GRAS), 23-1 to 23-2 genetic engineering methods chromosomal gene amplification and large scale, stable chromosomal cloning, 23-5 genes by use of plasmid vectors, expression of, 23-3 gene transfer and, 23-3
I-1
I-2 marker-free chromosomal modifications, 23-4 to 23-5 marker-retaining chromosomal modifications, 23-4 strain history, 23-2 to 23-3 genome resequencing, 23-23 to 23-4 omics and systems biology, 23-24 to 23-5 pantothenate biosynthesis pathway of, 23-13 qualified presumption of safety (QPS), 23-1 riboflavin biosynthesis and regulation of rib gene expression, 23-7 decoupled fermentation process, development of, 23-10 to 23-11 pathway engineering, 23-7 to 23-10 physiological and commercial significance, 23-6 to 23-7 transketolase mutants with GTP cyclohydrolase II and DHBP synthase, 23-10 and salvage pathways of, 23-16 thiamin biosynthetic pathway, 23-15 to 23-16 physiological and commercial significance, 23-15 and regulatory genes, 23-17 thiamin-TMP-TPP pathway, 23-17 Thi biosynthetic enzymes, 23-18 vitamin B6 biosynthesis of, 23-21 pathway engineering, 23-22 Bacteria cyclic AMP, second messenger in, 13-10 to 13-11 electron transport, 2-27 to 2-28 plasma membranes diffusion-controlled and carrier mediated solute fluxes across, 1-5 β-Barrel porins, 1-10 Batch model for growth and product formation, 7-11 to 7-13 µmax and qmax s , 7-14 bcl-2/bcl-x L gene overexpression and apoptosis, 26-5 Binding-protein-dependent (ABC) transport system, 3-2 Biological atoms incorporation in cells nitrogen, 3-2 one-carbon units and oxygen, 3-3 phosphate, 3-3 sulfur, 3-2 Biomass C-molX produced/mol substrate, 7-17 doubling time, 7-15 to 7-16 elemental composition organic and inorganic fraction, 6-3 to 6-4 q-rates and stoichiometry, 7-17 to 7-20 specific conversion rates, 7-8 to 7-9
Index Biotin affinity constant K values, 9-4 biosynthesis of, 23-18 and regulatory genes, 23-19 dethiobiotin (DTB), 23-19 pathway engineering, 23-19 to 23-21 physiological and commercial significance, 23-18 Black box kinetic functions under single nutrient (substrate) limited conditions substrate consumption for maintenance, 9-11 chemical maintenance coefficients, 9-12 to 9-13 maintenance Gibbs energy, 9-12 substrate uptake rate, 9-9 to 9-11 Branched-chain amino acids biosynthesis, 3-9 Branch point control coefficients and dependency relations, analysis, 16-26 to 16-27 BRENDA database, 18-4; see also Constraint-based genome-scale models N-Butanoyl-L-homoserine lactone (C4-HSL) molecular structure, 13-14 γ-Butyrolactone in actinomycetes, 13-7 to 13-9
C Calvin cycle in plants, 3-3 Candida utilis predicted and measured biomass yields, 10-7 Carbamoyl phosphate synthase (CPS) and pyrimidine synthesis, 3-17 Carbohydrates synthesis for building cells aldolases, 3-25 biotransformation, 5-9 to 5-10 disaccharides involved in, 3-26 glycosyl transfer, 3-23 interconversion of, 3-22 kinases for, 3-24 NDP-activated sugars transport across membranes, 3-23 to 3-24 regiospecific esterification of, 5-9 structure of, 3-20 to 3-21 Carbon catabolite repression (CCR), 1-15 to 1-16 Carbon fixation, 2-15 Carotenoids, 3-30 to 3-31; see also Neutral lipids synthesis Carrier-mediated solute-H+ symport, 1-10 Catabolites catabolic pathways, 2-28 to 2-29 catabolism and redox half reactions in cells, 6-7 repression in microbes, 2-29 CellAnalyzer tool, 18-12; see also Genome-scale metabolic models Cellobiose, 22-4 cell surface engineering with, 22-3 Cells catabolic reactions in, 6-6 to 6-7 composition
I-3
Index elemental biomass, 6-3 to 6-4 organic compounds in, 6-2 cultivation media, composition of, 6-8 growth quantification, 6-2 to 6-3 medium anabolic and catabolic reaction networks, 6-6 to 6-7 Fe3+ needed for growth, 6-5 inorganic compounds, 6-5 to 6-6 organic compounds, 6-4 signal transducing complexes, 6-7 membrane cholesterol structure in, 1-3 function, 1-4 to 1-5 lipidic bonds in, 1-4 phospholipids of, 1-2, 1-4 proteins, 1-2 structure, 1-1 to 1-4 viscosity of, 1-4 Cellulose fibers, 22-6 coexpression and cell surface engineering with, 22-5 Central dogma in molecular biology, 17-1 to 17-2 multiscale nature, 19-2 to 19-4 Central pathways ED pathway, 2-18 to 2-19 glycolysis, 2-15 to 2-16 PPP pathway, 2-16 to 2-18 pyruvate and, 2-19 to 2-20 TCA cycle, 2-20 Cephalosporins antibiotics biosynthesis, 5-16 13 C Flux analysis, 26-12 Channel and pore-mediated passive diffusion, 1-10 Chemostat experiments, estimation of parameters of kinetic model from free variables, 9-20 to 9-21 kinetics and stoichiometric model from, 9-21 to 9-22 lysine production, 9-24 to 9-26 model parameters, 9-21 q-rates using three independent reactions, 9-24 real conversion rates, 9-24 specific rates for, 9-23 steady state of, 9-23 Chemotroph organisms, 2-2 Chloroperoxidases (CPO) enzyme, 5-13 C 40 isoprenoids, 3-30 to 3-31; see also Neutral lipids synthesis Clostridium spp. Clostridium kluyveri 4-hydroxybutyryl-CoA/CoA transferase gene (cat2) in E.coli and P(4HB) homopolymer production, 4-5 strict/obligate anaerobes, 2-2 thioredoxin-type ferredoxins in, 2-12 Cofactor-dependent enzymes and regeneration, 5-16 Complex media in industrial fermentation processes, 6-4 Constrain based flux analysis, 26-12 to 26-13
Constraint-based genome-scale models, 18-1 development and validation biomass reaction representation, 18-4 to 18-5 curated reaction and metabolite database, 18-2 to 18-3 maintenance requirements, 18-5 to 18-6 metabolic network reconstruction methods, 18-3 to 18-4 validation and refinement, integration with physiology data, 18-6 Conventional metabolic engineering cell specific productivities, 26-7 to 26-9 product quality, 26-9 to 26-10 VCI strategies used, 26-2 to 26-7 apoptotic pathways, 26-5 to 26-6 metabolic phenotype, 26-4 to 26-5 process, 26-2 to 26-4 proliferation, 26-6 to 26-7 Corynebacterium glutamicum, 19-2 aerobic growth and lysine production, 9-18 to 9-19 Cybernetic models, 19-7 to 19-8 Cyclic AMP receptor protein (CRP) dependent activation by nucleoid proteins modulation FIS, negative regulation by, 12-17 to 12-18 IHF, negative and positive regulation by, 12-18 to 12-20 upstream elements, role of, 12-20 dependent promoter transcription from pnrfA, 12-13 and DNA site affinity, 12-7 Cystathionine formation, 3-8 Cytochrome P450 oxidoreducatases, 5-15 to 5-16 Cytochrome proteins and ETC, 2-12 Cytokinin isopentenyladenine (IP) from Arabidopsis thaliana, 14-12
D Data reconciliation, 8-1 basic tests in, 8-8 to 8-9 in biotechnology, 8-11 to 8-12 computations and, 8-11 general optimization problem, 8-4 to 8-5 hard and soft relations, 8-4 linear data reconciliation problem, 8-7 maximum power tests in, 8-9 to 8-10 missing observables, 8-8 statistical properties and identifiability, 8-5 to 8-6 Deacetoxycephalosporin/deacetylcephalosporin C synthase catalytic plasticity, 5-16 Decarboxylation-driven transporters, 1-12 Deletion analysis methods, 18-8; see also Metabolic network 1-Deoxygalactonojirimycin facile synthesis, 5-10 Detailed stoichiometric metabolic model, 10-3 ATP stoichiometry parameters
I-4 amino acid overproduction of, 10-13 to 10-15 estimation of, 10-4 maximum product yields, limit functions for, 10-15 to 10-17 maximum yields of biomass and product, calculation, 10-4 to 10-8 in metabolic networks, 10-4 to 10-5 metabolic network topology for growth on mixed substrates calculation, 10-8 to 10-13 Deviations missing data, 8-8 testing of, 8-8 to 8-10
E EF-Tu synthesis and P(3HB) biosynthesis, 4-9 Electrochemical potential-driven transporters, 1-11 Electron transport energy conversion and, 2-28 in mitochondria, 2-24 complex I and II, 2-24 complex III, 2-25 complex IV, 2-26 mitochondrial Q cycle, 2-25 Elementary mode analysis, 18-9; see also Metabolic network Embden-Meyerhof-Parnas (EMP) pathway, 2-4, 2-16 glycolytic pathway and PTS, 1-14 Enantio-transformations, hydrolytic enzymes in, 5-8 to 5-9 Enterobacter aerogenes α-acetolactate decarboxylase, expression, 22-22, 22-24 Entner-Doudoroff (ED) pathway, 2-4 Enzymes bioconversions and, 5-6 to 5-8 biotechnology and implementation, challenges for, 5-5 catalytic functions of, 5-16 to 5-18 classification, 5-2 to 5-3 kinetics in presence of conserved moieties, 16-24 to 16-26 metagenomic libraries construction isolation and, 5-20 steps in, 5-17 oxidoreductions and, 5-16 preparative and industrial applications, 5-8 rare conversions aliphatic groups by, 5-13 to 5-14 by amino acids, 5-14 carbohydrate modifying enzymes, 5-9 to 5-12 cofactor-dependent enzymes and, 5-16 with flavonoids and steroids, 5-14 to 5-15 hormones and vitamins, 5-15 to 5-16 hydrolytic enzymes in enantiotransformations, 5-8 to 5-9
Index oxidative enzymes for organic pollutants in industrial effuents, 5-12 rare domino and MCRs, 5-16 using fatty acids, 5-12 to 5-13 xenobiotics of, 5-15 screening and putative progression of conversion in industry, 5-17 used in chemical conversions, 5-6 to 5-8 Escherichia coli, 21-1 ACC encoded by accABCD in, 3-28 acetyl-CoA ACP transcylase (fabH), 3-28 artificial cell–cell communication in, 14-11 biotin acetyl carrier protein (BCCP), 3-28 biotin biosynthesis pathways of, 23-19 to 23-21 carbon catabolic repression mechanisms in, 1-17 Clostridium acetobutylicum butyrate kinase (buk) and phosphotransbutyrylase (ptb) expression in, 4-5 data from aerobic carbon limited chemostat culture of, 8-2 to 8-3 2-deoxyribose 5-phosphate (DERA) enzyme from, 3-25 diauxic growth elimination of, 2-29 dynamic single cell models of, 19-2 E. coli FabH and 3-ketoacyl-ACP:CoA transferase activity, 4-12 fumarate nitrate reductase (FNR) and ArcA/B global regulatory systems in, 2-28 glucose transport and metabolism in, 1-15 glucose PTS complex in, 1-16 metabolism and products, 21-3 to 21-4 metabolism, breakdown of, 21-6 phosphoenolpyruvate/sugar phosphotransferase system from, 1-12 pnrfA promoter and, 12-12 as production organism, 21-3 products obtained from, 21-4 to 21-5 R. eutropha PHA synthase gene and P(4HB) homopolymer production, 4-5 sbm and ygfG genes expression, 4-11 4HB-CoA from succinyl-CoA in recombinant, 4-11 historical developments as model, 21-2 IIAGlc protein and CCR, 1-16 intercellular communication network in, 14-10 K12 and B strains, use in metabolic engineering, 21-9 to 21-10 KS272 strain, ED pathway and P(3HB) biosynthesis, 4-9 lactose operon in, 14-4 MCL-PHA biosynthesis pathway from fatty acids, 4-12 melAB promoter, cooperative interactions, 12-11 metabolic engineering, fundamentals cultivation strategies, 21-12 gene expression, selected promoter elements used, 21-7
I-5
Index gene material, methods used for, 21-8 genetic elements and tools, 21-6 lac-promoter and araBAD promoter, 21-8, 21-11 plasmid stability, 21-7 to 21-8 shikimate pathway, metabolic engineering of, 21-12 by-products, minimization of, 21-17 flux, increase of, 21-14 non-oxidative pentose phosphate pathway and glycolysis, connected reactions of, 21-16 precursors, supply of, 21-14, 21-16 to 21-17 principal targets for, 21-15 1,3-propanediol production, 21-17 to 21-18 reactions and structures of, 21-13 in silico E. coli metabolic network model, 4-7 strains and improvements, 21-11 to 21-12 Eubacterial transcription hard-wired regulation, 12-4 initiation pathway, 12-2 promoter recognition, 12-3 to 12-4 regulation adaptive, 12-5 complex, 12-9 to 12-13 mechanism, 12-6 resultant transcription elongation complex (TEC), 12-3 RNA polymer (RNAP), 12-1 to 12-2 codependence mechanism, 12-10 promoter elements and interactions with, 12-3 signaling, 12-6 activation of, 12-8 to 12-9 covalent modification, 12-7 to 12-8 mechanisms of, 12-7 Eukaryotic Ser/Thr protein kinase (ESTPK) genes in bacterial species, 13-6 Extreme pathway analysis, 18-9, 20-6; see also Metabolic network ExPA for analysis of, 20-7
F Factor for inversion stimulation (FIS) nucleoid proteins, 12-12 Fatty acid biosynthesis cycle for long chain polysaccharides, 3-29 Fermentation pathways, 2-23 Fermentor transport mechanisms as tool extracellular concentrations control biomass, 9-6 conversion rates in chemostat, 9-7 to 9-9 product, 9-7 substrate, 9-5 Ferredoxins, see Iron-sulfur proteins Filamentous fungi, 25-1 cell factories, 25-2 transcriptomics, studies, 25-5
carbon metabolism, examining regulation of, 25-6 to 25-7 EST mining and metabolic engineering, directing, 25-7 genes using microarrays, functional assignment of, 25-6 metabolomics, 25-9 proteomics, 25-8 to 25-9 uses in, 25-8 Flavin mononucleotide (FMN) and flavin adenine dinucleotide (FAD) reduction, 2-13 Flavonoids rare conversion with, 5-14 to 5-15 Flavoproteins, 2-12; see also Redox potentials and mobile electron carriers Fluid mosaic model, 1-4 Fluorescence activated cell sorting (FACS) technology, 5-3 Fluoroacetate synthesis by Streptomyces cattleya bacterium, 5-13 to 5-14 Flux balance analysis (FBA), 18-7 to 18-8, 20-5, 26-11 to 26-12; see also Metabolic network Corynebacterium glutamicum, 19-2 13C-NMR/GC-MS labeling techniques, 19-2 metabolic flux analysis markup language (MFAML), 20-7 systems biology markup language (SBML), 20-7 Flux control coefficients (FCC), 25-9 to 25-10 Flux coupling analysis, 17-11 Fluxome application, 17-15; see also Network functionality at metabolite level Flux variability analysis (FVA), 18-8; see also Metabolic network Fructose 1,6-diphophate (FDP) aldolase in metabolism of monosaccharides, 3-25 Fuculose 1-phosphate Type II aldolases in microorganisms, 3-25 Fueling reactions, 2-2 energy obtained from, 2-3 products from ATP hydrolysis, 2-4 to 2-5 NADH and NADPH, 2-6 to 2-9 precursor metabolites, 2-4
G Genetic perturbations, 20-9 gene amplification, 20-11 gene deletion, 20-10 to 20-11 Genome-scale metabolic models, 18-2 commercial and academic software, 18-13 computational analysis of, 18-7 development from biological data, 18-3 features of, 18-13 flux distributions calculations, 10-3 for in silico analysis, 10-2, 20-4 software and databases for, 18-12
I-6 survey of, 18-12 to 18-14 validation and iterative refinement of, 20-2 Genomics, 25-2 to 25-3 Aspergillus nidulans, 25-3 to 25-4 Aspergillus niger, 25-4 Aspergillus oryzae, 25-4 Aspergillus terreus, 25-5 Penicillium chrysogenum, 25-5 Trichoderma reesei, 25-5 Gibbs energy based coupling of catabolism and anabolism, 11-6 Gluconeogenesis and C6 carbon base molecules generation, 3-3 Glutamate family amino acids biosynthetic pathways, 3-5 Glycolysis, 2-15 to 2-17 Glycosyl transfer and carbohydrates synthesis, 3-23 Glycosynthases directed-mutated glycosidases, 5-12 Glyoxylate bypass, 2-21 to 2-22 Gram-positive and gram-negative bacteria, cell membranes, 1-3
H Haloperoxidase enzyme, 5-13 Herbert-Pirt equation electron donor operational stoichiometry of growth process, 11-14 to 11-15 for substrate distribution, product formation kinetics, 9-13 categories of, 9-15 to 9-16 consumed substrate, 9-14 growth in, 9-16 to 9-18 qP(µ) function, 9-15 theoretical maximum yields, 9-14 to 9-15 Hexose 1-phosphates activation pathways from central precursor metabolites, 3-15 High potential ferredoxins (HiPIP), 2-12 Histidine biosynthesis, 3-12 to 3-13 Hopanoids structure, 1-3 3-Hydroxydecanoate (3HD) and 3-hydroxydodecanoate (3HDD) synthesis in fadB mutant E. coli, 4-13 β-Hydroxydecanoyl-ACP dehydrogenase (HDD) dehydration, 3-30 Hyphal branching regulator (HbrA), 25-14
I Inosine monophosphate (IMP) synthesis, 3-14 Integration host factor (IHF) nucleoid proteins, 12-12 dependent inhibition, 12-18 roles at acsP2, 12-19 Ion-gradient-driven energizers, 1-11 to 1-12 Iron-sulfur proteins, 2-12; see also Redox potentials and mobile electron carriers
Index
K Kanamycin biosynthetic cluster, 24-14 KEGG online database, pathway definitions, 17-7 3-Ketoacyl-ACP synthase isoenzymes and carbon chain elongation, 3-28 Ketoacyl-ACP synthase (KAS) I/isoenzyme KAS II dehydration, 3-30 Klebsiella terrigena α-acetolactate decarboxylase, expression, 22-22, 22-24
L lac operon, 14-2 Lactic acid formation and Saccharomyces cerevisiae, 22-25 to 22-26 Leloir and non-Leloir glycosyltranferases, hydrolytic activity, 5-10 Lesch–Nyhan syndrome, 3-19 to 3-20 Light absorption-driven transporters, 1-13 Lipids biosynthesis, 3-25 classification of, 3-26 diversity in microorganisms, 3-26 to 3-27 Lipogenesis, metabolites sources, 3-27 Lithotroph organisms, 2-2 Long-chain fatty acids synthesis, 3-27 to 3-30 Long chain polysaccharides and fatty acid biosynthesis cycle, 3-29 Long-chain polyunsaturated fatty acids (PUFAs) and biosynthesis, 5-12 to 5-13 Lysine biosynthesis, 3-8
M Maintenance thermodynamics, 10-13 maintenance energy, procedure for estimation, 20-9 Mass balances batch cultivation, 7-4 mathematical model for growth estimation, 7-14 to 7-15 modeling cycle, 7-9 to 7-10 organism growth, 7-5 to 7-7 q rates calculations, 7-8 to 7-9 conversion rate, 7-5 experimental design, 7-4 setting up NH3, 7-3 process knowledge, 7-4 stoichiometric coefficients, 7-16 to 7-17 structure of, 7-2 system boundary, 7-2 to 7-3 Maximal concentration electron donor thermodynamic prediction, 11-16 to 11-17 Maximum specific growth rate estimation, 11-15 to 11-16
Index Mechanistic model dynamic perturbation data, 16-24 Melibiose, 22-14 to 22-16 Menaquinone in electron transport chain (ETC), 2-11 to 2-12 Metabolic control analysis (MCA), linear kinetic approximation, 25-9 to 25-10 metabolite steady state solution, 16-9 to 16-10 problems of, 16-11 reference steady state and linearized kinetic relation using elasticity parameters, 16-6 to 16-7 features of, 16-8 kinetic equations in matrix notation, 16-8 to 16-9 steady state flux solution, 16-10 Metabolic engineering modeling, methodologies kinetic modeling of glucose metabolism, 25-10 MCA, 25-9 to 25-10 MFA, 25-9 stoichiometric modeling, 25-10 Metabolic flux analysis (MFA), 15-1, 25-9 estimation from metabolomics, 15-13 to 15-14 framework, 15-10 graph theory, 15-3 isotopomer distribution analysis, 15-6 to 15-10 metabolite balancing analysis (MBA), 15-4 to 15-6 quantification methods network reconstruction, 15-2 under transient physiological conditions, 15-14 vector, 15-3 to 15-4 Metabolic modeling, 20-1 to 20-2 in post-genome era, 20-3 to 20-4 in pre-genome era, 20-2 to 20-3 Metabolic model validation, 20-4 environmental perturbation by energetic parameters, 20-8 to 20-9 gas composition, 20-10 medium composition and essential metabolites, 20-10 genetic perturbations by, 20-9 gene amplification, 20-11 gene deletion, 20-10 to 20-11 tools for FBA, 20-5 metabolic pathway analysis, 20-5 to 20-7 simulation of, 20-7 validation of, 20-7 Metabolic network analysis methods biased, 18-7 to 18-9 flux-based, 18-6 to 18-7 unbiased, 18-9 to 18-10 depictions of, 17-5 to 17-6 elementary flux modes for, 17-8 FBA approach, 17-15 global structure and flux analysis, 17-3
I-7 minimization of metabolic (flux) adjustments (MOMA), 17-15 modular approach, 10-2 to 10-3 omics data, 17-7 optimization techniques bilevel programming, 18-11 to 18-12 integer programming, 18-11 pathway representation, 17-4 regulatory and dynamic extensions, 18-10 representation of, 17-2 stoichiometric matrix, 17-7 to 17-8 structure–function relationship bow tie architecture, 17-10 fluxes and metabolic network structure, 17-10 to 17-11 plug and play type modularity of nutrient intake, 17-10 structure and regulation, 17-11 to 17-12 topological and functional features of elements, 17-8 to 17-10 transcriptionally responsive sub networks, 17-12 thermodynamic and metabolic extensions, 18-10 to 18-11 Metabolic reaction network model parameterisation dynamic perturbation experiments measurement problem for, 16-20 metabolite measurements, 16-21 to 16-23 steady state perturbation experiments flux relation, 16-16 linear pathway, 16-13 to 16-16 lin-log relations, 16-17 noncharacterized perturbation, 16-17 to 16-18 steady state branched pathway, 16-18 to 16-20 Metabolic reaction networks mathematical models of independent metabolite mass balances and simplification, 16-4 to 16-5 metabolite mass balances, 16-4 steady state metabolite and flux functions, 16-5 to 16-6 structural features of, 16-3 to 16-4 vector and matrix definitions, 16-2 notation, 16-3 MetaFluxNet tool, 18-12; see also Genome-scale metabolic models Metagenomics activity-based mining, success rate of, 5-19 biocatalysts detection in, 5-18 enzymes isolation from libraries, 5-20 metaSHARK metabolic network reconstruction tools, 18-4 METATOOL for elementary modes of metabolic network, 20-7 Methionine biosynthesis from homoserine, 3-8 Methyltransfer-driven transporters, 1-12 to 1-13
I-8 Michaelis and Menten enzyme elasticity, 16-7 Michaelis and Menten kinetics of transport-mediated solute diffusion process, 1-7 to 1-8 Microbial growth stoichiometry thermodynamics anabolic reaction for biomass synthesis, 11-2 to 11-3 electron donor needed for anabolism using balance of degree of reduction calculation, 11-4 to 11-5 Gibbs energy from catabolic reaction anabolism and calculation of amount of electron donor, 11-10 to 10-12 biomass yield on electron donor, 11-12 to 10-13 and enthalpy of formation, 11-8 under nonstandard conditions, 11-7 to 11-10 under standard conditions, 11-6 to 11-7 Microorganisms kinetic behavior of, 9-2 function study, 9-2 to 9-4 Minimal concentration electron donor thermodynamic prediction, 11-16 to 11-17 Mitochondrial Q cycle, 2-25 Mitogen-activated protein kinase (MAPK) cascade signal transduction system, 14-4 Molasses, MEL1 gene expression, 22-14 Monosaccharide enzymatic transformations, 5-10 to 5-11 Multiscale constraints based models Boolean programs, 19-7 Q-dimensional vector of, 19-6 to 19-7 Multiscale mathematical modeling, 19-1 FBA, 19-2
N NarP/NarL, homologous transcription factors, 12-12 Na+ -transporting carboxylic acid decarboxylase (NaT-DC) system, 1-12 NDP-sugars as building blocks, 3-23 to 3-24 Network functionality at metabolite level, 17-12 fluxes experimental estimation of, 17-13 to 17-14 fluxome application in, 17-15 kinetic models for simulations, 17-16 regulatory on/off minimization (ROOM), 17-15 in silico prediction, 17-14 to 17-15 Neutral lipids synthesis, 3-30 to 3-31 Nonlinear approximate kinetics lin-log kinetics, 16-11 to 16-12 steady state metabolite and flux function for large perturbations, 16-13 Nonribosomally synthesized channels and porters, 1-11; see also Porter proteins Nonribosomal peptide synthases (NRPS), 24-20 to 24-21 Nucleoprotein complex activation by remodeling, 12-12 Nucleotides as building blocks
Index biosynthesis of de novo purine synthesis, 3-15 to 3-17 2´-deoxyribonucleotides, 3-17 to 3-19 pyrimidines, 3-17 metabolic conditions, 3-19 to 3-20 structure and organization, 3-12 to 3-15 phosphodiester and glycosidic bonds, 3-14 Nucleotidyl transfer and carbohydrates synthesis, 3-23
O Observability, redundancy and sensitivity analysis, 15-10 to 15-13 Oligosaccharide synthesis, 5-9 with glycosyltransferases and glycosidases, 5-10 Operational yield, 9-26 catabolic product formation, 9-27 to 9-29 noncatabolic product formation, 9-30 to 9-31 overall growth and product reaction, 9-29 to 9-30 q p(µ) function, 9-27 Yij(µ) functions derivation, 9-27 OptKnock approach, 17-15; see also Network functionality at metabolite level Organisms catabolic processes in photosynthesis, 2-14 to 2-15 classifications of, 2-3 life cycle of, 13-2 and light energy, 2-2 Oxidative enzymes for conversion of organic pollutants in industrial effluents, 5-12 Oxidative phosphorylation, 2-22 to 2-24 Oxidoreduction-driven transporters, 1-13 N-(3-Oxododecanoyl)-L-homoserine lactone (3-oxoC12-HSL) molecular structure, 13-14 Oxygenase enzymes catalytic plasticity, 5-16 Oxygenic photosynthesis, 2-14
P Pantothenic acid pantothenate, biosynthesis of, 23-12 B. subtilis, pantothenate biosynthesis pathway of, 23-13 pantothenate biosynthetic and regulatory genes, 23-14 pantothenate pathway engineering, 23-12, 23-14 physiological and commercial significance coenzyme A (CoA) and acyl carrier protein, biosynthesis of, 23-11 microbial production processes, 23-11 to 23-12 Pathway specific regulators, 13-3 in streptomycetes, 13-4 TetR-family transcriptional regulators, 13-5
Index Penicillin and biomass on carbon source, calculated yield and maintenance parameters, 10-9 to 10-10 Penicillium chrysogenum, 25-5 ad-7-ADCA and ad-7-ACA pathway in, 25-22 ATP-stoichiometry parameters for, 10-8 to 10-9 biosynthetic pathway for, 25-20 β-lactam production effect on, 10-10 penicillin production in, 25-20 to 25-21 Pentose phosphate pathway (PPP), 2-4 oxidative routes for NADPH and pentose formation, 3-22 to 3-23 Pentose sugars aerobic arabinose utilization by, 22-8 Peptides-polyketides rare conversions using aliphatic groups, 5-13 to 5-14 Phenazines production, 13-15 PhoR-PhoP system, 13-5 mutation in, 13-11 to 13-12 Phosphoenol pyruvate (PEP), 21-6 dependent phosphorylation by PTS, 1-14 phosphoenolpyruvate/sugar phosphotransferase system, 1-10 phosphoenolpyruvate:sugar (PTS) transporters transport mechanism, 1-13 Phosphogluconate pathway, 2-16 to 2-17 Phosphohistidine carrier protein (HPr), 1-13 Phospholipid s in cell membrane, 1-2, 1-4 Phosphotransacetylase (Pta) reversible enzyme for conversion of acetyl coenzyme A, 14-8 Phosphotransfer-driven group translocators, 1-13 to 1-16 Phototroph organisms, 2-2 Pichia pastoris glycosylation machinery, 5-11 to 5-12 Polyhydroxyalkanoates (PHAs) E. coli and Salmonella in, 4-8 metabolic engineering strategies for, 4-10 formation by polymerization, 4-1 mutagenesis and, 4-6 PHA synthases classes, 4-2, 4-4 evolution of, 4-4 to 4-5 genetic organization of, 4-3 3HB and 3-hydroxyhexanoate (3HHx), 4-3 substrate specificity of, 4-4 P(3HB) biosynthesis, 4-7 R. eutropha PHA synthase, 4-3 lipase box (G-X-[S/C]-X-G) in, 4-5 short chain length (SCL) and medium chain length (MCL), 4-2 β-oxidation pathway and de novo fatty acid biosynthesis pathways, 4-12 MCL-PHA synthase, 4-13 precursors of, 4-11 synthesis by glucose, 4-12 synthesis in E. coli strain, 4-13
I-9 fadB yfcX and fadB maoC double mutant E. coli and, 4-13 from gluconate, 4-13 to 4-14 microorganisms, metabolic engineering of, 4-5, 4-7, 4-9 to 4-11 (R)-hydroxyacyl-CoAs (RHA-CoAs), 4-2 YfcX crotonase superfamily and enoyl-CoA hydratase MaoC, 4-13 Polymerization, 4-1 Pore forming toxins, 1-11 Porins channel type proteins, 1-9 to 1-10 Porter proteins, 1-11 P-P-Bond-hydrolysis-driven transporters, 1-12 pqsABCDE and phnAB operons transcription, 13-14 Predator (lacI)-prey (glnG) model, 14-6 Primary active uptake ABC transporter driven by ATP hydrolysis, 1-10 Product formation kinetics categories of, 9-15 to 9-16 growth in, 9-16 to 9-18 qP(µ) function, 9-15 thermodynamics and stoichiometry anabolic product, 11-19 catabolic products, 11-17 to 11-19 Prokaryotic translational program models, 19-5 to 19-6 Proline biosynthesis from glutamate, 3-6 Protein N- and acyl-phosphorylation in prokaryotes, 13-5 Protein O-phosphorylation in eukaryotes, 13-6 Proteome analysis and 2-keto-3-deoxy-6phosphogluconate aldolase (Eda) in P(3HB) biosynthesis, 4-7, 4-9 Pseudomonas aeruginosa MCL-PHA biosynthesis pathway from fatty acids, 4-12 phaG gene in, 4-12 QS system in, 13-14 to 13-15 quorum-sensing hierarchy in, 13-15 Pseudomonas quinolone signal (PQS) controlled phz1 operon, 13-15 molecular structure, 13-14 pTRIDENT vector platform, 26-13 Purine synthesis from precursor 5-phosphoribosyl 1-pyrophosphate (PRPP), 3-15 to 3-17 Pyocyanin (PYO) MexGHI-OpmD efflux pump and, 13-16 molecular structure of, 13-14 QS-signal homeostasis and, 13-16 in regulation of mexGHI-opmD and PA2274, 13-17 as secondary metabolite in Pseudomonas aeruginosa, 13-12 to 13-16 Pyrimidine synthesis, 3-17 by orotate, 3-18 Pyrophosphoryl transfer and carbohydrates synthesis, 3-23
I-10
Q Quinone-mediated electron shuttling process, 2-10 to 2-12 Quorum-sensing (QS) inducer in S. natalensis, 13-9
R Rare sugars, 5-10 Recombinant protein production mammalian cell culture use in, 26-2 REDIRECT method, 24-8, 24-10 Redox potentials and mobile electron carriers, 2-9 to 2-10 flavoproteins, 2-12 iron-sulfur proteins, 2-12 quinones, 2-10 to 2-12 Repressilator oscillatory transcription network, 14-6 Rhamnulose 1-phosphate Type II aldolases in microorganisms, 3-25 Riboflavin biosynthesis and regulation of rib gene expression, 23-7 decoupled fermentation process, development of, 23-10 to 23-11 pathway engineering, 23-7 to 23-8 B. subtilis rib operon, 23-9 RB50::[pRF69]n, cultivation of, 23-10 physiological and commercial significance, 23-7 life cycle impact assessment of, 23-6 transketolase mutants with GTP cyclohydrolase II and DHBP synthase, 23-10 Ribokinase and pentose phosphate pathway, 3-24 Ribosome-associated ppGpp synthetase (RelA), Red and Act production, 13-11
S Saccharomyces cerevisiae, 21-18 amino acid production in, 10-15 artificial cell–cell communication in, 14-11 as biocatalyst L-malic acid degradation, 22-35 to 22-36 xylitol production, 22-36 to 22-37 bi-partite graph for genome-scale metabolic network of, 17-9 carbon and redox equivalents glycerol production, 22-16, 22-18 protein complexes, 22-17 TAM strain, 22-19 cellular properties, 22-34 evolutionary engineering, 22-35 diacetyl, 22-22 to 22-23 diauxie shift in, 19-1 to 19-2 ethanol production, modulation of, 22-19 to 22-20 NADH, dissipation, 22-21
Index growth on glucose metabolic network, 10-6 heterologous protein production, 22-34 lactic acid formation, 22-25 to 22-26 metabolic engineering, 22-30 design of, 22-2 nicotianamine (NA), 22-32 optimal metabolic flux patterns for aerobic growth of, 10-11 to 10-13 positive feedback loop in, 14-5 predicted and measured biomass yields, 10-7 sequencing of, 22-1 to 22-2 stereoselective bioreduction, 22-37 substrate range, extension, 22-3 cellobiose, 22-3 to 22-4, 22-6 cellulose fibers, 22-5 consolidated bio processing (CBP), 22-2 galactose, 22-10 to 22-12 lactose, 22-12 to 22-13 melibiose, 22-14 to 22-16 pentose sugars, 22-8 starch, 22-5, 22-7 to 22-8 xylan, 22-8 to 22-10 succinic acid, 22-21 terpenoids Erwinia uredovora for biosynthesis of, 22-26 to 22-27 metabolic engineering for, 22-26 pathway, 22-27 use and industrial exploitation of, 22-2 vitamin C, 22-34 S-Adenosylmethionine (SAM) function in secondary metabolism, 13-12 Salmonella enterica serovar Typhimurium propionyl-CoA synthetase (PrpE) and 3-hydroxypropionic acid conversion, 4-7 Salvage reactions, 3-19 Scale-free networks, 17-8; see also Metabolic network Secondary metabolism, 13-1 carbon metabolism and cyclic AMP, 13-10 to 13-11 light induction, 13-12 phosphate control, 13-11 to 13-12 ppGpp and nitrogen-limitation, 13-11 S-adenosylmethionine activation, 13-12 Sensitivity analysis methods, 18-8 to 18-9; see also Metabolic network Serine-glycine family biosynthesis from 3-phosphoglycerate, 3-8, 3-10 Severe combined immunodeficiency disease (SCID), 3-19 to 3-20 Shikimate pathway, 3-11 SimPheny tool, 18-12; see also Genome-scale metabolic models Single-cell measurement, 14-12 to 14-13 Solute transport system kinetics, 1-5 concentration of external solute Sx, 1-6 diffusion across lipid bilayer membrane, 1-6
I-11
Index Fick´s law, 1-6 flux of Sx, 1-7 Lineweaver–Burk plots, 1-7 Michaelis and Menten model, 1-7 presence and absence of transporter, comparison of, 1-6 transporters, classes and mechanisms, 1-10 SoxS transcription factor concentration and signaling mechanism, 12-7 Squalene, 3-30 to 3-31; see also Neutral lipids synthesis ssrA tagged protein degradation kinetics, 14-8 Starch, 22-7 cell surface engineering, 22-5 cellulolytic and amylolytic yeast strains, 22-5, 22-8 Statistical framework theory linear constraints, 8-5 to 8-6 nonlinear constraints, 8-7 to 8-8 optimization formulation least squares method (LSM) problem, 8-4 to 8-5 variables and relations, 8-3 to 8-4 Steroids rare conversion with, 5-14 to 5-15 Sterols in membrane, 1-2 structure of, 1-3 Stoichiometric metabolic model, 10-1 to 10-2 modeling, 25-10 Streptomyces spp., 24-1 AfsR and AfsK signal transduction, 13-6 analytical techniques fluxome, 24-11 to 24-12 metabolome analysis, 24-11 proteome, 24-11 transcriptome, 24-10 to 24-11 carotenoid biosynthesis gene clusters (crt) in, 13-12 genome and its modification Streptomyces coelicolor chromosome, 24-4 to 24-5 industrial processes, examples of, 24-2 life cycle, 24-3 mechanism of light induction for carotenoid biosynthesis in, 13-13 metabolic engineering in biosynthetic gene clusters, expression of, 24-14 to 24-15 central transcriptional regulators, changing, 24-19 to 24-20 cofactor supply and, 24-15 morphology changes, 24-15 to 24-16 optimized hosts for, 24-13 to 24-14 oxygen supply during, 24-16 products modification, 24-20 to 24-21 recombinant protein, reducing degradation and, 24-16 to 24-19 molecular biology, 24-5 to 24-6 gene disruption, flowchart of, 24-7 genes, overexpression and replacement of, 24-6 REDIRECT method, 24-8
transposon-based mutagenesis techniques, 24-8 to 24-10 mutation of PhoR-PhoP system in, 13-11 to 13-12 PHO boxes in, 13-11 ppGpp-independent antibiotic production in, 13-11 ptpA gene, regulation of secondary metabolism from, 13-6 regulatory cascade involving A-factor in, 13-7 strains, modeling and design genome-scale metabolic models, 24-11 model guided strain design, 24-12 useful links for, 24-4 Structural observability analysis, 15-12 SWISSPROT database, 18-4; see also Constraintbased genome-scale models Symporters proteins, 1-11; see also Porter proteins Synthetic approach and biological motivation design and construction, 14-2 oocytes development, 14-4 process flow for design and construction, 14-3 Synthetic intercellular circuits artificial cell–cell communication system, 14-10 to 14-12 intercellular communication networks., 14-11 natural cell–cell communication systems, 14-9 Synthetic intracellular circuits integrated circuits integrated dynamic feedback controller, 14-7 to 14-8 integrated gene and metabolic oscillator, 14-8 plug and play circuits, 14-7 modular circuits negative and positive feedback loops, 14-5 oscillatory transcription network, 14-6 synthetic gene-metabolic circuits, 14-7 synthetic transcriptional feedback controllers, 14-5 synthetic transcription oscillators, 14-6 Systems biology component analysis, 26-10 to 26-11 gene expression, tools for, 26-13 metabolic flux analysis 13 C flux analysis, 26-12 constrain based flux analysis, 26-12 to 26-13 FBA, 26-11 to 26-12 recombinant protein production, culture use in, 26-2
T Terpenoids, 22-28 to 22-9 Erwinia uredovora for biosynthesis of, 22-26 to 22-27 metabolic engineering for, 22-26 pathway, 22-27
I-12 Thiamin biosynthetic pathway and regulatory genes, 23-15 to 23-16 pathway engineering thiamin-TMP-TPP pathway, 23-17 Thi biosynthetic enzymes, 23-18 physiological and commercial significance, 23-15 and regulatory genes, 23-17 Thiobacillus ferrooxidans, threshold concentration in electron donor for catabolic reaction, 11-16 TonB family auxiliary proteins, 1-11 to 1-12 Toy metabolic network, 16-23 Transcriptional programs first-principle and reversed engineered models construction functions for, 19-5 studies of, 19-4 to 19-5 Transglycosidases biocatalysts for oligosaccharide synthesis in vitro, 5-11 to 5-12 Transketolase (TK), glyceraldehyde 3-phosphate and sedoheptulose 7-phosphate formation, 3-23 Transmembrane electron flow systems, 1-16 to 1-18 Transporter classification (TC) system channels/pores, 1-9 to 1-11 classes and subclasses, 1-9 electrochemical potential-driven transporter, 1-11 to 1-12 group translocators, 1-13 to 1-15 International Union of Biochemistry and Molecular Biology (IUBMB) approved, 1-8 primary active transporters, 1-12 to 1-13 transmembrane electron flow systems, 1-16 to 1-18 transporter classification database (TCDB), 1-8 Transposon mutagenesis techniques in vitro, 24-10 in vivo, 24-8 to 24-10 Tricarboxylic acid (TCA/citric acid/Krebs) cycle, 2-17 Trichoderma reesei, 25-5 cellulase production in, 25-14 to 25-15 cellulolytic system, regulation of, 25-15 to 25-16 classical strain improvement and induction mechanism of, 25-15 heterologous protein production, 25-16 enzymes, deletion of, 25-18 gene modifications, 25-17
Index modified regulators, 25-17 to 25-18 xylitol, production of, 25-12 Triosephosphate isomerase (TpiA) and fructosebisphosphate aldolase (FbaA) amplification effects on P(3HB) biosynthesis, 4-7 Tryptophan biosynthesis, 3-12 Two-component signal transduction (2CST), 12-7 to 12-8 α-Type channels, 1-10 Tyroid hormones rare routs for synthesis, 5-15 to 5-16
U Ubiquinone reduction, 2-11 Unfolded protein response (UPR), 25-17 to 25-18 Uniporters proteins, 1-11; see also Porter proteins
V Vancomycin glycosylation by glycorandomization strategies, 5-11 Vitamin B6 biosynthesis of, 23-21 pathway engineering in B. subtilis, 23-22 Vitamins rare routs for synthesis, 5-15 to 5-16
X Xenobiotics rare conversion with, 5-15 Xylan water-soluble polysaccharide degradation and yeast, 22-8 to 22-10
Z zwf and gnd genes overexpression, pentose phosphate (PP) pathways, 4-7 Zymomonas mobilis, 21-16 pyruvate decarboxylase (PDC) and metabolic engineering of fermentation pathways, 2-19 to 2-20
+
σ
–
Transcriptional programs
mRNA X Gene X
P
mRNA X mRNA X
Translational programs
pX pX
pX
pX B
Metabolic programs
A
Figure 19.1 Schematic of the central dogma of molecular biology. Genetic information is transcribed into mRNA which is then translated into protein machines. The layers and programs of metabolism are coupled and hierarchical; transcriptional programs influence translation which then drives metabolic programs. Metabolic programs in turn influence transcription, thus, forming feedback loops that integrate the metabolic layers. Glucose
galP
crr ptsH ptsI
Glycolysis
Gluconeogenesis
6PGL
tktA G3P
PEP pykF pck pykA
R5P
tktA
G3P
S7P F6P
E4P
aroF, aroG, aroH DAHP aroB DHQ pps
Pyruvate
AcCoA OA
DHS
AcP
Acetate
Cit
TCA
Fum
IC α-KG
Succ
Ru5P
X5P
1,6FDP
Mal
(Z.mobilis)
6PG
F6P
ppc
glf
glk (E.coli & Z.mobilis)
G6P
DHAP
Plasma membrane
Glf
GalP
PTS
Succ-CoA
aroE
Shikimate
aroL, aroK
tyr
trp
phe
men
ubi
fol
Figure 21.6 Principal targets for metabolic engineering for shikimate production. Genes, which have been overexpressed, are shown by thick red arrows and are written with a large font size. Genes, which have been deleted or inactivated, are shown by dotted blue arrows, crossed by a line, and written with a small font size. Proteins (genes in parenthesis): Glf (glf ) = glucose facilitator, PTS (crr, ptsH, ptsI, ptsG) = phosphotransferase system, GalP (galP) = galactose MFS-transporter. Genes: aroG aroF, aroH, aroB, aroE, aroL, aroK (see Figure 21.5), tktA = transketolase, pps, ppc, pck, pykF, pykA, glk = glucose kinase. Note: not all of the modifications have been carried out simultaneously. The most productive shikimate strain contains this combination of modifications: a nonfunctional PTS-system (∆crrptsHptsI), nonfunctional shikimate kinases (∆aroLaroK) and a multicopy plasmid containing aroFfbr, aroB, aroE, tktA and Z.mobilis glf and glk [26].
Spore Spore maturation Spore germination
Aerial mycelium and secondary metabolites
Growth
Colored secondary metabolite (actinorhodin)
Substrate mycelium
Aerial mycelium
Agar
Agar
Figure 24.1 Streptomyces life cycle. The pictures show cross-sections of S. coelicolor colonies on agar medium. In the upper right corner of each picture the development stage is drawn schematically.
8 1
7 Streptomyces coelicolor 8,667,507 bp
2
6 3
4
5 Ori
Figure 24.2 Streptomyces coelicolor chromosome. The outer scale is numbered anticlockwise in megabases and indicates the core (dark blue) and arm (light blue) regions of the chromosome. Circles 1 and 2 (from the outside in), all genes (reverse and forward strand, respectively) color-coded by function (black, energy metabolism; red, information transfer and secondary metabolism; dark green, surface associated; cyan, degradation of large molecules; magenta, degradation of small molecules; yellow, central, or intermediary metabolism; pale blue, regulators; orange, conserved hypothetical; brown, pseudogenes; pale green, unknown; grey, miscellaneous); circle 3, selected “essential” genes (for cell division, DNA replication, transcription, translation and amino-acid biosynthesis, color coding as for circles 1 and 2); circle 4, selected “contingency” genes (red, secondary metabolism; pale blue, exoenzymes; dark blue, conservon; green, gas vesicle proteins); circle 5, mobile elements (brown, transposases; orange, putative laterally acquired genes); circle 6, G + C content; circle 7, GC bias ((G–C/G + C), khaki indicates values > 1, purple < 1). The origin of replication (Ori) and terminal protein (blue circles) are also indicated. (From Bentley, S.D., et al., Nature, 417, 141, 2002. With permission.)