Proteases of Infectious Agents

PREFACE As the decade of the 1990s draws to a close, it is appropriate to assess the changes that have taken place in ...

Author: Ben Dunn (Editor)

52 downloads 879 Views 35MB Size Report

This content was uploaded by our users and we assume good faith they have the permission to share this book. If you own the copyright to this book and it is wrongfully on our website, we offer a simple DMCA procedure to remove your content from our site. Start by pressing the button below!

Report copyright / DMCA form

DOWNLOAD PDF

PREFACE

As the decade of the 1990s draws to a close, it is appropriate to assess the changes that have taken place in the drug discovery process. Previously, an infectious agent was identified and cultured for two purposes: (1) so that tests could be established to screen compounds for inhibition of growth of the infectious organism, and (2) to permit isolation of proteins that could serve as drug targets in favorable cases where sufficient quantities could be obtained. In this decade a new paradigm has emerged. Isolation of the genetic material of the infectious organism, followed by sequence analysis at the DNA level by rapid, automated methods, can reveal the entire genomic structure in a short time. Straightforward analysis using sequence-searching algorithms can lead to the identification of possible critical functional activities. Cloning of the specific region of the genome can yield the target for new drug development. With the exploding database of protein structure, and with homology modeling programs, it is now possible to predict structure and initiate the drug discovery process while waiting for solution of the X-ray structure. During the period following the identification of sequences within the HIV genome that fit the template for the structure of an aspartic protease, drugs were designed and synthesized, and the effect on viral growth was demonstrated. Now that FDA-approved drugs have been shown to suppress viral growth and improve the health of thousands of patients, we can conclude that the new paradigm for drug discovery has been successful. Of course, this account should not fail to point out the continuing problems of patient noncompliance, low bioavailability, and rapid development of resistant viral strains. Nonetheless, the value of protease-directed drugs, as well as the best pathway to finding them, is clear. Fittingly then, this book begins with an account of the development of antiviral agents targeted for AIDS. Important lessons derived from this work in-

xv

XV1

Preface

clude the demonstration, through catalytic mutation, that an active protease was essential for viral replication. Also, the value of the early determination of the three-dimensional structure of the enzyme and enzyme-inhibitor complexes has been established. Next, the iterative design process, based on structure determination, marks the HIV-1 protease case as a defining example for future work. Finally, the problems associated with drug metabolism and the resistance question complete the catalog of lessons learned in this case. Other chapters in this book summarize current knowledge regarding new drug targets from other infectious organisms. A significant new effort is directed toward a serine protease encoded by the hepatitis C virus. Due to the relatively recent identification of this virus and the lack of a cell-culture system, progress was very slow. In the past three years, however, expression of the protease and the solution of its structure have dramatically stimulated the search for drugs in this area. In addition, a shift occurred when many pharmaceutical companies cut back their HIV efforts because of the success of the Roche, Abbott, and Merck compounds. Hepatitis C was the next disease likely to have a significant public health impact in the United States and elsewhere due to the spread of the infection through transfusion before its discovery. As detailed in the chapter by Urbani, De Francesco, and Steink~hler on this subject, a strategy similar to that in the HIV case was followed. A huge body of work, conducted largely in pharmaceutical companies, is summarized by Qiu and Abdel-Meguid in their chapter on the human herpesviruses. Here again, the paradigm of discovery of a viral sequence, cloning, expression, and structure determination has proven to be the route to drug design. In this case especially, the structural insights reveal novel mechanisms of action that could provide clues for potent and selective inhibitor design. The Candida genus provides an example of an infectious agent that is not a problem for a healthy human. However, in the case of a patient whose immune system has been impacted by infectious agents such as HIV or through treatment with immune-suppressing agents to avoid transplant rejection, severe systemic infections can be the ultimate cause of death. In the chapter by Stewart, Goldman, and Abad-Zapatero, several related crystal structures derived from protease variants cloned from C. albicans are described. Unique insights that point the way toward selective inhibitor design are derived. Work in the picornarvirus arena has been under way for some time; however, the new information described by Bergmann and James seems likely to stimulate this area significantly. Progress in this field will have widespread impact, as the picornavirus family includes the rhinoviridae, which bring us the common cold, as well as many others. The chapters by Berry, on malaria, and by Cazzulo, on Chagas disease, represent the field of protozoan infectious species. Berry provides a thorough summary of current knowledge on hemoglobin degradation during the blood-borne

Preface

xvii

stage of malaria. The parasite is somewhat unique in presenting two targets for drug discovery: a cysteine protease (falcipain) and an aspartic protease (plasmepsin). The interplay of these two enzymes in the complicated process of breakdown of the globin chain provides several strategies for attack. In the case of T. cruze, the organism that causes Chagas disease, the major antigen is a cysteine protease, cruzipain (or cruzain). Cazzulo describes the involved life cycle, the properties of cruzipain, and other putative serine proteases of the organism. Successful infection by foreign agents frequently requires some function of the host. In the chapter by Kido, Chen, Murakami, Beppu, and Towatari, the role of cellular proteases in infection by influenza A and Sendai viruses in the respiratory tract and tryptase TL2 in T lymphocytes is described. The involvement of a cellular enzyme in the entry of HIV-1 into cells is also discussed. The implications for future therapeutic intervention are clear, but the complications of attempting to alter the function of a normal cellular enzyme are also significant. While it has been clearly established that polyprotein processing in the case of viruses is an essential feature of their life cycle, the world of bacteria is not so straightforward. One point of attack is the necessity for processing newly synthesized proteins targeted by the bacteria for export. Lively discusses the bacterial signal peptidases in a chapter that precedes structure determination. Nonetheless, the ground is well-prepared for the anticipated structural information. In this case as in the others, the differences between the bacterial enzyme and the related human enzyme will be critical to development of selective inhibitors. The world of plants is represented by the chapter of Garcia, FernandezFern~indez, and L6pez-Moya describing plant viruses. This area is unique in that a wide range of protease mechanisms are found, lacking only the metalloproteases. Given the huge impact of plant viruses on crop production, this is a field with a large potential for future growth. While this compilation is not encyclopedic, the chapters presented cover a wide range of infectious agents and mechanisms. There also are obvious differences in the state of knowledge of the different proteases described. This is appropriate, as efforts in the area of proteases will continue to expand into research in new diseases and infectious agents as they are discovered. The task before us is clear: Find the critical protease and develop a potent and selective inhibitor. Ben M. Dunn

CONTRIBUTORS

Numbers in parentheses indicate the pages on which the authors' contributions begin.

CELE ABAD-ZAPATERO (117) Protein Crystallography D-46Y, Abbott Laboratories, Abbott Park, IL 60064 SHERIN S. ABDEL-MEGUID (93) Department of Macromolecular Science, SmithKline Beecham, King of Prussia, PA 19406 YOSHIHITO BEPPU (205) Department of Enzyme Chemistry, Tokushima University of Medical School, Tokushima 770, Japan ERNST M. BERGMANN (139) Dept. of Biochemistry, University of Alberta/ Edmonton, Edmonton, Alberta T6G 2H7, Canada COLIN BERRY (165) Cardiff School of Biosciences, Cardiff University, Cardiff CF1 3US, Wales, UK JUAN JOSl~ CAZZULO (189) Instituto de Investigaciones Biotechnol6gicas, Universidad Nacional de Beneral San Marin, San Martin 1650, Buenos Aires, Argentina YE CHEN (205) Department of Enzyme Chemistry, Tokushima University of Medical School, Tokushima 770, Japan RAFFAELE DE FRANCESCO (61) Instituto di Ricerche di Biologia Moleculare, 00040 Pomezia, Rome, Italy MICHAEL A. EISSENSTAT (1) Structural Biochemistry Program, NCIFCRDC, Frederick, MD 21702 JOHN W. ERICKSON (1) Structural Biochemistry Program, NCI-FCRDC, Frederick, MD 21702 xiii

xiv

Contributors

MARIA ROSARIO FERNANDEZ-FERNANDEZ (233) Centro Nacional de Biotecnologia, Campus de la Universidad Autonoma, 28049-MADRID, Spain JUAN ANTONIO GARCIA (233) Centro Nacional de Biotecnologia, Campus de la Universidad AutOnoma, 28049-MADRID, Spain ROBERT C. GOLDMAN (117) Anti-infective Group, D-47, AP-9A, Abbott Laboratories, Abbott Park, IL 60064 MICHAEL N. G. JAMES (139) Department of Biochemistry, University of Alberta/Edmonton, Edmonton, Alberta T6G 2H7, Canada HIROSHI KIDO (205) Department of Enzyme Chemistry, Tokushima University of Medical School, Tokushima 770, Japan MARK O. LIVELY (219) Department of Biochemistry, Wake Forest University School of Medicine, Winston-Salem, NC 27157 JUAN JOSI~ LOPEZ-MOYA (233) Centro Nacional de Biotecnologia, Campus de la Universidad Autonoma, 28049-MADRID, Spain MEIKO MURAKAMI (205) Department of Enzyme Chemistry, Tokushima University of Medical School, Tokushima 770, Japan XIAYANG QIU (93) Department of Macromolecular Science, SmithKline Beecham, King of Prussia, PA 19406 CHRISTIAN STEINKOHLER (61) Instituto di Ricerche di Biologia Moleculare, 00040 Pomezia, Rome, Italy KENT STEWART (117) Molecular Modeling Group, D-46Y, AP-10, Abbott Laboratories, Abbott Park, IL 60064-3500 TAKAE TOWATARI (205) Department of Enzyme Chemistry, Tokushima University of Medical School, Tokushima 770, Japan ANDREA URBANI (61) Instituto di Ricerche di Biologia Moleculare, 00040 Pomezia, Rome, Italy

HIV Protease as a Target for the Design of Antiviral Agents for AIDS JOHN W. ERICKSON AND MICHAEL A. EISSENSTAT Structural Biochemistry Program, National Cancer InstitutemFrederick Cancer Research and Development Center, Frederick, Maryland 21702

I. II. III. IV. V. VI.

Introduction Retroviruses, HIV, and AIDS HIV Protease: Biology, Biochemistry, and Structure Design of HIV Protease Inhibitors Drug Resistance Future Challenges in HIV Protease Inhibitor Design VII. Summary and Conclusions References

I. I N T R O D U C T I O N The design of clinically effective inhibitors of HIV protease has been a major success story for m o d e r n antiviral therapy and has raised the awareness and attractiveness of viral proteases in general as drug design targets. 2 C u r r e n t interest in viral proteases as drug design targets is highlighted by the chapters in this b o o k devoted to the proteases of herpesviruses, adenoviruses, plant viruses, picornaviruses, hepatitis C viruses, and HIV. A c o m m o n strategy in the life cycle of viruses is the utilization of polycistronic messenger RNAs that can be translated into precursor polyproteins which subsequently are processed The content of this publication does not necessarily reflect the views or policies of the Department of Health and Human Services nor does mention of trade names, commercial products, or organization imply endorsement by the U.S. Government. 2There are two distinct types of human immunodeficiency virus, HIV-1 and HIV-2. In addition, multiple subtypes have been identified for HIV-1. For clarity, HIV will be used throughout this chapter to denote HIV-1, subtype B unless otherwise indicated. Proteasesof InfectiousAgents

Copyright9 1999byAcademicPress.Allrightsof reproduction in anyform reserved.

2

John W. Erickson and Michael A. Eissenstat

enzymatically into mature, functional proteins during virus assembly (Kr~iusslich et al., 1988; Kay and Dunn, 1990). Processing enzymes may be either cellular or viral encoded and offer novel targets for intervention. Virus-encoded proteases afford particularly attractive therapeutic targets for the design of antiviral agents that are highly specific and nontoxic to their host cells. Viral proteases can be classified according to their mechanism of action as serine, cysteine, aspartic, or metalloproteinases. This chapter discusses the key biological and structural features of HIV protease (HIV PR) that define its usefulness as an antiviral target. A brief review as well as an update on efforts to discover and design inhibitors of HIV PR inhibitors is presented. Finally, the problem of drug resistance to HIV PR inhibitors is discussed with an emphasis on viral strategies for resistance and on implications for future drug discovery efforts in this area.

II. R E T R O V I R U S E S , HIV, A N D A I D S The human immunodeficiency virus type 1 (HIV) causes acquired immunodeficiency syndrome (AIDS) in humans. This disease was recognized only recently in the U.S., around 1981, as a unique clinical syndrome manifested by opportunistic infections or malignancies associated with an underlying defect of the immune system characterized by the progressive loss of CD4 helper T cells. Nearly always fatal unless treated, AIDS cases currently number upward of 2.0 million reported cases worldwide and may be closer to 10 million based on recent estimates from the 1998 AIDS meeting held in Geneva, Switzerland. The number of HIV-infected individuals is much higher, around 30 million, and current projections of global prevalence predict 4 0 - 1 0 0 million HIVinfected individuals by the year 2000. Human immunodeficiency virus is a lentivirus and belongs to the family Retroviridae, which are enveloped, positive-sense, single-stranded RNA viruses. Retroviruses induce a variety of neoplastic diseases and are widely distributed among vertebrate species. A defining characteristic of these viruses is their use of an RNA-dependent DNA polymerase, or reverse transcriptase (RT), for replication of the viral RNA. Many of the molecular events specific to HIV-1 infection have been characterized including functions common to the retrovirus life cycle (Fig. 1). Infection by a retrovirus results in the synthesis of one or more double-stranded DNA intermediates by the action of RT and at least one DNA copy of the viral genome is integrated into the host DNA. This proviral DNA serves to direct its own transcription, translation, and assembly of new virions. This ability of retroviruses to incorporate their genomes into that of their host cells endows them with the capability to be stably maintained during the life of the host and even to be transmitted through the germ line. While most germline "endogenous" retroviruses found in animals and humans are believed to be

HIV

Protease

3

nonpathogenic, tumor-causing retroviruses are fairly common in animals, and their discovery dates back to 1901. However, the first definitive reports that associated a retrovirus with disease in humans would not appear until 1980 when it was discovered that adult T-cell leukemia (ATL) was caused by human T-cell leukemia virus (HTLV-I) (Poiesz et al., 1980). First described in 1977, ATL is highly malignant; median survival is measured in months (Kawano et al., 1985). Current prevalence estimates of ATL are around 1 0 - 2 0 million cases worldwide. There is currently no effective therapy for ATL. Human T-cell leukemia virus has also been attributed to be the cause of a myelopathy disorder as well as an aggressive form of non-Hodgkins T-cell lymphoma. Shortly after the discovery of HTLV-I, another C-type retrovirus closely related to HTLV-I, named HTLV-II, was isolated from a patient with hairy cell leukemia (Kalyanaraman et al., 1982). Two years later a third human retrovirus, initially called HTLV-III, was linked to AIDS (Gonda et al., 1985) and was later found to be nearly identical to lymphadenopathy-associated virus isolated a year earlier (Wain-Hobson et al., 1991). When the capsid morphology and genetic structure clearly identified HTLV-III as a lentivirus (Gonda et al., 1985), its name was changed to HIV-1. The lentiviruses comprise a subfamily of retroviruses that characteristically cause chronic infections and disease in animals and include bovine, feline, and simian immunodeficiency viruses, equine infectious anemia virus, and visna virus of sheep. A second human AIDS virus, HIV-2, was discovered in the mid-1980s (Clavel et al., 1986). Human immunodeficiency virus 2 has a distinct geographic distribution from HIV-1 and is found mainly in West Africa. Human immunodeficiency virus type 2 appears to have a lower morbidity and longer clinical latency than HIV-1 and is closely related in sequence to SIV, a lentivirus that causes an AIDS-like disease in macaques (Gao et al., 1994). This brief history underscores the fact that the identification of disease-associated retroviruses in humans is fairly recent. Thus, much of our current understanding of HIV infection and, in particular, the identification and characterization of targets for antiviral drug design have drawn heavily on basic virology studies obtained with animal retroviruses prior to the discovery of HIV (for a comprehensive review, see Coffin et al., 1997). The discovery of HIV-1 in 1984 led to a parallel explosion of research on the molecular virology of this infectious agent and to an intensive, global search for a cure for this fatal disease that continues to the present. Efforts to inhibit HIV continue to represent one of the most active areas of antiviral research today, and nearly every aspect of the viral life cycle has become a target for antiviral drug discovery (Mitsuya and Broder, 1987; De Clercq, 1995). Modern antiviral strategies are turning more and more to the use of structure- and mechanismbased approaches for the design of safer, more specific, and effective drugs. The successful introduction of HIV PR inhibitors for AIDS treatment is a testimonial to the importance of structure- and mechanism-based approaches in the

o ~

~

~~

g~

0

~ K

~

o.~ ~ ~

N

o

c

i

c_ m

7O

llllI ~r~~~.~ ~E "~o

O~

g

~

HIV Protease

5

discovery and design of more effective, less toxic antiviral therapies. To date, there are four FDA-approved PR inhibitorsmSaquinavir, Ritonavir, Indinavir, and Nelfinavir. These compounds are highly potent and play a central role in the development of the highly active antiretroviral therapies that comprise the current standard of care and that, for the first time in the brief history of HIV infection, provide dramatic and durable suppression of HIV replication. It is probably safe to assume that much of the attention currently being paid by the pharmaceutical industry to other viral proteases as drug design targets is a "coattail" effect based on the HIV success story. It can be argued that much of the current focus on viral proteases as targets for antiviral therapy really stems from the collision of two disparate disciplines--protease biochemistry and virology. Thus, it is worth reflecting on the scientific developments that led up to the identification of HIV PR as an attractive target for drug design.

III. H I V P R O T E A S E : BIOLOGY, B I O C H E M I S T R Y , AND STRUCTURE A. ROLE OF H I V PROTEASE IN THE VIRAL LIFE CYCLE The HIV genome, like all other retroviral genomes, is a single-stranded, positivesense RNA molecule that is organized into three major coding elements: the gag, pol, and env genes. The gag and pol gene products are translated from a single unspliced polycistronic mRNA that encodes both genes (Fig. 2). A stop codon in the unspliced RNA leads to the translation of a 55-kDa Gag polyprotein, Pr55 gag,that contains sequences of the structural proteins of the virionm matrix (MA), capsid (CA), and nucleocapsid (NC)malong with the peptides p2, p 1, and p6 that are involved in the assembly and morphogenesis of mature capsids. The pol gene encodes the viral enzymes necessary for replicationm protease (PR), reverse transcriptase (RT), and integrase (IN). These proteins

FIGURE 1 The HIV life cycle. Human immunodeficiency virus infects a T cell via recognition of the CD4 receptor on the cell surface. Fusion of the viral envelope and cell membranes leads to cytoplasmic invasion by the nucleoprotein core of the virus. Proviral DNA is synthesized using the virion-associated reverse transcriptase and tRNA as a primer. Integration of proviral DNA is mediated by a viral integrase, also present in the infecting virion, and host factors. Transcription of proviral DNA into spliced and unspliced RNAs provides mRNAs for translation of the gag, pol, and env gene products as well as viral RNA for packaging. Assembly of the Gag and Gag-Pol precursor proteins and packaging of the viral RNA occurs at the cell membrane. Extracellular budding of virions results in the acquisition of an envelope which contains the viral env proteins required for subsequent rounds of receptor recognition and fusion. Processing of the Gag and Gag-Pol polyproteins occurs during budding and release and is mediated by a viral protease. Reprinted from Coffin (1996) with permission.

6

John W. Erickson and Michael A. Eissenstat

FIGURE 2 Genome organization and translational strategy for HIV. Structural (gag, pol, and env) genes are shaded; regulatory (tat and rev) and accesory (vif, nef, vpr, and vpu) genes are clear. Common to all retroviruses, the gag and pol gene products are translated on free ribosomes in the cytoplasm from newly synthesized unspliced viral RNA. Translation usually occurs through to a stop codon at the 3' end of the gag gene resulting in the structural polyprotein Pr55 gag.About 5% of the time ribosome frameshifting during translation of gag results in the synthesis of a Gag-Pol fusion protein, Pr160gag-p~ The frameshift site (fs) is located upstream of the Gag p6 protein such that a transframe polypeptide, TF, is incorporated into Gag-Pol in place of p6. The functions of the p6 and TF proteins are unclear. The total number of amino acids contained by each polyprotein is indicated at the end of each molecule. See text for individual protein abbreviations. Reprinted from Swanstrom and Wills (1997) with permission.

are also translated as part of a larger polyprotein precursor, Pr160g "g-p~ which results from ribosomal frameshift and readthrough during translation of the gag gene. The frameshift site has been mapped to lie between the N C and p6 coding sequences. The gag and gag-pol gene products in mature virions are found in a ratio of 20:1, which represents the frequency of ribosomal frameshifting, about 5%. Thus, frameshifting is used as a regulatory mechanism to ensure that large numbers of the structural proteins of the virion are synthesized relative to the viral enzymes, which are needed in only catalytic amounts. The N-termini of Pr55 and Pr160 both contain a covalently attached myristic acid moiety that is added cotranslationally and targets the polyproteins to the cellular membrane where virus assembly and budding takes place (Fig. 3). The Pr55 precursor protein is believed to play a central role in directing virion assembly and RNA packaging based on studies with other retroviruses that show that enveloped nucleoprotein core particles can form from Gag precursor proteins in the absence of pol and env gene products. Proteolytic processing of Pr55 g~gand Pr160 g~g-p~during virus assembly and maturation is performed by the viral PR, which is itself encoded by the pol gene (Swanstrom and Wills, 1977, and references therein). The env gene product, gp160, is processed into gp 120 and gp41 by a cellular protease. The processing products of HIV PR include the gag-encoded structural proteins and peptides-MA, CA, NC, pl, p2, and p 6 w a n d the pol enzymes--RT, IN, and PR. All of

7

HIV Protease

FIGURE 3 Budding and maturation of HIV-1 from an infected cell. Both immature (third panel from left) and mature (fourth panel from left) forms are shown. The mature virion exhibits the elongated capsid morphology typical of most lentiviruses. Inactivating mutations in the viral protease gene, or the presence of protease inhibitors, blocks the morphogenesis of immature to mature virion. Reprinted from Swanstrom and Wills (1997) with permission.

these products are found in mature infectious viral particles and result from cleavages at unique amino acid sequences that span the N- and C-termini of the mature proteins (Debouck, 1992) (Fig. 4). The sequences recognized by HIV PR are diverse, but certain general features emerge. Hydrophobic amino acids are preferred at the P1-PI' residues that flank the scissile peptide bond, aliphatic and Glu/Gln residues are often found at P2', aromatic residues are almost never found at P3', and small residues are preferred at P2. Several sequences contain an aromatic residue at P 1 followed by a Pro at P 1'. Although

PROCESSING SITES FOR HIV-1 PROTEASE Site

P4 P3

P2

P1

PI" P2"P3" P4"

MA/CA

-Ser-Gln-Asn-Tyr/Pro-Ile-Val-Gln-

CA/p2

-Ala-Arg-Val-Leu/Ala-Glu-Ala-Met-

p2/NC

-Ala-Thr-Ile-Met/Met-Gln-Arg-Gly-

NC/pl

-Arg-Gln-Ala-Asn/Phe-Leu-Gly-Lys-

pl/p6

-Pro-Gly-Asn-Phe/Leu-Gln-Ser-Arg-

TF/PR

-Ser-Phe-Asn-Phe/Pro-Gln-Ile-Thr-

PR/RT

-Thr-Leu-Asn-Phe/Pr~176

RT/IN

-Arg-Lys-Val-Leu/Phe-Leu-Asp-Gly-

RT (internal)

-Alu-Glu-Thr-Phe/Tyr-Val-Asp-Gly-

FIGURE 4 Cleavage site sequences in Gag and Gag-Pol polyproteins recognized by HIV PR. Cleavage occurs between residues in the P1/PI' positions and they are indicated in bold and separated by a slash. The RT (internal) represents a PR-mediated cleavage at the junction of the p51/ RNase H domains which yields the active p66/p51 heterodimer found in isolated virus particles (Tomasselli et al., 1993). The nomenclature of Schechter and Berger (1967) is used to designate residue positions in the substrate sequence.

8

John W. Erickson and Michael A. Eissenstat

all retroviral proteases appear to be structurally and functionally related, their cleavage site preferences vary widely. Efforts to predict cleavage sites for HIV PR have met with limited success (Poorman et al., 1991; Chou and Zhang, 1993) and our understanding of the basis of PR specificity is incomplete (Dunn et al., 1994; Katz and Skalka, 1994). However, identification of cleavage site sequences quickly led to the successful generation of a variety of synthetic substrates that facilitated the design of rapid and quantitative assays of PR activity (Hellen, 1994; Krafft and Wang, 1994). The HIV PR has long been known to be toxic to cells and this has prompted a search to identify cellular proteins that may be cleaved by HIV PR. Several investigators have shown that key proteins, such as NF-KB and certain cytoskeletal proteins, are cleaved in HIV-infected cells (Shoeman et al., 1993). The possible involvement of PR in the early stages of retroviral replication was suggested initially on the basis of observations with equine infectious anemia virus (Roberts and Oroszlan, 1989) and later with HIV (Baboonian et al., 1991; Nagy et al., 1994). However, recent data from several groups have demonstrated that PR inhibitors fail to block the synthesis of proviral DNA, its integration into cellular DNA, and transcription (Jacobsen et al., 1992; Uchida et al., 1997). Similar conclusions were reached using conditional lethal HIV-1 PR mutants as a probe (Kaplan et al., 1996). In 1988, the late I. Segal and co-workers observed that deletion mutagenesis of the HIV PR gene resulted in the production of virus particles that had an immature morphology and were noninfectious (Kohl et al., 1988). These results were confirmed by mutation of the active site aspartic acids and subsequently by chemical inhibition with PR inhibitors (Seelmeier et al., 1988; McQuade et al., 1990). This seminal experiment provided conclusive proof that HIV PR is essential for the life cycle of HIV and defined this enzyme as an important target for the design of specific antiviral agents for AIDS. A similar conclusion had been reached for the PR of murine leukemia virus in 1985 (Crawford and Goff, 1985; Katoh et al., 1985). However, the HIV PR studies provided the boost needed by many groups to launch drug discovery programs for this target.

B. STRUCTURE AND MECHANISM OF H I V PROTEASE Identification of the mechanistic family that a viral protease belongs to is the key to predicting its structure and function and may unlock strategies for inhibitor design that were previously developed for homologous members of the family. This concept was extremely valuable for the design of HIV PR inhibitors and led directly to the design of the first approved drug, Saquinavir, long before

HIV Protease

9

the structure of the enzyme was known. Homology modeling and biochemical inhibition studies on retroviral proteases had led to the hypothesis that these enzymes were related mechanistically to the aspartic proteinase family, typified by pepsin (Toh et al., 1985; Katoh et al., 1987). The active site of these bilobed enzymes contains two aspartic acids, one from each lobe, that participate in the catalysis of peptide bond breakage. Crystal structures of aspartic proteinases from cellular organisms revealed that the N- and C-domains associate to form an active site with approximate twofold symmetry at the protein backbone level (Davies, 1990). These observations led to the suggestion that the cellular enzymes had evolved via a duplication event of a primordial aspartic protease gene (Tang et al., 1978). Since the sequence length of retroviral proteases is about one-third that of typical aspartic proteases, it was proposed that the former enzymes are composed of two identical subunits, each of which contributes a single aspartic acid to the active site (Pearl and Taylor, 1987). This hypothesis was verified by the crystal structure determination of Rous sarcoma virus (RSV) protease by Wlodawer and colleagues (Miller et al., 1989) and subsequently of HIV PR by several laboratories (Lapatto et al., 1989; Navia et al., 1989; Wlodawer et al., 1989; Spinelli et al., 1991). So far, retroviruses are the only family of viruses whose proteases have adopted an aspartic proteinase mechanism. The HIV PR dimer consists of two identical, noncovalently associated subunits of 99 amino acid residues associated in a twofold (C2) symmetric fashion (Wlodawer and Erickson, 1993) (Fig. 5a). The dimer is stabilized by a fourstranded antiparallel/3-sheet formed by the interlocking N- and C-termini of each subunit. The active site of the enzyme is actually formed at the dimer interface and contains two conserved catalytic aspartic acid residues, one from each monomer. The substrate binding cleft is composed of equivalent residues from each subunit and is bound on one side by the active site aspartic acids, Asp25 and Asp125, and on the other by a pair of twofold related, antiparallel /3-hairpin structures, or "flaps." Comparison of the structure of HIV PR with that of a complex with a peptide-based inhibitor (Miller et al., 1989) shows that the flap undergoes significant structural changes upon binding and that it makes several direct interactions with inhibitor (Fig. 5b). Molecular dynamics studies indicate that the flaps are highly flexible and must undergo large localized conformational changes during the binding and release of inhibitors and substrates (Collins et al., 1995). Crystal packing forces apparently maintain the flap in a conformation that is unsuitable for substrate binding in the structure of the uncomplexed form of the enzyme (York et al., 1993). The crystal structures of RSV and HIV PR revealed that, despite the apparent lack of sequence homology, aspartic proteinases and retroviral proteases display considerable structural homology at the backbone level (Rao et al., 1991). Fully one-third of the main chain atoms of RSV PR can be superposed onto the backbone of

10

John W. Erickson and Michael A. Eissenstat

FIGURE 5 Chain tracings of HIV PR from crystal structures of (A) unbound and (B) inhibited forms of the enzyme. Inhibitor is drawn as stick figure.

porcine pepsin to within 1.5 A root-mean-square deviation. As expected, most of the structural c o r r e s p o n d e n c e is in the active site region. However, the overall chain topologies of the two families of enzymes are more similar than a simple superposition analysis reveals and are indicative of a distant but definite relationship to a c o m m o n , ancestral aspartic proteinase gene.

HIV Protease

11

C. STRUCTURE OF H I V PROTEASE INHIBITOR COMPLEXES To date, several hundred crystal structures have been solved for various HIV protease/inhibitor complexes~a testimony to the importance placed on structural information in the process of inhibitor design (Fitzgerald and Springer, 1991; Huff, 1991; Tomasselli et al., 1991; Meek, 1992; Abdel-Meguid, 1993; Appelt, 1993; Erickson, 1993; Wlodawer and Erickson, 1993). Structural comparison of the inhibitor complexes reveals certain common features (Fig. 6). The inhibitor and enzyme make a pattern of complementary hydrogen bonds between their backbone atoms. In some instances, these hydrogen bonds are mediated by bridging water molecules. A unique feature found in the structure of HIV PR/inhibitor complexes is the presence of a water molecule that forms bridging hydrogen bonds between the NH atoms of Ile50 and Ile150 in the two flaps and the P2 and P 1' backbone carbonyl groups of the inhibitor. This water is close to the twofold axis of the enzyme and is distinct from the water molecule that has been identified in the active site of uncomplexed structures of aspartic proteinases, including HIV PR. The latter has been implicated in substrate catalysis. The enzyme also contains a number of well-defined pockets, or subsites, in its active site region into which inhibitor side-chains protrude, resulting in tight binding interactions between enzyme and inhibitor. Since a similar pattern of hydrogen bonds is believed to be made for both substrates and peptidomimetic inhibitors, specificity is believed to reside in the pattern of largely nonpolar subsite interactions between inhibitor and enzyme side-chain atoms. Overall, knowledge of the structure and function of HIV PR and its relationship to other aspartic proteinases has led to the successful development of a wide variety of potent and chemically diverse inhibitors as is discussed in the next section.

IV. D E S I G N O F H I V P R O T E A S E I N H I B I T O R S The design of clinically effective HIV PR inhibitors has been a major success story for structure-based design (for reviews on clinically effective protease inhibitors see Vacca and Condra, 1997; Flexner, 1998). Several HIV PR inhibitors are currently in widespread use for the treatment of patients with AIDS (Mous et al., 1994; Ho et al., 1995; Wei et al., 1995; Markowitz et al., 1998). These compounds represent a new class of therapeutic agents that complement already-licensed antivirals--AZT, ddI, ddC, and d4T--all of which inhibit HIV RT. Although the initial lead compounds have been generated in various ways, the availability of protein crystal structures of these leads with

" .4,=.

0

0

~

O.

>, r

9--

:=

0

.......

:Z:Z

....

.

.

.

0~-.0

_>,

9' , , , , " ~ l i l i l l l l l l

.

0,~

/

~'~'

iI

/

......

:Z:

:=Z

=~

m

~

,.::I:Z m

o

_>,

|

~

L///

,,,,"

k" "- ~

L//,

--

o.'- :

L///

..,,,,,,llmlllllll e

0----o

',.~~IIIIIIII

\... o

9 ,,,,,IIillllll

.

~ 1 7..... 6 =~ z:z:

C,):

1 \

=

0

....

.....

.....

"/..

:=Z

0 : 0

---

0 ~ 0

---'rZ

~

\z= /

\

:C..Z

.... 0 ~ 0

. . . . . . . .

Z=

---~ 0

~"

e~ 0 : 0

. . . .

~, z:z: . . . .

S / ~ . , ~

//'~/I

,~, ~ I

\z=

~

~~0

~

_>=

I.T-I ~

HIV Protease

13

HIV PR facilitated structure-based approaches to the optimization of interactions to increase potency. Once potent inhibitors of the enzyme were generated, considerations such as improving antiviral potency, improving bioavailability (Kempf, 1994), and reducing cost could be addressed within the known structural limitations provided by these crystal structures. We make no attempt to describe in detail the enormous amount of medicinal chemistry that has been done over the past decade that has led to the development of these successful drugs. Rather, we describe the various approaches that have been used to generate potent HIV PR inhibitors with an emphasis on recent developments and in particular how they might address the two major challenges remaining for drugs of this class: pharmacokinetics and effectiveness against resistant mutants. The inhibitors can be classified on the basis of the rationale used for their design as substrate-based, structure-based, nonpeptidic, and irreversible inhibitors. These distinctions are somewhat arbitrary since many inhibitors can fall into more than one category. We apologize in advance for placing compounds into a class that others may not agree with. One indication of the enormous activity in this field is the large number of cores or templates that have been used for elaboration of HIV protease inhibitors. Table I lists the various cores along with representative examples of potent inhibitors. Cores are arranged approximately by size according to the portion delimited by a heteroatom (usually nitrogen) or carbonyl connected by atoms to another heteroatom or carbonyl, allowing for a shorthand nomenclature for grouping purposes. When looking at the compounds in this way one sees that there are close similarities between the structures of the series pursued by the various groups. As will become apparent (see below), the clinically successful compounds often resulted from incorporating pieces discovered by various groups onto a proprietary core or introducing a proprietary group onto an established core. In the discussions below structural references to a specific core in Table I are made whenever possible.

A. SUBSTRATE-BASEDINHIBITORS The close structural and functional relationships between the retroviral and cellular aspartic proteases, together with knowledge of the HIV PR cleavage site sequences on the Gag and Gag-Pol polyproteins, immediately opened the avenue of peptidomimetic substrate-based approaches that had been developed for designing inhibitors of human renin, an aspartic protease that was a popular target for the design of antihypertensive agents in the 1980s. Screening compounds generated in these programs rapidly identified reasonably potent inhibitors of HIV PR. Many of the cores shown in Table I had been previously generated as substrate mimetics for renin programs. Substrate peptidomimetic

14

John W. Erickson and Michael A. Eissenstat

TABLE I HIV Inhibitor Cores Core

Structure

Source

NCCN NH(CH2)2NH

Glaxo (Humber et al., 1993)

PhCH2CONHH PhCH2NHCO~HN ~~Or S~ N~

H NHCOCH2Ph

HN.~ NH O NCCC--O NHCH(R)CH(OH)CO

O",11.N

O - H MeS KNI-272

NHCH(R)COCO

H2N,tO

N~S O~

~./

Japan Energy (Mimoto et al., 1992; Kageyama et al., 1993; E1-Farrash et al., 1994) Syntex (Tam et al., 1992) Takeda (Kitazaki et al., 1994)

~ N H Takeda (Kitazaki et al., 1994) Scripps (Slee et al., 1995)

O NHtBu NCCCN NHCH(R) CH(OH) CH(R')NH

oWH OHH~O

Abbott (Erickson et al., 1990; Kempf et al., 1990)

A-74704 (continues)

15

HIV Protease TABLE I

(continued)

Core

Structure

Source

NHCH(R) CH(OH) CH2N(R)CO

Monsanto-Searle (Getman et al., 1993)

~ H oN.~ 0 ~I ONH2

~ ON41

SC-52151 NHCH(R) CH(OH) CHzN(R)SO2

~O

Monsanto-Searle (Getman et al., 1993) Vertex (Kim et al., 1995)

O

N"SO2-p-NH2Ph

OH~ VX-478 Amprenavir NHCH(R) CH(OH) CH2NR~

,~.N~H

'~~~-~~'~ ==_ . N O

o-

N'~ H [~.~

,[

O' NH20HH

Roche (Roberts et al., 1990; Thomas et al., 1994) Boehringer,Biomega

..,..199~.

Beaulieu et al., 1997) Agouron, Lilly (Kalish et al., 1995; Kaldor et al., 1997)

Ro 31-8959 Saquinavir

sO O~.-4NH

0 ~N:

AG 1343 Nelfinavir (continues)

16 TABLE I

John W. Erickson and Michael A. Eissenstat (continued)

Core

Structure

Source

"4-

0.,.. NH

OH

'0

%0

BILA 2011 Palinavir NH CH (R) CH (OH) CH2NR' NH

Mer

..N"~rN..,,'~ N" \ 0 NHCH(R) CH(OH)CH2NR'R' 'NH

Abbott (Sham et al., 1993, 1995) Ciba-Geigy (Fassler et al., 1993) Narhex (Grobelny et al., 1997)

UCSF (Rutenber et al., 1996)

O OH IX

Me CH2Ph AQ-148

NCPCN NH CH (R) PO (OH) CH (R') NH

HO

OHNN~H l H .VO OJ'l' ~ "Nv- E~~"X N11'O~ H

SKB (Abdel-Meguid et al., 1993a)

SB-204,144 (continues)

17

HIV Protease TABLE I

(continued)

Core

Structure

Source

NCCCC~---O NHCH(R)CH(OH)CH2CO

Glaxo (Holmes et al., 1993)

PhCH23NH PhCH2NH S N H ~ NHCH2Ph O

OH U

NHCH(R) CH(OH) CH(NHR') CO

Sandoz (Billich et al., 1994; Scholz et al., 1994)

CbzNHLH - ~ ~ N~ NHCbz = HN "

n

0

NHCH(R)COCF2CO

MMD

(Taylor et al., 1997)

o- b

O~F. F H .~

H ~ ~

O

MDL-74695 NCCCS NHCH(R)CH(OH)CH2S

Lilly (Cho et al., 1994)

O@NH O NH2 (continues)

18

John W. Erickson and Michael A. Eissenstat

TABLE I

(continued)

Core

Structure

Source

OCCCO OCH(R)CH(OH)CH(R')O

0

~

0

IL.

x~

Lederle (Babine et al., 1992)

OH

v

0

NCCCCN

NHCH(R)CH(OH)CH(OH)CH(R')NH

Me H O ~

H.0

H ~ --

0

N o ,&"

6HI,

o

"

0

N

N

Abbott (Kempf et al., 1990, 1993, 1995a) SKB (Dreyer et al., 1993) Hoechst, Bayer (Budt et al., 1995; Lange-Savage et al., 1997) NCI (Hosur et al., 1994; Randad et al., 1996)

A-77003 NH CH (R) CH (O H) CH2CH (R') NH

Abbott (Kempf et al., 1993)

H

~,~O O

A-80987

NwN 0

NwO 0

A-84538 (ABT-538) Ritonavir (continues)

19

HIV Protease

TABLE I

(continued)

Core

Structure

Source Abbott (Kempf et al., 1991)

NHCH(R) CH(OH) CF2CH(R')NH

H O~ F F , , V O 0 fN H ( ~ H b O

H Abbott (Kempf et al., 1991)

NHCH(R)COCF2CH(R')NH

~ M e H O~

FF H V . O N 1.fN_~ _N_1 , ~ N~,~ N J.kN

oz,.o

N

A-79295 Lilly (Munroe and Hornback, 1993) LG Chemical (Park et al., 1995)

NHCH(R)CH(O)CHCH2NH

O,rfNH H 0

%'0

H HN,rfO 0

O ..(~ 0

H HN,I1,0 0 (continues)

20 TABLE I

John W. Erickson and Michael A. Eissenstat (continued)

Core

Structure

NCPCCN NHCH(R)P(O) (OH) CHOHCH(R')NH

~

Source

Hoechst (Stowasser et al., 1992)

O0 H 0 0 HN.~N~ IpIJ,~H ~ N

6. )o -

NCCCCC--O NHCH(R) CH(OH) CH2CH(R')CO

HO ~ OH BocHN. ~ ~ N,,

Merck (Vacca et al., 1991) SKB (Dreyer et al., 1992) Upjohn (Mulichak et al., 1993)

o

Merck (Dorsey et al., 1994a,b)

NR2CH2CH(OH)CH2CH(R')CO

~N'~

OH~H OH NHtBu

MK-639 (L-735,524) Indinavir LG Chemical (Lee et al., 1996; Yoon et al., 1997)

NHCH(R)CH(O)CHCH2CO

,,,/.o~

H 0 H 0 I"v N,.,,.~ N J,~ ~,~ N~,,~ _ H

o~,

b"

S02Me

o /.,, ~

LB-71350 (continues)

21

HIV Protease TABLE I

(continued) Core

Structure

Source

NCCNCC---O NHCH(R)CH2NHCH(R')CO

BocHN

N H

U

Czech Academy of Science (Urban et al., 1992)

qO O ~

H

O

NCCCCCN NHCH(R) CH(OH) CH2CH2CH(R')NH

U. Montreal (Hanessian and Devasthale, 1996)

O o.~~ o..~or .o

~~-o~

Q

O--CCCCCCC~-----O COCH(R) CH2CH(OH) CH2CH(R') CO

OH H'~-N

Merck (Bone eta/., 1991) kederle (Babine eta[., 1992)

OH

O

H

OH

O

L-700417 NCCCNCCCN NHCH(R) CH(OH)CH2NHCH2CH(OH) CH(R)NH

OH BocNH ~ -

U

BMS (Patick et al., 1995)

OH N

-

NHB oc

/~O

"~oy ~-~ BMS-186318

(continues)

22 TABLE I

John W. Erickson and Michael A. Eissenstat (continued)

Core

Structure

Source

Nonpeptide Agouron (Kalish et al., 1995) Roche (Thomas et al., 1994)

NHCH(R)CH(OH)CH2R'

S O

~ OOH NH

AG-1254 SKB (Thompson et al., 1994)

NHCH(R) CH(OH) CH2CH(R')Ar

H

o

O

N...*.N ..%.....~. ,j

,

SB-206343

LG Chemical

NH CH (R) CH (O) CH2CH2SO2NR'

(Choy et al., 1997)

ON~N ~,~ H ~" o

~N' ..<.

SO2Me (continues)

23

HIV Protease TABLE I

(continued)

Core

Structure

Source Agouron, Lilly (Kaldor et al., 1994; Reich and Pino, 1994; Kalish et al., 1995; Reich et al., 1995; Melnick et al., 1996)

ArCHzCH(OH)CH2CH2Ar

N.O

if-OH

HOrN

O

% AG-1284 Imidazolone

Dupont-Merck (De Lucca, 1997)

HO G.

O

N

~t

OH N

Pyrimidin-2-one

Dupont-Merck (De Lucca et al., 1997a)

H2NOC

CONH2

~N~N"~ 1,3-Diazepan-2-one

Dupont-Merck (Lam et al., 1994, 1996; Nugiel et al., 1996; Rodgers et al., 1996; Wilkerson et al., 1996; Jadhav et al., 1997; Smallheer et al., 1997; Wilkerson et al., 1997; Han et al., 1998; Patel et al., 1998) Uppsala (Hulten et al., 1997)

HO~N~N~O" DMP-323 (XM-323)

(continues)

24 TABLE I

John W. Erickson and Michael A. Eissenstat

(continued) Core

Structure

H2N~Ho

Source

~'OH~NH2 ,

DMP-450 (XM-412)

H HNO O NII.N OH XV-652

HO OH 1,3-Diazepan- 2-imine

NCN

/=x

Dupont-Merck (Jadhav et al., 1998)

Ho_~N"~~4_OH ~ ~ 0 ~ IIII on)~

1,4-Diazepan-2,3-dione

Dupont-Merck (Jadhav and Man, 1996)

MeO2C~~. ~_~ ~CO2 Me N

N

(continues)

25

HIV Protease TABLE I

(continued)

Core

Structure

Source Abbott (Sham et al., 1996)

1,2,4-Triazepan-3-one

MeO

OMe

HO--NNOH A-98881 Uppsala (Backbro et al., 1997)

1,3-Diazepan-2-sulfone

N-S.N

Bayer (Wild et al., 1993)

Dibenzcycloheptanone

O BnO2CNHCH2~ ~ HO

CH2NHCO2Bn OH Gilead (Kim et al., 1996)

Thiepane-dioxide 02

x . ~

HO

OH ~ k ~

2-Pyrone

U-96988

Upjohn (Romines and Thaisrivongs, 1995; Thaisrivongs et al., 1995, 1996) W-LPD (Lunney et al., 1994; Prasad et al., 1994, 1995a,b; Tummino et al., 1994) (continues)

26

John W. Erickson and Michael A. Eissenstat

TABLE I (continued) Core

Structure

Source

OH

HO2CCH20 ~

S

(

PD-153103 Fused pyrones

OH

MeO

Upjohn (Thaisrivongs et al., 1994, 1995; Rominesand Thaisrivongs, 1995; Romines et al., 1995a,b, 1996; Skulnicket al., 1995, 1997) NHCO(CH2)2NHCO2tBu WLPD (Lunney et al., 1994; Tummino et al., 1994) N'S CN

U-103017 Dihydropyrones OH

f

f HN "S 02 PNU-140690

~

Upjohn (Thaisrivongs et al., 1996a,b;Janakiraman et al., 1998) WLPD CF3 (Taitet al'1996'1997; Tummino et al., 1996; Hagen et al., 1997)

HO OH

OH

(continues)

27

HIV Protease TABLE I

(continued) Core

Structure

Source

Fullerene

UCSF (Friedman et al., 1998)

,~,,

OH

inhibitors are essentially peptide substrate analogs in which the scissile peptide bond has been replaced by a noncleavable, transition-state analog or isostere. Many transition state-based inhibitors had been investigated that would mimic the hydrated amide presumed to represent the intermediate in amide hydrolysis catalyzed by aspartic acids (Szelke, 1985). A hydroxy group contained in nearly all HIV protease inhibitors is believed to mimic the transition state diol structure and to interact with the catalytic aspartates, residues 25 and 125. While much of the early work addressed peptide analogs with a core transition state mimetic (Dreyer et al., 1989; Norbeck, 1990; Norbeck and Kempf, 1991), it was not long before truncated analogs that removed much of the peptide character of the molecules were discovered. While much of the early investigation involved identifying suitable cores, the subsequent refinement of these structures to reduce molecular weight and peptidic character while maintaining potency became the focus of intensive efforts by many research groups. An early success story from these efforts was that of Saquinavir, which was reported by the Hoffman-La Roche group in 1990 (Roberts et al., 1990) and was eventually approved for clinical treatment of AIDS in 1996. This hydroxyethylamine-based compound contains the NCCCN core, which has provided a rich source of potent HIV protease inhibitors (Table I). A unique aspect of the structure of Saquinavir was the introduction of the decahydroisoquinoline moiety in the P I' region replacing the proline found in substrates. This substituent allows for maintenance of excellent enzyme inhibitory potency in the absence of a lipophilic core substituent normally required to make interactions with the $1' hydrophobic pocket. The oral bioavailability of this compound is low, although it did reach a mean plasma concentration above the in vitro antiviral ECgo. The poor bioavailability may be due to a combination of it being poorly absorbed and readily metabolized. These properties together with high plasma protein binding (98%) limited the

28

John W. Erickson and Michael A. Eissenstat

effectiveness of Saquinavir as a first-line therapy and provided incentive for the design of second-generation bioavailable drugs. Merck reported analogs of Saquinavir in which the decahydroisoquinoline has been replaced by a substituted piperazine and a tetrahydrothiophene derivative placed at P2 to provide analogs that are potent but more bioavailable (Kim et al., 1994, 1995a,b). Hydroxyethylene analogs that contain the N C C C C C ~ O core were developed by Merck (Lyle et al., 1991). These compounds retain a hydroxyethylamine subunit but on the P side rather than the P' side. The amine was introduced to remove peptide character as well as to provide a basic group for improved solubility. Piperazine analogs eventually proved to be optimal with the additional nitrogen providing a site for introducing a substituent to interact with the $3 subsite as well as providing a weak base for better solubility (Vacca et al., 1994; Holloway et al., 1995). At the C-terminus an aminohydroxy indane was introduced as a phenylglycine surrogate (Lyle et al., 1991). The indane group fits into the lipophilic P2 pocket and the hydroxy end formed a hydrogen bond with the NH group of Asp 29. This compound had much better oral bioavailability than Saquinavir, retained comparable potency, Ki = 0.5 nM, and eventually was approved for clinical use as Indinavir. As indicated in Table I there are many other cores that were developed into potent HIV PR inhibitors. Among the more noteworthy is the hydroxymethyl carbonyl analog (NCCC--O core) first described by Kiso and co-workers (Mimoto et al., 1992) and later developed by Japan Energy Corporation (Kageyama et al., 1993). This shortened core retained considerable potency. The main additional structural variation was the introduction of novel proline mimetics in the P' region. A variation on the hydroxyethylamine (NCCCN) core is the addition of a second nitrogen to provide a hydroxyethyl hydrazine, which also yielded potent analogs (Fassler et al., 1993; Sham et al., 1995). Another NCCCN core that became important is the hydroxyethyl urea core originally explored by Monsanto-Searle workers. This core comes from the same family as Saquinavir except that instead of being part of a basic amine the P' nitrogen is incorporated into a urea or sulfonamide moiety (Getman et al., 1993; Bryant et al., 1995; Vazquez et al., 1995). SC-52151 was taken to the clinic but failed to demonstrate antiviral activity, probably because its high protein binding decreased cellular uptake (Fischl et al., 1997). This core was eventually optimized by Vertex workers using structure-based approaches leading to a new clinical candidate, VX-478 (Amprenavir), vide infra. Much of the early work on the peptide-based inhibitors led to the realization that, unlike the case with other proteolytic enzymes such as renin, HIV PR readily tolerates dramatic departures from typical substrate side-chains in inhibitors. However, the usefulness of peptidomimetics as drug candidates has been hampered by their generally poor pharmacological properties such as oral bioavailability, metabolic stability, and pharmacokinetics.

HIV Protease

29

B. STRUCTURE-BASED INHIBITORS The crystal structure of HIV PR immediately provided a structural basis for the development of a new generation of inhibitors that did not need to rely on substrate or peptide mimicry. Virtually all of the different approaches to improving potency have incorporated some aspect of structure into their design efforts, and numerous protein crystal structures of the enzyme with inhibitors have been generated (Wlodawer and Erickson, 1993; Gait and Karn, 1995). Structure-based design has been used to identify templates for elaboration as well as to optimize the interactions found in the original peptide-based substrates and to replace peptide moieties by nonpeptide groups. Combined with medicinal chemistry and, in some cases, target-based screening efforts, these structural investigations have created a structurally diverse compendium of inhibitors that includes those such as AG-1343 (Nelfinavir), which was almost a purely structure-based design, to MK-639 (Indinavir) and VX-478 (Amprenavir), the designs of which were a blend of medicinal chemistry and structural insights. An early contribution to the structure-based design of HIV protease inhibitors was the symmetry-based inhibitors originally described by the Abbott group. They postulated that HIV protease, being a symmetrical dimer, might be inhibited by symmetrical inhibitors that were complementary to the twofold symmetry of the enzyme (Erickson et al., 1990; Kempf et al., 1990, 1991, 1993). This approach led to potent diamino diol inhibitors with the core NCCCCN. Examination of the hydroxy substituents on these compounds demonstrated that these symmetrical compounds were making unsymmetrical interactions with the enzyme, with one of the core hydroxy groups forming more hydrogen bonds to the catalytic aspartates than the other (Hosur et al., 1994). This allowed the rational removal of one of these hydroxyls with no loss in potency. As in many of the peptide based inhibitors that have been described, bioavailability was a significant problem with the Abbott compounds. The first clinical candidate, A-77003, had poor oral bioavailability. The second clinical candidate, A-80987, designed to have better solubility, was well absorbed, but had a short plasma half-life as well as high plasma protein binding. Significant exploration of substituents was done to identify a potent, bioavailable analog. One focus was to replace the readily metabolized pyridine groups. Eventually ABT-538 was identified, where the bioavailability was high despite its relatively high molecular weight (721 Da) (Graul and Castaner, 1996). Plasma protein binding remained high (99%). Further investigation of this compound revealed that its improved bioavailability was at least in part a consequence of inhibition of cytochrome P450 enzymes. While inhibiting metabolizing enzymes is generally avoided in drug candidates, in this case it allowed the compound to

30

John W. Erickson and Michael A. Eissenstat

achieve sufficient blood levels to be an effective drug clinically and it was approved as Ritonavir. An interesting sidelight of the P450 inhibitory effects of this compound is that it is being examined in combination with less bioavailable drugs such as Saquinavir and has been demonstrated to improve their bioavailability also (Kempf et al., 1997). Since both of these compounds are also substrates of the P-glycoprotein efflux system, that may also contribute to the improved bioavailability when used together (Alsenz et al., 1998; Lee et al., 1998). A problem with inhibition of P450 enzymes to improve bioavailability is that it can lead to multiple drug interactions and untoward side-effects. Later efforts in structure-based design have utilized some of the previously identified cores and used key interactions identified in inhibitor-enzyme crystal structures to design nonpeptidyl substituents that would make key energetically favorable interactions with the enzyme. One such effort was by the Vertex group, who took the hydroxyethyl urea core (NCCCN class) described previously by Monsanto-Searle workers and incorporated a sulfonamide moiety into the P' region (also reported by the Monsanto-Searle workers) and also used the 3-tetrahydrofuryloxycarbonyl substituent originally identified as a useful P2 substituent by Merck workers to generate VX-478 (Kim et al., 1995). This compound is being developed clinically as Amprenavir in a joint effort with GlaxoWellcome. It is an attractive compound in that the reduced molecular weight translates into good bioavailability while retaining significant potency (St. Clair et al., 1996). Plasma protein binding does not appear to be a major problem (Livingston et al., 1995). The Agouron group has been involved both in structure-based design to reduce the peptide nature of substrate-based analogs and in de novo design. Some of this work has been done in collaboration with Eli Lilly workers. In one approach they utilized the Lilly discovery that the P' amine of Saquinavir could be replaced by a benzamide subunit while retaining significant potency. They then set out to remove the peptidic character of the P region by using an orthobenzamide to make a hydrogen bond to the flap water. Although this inhibitor was very weak, addition of other substituents provided single-digit nanomolar enzyme inhibitory potency (AG 1284) without any amide substiments in the core and any amino acid-derived substituents in the molecule. Remarkably, this potency was achieved with only four hydrogen bonds with either protein sidechains or ordered waters (Reich et al., 1995). Several groups virtually simultaneously reported their structure-based design of benzamide P2 substituents (Kaldor et al., 1995; Kalish et al., 1995; Kempf et al., 1995, Randad et al., 1995a, 1996). The Lilly group had previously demonstrated the utility of t-butyl benzamides and the Roche group t-butyl cyclohexylcarboxamides as PI'P2' substituents as part of a nonpeptidyl hydroxyethylaryl (alkyl) core (Kaldor et al., 1994; Thomas et a|., 1994).

HIV Protease

31

The Agouron group incorporated this substituent into the P' aromatic analogs mentioned above to generate low-molecular-weight, potent inhibitors (AG 1254). However, their poor aqueous solubility and weak antiviral activity precluded their development. Introduction of these P2 substituents back into the original Saquinavir hydroxyethyl amine analog along with the additional replacement of the P1 benzyl by a phenylthiomethyl (Kaldor et al., 1995) to increase hydrophobic interactions with $1 and $3 provided Nelfinavir (NCCCN core), a potent analog with good bioavailability (Shetty et al., 1996; Rabasseda et al., 1997), which was recently approved for therapeutic use (Kaldor et al., 1997). Abbott and NCI workers independently studied the benzamide P2 substituent on the symmetrical core described above. The initial NCI paper (Randad et al., 1995a) describes the introduction of benzamide substituents at both P2 and P2' positions and the process by which the HIV protease structure was used to design the 3-hydroxy substituent. A crystal structure of this compound bound to the enzyme led to the introduction of the 2-methyl substituent to stabilize a conformation of the aromatic ring out of plane with the amide carbonyl and resulted in the same optimum substituent identified by the Agouron workers (Kalish et al., 1995). Abbott workers described similar compounds where the P and P' substituents were individually varied using both rational design and combinatorial approaches to arrive at a similar optimum substituent pattern (Kempf et al., 1995a). These substituted benzamides were also shown to be effective substituents on the Searle-Vertex core (Freskos et al., 1996). In subsequent work, the N CI group modified the aromatic substituents to incorporate an anthranilamide moiety which allows for interactions at $3 and $3' and a significant boost in potency (Randad et al., 1995b). Working initially from the hydroxyethyl amine template, Ghosh et al. designed a bis-THF P2 substituent to replace the P3/P2 spanning quinolinecarbonyl-asparagine moieties found in the earlier compound (Ghosh et al., 1994). The resulting compound was 8 • less potent in the enzyme assay but only 2• less active in the antiviral assay. An X-ray structure showed that the bis-THF moiety forms hydrogen bonds with both the Asp29 NH and the Asp30 NH backbone amides. This analog has the advantages of improved aqueous solubility, decreased logP, and decreased MW (Ghosh et al., 1994, 1996). More recently Ghosh has reported that this P2 substituent is highly effective when placed in the Monsanto-Vertex core with a Ki of 1.1 nM and an antiviral EC5o of 1.4 nM, which is an order of magnitude more effective than either the Vertex or Roche compounds (Ghosh et al., 1998). Profiling this compound against resistant mutants is in progress. The Biomega group discovered that a 2,6-dimethylphenoxyacetyl P2 substituent provided potent analogs (Tong et al., 1995). The DuPont-Merck group took a fundamentally different approach to design-

32

John W. Erickson and Michael A. Eissenstat

ing an HIV protease inhibitor. They reasoned that if one could replace the flap water by an appropriate group while retaining the interactions with the catalytic Asps, one could design potent nonpeptidic HIV PR inhibitors. The initial hits from screening of their database for these interactions were hydroquinones and cyclohexanediols with micromolar potency (Lam et al., 1994). They and others also took the approach of cyclizing the Abbott inhibitor core, with appropriate stereochemical modifications, to provide a cyclic urea. The six-membered ring pyrimidinone analogs were weakly active (Randad et al., 1994), but the sevenmembered ring analogs had subnanomolar potency, MW < 600, antiviral potency, and bioavailability in rats and dogs (Lam et al., 1994). The first of these compounds to enter clinical study was DMP-323, a bis-hydroxymethylbenzyl analog (Grubb et al., 1994). However this compound was withdrawn because of low and variable drug levels. Many crystal structures of these compounds complexed with HIV protease have been generated (Ala et al., 1998). The seven-membered ring is roughly perpendicular to the catalytic aspartates and the diol unit interacts with these residues. The carbonyl oxygen replaces the commonly found flap water molecule and accepts H-bonds from Ile50 and Ile50 r. The unsubstituted benzyl groups bind to S1 and $1 r as in the acyclics and the N-hydroxymethylbenzyl groups bind to $2 and $2 r. The four contiguous asymmetric centers are in the RSSR orientation, which contrasts with the stereochemistry seen in the acyclic analogs. Further research on these cyclic urea analogs has yielded many potent variations by both this group and others (Sham et al., 1996). The second-generation clinical candidate DMP-450 improved solubility and bioavailability by replacing the 4-hydroxymethyl substituents by 3-amino substituents. However the high (90%) plasma protein binding of this compound reduced its effective blood plasma concentration to a level where it might not be effective against mutant variants (Hodge et al., 1996). Since there is a high tolerance for variation at P2 and P2 ~, which are also synthetically more accessible, much effort has focused on these substituents (Lam et al., 1996). Nonsymmetric analogs tended to have better solubilities (Wilkerson et al., 1997). Heterocycles have been incorporated which form additional H-bonds to the Asp 30 carbonyl and NH (XV-652) (Rodgers et al., 1996; Han et al., 1998). Smaller ring-size ureas have been revisited and potency can be improved by replacing one of the P 1 benzyl groups by a P 1 phenethyl group, which allows both aromatic P1 substituents to occupy the preferred pseudoaxial positions (De Lucca et al., 1997a). A review of some of this work and that of others on cyclic inhibitors has appeared recently (De Lucca et al., 1997b). Interestingly, the interaction of the ring hydroxyls with the catalytic aspartates is less critical in the cyclic analogs since nonhydroxylated 5-, 6-, or 7-membered ring cyclic ureas retain inhibitory activity in the 100-nM range (De Lucca, 1997). The carbonyl of the urea has been replaced to generate a cyanoguanidine which also replaced the structural water molecule (Jadhav

HIV Protease

33

et al., 1998). Recent emphasis in their work has been on identifying analogs

that retain good bioavailability while remaining effective against drug-resistant mutants (Jadhav et al., 1997). The Gilead and Uppsala groups extended this work into the realm of carbohydrate-derived cyclic analogs (Kim et al., 1996; Backbro et al., 1997; Hulten et al., 1997). The Gilead group set out to design cyclic analogs that were devoid of peptide character but retained the favorable solubility and low molecular weight of some of the DuPont-Merck analogs. In the resulting thiepane dioxides a sulfonyl group is used in place of a carbonyl to replace the structural water molecule and the diol interactions with the catalytic aspartates are retained. The nitrogens of the DuPont-Merck compounds are replaced by carbons to which are attached the P2 benzyls. Benzyloxy or thiazol-4-methoxy substituents are used to generate the hydrophobic interactions with $1/$1'. The compounds were potent in the enzyme and antiviral assays and a meta-amino analog showed improved solubility and high bioavailability. The Uppsala group generated structures that were intermediate between the DuPont-Merck and Gilead compounds. D-Mannitol provided the required stereochemistry at all of the asymmetric centers. A key to using this starting material efficiently was the use of phenoxymethyl as the P1/PI' substituent. The nitrogens required for urea formation were brought in by double azide displacement of two of the hydroxy groups. Not surprisingly, the RSSR stereochemistry was required. Sulfamide (1,3-diazepan-2-sulfone) analogs were also potent. Surprisingly the crystal structure of a sulfamide showed that the presumed P2' benzyl group actually binds in the $1' pocket and the (P1 ') phenoxymethyl group binds in the $2' pocket. This was proposed to be due to the difference in ring geometry between the cyclic urea and the sulfamide.

C. NONPEPTIDE INHIBITORS FROM SCREENING Two groups have been pursuing a different cyclic nonpeptide template for the development of HIV protease inhibitors. Reviews of these efforts have appeared (Lee et al., 1996). Both Parke-Davis and Upjohn identified 4-hydroxypyrone derivatives as weak HIV protease inhibitors during mass screening of their chemical inventories. The Parke-Davis group identified coumarin and pyrone derivatives with Ki's in the micromolar range (Tummino et al., 1994). The low molecular weights, nonpeptide nature, and high solubility of these compounds encouraged extensive SAR studies. The crystal structure of the initial 4-hydroxy coumarin lead showed that the hydroxy group was H-bonded to the catalytic Asps (Lunney et al., 1994). The rest of the molecule existed in two different binding modes, both of which had the lactone oxygens interacting with the Ile50 NH group in the flap. Because the coumarin system did not allow

34

John W. Erickson and Michael A. Eissenstat

efficient occupancy of the $2 subsite, work focused on the pyrone analogs, which showed similar binding interactions with the catalytic aspartates and the flap (Prasad et al., 1994). A highly potent (Ki = 3 nM) analog which only spanned the $1-$2' subsites was generated by introducing a 2-o-t-butylphenylthio substituent to occupy $1' and $2' and a 6-phenyl substituent to occupy $1 (Prasad et al., 1995a,b). Dihydropyrone analogs were generated since they gave additional flexibility for substitution that would provide favorable interactions (Hamilton et al., 1996; Tait et al., 1996, 1997). However, while this variation allowed for improvement in enzyme inhibitory potency to the single-digit nanomolar range, it did not translate into potent antiviral activity. It was found that polar substituents had to be introduced into the 3-arylthio group and the 6-phenylethyl group before submicromolar antiviral activity could be observed (Hagen et al., 1997). As anticipated from their structures, these compounds showed very high oral bioavailabilities. The Upjohn series has been developed along a somewhat parallel path. High throughput screening of their chemical database uncovered the coumarin derivative warfarin as a weak HIV protease inhibitor. They rapidly identified a pyrone analog with aralkyl substituents at the 3- and 6-positions which had a Ki of 36 nM against the protease, an ECs0 of 3/zM in the antiviral assay, and high oral bioavailability. This low-molecular-weight (362 Da), easily synthesized (three steps) compound, U-96988, was advanced to the clinic. Another series of compounds which have been explored by Upjohn are 5,6-cycloalkylpyrones, which also have greater flexibility than the coumarins. The cyclooctyl analog was significantly more potent with the eight-membered ring effectively folding into the $1' pocket. A potent analog containing an a-cyclopropylbenzyl substituent at C-3 was identified (K~ = 15 nM) with high bioavailability, but poor antiviral activity (Romines and Thaisrivongs, 1995; Romines et al., 1995a). Substitution of the benzyl substituent at the meta-position by a p-cyanobenzenesulfonamideimproved enzyme potency to K~ = 0.8 nM and antiviral activity to ICs0 = 1.5/zM. This compound (U-103017) had high bioavailability and was advanced to the clinic (Skulnick et al., 1997). A similar variation allowed for the preparation of potent coumarin analogs, which, however, lacked antiviral activity. A similar variation in the pyrone series showed similar potency. The X-ray structure showed that the sulfonamide oxygens formed bifurcated H-bonds with the Asp30 OH and NH groups. The imidazole NH group is H-bonded to the Asp29 NH group (Thaisrivongs et al., 1996). Potent dihydropyrones were also identified (Thaisrivongs et al., 1996b). The compound PNU-140690 has been shown to be effective against Ritonavir-resistant mutants in cell culture (Lee et al., 1996; Poppe et al., 1997). The parallels between these two independent design efforts are striking. In both cases, the core 4-hydroxy-2-pyranone ring system was discovered from high throughput enzyme-based screening. In each case, the potency of the initial lead compounds was too low to possess measurable antiviral activity. Fur-

HIV Protease

35

ther optimization was guided by X-ray crystal structure analysis combined with modeling to identify novel binding modes and key positions of the molecules that could be used to incorporate substituents that would improve potency, retain bioavailability, and inhibit resistant mutants.

D. IRREVERSIBLE INHIBITORS Irreversible enzyme inhibitors are often ignored as therapeutic targets because of toxicity and bioavailability concerns. A number of weak irreversible inhibitors of HIV protease have been described and crystal structures of the resulting alkylated enzyme have been generated (Yu et al., 1996, and references therein). More recently a Korean group has described a series of potent irreversible HIV protease inhibitors where the reactive moiety is an epoxide functionality (Park et al., 1995; Lee et al., 1996; Choy et al., 1997). It was expected that the epoxide would bind in the catalytic site with one aspartate activating the epoxide via protonation while the other (as the carboxylate anion) would nucleophilically attack an epoxide carbon. Key to the reactivity of these compounds was the presence of only the P1 benzyl group (NCCCCN core). The presence of the P 1' benzyl group commonly found in the reversible inhibitors destroys activity in the irreversible series. These compounds have subnanomolar apparent Ki s in the protease assay. Noteworthy is the good bioavailability observed with LB-71350 (NCCCCCC~O core), suggesting that the inherent reactivity of the epoxide is only unveiled once it is bound in the enzyme pocket (Jeong et al., 1997). We have generated a crystal structure of one of these compounds after reaction with HIV protease, which verifies the expected reaction mode and suggests a unique mode of binding in the P' region (T. N. Bhat, unpublished data). Irreversible inhibition of the enzyme may be an approach that proves useful in attacking protease mutants.

E. PHARMACOKINETICSOF H I V PROTEASE INHIBITORS Unlike the case for the majority of RT inhibitors, most HIV PR inhibitors have serious pharmacokinetic limitations. Poor oral absorption, serum protein binding, liver enzyme metabolism, and other factors can all but eliminate the antiviral benefits of many potent protease inhibitors. Currently approved protease inhibitors need to be taken often and in large quantities to maintain effective antiviral concentrations in the blood. Of the currently available antiviral drugs for AIDS, protease inhibitors are among the most effective, but are costly and require difficult treatment regimens. As a result, the failure or drop-out rate of

36

John W. Erickson and Michael A. Eissenstat

patients on protease inhibitor therapy tends to be relatively high, the compliance of treatment with protease inhibitors is likely to be poor, and the development of resistance to these highly effective drugs is a growing problem (Deeks et al., 1997).

V. DRUG RESISTANCE Although many promising new anti-HIV drugs have been developed, their effectiveness has been hampered by the emergence of drug-resistant variants. Mutant viruses emerge in the presence of antiviral agents whenever the balance of mutant virus replication is favorable, i.e., the mutant provides a selective advantage to the virus in the presence of the drug. Biological selection of HIV mutants is particularly favorable owing to the nature of HIV infection, which is chronic, persistent, and characterized by a high steady-state rate of replication. Pharmacodynamic studies using potent HIV PR inhibitors have revealed that the combined half-life of plasma virus and virus-producing cells in the body is on the order of 2 days or less, with new virus being produced at a rate of 108-109 virions per day (Ho et a|., 1995; Wei et al., 1995). These conditions, coupled with the high error rate of HIV-1 reverse transcriptase, less than 1 in 10,000 bases, favor rapid mutation and selection of drug-resistant virus (Coffin, 1995). Clinical resistance to every newly introduced anti-HIV agent has become the rule and resistance to PR inhibitors is no exception (Erickson, 1995; Richman, 1996). Recent reports of the selection and transmission ofmultidrugresistant HIV strains that contain multiple PR and RT mutations in the pol gene and also in cleavage site sequences in the gag gene have raised new alarms in the AIDS community. An understanding of the biological and structural mechanisms of resistance to HIV PR inhibitors is necessary in order to predict optimal salvage therapies; indeed, the full implications of drug resistance for disease outcome have not yet been realized. A great deal of progress has been made at unraveling the structural, biochemical, and virologic mechanisms of resistance to PR and RT inhibitors in light of the three-dimensional atomic structures of these proteins (Richman, 1996). An important goal of these studies is to gain insights that may lead to new strategies to combat resistance and to eventually develop paradigms for structure-based strategies to combat resistance to antiinfectious disease agents.

A. MECHANISMS Over 57 mutations have been observed in at least 27 positions in the 99-residue HIV PR monomer in response to drug selection pressure (Hammond et al.,

HIV Protease

37

1997) (Fig. 7). Many of these mutations were first observed in vitro and presaged the emergence of mutants in the clinical setting in vivo, although the actual number and order of appearance of multiple mutations varied (Condra et al., 1995; Molla et al., 1996; Ives et al., 1997). They can be classified as active site vs nonactive site mutations according to whether they occur inside or outside the inhibitor binding subsites. A variety of resistance mechanisms to HIV PR inhibitors have been proposed based on our understanding of the structural biochemistry of the PR and on the nature of inhibitor binding to the enzyme (Erickson, 1995). Mutations of specificity-determining residues that would directly interfere with inhibitor binding and lead to loss of potency constitute an obvious mechanism for resistance to HIV PR inhibitors. Active site mutations are necessary but often are not sufficient for high-level resistance in the clinical setting (Condra et al., 1995). An explanation for this observation comes from biochemical studies that reveal a negative impact of many resistance-conferring active site mutations on enzyme activity (Sardana et al., 1994; Condra et al., 1995; Gulnik et al., 1995), suggesting that such mutations result in suboptimal virus. A second mechanism of resistance involves nonactive site mutations that indirectly alter the active site architecture via longrange structural perturbations. Mutations that enhance enzyme catalysis in the presence of inhibitors could constitute a third mechanism. Finally, mutations that affect dimer stability, cleavage site mutations that lead to altered processing kinetics by mutant enzymes, and "regulatory" mutations elsewhere in the genome that lead to improved viral growth in the presence of PR inhibitors comprise additional resistance pathways. Any mutation that influences the binding of a specific inhibitor can be expected to have an effect on substrate cleavage kinetics as well as perhaps on substrate recognition. For this reason, combinations of two or more mutations in HIV PR may lead to a variety of additive, synergistic, or compensatory effects, depending on which property is being measured.

B. ACTIVE SITE MUTANTS The first-described resistance mutation for HIV PR was a single substitution of a valine residue by alanine at position 82 (V82A mutation) that was selected using a symmetric diol inhibitor (Otto et al., 1993). Since then, resistance mutations have been observed in each of the specificity pockets, $3, $2, S 1 and, by symmetry, S 1 ', $2', and $3 '. However, only a subset of all residues that constitute a particular subsite mutates in response to a particular drug. The structural effects of mutations on drug binding have been modeled using the crystal structures of the appropriate wild-type enzyme/inhibitor complexes and used to rationalize the effects of specific mutations on drug binding (Otto et al., 1993; Ho

38

John W. Erickson and Michael A. Eissenstat

et al., 1994; Kaplan et al., 1994; Markowitz et al., 1995; Baldwin et al., 1995a). Most of the subsite mutations, like I84V and V82A, -I, or -F, affect hydrophobic and van der Waals' interactions and can be considered to be "packing" mutants, somewhat analogous to hydrophobic mutations in a protein (Markowitz et al., 1995; Baldwin et al., 1995a,b). Crystallographic analyses of HIV PR/inhibitor complexes show that most of the surface of an inhibitor and its immediate protein environment are solvent inaccessible. Some mutations, such as V82I, are more effective when combined with a second active site mutation, such as V32I, owing to synergistic effects on enzyme activity (Kaplan et al., 1994). Other mutations can affect electrostatic contacts. The effect of the R8Q mutation on binding to A-77003 was shown to be due to loss of a chargeinduced dipole interaction between the guanidinium side-chain of the enzyme and the pyridine ring of the inhibitor (Ho et al., 1994). While this mutation resulted in a dramatic loss of inhibition, it also produced a virus with severely impaired growth kinetics. A possible explanation for this defect is the symmetric loss of the intersubunit salt bridge between Arg8 and Asp129, which may decrease the stability of the enzyme. Despite the demonstration of strong crossresistance of the R8Q PR to Ritonavir, Indinavir, and Saquinavir (Gulnik et al., 1995), this mutation has not yet been observed clinically. Thus, the design of inhibitors that selectively interact with Arg8 appears to be a useful strategy for the development of "resistance-repellent" drugs (Erickson, 1995; Mo et al., 1996). While numerous crystal structures of wild-type HIV PR/inhibitor complexes have been published, crystal structures of mutant HIV PR/inhibitor complexes have appeared with less frequency in the literature. The structure determination of the V82A mutant of HIV PR complexed to the C2 symmetrybased inhibitor A-77003 revealed unexpected backbone changes that resulted in the repacking of enzyme and inhibitor atoms in the $1/$1' and $3/$3' subsites (Baldwin et al., 1995b). The structure of the V82I mutant with DMP-323, a cyclic urea-based inhibitor, exhibited a loss of interaction between the CG1 methyl group of valine and the inhibitor despite the larger bulk of the mutant isoleucine side-chain (Chang et al., 1994). This was because the energetically favored side-chain rotamer for isoleucine resulted in a repositioning of the CD 1 methyl away from the P 1 group of the inhibitor. In both examples, the observed structural changes were unexpected based on modeling studies. The protein residue Val82 resides in a structurally flexible loop that can exhibit a variety of different conformations in response to different conditions (Erickson, 1993). This flexibility is consistent with the fact that Val82 can be replaced by both larger (Ile and Phe) and smaller (Ala) residues in viable mutant enzymes. Crystal structures of cyclic ureas complexed with V82F, I84V, and V82F/I84V mutants provide further support for the flexibility of this region and also demon-

HIV Protease

39

strate that the conformation of the 82F mutant side-chain depends on the nature of the side-chain at position 84 (I or V) (Ala et al., 1997). The structure of Indinavir complexed to the cross-resistant multiple mutant M46I/L63P/ V82T/I84V demonstrated little discernible change from the wild-type enzyme complex although the mutant showed a 70-fold (---2.5 kcal/mol) increase in Ki (Chen et al., 1995). These studies point out the pitfalls inherent in reliance on pure modeling approaches and underscore the need for continued experimental studies of inhibitors complexed with resistant mutants of HIV PR.

C.

NON-ACTIVE SITE MUTANTS

While the precise structural mechanism of drug resistance can often be pinpointed for active site mutations that directly affect inhibitor binding, the evaluation of non-active site mutants is more challenging. Some mutations might act in concert with active site mutations by compensating for a functional deficit caused by the latter. For example, the R8Q mutation is found almost exclusively in combination with one or more mutations outside the active site region, such as M46I (Ho et al., 1994). Mutations of Met46 to Ile, Leu, or Phe are often found in the presence or absence of other active site mutations, such as V82I, -A, or -F and I84V. The residue Met46 is in the flap of HIV PR and molecular dynamics simulations on flap movement have shown that the M46I mutant exhibits a markedly different dynamical behavior than the wild-type enzyme (Collins et al., 1995) and presumably exhibits altered enzyme kinetics. A role for Met46 in polyprotein substrate recognition is also possible. Other non-active site mutations may indirectly alter the structure of the active site region alone or in combination with one or more active site mutations. Modeling studies of a hextuple mutant selected using KNI-272 (B. Anderson and H. Mitsuya, personal communication) reveal that mutations far from the active site, such as A71V (Fig. 7, see color plate), may influence inhibitor or substrate binding via a concerted "domino" effect (T. N. Bhat, unpublished data). A similar mechanism may be operative for the Saquinavir-resistant L90M mutant. The side-chain of this residue can affect the conformation of the adjacent active site loop that contains the catalytic aspartic acids. Most of the non-active site mutations result in a larger hydrophobic side-chain which can be expected to perturb packing interactions within the enzyme core. This perturbation may be transferred to the active site where the internal strain energy in the protein may be relieved through alterations in the interaction energy with the inhibitor. In some cases, the introduction of drug-selected non-active site mutations alone does not lead to a measurable reduction in drug binding, in contrast to the case for active site mutations (Gulnik et al., 1995). However, the fact that these

40

John W. Erickson and Michael A. Eissenstat

mutations are only observed in the presence of drug means that they must provide the mutant virus with a competitive growth advantage over wild-type HIV. At least one engineered HIV PR mutant, G48Y, exhibited a greater catalytic efficiency than the wild-type enzyme toward artificial peptide substrates (Lin et al., 1995). However, this mutant has not been observed either in vitro or in vivo in the presence or absence of inhibitors.

D. CLEAVAGESITE MUTANTS AND SUBSTRATE SPECIFICITY Active site mutations can strongly affect catalytic efficiency of HIV PR (Gulnik et al., 1995), but the magnitude of the effect depends on the substrate sequence (Ridky et al., 1998). The addition of one or more non-active site mutations may compensate for a catalytically defective active site mutation (Gulnik, 1995; Schock, 1996). These conclusions are based on enzymology studies with recombinant HIV PR mutants. Recent longitudinal studies of individual drugtreated patients demonstrate an ordered accumulation of mutations in which one or two active site mutations usually occur early and are followed by numerous nonactive site mutations (Molla et al., 1996; Zhang et al., 1997). Thus, the clinical evolution of drug resistance to protein inhibitors seems to qualitatively mirror expectations based on the enzymology studies. Since active site mutations may be expected to alter the rate of one or more cleavages that must occur during viral maturation, one may imagine that compensating mutations in the cleavage sites on the Gag or Gag-Pol polyproteins might result in better substrates for particular mutant enzymes. Recent studies identified a Leu449Phe mutation in the p 1/p6 Gag polyprotein cleavage site that can synergize with the I84V PR mutation to produce a virus with 350- to 1500-fold decreased sensitivity to substrate-based inhibitors Bila-1906 and Bila-2185 (Doyon et al., 1996). The mutation altered the pl/p6 cleavage site from PheLeu to Phe-Phe. A synthetic peptide containing the mutant Phe-Phe cleavage site was cleaved at higher catalytic efficiency by the I84V HIV PR mutant than the corresponding peptide with the wild-type sequence (Lamarre et al., 1994). Salzman and co-workers subsequently identified mutations at the NC/pl Gag cleavage site in breakthrough resistant virus isolated from patients on indinavir therapy (Zhang et al., 1997). This is an important finding since it confirms the possibility for this drug resistance mechanism to be operative in the clinical setting. Several groups have followed the lead of these investigators and have confirmed the presence of cleavage site mutations in clinical isolates from multidrug-resistant HIV in preliminary reports (Larder et al., 1998; Mammano et al., 1998).

HIV Protease

41

VI. F U T U R E C H A L L E N G E S I N H I V P R O T E A S E INHIBITOR DESIGN

A. PREDICTION OF RESISTANCE PATHWAYS: DEFINING BIOCHEMICAL FITNESS FOR VARIANT PROTEASES A key issue in combating resistance is to first define the most important resistance mutations for a particular inhibitor. While there is no guarantee that in vitro selection studies will accurately mirror in vivo results, a recent study using a panel of over a dozen recombinant PR mutants suggests that those mutations that were selected in vivo by a particular inhibitor displayed the highest enzyme "vitality," or fitness, defined as the ratio of (K~ • kcat/Km)mutant (Ki • kcat/Km)wildtype' for a given inhibitor (Gulnik et al., 1995). A number of studies indicate that cross-resistance is a problem for most clinically useful PR inhibitors, Saquinavir being a notable exception (Sardana et al., 1994; Condra et al., 1995; Gulnik et al., 1995; Wilson et al., 1997). However, this is to be expected since all the inhibitors were highly optimized to bind the wild-type enzyme. Mutations have been found in every subsite of HIV PR. However, not every subsite residue has been found to mutate for a particular drug. Further, only a limited number of the total possible single-step mutations that could occur have actually been found at each variant position. These results suggest that there are constraints that limit the potentially available resistance pathways. Besides the usual factors that limit viable mutations in enzymes, such as proper folding, stability, and catalytic function, HIV PR has several additional and unique constraints. The homodimeric nature of the enzyme means that every mutation at the genetic level will result in a double mutant at the enzyme level (unless heterodimer formation occurs in vivo). The requirement for recognition and cleavage of at least nine specific polyprotein cleavage sites should limit mutations to ones that will not diminish processing below some tolerable threshold. Enzymology and virus infectivity studies with recombinant HIV PR mutants independently suggested that this threshold may be between 5 and 10% of wild-type activity (Babe et al., 1995; Gulnik et al., 1995; McPhee et al., 1996). The relative cleavage rates for certain Gag polyprotein processing sites, such as p2-CA, have also been shown to be important for ordered assembly and maturation of infectious virus (Pettit et al., 1994; Weigers et al., 1998). Finally, the fact that some subsite

42

John W. Erickson and Michael A. Eissenstat

amino acids, such as Arg8, also participate in the dimer interface suggests that some subsite mutants may adversely affect dimer stability and reduce inhibitor potency through thermodynamic linkage of inhibitor binding and dimer formation (Xie et al., 1998). It is interesting to note that many of the resistance mutations in HIV-1 PR correspond to the wild-type residues in HIV-2 or in other retroviral proteases at structurally equivalent locations. Thus, there may be a limited number of solutions to the PR sequence/substrate specificity combinatorial puzzle for retroviruses, and mutational strategies for resistance tend to evolve toward one or more of them.

B. IMPLICATIONS FOR INHIBITOR DESIGN There are two major challenges that lie ahead for HIV PR inhibitor design. One concerns the issue of pharmacokinetics. As potent and bioavailable as these inhibitors are, they need to be even more potent and more bioavailable to maximize their effectiveness and usefulness. The daily requirements for PR inhibitors is massive at upward of 800 mg/kg two to three times/day. Side-effects can be troublesome, if not debilitating, for most of these compounds. Indinavir therapy leads to painful reoccurrence of kidney stones in about 20% of individuals. Ritonavir has serious drug interaction issues since it is a highly potent inhibitor of a major liver enzyme, cytochrome P450 3A4, involved in drug metabolism. A syndrome of peripheral lipodystrophy (fat wasting), hyperlipidemia, and insulin resistance is emerging as a frequent side-effect of long-term PR inhibitor therapy (Carr et al., 1998). Finally, these drugs tend to be costly to make. Coupled with the massive dosages required for maximal viral suppression to be maintained, the economics of PR inhibitor therapy is not a trivial consideration. The solution lies in the design of more potent, less costly, and more bioavailable inhibitors. This in turn raises the question of what is the theoretical limit of potency that can be achieved for HIV PR inhibitors and how close are we to this limit? A related question is can we design inhibitors with a better potency/MW ratio and thereby achieve significant improvements in bioavailability and ease of synthesis? The second major challenge confronting the field of HIV PR inhibitors is the problem of how to effectively combat drug resistance. Given the unusual constraints on HIV PR and our understanding of inhibitor binding and resistance at an atomic level of detail, it should be possible at a minimum to design inhibitors that target different mutants. These inhibitors could then be used together in multidrug combination therapy approaches. It may even be possible to design inhibitors that target multiple mutants simultaneously. The potency of Saquinavir against multiple active site mutants suggests that this may be feasible. In certain instances, these second-line drugs might prompt the selection

HIV Protease

43

of wild-type revertants that can then be treated with the original drug. Classical sequential therapy approaches have been unsuccessful in controlling infectious diseases and have given way to multidrug treatment. However, one can even envision a therapeutic strategy for HIV PR inhibitors that could apply specific, structure-guided selection pressure to influence clinical outcome (Erickson, 1995). Such strategies would represent a fundamental departure from currently held concepts of infectious disease treatment and will require a great deal of experimental virology assuming that the appropriate inhibitors can be designed in the first place.

VII. S U M M A R Y A N D C O N C L U S I O N S Human immunodeficiency virus proteases have proven to be valuable instructional targets for the design of mechanism- and structure-based drugs and for providing novel strategies for designing antiviral therapeutics. The introduction of PR inhibitors to the armamentarium of AIDS drugs has had a profoundly successful impact on the AIDS community where these drugs are prolonging life and providing new hope for the long-term maintenance of relatively good health until such time as a more effective cure or vaccine becomes available. However, the success of PR inhibitor therapy is a qualified one. A growing number of individuals are developing multidrug-resistant strains of HIV and there is emerging evidence that these strains can be readily transmitted. At present, there are no effective treatment options available for these patients. This brings us to the brink of a new forefront in antiviral research that requires novel approaches to deal with the problem of drug resistance. This problem poses different challenges than we faced in the design of the first-line drugs, as it forces us to think about selection pressure mechanisms in addition to the usual issues of potency, pharmacology, safety, and mechanism of drug action. Unfortunately, those very features that contribute to the specificity and efficacy of PR inhibitors also provide the virus with strategies to develop drug resistance. The resistance picture for HIV PR inhibitors is not limited to this class of compounds; cross-resistance to both the nucleoside and non-nucleoside HIV RT inhibitors is common. The experience obtained with HIV antiviral therapy should be a warning to those targeting other viruses. The rapid replication rate of HIV in vivo coupled with the long duration of viral infection favors the emergence of resistant mutants to virtually any targeted antiviral agent. Computer simulation studies show that a mutant virus with a selective advantage of only 5% (i.e., a replication rate 1.05 x wild-type virus) will propagate to become the dominant genotype within --~100 virus generations (Coffin, 1995). Human immunodeficiency virus undergoes 100 replication cycles in several months. These same considerations imply that those viral mutants that only arise under

44

John W. Erickson and Michael A. Eissenstat

drug pressure and arise early should be less robust than their wild-type progenitors. Thus, effective strategies to combat resistance should be ones that can limit the selection of mutational pathways to a small number of mutants and not drive the virus far from wild type. A somewhat more optimistic picture of resistance can be painted with viruses that cause acute infections that resolve within a few weeks, since they may have less of a tendency to develop dominant resistant strains than HIV. However, treatment of HIV infection has taught us that, under the appropriate selection pressure, rapid and robust resistance is an ever-present danger. Finally, with the growing number of potential resistance mutations, improved approaches to structure-based drug design and the development of quantitative methods of modeling and binding affinity prediction take on added significance. Meanwhile, it is important to continue to search for combinations of drugs with complementary resistance profiles and to explore new treatment modalities such as structure-based approaches that can be used to exert specific selection pressures on HIV in the hopes that at least a stalemate, if not an end game, strategy can be found to control the progression of AIDS. Lessons learned with HIV PR may provide valuable strategies for dealing effectively with other retroviral infections, such as HIV-2 and HTLV-I, as well as with important animal diseases caused by retroviruses, through targeting of retroviral proteases.

ACKNOWLEDGMENTS The authors thank Ms. Christine Ray for help with the preparation of this chapter. This project has been funded in whole or in part with Federal funds from the National Cancer Institute, National Institutes of Health, under Contract No. NO1-CO-56000.

REFERENCES Abdel-Meguid, S. S. (1993). Inhibitors of aspartyl proteinases. Med. Res. Rev. 13(6), 731-778. Abdel-Meguid, S. S., Zhao, B., Murthy, K. H., Winborne, E., Choi, J. K., DesJarlais, R. L., Minnich, M. D., Culp,J. S., Debouck, C., and Tomaszek, T. A.,Jr. (1993a). Inhibition of human immunodeficiency virus-1 protease by a C2-symmetric phosphinate: Synthesis and crystallographic analysis. Biochemistry 32(31), 7972-7980. Ala, P. J., Huston, E. E., Klabe, R. M., McCabe, D. D., Duke, J. L., Rizzo, C. J., Korant, B. D., DeLoskey, R. J., Lam, P. Y., Hodge, C. N., and Chang, C. H. (1997). Molecular basis of HIV-1 protease drug resistance: structural analysis of mutant proteases complexed with cyclic urea inhibitors. Biochemistry 36(7), 1573-1580. Ala, R., DeLoskey, R., Huston, E., Jadhav, P., Lain, P., Eyermann, C., Hodge, C., Schadt, M., Lewandowski, F., Weber, P., McCabe, D., Duke, J., and Chang, C. (1998). Molecular recognition of cyclic urea HIV-1 protease inhibitors.J. Biol. Chem. 273(20), 12325-12331. Alsenz, J., Steffen, H., and Alex, R. (1998). Active apical secretory emux of the HIV protease inhibitors saquinavir and ritonavir in Caco-2 cell monolayers. Pharm. Res. 15,423-428.

HIV Protease

45

Appelt, K. (1993). Crystal structures of HIV-1 protease-inhibitor complexes. Perspect. Drug Disc. Design 1, 23-48. Babe, L. M., Rose, J., and Craik, C. S. (1995). Trans-dominant inhibitory human immunodeficiency virus type i protease monomers prevent protease activation and virion maturation. Proc. Natl. Acad. Sci. USA 92(22), 10069-10073. Babine, R. E., Zhang, N., Jurgens, A. R., Schow, S. R., Desai, P. R., James, J. C., and Semmelhack, M. F. (1992). The use of HIV-1 protease structure in inhibitor design. Bioorg. Med. Chem. Lett. 2,541-546. Baboonian, C., Dalgleish, A., Bountiff, L., Gross, J., Oroszlan, S., Rickett, G., Smith-Burchnell, C., Troke, P., and Merson, J. (1991). HIV-1 proteinase is required for synthesis of pro-viral DNA. Biochem. Biophys. Res. Commun. 179(1), 17-24. Backbro, K., Lowgren, S., Osterlund, K., Atepo, J., Unge, T., Hulten, J., Bonham, N. M., Schaal, W., Karlen, A., and Hallberg, A. (1997). Unexpected binding mode of a cyclic sulfamide HIV-1 protease inhibitor.J. Med. Chem. 40(6), 898-902. Baldwin, E. T., Bhat, T. N., Gulnik, S., Liu, B., Topol, I. A., Kiso, Y., Mimoto, T., Mitsuya, H., and Erickson, J. W. (1995a). Structure of HIV-1 protease with KNI-272, a tight-binding transitionstate analog containing allophenylnorstatine. Structure 3(6), 581-590. Baldwin, E. T., Bhat, T. N., Liu, B., Pattabiraman, N., and Erickson, J. W. (1995b). Structural basis of drug resistance for the V82A mutant of HIV-1 proteinase. Nat. Struct. Biol. 2(3), 244-249. Beaulieu, P. L., Wernic, D., Abraham, A., Anderson, P. C., Bogri, T., Bousquet, Y., Croteau, G., Guse, I., Lamarre, D., Liard, F., Paris, W., Thibeault, D., Pav, S., and Tong, L. (1997). Potent HIV protease inhibitors containing a novel (hydroxyethyl)amide isostere. J. Med. Chem. 40(14), 2164-2176. Billich, A., Charpiot, B., Fricker, G., Gstach, H., Lehr, P., Peichl, P., Scholz, D., and Rosenwirth, B. (1994). HIV proteinase inhibitors containing 2-aminobenzylstatine as a novel scissile bond replacement: biochemical and pharmacological characterization. Antiviral Res. 25,215-233. Bone, R., Vacca, J. P., Anderson, P. S., and Holloway, M. K. (1991). X-ray crystal structure of the HIV protease complex with L-700,417, an inhibitor with pseudo C2 symmetry. J. Am. Chem. Soc. 113(24), 9382-9385. Bryant, M., Getman, D., Smidt, M., Marr, J., Clare, M., Dillard, R., Lansky, D., DeCrescenzo, G., Heintz, R., Houseman, K. (1995). SC-52151, a novel inhibitor of the human immunodeficiency virus protease. Antimicrob. Agents Chemother. 39(10), 2229-2234. Budt, K. H., Peyman, A., Hansen, J., Knolle, J., Meichsner, C., Paessens, A., Ruppert, D., and Stowasser, B. (1995). HIV protease inhibitor HOE/BAY 793, structure-activity relationships in a series of C2-symmetric diols. Bioorg. Med. Chem. 3(5), 559-571. Carr, A., Samaras, K., Burton, S., Law, M., Freund, J., Chisholm, D., and Cooper, D. (1998). A syndrome of peripheral lipodystrophy, hyperlipidaemia and insulin resistance in patients receiving HIV protease inhibitors. AIDS 12, F51-F58. Chang, C.-H., DeLoskey, R. L., Lam, P., Schadt, M., Duke, J., and Weber, P. C. (1994). Structures of cyclic ureas complexed with native and V82I mutant HIV-1 protease. Abstract 110A from the Tenth International Conference on AIDS and STD, Yokohama, Japan. Chen, Z., Li, Y., Schock, H. B., Hall, D., Chen, E., and Kuo, L. C. (1995). Three-dimensional structure of a mutant HIV-1 protease displaying cross-resistance to all protease inhibitors in clinical trials.J. Biol. Chem. 270(37), 21433-21436. Cho, S. Y., Jungheim, L., and Baxter, A. (1994). Novel HIV-1 protease inhibitors containing a ]3-hydroxy sulfide isostere. Bioorg. Med. Chem. Lett. 4(5), 715-720. Chou, K.-C., and Zhang, C.-T. (1993). Studies on the specificity of HIV protease: An application of Markov chain theory. J. Protein Chem. 12(6), 709-724. Choy, N., Choi, H.,Jung, W. H., Kim, C. R., Lee, T. G., and Koh, J. S. (1997). Synthesis of irreversible HIV-1 protease inhibitors containing sulfonamide and sulfone as amide bond isosteres. Bioorg. Med. Chem. Lett. 7, 2635-2638.

46

John W. Erickson and Michael A. Eissenstat

Clavel, F., Guetard, D., Brun-Vezinet, F., Chamaret, S., Rey, M. A., Santos-Ferreira, M. O., Laurent, A. G., Dauguet, C., Katlama, C., and Rouzioux, C. (1986). Isolation of a new human retrovirus from West African patients with AIDS. Science 233,343-346. Coffin, J. M. (1995). HIV population dynamics in vivo: Implications for genetic variation, pathogenesis and therapy. Science 267,483-489. Coffin, J. M. (1996). Retroviridae: The viruses and their replication. In "Virology" (B. N. Fields, D. M. Knipe, and P. M. Howley, Eds.), Vol. 2, pp. 1767-1847. Lippincott~ Philadelphia. Coffin, J. M., Hughes, S. H., and Varmus, H. E. (Eds.) (1997). "Retroviruses." Cold Spring Harbor Laboratory Press, Cold Spring Harbor, NY. Collins, J. R., Burt, S. K., and Erickson, J. W. (1995). Flap opening in HIV~ protease simulated by "activated" molecular dynamics. Nat. Struct. Biol. 2(4), 334-338. Condra, J. H., Schleif, W. A., Blahy, O. M., Gabryelski, L.J., Graham, D. J., Quintero, J. C., Rhodes, A., Robbins, H. L., Roth, E., Shivaprakash, M., Titus, D., Yang, T., Teppler, H., Squires, K. E., Deutsch, P.J., and Emini, E. A. (1995). In vivo emergence of HIV-1 variants resistant to multiple protease inhibitors. Nature 374, 569-571. Crawford, S., and Goff, S. P. (1985). A deletion mutation in the 5' part of the pol gene of Moloney murine leukemia virus blocks proteolytic processing of the gag and pol polyproteins. J. Virol. 53,899-907. Davies, D. R. (1990). The structure and function of the aspartic proteinases. Annu. Rev. Biophys. Biophys. Chem. 19, 189-215. De Clercq, E. (1995). Toward improved anti-HIV chemotherapy: Therapeutic strategies for intervention with HIV infections.J. Med. Chem. 38, 2491-2517. De Lucca, G. V. (1997). Synthesis and evaluation of imidazolidinones as nonpeptide HIV-protease inhibitors. Bioorg. Med. Chem. Lett. 7,495-500. De Lucca, G. V., Liang, J., Aldrich, P. E., Calabrese, J., Cordova, B., Klabe, R. M., Rayner, M. M., and Chang, C. H. (1997a). Design, synthesis, and evaluation of tetrahydropyrimidinones as an example of a general approach to nonpeptide HIV protease inhibitors. J. Med. Chem. 40(11), 1707-1709. De Lucca, G. V., Erickson-Viitanen, S., and Lam, P. Y. S. (1997b). Cyclic HIV protease inhibitors capable of displacing the active site structural water molecule. DDT 2, 6-18. Debouck, C. (1992). The HIV-1 protease as a therapeutic target for AIDS. AIDS Res. Hum. Retroo viruses 8(2), 153-164. Deeks, S. G., Smith, M., Holodniy, M., and Kahn, J. O. (1997). HIV-1 protease inhibitors: A review for clinicians.J. Am. Med. Assoc. 277(2), 145-153. Dorsey, B., McDaniel, S., Levin, R., Vacca, J., Darke, P., Zugay, J., Emini, E., Schleif, W., Lin, J., Chen, I., Holloway, K., Anderson, P., and Huff, J. (1994a). Synthesis and evaluation of pyridyl analogs of L-735,524: Potent HIV-1 protease inhibitors. Bioorg. Med. Chem. Lett. 4(23), 27692774. Dorsey, B. D., Levin, R. B., McDaniel, S. L., Vacca, J. P., Guare, J. P., Darke, P. L., Zugay, J. A., Emini, E., Schleif, W. A., Quintero, J. C., Lin, J. H., Chen, I.-W., Holloway, M. K., Fitzgerald, P., M. D., Axel, M. G., Ostovic, D., Anderson, P. S., and Huff, J. R. (1994b). L-735,524: The design of a potent and orally bioavailable HIV protease inhibitor. J. Med. Chem. 37(21), 3443-3451. Doyon, L., Croteau, G., Thibeault, D., Poulin, F., Pilote, L., and Lamarre, D. (1996). Second locus involved in human immunodeficiency virus type 1 resistance to protease inhibitors. J. Virol 70(6),3763-3769. Dreyer, G. B., Boehm, J. C., Chenera, B., DesJalais, R. L., Hassell, A. M., Meek, T. D., and Tomaszek, T. A., Jr. (1993). A symmetric inhibitor binds HIV-1 protease asymmetrically. Biochemistry 32, 937-947. Dreyer, G. B., Lambert, D. M., Meek, T. D., Carr, T.J., Tomaszek, T. A.,Jr., Fernandez, A. V., Bartus, H., Cacciavillani, E., Hassell, A. M., Minnich, M., Petteway, S. R., Jr., Metcalf, B. W., and Lewis,

HIV Protease

47

M. (1992). Hydroxyethylene isostere inhibitors of human immunodeficiency virus-1 protease: Structure-activity analysis using enzyme kinetics, X-ray crystallography, and infected T-cell assays. Biochemistry 31(29), 6646-6659. Dreyer, G. B., Metcalf, B. W., Tomaszek, T. A., Jr., Carr, T. J., Chandler, A. C., III, Hyland, L., Fakhoury, S. A., Magaard, V. W., Moore, M. L., Strickler, J. E., Debouck, C., and Meek, T. D. (1989). Inhibition of human immunodeficiency virus 1 protease in vitro: Rational design of substrate analogue inhibitors. Proc. Natl. Acad. Sci. USA 86, 9752-9756. Dunn, B. M., Gustchina, A., Wlodawer, A., and Kay, J. (1994). Subsite preferences of retroviral proteinases. In "Methods in Enzymology" (L. C. Kuro andJ. A. Shafer, Eds.), Vol. 241, pp. 254278. Academic Press, San Diego, CA. E1-Farrash, M. A., Kuroda, M. J., Kitazaki, T., Masuda, T., Kato, K., Hatanaka, M., and Harada, S. (1994). Generation and characterization of a human immunodeficiency virus type 1 (HIV-1) mutant resistant to an HIV-1 protease inhibitor.J. Virol. 68(1), 233-239. Erickson, J., Neidhart, D. J., VanDrie, J., Kempf, D. J., Wang, X. C., Norbeck, D. W., Plattner, J. J., Rittenhouse, J. w., Turon, M., Wideburg, N., Kohlbrenner, W. E., Simmer, R., Helfrich, R., Paul, D. A., and Knigge, M. (1990). Design, activity, and 2.8A crystal structure of a C2 symmetric inhibitor complexed to HIV-1 protease. Science 249, 527-533. Erickson, J. W. (1993). Design and structure of symmetry-based inhibitors of HIV-1 protease. Perspect. Drug Disc. Design 1,109-128. Erickson, J. W. (1995). The not-so-great escape. Nat. Struct. Biol. 2(7), 523-529. Fassler, A., Rosel, J., Grutter, M., Tintelnot-Blomley, M., Alteri, E., Bold, G., and Lang, M. (1993). Novel pseudosymmetric inhibitors of HIV-1 protease. Bioorg. Med. Chem. Lett. 3, 2837-2842. Fischl, M., Richman, D., Flexner, C., Para, M., Haubrich, R., Karim, A., Yeramian, P., HoldenWiltse, J., and Meehan, P. (1997). Phase I/II study of the toxicity, pharmacokinetics, and activity of the HIV protease inhibitor SC-52151.J. Acquir. Immune Defic. Syndr. 15, 28-34. Fitzgerald, P. M. D., and Springer, J. P. (1991). Structure and function of retroviral proteases. Annu. Rev. Biophys. Biophys. Chem. 20, 299-320. Flexner, C. (1998). HIV-protease inhibitors. New Engl. J. Med. 338, 1281-1292. Freskos, J. N., Bertenshaw, D. E., Getman, D. P., Heintz, R. M., Mitchke, B. V., Blystone, L. W., Bryant, M. L., Funckes-Shippy, C., Houseman, K. A., Kishore, N. N., Kocan, G. P., and Mehta, P. P. (1996). (Hydroxyethyl) sulfonamide HIV-1 protease inhibitors: Identification of the 2-methylbenzoyl moiety at P-2. Bioorg. Med. Chem. Lett. 6,445-450. Friedman, S., Ganapathi, P., Rubin, Y., and Kenyon, G. (1998). Optimizing the binding of fullerene inhibitors of the HIV-1 protease through predicted increases in hydrophobic desolvation. J. Med. Chem. 41, 2424-2429. Gait, M. J., and Karn, J. (1995). Progress in anti-HIV structure-based drug design. Trends Biotech. 13,430-438. Gao, F., Yue, L., Robertson, D. L., Hill, S. C., Hui, H., Biggar, R. J., Neequaye, A. E., Whelan, T. M., Ho, D. D., Shaw, G. M., Sharp, P. M., and Hahn, B. H. (1994). Genetic diversity of human immunodeficiency virus type 2: Evidence for distinct sequence subtypes with differences in virus biology.J. Virol. 68, 7433-7447. Getman, D. P., DeCrescenzo, G. A., Heintz, R. M., Reed, K. L., Talley, J. J., Bryant, M. L., Clare, M., Houseman, K. A., Marr, J. J., Mueller, R. A., Vazquez, M. L., Shieh, H.-S., Stallings, W. C., and Stegeman, R. A. (1993). Discovery of a novel class of potent HIV-1 protease inhibitors containing the (R)-(hydroxyethyl)urea isostere. J. Med. Chem. 36(2), 288-291. Ghosh, A. K., Kincaid, J. F., Cho, W., Waiters, D. E., Krishnan, K., Hussain, K. A., Koo, Y., Cho, H., Rudall, C., Holland, L., and Buthod, J. (1998). Potent HIV protease inhibitors incorporating high-affinity P2-1igands and (R)-(Hydroxyethylamino)sulfonamide isostere. Bioorg. Med. Chem. Lett. 8, 687-690. Ghosh, A. K., Kincaid, J. F., Waiters, D. E., Chen, Y., Chaudhuri, N. C., Thompson, W. J., Culber-

48

John W. Erickson and Michael A. Eissenstat

son, C., Fitzgerald, P. M., Lee, H. Y., McKee, S. P., Munson, P. M., Duong, T. T., Darke, P. L., Zugay, J. A., Schleif, W. A., Axel, M. G., Lin, J., and Huff, J. R. (1996). Nonpeptidal P2 ligands for HIV protease inhibitors: Structure-based design, synthesis, and biological evaluation.J. Med. Chem. 39(17), 3278-3290. Ghosh, A. K., Thompson, W. J., Fitzgerald, P. M. D., Culberson, J. C., Axel, M. G., McKee, S. P., Huff, J. R., and Anderson, P. S. (1994). Structure-based design of HIV-1 protease inhibitors: Replacement of two amides and a 107r-aromatic system by a fused bis-tetrahydrofuran. J. Med. Chem. 37, 2506-2508. Gonda, M. A., Wong-Staal, F., Gallo, R. C., Clements, J. E., Narayan, O., and Gilden, R. V. (1985). Sequence homology and morphologic similarity of HTLV-III and visna virus, a pathogenic lentivirus. Science 227, 173-177. Graul, A., and Castaner, J. (1996). Ritonavir. Drugs Future 21,700-705. Grobelny, G., Chen, Q., Tyssen, D., Tachedjian, G., Sebire, K., Buchanan, L., and Birch, C. (1997). Antiviral activity of Dg-35-VIII, a potent inhibitor of the protease of human immunodeficiency virus. Antiviral Chem. Chemother. 8(2), 99-106. Grubb, M. F., Wong, Y. N., Burcham, D. L., Saxton, P. L., Quon, C. Y., and Huang, S.-M. (1994). Pharmacokinetics of HIV protease inhibitor DMP 323 in rats and dogs. Drug Metab. Dispos. 22, 709-712. Gulnik, S. V., Suvorov, L. I., Liu, B., Yu, B., Anderson, B., Mitsuya, H., and Erickson, J. W. (1995). Kinetic characterization and cross-resistance patterns of HIV-1 protease mutants selected under drug pressure. Biochemistry 34(29), 9282-9287. Hagen, S. E., Prasad, J. V., Boyer, F. E., Domagala,J. M., Ellsworth, E. L., Gajda, C., Hamilton, H. W., Markoski, L.J., Steinbaugh, B. A., Tait, B. D., Lunney, E. A., Tummino, P. J., Ferguson, D., Hupe, D., Nouhan, C., Gracheck, S.J., Saunders, J. M., and VanderRoest, S. (1997). Synthesis of 5,6dihydro-4-hydroxy-2-pyrones as HIV-1 protease inhibitors: the profound effect of polarity on antiviral activity.J. Med. Chem. 40(23), 3707-3711. Hamilton, H. W., Tait, B. D., Gajda, C., Hagen, S. E., Ferguson, D., Lunney, E. A., Pavlovsky, A., and Tummino, P. J. (1996). 6-Phenyl-6-alkylamido-5,6-dihydro-2H-pyran-2-ones: Novel HIV protease inhibitors. Bioorg. Med. Chem. Lett. 6, 719-724. Hammond, J., Larder, B. A., Schinazi, R. F., and Mellors, J. W. (1997). Mutation in retroviral genes associated with drug resistance. In "Human Retroviruses and AIDS" (B. Korber, Ed.), pp. 207249. Los Alamos, NM. Han, Q., Chang, C.-H., Li, R., Ru, Y., Jadhav, P. K., and Lam, P. Y. S. (1998). Cyclic HIV protease inhibitors: Design and synthesis of orally bioavailable, pyrazole P2/P2' cyclic ureas with improved potency.J. Med. Chem. 41, 2019-2028. Hanessian, S., and Devasthale, P. V. (1996). Design and synthesis of novel, pseudo C2 symmetric inhibitors of HIV protease. Bioorg. Med. Chem. Lett. 6(18), 2201-2206. Hellen, C. U. T. (1994). Assay methods for retroviral proteases. In "Methods in Enzymology" (L. C. Kuo and J. A. Shafer, Eds.), Vol. 241, pp. 46-58. Academic Press, San Diego, CA. Ho, D. D., Neumann, A. U., Perelson, A. S., Chen, W., Leonard, J. M., and Markowitz, M. (1995). Rapid turnover of plasma virions and CD4 lymphocytes in HIV-1 infection. Nature 373(6510), 123 -126. Ho, D. D., Toyoshima, T., Mo, H., Kempf, D. J., Norbeck, D., Chen, C.-M., Wideburg, N. E., Burt, S. K., Erickson, J. W., and Singh, M. K. (1994). Characterization of human immunodeficiency virus type 1 variants with increased resistance to a C2-symmetric protease inhibitor. J. Virol. 68(3), 2016-2020. Hodge, C. N., Aldrich, P. E., Bacheler, L. T., Chang, C. H., Eyermann, C.J., Garber, S., Grubb, M., Jackson, D. A., Jadhav, P. K., Korant, B., Lam, P. Y., Maurin, M. B., Meek, J. L., Otto, M. J., Rayner, M. M., Reid, C., Sharpe, T. R., Shum, L., Winslow, D. L., and Erickson-Viitanen, S.

HIV Protease

49

(1996). Improved cyclic urea inhibitors of the HIV-1 protease: Synthesis, potency, resistance profile, human pharmacokinetics and X-ray crystal structure of DMP 450. Chem. Biol. 3(4), 301-314. Holloway, M. K., Wai, J. M., Halgren, T. A., Fitzgerald, E M., Vacca, J. E, Dorsey, B. D., Levin, R. B., Thompson, W. J., Chen, L. J., deSolms, S. J., Gaffin, N., Ghosh, A. K., Giuliani, E. A., Graham, S. L., Guare, J. P., Hungate, R. W., Lyle, T. A., Sanders, W. M., Tucker, T. J., Wiggins, M., Wiscount, C. M., Woltersdorf, O. W., Young, S. D., Darke, E L., and Zugay, J. A. (1995). A priori prediction of activity for HIV- 1 protease inhibitors employing energy minimization in the active site. J. Med. Chem. 38(2), 305-317. Holmes, D. S., Bethell, R. C., Hann, M. M., Kitchin, J., Starkey, I. D., and Storer, R. (1993). The design and synthesis of novel hydroxyproline inhibitors of HIV-1 proteinase. Bioorg. Med. Chem. Lett. 3(8), 1485-1491. Hosur, M. V., Bhat, T. N., Kempf, D. J., Baldwin, E. T., Liu, B., Gulnik, S. V., Wideburg, N. E., Norbeck, D. W., Appelt, K., and Erickson, J. W. (1994). Influence of stereochemistry on activity and binding modes for C2 symmetry-based diol inhibitors of HIV-1 protease. J. Am. Chem. Soc. 116,847-855. Huff, J. R. (1991). HIV protease: A novel chemotherapeutic target for AIDS. J. Med. Chem. 34(8), 2305-2314. Hulten, J., Bonham, N. M., Nillroth, U., Hansson, T., Zuccarello, G., Bouzide, A., Aqvist,J., Classon, B., Danielson, U. H., Karlen, A., Kvarnstrom, I., Samuelsson, B., and Hallberg, A. (1997). Cyclic HIV-1 protease inhibitors derived from mannitol: Synthesis, inhibitory potencies, and computational predictions of binding affinities. J. Med. Chem. 40(6), 885-897. Humber, D., Bamford, M., Bethell, R., Cammack, N. K. C., Evans, D., Gray, N., Hann, M., Orr, D., Saunders, J., Shenoy, S., Storer, R., Weingarten, G., and Wyatt, P. (1993). A series of penicillin derived C2-symmetric inhibitors of HIV-1 proteinases: synthesis, mode of interaction, and structure-activity relationships. J. Med. Chem. 36(21), 3120-3128. Ives, K. J., Jacobsen, H., Galpin, S. A., Garaev, M. M., Dorrell, L., Mous, J., Bragman, K., and Weber, J. N. (1997). Emergence of resistant variants of HIV in vivo during monotherapy with the proteinase inhibitor saquinavir. J. Antimicrob. Chemother. 39(6), 771-779. Jacobsen, H., Ahlborn, L., Gugel, R., and Mous, J. (1992). Progression of early steps of human immunodeficiency virus type 1 replication in the presence of an inhibitor of viral protease. J. Virol. 66, 5087-5091. Jadhav, E K., Ala, E, Woerner, F. J., Chang, C. H., Garber, S. S., Anton, E. D., and Bacheler, L. T. (1997). Cyclic urea amides: HIV-1 protease inhibitors with low nanomolar potency against both wild type and protease inhibitor resistant mutants of HIV.J. Med. Chem. 40(2), 181-191. Jadhav, E K., and Man, H. W. (1996). Synthesis of 7-membered cyclic oxamides: Novel HIV-1 protease inhibitors. Tetrahed. Lett. 37, 1153-1156. Jadhav, E K., Woerner, F. J., Lam, P. Y. S., Hodge, C. N., Eyermann, C. J., Man, H.-W., Daneker, W. F., Bacheler, L. T., Rayner, M. M., Meek, J. L., Erickson-Viitanen, S., Jackson, D. A., Calabrese, J. C., Schadt, M., and Chang, C.-H. (1998). Nonpeptide cyclic cyanoguanidines as HIV-1 protease inhibitors: Synthesis, structure-activity relationships, and X-ray crystal structure studies.J. Med. Chem. 41, 1446-1455. Janakiraman, M., Watenpaugh, K., Timich, E, Chong, K., Turner, S., Tommasi, R., Thaisrivongs, S., and Strohbach, J. (1998). Non-peptidic HIV protease inhibitors: Cz-symmetry-based design of bis-sulfonamide dihydropyrones. Bioorg. Med. Chem. Lett. 8, 1237-1242. Jeong, Y., Seo, M., Choi, Y., Kim, I., and Lee, Y. (1997). High-performance liquid chromatographic assay of a new HIV-1 protease inhibitor, LB71350, in the plasma of dogs. J. Chrom. B 703, 284-288. Kageyama, S., Mimoto, T., Murakawa, Y., Nomizu, M., Ford, H., Jr., Shirasaka, T., Gulnik, S., Erick-

50

John W. Erickson and Michael A. Eissenstat

son, J., Takada, K., Hayashi, H., Broder, S., Kiso, Y., and Mitsuya, H. (1993). In vitro anti-human immunodeficiency virus (HIV) activities of transition state mimetic HIV protease inhibitors containing allophenylnorstatine. Antimicrob. Agents Chemother. 37 (4), 810 - 817. Kaldor, S. W., Appelt, K., Fritz, J. E., Hammond, M., Crowell, T. A., Baxter, A. J., Hatch, S. D., Wiskerchen, M., and Muesing, M. A. (1995). A systematic study of the P1-P3 spanning sidechains for the inhibition of HIV-1 protease. Bioorg. Med. Chem. Lett. 5,715-720. Kaldor, S. W., Hammond, M., Dressman, B. A., Fritz, J. E., Crowell, T. A., and Hermann, R. A. (1994). New dipeptide isosteres useful for the inhibition of HIV-1 protease. Bioorg. Med. Chem. Lett. 4(11), 1385-1390. Kaldor, S. W., Kalish, V. J., Davies, J. F., II, Shetty, B. V., Fritz, J. E., Appelt, K., Burgess, J. A., Campanale, K. M., Chirgadze, N. Y., Clawson, D. K., Dressman, B. A., Hatch, S. D., Khalil, D. A., Kosa, M. B., Lubbehusen, P. P., Muesing, M. A., Patick, A. K., Reich, S. H., Su, K. S., and Tatlock, J. H. (1997). Viracept (nelfinavir mesylate, AG1343): A potent, orally bioavailable inhibitor of HIV-1 protease.J. Med. Chem. 40(24), 3979-3985. Kalish, V. J., Tatlock, J. H., Davies, J. F., II, Kaldor, S. W., Dressman, B. A., Reich, S., Pino, M., Nyugen, D., Appelt, K., Musick, L., and Wu, B. (1995). Structure-based drug design of nonpetidic P2 substituents for HIV-1 protease inhibitors. Bioorg. Med. Chem. Lett. 5,727-732. Kalyanaraman, V. S., Sarngadharan, M. G., and Robert-Guroff, M. (1982). A new subtype of human T-cell leukemia virus (HTLV-II) associated with a T-cell variant of hairy cell leukemia. Science 218,571-573. Kaplan, A. H., Manchester, M., Smith, T., Yang, Y. L., and Swanstrom, R. (1996). Conditional human immunodeficiency virus type i protease mutants show no role for the viral protease early in virus replication.J. Virol. 70(9), 5840-5844. Kaplan, A. H., Michael, S. F., Wehbie, R. S., Knigge, M. F., Paul, D. A., Everitt, L., Kempf, D. J., Norbeck, D. W., Erickson, J. W., and Swanstrom, R. (1994). Selection of multiple human immunodeficiency virus type 1 variants that encode viral proteases with decreased sensitivity to an inhibitor of the viral protease. Proc. Natl. Acad. Sci. USA 91, 5597-5601. Katoh, I., Yasunaga, T., Ikawa, Y., and Yoshinaka, Y. (1987). Inhibition of retroviral protease activity by an aspartyl proteinase inhibitor. Nature 329, 654-656. Katoh, I., Yoshinaka, Y., Rein, A., Shibuya, M., Odaka, T., and Oroszlan, S. (1985). Murine leukemia virus maturation: Protease region required for conversion from "immature" to "mature" core form and for virus infectivity. Virology 145,280-292. Katz, R. A., and Skalka, A. M. (1994). The retroviral enzymes. Annu. Rev. Biochem. 63, 133-173. Kawano, F., Yamaguchi, K., Nishimura, H., Tsuda, H., and Takatsuki, K. (1985). Variation in the clinical courses of adult T-cell leukemia. Cancer 55,851-856. Kay, J., and Dunn, B. M. (1990). Viral proteinases: Weakness in strength. Biochim. Biophys. Acta 1048, 1-18. Kempf, D. J. (1994). Progress in the discovery of orally bioavailable inhibitors of HIV protease. Perspect. Drug Discov. Design 2, 427-436. Kempf, D. J., Codacovi, L., Wang, X. C., Kohlbrenner, W. E., Wideburg, N. E., Saldivar, A., Vasavanonda, S., Marsh, K. C., Bryant, P., Sham, H. L., Green, B. E., Betebenner, D. A., Erickson, J., and Norbeck, D. W. (1993). Symmetry-based inhibitors of HIV protease: Structure-activity studies of acylated 2,4-diamino-l,5-diphenyl-3-hydroxypenta~e and 2,5-diamino-l,6-diphen\ ylhexane-3,4-diol.J. Med. Chem. 36(3), 320-330. Kempf, D. J., Flentge, C. A., Wideburg, N. E., Saldivar, A., Vasavanonda, S., and Norbeck, D. W. (1995a). Evaluation of substituted benzamides as P2 ligands for symmetry-based inhibitors of HIV protease. Bioorg. Med. Chem. Lett. 5, 2725-2728. Kempf, D. J., Marsh, K. C., Denissen, J. F., McDonald, E., Vasavanonda, S., Flentge, C. A., Green, B. E., Fino, L., Park, C. H., Kong, X. P., Wideburg, N. E., Saldivar, A., Ruiz, L., Kati, W. M., Sham, H. L., Robins, T., Stewart, K. D., Hsu, A., Plattner, J. J., Leonard, J. M., and Norbeck,

HIV Protease

51

D. W. (1995). ABT-538 is a potent inhibitor of human immunodeficiency virus protease and has high oral bioavailability in humans. Proc. Natl. Acad. Sci. USA 92(7), 2484-2488. Kempf, D. J., Marsh, K. C., Kumar, G., Rodrigues, A. D., Denissen, J. F., McDonald, E., Kukulka, M. J., Hsu, A., Granneman, G. R., Baroldi, P. A., Sun, E., Pizzuti, D., Plattner, J. J., Norbeck, D. W., and Leonard, J. M. (1997). Pharmacokinetic enhancement of inhibitors of the human immunodeficiency virus protease by coadministration with ritonavir. Antimicrob. Agents Chemother. 41(3), 654-660. Kempf, D. J., Marsh, K. C., Paul, D. A., Knigge, M. F., Norbeck, D. W., Kohlbrenner, W. E., Codacovi, L., Vasavanonda, S., Bryant, P., Wang, X. C., Wideburg, N. E., Clement, J. J., Plattner, J. J., and Erickson, J. W. (1991). Antiviral and pharmacokinetic properties of C2 symmetric inhibitors of the human immunodeficiency virus type 1 protease. Antimicrob. Agents Chemother. 35, 2209-2214. Kempf, D. J., Norbeck, D. W., Codacovi, L., Wang, X. C., Kohlbrenner, W. E., Wideburg, N. E., Paul, D. A., Knigge, M. F., Vasavanonda, S., Craig-Kennard, A., Saldivar, A., Rosenbrook, W., Jr., Clement, J. J., Plattner, J. J., and Erickson, J. (1990). Structure-based, C2 symmetric inhibitors of HIV protease. J. Med. Chem. 33(10), 2687-2689. Kim, B. M., Guare, J. P., Vacca, J. P., Michelson, S. R., Darke, P. L., Zugay, J. A., Emini, E. A., Schleif, W., Lin, J. H., Chen, I. W., Vastag, K., Anderson, P. S., and Huff, J. R. (1995a). Thiophene derivatives as extremely high affinity Py ligands for the hydroxyethylpiperazine class of HIV-1 protease inhibitors. Bioorg. Med. Chem. Lett. 5, 185-190. Kim, B. M., Hanifin, C. M., Zartman, C. B., Vacca, J. P., Michelson, S. R., Lin, J. H., Chen, I.-W., Vastag, K., Darke, P. L., Zugay, J. A., Emini, E. A., Schleif, W., Anderson, P. S., and Huff, J. R. (1995b). Substituted alkylpyridines as P3, ligands for the hydroxyethylpiperazine class of HIV- 1 protease inhibitors: Improved pharmacokinetic profiles. Bioorg. Med. Chem. Lett. 5, 22392244. Kim, B. M., Vacca, J. P., Guare, J. P., Hanifin, C. M., Michelson, S. R., Darke, P. L., Zugay, J. A., Emini, E. A., Schleif, W., Lin, J. H., Chen, I.-W., Vastag, K., Ostovic, D., Anderson, P. S., and Huff, J. R. (1994). A new hydro xyethylamine class of HIV-1 protease inhibitors with high antiviral potency and oral bioavailability. Bioorg. Med. Chem. Lett. 4, 2273-2278. Kim, C. U., McGee, L. R., Krawczyk, S. H., Harwood, E., Harada, Y., Swaminathan, S., Bischofberger, N., Chen, M. S., Cherrington, J. M., Xiong, S. F., Griffin, L., Cundy, K. C., Lee, A., Yu, B., Gulnik, S., and Erickson, J. W. (1996). New series of potent, orally bioavailable, non-peptidic cyclic sulfones as HIV-1 protease inhibitors.J. Med. Chem. 39(18), 3431-3434. Kim, E. E., Baker, C. T., Dwyer, M. D., Murcko, M. A., Rao, B. G., Tung, R. D., and Navia, M. A. (1995). Crystal structure of HIV-1 protease in complex with VX-478, a potent and orally bioavailable inhibitor of the enzyme.J. Am. Chem. Soc. 117, 1181-1182. Kitazaki, T., Asano, T., Kato, K., Kishimoto, S., and Itoh, K. (1994). Synthesis and human immunodeficiency virus (HIV)-I protease inhibitory activity of tripeptide analogues containing a dioxoethylene moiety. Chem. Pharm. Bull. 42(12), 2636-2640. Kohl, N. E., Emini, E. A., Schleif, W. A., Davis, L. J., Heimbach, J. C., Dixon, R. A. F., Scolnick, E. M., and Sigal, I. S. (1988). Active human immunodeficiency virus protease is required for viral infectivity. Proc. Natl. Acad. Sci. USA 85, 4686-4690. Krafft, G. A., and Wang, G. T. (1994). Synthetic approaches to continuous assays of retroviral proteases. In "Methods in Enzymology" (L. C. Kuo and J. A. Schafer, Eds.), Vol. 241, pp. 7086. Academic Press, San Diego, CA. Kr~iusslich, H.-G., Schneider, H., Zybarth, G., Carter, C. A., and Wimmer, E. (1988). Processing of in vitro-synthesized gag precursor proteins of human immunodeficiency virus (HIV) type i by HIV proteinase generated in Escherichia coli.J. Virol. 62(11), 4393-4397. Lam, P. Y., Jadhav, P. K., Eyermann, C. J., Hodge, C. N., Ru, Y., Bacheler, L. T., Meek, J. L., Otto, M. J., Rayner, M. M., Wong, Y. N., Chang, C.-H., Weber, P. C., Jackson, D. A., Sharpe, T. R., and

52

John W. Erickson and Michael A. Eissenstat

Erickson-Viitanen, S. (1994). Rational design of potent, bioavailable, nonpeptide cyclic ureas as HIV protease inhibitors. Science 263,380-384. Lam, P. Y., Ru, Y., Jadhav, P. K., Aldrich, P. E., DeLucca, G. V., Eyermann, C. J., Chang, C. H., Emmett, G., Holler, E. R., Daneker, W. F., Li, L., Confalone, P. N., McHugh, R. J., Han, Q., Li, R., Markwalder, J. A., Seitz, S. P., Sharpe, T. R., Bacheler, L. T., Rayner, M. M., Klabe, R. M., Shum, L., Winslow, D. L., Kornhauser, D. M., and Hodge, C. N. (1996). Cyclic HIV protease inhibitors: Synthesis, conformational analysis, P2/P2' structure-activity relationship, and molecular recognition of cyclic ureas. J. Med. Chem. 39(18), 3514-3525. Lamarre, D., Croteau, G., Pilote, L., Rousseau, P., and Doyon, L. (1994). Molecular characterization of HIV-1 variants resistant to specific vital protease inhibitors. Proceedings of the Third International Workshop on HIV Drug Resistance. Lange-Savage, G., Berchtold, H., Liesum, A., Budt, K. H., Peyman, A., Knolle, J., Sedlacek, J., Fabry, M., and Hilgenfeld, R. (1997). Structure of HOE/BAY 793 complexed to human immunodeficiency virus (HIV-1) protease in two different crystal forms-structure/function relationship and influence of crystal packing. Eur. J. Biochem. 248(2), 313-322. Lapatto, R., Blundell, T., Hemmings, A., Overington, J., Wilderspin, A., Wood, S., Merson, J. R., Whittle, P. J., Danley, D. E., Geoghegan, K. F., Hawrylik, S.J., Lee, S. E., Scheld, K., and Hobart, P. M. (1989). X-ray analysis of HIV-1 proteinase at 2.7A resolution confirms structural homology among retroviral enzymes. Nature 342, 299-302. Larder, B. A., Bloor, S., Hertogs, K., Van den Eynde, C., and Pauwels, R. (1998). HIV-1 Gag cleavage site changes are associated with specific protease mutations in plasma HIV-1 RNA but are not always retained in replication-competent, protease inhibitor-resistance recombinant viruses. Abstract 23 from the 2nd International Workshop on HIV Drug Resistance Treatment Strategies, Lake Maggiore, Italy. Lee, C. G. L., Gottesman, M. M., Cardarelli, C. O., Ramachandra, M., Jeang, K.-T., Ambudkar, S. V., Pastan, I., and Dey, S. (1998). HIV-1 protease inhibitors are substrates for the MDR1 multidrug transporter. Biochemistry 37, 3594-3601. Lee, C. S., Choy, N., Park, C., Choi, H., Son, Y. C., Kim, S., Ok, J. H., Yoon, H., and Kim, S. C. (1996). Design, synthesis, and characterization of dipeptide isostere containing cis-epoxide for the irreversible inactivation of HIV protease. Bioorg. Med. Chem. Lett. 6, 589-594. Lin, Y., Lin, X., Hong, L., Foundling, S., Heinrikson, R. L., Thaisrivongs, S., Leelamanit, W., Raterman, D., Shah, M., and Dunn, B. M. (1995). Effect of point mutations on the kinetics and the inhibition of human immunodeficiency virus type 1 protease: Relationship to drug resistance. Biochemistry 34(4), 1143-1152. Livingston, D. J., Pazhanisamy, S., Porter, D. J., Partaledis, J. A., Tung, R. D., and Painter, G. R. (1995). Weak binding of VX-478 to human plasma proteins and implications for anti-human immunodeficiency virus therapy. J. Infect. Dis. 172(5), 1238-1245. Lunney, E. A., Hagen, S. E., Domagala, J. M., Humblet, C., Kosinski, J., Tait, B. D., Warmus, J. S., Wilson, M., Ferguson, D., Hupe, D., Tummino, P. J., Baldwin, E. T., Bhat, T. N., Liu, B., and Erickson, J. W. (1994). A novel nonpeptide HIV-1 protease inhibitor: Elucidation of the binding mode and its application in the design of related analogs. J. Med. Chem. 37, 2664-2677. Lyle, T. A., Wiscount, C. M., Guare, J. P., Thompson, W. J., Anderson, P. S., Darke, P. L., Zugay, J. A., Emini, E. A., Schleif, W. A., Quintero, J. C., Dixon, R. A. F., Sigal, I. S., and Huff, J. R. (1991). Benzocycloalkyl amines as novel C-termini for HIV protease inhibitors. J. Med. Chem. 34, 1228-1230. Mammano, F., de la Carriere, L. C., Petit, C., and Clavel, F. (1998). Multiple impacts of HIV resistance to protease inhibitors on Gag and reverse transcriptase function. Abstract 47 from the 2nd International Workshop on HIV Drug Resistance Treatment Strategies, Lake Maggiore, Italy. Markowitz, M., Conant, M., Hurley, A., Schluger, R., Duran, M., Peterkin, J., Chapman, S., Patick, A., Hendricks, A., Yuen, G.J., Hoskins, W., Clendeninn, N., and Ho, D. D. (1998). A prelimi-

HIV Protease

53

nary evaluation of nelfinavir mesylate, an inhibitor of human immunodeficiency virus (HIV)-1 protease, to treat HIV infection.J. Infect. Dis. 177(6), 1533-1540. Markowitz, M., Mo, H., Kempf, D. J., Norbeck, D. W., Bhat, T. N., Erickson, J. W., and Ho, D. D. (1995). Selection and analysis of human immunodeficiency virus type i variants with increased resistance to ABT-538, a novel protease inhibitor.J. Virol. 69(2), 701-706. McPhee, F., Good, A. C., Kuntz, I. D., and Craik, C. S. (1996). Engineering human immunodeficiency virus 1 protease heterodimers as macromolecular inhibitors of viral maturation. Proc. Natl. Acad. Sci. USA 93(21), 11477-11481. McQuade, T. J., Tomasselli, A. G., Liu, L., Karacostas, V., Moss, B., Sawyer, T. K., Heinrikson, R. L., and Tarpley, W. G. (1990). A synthetic HIV-1 protease inhibitor with antiviral activity arrests HIV-like particle maturation. Science 247,454-456. Meek, T. D. (1992). Inhibitors of HIV-1 protease.J. Enzym. Inhib. 6, 65-98. Melnick, M., Reich, S. H., Lewis, K. K., Mitchell, L. J., Jr., Nguyen, D., Trippe, A. J., Dawson, H., Davies, J. F., 2nd, Appelt, K., Wu, B. W., Musick, L., Gehlhaar, D. K., Webber, S., Shetty, B., Kosa, M., Kahil, D., and Andrada, D. (1996). Bis tertiary amide inhibitors of the HIV-1 protease generated via protein structure-based iterative design. J. Med. Chem. 39(14), 2795-2811. Miller, M., Jask61ski, M., Rao, J. K. M., Leis, J., and Wlodawer, A. (1989). Crystal structure of a retroviral protease proves relationship to aspartic protease family. Nature 337, 576-579. Mimoto, T., Imai, J., Kisanuki, S., Enomoto, H., Hattori, N., Akaji, K., and Kiso, Y. (1992). Kynostatin (KNI)-227 and -272, highly potent anti-HIV agents: Conformationally-constrained tripeptide inhibitors of HIV protease containing allophenylnorstatine. Chem. Pharm. Bull. 40(8), 2251-2253. Mitsuya, H., and Broder, S. (1987). Strategies for antiviral therapy in AIDS. Nature 325,773-778. Mo, H., Markowitz, M., Majer, P., Burt, S. K., Gulnik, S. V., Suvorov, L. I., Erickson, J. W., and Ho, D. D. (1996). Design, synthesis, and resistance patterns of MP-134 and MP-167, two novel inhibitors of HIV type 1 protease. AIDS Res. Hum. Retroviruses 12(1), 55-61. Molla, A., Korneyeva, M., Gao, Q., Vasavanonda, S., Schipper, P. J., Mo, H. M., Markowitz, M., Chernyavskiy, T., Niu, P., Lyons, N., Hsu, A., Granneman, G. R., Ho, D. D., Boucher, C. A., Leonard, J. M., Norbeck, D. W., and Kempf, D.J. (1996). Ordered accumulation of mutations in HIV protease confers resistance to ritonavir. Nat. Med. 2(7), 760-766. Mous, J., Brun-Vezinet, F., Duncan, I. B., Hanggi, M., Jacobsen, H., and Vella, S. (1994). Characterisation of in vivo selected HIV-1 variants with reduced sensitivity to proteinase inhibitor Saquinavir. Tenth International Conference on AIDS and STD, Yokohama, Japan. Mulichak, A. M., Hui, J. O., Tomasselli, A. G., Heinrikson, R. L., Curry, K. A., Tomich, C. S., Thaisrivongs, S., Sawyer, T. K., and Watenpaugh, K. D. (1993). The crystallographic structure of the protease from human immunodeficiency virus type 2 with two synthetic peptidic transition state analog inhibitors.J. Biol. Chem. 268, 13103-13109. Munroe, J. E., and Hornback, W. J. (1993). 2,3-Bis-carboxamidomethyl substituted oxiranes as inhibitor of HIV protease and their use for the treatment of AIDS. EP575097, 1-55. Nagy, K., Young, M., Baboonian, C., Merson, J., Whittle, P., and Oroszlan, S. (1994). Antiviral activity of human immunodeficiency virus type 1 protease inhibitors in a single cycle of infection: Evidence for a role of protease in the early phase. J. Virol. 68, 757-765. Navia, M. A., Fitzgerald, P. M. D., McKeever, B. M., Leu, C.-T., Heimbach, J. C., Herber, W. K., Sigal, I. S., Darke, P. L., and Springer, J. P. (1989). Three-dimensional structure of aspartyl protease from human immunodeficiency virus HIV-1. Nature 337,615-620. Norbeck, D. W. (1990). Recent advances in anti-retroviral chemotherapy for AIDS. In "Annual Reports in Medicinal Chemistry" (J. A. Bristol, Ed.), Vol. 25, pp. 149-158. Academic Press, San Diego, CA. Norbeck, D. W., and Kempf, D.J. (1991). HIV protease inhibitors. In "Annual Reports in Medicinal Chemistry" (J. A. Bristol, Ed.), Vol. 26, pp. 141-150. Academic Press, San Diego, CA.

54

John W. Erickson and Michael A. Eissenstat

Nugiel, D. A., Jacobs, K., Cornelius, L., Chang, C. H., Jadhav, P. K., Holler, E. R., Klabe, R. M., Bacheler, L. T., Cordova, B., Garber, S., Reid, C., Logue, K. A., Gorey-Feret, L. J., Lam, G. N., Erickson-Viitanen, S., and Seitz, S. P. (1997). Improved P1/PI' substituents for cyclic urea based HIV-1 protease inhibitors: Synthesis, structure-activity relationship, and X-ray crystal structure analysis.J. Med. Chem. 40(10), 1465-1474. Nugiel, D. A., Jacobs, K., Kahenbach, R. F., Worley, T., Patel, M., Meyer, D. T., Jadhav, P. K., De Lucca, G. V., Smyser, T. E., Klabe, R. M., Bacheler, L. T., Rayner, M. M., and Seitz, S. P. (1996). Preparation and structure- activity relationship of novel P 1/P 1'-substituted cyclic ureabased human immunodeficiency virus type- 1 protease inhibitors. J. Med. Chem. 39(11), 2156 2169. Otto, M.J., Garber, S., Winslow, D. L., Reid, C. D., Aldrich, P., Jadhav, P. K., Patterson, C. E., Hodge, C. N., and Cheng, Y.-S. E. (1993). In vitro isolation and identification of human immunodeficiency virus (HIV) variants with reduced sensitivity to C-2 symmetrical inhibitors of HIV type 1 protease. Proc. Natl. Acad. Sci. USA 90, 7543-7547. Park, C., Koh, S. J., Son, Y. C., Choi, H., Lee, C. S., Choy, N., Moon, K. Y., Jung, W. H., Kim, S. C., and Yoon, H. (1995). Rational design of irreversible, pseudo-C2-symmetric HIV-1 protease inhibitors. Bioorg. Med. Chem. Lett. 5, 1843-1848. Patel, M., Kaltenbach, R., Nugiel, D., McHugh, R., Jadhav, P., Bacheler, L., Cordova, B., Klae, R., Viitanen, S., Garber, S., Reid, C., and Seitz, S. (1998). The synthesis of symmetrical and unsymmetrical P1/PI' cyclic ureas as HIV protease inhibitors. Bioorg. Med. Chem. Lett., 8, 1077-1082. Patick, A. K., Rose, R., Greytok, J., Bechtold, C. M., Hermsmeier, M. A., Chen, P. T., Barrish, J. C., Zahler, R., Colonno, R.J., and Lin, P.-F. (1995). Characterization of a human immunodeficiency virus type 1 variant with reduced sensitivity to an aminodiol protease inhibitor. J. Virol. 69, 2148 -2152. Pearl, L. H., and Taylor, W. R. (1987). A structural model for the retroviral proteases. Nature 329, 351-354. Pettit, S., Moody, M., Wehbie, R., Kaplan, A., Nantermet, P., Klein, C., and Swanstrom, R. (1994). The p2 domain of human immunodeficiency virus type 1 Gag regulates sequential proteolytic processing and is required to produce fully infectious virions.J. Gen. Virol. 68, 8017-8027. Poiesz, B. J., Ruscetti, F. W., and Gazdar, A. F. (1980). Detection and isolation of type C retrovirus particles from fresh and cultured lymphocytes of a patient with cutaneous T-cell lymphoma. Proc. Natl. Acad. Sci. USA 77, 7415-7419. Poorman, R. A., Tomasselli, A. G., Heinrikson, R. L., and K6zdy, F.J. (1991). A cumulative specificity model for proteases from human immunodeficiency virus types 1 and 2, inferred from statistical analysis of an extended substrate data base. J. Biol. Chem. 266, 14554-14561. Poppe, S. M., Slade, D. E., Chong, K. T., Hinshaw, R. R., Pagano, P. J., Markowitz, M., Ho, D. D., Mo, H., Gorman, R. R., Dueweke, T. J., Thaisrivongs, S., and Tarpley, W. G. (1997). Antiviral activity of the dihydropyrone PNU-140690, a new nonpeptidic human immunodeficiencyvirus protease inhibitor. Antimicrob. Agents Chemother. 41 (5), 1058-1063. Prasad, J. V. N. V., Para, K. S., Lunney, E. A., Ortwine, D. F., Dunbar, J. B., Jr., Ferguson, D., Tummino, P. J., Hupe, D., Tait, B. D., Domagala, J. M., Humblet, C., Bhat, T. N., Liu, B., Guerin, D. M. A., Baldwin, E. T., Erickson, J. W., and Sawyer, T. K. (1994). Novel series of achiral, low molecular weight, and potent HIV-1 protease inhibitors. J. Am. Chem. Soc. 116(15), 69896990. Prasad, J. V. N., Lunney, E. A., Ferguson, D., Tummino, P. J., Rubin, R. J., Reyner, E. L., Stewart, B. H., Guttendorf, R. J., Domagala, J. M., Suvorov, L. I., Gulnik, S. V., Topoi, I. A., Bhat, T. N., and Erickson,J. W. (1995a). HIV protease inhibitors possessing a novel, high affinity and achiral P1/P2 ligand with unique pattern of in vitro resistance: Importance of conformationallyrestricted template in the design of enzyme inhibitors.J. Am. Chem. Soc. 117, 11070-11074. Prasad, J. V. N. V., Para, K. S., Tummino, P.J., Ferguson, D., McQuade, T.J., Lunney, E. A., Rapun-

HIV Protease

55

dalo, S. T., Batley, B. L., Hingorani, G., Domagala, J. M., Gracheck, S. J., Bhat, T. N., Liu, B., Baldwin, E. T., Erickson, J. W., and Sawyer, T. K. (1995b). Nonpeptidic potent HIV-1 protease inhibitors: (4-hydroxy-6-phenyl-2-oxo-2H-pyran-3-yl) thiomethanes which span the P1 to P2, subsites in a unique mode of binding. J. Med. Chem. 17,898-905. Rabasseda, X., Martel, A. M., and Castaner, J. (1997). Nelfinavir mesylate. Drugs Future 22,371377. Randad, R., Lubkowska, L., Bujacz, A., Gulnik, S., Yu, B., Silva, A., Munshi, S., Lynch, T., Clanton, D., Bhat, T. N., and Erickson, J. (1995b). Structure-based design of achiral anthranilamides as P2/P2' surrogates for symmetry-based HIV protease inhibitors: Design, synthesis, X-ray structure, enzyme inhibition and antiviral activity. Bioorg. Med. Chem. Lett. 5(21), 2557-2562. Randad, R. S., Lubkowska, L., Bhat, T. N., Munshi, S., Gulnik, S. V., Yu, B., and Erickson, J. W. (1995a). Symmetry-based HIV protease inhibitors: Rational design of 2-methylbenzamides as novel P2/P2' ligands. Bioorg. Med. Chem. Lett. 5(15), 1707-1712. Randad, R. S., Lubkowska, L., Silva, A. M., Guerin, D. M., Gulnik, S. V., Yu, B., and Erickson, J. W. (1996). Structure-based design of achiral, nonpeptidic hydroxybenzamide as a novel P2/P2' replacement for the symmetry-based HIV protease inhibitors. Bioorg. Med. Chem. Lett. 4(9), 1471-1480. Randad, R. S., Pan, W., Gulnik, S. V., Burt, S., and Erickson, J. W. (1994). De novo design of nonpeptidic HIV-1 protease inhibitors: Incorporation of structural water. Bioorg. Med. Chem. Lett. 4, 1247-1252. Rao, J. K. M., Erickson, J. W., and Wlodawer, A. (1991). Structural and evolutionary relationships between retroviral and eucaryotic aspartic proteinases. Biochemistry 30, 4663-4671. Reich, S. H., Melnick, M., Davies, J. F., II, Appelt, K., Lewis, K. K., Fuhry, M. A., Pino, M., Trippe, A. J., Nguyen, D., and Dawson, H. (1995). Protein structure-based design of potent orally bioavailable, nonpeptide inhibitors of human immunodeficiency virus protease. Proc. Natl. Acad. Sci. USA 92(8), 3298-3302. Reich, S. H., and Pino, J. J. (1994). HIV protease inhibitors and their preparation. WO94/15906, 1-194. Richman, D. D. (1996). Antiretroviral drug resistance: Mechanisms, pathogenesis, clinical significance. Adv. Exp. Med. Biol. 394, 383-395. Ridky, T. W., Kikonyogo, A., Leis, J., Gulnik, S., Copeland, T., Erickson, J., Wlodawer, A., Kurinov, I., Harrison, R. W., and Weber, I. T. (1998). Drug-resistant HIV-1 proteases identify enzyme residues important for substrate selection and catalytic rate. Biochemistry 37(39), 1383513845. Roberts, M., and Oroszlan, S. (1989). The preparation and biochemical characterization of intact capsids of equine infectious anemia virus. Biochem. Biophys. Res. Commun. 160,486-494. Roberts, N. A., Martin, J. A., Kinchington, D., Broadhurst, A. V., Craig, J. C., Duncan, I. B., Galpin, S. A., Handa, B. K., Kay, J., KrOhn, A., Lambert, R. W., Merrett, J. H., Mills, J. S., Parkes, K. E. B., Redshaw, S., Ritchie, A.J., Taylor, D. L., Thomas, G. J., and Machin, P.J. (1990). Rational design of peptide-based HIV proteinase inhibitors. Science 248, 358-361. Rodgers, J. D., Johnson, B. L., Wang, H., Greenberg, R. A., Erickson-Viitanen, S., Klabe, R. M., Cordova, B. C., Rayner, M. M., Lam, G. N., and Chang, C.-H. (1996). Potent cyclic urea HIV protease inhibitors with benzofused heterocycles as P2/P2' groups. Bioorg. Med. Chem. Lett. 6, 2919-2924. Romines, K. R., Morris, J. K., Howe, W. J., Tomich, P. K., Horng, M. M., Chong, K. T., Hinshaw, R. R., Anderson, D. J., Strohbach, J. W., Turner, S. R., and Mizsak, S. A. (1996). Cycloalkylpyranones and cycloalkyldihydropyrones as HIV protease inhibitors: Exploring the impact of ring size on structure-activity relationships.J. Med. Chem. 39(20), 4125-4130. Romines, K. R., and Thaisrivongs, S. (1995). Analogs of 4-hydroxypyrone: Potent, non-peptidic HIV protease inhibitors. Drugs Future 204, 377-382.

56

John W. Erickson and Michael A. Eissenstat

Romines, K. R., Watenpaugh, K. D., Howe, W. J., Tomich, P. K., Lovasz, K. D., Morris, J. K., Janakiraman, M. N., Lynn, J. C., Horng, M. M., and Chong, K. T. (1995a). Structure-based design of nonpeptidic HIV protease inhibitors from a cyclooctylpyranone lead structure. J. Med. Chem. 38(22), 4463-4473. Romines, K. R., Watenpaugh, K. D., Tomich, P. K., Howe, W. J., Morris, J. K., Lovasz, K. D., Mulichak, A. M., Finzel, B. C., Lynn, J. C., and Horng, M. M. (1995b). Use of medium-sized cycloalkyl rings to enhance secondary binding: Discovery of a new class of human immunodeficiency virus (HIV) protease inhibitors.J. Med. Chem. 38(11), 1884-1891. Rutenberg, E. E., McPhee, F., Kaplan, A. P., Gallion, S. L., Hogan, J. C., Jr., Craik, C. S., and Stroud, R. M. (1996). A new class of HIV-1 protease inhibitor: The crystallographic structure, inhibition and chemical synthesis of an aminimide peptide isostere. Bioorg. Med. Chem. 4(9), 1545-1558. Sardana, V. V., Schlabach, A. J., Graham, P., Bush, B. L., Condra, J. H., Culberson, J. C., Gotlib, L., Graham, D. J., Kohl, N. E., LaFemina, R. L., Schneider, C. L., Wolanski, B. S., Wolfgang, J. A., and Emini, E. A. (1994). Human immunodeficiency virus type I protease inhibitors: Evaluation of resistance engendered by amino acid substitutions in the enzyme's substrate binding site. Biochemistry 33, 2004-2010. Schechter, I., and Berger, A. (1967). On the size of the active site in proteases. I. Papain. Biochem. Biophys. Res. Commun. 27(2), 157-162. Scholz, D., Billich, A., Charpiot, B., Ettmayer, P., Lehr, P., Rosenwirth, B., Schreiner, E., and Gstach, H. (1994). Inhibitors of HIV-1 proteinase containing 2-heterosubstituted 4-amino-3-hydroxy5-phenylpentanoic acid: Synthesis, enzyme inhibition, and antiviral activity. J. Med. Chem. 37, 3079-3089. Seelmeier, S., Schmidt, H., Turk, V., and vonder Helm, K. (1988). Human immunodeficiencyvirus has an aspartic-type protease that can be inhibited by pepstatin A. Pro. Natl. Acad. Sci. USA 85, 6612-6616. Sham, H. L., Betebenner, D. A., Zhao, C., Wideburg, N. E., Saldivar, A., Kempf, D. J., Plattner, J. J., and Norbeck, D. W. (1993). Facile synthesis of potent HIV-1 protease inhibitors containing a novel pseudo-symmetric dipeptide isostere.J. Chem. Soc. Chem. Commun. 13, 1052-1053. Sham, H. L., Zhao, C., Marsh, K. C., Betebenner, D. A., Lin, S., McDonald, E., Vasavanonda, S., Wideburg, N., Saldivar, A., and Robins, T. (1995). Potent inhibitors of the HIV-1 protease with good oral bioavailabilities. Biochem. Biophys. Res. Commun. 211 (1), 159-165. Sham, H. L., Zhao, C., Marsh, K. C., Betebenner, D. A., Lin, S., Rosenbrook, W., Jr., Herrin, T., Li, L., Madigan, D., Vasavanonda, S., Molla, A., Saldivar, A., McDonald, E., Wideburg, N. E., Kempf, D., Norbeck, D. W., and Plattner, J. J. (1996). Novel azacyclic ureas that are potent inhibitors of HIV-1 protease. Biochem. Biophys. Res. Commun. 225(2), 436-440. Shetty, B. V., Kosa, M. B., Khalil, D. A., and Webber, S. (1996). Preclinical pharmacokinetics and distribution to tissue of AG1343, an inhibitor of human immunodeficiency virus type i protease. Antimicrob. Agents Chemother. 40(1), 110 -114. Shoeman, R. L., Sachse, B., Honer, E., Mothes, E., Kaufmann, M., and Traub, P. (1993). Cleavage of human and mouse cytoskeletal and sarcomeric proteins by human immunodeficiency virus type i protease: Actin, desmin, myosin, and tropomyosin. Am. J. Pathol. 142,221-230. Skulnick, H. I., Johnson, P. D., Aristoff, P. A., Morris, J. K., Lovasz, K. D., Howe, W. J., Watenpaugh, K. D., Janakiraman, M. N., Anderson, D. J., Reischer, R. J., Schwartz, T. M., Banitt, L. S., Tomich, P. K., Lynn, J. C., Horng, M. M., Chong, K. T., Hinshaw, R. R., Dolak, L. A., Seest, E. P., Schwende, F. J., Rush, B. D., Howard, G. M., Toth, L. N., Wilkinson, K. R., and Romines, K. R. (1997). Structure-based design of nonpeptidic HIV protease inhibitors: the sulfonamidesubstituted cyclooctylpyranones. J. Med. Chem. 40(7), 1149-1164. Skulnick, H. I., Johnson, P. D., Howe, W. J., Tomich, P. K., Chong, K. T., Watenpaugh, K. D., Janakiraman, M. N., Dolak, L. A., McGrath, J. P., and Lynn, J. C. (1995). Structure-based design of

HIV Protease

57

sulfonamide-substituted non-peptidic HIV protease inhibitors. J. Med. Chem. 38(26), 49684971. Slee, D. H., Laslo, K. L., Elder, J. H., Ollmann, I. R., Gustchina, A., Kervinen, J., Zdanov, A., Wlodawer, A., and Wong, C. H. (1995). Selectivity in the inhibition of HIV and FIV protease: Inhibitory and mechanistic studies of pyrrolidine-containing cr-keto amide and hydroxyethylamine core structures.J. Am. Chem. Soc. 117, 11867-11878. Smallheer, J. M., Mchugh, R.J., Chang, C. H., Kaltenbach, R. F., Worley, T. V., Klabe, R. M., Bacheler, L. T., Rayner, M. M., Erickson-Viitanen, S., and Seitz, S. P. (1997). Functionalized aliphatic P2/P2' analogs of HIV-1 protease inhibitor DMP323. Bioorg. Med. Chem. Lett. 7(11), 13651370. Spinelli, S., Liq, Q. Z., Alzari, P. M., Hirel, P. H., and Poljak, R.J. (1991). The three-dimensional structure of the aspartyl protease from the HIV-1 isolate BRU. Biochimie 73, 1391-1393. St. Clair, M. H., Millard, J., Rooney, J., Tisdale, M., Parry, N., Sadler, B. M., Blum, M. R., and Painter, G. (1996). In vitro antiviral activity of 141W94 (VX-478) in combination with other antiretroviral agents. Antiviral Res. 29(1), 53-56. Stowasser, B., Budt, K. H., Qi, L. J., Peyman, A., and Ruppert, D. (1992). New hybrid transition state analog inhibitors of HIV protease with peripheric C2-symmetry. Tetrahed. Lett. 33, 66256627. Swanstrom, R., and Wills, J. W. (1997). Synthesis, assembly, and processing of viral proteins. In "Retroviruses" (J. M. Coffin, S. H. Hughes, and H. E. Varmus, Eds.), pp. 263-334. Cold Spring Harbor Laboratory Press, Cold Spring Harbor, New York. Szelke, M. (1985). Chemistry of renin inhibitors. In "Aspartic Proteinases and Their Inhibitors" (V. Kostka, Ed.), pp. 421-441. W. de Gruyter, New York. Tait, B. D., Domagala, J., Ellsworth, E. L., Ferguson, D., Gajda, C., Hupe, D., Lunney, E. A., and Tummino, P. J. (1996). Inhibitors of HIV protease: Unique non-peptide active site templates. J. Mol. Recognit. 9(2), 139-142. Tait, B. D., Hagen, S., Domagala, J., Ellsworth, E. L., Gajda, C., Hamilton, H. W., Prasad, J. V., Ferguson, D., Graham, N., Hupe, D., Nouhan, C., Tummino, P. J., Humblet, C., Lunney, E. A., Pavlovsky, A., Rubin, J., Gracheck, S. J., Baldwin, E. T., Bhat, T. N., Erickson, J. W., Gulnik, S. V., and Liu, B. (1997). 4-hydroxy-5,6-dihydropyrones. 2. Potent non-peptide inhibitors of HIV protease.J. Med. Chem. 40(23), 3781-3792. Tam, T., Carriere, J., MacDonald, I. D., Castelhano, A. L., Pliura, D. H., Dewdney, N. J., Thomas, E. M., Bach, C., Barnett, J., Chan, H., and Krantz, A. (1992). Intriguing structure-activity relations underlie the potent inhibition of HIV protease by norstatine-based peptides. J. Med. Chem. 35, 1318-1320. Tang, J., James, M. N. G., Hsu, I. N., Jenkins, J. A., and Blundell, T. L. (1978). Structural evidence for gene duplication in the evolution of the acid proteases. Nature 271,618-621. Taylor, D. L., Ahmed, P. S., Brennan, T. M., Bridges, C. G., Tyms, A. S., Van Dorsselaer, V., Tamus, C., Homsperger, J.-M., and Schirlin, D. (1997). Anti-human immunodeficiency virus activity, bioavailability and drug resistance profile of the novel proteinase inhibitor MDL 74,695. Antiviral Chem. Chemother. 8, 205-214. Thaisrivongs, S., Janakiraman, M. N., Chong, K. T., Tomich, P. K., Dolak, L. A., Turner, S. R., Strohbach, J. W., Lynn, J. C., Horng, M. M., Hinshaw, R. R., and Watenpaugh, K. D. (1996). Structure-based design of novel HIV protease inhibitors: Sulfonamide-containing 4-hydroxycoumarins and 4-hydroxy-2-pyrones as potent non-peptidic inhibitors. J. Med. Chem. 39(12), 2400 -2410. Thaisrivongs, S., Skulnick, H. I., Turner, S. R., Strohbach, J. W., Tommasi, R. A., Johnson, P. D., Aristoff, P. A., Judge, T. M., Gammill, R. B., Morris, J. K., Romines, K. R., Chrusciel, R. A., Hinshaw, R. R., Chong, K. T., Tarpley, W. G., Poppe, S. M., Slade, D. E., Lynn, J. C., Horng,

58

John W. Erickson and Michael A. Eissenstat

M. M., Tomich, P. K., Seest, E. P., Dolak, L. A., Howe, W. J., Howard, G. M., Schwende, F. J., Toth, L. N., Padbury, G. E., Wilson, G. J., Shiou, L., Zipp, G. L., Wilkinson, K. F., Rush, B. D., Ruwart, M. J., Koeplinger, K. A., Zhao, Z., Cole, S., Zaya, R. M., Kakuk, T. J., Janakiraman, M. N., and Watenpaugh, K. D. (1996a). Structure-based design of HIV protease inhibitors: Sulfonamide-containing 5,6-dihydro-4-hydroxy-2-pyrones as non-peptidic inhibitors. J. Med. Chem. 39(22), 4349-53. Thaisrivongs, S., Romero, D. L., Tommasi, R. A., Janakiraman, M. N., Strohbach, J. W., Turner, S. R., Biles, C., Morge, R. R., Johnson, P. D., Aristoff, P. A., Tomich, P. K., Lynn, J. C., Horng, M. M., Chong, K. T., Hinshaw, R. R., Howe, W.J., Finzel, B. C., and Watenpaugh, K. D. (1996b). Structure-based design of HIV protease inhibitors: 5,6-Dihydro-4-hydroxy-2-pyrones as effective, nonpeptidic inhibitors.J. Med. Chem. 39(23), 4630-4642. Thaisrivongs, S., Tomich, P. K., Watenpaugh, K. D., Chong, K.-T., Howe, W. J., Yang, C.-P., Strohbach, J. W., Turner, S. R., McGrath, J. P., Bohanon, M. J., Lynn, J. C., Mulichak, A. M., Spinelli, P. A., Hinshaw, R. R., Pagano, P. J., Moon, J. B., Ruwart, M. J., Wilkinson, K. F., Rush, B. D., Zipp, G. L., Dalga, R. J., Schwende, F. J., Howard, G. M., Padbury, G. E., Toth, L. N., Zhao, Z., Koeplinger, K. A., Kakuk, T. J., Cole, S. L., Zaya, R. M., Piper, R. C., and Jeffrey, P. (1994). Structure-based design of HIV protease inhibitors: 4-Hydroxycoumarins and 4-hydroxy-2pyrones as non-peptidic inhibitors. J. Med. Chem. 37, 3200-3204. Thaisrivongs, S., Watenpaugh, K. D., Howe, W. J., Tomich, P. K., Dolak, L. A., Chong, K. T., Tomich, C. C., Tomasselli, A. G., Turner, S. R., Strohbach, J. W., Mulichak, A. M., Janakiraman, M. N., Moon, J. B., Lynn, J. C., Horng, M. M., Hinshaw, R. R., Curry, K. A., Rothrock, D. J. (1995). Structure-based design of novel HIV protease inhibitors: Carboxamide-containing 4-hydroxycoumarins and 4-hydroxy-2-pyrones as potent nonpeptidic inhibitors. J. Med. Chem. 38(18), 3624-3637. Thomas, G. J., Bushnell, D. J., and Martin, J. A. (1994). Carbocyclic analogues of hydroxyethylamine containing inhibitors of HIV proteinase. Bioorg. Med. Chem. Lett. 4, 2759-2762. Thompson, S. K., Murthy, K. H. M., Zhao, B., Winborne, E., Green, D. W., Fisher, S. M., DesJarlais, R. L., Tomaszek, T. A.,Jr., Meek, T. D., Gleason, J. G., and Abdel-Meguid, S. S. (1994). Rational design, synthesis, and crystallographic analysis of a hydroxyethylene-based HIV-1 protease inhibitor containing a heterocyclic PI,-P2, amide bond isostere. J. Med. Chem. 37, 3100-3107. Toh, H., Ono, M., Saigo, K., and Miyata, T. (1985). Retroviral protease-like sequence in the yeast transposon Tyl. Nature 315,691-692. Tomasselli, A. G., Howe, W. J., Sawyer, T. K., Wlodawer, A., and Heinrikson, R. L. (1991). The complexities of AIDS: an assessment of the HIV protease as a therapeutic target. Chim. Oggi 9, 6-27. Tomasselli, A. G., Sarcich, J. L., Barrett, L. J., Reardon, I. M., Howe, W. J., Evans, D. B., Sharma, S. K., and Heinrikson, R. L. (1993). Human immunodeficiency virus type-1 reverse transcriptase and ribonuclease H as substrates of the viral protease. Prot. Sci. 2, 2167-2176. Tong, L., Pav, S., Mui, S., Lamarre, D., Yoakim, C., Beaulieu, P., and Anderson, P. C. (1995). Crystal structures of HIV-2 protease in complex with inhibitors containing the hydroxyethylamine dipeptide isostere. Structure 3(1), 33-40. Tummino, P. J., Ferguson, D., Hupe, L., and Hupe, D. (1994). Competitive inhibition of HIV-1 protease by 4-hydroxy-benzopyran-2-ones and by 4-hydroxy-6-phenylpyran-2-ones. Biochem. Biophys. Res. Commun. 200, 1658-1664. Tummino, P. J., Prasad, J. V., Ferguson, D., Nouhan, C., Graham, N., Domagala, J. M., Ellsworth, E., Gajda, C., Hagen, S. E., Lunney, E. A., Para, K. S., Tait, B. D., Pavlovsky, A., Erickson, J. W., Gracheck, S., McQuade, T.J., and Hupe, D.J. (1996). Discovery and optimization of nonpeptide HIV-1 protease inhibitors. Bioorg. Med. Chem. 4(9), 1401-1410. Uchida, H., Maeda, Y., and Mitsuya, H. (1997). HIV-1 protease does not play a critical role in the early stages of HIV-1 infection. Antiviral Res. 36(2), 107-113.

HIV Protease

59

Urban, J., Konvalinka,J., Stehlikova,J., Gregorova, E., Majer, E, Soucek, M., Andreansky, M., Fabry, M., and Strop, E (1992). Reduced bond tight-binding inhibitors of HIV-1 protease: Fine tuning of the enzyme subsite specificity. FEBS Lett. 298, 9-13. Vacca, J. E, and Condra, J. H. (1997). Clinically effective HIV-1 protease inhibitors. DDT 2, 261272. Vacca, J. E, Dorsey, B. D., Schleif, W. A., Levin, R. B., McDaniel, S. L., Darke, E L., Zugai, J., Quintero, J. C., Blahy, O. M., Roth, E., Sardana, V. V., Schlabach, A. J., Graham, E I., Condra, J. H., Gotlib, L., Holloway, M. K., Lin, J., Chen, I.-W., Vastag, K., Ostovic, D., Anderson, E S., Emini, E. A., and Huff, J. R. (1994). L-735,524: An orally bioavailable human immunodeficiency virus type i protease inhibitor. Proc. Natl. Acad. Sci. USA 91, 4096-4100. Vacca, J. E, Guare, J. E, deSolms, S. J., Sanders, W. M., Giuliani, E. A., Young, S. D., Darke, E L., Zugay, J., Sigal, I. S., Schleif, W. A., Quintero, J. C., Emini, E. A., Anderson, E S., and Huff, J. R. (1991). L-687,908: A potent hydroxyethylene-containing HIV protease inhibitor. J. Med. Chem. 34, 1225-1228. Vazquez, M. L., Bryant, M. L., Clare, M., DeCrescenzo, G. A., Doherty, E. M., Freskos, J. N., Getman, D. E, Houseman, K. A.,Julien, J. A., and Kocan, G. E (1995). Inhibitors ofHIV-1 protease containing the novel and potent (R)-(hydroxyethyl)sulfonamide isostere. J. Med. Chem. 38(4), 581-584. Wain-Hobson, S., Vartanian, J. E, Henry, M., Chenciner, N., Cheynier, R., Delassus, S., Martins, L. E, Sala, M., Nugeyre, M. T., and Guetard, D. (1991). LAV visited: Origins of the early HIV-1 isolates from Institut Pasteur. Science 252,961-965. Wei, X., Ghosh, S. K., Taylor, M. E., Johnson, V. A., Emini, E. A., Deutsch, E, Lifson, J. D., Bonhoeffer, S., Nowak, M. A., Hahn, B. H., Saag, M. S., and Shaw, G. M. (1995). Viral dynamics in human immunodeficiency virus type 1 infection. Nature 373, 117-122. Weigers, K., Rutter, G., Kottler, H., Tessmer, U., Hohenberg, H., and Krausslich, H.-G. (1998). Sequential steps in human immunodeficiency virus particle maturation revealed by alterations of individual gag polyprotein cleavage sites. J. Virol. 72, 2846-2854. Wild, H.,Jutta, H., Lautz,J., and Paessens, A. (1993). 5-Oxo-dibenzo[a,d]cyclohepta-l,4-dieneund ihre Verwneundg als retrovirale Mittel. O589322A1, 1-26. Wilkerson, W. W., Akamike, E., Cheatham, W. W., Hollis, A. Y., Collins, R. D., DeLucca, I., Lam, E Y., and Ru, Y. (1996). HIV protease inhibitory bis-benzamide cyclic ureas: A quantitative structure-activity relationship analysis. J. Med. Chem. 39(21), 4299-4312. Wilkerson, W. W., Dax, S., and Cheatham, W. W. (1997). Nonsymmetrically substituted cyclic urea HIV protease inhibitors.J. Med. Chem. 40(25), 4079-4088. Wilson, S. I., Phylip, L. H., Mills,J. S., Gulnik, S. V., Erickson,J. W., Dunn, B. M., and Kay,J. (1997). Escape mutants of HIV-1 proteinase-enzymic efficiency and susceptibility to inhibition. Biochim. Biophys. Acta Prot. Struct. Mol. Enzymol. 1339(1), 113-125. Wlodawer, A., and Erickson, J. W. (1993). Structure-based inhibitors of HIV-1 protease. Annu. Rev. Biochem. 62,543-585. Wlodawer, A., Miller, M., Jask61ski, M., Sathyanarayana, B. K., Baldwin, E., Weber, I. T., Selk, L. M., Clawson, L., Schneider, J., and Kent, S. B. H. (1989). Conserved folding in retroviral proteases: Crystal structure of a synthetic HIV-1 protease. Science 245(4918), 616-621. Xie, D., Gulnik, S., Gustchina, E., Yu, B., Shao, W., Qoronfleh, W., Nathan, A., and Erickson, J. W. (1998). Linkage of dimer stability and inhibitor binding for drug resistant HIV-1 protease mutants at neutral pH. Abstract 25 from the 2nd International Workshop on HIV Drug Resistance Treatment Strategies, Lake Maggiore, Italy. Yoon, H., Choy, N., Kim, S. C., Choi, H., Park, C. H., Moon, K. Y., Jung, W., Kim, C. R., Lee, C. S., Koh, J. S., and Kim, S. S. (1997). Irreversible HIV protease inhibitors, intermediates, compositions and processes for the preparation thereof. US 5696134. York, D. M., Darden, T. A., Pedersen, L. G., and Anderson, M. W. (1993). Molecular dynamics

60

John W. Erickson and Michael A. Eissenstat

simulation of HIV-1 protease in a crystalline environment and in solution. Biochemistry 32, 1443-1453. Yu, Z., Caldera, P., McPhee, F., De Voss, J. J., Jones, P. R., Burlingame, A. L., Kuntz, I. D., Craik, C. S., and Ortiz de Montellano, P. R. (1996). Irreversible inhibition of the HIV-1 protease: Targeting alkylating agents to the catalytic aspartate groups. J. Am. Chem. Soc. 118, 5846-5856. Zhang, Y. M., Imamichi, H., Imamichi, T., Lane, H. C., Falloon, J., Vasudevachari, M. B., and Salzm man, N. P. (1997). Drug resistance during indinavir therapy is caused by mutations in the protease gene and in its Gag substrate cleavage sites.J. Virol. 71(9), 6662-6670.

Proteases of the Hepatitis C Virus ANDREA URBANI, RAFFAELE DE FRANCESCO, AND CHRISTIAN STEINKUHLER Instituto di Ricerche di Biologia Molecolare (IRBM) "P. Angeletti," 00040 Pomezia, Rome, Italy

I. II. III. IV. V.

Introduction Genomic Organization The NS2-NS3 Protease The NS3 Protease Conclusions References

I. I N T R O D U C T I O N After the development of serological tests for the hepatitis A and B viruses in the 1970s it became clear that an additional agent accounted for approximately 90% of transfusion-associated hepatitis (non-A non-B hepatitis, NANBH) (Houghton, 1996). Initial experiments indicated that the causative agent of NANBH probably was a small, enveloped virus that could be transmitted to chimpanzees. Its very low titer in infected sera made conventional attempts aimed at its identification impossible. In 1989 Houghton and co-workers at Chiron succeeded in identifying an NANBH-specific cDNA clone from an immunoscreening of a cDNA expression library obtained from an infected chimpanzee plasma pool using a panel of presumptively infected human sera as a specific anti-NANBH immunoglobulin source (Choo et al., 1989). The cloned cDNA was shown to be derived from a single RNA molecule of 9.5 kb with homologies to the genome of Flavi- and Pestiviruses (Miller and Purcell, 1990; Francki et al., 1991). The novel agent, hence termed hepatitis C virus

Proteases of Infectious Agents Copyright 9 1999 by Academic Press. All rights of reproduction in any form reserved.

61

62

Urbani et al.

(HCV), was classified within the Flavivirideae family. Diagnostic tests for antiHCV antibodies developed thereafter proved that HCV was indeed the predominant cause of NANBH (Kuo et al., 1989). The prevalence of the infection varies between 0.01 and 2% according to the geographic region with peaks as high as 14% (Mast and Alter, 1993; Cuthbert, 1994; Houghton, 1996). It is estimated that about 0.5-1.5% of the total world's population is infected with HCV. The main route of transmission is parenteral with high incidence among intravenous drug users and recipients of blood products and renal dialysis patients before implementation of screening regimens. Other risk factors are tattooing and needle stick injuries. Sexual and congenital transmissions appear less frequently (Alter, 1995). The incidence of new infections has greatly diminished as a consequence of the development of reliable diagnostic assays and an effective screening of blood or blood products. Nevertheless, between 20 and 50% of HCV cases are not associated to classic risk factors. These "sporadic" infections correlate with a low socioeconomic background which may be associated with enhanced contact with risk groups such as intravenous drug users. Infection with HCV often occurs without overt clinical symptoms, with only 10% of infected individuals developing an acute illness. About 50% of the infections, however, lead to chronic hepatitis, with elevated serum transaminases, persistence of HCV RNA, and histologic liver lesions. Liver cirrhosis, characterized by an overgrowth of connective tissue replacing the damaged hepatocytes, develops in 20% of the cases within 20 years after the infection (MacDonnell and Lucey, 1995; Houghton, 1996). At this stage, the disease may further progress into liver failure. Furthermore, chronic HCV infection has been associated with hepatocellular carcinoma (Saito et al., 1990; Shimotohno, 1993). At present, ce-interferon is the only licensed treatment for chronic HCV infection in the UK and in the U.S. The efficacy of this therapy is, however, rather low: the response rate to ce-interferon treatment is about 50%. Roughly half of these responders relapse after the end of the treatment, resulting in only 25% of sustained response with disappearance of HCV-RNA from liver tissue, serum, and peripheral blood mononuclear cells (Davies et al., 1989; Alberti, 1995; Blair et al., 1996). The efficacy of ce-interferon appears to be enhanced in a combination therapy with ribavirin (DiBisceglie et al., 1992; Kakumu et al., 1993; Reichard et al., 1993; Koskinas et al., 1995). Even though a very significant improvement of sustained response over monotherapy has been recently reported for the interferon-ribavirin combination therapy, this treatment results in less than 50% of sustained responders (McHutchison et al., 1998; Davies et al., 1998). Furthermore, ce-interferon therapy has a number of sideeffects such as flu-like symptoms, irritability, fatigue, depression, anorexia, nausea, rashes, alopecia, thrombocytopenia, and leukopenia. Additional sideeffects may arise during the combination therapy with ribavirin. In view of the

HCV

Proteases

63

lack of vaccines against HCV there is an urgent need for an efficacious treatment of the disease by an effective antiviral drug. This necessity has boosted research on the biology of HCV with the primary focus being to identify possible targets for pharmaceutical intervention.

II. G E N O M I C O R G A N I Z A T I O N The investigation of HCV biology was facilitated by the homology of its genome structure to that of Flavi- and Pestiviruses. As observed in these two genera, HCV has a single-stranded RNA genome of 9.5 kb with positive polarity containing a single, large open reading frame (ORF). This ORF encodes a polyprotein of 3010-3033 amino acids. The ORF is flanked by 5' and 3' untranslated regions (UTR). The 5' UTR harbors a very well-conserved sequence that is thought to play a role in translation initiation through a mechanism of internal ribosomal entry (Tsukiyama-Koharak et al., 1992; Wang et al., 1993; Fukushi et al., 1994). The 3' UTR contains a poly (U) stretch flanked by an upstream variable region of 2 8 - 4 2 nucleotides and downstream by a highly conserved 98-nucleotide sequence predicted to form a stem-and-loop structure (Tanaka et al., 1995, 1996; Kolykhalov et al., 1996). Owing to the lack of an efficient in vitro propagation system of HCV our knowledge about the fate of the translation product of the viral genome is based on transient transfection or in vitro translation of portions of the viral genome (Hijikata et al., 1991; 1993a; Grakoui et al., 1993a; Selby et al., 1993; Tomei et al., 1993; Lin et al., 1994a; Mizushima et al., 1994a,b). These studies have shown that, upon its synthesis, the polyprotein precursor is co- and posttranslationally processed into at least 10 individual structural and nonstructural mature viral proteins. These gene products are arranged as shown in Fig. 1, following the order C-E1-E2-p7-NS2-NS3-NS4A-NS4B-NS5A-NS5B. The structural proteins C, El, E2, and p7 are located at the N-terminus of the polyprotein, followed by the nonstructural proteins at its C-terminus. The core protein (C), located at the amino terminus of the polyprotein, is considered to be the viral capsid protein. It is very basic and was shown to bind to RNA (Santolini et al., 1994). The proteins E1 and E2 are the viral envelope glycoproteins, whereas the function of p7 is unknown. The mature structural proteins C, El, and E2 are released from the nascent polypeptide chain through cotranslational cleavage by host cell signal peptidases (Hijikata et al., 1991; Grakoui et al., 1993a; Lin et al., 1994a; Mizushima et al., 1994a,b). In contrast, generation of the free C- and N-termini of p7 appears to be a posttranslational event as judged by the occurrence of E l - p 7 and p7-NS2 precursors after in vitro translation of E2-NS2 precursor polyproteins (Lin et al., 1994a; Mizushima et al., 1994a,b). All cleavages downstream of NS2 are mediated by virally

64

Urbani et al.

FIGURE 1 Organization of the HCV polyprotein. Cleavage sites for signal peptidase (*), NS2/ NS3 protease (A), and NS3 serine protease (X) are indicated.

encoded proteases. The mature C-terminus of NS2 is generated by an intramolecular cleavage catalyzed by a protease encoded within NS2 and the N-terminal third of the NS3 protein (Grakoui et al., 1993c; Hijikata et al., 1993b; Santolini et al., 1995). The N-terminal --~180 amino acids of NS3 also form a chymotrypsinlike serine protease domain that is responsible for all the cleavages downstream of NS3, i.e., at the NS3-NS4A, NS4A-NS4B, NS4B-NS5A, and NS5ANS5B junctions (Bartenschlager et al., 1993; Eckart et al., 1993; Grakoui et al., 1993b; Hijikata et al., 1993b; Tomei et al., 1993; D'Souza et al., 1994; Manabe et al., 1994). The C-terminal two-third of the NS3 protein shows conserved sequence motifs that are characteristic of RNA helicases (Gorbalenya et al., 1989, 1990; Kim et al., 1997). Deletion experiments have shown that the isolated domains of NS3 retain their enzymatic activities and various forms of recombinant proteins encompassing the C-terminal domain of NS3 have been shown to possess RNA-stimulated NTPase activity (Suzich et al., 1993; Gwack et al., 1995; Preugschat et al., 1996) as well as RNA-helicase activity (Kim et al., 1995; Jin and Peterson, 1995; Tai et al., 1996; Gwack et al., 1996; Hong et al., 1996). Recently, the truncated protease and helicase domains of NS3 were crystallized and their three-dimensional structure was solved (Kim et al., 1996; Love et al., 1996; Yao et al., 1997; Yan et al., 1998). The NS4A protein was shown to be a cofactor of the NS3 protease activity (Failla et al., 1994; Bartenschlager et al., 1994; Lin et al., 1994b; Tanji et al., 1995a) and to stimulate the hyperphos-

HCV

Proteases

65

phorylation of NS5A (Tanji et al., 1995b; Asabe et al., 1997). Although the function of the NS5A protein is unknown there is accumulating evidence that this protein may be involved in the mechanism of viral resistance to interferon therapy (Herion and Hoofnagle, 1997). In fact, sequence comparisons between c~-interferon-resistant and -sensitive HCV isolates suggested a correlation between interferon response and mutations in a discrete region of NS5A (Enomoto et al., 1995, 1996; Chayama et al., 1997; Kurosaki et al., 1997). Furthermore, a recent study has shown that NS5A is a potent inhibitor of PKR, an interferon-induced protein kinase (Gale et al., 1997). NS5B was demonstrated to be the viral RNA-dependent RNA polymerase (Behrens et al., 1996; Yuan et al., 1997; Lohmann et al., 1997). No specific function has been so far assigned to the NS4B protein. Since the generation of the mature nonstructural HCV proteins, including the viral polymerase, relies on cleavages mediated by the virally encoded NS2-NS3 and NS3 proteins, specific inhibition of their proteolytic activities is presently regarded as a promising strategy to combat HCV infection (Bartenschlager, 1997; Neddermann et al., 1997). In the following, the progress made toward the understanding of the structure and the mechanism of action of the two HCV proteases is summarized.

III. T H E N S 2 - N S 3 P R O T E A S E The mature N-terminus of NS2 was reported to be generated by a signalasemediated cleavage (Mizushima et al., 1994b). Conversely, it was concluded that cleavage at the NS2-NS3 junction had to be mediated by a virally encoded protease since processing at this site could be observed in mammalian ceils as well as in Escherichia coli and plant and animal cell-free translation systems in the absence of microsomal membranes (Bartenschlager et al., 1993; Grakoui et al., 1993c; Hijikata et al., 1993b; Komoda et al., 1994a). Radiosequencing of the mature cleavage products showed that processing at the NS2-NS3 junction occurs between Leu1026 and Ala1027, within the sequence GWRLL-API (Grakoui et al., 1993c). Cleavage at the 2/3 site is remarkably resistant to single amino acid substitutions, the only dramatically inhibiting mutations being those likely to cause conformational alterations, simultaneous deletion of the residues flanking the cleavage site, or their concomitant substitution with alanine residues (Reed et al., 1995). Although not essential, membranes were shown to enhance cleavage efficiency at the NS2-NS3 junction in a strainspecific way (Santolini et al., 1995) and membrane activation could be mimicked by detergent micelles (Pieroni et al., 1997). In vitro translation studies showed that the NS2-NS3 precursor is targeted to the ER membrane through interaction with the signal recognition particle receptor and NS2 derived from

66

Urbani et al.

the NS2-NS3 cleavage was demonstrated to be a transmembrane protein, having its C-terminus translocated into the ER lumen and at least part of its N-terminus exposed to the cytosol (Santolini et al., 1995). An active NS3 serine protease is not required for processing of the NS2-NS3 junction since mutations of the residues that constitute the catalytic triad of the NS3 serine protease affect all cleavages downstream of NS3, but do not influence the processing at the NS2-NS3 cleavage site (Hijikata et al., 1993b; Bartenschlager et al., 1993; Grakoui et al., 1993c; Tomei et al., 1993). A deletion mutagenesis analysis made to map the minimum domain required for efficient processing at the NS2NS3 site indicated an N-terminal boundary between residues 898 and 923 within NS2, whereas a sharp drop in cleavage efficiency was observed upon Cterminal truncations beyond residue 1207 within NS3 (Grakoui et al., 1993c; Hijikata et al., 1993b; Reed et al., 1995; Santolini et al., 1995). These findings indicate that the presence of the NS3 protease domain, but not its serine protease activity, is specifically required for processing at the NS2-NS3 site. In fact, the NS3 sequence could not be substituted by other parts of the HCV polyprotein without inactivating the NS2-NS3 protease (Santolini et al., 1995). It is remarkable that the NS3 serine protease and the 2/3 protease overlap, with the N-terminus of the NS3 protein contributing to two different enzymatic functions. As there is no obvious sequence homology between the NS2-NS3 protease and other known cellular or viral enzymes the nature of this enzyme was investigated using classic protease inhibitors. Thiol-reactive agents such as iodacetamide, N-ethylmaleimide or N-tosyl-L-phenylalauine Chloromethylketone were found to inhibit the enzyme, which was also shown to be redox sensitive (Pieroni et al., 1997). Also, metal chelators such as EDTA or phenanthroline abolished cleavage and subsequent addition of zinc or cadmium was able to restore activity (Hijikata et al., 1993b; Pieroni et al., 1997). Based on these findings, Hijikata and co-workers (1993b) proposed that NS2-NS3 might be a zinc-dependent metalloprotease. In order to identify possible zinc ligands they mutagenized all conserved histidine, cysteine, and glutamic acid residues within the minimum core of the NS2-NS3 protease and identified His952 and Cys993, which are localized within NS2 as essential for proteolytic activity. Although these residues were originally proposed to participate in the coordination of the zinc ion, the finding of a structural zinc binding site in the NS3 protease domain (De Francesco et al., 1996; Kim et al., 1996; Love et al., 1996; Yan et al., 1998) suggests that zinc could in fact be required for the NS2-NS3 protease activity because it stabilizes the fold of NS3. The residues His1175, Cys1123, Cys1125, and Cys1171 were identified as the zinc ligands in NS3. Interestingly, mutation of these cysteine residues into alanine decreased both the NS3 serine protease activity and processing at the 2/3 site (Hijikata et al., 1993b). In contrast to these findings, Reed et al. (1995) reported that a trun-

HCV Proteases

67

cated precursor protein spanning residues 827-1137 of the HCV polyprotein still possesses some NS2-NS3 cleavage activity, even though the truncation has eliminated the zinc ligating Cys1171 and His1175 in NS3. It would be interesting to investigate whether the activity of this truncated protein is still influenced by zinc. Several hypotheses about the cleavage mechanism of the 2/3 junction are compatible with the experimental findings at this point. 1. There is one zinc ion bound in the NS3 protease domain that promotes its proper folding. The cleavage is catalyzed by a cysteine protease having His952 and Cys993 as a catalytic diad. In accord with this view, Gorbalenya and Snijder (1996) have recently classified NS2-NS3 as a cysteine protease. 2. The zinc ion bound in NS3 participates in the catalysis. Usually, the metal coordination sphere in zinc-dependent hydrolases involves nitrogen and oxygen and seldom sulfur ligands as observed in the NS3 zinc binding site (Vallee and Auld, 1990a,b), which has the characteristics of a structural, hydrolytically inactive or poorly active zinc site. Even though in the structure of the NS3 protease domain the metal binding site is quite close to the N-terminus of the enzyme (Kim et al., 1996; Yan et al., 1998; Wu et al., 1998). His1175, which is coordinated to the zinc via a bridging water molecule, is shielding the metal ion from the bulk solvent. However, both X-ray crystallography (Love et al., 1996) and NMR studies (Urbani et al., 1998) have shown that this histidine residue is endowed with a considerable flexibility, compatible with a switch of the metal binding site between an "open" and a "closed" conformation. It could be speculated that a facile movement of His1175, possibly leading to the exposure of a metal-activated water molecule, might play some role in the proteolysis of the 2/3 junction. 3. There are two zinc ions present in NS2-NS3, one being ligated to His952 and Cys993 and having a catalytic function. In this case, a dimerization of the 2/3 precursor should be conceived in order to provide a stable coordination to the second zinc ion. The answer to the question of which mechanism is operative in the cleavage of the 2/3 junction would be facilitated by a trans-cleavage assay using purified components. Reed et al. (1995) investigated this possibility. They concluded that bimolecular cleavage is only possible if the substrate polypeptide contributes a functional domain to the formation of an active protease. Precursors that contain a 2/3 site that is cleavable in trans are capable of supplying either the N-terminal domain of NS3 or a functional NS2 region to form the NS2-NS3 protease. This behavior suggests dimer formation to occur as a requisite for cleavage of the 2/3 junction. Attempts at demonstrating this interaction by immunoprecipitation failed, indicating that if dimer formation occurs the affinity

68

Urbani et al.

between the single subunits must be rather low. A further indication for possible dimerization of 2/3 precursors comes from the observation that autocleavage of the 2/3 site could be inhibited by the addition of polypeptides capable of participating in bimolecular cleavage reactions, i.e., those precursors having either a functional NS2 portion or an intact NS3 region (Reed et al., 1995). This latter observation indicates that cleavage at the 2/3 site can potentially be inhibited in trans. Whether this inhibition requires protein-protein interactions involving a large surface area or is also amenable to low-molecular-weight compounds will hopefully be addressed by future studies aimed at the elucidation of this intriguing aspect of HCV polyprotein processing.

IV. T H E N S 3 P R O T E A S E The N-terminal portion of the NS3 protein was predicted to contain a serine protease domain as judged from conserved sequence patterns and by homology to Flavi- and Pestiviruses (Miller and Purcell, 1990; Francki et al., 1991; Rice, 1996). Within this region, residues His1083, Aspll07, and Ser1165 constitute the catalytic triad of the enzyme and mutagenesis of either residue was found to abolish proteolytic activity (Bartenschlager et al., 1993; Eckart et al., 1993; Grakoui et al., 1993b; Tomei et al., 1993; Manabe et al., 1994; Hijikata et al., 1993b). Transient transfection and in vitro transcription/translation experiments have been extensively employed as tools for the characterization of the role of NS3 in the maturation of the viral polyprotein. Its activity was required for the processing of all junctions downstream of NS3, i.e., at the NS3-NS4A, NS4A-NS4B, NS4B-NS5A, and NS5A-NS5B boundaries (Bartenschlager et al., 1993; Grakoui et al., 1993b; Hijikata et al., 1993b; Tomei et al., 1993). There is a temporal hierarchy with which these sites are processed by the enzyme (Fig. 2). Several lines of evidence such as insensitivity to dilution, lack of detectable NS3-NS4A precursors and failure to observe transcleavage at this site suggested that processing at the NS3-NS4A boundary is an intramolecular reaction, which precedes all other cleavages. The remaining cleavage sites were found to be processed in an intermolecular fashion following the preferred order NS5A-NS5B>NS4A-NS4B>NS4B-NS5A (Fig. 2) (Bartenschlager et al., 1994; Failla et al., 1995; kin et al., 1994b; Tanji et al., 1994a). Based on the known structures of trypsinlike serine proteases as well as on the conservation pattern of protease sequences among different HCV strains, a homology model of the specificity pocket of the NS3 protease was built (Pizzi et al., 1994). According to this model, the presence of a phenylalanine side-chain in the pocket was predicted to determine the requirement for small, hydrophobic residues in the P1 positions of NS3 substrates, with a preference for cysteine residues. Radiosequencing of the N-termini of NS3 cleavage products has confirmed this

69

HCV Proteases

FIGURE 2

Kinetics of HCV polyprotein processing by the NS3 serine protease.

prediction (Grakoui et al., 1993b; Pizzi et al., 1994), yielding the NS3 substrate consensus sequence D/E-X-X-X-X-Cys/Thr~Ser/Ala (Table I). Cleavage was demonstrated to occur as predicted after a cysteine residue in all trans cleavage sites, whereas the intramolecular site between NS3 and NS4A was shown to differ from this consensus, having a threonine residue in its P 1 position. Other features are a conserved negatively charged residue in the P6 positions and a serine or alanine residue in the P 1' positions.

TABLE I

Sequences of the NS3-Dependent Cleavage Sites

Cleavage site NS3/4A (cis)

Sequence

Position

DLEVVT

STWV

1658

NS4A/4B

DEMEEC

ASHL

1706

NS4B/5A

DCSTPC

SGSW

1967

NS5A/5B

EDWCC

SMSY

2414

Consensus

D E

S A

C T

70

Urbani et al.

A. THE N S 3 - N S 4 A INTERACTION Using a transient expression system, cleavage of an NS4B-NS5B precursor protein by the NS3 protease supplied in trans from another plasmid was found to occur only at the NS5A-NS5B site, whereas efficient cleavage at all sites was observed with a precursor harboring also the NS4A protein (Bartenschlager et al., 1994; Failla et al., 1994; Lin et al., 1994b; Tanji et al., 1995a). This observation led to the discovery that NS3 is necessary but not sufficient for the efficient processing of the HCV polyprotein and that the NS4A protein is required as a cofactor of the NS3 serine protease, the active protease being a heterodimer consisting of both NS3 and NS4A. The conclusion drawn from these findings was that NS4A is a functional analog of Flavivirus NS2B and Pestivirus p l0 proteins (Failla et al., 1994). Heterodimer formation in cells can be accomplished by providing NS4A either in cis or in trans and can be demonstrated by coimmunoprecipitation of the two polypeptides, suggesting the formation of a very tight complex (Hijikata et al., 1993a; Bartenschlager et al., 1995b; Failla et al., 1995; Lin et al., 1995; Satoh et al., 1995). This complex formation is absolutely required for the processing of the NS4B-NS5A and NS3-NS4A junctions, whereas cleavage at the NS5A-NS5B site could be observed also in the absence of the NS4A protein. Besides acting as a cofactor of the serine protease, NS4A has also been shown to target NS3 to membranes and to increase its metabolic stability (Tanji et al., 1995a). Deletion mutagenesis showed that the N-terminal 22 residues of NS3 are involved in cofactor binding (Failla et al., 1995; Satoh et al., 1995; Koch et al., 1996) N-terminal deletions in NS3 leading to both a decrease in cleavage efficiency of the NS4B-NS5A site and to an impairment in complex formation, as detected by immunoprecipitation experiments. A quite complicated picture concerning the generation of a mature NS3 N-terminus engaged in a complex with NS4A emerges from these observations. As outlined above, cleavage at the 2/3 site is mediated by the NS2-NS3 protease. It is at present not clear what the exact temporal sequence of these events is, but it is likely that processing at the 2/3 site precedes NS4A binding to the N-terminus of NS3. On the other hand, cleavage at the intramolecular NS3NS4A site was demonstrated to rely on the previous activation of the NS3 protease by its cofactor (Failla et al., 1994), implying that this processing event occurs posttranslationally after intramolecular complex formation between NS4 and the N-terminal region of NS3. The likely sequence of events therefore is: (1) cleavage of the 2/3 site and insertion of NS2 into the ER membrane, (2) intramolecular complex formation between NS4A and the mature Nterminus of NS3, and (3) intramolecular cleavage of the NS3-NS4A junction and generation of the activated, heterodimeric NS3-NS4A protease bound to the ER membrane. It remains to be determined whether interference with a

HCV Proteases

71

FIGURE 3 Hydropathy plot of the NS4A protein. The amino acid sequence of NS4A and a secondary structure prediction are shown (B = ~-strand; H - helix; T = turn). The central region involved in NS3 activation is highlighted in the plot and shown in boldface in the sequence.

correct sequence of events might impair either 2/3 processing itself or a functional NS3-NS4A heterodimer formation. While NS3 constructs extending into NS2 were found to be activated by NS4A, implying correct binding of the cofactor to occur even without processing of the 2/3 site (Failla et al., 1994), D'Souza and co-workers (1994) reported that constructs deficient in NS2-NS3 protease activity also showed an impaired processing of the NS3-dependent cleavage sites that could be restored upon deletion of NS2 sequences. These latter findings suggest that a correct processing in this region may be a prerequisite for the activation of the NS3 protease by its cofactor. Further investigations are needed to define the role of efficient processing at the 2/3 site in the formation of the NS3-NS4A complex. The protein NS4A is composed of 54 residues. According to hydropathy plots and secondary structure algorithms it can be subdivided into three portions (Fig. 3): residues 1-34 are highly hydrophobic with the first 20 residues predicted to have a high propensity of forming an a-helix, whereas residues 2 0 - 3 4 are assumed to preferentially form fl-strands. The remaining 20 C-terminal residues of NS4A are hydrophilic with preference for adopting a helical conformation. Deletion mutagenesis experiments have shown that residues 2 0 - 3 4 corresponding to the extended, hydrophobic region of NS4A are sufficient for eliciting full activation of the NS3 protease (Lin et al., 1995; Shimizu et al., 1996; Tomei et al., 1996; Koch et al., 1996; Butkiewicz et al., 1996). Also peptides encompassing this region were shown to bind to the enzyme and to activate it (Lin et al., 1995; Shimizu et al., 1996; Steink~ihler et al., 1996a,b; Tomei et al., 1996; Bianchi et al., 1997). Many publications have reported deletion mutagenesis experiments of the NS3 protein that mapped the minimum protease domain to the - 2 0 0 N-terminal residues (Bartenschlager et al., 1994; Failla et al., 1995; Tanji et al.,

72

Urbani et al.

1994b; Han et al., 1995; Lin et al., 1994b; SteinkCthler, 1996a,b). Several pieces of evidence furthermore suggested that both basal and NS4A-stimulated protease activity of a truncated protein encompassing this region of NS3 were indistinguishable from data obtained with a full-length 70-kDa NS3 protein. This has prompted many groups to adopt a "minimalist" approach, focusing on the characterization of a system composed of a 20-kDa NS3 protease domain and a synthetic 14-mer peptide as an NS4A cofactor mimic. Purification of the truncated enzyme using several different heterologous expression systems has been reported (Suzuki et al., 1995; Shoji et al., 1995; Mori et al., 1996; Steinki~hler et al., 1996a,b; Markland et al., 1997; Vishnuvardhan et al., 1997) and the effort has culminated in the crystallization of the free NS3 protease domain and the complex with a cofactor peptide (Kim et al., 1996; Love et al., 1996; Yan et al., 1998). Recently, the functional independence of protease and helicase domains has been questioned by a report by Morgenstern et al. (1997), who observed polynucleotide modulation of the protease activity of the full-length NS3-NS4A complex but not of the truncated NS3 protease domain in the presence of an NS4A peptide. These conclusions are based on rather indirect findings and should be substantiated by a more rigorous quantitative study using peptidic substrates. Such studies could be easily performed since several reports have now appeared on expression and purification procedures of both the full-length NS3 protein and of the NS3-NS4A complex (D'Souza et al., 1995; Kakiuchi et al., 1995; Hong et al., 1996; Hamarake et al., 1996; Gallinari et al., 1998; Sali et al., 1998). The native, noncovalent complex between NS3 and NS4A was shown to be formed by in vivo cleavage of an NS3-NS4A precursor during protein expression and to be stable during purification procedures. The topology and the three-dimensional structure of the NS3 protease domain complexed with an NS4A peptide are shown in Figs. 4 and 5. Analysis of the X-ray structure revealed that the NS3 protease folds in a canonical chymotrypsinlike fold, consisting of two j3barrel-like domains, each containing a Greek key motif (Fig. 4). The C-terminal domain contains a six-stranded ~-barrel common to most members of the chymotrypsin family of serine proteases, which ends with a structurally conserved helix. The residues of the catalytic triad (His1083, Asp1107, and Ser1165) are located in a groove at the interface of the two domains (Fig. 5, see color plate). The N-terminal domain contains eight j3-strands, including one strand contributed by NS4A (Kim et al., 1996, Yah et al., 1998). The central region of NS4A, spanning residues 21-34, is embedded into the core of the NS3 protease domain with a total of 2400A 2 of surface area buried by the interaction between the two molecules. The Nterminus of NS3 that interacts with this region of NS4A has a peculiar ~-cr-~ fold with the j3-strand contributed by the cofactor intercalating into this structure. In the structure without NS4A, this N-terminus interacts with neighboring molecules binding to hydrophobic surface patches (Love et al., 1996).

HCV Proteases

73

FIGURE 4 Secondarystructure topology of the NS3 protease domain-NS4A peptide complex. The c~-helicesare marked as c~1-3;/t-strands are marked A0-F2. The strand contributed by the cofactor is shown in black. Greek key motifs are shaded. Numeration starts with residue 1026 of the HCV polyprotein.

These interactions are likely to be peculiar features of the crystallized enzyme and in solution the N-terminal region of NS3 is probably disordered in the absence of NS4A, thus providing an explanation for the enhanced metabolic stability of the NS3-NS4A complex with respect to the uncomplexed enzyme (Tanji et al., 1995a). Complex formation between the truncated NS3 protease and an NS4A peptide spanning residues 21-34 was shown to go along with changes in the protein near-UV CD spectrum and in its tryptophan fluorescence spectrum (Bianchi et al., 1997). These spectroscopic changes were interpreted in terms of changes in the environment of Trp 1111, which is engaged in the interaction with Va123 of the cofactor peptide and have allowed calculation of complex dissociation constants in the low micromolar range and a complex half-life of 3.5 min (Bianchi et al., 1997). Therefore, the complex with the peptide analog of NS4A appears to be significantly less stable than what can be extrapolated for the native complex with the full-length cofactor. Systematic deletion experiments in transfected cells showed that, while the central domain of NS4A

74

Urbani et al.

is important in the activation of the protease, truncations affecting the N-terminal hydrophobic sequence of NS4A impair the coimmunoprecipitation of NS3 and N-S4A (Bartenschlager et al., 1995b; Lin et al., 1995; Tanji et al., 1995a; Koch et al., 1996). In the light of these findings either "loose" or "tight" complexes can be formed, depending on the integrity of the N-terminus of NS4A. Since this region of NS4A is likely to be responsible for the membrane anchoring of NS3-NS4A complexes, tight complex formation probably involves membrane association. This process may serve other functions than protease activation alone, such as membrane targeting of the helicase and subsequent formation of a membrane-associated replication complex. The proximity, shown by the X-ray crystal structure, of the N-terminus of the protease domain to the C-terminus of the activator peptide has suggested that both molecules may be linked within a single polypeptide chain (Pasquo et al., 1998; Taremi et al., 1998; Dimasi et al., 1998). The single-chain proteases were shown to be constitutively activated molecules and indistinguishable in all enzymatic and physicochemical properties from the NS3 protease complexed with its cofactor peptide when analyzed under identical conditions. The single-chain design has thus led to a tightening of the NS3-cofactor peptide interaction even in the absence of the N-terminal portion of NS4A. Despite the wealth of structural information now available relatively little is known about the mechanism by which NS4A activates the enzymatic functions of the NS3 protease. The NS3 protease is active also in the absence of its cofactor, albeit with greatly reduced catalytic efficiency. Active site titrations performed in the absence of cofactor yielded 94% of catalytically active enzyme molecules (Urbani et al., 1997), indicating the presence of a homogeneous enzyme population with reduced specific activity. In the crystal structure obtained in the absence of the NS4A peptide the catalytic Aspll07, which is expected to provide charge stabilization to His1083 after deprotonation of Serl165, is oriented away from His1083, and forms an ion pair with Argll81 (Love et al., 1996). Instead, a correct orientation of Asp1107 is observed in the structure obtained in the presence of NS4A (Kim et al., 1996). Nevertheless, relatively minor readjustments would suffice to correctly position the sidechain of Asp1107 in the absence of the cofactor, suggesting that also in this case a classic catalytic triad configuration will probably exist during catalysis (Love et al., 1996). This hypothesis is in line with the finding that complex formation with NS4A does not alter the pK value of His1083, as determined both by activity titration and by NMR (Landro et al., 1997; Urbani et al., 1998). The kinetic consequences of complex formation depend on both the assay conditions and the substrate peptide used. Thus, either an enhancement of kcat alone or a simultaneous decrease in Km and a concomitant increase in kcat values are both observed (Shimizu et al., 1996; Steink~ihler et al., 1996a,b; Bianchi et al., 1997; Landro et al., 1997; Urbani et al., 1997). These findings are

HCV Proteases

75

suggestive of an activation of the catalytic machinery and of altered binding modes experienced by the substrate in the NS3-NS4A complex. In a recent steady-state kinetic analysis, Landro and co-workers (1997) propose a kinetic scheme according to which an ordered, sequential binding of NS4A and substrate occur. They further reported evidence for the influence of NS4A on binding of inhibitors to prime-side subsites, especially to $1' and $4', suggesting that the correct formation of these subsites is influenced by the cofactor. This is in line with the crystal structure data, which show that NS4A binding occurs in the N-terminal ~-barrel, which is predicted to accommodate the prime-side residues of the substrate. It is still not clear whether interference with NS4A binding is a viable strategy for the development of protease inhibitors. Although the contact surface between the two molecules is quite extended, Kim et al. (1996), reasoning that there are unique, highly conserved features in this interaction, suggested that small, predominantly hydrophobic molecules may be designed that could compete with NS4A binding and activation.

B. SUBSTRATE SPECIFICITY OF THE NS3 PROTEASE The NS3 cleavage sites have the consensus sequence D/E-X-X-X-X-Cys/ThrSer/Ala (Table I), with cleavage occurring after cysteine in all trans cleavage sites (i.e., NS4A-NS4B, NS4B-NS5A, and NS5A-NS5B), or after threonine in the intramolecular cleavage site between NS3 and NS4A. Preference for cysteine residues in the P1 positions of NS3 substrates was rationalized on the basis of the peculiar structure of the S1 pocket of the enzyme. This pocket is hydrophobic and shallow, being occluded by the aromatic ring of Phe1180 (Pizzi et al., 1994; Kim et al., 1996; Love et al., 1996; Yan et al., 1997). Furthermore, the sulfhydryl group of cysteine has been shown to engage in favorable interactions with the aromatic ring system of phenylalanine. Mutagenesis of residues flanking the S1 pocket was used to engineer proteases with altered substrate specificities (Failla et al., 1996; Koch and Bartenschlager, 1997), confirming the role of Phe1180 and a minor importance of Ala1183 in determining the selectivity of the NS3 protease for substrates harboring cysteine as a P 1 residue. It is noteworthy that threonine, the residue found in the P1 position of the intramolecular cleavage site between NS3 and NS4A, cannot be optimally fitted into the S1 pocket and was shown to decrease cleavage efficiency when introduced into the P1 positions of the other cleavage sites (Bartenschlager et al., 1995a; Kolykhalov et al., 1994; Komoda et al., 1994b). Komoda and co-workers (1994b) were able to demonstrate trans cleavage of a chimeric protein harboring the NS3/NS4A cleavage site expressed in Escherichia coli when the wildtype threonine residue was replaced by cysteine, suggesting that threonine is

76

Urbani et al.

suboptimal also in the context of the NS3-NS4A junction. Using transient expression in eukaryotic cells or in vitro translation of an NS2-NS4 precursor, the NS3-NS4A cleavage was shown to be remarkably tolerant to mutations in the consensus sequence (Bartenschlager et al., 1995a; Leinbach et al., 1994; Kolykhalov et al., 1994). This has led to the conclusion that processing at the cis cleavage site is determined primarily by polypi'otein folding (Bartenschlager et al., 1995a). In any case, the reason for the counterselection of an optimal P 1 residue in the NS3-NS4A junction is still not understood. Susceptibility to mutations of the trans cleavage sites depends on the sequence context with a gradient of increasing sensitivity following the order of NS4A-NS4BNS4A-NS4B>NS4B-NS5a), which was taken as evidence for primary structure being a major determinant for trans cleavage efficiency (Landro et al., 1997; Steinkahler et al., 1996b). Also, the extent of cleavage activation by NS4A, determined using peptide substrates, paralleled the hierarchy observed during polyprotein processing with the NS4B-NS5A junction being absolutely dependent on NS4A and the NS5ANS5B site being efficiently processed also in the absence of the cofactor. Catalytic efficiencies, expressed as kcat/Km values, obtained with peptide substrates are rather low and vary between 40 and 20,000 M -1 sec -1, depending on the substrate sequence. These values can be improved using ester substrates in which the P1 cysteine residue was replaced by 2-amino-butyric acid (Bianchi et al., 1996). In particular, incorporation of an ester linkage as scissile bond in the context of peptides harboring P' residues yielded very efficient substrates. This principle was used to generate internally quenched depsipeptide sub-

HCV Proteases

77

strates with kcat/Kmvalues of 345,000 M -1 sec -1 that permit continuous monitoring of enzymatic activity (Taliani et al., 1996). Alanine scanning experiments performed on peptide substrates based either on the NS4A-NS4B (Urbani et al., 1997) or on the NS5A-NS5B (Zhang et al., 1997) cleavage sites came to the conclusion that besides P1 also P6, P3, and P4' residues contribute to efficient substrate recognition by the NS3 protease. In the context of peptide substrates, a more pronounced sensitivity of the NS3 protease to P 1 substitutions was observed than using polyprotein substrates in transient transfection experiments (Landro et al., 1997; Urbani et al., 1997; Zhang et al., 1997). The P1 substitutions in the context of an NS4A-NS4Bbased substrate gave the following order of decreasing efficiencies: cysteine > homocysteine > allyglycine >2-amino-butyric acid > threonine > norvaline > valine. Serine, alanine, glycine, or leucine in the P 1 position yielded uncleavable substrates (Urbani et al., 1997). Similar results were obtained introducing modifications in the P1 position of substrates based on the sequence of the NS5A-NS5B junction (Landro et al., 1997; Zhang et al., 1997). These findings have to be compared with the results from mutagenesis experiments of the P 1 residues in the polyprotein NS4A-NS4B junction, where only Arg and Asp resulted in complete abolishment of cleavage (Kolykhalov et al., 1994), while efficient processing was still observed with residues such as glycine, serine, or leucine that were incompatible with peptide cleavage. These differences may reflect an increased activity of the NS3 protease on polyprotein substrates that render it less discriminative against suboptimal P1 residues. In peptide substrates, all cysteine substitutes so far reported were found to decrease cleavage efficiency by at least one order of magnitude, suggesting that it will be difficult to eliminate the P 1 cysteine residue in peptide-based inhibitors ofNS3 protease without incurring a substantial loss in potency. It was concluded that ground-state binding of peptide substrates to the NS3 protease is mediated by multiple, weak interactions involving distal residues spanning at least from P6 through P4', whereas the efficiency with which the bound substrate will proceed through the transition state is strongly influenced by the nature of the residue in the P1 position (Urbani et al., 1997). This requirement for relatively long peptide substrates and the apparent lack of strong specific interactions with distal subsites can be rationalized by analyzing the structure of the enzyme. Apart from the P 1 pocket, the surface of the enzyme involved in substrate recognition is relatively flat and featureless. All major loops, that in other serine proteases contact the P2, P3, and P4 moieties of substrates, are absent in NS3 (Kim et al., 1996; Love et al., 1996; Yan et al., 1998). Two positively charged residues, Arg1187 and Lys1191, have been invoked to engage interactions with the conserved negatively charged residue in the P6 position of NS3 substrates (Love et al., 1996). Possibly, substrates are anchored to the active site via the side-chain of the charged P6 residue and

78

Urbani et al.

through main-chain interactions between P5 and P2, with P'-site residues, especially P 1' and P4', also contributing to ground-state binding (Landro et al., 1997). This peculiar substrate recognition mechanism, requiring an extended interaction network, renders the development of low-molecular-weight inhibitors of the NS3 protease a formidable challenge.

C. THE METAL BINDING SITE OF THE N S 3 PROTEASE A homology model of the NS3 protease domain based on the known structures of serine proteases and on the sequence similarities between HCV isolates and the recently described HCV-related viruses GBV-A, GBV-B, and HGV (Simons et al., 1995; Muerhoff et al., 1995; Ohba et al., 1996) first predicted the presence of a metal binding site in the NS3 protease (De Francesco et al., 1996). This prediction was made on the basis of a clustering in space of three cysteines and one histidine residue in the model and was subsequently demonstrated by atomic absorption spectroscopy of the native enzyme and electronic spectroscopy of the Co(II) and Cd(II) derivatives of the NS3 protease (De Francesco et al., 1996; Stempniak et al., 1997). Zinc was shown to be essential for the structural integrity of the protein, its removal leading to unfolding and aggregation of the enzyme. Mutagenesis experiments showed that mutations affecting any of the three cysteine residues predicted to be involved in metal chelation resulted in an impaired NS3 protease activity (Hijikata et al., 1993b). Later, X-ray crystallography confirmed the assignment of the metal ligands predicted by the model. The zinc ion is in fact tetrahedrally coordinated by Cys1123, Cys1125, Cys1171, and through a water molecule by His1175, which shields the metal from bulk solvent (Fig. 6) (Kim et al., 1996; Love et al., 1996; Yan et al., 1998). The indirect coordination role of His1175 is consistent with the relatively weak effects of mutations in this position. A recent NMR study suggested that not water itself, but rather an OH- is the bridging ligand between the metal and His1175 (Urbani et al., 1998). This study also showed that His1175 is endowed with a significant flexibility, transiently exposing the metal to bulk solvent in a pH-dependent fashion. This flexibility is also evident from the analysis of the crystal structures: in the absence of NS4A, coordination by His1175 is observed only in two of the three monomers in the asymmetric unit. In the third monomer, His1175 is 4 A away from the zinc and does not participate in metal chelation. In contrast, a homogeneous situation is observed in the presence of NS4A, with His1175 always pointing toward the metal in both published crystal structures (Kim et al., 1996; Yan et al., 1998). A spectroscopic study using a Co(II)-substituted protein found that NS4A binding is indeed accompanied by changes in the coordination geometry of the metal binding site

HCV Proteases

79

FIGURE 6 Three-dimensionalstructure of the zinc binding site of the NS3 protease.

(Urbani et al., 1998). Interestingly, changes in the coordination geometry were also observed with small, anionic ligands and these changes went along with a modest enhancement of the enzymatic activity of the NS3 protease (Urbani et al., 1998). The results are suggestive of an influence of the coordination geometry of the metal binding site on the conformation of the active site. A possible explanation for these findings resides in the topological location of the zinc ion in the NS3 protease. The metal ligating residues are located in a long loop connecting the two fl-barrel domains of the protease and in a short hairpin loop in the second domain. Since the residues of the catalytic triad are distributed between the two domains, the relative orientation of the two ~-barrels is expected to affect enzymatic activity. Extracellular serine proteases often contain disulfide bridges that are believed to serve the purpose of stabilizing the relative positions of the two domains. Since disulfide bridges are not stable in the reducing intracellular milieu the metal binding site of the NS3 protease might serve as a reductant-stable surrogate of an S-S bridge. In this context, it is interesting to note that in picornavirus 3C proteases, which have neither disulfides nor a structural zinc, stability in this region may be provided by the N-terminal c~-helix that packs against the interbarrel loop (Love et al., 1996).

80

Urbani et

al.

In contrast, picornavirus 2A proteases contain two sequence motifs, Cys-XCys and Cys-X-His, that are located in a region which is topologically similar to the one harboring the metal binding site of HCV NS3 protease. Picornavirus 2A protease has indeed been found to rely on zinc for the formation of a native structure (Voss et al., 1995). This is even more remarkable since 2A proteases belong structurally to the serine protease family but utilize cysteine as nucleophile, implying that the metal binding site is more conserved than the catalytic site.

D.

INHIBITORS OF N S 3 PROTEASE

Since the definition of the role of the NS3 protease in HCV polyprotein processing a considerable effort has been devoted in the development of expression systems, purification protocols, and in vitro activity assays for this enzyme. These efforts have proven successful despite the poor solubility and low catalytic efficiency of the enzyme and the intrinsic difficulties in assay design arising from the necessity to supply a cofactor in order to get a reasonable substrate turnover. Inhibition studies have shown that NS3 is only modestly inactivated by classic serine protease inhibitors such as chloromethylketones or phenylmethyl sylfonylfluoride (Bouffard et al., 1995; Hahm et al., 1995; Lin and Rice, 1995; Steink~ihler et al., 1996a; Sudo et al., 1996; Shoji et al., 1995; Mori et al., 1996; Markland et al., 1997). It is, however, very sensitive to inhibition by copper ions (Han et al., 1995; Stempniak et al., 1997), which might inactivate the enzyme by catalyzing the oxidation of exposed cysteine residues. The assay systems that have been developed are presently used to screen for selective NS3 protease inhibitors. Some reports on the identification of such compounds have appeared in the literature and are briefly reviewed below. 1. Macromolecular Inhibitors Quite a few reports about macromolecular inhibitors of NS3 protease activity have appeared. Using phage display technology, Martin and co-workers (1997) report the affinity selection of a "camelized" variable domain antibody fragment that inhibited NS3 protease in a competitive way (Ki -- 150 nM). The same strategy was used in the search for inhibitors by sampling repertoires of a minimized antibodylike molecule ("minibody") and of human pancreatic secretory trypsin inhibitor (HPSTI) (Dimasi et al., 1997). A low-micromolar, noncompetitive minibody inhibitor and a competitive HPSTI NS3 protease inhibitor (Ki = 360 nM) were selected. A design approach using an NS3-directed macromolecular inhibitor has been recently reported by Martin et al. (1998). In this approach the active site binding loop of eglin C was engineered to in-

HCV Proteases

81

corporate NS3 P5-P4' consensus residues without affecting the integrity of the eglin scaffold. Competitive inhibitors with potencies in the low nanomolar range were obtained. The interaction between NS3 and the designed eglin inhibitors was characterized in detail and compared to the inhibition of other serine proteases by the parent eglin C molecule. The complexes of a series of serine proteases with wild-type eglin C are characterized by slow off-rates and long half-lives. In contrast, inhibition of NS3 by designed eglin inhibitors was characterized by very fast association and dissociation processes, possibly reflecting the peculiar, poorly structured substrate recognition region of the HCV serine protease. Using a different approach, Kumar and co-workers (1997) selected from a pool of random RNA molecules (aptamers) that inhibited the NS3 protease in a competitive way (K~ = 3/.LM). All these macromolecular inhibitors may prove useful as tools for cocrystallization with the enzyme and as a starting point for structure-based drug design. 2. Nonpeptide Inhibitors Chu and co-workers (1996) report the isolation of a phenanthrenequinone during a natural product screen which inhibited the protease activity in an in vitro translation assay. A series of thiazolidine, benzamide, and benzanilide derivatives were identiffed in an HPLC-based assay using an NS3-NS4A complex fused to maltose binding protein at the N-terminus of NS3 and a peptide substrate based on the sequence of the NS5A-NS5B cleavage site (Sudo et al., 1997a,b; Kakiuchi et al., 1998). The compounds were found to inhibit the protease in a noncompetitive manner and showed poor selectivity. 3. Peptide Inhibitors Landro and co-workers (1997) first reported on the development of peptidebased inhibitors of the NS3 protease. A competitive hexapeptide aldehyde, having the sequence EDVVAbuV-CHO, inhibited the NS3 protease in an NS4Aindependent fashion (Ki = 50/xM). In their substrate specificity studies, Landro et al. (1997) have identified PI' substitutions such as proline, tetrahydroisoquinoline-3-carboxylic acid, or pipecolinic acid that abolished cleavage but retained a high affinity of the corresponding peptides for the protease. The decapeptide EDWLCTicNleSY was a competitive inhibitor of the protease with Ki = 340 nM in the presence of NS4A and K~ = 28/xM in the absence of the cofactor, suggesting that NS4A influences binding of inhibitors extending into the P' side, but does not affect the affinity of ligands that are confined to the P side. Two groups have reported that the NS3 protease undergoes significant

82

Urbani et

al.

inhibition by its N-terminal hexamer cleavage products (SteinkOhler et al., 1998; Llinas-Brunet et al., 1998a). The Ki values of the hexamer products arising from hydrolysis of the NS4A-NS4B, NS4B-NS5A, and NS5A-NS5B junctions turned out to be up to 1 order of magnitude lower than the K,, values of the corresponding substrate peptides. The physiological significance of this pronounced product inhibition is unknown at present. Interestingly, the Nterminal hexapeptide product of the intramolecular cleavage site between NS3 and NS4A was shown not to inhibit the protease up to a concentration of 500 /xM. Since this is the only cleavage site containing a suboptimal P1 threonine residue it has been argued that the counterselection of an optimal P 1 residue at this junction has occurred in order to avoid premature inhibition of the enzyme by the first intramolecular cleavage product. Based on the observation of product inhibition, two groups (Ingallinella et al., 1998; Llinas-Brunet et al., 1998a,b) have sequentially optimized the sequences of the natural hexapeptide products to set up structure-activity relationships. These studies have shown that the main contribution to the binding energy derives from the P1 amino acid through both its side-chain and its ce-carboxylic acid function. Molecular modeling, pH-dependence studies, and site-directed mutagenesis suggested that the P1 a-carboxylic acid is anchored in the active site via interactions with the protonated E-N of the catalytic His1083, the backbone amide groups of Ser1165 and Gly1163 (forming the oxyanion hole of the enzyme) and with the ~/-N of the conserved Lys1162 (SteinkOhler et al., 1998). Deletion of the acidic residue in the P6 position led to a > 10-fold drop in potency, whereas an acidic residue is not strictly required in the P5 position. In this position amino acids with D chirality are also well accepted and even preferred. In the P4 position a preference for large, hydrophobic residues was found, whereas either Glu or the hydrophobic Val or teu were best accepted in P3. Both negatively charged and hydrophobic residues were favored in the P2 position. A fully optimized hexapeptide that inhibited the protease with IC~o = 1.5 nM was reported (Ingallinella et al., 1998). Llinas-Brunet and co-workers (1998b) have explored the possibility of substituting the P1 ce-carboxylic acid of hexapeptide inhibitors with activated carbonyl groups. This was done in the context of hexapeptides containing a suboptimal but chemically inert P 1 norvaline residue. Trifluoromethylketones, pentafluoroethylketones, and ce-ketoamides were synthesized and showed only a modest improvement in potency with respect to peptide acids. The best compound, a hexapeptide ce-ketoamide (IC~0 = 0.64/zM), showed poor selectivity. Whereas replacement of the P 1 ce-carboxylic acid with a primary amide resulted in a complete loss of activity, methyl- and benzylamides showed inhibition in the low micromolar range. However, these compounds were equally potent against different serine proteases.

HCV Proteases

83

V. C O N C L U S I O N S Only 4 years elapsed between the first description of the enzymatic activity of the NS3 protease in transfected cells and the determination of its threedimensional structure by X-ray crystallography. On the way from the characterization of the cleavage products by immunoprecipitation to the atomic resolution of the active site of the enzyme a great deal of information has also been gained on functional aspects of substrate recognition and on the catalytic mechanism. The news was often rather sobering: the requirement of extended interactions spanning at least 10 residues in the substrate, the need for a cysteine as optimum P1 residue and the absence of P2, P3, and P4 binding loops make the development of potent and selective NS3 protease inhibitors a challenging project. The same effort devoted by the scientific community to unravel the biology and enzymology of the NS3 protease will have to converge into the discovery of inhibitors. If this can be accomplished, one should be optimistic about the fact that efficacious anti-HCV therapeutics will be developed, with the ultimate goal to cure a potentially life-threatening disease that affects 100 million individuals worldwide.

ACKNOWLEDGMENTS We thank Youwei Yan, Sanjeev Munshi, Zhongguo Chen, and Lawrence Kuo for letting us use the coordinates of the NS3-NS4A complex prior to publication. Special thanks to Giovanni Migliaccio and Licia Tomei for helpful discussions and critical reading of the manuscript and to Uwe Koch for the preparation of Figs. 5 and 6.

REFERENCES Alberti, A. (1995). Interferon therapy of acute hepatitis C. Viral Hepatitis 1, 37-45. Alter, H.J. (1995). To C or not to C: These are the questions. Blood 85, 1681-1695. Asabe, S.-I., Tanji, Y., Satoh, S., Kaneko, T., Kimura, K., and Shimotohno, K. (1997). The N-terminal region of hepatitis C virus-encoded NS5A is important for NS4A-dependent phosphorylation. J. Virol. 71,790-796. Bartenschlager, R. (1997). Molecular targets in inhibition of the hepatitis C virus replication. Antiviral Chem. Chemother. 8, 281-301. Bartenschlager, R., Ahlborn-Laake, L., Mous, J., and Jacobsen, H. (1993). Nonstructural protein 3 of the hepatitis C virus encodes a serine-type proteinase required for cleavage at the NS3/4 and NS4/5 junctions.J. Virol. 67, 3835-3844. Bartenschlager, R., Ahlborn-Laake, L., Mous, J., and Jacobsen, H. (1994). Kinetic and structural analyses of hepatitis C virus polyprotein processing. J. Virol. 68, 5045-5055. Bartenschlager, R., Ahlborn-Laake, L., Yasargil, K., Mous, J., and Jacobsen, H. (1995a). Substrate

84

Urbani et al.

determinants for cleavage in cis and in trans by the hepatitis C virus NS3 proteinase. J. Virol. 69, 198-205. Bartenschlager, R., Lohman, V., Wilkinson, T., and Koch, J. O. (1995b). Complex formation between the NS3 serinemtype proteinase of the hepatitis C virus and NS4A and its importance for polyprotein maturation.J. Virol. 69, 7519-7528. Behrens, S.-E., Tomei, L., and De Francesco, R. (1996). Identification and properties of the RNAdependent RNA polymerase of hepatitis C virus. EMBOJ. 15, 12-22. Bianchi, E., Steink~ihler, C., Taliani, M., Urbani, A., De Francesco, R., and Pessi, A. (1996). Synthetic depsipeptide substrates for the assay of human hepatitis C virus protease. Anal. Biochem. 237, 239-244. Bianchi, E., Urbani, A., Biasiol, G., Brunetti, M., Pessi, A., De Francesco, R., and Steink~hler, C. (1997). Complex formation between the hepatitis C virus serine protease and a synthetic NS4A cofactor peptide. Biochemistry 36, 7890-7897. Blair, C. S., Haydon, G. H., and Hayes, P. C. (1996). Current perspectives on the treatment and prevention of hepatitis C infection. Exp. Opin. Invest. Drugs 5, 1657-1671. Bouffard, P., Bartenschlager, R., Ahlborn-Laake, L., Mous, J., Roberts, N., and Jacobsen, H. (1995). An in vitro assay for hepatitis C virus NS3 serine proteinase. Virology 209, 52-59. Chayama, K., Tsubota, A., Kobayashi, M., Okamoto, K., Hashimoto, M., Miyano, Y., Koike, H., Koida, I., Arase, Y., Saitoh, S., Suzuki, Y., Murashima, N., Ikeda, K., and Kumada, H. (1997). Pretreatment virus load and multiple amino acid substitutions in the interferon sensitivitydetermining region predict the outcome of interferon treatment in patients with chronic genotype lb hepatitis C virus infection. Hepatology 25,745-749. Choo, Q.-L., Kuo, G., Weiner, A. J., Overby, L. R., Bradley, D. W., and Houghton, M. (1989). Isolation of a cDNA clone derived from a blood-borne non-A non-B viral hepatitis genome. Science 244, 359-362. Chu, M., Mierzwa, R., Truumees, I., King, A., Patel, M., Berrie, R., Hart, A., Butkiewicz, N., DasMahapatra, B., Chan, T. M., and Puar, M. S. (1996). Structure of Sch 68631: A new hepatitis C virus proteinase inhibitor from Streptomyces sp. Tetrahedr. Lett. 37, 7229-7232. Cuthbert, J. A. (1994). Hepatitis C: Progress and problems. Clin. Microbiol. Rev. 7, 505-532. Davies, G. L., Balatt, L. A., Schiff, E. R., Lindsay, K., Bodenheimer, H. C., Perillo, R. P., Carey, W., Jacobson, J. M., Payne, J., Dienstag, J. L., Van Thiel, D. H., Tamburro, C., Lefkowitch, J., Albert, J., Meschiewitz, C., Orrego, T. J., Gibas, A. and The Hepatitis Interventional Therapy Group (1989). Treatment of acute hepatitis C with recombinant interferon alpha: A multicenter, randomized, controlled trial. New Engl.J. Med. 321, 1501-1506. Davies, G. L., Esteban-Mur, R., Rustgi, V., Hoefs, J., Gordon, S. C., Trepo, C., Shiffman, M. C., Zeuzem, S., Craxi, A., Ling, M. H., and Albrecht, J. (1998). Interferon-~ 2b alone or in combination with ribavirin for the treatment of relapse of chronic hepatitis C. New Engl. J. Med. 339, 1493 -1499. De Francesco, R., Urbani, A., Nardi, M. C., Tomei, L., Steink~hler, C., and Tramontano, A. (1996). A zinc binding site in viral serine proteinases. Biochemistry 35, 13282-13287. Di Bisceglie, A. M., Shindo, M., Fong, T. L., Friend, M. W., Swain, M. G., Gergasa, N. V., Axiotis, C. A., Waggoner, J. G., Park, Y., and Hoofnagle, J. H. (1992). A pilot study of ribavirin therapy for chronic hepatitis C. Hepatology 16, 649-654. Dimasi, N., Martin, F., Volpari, C., Brunetti, M., Biasiol, G., Altamura, S., Cortese, R., De Francesco, R., Steink~hler, C., and Sollazzo, M. (1997). Characterization of engineered hepatitis C virus NS3 protease inhibitors affinity selected from human pancreatic secretory trypsin inhibitor and minibody repertoires. J. Virol. 71,7461-7469. Dimasi, N., Pasquo, A., Martin, F., Di Marco, S., Steink~hler, C., Cortese, R., and Sollazzo, M. (1998). Engineering, characterization, and phage display of hepatitis C virus NS3 protease and NS4A cofactor peptide as a single-chain protein. Protein Eng. 11, 1257-1265.

HCV Proteases

85

D'Souza, E. D. A., Grace, K., Sangar, D. V., Rowlands, D. J., and Clarke, B. E. (1995). In vitro cleavage of hepatitis C virus polyprotein substrates by purified recombinant NS3 protease. J. Gen. Virol. 76, 1729-1736. D'Souza, E. D. A., O'Sullivan, E., Amphlett, E. M., Rowlands, D. J., Sangar, D.V., and Clarke, B. E. (1994). Analysis of NS3-mediated processing of the hepatitis C virus non-structural region in vitro.J. Gen. Virol. 75, 3469-3476. Eckart, M. R., Selby, M., Masiarz, F., Lee, C., Berger, K., Crawford, K., Kuo, G., Houghton, M., and Choo, Q. L. (1993). The hepatitis C virus encodes a serine protease involved in processing of the putative nonstructural proteins from the viral polyprotein precursor. Biochem. Biophys. Res. Commun. 192,399-406. Enomoto, N., Sakuma, I., Asahina, Y., Kurosaki, M., Murakami, T., Yamamoto, C., Izumi, N., Marumo, F., and Sato, C. (1995). Comparison of full-length sequences of interferon-sensitive and resistant hepatitis C virus lb. Sensitivity to interferon is conferred by amino acid substitutions in the NS5A region. J. Clin. Invest. 96, 224-230. Enomoto, N., Sakuma, I., Asahina, Y., Kurosaki, M., Murakami, T., Yamamoto, C., Ogura, Y., Izumi, N., Marumo, F., and Sato, C. (1996). Mutations in the nonstructural protein 5A gene and response to interferon in patients with chronic hepatitis C virus lb infection. N. Engl. J. Med. 334, 77-81. Failla, C., Pizzi, E., De Francesco, R., and Tramontano, A. (1996). Redesigning the substrate specificity of the hepatitis C virus protease. Fold. Design 1, 35-42. Failla, C., Tomei, L., and De Francesco, R. (1994). Both NS3 and NS4A are required for proteolytic processing of hepatitis C virus nonstructural proteins.J. Virol. 68, 3753-3760. Failla, C., Tomei, L., and De Francesco, R. (1995). An amino-terminal domain of the hepatitis C virus NS3 proteinase is essential for interaction with NS4A. J. Virol. 69, 1769-1777. Francki, R. I. B., Fauquet, C. M., Knudson, D. L., and Brown, F. (1991). Classification and nomenclature of viruses: Fifth report of the International Committee on Taxonomy of Viruses. Arch. Virol. Suppl. 2, 223-233. Fukushi, S., Katayama, K., Kurihara, C., Ishiyama, N., Hoshino, F.-B., Ando, T., and Oya, A. (1994). Complete 5' noncoding region is necessary for the efficient internal initiation of hepatitis C virus RNA. Biochem. Biophys. Res. Commun. 199, 425-432. Gale, M. J., Jr., Korth, M. J., Tang, N. M., Tan, S.-L., Hopkins, D. A., Dever, T. E., Polyak, S. J., Gretch, D. R., and Katze, M. G. (1997). Evidence that hepatitis C virus resistance to interferon is mediated through repression of the PKR protein kinase by the nonstructural 5A protein. Virology 230, 217- 227. Gallinari, P., Brennan, D., Nardi, C., Brunetti, M., Tomei, L., Steink~hler, C., and De Francesco, R. (1998). Multiple enzymatic activities associated with recombinant NS3 protein of hepatitis C virus.J. Virol. 72, 6758-6769. Gorbalenya, A. E., and Koonin, E. V. (1993). Helicases: amino acid sequence comparison and structure-function relationship. Curr. Opin. Struct. Biol. 3,419-429. Gorbalenya, A. E., and Snijder, E.J. (1996). Viral cysteine proteinases. Perspect. Drug Discov. Design 6, 64-86. Gorbalenya, A. E., Koonin, E. V., and Wolf, Y. I. (1990). A new family of putative NTP-binding domains encoded by genomes of small DNA and RNA viruses. FEBS Lett. 262, 145-148. Grakoui, A., Wychowski, C., Lin, C., Feinstone, S. M., and Rice, C. M. (1993a). Expression and identification of hepatitis C virus polyprotein cleavage products. J. Virol. 67, 1385-1395. Grakoui, A., McCourt, D. W., Wychowski, C., Feinstone, S. M., and Rice, C. (1993b). Characterization of the hepatitis C virus-encoded serine proteinase: Determination of proteinasedependent polyprotein cleavage sites. J. Virol. 67, 2832-2843. Grakoui, A., McCourt, D. W., Wychowski, C., Feinstone, S. M., and Rice, C. M. (1993c). A second hepatitis C virus-encoded proteinase. Proc. Natl. Acad. Sci. USA 90, 10583-10587.

86

Urbani et al.

Gwack, Y., Kim, D. W., Han, J. H., and Choe, J. (1996). Characterization of RNA binding activity and RNA helicase activity of the hepatitis C virus NS3 protein. Biochem. Biophys. Res. Commun. 225,654-659. Gwack, Y., Wook, D., Han, J. H., and Choe, J. (1995). NTPase activity of hepatitis C virus NS3 protein expressed in insect cells. Mol. Cells. 5, 171-175. Hahm, B., Han, D. S., Back, S. H., Song, O. K., Cho, M. J., Kim, C. J., Shimotohno, K., and Jang, S. K. (1995). NS3-4A of hepatitis C virus is a chymotrypsin-like protease. J. Virol. 69, 25342539. Hamarake, R., Wang, H. G. H., Butcher, J. A., Bifano, M., Clark, G., Hernandez, D., Zhang, D., Racela, J., Standring, D., and Colonno, R. (1996). Establishment of an in vitro assay to characterize hepatitis C virus NS3-4A protease trans-processing activity. Intervirology 39, 249-258. Han, D. S., Hahm, B., Rho, H. M., and Jang, S. K. (1995). Identification of the serine protease domain in NS3 of the hepatitis C virus. J. Gen. Virol. 76, 985-993. Herion, D., and Hoofnagle, J. H. (1997). The interferon sensitivity determining region: All hepatitis C virus isolates are not the same. Hepatology 25,769-771. Hijikata, M., Kato, N., Ootsuyama, Y., Nakagawa, M., and Shimotohno, K. (1991). Gene mapping of the putative structural region of the hepatitis C virus genome by in vitro processing analysis. Proc. Natl. Acad. Sci. USA 88, 5547-5551. Hijikata, M., Mizushima, H., Tanji, Y., Komoda, Y., Hirowatari, Y., Akagi, T., Kato, N., Kimura, K., and Shimotohno, K. (1993a). Proteolytic processing and membrane association of putative nonstructural proteins of hepatitis C virus. Proc. Natl. Acad. Sci. USA 90, 10733-10737. Hijikata, M., Mizushima, H., Akagi, T., Mori, S., Kakiuchi, N., Kato, N., Tanaka, T., Kimura, K., and Shimotohno, K. (1993b). Two distinct proteinase activities required for the processing of a putative nonstructural precursor protein of hepatitis C virus. J. Virol. 67, 4665-4675. Hong, Z., Ferrari, E., Wright Minogue, J., Chase, R., Risano, C., Seelig, G., Lee, C. G., and Kwong, A. D. (1996). Enzymatic characterization of hepatitis C virus NS3/4A complexes expressed in mammalian cells by using the herpes simplex virus amplicon system. J. Virol. 70, 4261-4268. Houghton, M. (1996). Hepatitis C viruses. In "Fields' Virology" (B. N. Fields, D. M. Knipe, and P. M. Howley, Eds.), 3rd ed., pp. 1035-1058. Lippincott-Raven, Philadelphia/New York. Ingallinella, P., Altamura, S., Bianchi, E., Taliani, M., Ingenito, R., Cortese, R., De Francesco, R., Steink~ihler, C., and Pessi, A. (1998). Potent peptide inhibitors of human hepatitis C virus NS3 protease are obtained by optimizing the cleavage products. Biochemistry 37, 8906-8914. Jin, L., and Peterson, D. L. (1995). Expression, isolation, and characterization of the hepatitis C virus ATPase/RNA helicase. Arch. Biochem. Biophys. 323, 47-53. Kakiuchi, N., Hijikata, M., Komoda, Y., Tanji, Y., Hirowatari, Y., and Shimotohno, K. (1995). Bacterial expression and analysis of cleavage activity of HCV serine proteinase using recombinant and synthetic substrate. Biochem. Biophys. Res. Commun. 210, 1059-1065. Kakiuchi, N., Komoda, Y., Komoda, K., Takeshita, N., Okada, S., Tani, T., and Shimotohno, K. (1998). Non-peptide inhibitors of HCV serine protease. FEBS Lett. 421,217-220. Kakumu, S., Yoshioka, K., Wakita, T., Ishikawa, T., Takayanagi, M., and Higashi, Y. (1993). A pilot study of ribavirin and interferon beta for the treatment of chronic hepatitis C. Gastroenterology 105,507-512. Kim, D. W., Gwack, Y., Han, J. H., and Choe, J. (1995). C-terminal domain of the hepatitis C virus NS3 protein contains an RNA helicase activity. Biochem. Biophys. Res. Commun. 215, 160-166. Kim, D. W., Kim, J., Gwack, Y., Han, J. H., and Choe, J. (1997). Mutational analysis of the hepatitis C virus RNA helicase.J. Virol. 71, 9400-9409. Kim, J. L., Morgenstern, K. A., Lin, C., Fox, T., Dwyer, M. D., Landro, J. A., Chambers, S. P., Markland, W., Lepre, C. A., O'Malley, E. T., Harbeson, S. L., Rice, C. M., Murcko, M. A., Caron, P. R., and Thomson, J. A. (1996). Crystal structure of the hepatitis virus NS3 proteinase domain complexed with a synthetic NS4A cofactor peptide. Cell 87,343-355.

HCV Proteases

87

Koch, J. O., and Bartenschlager, R. (1997). Determinants of substrate specificity in the NS3 serine proteinase of the hepatitis C virus. Virology 237, 78-88. Koch, J. O., Lohmann, V., Herian, U., and Bartenschlager, R. (1996). In vitro studies on the activation of the hepatitis C virus NS3 proteinase by the NS4A cofactor. Virology 221, 54-66. Kolykhalov, A. A., Agapov, E. V., and Rice, C. M. (1994). Specificity of the hepatitis C virus NS3 serine proteinase: Effects of substitutions at the 3/4A, 4A/4B, 4B/5A, and 5A/5B cleavage sites on polyprotein processing.J. Virol. 68, 7525-7533. Kolykhalov, A. A., Feinstone, S. M., and Rice, C. M. (1996). Identification of a highly conserved sequence element at the 3' terminus of hepatitis C virus genome RNA.J. Virol. 70, 3363-3371. Komoda, Y., Hijikata, M., Tanji, Y., Hirowatari, Y., Mizushima, H., Kimura, K., and Shimotohno, K. (1994a). Processing of hepatitis C viral protein in Escherichia coli. Gene 145,221-226. Komoda, Y., Hijikata, M., Sato, S., Asabe, S. I., Kimura, K., and Shimotohno, K. (1994b). Substrate requirements of hepatitis C virus serine proteinase for intermolecular polypeptide cleavage in Escherichia coli. J. Virol. 68, 7351-7357. Koskinas, J., Tibbs, C., Saleh, M. G., Pereira, U M., McFarlaine, I. G., and Williams, R. (1995). Effects of ribavirin on intrahepatic and extrahepatic expression of hepatitis C virus in interferon nonresponsive patients. J. Med. Virol. 45, 29-34. Kumar, P. K. R., Machida, K., Urvil, P. T., Kakiuchi, N., Vishnuvardhan, D., Shimotohno, K., Taira, K., and Nishikawa, S. (1997). Isolation of RNA aptamers specific to the NS3 protein of hepatitis C virus from a pool of completely random RNA. Virology 237, 270-282. Kuo, G., Choo, Q. L., Alter, H. J., Gitnick, G. L., Redecker, A. G., Purcell, R. H., Myamura, T., Dienstag, J. L., Alter, M. J., Syevens, C. E., Tagtmeyer, G. E., Bonino, F., Colombo, M., Lee, W. S., Kuo, C., Berger, K., Shister, J. R., Overby, L. R., Bradley, D. W., and Houghton, M. (1989). An assay for circulating antibodies to a major etiologic virus of human non-A non-B hepatitis. Science 244, 362-364. Kurosaki, M., Enomoto, N., Murakami, T., Sakuma, I., Asahina, Y., Yamamoto, C., Ikeda, T., Tozuka, S., Izumi, N., Marumo, F., and Sato, C. (1997). Analysis of genotypes and amino acid residues 2209 to 2248 of the NS5A region of hepatitis C virus in relation to the response to interferon-beta therapy. Hepatology 25,750-753. Landro, J. A., Raybuck, S. A., Luong, Y. P. C., O'Malley, E. T., Harbeson, S. L., Morgenstern, K. A., Rao, G., and Livingston, D. J. (1997). Mechanistic role of an NS4A peptide cofactor with the truncated NS3 protease of hepatitis C virus: Elucidation of the NS4A stimulatory effect via kinetic analysis and inhibitor mapping. Biochemistry 36, 9340-9348. Leinbach, S. S., Bhat, R. A., Xia, S. M., Hum, W. T., Stauffer, B., Davis, A., Hung, P. P., and Mizutani, S. (1994). Substrate specificity of the NS3 serine proteinase of hepatitis C virus as determined by mutagenesis at the NS3/NS4A junction. Virology 204, 163-169. Lin, C., and Rice, C. M. (1995). The hepatitis C virus NS3 proteinase and NS4A co-factor: establishment of a cell-free trans-processing assay. Proc. Natl. Acad. Sci. USA 92, 7622-7626. Lin, C., Lindenbach, B. D., Pragai, B. M., McCourt, D. W., and Rice, C. M. (1994a). Processing in the hepatitis C virus E2-NS2 region: Identification of p7 and two distinct E2-specific products with different C-termini. J. Virol. 68, 5063-5073. Lin, C., Pragai, B. M., Grakoui, A., Xu, J., and Rice, C. M. (1994b). Hepatitis C virus NS3 serine proteinase: Trans-cleavage requirements and processing kinetics. J. Virol. 68, 8147-8157. Lin, C., Thomson, J. A., and Rice, C. M. (1995). A central region in the hepatitis C virus NS4A protein allows formation of an active NS3-NS4A serine proteinase complex in vivo and in vitro. J. Virol. 69, 4373-4380. Llinas-Brunet, M., Bailey, M., Fazal, G., Goulet, S., Halmos, T., Laplante, S., Maurice, R., Poirer, M., Poupart, M. A., Thibeault, D., Wernic, D., and Lamarre, D. (1998a). Peptide-based inhibitors of the hepatitis C virus serine protease. Bioorg. Med. Chem. Lett. 8, 1713-1718. Llinas-Brunet, M., Bailey, M., Deziel, R., Fazal, G., Gorys, V., Goulet, S., Halmos, T., Maurice, R.,

88

Urbani et al.

Poirer, M., Poupart, M. A., Rancourt, J., Thibeault, D., Wernic, D., and Lamarre, D. (1998b). Studies on the C-terminal of hexapeptide inhibitors of the hepatitis C virus serine protease. Bioorg. Med. Chem. Lett. 8, 2719-2724. Lohmann, V., K6rner, F., Herian, U., and Bartenschlager, R. (1997). Biochemical properties ofhepatitis C virus NS5B RNA-dependent RNA polymerase and identification of amino acid sequence motifs essential for enzymatic activity.J. Virol. 71, 8416-8428. Love, R. A., Parge, H. E., Wickersham, J. A., Hostomsky, Z., Habuka, N., Moomaw, E. W., Adachi, T., and Homstomska, Z. (1996). The crystal structure of hepatitis C virus NS3 proteinase reveals a trypsin-like fold and a structural zinc binding site. Cell 87, 331-342. Manabe, S., Fuke, I., Tanishita, O., Kaji, C., Gomi, Y., Yoshida, S., Mori, C., Takamizawa, A., Yosida, I., and Okayama, H. (1994). Production of nonstructural proteins of hepatitis C virus requires a putative viral proteinase encoded by NS3. Virology 198,636-644. Markland, W., Petrillo, R. A., Fitzgibbon, M., Fox, T., McCarrick, R., McQuaid, T., Fulghum, J. R., Chen, W., Fleming, M. A., Thomson, J. A., and Chambers, S. P. (1997). Purification and characterization of the NS3 serine protease domain of hepatitis C virus expressed in Saccharomyces cerevisiae. J. Gen. Virol. 78, 39-43. Martin, F., Dimasi, N., Volpari, C., Perrera, C., Di Marco, S., Brunetti, M., Steink~ihler, C., De Francesco, R., and Sollazzo, M. (1998). Design of selective eglin inhibitors ofHCV NS3 proteinase. Biochemistry 37, 11459-11468. Martin, F., Volpari, C., Steink~ihler, C., Dimasi, N., Brunetti, M., Biasiol, G., Altamura, S., Cortese, R., De Francesco, R., and Sollazzo, M. (1997). Affinity selection of a camelized VH domain antibody inhibitor of hepatitis C virus NS3 protease. Prot. Engineer. 10, 607-614. Mast, E. E., and Alter, M.J. (1993). Epidemiology of viral hepatitis: An overview. Semin. Virol. 4, 273-283. McDonnell, W. M., and Lucey, M. R. (1995). Hepatitis C infection. Curr. Opin. Infect. Dis. 8, 384390. McHutchison, J. G., Gordon, S. C., Schiff, E. R., Shiffman, M. L., Lee, W. M., Rustgi, V. K., Goodman, Z. D., Ling, M. H., Cort, S., and Albrecht, J. K. (1998). Interferon-a 2b alone or in combination with ribavirin as initial treatment for chronic hepatitis C. New Engl.J. Med. 339,14851492. Miller, R. H., and Purcell, R. H. (1990). Hepatitis C virus shares amino acid sequence similarity with pestiviruses and flaviviruses as well as members of two plant virus supergroups. Proc. Natl. Acad. Sci. USA 87, 2057-2061. Mizushima, H., Hijikata, M., Asabe, S. I., Hirota, M., Kimura, K., and Shimotohno, K. (1994a). Two hepatitis C virus glycoprotein E2 products with different C termini. J. Virol. 68, 6215-6222. Mizushima, H., Hijikata, M., Tanji, Y., Kimura, K., and Shimotohno, K. (1994b). Analysis of Nterminal processing of hepatitis C virus nonstructural protein 2. J. Virol. 68, 2731-2734. Morgenstern, K. A., Landro, J. A., Hsiao, K., Lin, C., Gu, Y., Su, M. S. S., and Thomson, J. A. (1997). Polynucleotide modulation of the protease, nucleoside triphosphatase, and helicase activities of a hepatitis C virus NS3-NS4A complex isolated from transfected COS cells. J. Virol. 71, 37673775. Mori, A., Yamada, K., Kimura, J., Koide, T., Yuasa, S., Yamada, E., and Miyamura, T. (1996). Enzymatic characterization of purified NS3 serine proteinase of hepatitis C virus expressed in Escherichia coli. FEBS Lett. 378, 37-42. Muerhoff, S. A., Leary, T. P., Simons, J. N., Pilot-Matias, T. J., Dawson, G.J., Erker, J. C., Chalmers, M. L., Schlauder, G. G., Desai, S. M., and Mushahwar, I. K. (1996). Genomic organization of GB viruses A and B: Two new members of the Flavivirideae associated with GB agent hepatitis. J. Virol. 69, 5621-5630. Neddermann, P., Tomei, L., Steink~hler, C., Gallinari, P., Tramontano, A. and De Francesco, R. (1997). The nonstructural proteins of the hepatitis C virus: structure and functions. Biol. Chem. 378,469-476.

HCV Proteases

89

Ohba, K. I., Mizokami, M., Lau,J. Y. N., Orito, E., Ikeo, K., and Gojobori, T. (1996). Evolutionary relationship of hepatitis C, pesti, flavi, plantviruses, and the newly discovered GB hepatitis agents. FEBS Lett. 378, 232-234. Pasquo, A., Nardi, M. C., Dimasi, N., Tomei, U, Steink(ihler, C., Delmastro, E, Tramontano, A., and De Francesco, R. (1998). Rational design and functional expression of a constitutively active single chain NS4A-NS3 proteinase. Fold. Design. 3,433-441. Preugschat, F., Averett, D. R., Clarke, B. E., and Porter, D. J. T. (1996). A steady-state and presteady state kinetic analysis of the NTPase activity associated with the hepatitis C virus NS3 helicase domain.J. Biol. Chem. 271, 24449-24457. Pieroni, L., Santolini, E., Fipaldini, C., Pacini, U, Migliaccio, G., and La Monica, N. (1997). In vitro study of the NS2-3 protease of hepatitis C virus.J. Virol. 71, 6373-6380. Pizzi, E., Tramontano, A., Tomei, U, La Monica, N., Failla, C., Sardana, M., Wood, T., and De Francesco, R. (1994). Molecular model of the specificity pocket of the hepatitis C virus proteinase: Implications for substrate recognition. Proc. Natl. Acad. Sci. USA 91,888-892. Reed, K. E., Grakoui, A., and Rice, C. M. (1995). Hepatitis C virus-encoded NS2-3 proteinase: Cleavage-site mutagenesis and requirements for bimolecular cleavage. J. Virol. 69, 4127-4136. Reichard, O., Yun, Z. B., Sonnerborg, A., and Weiland, O. (1993). Hepatitis C viral RNA titres in serum prior to, during and after oral treatment with ribavirin for chronic hepatitis C.J. Med. Virol. 41, 99-102. Rice, C. M. (1996). Flavivirideae: The viruses and their replication. In "Fields' Virology" (B. N. Fields, D. M. Knipe, and E M. Howley, Eds.), 3rd ed. pp. 931-959. Lippincott-Raven, Philadelphia/New York. Saito, I., Miyamura, T., Ohbayashi, A., Harada, H., Katayama, T., Kikuchi, S., Watanabe, Y., Koi, S., Onji, M., Ohta, Y., Choo, Q. L., Houghton, M., and Kuo, G. (1990). Hepatitis C virus infection is associated with the development of hepatocellular carcinoma. Proc. Natl. Acad. Sci. USA 87, 6547-6549. Sali, D. U, Ingram, R., Wendel, M., Gupta, D., McNemar, C., Tsarbopoulos, A., Chen, J. W., Hong, Z., Chase, R., Risano, C., Zhang, R., Yao, N., Kwong, A. D., Ramanathan, L., Le, H. V., and Weber, E C. (1998). Serine protease of hepatitis C virus expressed in insect cells as the NS3/4A complex. Biochemistry 37, 3392-3401. Santolini, E., Migliaccio, G., and La Monica, N. (1994). Biosynthesis and biochemical properties of the hepatitis C virus core protein. J. Virol. 68, 3631-3641. Santolini, E., Pacini, L., Fipaldini, C., Migliaccio, G., and Monica, N. (1995). The NS2 protein of hepatitis C virus is a transmembrane polypeptide.J. Virol. 69, 7461-7471. Satoh, S., Tanji, Y., Hijikata, M., Kimura, K., and Shimotohno, K. (1995). The N-terminal region of hepatitis C virus nonstructural protein 3 (NS3) is essential for stable complex formation with NS4A.J. Virol. 69, 4255-4260. Selby, M. J., Choo, Q. L., Berger, K., Kuo, G., Glazer, E., Eckart, M., Lee, C., Chien, D., Kuo, C., and Houghton, M. (1993). Expression, identification and subcellular localization of the proteins encoded by the hepatitis C viral genome.J. Gen. Virol. 74, 1103-1113. Shimotohno, K. (1993). Hepatocellular carcinoma in Japan and its linkage to infection with hepatitis C virus. Semin. Virol. 4, 305-312. Shimizu, Y., Yamaji, K., Masuho, Y., Yokota, T., Inoue, H., Sudo, K., Satoh, S., and Shimotohno, K. (1996). Identification of the sequence of NS4A required for enhanced cleavage of the NS5A/5B site by hepatitis C virus NS3 proteinase.J. Virol. 70, 127-132. Shoji, I., Suzuki, T., Chieda, S., Sato, M., Harada, T., Chiba, T., Matsuura, Y., and Miyamura, T. (1995). Proteolytic activity of NS3 serine protease of hepatitis C virus efficiently expressed in Escherichia coli. Hepatology 22, 1648-1655. Sirnons, J. N., Leary, T. E, Dawson, G. J., Pilot-Matias, T.J., Muerhoff, A. S., Schlauder, G. G., Desai, S. M., and Mushahwar, I. K. (1995). Isolation of novel virus-like sequences associated with human hepatitis. Nat. Med. 1,564-569.

90

Urbani et al.

Steink~ihler, C., Biasiol, G., Brunetti, M., Urbani, A., Koch, U., Cortese, R., Pessi, A., and De Francesco, R. (1998). Product inhibition of the hepatitis C virus NS3 protease. Biochemistry 37, 8899-8905. Steinkflhler, C., Tomei, L., and De Francesco, R. (1996a). In vitro activity of hepatitis C virus protease NS3 purified from recombinant baculovirus-infected Sf9 cells. J. Biol. Chem. 271, 63676373. Steinkflhler, C., Urbani, A., Tomei, L., Biasiol, G., Sardana, M., Bianchi, E., Pessi, A., and De Francesco, R. (1996b). Activity of purified hepatitis C virus proteinase NS3 on peptide substrates. J. Virol. 70, 6694-6700. Stempniak, M., Hostomska, Z., Nodes, B. R., and Hostomsky, Z. (1997). The NS3 proteinase domain of hepatitis C virus is a zinc-containing enzyme. J. Virol. 71, 2881-2886. Sudo, K., Inoue, H., Shimizu, Y., Yamaji, K., Konno, K., Shigeta, S., Kaneko, T., Yokota, T., and Shimotohno, K. (1996). Establishment of an in vitro assay system for screening hepatitis C virus protease inhibitors using high performance liquid chromatography. Antiviral Res. 32, 9-18. Sudo, K., Matsumoto, Y., Matsushima, M., Fujiwara, M., Konno, K., Shimotohno, K., Shigeta, S., and Yokota, T. (1997a). Novel hepatitis C virus protease inhibitors: Thiazolidine derivatives. Biochem. Biophys. Res. Commun. 238, 643-647. Sudo, K., Matsumoto, Y., Matsushima, M., Konno, K., Shimotohno, K., Shigeta, S., and Yokota, T. (1997b). Novel hepatitis C virus proteinase inhibitors: 2,4,6-Trihydroxy, 3-nitrobenzamide derivatives. Antiviral Chem. Chemother. 8, 541-544. Suzich, J. A., Tamura, J. K., Palmer Hill, F., Warrener, P., Grakoui, A., Rice, C. M., Feinstone, S. M., and Collett, M. S. (1993). Hepatitis C virus NS3 protein polynucleotide-stimulated nucleoside triphosphatase and comparison with the related pestivirus and flavivirus enzymes. J. Virol. 67, 6152-6158. Suzuki, T., Sato, M., Chieda, S., Shoji, I., Harada, T., Yamakawa, Y., Watabe, S., Matsuura, Y., and Miyamura, T. (1995). In vivo and in vitro trans~ activity of hepatitis C virus serine proteinase expressed by recombinant baculoviruses. J. Gen. Virol. 76, 3021-3029. Tai, C.~ Chi, W.-K., Chen, D.-S., and Hwang, L.~ (1996). The helicase activity associated with hepatitis C virus nonstructural protein 3 (NS3).J. Virol. 70, 8477-8484. Taliani, M., Bianchi, E., Narjes, F., Fossatelli, M., Urbani, A., Steink~ihler, C., De Francesco, R., and Pessi, A. (1996). A continuous assay of hepatitis C virus protease based on resonance energy transfer depsipeptide substrates. Anal. Biochem. 240, 60-67. Tanaka, T., Kato, N., Cho, M. J., and Shimotohno, K. (1995). A novel sequence found at the 3' terminus of the hepatitis C virus genome. Biochem. Biophys. Res. Commun. 215,744-749. Tanaka, T., Kato, N., Cho, M. J., and Shimotohno, K. (1996). Structure of the 3' terminus of the hepatitis C virus.J. Virol. 70, 3307-3312. Tanji, Y., Hijikata, M., Hirowatari, Y., and Shimotohno, K. (1994a). Hepatitis C virus polyprotein processing: Kinetics and mutagenic analysis of serine proteinaseodependent cleavage. J. Virol. 68, 8418-8422. Tanji, Y., Hijikata, M., Hirowatari, Y., and Shimotohno, K. (1994b). Identification of the domain required for trans-cleavage activity of hepatitis C viral serine proteinase. Gene 145,215-219. Tanji, Y., Hijikata, M., Satoh, S., Kaneko, T., and Shimotohno, K. (1995a). Hepatitis C virusencoded nonstructural protein NS4A has versatile functions in viral protein processing. J. Virol. 69, 1575-1581. Tanji, Y., Kaneko, T., Satoh, S., and Shimotohno, K. (1995b). Phosphorylation of hepatitis C virusencoded nonstructural protein NS5A. J. Virol. 69, 3980-3986. Taremi, S. S., Beyer, B., Maher, M., Yao, N., Prosise, W., Weber, P., and Malcolm, B. A. (1998). Construction, expression and characterization of a novel fully activated recombinant singlechain hepatitis C virus protease. Protein Sci. 7, 2143-2149. Tomei, L., Failla, C., Santolini, E., De Francesco, R., and La Monica, N. (1993). NS3 is a serine proteinase required for processing of hepatitis C virus polyprotein.J. Virol. 67, 4017-4026.

HCV Proteases

91

Tsukiyama-Koharak, D., Iizukam, N., Kohara, M., and Nomoto, A. (1992). Internal ribosome entry site within hepatitis C virus RNA.J. Virol. 66, 1476-1483. Urbani, A., Bazzo, R., Nardi, M. C., Cicero, D., De Francesco, R., Steinkflhler, C., and Barbato, G. (1998). The metal binding site of the hepatitis C virus NS3 protease. A spectroscopic study. J. Biol. Chem. 272, 9204-9209. Urbani, A., Bianchi, E., Narjes, F., Tramontano, A., De Francesco, R., SteinkC~hler, C., and Pessi, A. (1997). Substrate specificity of the hepatitis C virus serine protease NS3. J. Biol. Chem. 272, 9204-9209. Vallee, B., and Auld, D. S. (1990a). Zinc coordination, function and structure of zinc enzymes and other proteins. Biochemistry 29, 5647-5659. Vallee, B., and Auld, D. S. (1990b). Active site zinc ligands and activated H20 of zinc enzymes. Proc. Natl. Acad. Sci. USA 87, 220-224. Vishnuvardan, D., Kakiuchi, N., Urvil, P. T., Shimotohno, K., Kumar, P. K. R., and Nishikawa, S. (1997). Expression of highly active recombinant NS3 protease domain of hepatitis C virus in E. coll. FEBS Lett. 402, 209-212. Wang, C., Sarnow, P., and Siddiqui, A. (1993). Translation of human hepatitis C virus RNA in cultured cells is mediated by an internal ribosome binding mechanism.J. Virol. 67, 3338-3344. Wu, Z., Yao, N., Le, H. V., and Weber, P. (1998). Mechanism of autoproteolysis at the NS2-NS3 junction of the hepatitis C virus polyprotein. Trends Biochem. Sci. 23, 92-94. Yan, Y., Li, Y., Munshi, S., Sardana, V., Cole, J., Sardana, M., Steinkuhler, C., Tomei, L., De Francesco, R., Kuo, L., and Chen, Z. (1998). Complex of NS3 protease and NS4A peptide of BK strain hepatitis C virus: A 2.2 A resolution structure in a hexagonal crystal form. Prot. Sci. 17, 837-847. Yao, N., Hesson, T., Cable, M., Hong, Z., Kwong, A. D., Le, H. V., and Weber, P. C. (1997). Structure of the hepatitis C virus RNA helicase domain. Nat. Struct. Biol. 4, 463-467. Yuan, Z. H., Kumar, U., Thomas, H. C., Wen, Y. M., and Monjardino, J. (1997). Expression, purification and partial characterization of HCV RNA polymerase. Biochem. Biophys. Res. Commun. 232, 231-235. Zhang, R., Durkin, J., Windsor, W. T., McNemar, C., Ramanathan, L., and Le, H. V. (1997). Probing the substrate specificity of hepatitis C virus NS3 serine protease by using synthetic peptides. J. Virol. 71, 6208-6213.

Human Herpesvirus Proteases XIAYANG QIU AND SHERIN S. ABDEL-MEGUID Department of Structural Biology, SmithKline Beecham Pharmaceuticals, King of Prussia, Pennsylvania 19406

I. II. III. IV. V. VI. VII. VIII. IX.

Introduction Role of the Protease in the Virus Life Cycle Primary Structures Enzymatic Activity Three-Dimensional Structures Substrate Specificity Mechanism of Action Inhibitors Ligand Binding References

I. I N T R O D U C T I O N Human herpesviruses are responsible for a variety of diseases from subclinical infections to fatal diseases in the immunocompromised and immunosuppressed; clinical manifestations of the diseases they cause are very different (Table I). They are divided into three subfamilies designated or,/3, and 3' (Table I). The a-subfamily includes herpes simplex viruses types I and 2 (HSV-1 and HSV-2) and varicella-zoster virus (VZV); the/3 group includes cytomegalovirus (CMV) and human herpesviruses 6 and 7 (HHV-6 and HHV-7); and the 5' group includes Epstein-Barr virus (EBV) and human herpesvirus 8 (HHV-8) also known as Kaposi's sarcoma-associated herpesvirus. These viruses vary greatly in their biological properties and remain latent in a specific set of cells (Table I). The a-subfamily replicates relatively fast, destroys infected cells efficiently, and establishes latency primarily in sensory ganglia (thus called neurotropic). The/3-subfamily replicates slowly, often causes enlargement of the infected cells, and can be latent in secretory glands, lymphoreticular cells, and other Proteases of Infectious Agents Copyright 9 1999 by Academic Press. All rights of reproduction in any form reserved.

93

.,,~

i .,,,~

ffl

0

.~

~

9~

|

,.~

9'-'

,--Z

~,3

'

94

~

""

~,~

~

~

m

~

_

..~

r

>.

~

~

~=a:

-~

>-,'~ ul

.-= = ~

,.el

~

~>~'~

r

~ ~

.gg

~a"~

r

,a::l

~

o

~

o~

. ~ ~

~ ~

~=~-~...=

9

~,.-=

~a ~

~

~_~

~

-7=

N

> ~

~

0

.,-,

0

~=~o ""

9,-,

o

o

"~

~n~

0

~o ~

o

~

.,~

0

~

0

~

~o~

o~2 ~ ~ .~'~

o~~

0

~ "~ ~ ~ o

0

0 O

0 ,-..,

o~

Q0.

L/h

|

+

0

n~

~o|

;>

==

o

, .,._,

-'-'

9~

U~

0

0

0

o

0

.~

~

0

.,~

0

.,-,

=

,~,

0

=~

95

UJ

"el

o~ .o

~[ ~

0

0

O0

96

XiayangQiu and Sherin S. Abdel-Meguid

tissues. The y-subfamily is specific for either B- or T-lymphocytes (lymphotropic) and exists at either a latent or lytic stage but does not produce infectious progeny. Viruses of the a-subfamily are among those causing serious diseases. The herpes simplex viruses were the first of the human herpesviruses to be discovered and are among the most extensively investigated of all viruses. Herpes simplex virus type 1 is the virus responsible mainly for herpes labialis (cold sores), while HSV-2 causes genital herpes. The latter disease is of increasing public health importance. The recurrent nature of the infection, its differing clinical manifestations, and complications such as aseptic meningitis and neonatal infection, are of great concern to patients and health care providers. Varicella-zoster virus is responsible for chickenpox, shingles, and postherpetic neuralgia. Primary exposure to VZV results in chickenpox, reactivation of the virus following a period of latency gives rise to shingles, and postherpetic neuralgia is probably the result of nerve damage during the active replication phase of shingles. Cytomegalovirus is a ubiquitous opportunistic pathogen that can result in life-threatening infections in congenitally infected infants, immunocompromised individuals, and immunosuppressed transplant patients. Both HHV-6 and HHV-7 have been associated with the childhood diseases such as roseola. Epstein-Barr virus could cause mononucleosis, and HHV-8 has recently been linked to the development of Kaposi's sarcoma.

II. R O L E OF THE P R O T E A S E IN T H E V I R U S LIFE C Y C L E Herpesviruses are enveloped double-stranded DNA viruses that share a common pathway of assembly. The DNA is packaged into an icosahedral capsid in the nucleus of infected cells. The icosahedral capsid is surrounded by an amorphous material referred to as the tegument and enclosed within a lipid envelope of cellular origin that is acquired while the virus buds from the infected cell. Packaging of the viral DNA requires processing of an assembly protein precursor designated ICP35 in HSV-1. The processed ICP35 appears to form an inner scaffold that supports the proper assembly of the capsid. It is found in the capsid prior to DNA packaging, but is absent in the mature virions. The precursor ICP35 is processed through removal of 25 amino acid residues from its C-terminus by a virally encoded 635-amino-acid serine protease that contains the assembly protein at its C-terminus (Fig. 1). This site of cleavage is known as the maturation (M) site. The protease is also capable of catalyzing its own cleavage at the release (R) site to produce a 247-residue N-terminal domain that has full catalytic activity (Fig. 1).

Human Herpesvirus Proteases

97

FIGURE 1 HSV-1protease (UL26gene product) and its substrate (UL26.5geneproduct). Maturation (M-site) and release (R-site) cleavagesare indicated.

Liu and Roizman (1991) and Welch et al. (1991) were the first to report the identification of these serine proteases from herpesviruses; the former identified the protease from HSV-1 and the latter from CMV. These two enzymes are the most studied of all human herpesvirus proteases. Gao et al. (1994) showed, using a null mutant virus, that the HSV-1 protease is essential for capsid formation and production of infectious virus, making it an attractive target for therapeutic intervention. This and other studies showing that herpesvirus proteases are essential for the virus life cycle have been summarized in recent reviews by Holwerda (1997) and Gibson and Hall (1997).

III. P R I M A R Y S T R U C T U R E S The full-length proteases of the various human herpesviruses range from 512 amino acids in HHV-7 to 708 amino acids in CMV. The N-terminal (catalytic) domains range from 226 amino acids in HHV-7 to 256 amino acids in CMV, indicating that most of the variability in the size of these proteases is in the C-terminal (assembly protein) portion. The catalytic domains show significant sequence homology within each subfamily of herpesvirus, but only limited homology between the different subfamilies (Fig. 2 and Table II). For example, the amino acid sequence of the HSV-1 protease catalytic domain is 91 and 50% identical to that of HSV-2 and VZV, respectively, while it is only 26% identical to that of CMV protease. Extensive homology searches against all known sequence databases revealed little homology to any other protein. Liu and Roizman (1992) showed that the HSV-1 protease was inhibited by

I I

I I

I I

I I

,,,~ I I

~~

I I

I I

~1~~~~1 ~

~,~

m

H

m ~ ~

~

M

m

~

mm~~

HH

CI~ ~-.I (~1

~

~~~~~ ~~1

~

~ D r ~-

o

mm

~q ~q ~q,q ~ m m

~m

~

m

ul ul ul ul ul u ~ u

I~~

~1~~~!1

~mmmm~ ,,~ ~l~q

I i

,-.1 b,

i-.~ I ~:~::~ i

H

~

I i

~ ,~

I i

~,,,

I i

~~~~~1

~1~

.G-, I r.T..~ i

~8,

==~

!!

HHHHHH

~ ~ , ,

i'

~1~~ i

N

~~ b~

~.~

~ 8

99

H u m a n Herpesvirus Proteases TABLE II

Amino Acid Sequence Identities between H u m a n Herpesvirus Proteases a

(%)

HSV-1

HSV-2

VZV

CMV

HHV-6

HHV-7

EBV

HHV-8

HSV-1 HSV-2 VZV CMV

--

91 --

50 51

54

54

~

26 26 26

30

30

30

~

HHV-6

24

24

21

41

23 23 21 38 --

19 19 20 35 59

HHV-7

25

28

24

39

60

~

26 27 23 31 31 27

EBV

31

31

26

34

29

29

~

27 28 26 33 33 28 45

HHV-8

30

30

30

37

35

30

45

91

aBoldface used as in Fig. 2 where the alignment was improved based on crystal structures; italic from GCG BESTFIT.

some serine protease inhibitors, but not by inhibitors of metallo-, acidic, or thiol proteases. This was surprising, given the absence in herpesvirus proteases of the conserved G-X-S/C-G-G motif for chymotrypsinlike and G-T-S-M/A for subtilisinlike proteases, and an early clue that these enzymes could be a novel class of serine proteases.

IV. E N Z Y M A T I C A C T I V I T Y All human herpesvirus proteases cleave a peptide bond between an alanine and a serine, with CMV protease being the most extensively characterized in terms of its enzymatic activity (Burck et al., 1994; Pinko et al., 1995; Liang et al., 1998). The purified wild-type CMV protease shows a clip between Ala143 and Ala144. This site of cleavage is referred to as the internal (I) site. The CMV protease is the only member of the family known to show this I-site clip; all others have been purified as a single chain. A number of mutations to ablate cleavage at the I-site have been engineered (Pinko et al., 1995; Qiu et al., 1996), with the resulting single-site mutant having nearly identical catalytic activity as the wild-type (clipped) enzyme (Table III). This suggests that the immediate vicinity of the I-site is involved neither in catalysis nor substrate binding, as was confirmed later from the crystal structure of CMV protease (the I-site being part of a disordered loop on the surface of the enzyme). The CMV protease differs in its catalytic activity toward the R- and M-sites. The turnover rate of the M-site (GVVNA$SCRLA) cleavage is an order of magnitude faster than that of the R-site (SYVKA$SVSPE), while having similar Km values. Moreover, the hydrolysis of the R-site peptide has an optimal pH of about 7, while the hydrolysis of the M-site peptide has a biphasic pH dependence with optima of approximately 7 and 9, probably due to the protonation

~S

c~

Q

Q

0 ~

w~

oo

0

~oooo~

~

~

o

Z

o

.,~

c~ .,~ .,~

.~

c~

o

Human Herpesvirus Proteases

101

of the arginines and the lysine in the peptide substrate (Burck et al., 1994). Unlike CMV protease, HSV-1 protease does not have a preference for the Mover the R-site. Its pH optimal is approximately 8.0, and it is about 10 times slower than CMV protease (DiIanni et al., 1993; Darke et al., 1994; Hall and Darke, 1995). The catalytic activity of the herpesvirus proteases is influenced significantly by the nature and concentration of cosolvents. For CMV protease, the maximal specific activity is increased by about 10-fold in the presence of 30% glycerol (Margosiak et al., 1996). Similar enhancement in catalytic activity has also been reported for the HSV-1 protease (Hall and Darke, 1995). The most dramatic cosolvent effect is observed with sodium citrate. The CMV protease catalytic efficiency increases by 290-fold (kcat/Kmof 1.24 min-1/zM -1) in the presence of 0.8 M citrate over that in the absence of the cosolvent (0.0044 min-1/zM-~). Similarly, HSV-1 protease has a kcaJKm of 0.25 min-1/zM -~ (kca~4 min -~, K~ 16/zM) in 0.8 M citrate, compared to the k c J K ~ value of 0.0003 min-1/zM -1 in the absence of citrate, an increase of 800-fold in catalytic efficiency. The cosolvent effects led to the discovery that both CMV and HSV-1 proteases are active as homodimers (Margosiak et al., 1996; Schmidt and Darke, 1997). For CMV protease, the dissociation constant is 6.6 or 0.55/zM in 10 or 20% of glycerol, respectively (Darke et al., 1996). For HSV-1 protease, they are 0.96 or 0.23/zM in 20% glycerol and 0.2 or 0.5 M citrate, respectively (Schmidt and Darke, 1997). The dimer association is rather weak, probably an artifact of working with only the catalytic domain. The cosolvent effects on herpesvirus protease activity are likely due to their ability to enhance dimerization and stabilize the conformation of active dimers. Another important observation is the apparent low catalytic efficiency of herpesvirus protease when peptide substrates were used (Table III). Using authentic M-site derived peptides in 50% glycerol (Burck et al., 1994), CMV protease catalytic efficiency is about 104 worse than that of chymotrypsin and about 30 times worse than that of HIV protease. The apparent low efficiencies of the herpesvirus proteases may be an important property of their biological functions in the well-orchestrated events during capsid maturation (Babe and Craik, 1997). Although data support the view that herpesvirus proteases are less active than other viral proteases, this may just be an artifact of working with only the catalytic domains. In fact, when the assembly protein is used as substrate, the K~ is enhanced by over 100-fold (with similar kcat) e v e n in the absence of glycerol (Burke et al., 1994). It is also known that HSV-1 protease, when coexpressed with ICP35, exhibits greater catalytic efficiency (Deckman et al., 1992). This suggests that substrate binding may involve interactions beyond the catalytic domain, i.e., with the C-terminal domain in the full-length enzyme.

102

XiayangQiu and Sherin S. Abdel-Meguid

V. THREE-DIMENSIONAL STRUCTURES A. NOVEL FOLD The crystal structure of CMV protease has been determined by several groups (Qiu et al., 1996; Shieh et al., 1996; Tong et al., 1996; Chen et al., 1996), followed by those of VZV, HSV-1, and HSV-2 proteases (Qiu et al., 1997; Hoog et al., 1997). Unlike the structures of classic serine proteases having two distinct fl-barrel domains, the CMV protease is a single-domain protein. Its overall fold can be described as a seven-stranded orthogonally packed fl-barrel core surrounded by seven to eight c~-helices and connecting loops (Fig. 3, see color plate). The core/~-barrel is mostly antiparallel, except for strands B2 and B6, which are parallel. To our knowledge, the herpesvirus proteases three-dimensional fold has not been observed in any other protein. The overall fold of the four herpesvirus proteases with known threedimensional structures is very similar. As expected, the structures of the three a-herpesvirus (VZV, HSV-1, and HSV-2) proteases are nearly identical, and despite limited sequence identity (Table II), the overall fold of the a-herpesvirus proteases is similar to that of the fl-herpesvirus CMV protease (Qiu et al., 1997, Hoog et al., 1997). Superposition of the 197 pairs of equivalent a-carbon atoms of HSV-2 and VZV proteases (Fig. 4) gives an rms deviation of 0.9 A, while superposition of 142 pairs of those of VZV and CMV protease (Fig. 4) gives 1.3 A. The core fl-barrels of the VZV and CMV proteases superimpose even better (Fig. 4), with rms differences between the 52 pairs of c~-carbon atoms

FIGURE 4 Superimpositionof a-carbon atoms of VZV and CMVproteases (left), and VZV and HSV-2proteases (right). The VZVprotease structures are in light and thick lines and CMVprotease or HSV-2protease structure is in dark and thin lines.

HumanHerpesvirusProteases

103

being only 0.7 A. As anticipated, most of the significant structural differences between CMV and VZV protease are in the loops (Qiu et al., 1997). Moreover, an additional loop containing a small or-helix (referred to as the AA loop) has been observed in the VZV protease structure (Fig. 3) but not in the CMV protease. The corresponding segment in the CMV protease was either totally or partially disordered in apo-CMV protease structures.

B. NOVEL CATALYTIC TRIAD Unlike all previously known serine proteases having a catalytic triad comprised of a serine, a histidine, and an aspartic acid (Perona and Craik, 1995), the herpesvirus protease's catalytic triad consists of a serine and two histidines. The catalytic serine of HSV-1 protease was identified by DFP-modification experiments (DiIanni et al., 1994). Site-directed mutagenesis was able to determine the catalytic serine and histidine of cytomegalovirus protease (Welch et al., 1993), but failed to identify the third member of the catalytic triad. It was only after the determination of the crystal structure of the CMV protease (Qiu et al., 1996; Shieh et al., 1996; Tong et al., 1996; Chen et al., 1996) that the novel SerHis-His catalytic triad for the herpesvirus proteases was identified. In CMV protease, the catalytic triad residues are Ser132, His63, and His157. In this chapter, we use CMV protease numbering (Qiu et al., 1997; Fig. 2) to describe all herpesvirus protease residues. This should eliminate confusion and help to standardize numbering of catalytic triad residues as has been done with the chymotrypsin family of serine proteases. The active site of CMV protease is situated in the only region of the core /3-barrel that is not sheltered by helices and flanking loops (Fig. 3). The active site region is very shallow with the catalytic residues exposed to solvent. This shallowness is not unreasonable considering the P 1-P 1' residues (Ala-Ser) are small. Residues of the triad are absolutely conserved among all human herpesvirus proteases (Fig. 2) and superimpose almost perfectly on the Ser195His57-Asp102 catalytic triad of trypsin (Fig. 5; Qiu et al., 1996; Qiu et al., 1997; Hoog et al., 1997). When the second histidine (His157) is mutated to an alanine, the CMV protease is nearly inactive (Welch et al., 1993) and the HSV-1 protease is completely inactive (Liu and Roizman, 1992). The residual activity in CMV H157A mutant protease is reminiscent to that seen in classical serine proteases when the catalytic aspartate is mutated (Perona and Craik, 1995).

C.

DIMER INTERFACE

As indicated above, it has been shown that CMV and HSV-1 proteases are active as homodimers (Margosiak et al., 1996; Darke et al., 1996; Schmidt and Darke,

104

XiayangQiu and Sherin S. Abdel-Meguid

FIGURE 5 Stereoviewof the superimposition of catalytic triad residues of CMVprotease (light) and trypsin (dark).

1997). A dimer interface related by a twofold crystallographic axis has been first identified in the structure of CMV protease (Qiu et al., 1996; Fig. 6). The most distinct structural element in the CMV dimer interface is helix A6, with the two A6 helices almost parallel and A6 of each monomer interacting with helices A1, A2, A3, and A6 of the other (Qiu et al., 1996). Subsequently, similar dimer interfaces have also been found in the structures of cr-herpesvirus proteases (Qiu et al., 1997; Hoog et al., 1997; Fig. 6). The interface area between the two monomers of these proteases is about 1300 A 2. This is in general agreement with the reported (Margosiak et al., 1996; Darke et al., 1996; Schmidt and Darke, 1997) micromolar dissociation constants of herpesvirus proteases and suggests that the crystallographically observed dimer is indeed the biologically active dimer. Although the herpesvirus proteases are active as homodimers, each monomer has a well-defined active site containing all the residues necessary for catalytic activity. The two active sites are on opposite sides of the dimer (Fig. 6, see color plate). Since the dimer interface is distal to the catalytic triad (Fig. 6), dimerization must only influence enzymatic activity indirectly. We had speculated (Qiu et al., 1996) that this occurs by stabilization of the conformation of helix A6. This helix is part of a shallow groove running across the catalytic site. One side of the groove is relatively wide and deep. It is defined by the end of helix A6, the end of strand B6, His63, and the highly conserved Gly164Arg165-Arg166 segment. This side of the groove has been proposed (Qiu et al., 1997) to be the S' subsite. In the absence of dimer formation, helix A6, the core of the dimer interface, could move toward the active site to block substrate access, thus rendering the enzyme inactive or much less active. Although dimer formation appears to be a common feature for all herpesvirus proteases, there are notable differences between the dimer interfaces

105

Human Herpesvirus Proteases

of cr- and ~8-herpesvirus proteases. The two A6 helices of the dimer are almost parallel in the CMV protease structure but show an approximately 30 ~ twist in the VZV and HSV protease structures (Qiu et al., 1997; Hoog et al., 1997; Fig. 6). Helix A2 adopts a very different conformation in VZV protease comparing to that of CMV protease. It is interesting to note that the amino acid residues of helices at the dimer interface are less conserved than those of the /3-strands, with the least conserved being those of helix A2 (Fig. 2). These differences further support the notion that the dimer interfaces do not directly contribute to catalysis, but do it indirectly.

VI. S U B S T R A T E S P E C I F I C I T Y As indicated above, human herpes protease precursor molecules undergo autoproteolytic cleavage at two sites (R and M), and they all cleave between an alanine and a serine. Studies with both CMV (Sardana et al., 1994) and HSV-1 (DiIanni et al., 1993; McCann et al., 1994) proteases have identified the P 4 - P I ' residues of the M-site as highly conserved between the two proteases and as having sequence-specific interactions with the S and S' subsites of the protease. However, inspection of all sequences at the R- and M-sites (Table IV) shows a consensus of (V,L,I)-X-A$S, where X is a hydrophilic residue. This recognition sequence is unique among known serine proteases.

TABLE IV Recognition Sites of Several Human Herpesvirus Proteases HSV-1

GALVNA*S SAAHVDV HTYLQA*SEKFKMWG

M R

HSV-2

GALVNA*S SAAHVNV HTYLQA*SEKFKIWG

M R

VNAVEA*S

M

VZV

SKAPL

IQ

HVYLQA*STGYGLAR

R

CMV

AGVVNA*SCRLATAS ESYVKA*SVSPEARA

M R

HHV-6

PSI LNA*SLAPETVN CTY I KA*SEPPVE

I I

M R

HHV-7

PSVVNA*SLTPGQDR STY I KA*SENLTANN

M R

EBV

KKLVQA*SASGVAQS ESYLKA*SDAPDLQK

M R

HHV-8

SNRLEA*S SRS SPKS PVYLKA*SQFPAGIQ

M R

106

XiayangQiu and Sherin S. Abdel-Meguid

Despite sharing the same core M-site sequence (VNA$S), HSV-1 protease does not cleave at the CMV protease M-site; however, CMV protease does cleave at the corresponding HSV-1 site (Welch et al., 1995). Likewise, the EBV protease can process both EBV and CMV assembly proteins, but CMV protease cannot process the EBV assembly protein (Donaghy and Jupp, 1995). The high selectivity of the HSV-1 protease is consistent with studies using peptide substrates. The smallest peptide mimic of the CMV protease M-site that is cleaved by that protease is P4-P4 r (Sardana et al., 1994), whereas 13 residues from P5-P8 ~ are required for cleavage by HSV-1 protease (DiIanni, 1993). This is surprising given the high sequence homology of residues lining the active site cavity of the two enzymes (Hoog et al., 1997). Therefore, it was suggested that HSV-1 protease has a more extended substrate binding pocket and that differences in substrate specificity between the two enzymes result from differences in loop conformations around the active site cavity (Hoog et al., 1997). These loops show low sequence homology and are of differing lengths.

VII. M E C H A N I S M O F A C T I O N

A. CATALYTIC MECHANISM The herpesvirus protease structures suggest a novel Ser-His-His catalytic triad, while all other known serine proteases carry a Ser-His-Asp triad. There are two common models for the mechanism of serine proteases (Perona and Craik, 1995). In the "two-proton transfer" model, the negatively charged aspartic acid accepts a second proton to become uncharged during the transition state. In the second model, the most important role for the Asp is the groundstate stabilization of the required tautomer and rotamer of the catalytic histidine. In the crystal structures of herpesvirus protease, His63 forms hydrogen bonds with both Ser132 and His157 (Fig. 5). At the optimal pH (7 to 8) of these enzymes and with the two histidines mostly exposed to solvent, it is most probable that they are neutral. Thus, it is unlikely that in the transition state His157 would accept the complete transfer of the proton it shares with His63, unless it could transfer a proton to a negatively charged residue. On the other hand, His157 can assume the role of properly orienting His63 for catalysis. Therefore, the existence of an active Ser-His-His triad seems to support the second model. An aspartic acid (Asp65) was found in the CMV protease structure near His157, suggesting the possibility of a catalytic tetrad composed of Ser132, His63, His157, and Asp65 (Qiu et al., 1996). The likelihood of a catalytic tetrad was, however, diminished because Asp65 of the CMV protease is not conserved among all human herpes proteases; for example, it is a lysine in the VZV pro-

Human Herpesvirus Proteases

10 7

tease and an alanine in the HSV-1 and HSV-2 proteases (Fig. 2). Moreover, the activity of the D65A mutant CMV protease is reduced by only 35% (Liang et al., unpublished data). Another important element of serine protease catalysis is the existence of an oxyanion hole to stabilize the negatively charged oxygen of the substrate in the transition state. Overlays of the catalytic triad of any of the herpesvirus protease structures with that of trypsin suggested that the highly conserved Arg165 and Arg166 are involved in stabilization of the oxyanion intermediate. Such overlays resulted in superimposition of the Arg165 backbone atoms with those of Gly193 of trypsin. The latter is known to stabilize the oxyanion intermediate through a hydrogen bond with its backbone NH. The Ser195 of trypsin is also known to stabilize the enzyme active site oxyanion intermediate through a hydrogen bond with its backbone NH. The equivalent residue in the herpes proteases is absent. Instead, a water molecule (Fig. 7) held by the sidechain of Arg166 in the viral proteases was proposed to form a hydrogen bond with the oxyanion (Qiu et al., 1996; 1997; Hoog et al., 1997). The roles of Arg165 and Arg166 in catalysis are further supported by the fact that both residues are absolutely conserved among all herpes proteases (Fig. 2) and the fact that the CMV R165A mutant protease still has 30% of the wild-type activity while the R166A mutant is about four orders of magnitude less active (Liang et al., 1998).

B. MODE OF PROTEOLYTIC PROCESSING There is no direct experimental evidence to support dimer formation of fulllength herpesvirus proteases. However, the extent and the intricate nature of the dimer interface of the catalytic domain suggest that the full-length proteases will also form dimers. Although greater enzymatic activity of the catalytic domain is attained with the influence of cosolvents that are thought to enhance dimer formation, the full-length protease is quite active in the absence of cosolvents. This implies that dimer formation may be further stabilized by the assembly protein or the C-terminal portion of the full-length protease. Little is known about autoprocessing of the herpes proteases. Inspection of the crystal structure shows the active site and the C-terminus are on opposite sides within a protease monomer (Fig. 6). Within a dimer, the C-terminus of one monomer is on the same side as the active site of the other monomer and they are connected by a well-defined groove (Fig. 6). However, not only are they 29 A apart, but also considerable conformational change must occur in order for the C-terminus to position itself properly and assume the correct orientation in the active site. Thus, the structure suggests that the protease is unlikely to act in cis, at least at the R-site.

108

Xiayang Qiu and Sherin S. Abdel-Meguid

FIGURE 7 The active site of HSV-2 protease with di-isopropyl phosphate (DIP) bound. Key hydrogen bonds are connected by dashed lines. The oxyanion hole is predicted to be between Arg165 N and Wat2.

VIII. INHIBITORS Knowledge that the protease is essential for viral replication (Gao et al., 1994) has stimulated research by a number of pharmaceutical companies to identify inhibitors that can be used as drugs to combat diseases caused by herpesviruses, particularly CMV. As with most drug discovery programs, the goal of these

Human Herpesvirus Proteases

109

programs is to design novel, potent, low-molecular-weight molecules devoid of peptide character. Although the bulk of the work to develop inhibitors of herpesvirus proteases has yet to be made public, some of it has been recently reported (see below). Most of the inhibitors reported to date have been designed prior to knowledge of the three-dimensional structures of any of the herpesvirus proteases. They have been either derived from a substrate or designed based on molecules that are previously known to be classical inhibitors of serine proteases and act by covalently and reversibly binding to the active site serine hydroxyl. Given the shallowness of the active site cavity of the herpesvirus proteases, such molecules may offer an advantage over those that do not bind covalently.

A. PEPTIDE INHIBITORS Studies to identify peptide inhibitors were initiated to define the minimal element in the substrate that can act as a competitive inhibitor, i.e., the smallest peptide that binds but does not process. This peptide inhibitor can then be used as a core structure against which nonpeptide inhibitors can be designed. Using peptides encompassing the sequence of the natural M-site substrate of CMV protease (GVVNA$SCRLA), LaFemina et al. (1996) identified VVNA (P4-P1 of the substrate) as such a minimal element. This peptide was shown to have a Ki of 1.36 mM against CMV protease. They also reported that substitution of the P 1' serine of an M-site P6-P5' peptide by an alanine improved the Ki by about threefold over the unaltered peptide and gave them their most potent peptide inhibitor, having a Ki of 72/zM.

B. AMINOMETHYLENE ISOSTERES (REDUCED PEPTIDE BOND) Holskin et al. (1995) reported the first peptidomimetic inhibitor designed for CMV protease. Also starting with the M-site of CMV protease, they prepared a reduced peptide bond inhibitor (RGVVNA~[CH2NH]SSRLA-OH) having an inhibition constant of ~500/zM against CMV protease. Reduced peptide bond inhibitors are secondary amines in which the carbonyl group of the scissile peptide bond ( m C O - N H - - ) is reduced to yield the methylene-containing group (mCH2-NHm). This peptide, spanning P6 to P5', differs from the amino acid sequence of the M-site at two positions, namely P6 (an arginine for alanine to increase solubility) and P2' (a serine for cysteine to prevent disulfide bond formation).

110 C.

Xiayang Qiu and Sherin S. Abdel-Meguid

KETONES

A number of micromolar and submicromolar CMV protease inhibitors containing an activated carbonyl moiety (Fig. 8), such as fluoromethyl ketones and ce-ketoamides, were recently reported (Bonneau et al., 1997). Molecules of this type are classic serine protease inhibitors that act by reversibly forming covalent hemiketal adducts with the active site serine hydroxyl. These inhibitors were used to study the effect of ligand binding on the intrinsic fluorescence and CD properties of the enzyme and to suggest that inhibition of CMV protease by peptidyl ketones involves a conformational change of the protease (Bonneau et al., 1997).

~,.

H

O

CONMe2 . 9 H

N

]1

0

~

_ O

N

-

/1\

N

I

H

II

0

R

"

:

Ketones R1

O

O

R2

N,,,~O H40

Benzoxazinones

R3

O

Thienoxazinones

O

R* ~ N ~

O

O

ph~'N

Ph

Ph

Spirocyclopropyl oxazolones

Imidazolones

....0

Fungal Metabolite FIGURE 8

N--R

H 2 N ~

0

Bripiodionen

Some of the known inhibitors of human herpesvirus proteases.

Human Herpesvirus Proteases

111

D. BENZOXAZINONES Benzoxazinones are a class of heterocyclic molecules (Fig. 8) initially identified as general mechanism-based inhibitors of serine proteases (Teshima et al., 1982) and later developed as specific inhibitors of human leukocyte elastase (Uejima et al., 1993). They inhibit by acylation of the active site serine through their carbonyl group (Radhakrishnan et al., 1987). Recently, Jarvest et al. (1996) reported inhibition of HSV- 1 protease by benzoxazinones at micromolar potency. These inhibitors appear to interact specifically with HSV-1 protease, as suggested by SAR trends and stereoselectivity, and were shown to have a wide range of half-lives (1 to 171 h) at pH 7.5 in aqueous solutions. They showed that their most stable compound was one of their most potent, with IC50 of 5 ~M.

E. THIENOXAZINONES Jarvest et al. (1997) also reported the design and synthesis of a number of thienoxazinones (Fig. 8) and showed that they are potent, selective, mechanismbased inhibitors of the herpes proteases with good aqueous stability. These compounds were found to be submicromolar inhibitors of HSV-1 and HSV-2 proteases and moderate inhibitors of CMV protease.

F. OXAZOLONES AND IMIDAZOLONES Targeted screening of compounds that can acylate the active site serine of the herpes proteases identified the spirocyclopropyl oxazolones (Fig. 8) as submicromolar inhibitors of HSV-2 and CMV proteases (Pinto et al., 1996). These compounds were shown to be better inhibitors of herpesvirus proteases than other enzymes of the chymotrypsin superfamily. To enhance the stability of these compounds, the imidazolones (Fig. 8) were prepared and found to be selective for CMV protease, with little inhibition of HSV-2 protease, elastase, trypsin, and chymotrypsin (Pinto et al., 1996).

G. NATURAL PRODUCT INHIBITORS Two natural product inhibitors of CMV protease have been identified. A fungal metabolite (Fig. 8) isolated from an unidentified fungus was found to inhibit the enzyme with an IC50 of 9.8 ~g/ml (Chu et al., 1996). A second inhibitor, bripiodionen (Fig. 8), was isolated from Streptomyces and shown to have an

112

XiayangQiu and Sherin S. Abdel-Meguid

IC50 of 30/.,M (Shu et al., 1997). Furthermore, a cycloartanol sulfate from the green alga Tuemoya sp. was identified as a 4- to 7-/zM inhibitor of both VZV and CMV proteases (Patil et al., 1997).

IX. L I G A N D B I N D I N G With the known three-dimensional structures of herpesvirus proteases, it is possible to speculate about the general characteristics of ligand binding. As mentioned above, despite a totally different protein fold of the herpesvirus proteases when compared to the classic serine proteases, residues of the catalytic triad as well as those of the oxyanion hole are quite superimposable. This suggests that the mode of substrate binding of herpesvirus proteases could be similar to that of classic serine proteases. As indicated above, the active site cavity of the herpes proteases is shallow with the catalytic residues mostly exposed to solvent. The prime side (right in Fig. 9, see color plate) of the groove is relatively wide, suggesting lack of specific recognition of the substrate. It is defined by the end of helix A6, the end of strand B6, His63, and the highly conserved Gly164-Arg165-Arg166 segment. Overlay of the Ser-His catalytic dyad of the VZV protease to that of the human leukocyte elastase from the crystal structure containing the turkey ovomucoid inhibitor shows that this part of the groove is analogous to the S' cavity of elastase, with the P I ' - P 3 ' residues of the inhibitor able to fit well in the VZV protease groove. It was speculated that the HSV-1 protease P2'-P8' residues may play a structural role that is more length dependent than sequence dependent (DiIanni et al., 1993), which agrees with the shape of the S' cavity that is large enough to accommodate a folded peptide. The nonprime side (left in Fig. 9) of the groove is rather narrow, suggesting more substrate specificity than the prime site. The nonprime region is delineated by strand B5, the Gly164Arg165-Arg166 segment and the beginning of the AA loop. Since strand B5 is almost parallel to this groove, it is possible that the substrate could be inserted into the groove with its main chain in an extended conformation forming an antiparallel/3-sheet with strands B5 and B 6 m a mode that is almost identical to that of classic serine proteases. We had speculated that the $1 site is between Arg166 and Leu32, the $2 site is at Leu133 and at a conserved water molecule bound to Arg166, and the $1' site is near His63 (Hoog et al., 1998; Fig. 7). However, the precise mode of substrate binding has yet to be determined experimentally. Surface loops are known to be important for the substrate specificity of serine proteases (Perona and Craik, 1997). Protruding from the herpesvirus protease structures are the two large surface loops: one contains the AA helix and is called the AA loop, and the other contains the I-site in CMV protease and is

Human Herpesvirus Proteases

113

called the I-loop. The sequences of these two regions are not very conserved among the herpesvirus proteases, with CMV protease having multiresidue insertions in both regions (Fig. 2). Figure 9 shows an approximate position of the M-site peptide in the VZV protease active site cavity. In this model the AA loop is important for forming the $2-$4 subsites and the I-loop could be important for recognizing substrate residues P4 and further. Therefore, the difference in substrate specificity between c~- and ~-herpesvirus proteases could be explained by the large differences in these loop regions. It is interesting that an "I-site" deletion mutant of CMV protease was shown to have altered substrate specificity (Welch et al., 1993). Structures of protease-ligand complexes are needed to support this model. While one can only speculate on the nature of interactions between herpesvirus proteases and the various classes of known inhibitors, the current knowledge of the enzymatic and structural properties is critical for the future successes in identifying drug candidates. Knowledge of the novel structure framework and active site, as well as the delineation of the substrate binding groove, are particularly important in providing a template for rational approaches to the design of novel, potent inhibitors.

REFERENCES Babe, L. M., and Craik, C. S. (1997). Viral proteases: Evolution of diverse structural motifs to optimize function. Cell 91,427-430. Bonneau, P. R., Grand-Maitre, C., Greenwood, D. J., Lagace, L., LaPlante, S. R., Massariol, M. J., Ogilvie, W. W., O'Meara, J. A., and Kawai, S. H. (1997). Evidence of a conformational change in the human cytomegalovirus protease upon binding of peptidyl-activated carbonyl inhibitors. Biochemistry 36, 12644-12652. Burck, P. J., Berg, D. H., Luk, T. P., Sassmannshausen, L. M., Wakulchik, M., Smith, D. P., Hsiung, H. M., Becker, G. W., Gibson, W., and Villarreal, E. C. (1994). Human cytomegalovirus maturational proteinase: Expression in Escherichia coli, purification, and enzymatic characterization by using peptide substrate mimics of natural cleavage sites. J. Virol. 68, 2937-2946. Chen, P., Tsuge, H., Almassy, R. J., Gribskov, C. L., Katoh, S., Vanderpool, D. L., Margosiak, S. A., Pinko, C., Matthews, D. A., and Kan, C.-C. (1996). Structure of the human cytomegalovirus protease catalytic domain reveals a novel serine protease fold and catalytic triad. Cell 86, 835-843. Darke, P. L., Chert, E., Hall, D. L., Sardana, M. K., Veloski, C. A., LaFemina, R. L., Shafer, J. A., and Kuo, L. C. (1994). Purification of active herpes simplex virus-1 protease expressed in Escherichia coli.J. Biol. Chem. 269, 18708-18711. Darke, P. L., Cole, J. L., Waxman, L., Hall, D. L., Sardana, M. K., and Kuo, L. C. (1996). Active human cytomegalovirus protease is a dimer. J. Biol. Chem. 271, 7445-7449. Deckman, I. C., Hagen, M., and McCann, P. J., III (1992). Herpes simplex virus type 1 protease expressed in Escherichia coli exhibits autoprocessing and specific cleavage of the ICP35 assembly protein.J. Virol. 66, 7362-7367. DiIanni, C. L., Mapelli, C., Drier, D. A., Tsao, J., Natarajan, S., Riexinger, D., Festin, S. M., Bolgar, M., Yamanaka, G., Weinheimer, S. P., Meyers, C. A., Colonno, R. J., and Cordingley, M. G.

114

Xiayang Qiu and Sherin S. Abdel-Meguid

(1993). In vitro activity of the herpes simplex virus type 1 protease with peptide substrates. J. Biol. Chem. 268, 25449-25454. DiIanni, C. L., Stevens, J. T., Bolgar, M., O'Boyle, D. R., II, Weinheimer, S. P., and Colonno, R. J. (1994). Identification of the serine residue at the active site of the herpes simplex virus type 1 protease. J. Biol. Chem. 269, 12672-12676. Donaghy, G., and Jupp, R. (1995). Characterization of the Epstein-Barr virus proteinase and comparison with the human cytomegalovirus proteinase.J. Virol. 69, 1265-1270. Gao, M., Matusick-Kumar, L., Hurlburt, W., DiTusa, S. F., Newcomb, W. W., Brown, J. C., McCann, P.J., III, Deckman, I., and Colonno, R.J. (1994). The protease of herpes simplex virus type i is essential for functional capsid formation and viral growth.J. Virol. 68, 3702-3712. Gibson, W., and Hall, M. R. T. (1997). Assemblin, an essential herpesvirus proteinase. Drug Des. Discov. 15, 39-47. Hall, D. L., and Darke, P. L. (1995). Activation of the herpes simplex virus type I protease.J. Biol. Chem. 270, 22697-22700. Holskin, B. P., Bukhtiyarova, M., Dunn, B. M., Baur, P., de Chastonay, J., and Pennington, M. W. (1995). A continuous fluorescence-based assay of human cytomegalovirus protease using a peptide substrate. Anal. Biochem. 227, 148-155. Holwerda, B. C. (1997). Herpesvirus proteases: Targets for novel antiviral drugs. Antiviral Res. 35, 1-21. Hoog, S. S., Smith, W. W., Qiu, X., Janson, C. A., Hellmig, B., McQueney, M. S., O'Donnell, K., O'Shannessy, D., DiLella, A. G., Debouck, C., and Abdel-Meguid, S. S. (1997). Active site cavity of herpesvirus proteases revealed by the crystal structure of herpes simplex virus protease/ inhibitor complex. Biochemistry 36, 14023-14029. Jarvest, R. L., Connor, S. C., Gorniak, J. G.,Jennings, L. J., Serafinowska, H. T., and West, A. (1997). Potent selective thienoxazinone inhibitors of herpes proteases. Bioorg. Med. Chem. Lett. 7, 1733-1738. Jarvest, R. L., Parratt, M.J., Debouck, C. M., Gorniak, J. G., Jennings, L.J., Serafinowska, H. T., and Strickler, J. E. (1996). Inhibition of HSV-1 protease by benzoxazinones. Bioorg. Med. Chem. Lett. 6, 2463-2466. Kraulis P. J. (1991). MOLSCRIPT: A program to produce both detailed and schematic plots of protein structures. J. Appl. Crystallogr. 24, 946-950. LaFemina, R. L., Bakshi, K., Long, W. J., Pramanik, B., Veloski, C. A., Wolanski, B. S., Marcy, A. I., and Hazuda, D.J. (1996). Characterization of a soluble stable human cytomegalovirus protease and inhibition by M-site peptide mimics.J. Virol. 70, 4819-4824. Liang, P.-H., Doyle, M. L., Brun, K. A., O'Donnell, K., Green, S. M., Baker, A. E., Feild, J. A., Blackburn, M. N., and Abdel-Meguid, S. S. (1998). Site-directed mutagenesis probing the catalytic role of arginines 165 and 166 of human cytomegalovirus protease. Biochemistry 37, 5923-5929. Liu, F., and Roizman, B. (1991). The herpes simplex virus i gene encoding a protease also contains within its coding domain the gene encoding the more abundant substrate. J. Virol. 65, 51495156. Liu, F., and Roizman, B. (1992). Differentiation of multiple domains in the herpes simplex virus 1 protease encoded by the UL26 gene. Proc. Natl. Acad. Sci. USA 89, 2076-2080. Margosiak, S. A., Vanderpool, D. L., Sisson, W., Pinko, C., and Kan, C.-C. (1996). Dimerization of the human cytomegalovirus protease: Kinetic and biochemical characterization of the catalytic homodimer. Biochemistry 35, 5300-5307. McCann, P. J., III, O'Boyle, D. R., II, and Deckman, I. C. (1994). Investigation of the specificity of the herpes simplex virus type 1 protease by point mutagenesis of the autoproteolysis sites. J. Virol. 68,526-529. Nicholls, A., and Honig, B. H. (1991). GRASP.J. Comp. Chem. 12,435-445. Patil, A., Freyer, A. J., Killmer, L., Breen, A., and Johnson, R. K. (1996). A cycloartanol sulfate from

Human Herpesvirus Proteases

115

the green alga Tuemoya sp.: An inhibitor of VZV protease. Bioorganic & Med. Chem. Letters 6, 2467-2472. Perona, J. J., and Craik, C. S. (1995). Structural basis of substrate specificity in the serine proteases. Prot. Sci. 4, 337-360. Perona, J. J., and Craik, C. S. (1997). Evolutionary divergence of substrate specificity within the chymotrypsin-like serine protease fold. J. Biol. Chem. 272, 29987-29990. Pinko, C., Margosiak, S. A., Vanderpool, D. L., Gutowski, J. C., Condon, B., and Kan, C.-C. (1995). Single-chain recombinant human cytomegalovirus protease. J. Biol. Chem. 270, 23634-23640. Pinto, I. L., West, A., Debouck, C. M., DiLella, A. G., Gorniak, J. G., O'Donnell, K. C., O'Shannessy, D. J., Patel, A., and Jarvest, R. L. (1996). Novel, selective mechanism-based inhibitors of the herpes proteases. Bioorg. Med. Chem. Lett. 6, 2467-2472. Qiu, X., Culp, J. S., DiLella, A. G., Hellmig, B., Hoog, S. S., Janson, C. A., Smith, W. W., and AbdelMeguid, S. S. (1996). Unique fold and active site in cytomegalovirus protease. Nature 383, 275 -279. Qiu., x., Janson, C. A., Culp, J. S., Richardson, S. B., Debouck, C., Smith, W. W., and Abdel-Meguid, S. S. (1997). Crystal structure of varicella-zoster virus protease. Proc. Natl. Acad. Sci. USA 94, 2874-2879. Radhakrishnan, R., Presta, L. G., Meyer, E. F., Jr., and Wildonger, R. (1987). Crystal structures of the complex of porcine pancreatic elastase with two valine-derived benzoxazinone inhibitors. J. Mol. Biol. 198,417-424. Sardana, V. V., Wolfgang, J. A., Veloski, C. A., Long, W. J., LeGrow, J., Wolanski, B., Emini, E. A., and LaFemina, R. L. (1994). Peptide substrate cleavage specificity of the human cytomegalovirus protease.J. Biol. Chem. 269, 14337-14340. Schmidt, U., and Darke, P. L. (1997). Dimerization and activation of the herpes simplex virus type 1 protease.J. Biol. Chem. 272, 7732-7735. Shieh, H. S., Kurumbail, R. G., Stevens, A. M., Stegeman, R. A., Sturman, E. J., Pak, J. Y., Wittwer, A. J., Palmier, M. O., Wiegand, R. C., Holwerda, B. C., and Stallings, W. C. (1996). Threedimensional structure of human cytomegalovirus protease. Nature 383, 279-282. Shu, Y. Z., Ye, Q., Kolb, J. M., Huang, S., Veitch, J. A., Lowe, S. E., and Manly, S. P. (1997). Bripiodionen, a new inhibitor of human cytomegalovirus protease from Streptomyces sp. WC76599. J. Nat. Prod. 60, 529-532. Teshima, T., Griffin, J. C., and Powers, J. C. (1982). A new class of heterocyclic serine protease inhibitors. Inhibition of human leukocyte elastase, porcine pancreatic elastase, cathepsin G, and bovine chymotrypsin A alpha with substituted benzoxazinones, quinazolines, and anthranilates. J. Biol. Chem. 257, 5085-5091. Tong, L., Qian, C., Massariol, M. J., Bonneau, P. R., Cordingley, M. G., and Lagace, L. (1996). A new serine-protease fold revealed by the crystal structure of human cytomegalovirus protease. Nature 383, 272-275. Uejima, Y., Kokubo, M., Oshida, J., Kawabata, H., Kato, Y., and Fujii, K. (1993). 5-Methyl-4H-3, 1-benzoxazin-4-one derivatives: Specific inhibitors of human leukocyte elastase. J. Pharmacol. Exp. Ther. 265,516-523. Welch, A. R., McNally, L. M., Hall, M. R., and Gibson, W. (1993). Herpesvirus proteinase: Sitedirected mutagenesis used to study maturational, release, and inactivation cleavage sites of precursor and to identify a possible catalytic site serine and histidine.J. Virol. 67, 7360-7372. Welch, A. R., Villarreal, E. C., and Gibson, W. (1995). Cytomegalovirus protein substrate are not cleaved by the herpes simplex virus type i proteinase. J. Virol. 69,341-347. Welch, A. R., Woods, A. S., McNally, L. M., Cotter, R. J., and Gibson, W. (1991). A herpesvirus maturational proteinase, assemblin: Identification of its gene, putative active site domain, and cleavage site. Proc. Natl. Acad. Sci. USA 88, 10792-10796.

The Secreted Proteinases from Candida: Challenges for Structure-Aided Drug Design KENT STEWART,* ROBERT C. GOLDMAN,t AND CELE ABAD-ZAPATERO~ *Molecular Modeling Group, *Anti-infective Group, and ~Laboratory of Protein Crystallography, Abbott Laboratories, Abbott Park, Illinois 60064-3500

I. Introduction II. Pathogenic Spectrum and Current Therapies III. Secreted Aspartic Acid Proteases as Virulence Factors IV. Secreted Aspartic Acid Protease Substrates and Early Inhibitors V. Structural Characterization VI. Candida Genomics VII. Summary and Conclusions VIII. Methods References

I. I N T R O D U C T I O N Candida albicans is a diploid, dimorphic fungus that exists in two dominant morphological forms, the yeast and the hyphal. As a human pathogen, Candida can cause diseases ranging from mild, transient, and readily curable infections to chronic, severe, and frequently fatal systemic conditions. Over the past decade the number of patients diagnosed with severe, life-threatening infection with C. albicans and other Candida species has increased dramatically in the hospital setting, especially in patients at risk due to underlying immunosuppression as a result of cancer chemotherapy, organ transplantation, and AIDS. At present Candida is the fourth most frequent infectious organism isolated from blood cultures in hospital settings, reflecting the increasing Proteases of Infectious Agents Copyright 9 1999 by Academic Press. All rights of reproduction in any form reserved.

117

118

Stewart et al.

frequency of severe infection. Prominent members of this genus are Candida albicans, Candida tropicalis, and Candida parapsilosis. Secreted aspartic (acid) proteases (SAPs) from these and other fungal pathogens with unusually broad substrate specificity have been implicated as virulence factors (Cutler, 1991; Douglas, 1988; Ray et al., 1991; Ruchel et al., 1992; White et al., 1995). Although initially Candida strains were believed to express a single SAP, further research has documented the existence of at least seven distinct genes in C. albicans grouped into two subfamilies represented by SAPs 1-3 and SAPs 4 - 6 , with SAP7 being the most divergent in sequence (Monod et al., 1994). Characterization of two additional members, SAP8 and SAP9, is still underway (Hube et al., 1997a). Prior research (Goldman et al., 1995) discussed the choice of the SAPs as antifungal targets, the discovery of the early specific inhibitors using a fluorogenic substrate, and the pharmacokinetics and in vivo activity of the inhibitors. In this chapter, we focus on the most recent data on pathogenicity and differential expression of the various SAPs, and also on the unique structural differences encountered in the proximity of the active site among the different members of SAPs from C. albicans and C. tropicalis. A detailed analysis reveals that although the various secreted enzymes are highly homologous, the residues next to the active site in the different SAP 1 - 6 of C. albicans present significant variability. This diversity can be initially divided into two groups: SAP 1-3 and SAP4-6. However, further analysis indicates that individual subsite specificity also exists within each SAP. A successful antifungal effort directed to the inhibition of the SAPs from Candida should be based on understanding this microdiversity at the different active site pockets, and on combining this information with the complex pattern of expression and regulation of the individual SAPs in the host.

II. P A T H O G E N I C S P E C T R U M AND CURRENT THERAPIES Although C. albicans can exist as a mostly harmless colonizer of mucousal surfaces, the slightest weakening of the host immune system can lead to various severities of disease. Colonization of mucousal surfaces can lead to diseases such as thrush, oral and esophageal infection, and vaginitis, many cases of which can be cured with appropriate antifungal therapy (see below) when there are no underlying factors weakening the host immune system. Predisposing factors include cellular immunodeficiency, prolonged neutropenia, diabetes mellitus, use of broad-spectrum antibacterial therapy, and the presence of intravascular catheters. Candida spp. septicemia are frequent in patients with leukemia, organ transplant patients, and others receiving immunosuppressive

Secreted Proteinases from Candida

119

therapy. Candida albicans manifests in much more severe diseases upon penetration of mucousal barriers, leading to dissemination within the host and colonization of various organ systems wherein functional damage occurs (for a detailed medical and microbiological review see Odds, 1987). In specific populations, mortality can be as high as 30% or more in spite of aggressive prophylactic and therapeutic intervention with antifungal agents (Lortholary and Dupont, 1997). Current therapies for the treatment of life-threatening Candida infection are limited, but include the use of the following approved chemotherapeutic agents: amphotericin B and its lipid formulations; azoles (fluconazole and itraconazole); and flucytosine, used much more infrequently and usually in combination with another antifungal drug. Amphotericin B must be administered by iv delivery, and such use is limited by renal toxicity and other infusion related toxicities. The lipid formulations of amphotericin B are less toxic and as efficacious as amphotericin B, but require higher levels of administration and are quite costly. Given the limited alternatives currently in use for the treatment of severe Candida infections, the higher-than-acceptable failure rates with subsequent fatality, and the rising increase of resistance to some agents, newer drug and treatment modalities are actively being sought.

III. S E C R E T E D A S P A R T I C A C I D P R O T E A S E S AS V I R U L E N C E F A C T O R S Many Candida species produce SAPs and the evidence that they contribute to virulence is substantial (Cutler, 1991; Data, 1994; Douglas, 1988; Odds, 1985; Ray et al., 1991; Ruchel et al., 1992; White et al., 1995). Secreted aspartic acid proteases produced by C. albicans are coded by a family of related aspartic proteinase genes (de Viragh et al., 1993; Hube et al., 1991; Magee et al., 1993; Miyasaki et al., 1994; Monod et al., 1994; Morrow et al., 1992; Mukai et al., 1992; White et al., 1993; Wright et al., 1992), and most of these genes are subsequently expressed (Hube et al., 1994; Hube et al., 1997a; Miyasaki et al., 1994; Morrow et al., 1992; White and Agabian, 1995; White et al., 1993; 1995; Wright et al., 1992). Three subfamilies of SAPs were defined on the basis of sequence similarity: SAP1-3, SAP4-6, and SAP7 (Monod et al., 1994). The SAP8 gene was also identified (Morrison et al., 1993) and expression of the SAP8 and SAP9 genes is being investigated (Sanglard et al., 1997) (M. Monod, personal communication). The most recent data implicating Candida SAPs in disease is outlined below. New evidence supports the hypothesis that SAPs play a significant role early in the process of Candida dissemination (Fallon et al., 1997). The role of C. albicans SAPs in early disease progression was examined using a neutropenic

120

Stewart et al.

murine model of dissemination following intranasal inoculation with C. albicans. A significant dose-dependent protection against a subsequent lethal intranasal dose of C. albicans was observed by pretreatment of neutropenic mice with the aspartic protease inhibitor pepstatin A, and the efficacy was comparable to protection obtained with amphotericin B. The reduced mortality provided by pepstatin A also correlated with a reduction in the numbers of C. albicans recovered in the lungs, liver, and kidneys. Pepstatin A did not provide protection against C. albicans innoculated intravenously to mice. Within the limitations of this experimental system, these data are consistent with Candida SAPs playing a significant role in the early spread of the infection. Additional studies have implicated Candida SAPs in the invasion process. A possible role for SAPs in the invasion of the intestinal wall, following oralintragastric inoculation of infant mice has been reported (Colina et al., 1996). Digestion of labeled mucin was examined using a plate assay method devised for quantitation of protease and glycosidase activities. Culture filtrates of C. albicans contained proteolytic activity capable of digesting mucin. The activity was inhibited by pepstatin A, thus implicating Candida SAPs in the degradation of gastric mucin. Consequently Candida SAPs may play a role in dissemination from colonization sites in the gastrointestinal tract. Moreover, induction of C. albicans acid proteinase caused degradation of extracellular matrix proteins produced by a human endothelial cell line, and this degradation was inhibited by pepstatin A (Morschhauser et al., 1997). Thus, a role is suggested also for C. albicans SAPs in the degradation of the subendothelial extracellular matrix, which could facilitate dissemination via the circulatory system. Candida SAP is known to degrade specifically the heavy chain of IgA and IgG (Ruchel, 1984, 1986). Recent data (Kaminishi et al., 1995) revealed that killing of Staphylococcus aureus by human polymorphonuclear leukocytes was greatly reduced when bacteria were opsonized with human serum pretreated with Candida protease. Degradation of the Fc portion of immunoglobulin G by the action of C. albicans proteinase was observed, indicating that this was the cause of reduced bactericidal activity. Decreased bactericidal activity of human serum against Escherichia coli was also observed, and reduction of serum bactericidal activity was apparently due to proteolysis of complement proteins. Recent data further support a role for SAP in the pathogenicity of vulvovaginal candidiasis. Passive transfer of vaginal washes from rats recovering from Candida vaginitis was able to enhance clearance in the receiving animals. Likewise, monoclonal antibody to SAP2, intravaginal immunization with SAP2, and use of pepstatin also lead to more rapid clearance of Candida (De Bernardis et al., 1997). The SAP was localized to contact points between the fungal cell wall and the vaginal epithelial cell cell layer by immunoelectron microscopy (Stringaro et al., 1997).

Secreted Proteinases from Candida

121

Candida albicans SAP could also contribute to atopic asthma (Akiyama et al., 1994, 1996). Among patients with positive skin response to C. albicans acid protease, IgE antibodies were detected in 37% of the cases. The SAP also induced T-cell proliferation in 71% of patients showing a positive response to crude C. albicans antigen, and high levels of serum IgE correlated with the hism taminemrelease response of peripheral blood leukocytes to protease. Patients with high levels of serum IgE antibodies against the SAP showed positive conjunctival and immediate bronchial responses when challenged with protease. These data suggest that the protease is a significant allergen in mucosal allergy caused by C. albicans. If the C. albicans SAP gene family evolved to perform various functions during the process of establishing infection, one would assume that the expressions of these genes would be regulated in response to host environment. Secreted aspartic acid protease production is inducible by environmental stimuli, and our own work (Lerner and Goldman, 1993) clearly indicated that it is "protein" rather than small "peptides" that act as inducers; where small peptides are defined as those small enough to be transported by C. albicans peptide transport systems. Secreted aspartic acid protease was also induced by bovine serum albumin (BSA) in the absence of protein hydrolysis by SAP, indicating that some form of signal transduction event was occurring. In addition, SAPs are also regulated by the morphological switch pathway (Morrow et al., 1992; White and Agabian, 1995) and serum-induced hyphal formation (Homma et al., 1993; Traub, 1985). The SAP antigen is also present in infected tissue, as previously reviewed (Goldman et al., 1995); however, at the present time, except for vaginal infection, the mechanisms of tissuemspecific expression or recognition of the various SAPs are not known. Most recently, the use of targeted gene disruption in Candida was used to substantiate and further elucidate the roles of SAP expression in virulence (Hube et al., 1997b; Sanglard et al., 1997). Lethality was reduced significantly in both mice and guinea pigs when Candida containing the triple disruption of SAP4-6 were injected intravenously (Sanglard et al., 1997). Slight attenuation was also observed when singly disrupted SAP1, SAP2, or SAP3 strains were examined. These two studies used iv administration of Candida, thus bypassing many of the events occuring during the normal process of colonization, invasion, and dissemination. Nonetheless, one can argue that SAP4-6 seem to be implicated as major elements of Candida virulence. Further studies using SAP gene disruptants are currently in progress (Hube, personal communication) to elucidate the effects of the deficiency of individual proteases on specific stages of the infection process. The outcome of these experiments should clarify the roles of the individual proteases during infection, dissemination, and invasion of host tissues.

122

Stewart et al.

IV. S E C R E T E D A S P A R T I C A C I D P R O T E A S E S U B S T R A T E S A N D EARLY I N H I B I T O R S Secreted aspartic acid proteases can cleave a variety of substrates, which function as barriers and host defense molecules (Goldman et al., 1995). A preference for cleavage of a His-Thr or Lys-Thr bond by SAP2 was observed, and SAP2 was quite active on the His-Thr bond at physiological pH. Cleavage site specificity of Candida SAPs was also observed using other synthetic substrates (Fusek et al., 1994). Degradation of cytoskeletal proteins of mammalian cells by SAP2 also occurs (Goldman et al., 1995). The intermediate filament protein vimentin is involved in the crosslinking of filaments (microtubules, microfilaments, and intermediate filaments) to themselves and other cell organelles. The primary cleavage site in vimentin was between Lys436 and Thr437. Cell morphology was altered, and growth was severely reduced when human skin fibroblasts were electroporated in the presence of SAP2. The addition of a potent inhibitor of SAP2 (A-70450, see below) at 20 nM restored normal growth and morphology in SAP-treated cells. These results indicate that host intermediate filament proteins, which form an essential structural scaffold, serve as substrate for Candida SAP2. Identification of additional host substrates may provide important clues to the role of SAPs during colonization and infection. Given the volume of data implicating Candida SAPs in virulence, several groups have investigated the effects of standard inhibitors of SAPs on virulence. Two earlier studies reported the activity of pepstatin A in vivo in mouse models of intravenous infection (Ruchel et al., 1990; Zotter et al., 1990). Weak effects were observed in both studies. The lack of effects of pepstatin in the Candida mouse model was briefly mentioned in a third report (Edison and ManningZweerink, 1988). Pepstatin does not have the optimum toxicity and pharmacokinetic profile, and this may reflect the limited degree of activity observed (Ruchel et al., 1990). Other novel inhibitors of SAP were reported (Sato et al., 1994a,b), but testing of in vivo efficacy was not mentioned. Preliminary work suggests that certain synthetic inhibitors of SAPs decrease C. albicans adherence to endothelial cells in vitro (Frey et al., 1990). The development of a sensitive, rapid assay system for SAP activity based on fluorogenic substrates was critical in the identification of the first subnanomolar inhibitors of SAP2 (Capobianco et al., 1992). It allowed the investigation of the biochemical properties of SAP2, its interaction with putative peptide sequences, and analysis of various kinetic parameters (Goldman et al., 1995). One inhibitor, A-70450 (Fig. 1), inhibited SAP2 with a Ki of 0.17 nM, a significant improvement when compared to a Ki of 2.9 nM for pepstatin (Capobianco et al., 1992). Additional analogs were synthesized with variations at either end of the molecule. One compound analog, Ao79912, retained potent activity

123

Secreted Proteinases from Candida i $2 subsite ] 221, 225, P2 [ 301,303, 305 ~

PI'

OH x~(/

9,

0

P3b

P1 S3b subsite" 51, 86, 118,120

, . S1 subslte 19313195:305

P3a S3a subsite [ 12, 13,119, 220, 222

"0

P2' $2' subsite 35, 82, 131,133

S1 subsite 30, 84, 88, 119,123

FIGURE 1 Schematicrepresentation of the A-70450inhibitorwith the differentenzymepockets corresponding to the inhibitorsubsites (TableII).

against SAP2 and reduced activity against key host aspartic proteainases. In this compound the C-terminal butyl group of A-70450 is replaced with a 3-morpholinopropyl substituent (compound no. 6 of Fig. 5 in Abad-Zapatero et al., 1996). Yet, no efficacy was observed when Candida cells were injected iv into mice (bypassing the normal infection pathways) and treated with A-70450, A-79912, or pepstatin A. The lack of in vivo antifungal activity by the SAP2 inhibitors discussed previously should not be used to dismiss Candida SAPs as possible targets for therapeutic intervention. Quite to the contrary, new evidence is accumulating from basic investigations of virulence of Candida. Of special interest is the systematic evaluation of the effects of targeted gene disruption of SAP genes, which should define specific roles for each SAP in the process of Candida pathogenesis. An important tool for targeting inhibition of specific SAPs as virulence factors will be potent inhibitors with specific SAP activity. To date, we know the most about SAP2. Unfortunately, the isoforms SAP1 and SAP3-7 are also involved in the pathogenesis of Candida and their inhibition characteristics are still unknown. We thus undertook a comparative study of the most closely homologous members (SAP1-6) of the SAP gene family by molecular modeling. This analysis has identified significant differences between the various SAP structures and active site residues which are relevant for the future of structurebased drug design.

124

Stewart et al.

V. S T R U C T U R A L C H A R A C T E R I Z A T I O N Structural studies have focused on SAP2, which is the most abundantly secreted protein in vitro when BSA is used as the sole nitrogen source in fermentations. The gene for SAP2 codes for a 398-residue preproprotein, which is cleaved to a mature single polypeptide chain of 342 residues with a deduced molecular mass of 35,880 Da (Wright et al., 1992). Detailed reports of the three-dimensional structures of SAP2 (Cutfield et al., 1995) and a closely related clinical isolate (Abad-Zapatero et al., 1996) (referred to as SAP2X) complexed with the same potent inhibitor, A-70450 (Ki = 0.17 nM), have been published. In addition, Foundling and co-workers have reported the molecular structure of the secreted aspartic protease from C. tropicalis (SAPT) in complex with an unknown tetrapeptide tentatively identified as Thr-Ile-Thr-Ser (Symersky et al., 1997). A detailed discussion of the variations on the aspartic proteinase fold found in the SAPs from C. albicans and C. tropicalis has been presented elsewhere, together with the structural differences among SAP2, SAP2X, and SAPT. A brief outline of the implications and possible strategies for the structure-based design of antifungal agents has been discussed (Abad-Zapatero et al., 1998).

A. VARIATIONS WITHIN THE ASPARTIC PROTEINASE FOLD The overall architecture of the SAPs from Candida conforms with the classic aspartic protease fold represented in all members of the group whose threedimensional structure has been determined (Abad-Zapatero et al., 1990; Dealwis et al., 1994; Hsu et al., 1977; Sielecki et al., 1990; Subramanian, 1978; Subramanian et al., 1977; Tang et al., 1978). Available three-dimensional structures for mammalian and fungal aspartic proteinases as well as some of their complexes have been summarized by Aguilar and coworkers with tabulations of sequence identity and root-mean-square deviations among the different members (Aguilar et al., 1997). Detailed differences between the structure of the SAPs and the cannonical pepsin fold have been illustrated elsewhere (Abad-Zapatero et al., 1998; Goldman et al., 1995) and discussed in detail by two groups (Cutfield et al., 1995) and (Abad-Zapatero et al., 1996). Briefly, SAPs from Candida present: (1) an 8residue insertion near the first disulfide (Cys47-Cys59, SAP2) that results in a "broad" flap extending toward the active site; (2) a 7-residue deletion near helix hN2 (Ser118-Gln121), which enlarges the $3 pocket; (3) a short polar connection between the two rigid body domains that alters their relative orientation and projects a small "specificity ridge" into the active site; and (4) an ordered 12-residue addition at the carboxy terminus (Fig. 2, see color plate).

Secreted Proteinases from Candida

125

B. COMPARISON OF THE DIFFERENT SAP1- 6 SXRUCXURES There is an intriguing and potentially significant trend in the total electrostatic charge of the SAP1-6 enzymes. Both SAP2 and -3 were the most negative with a net charge of - 2 1 and - 2 0 , respectively. The SAPT from C. tropicalis also has a very large total negative charge ( - 2 0 ) consistent with the larger sequence identity with SAP2 and SAP3 (Table I, see color plate). Both SAP1 and SAP4- 6 were significantly more positive with net charges of - 8 , - 5 , + 2, and + 2, respectively. Inspection of the electrostatic potential surfaces indicate that the variation in charge was mainly due to changes over the entire surface rather than locally concentrated variation. This overall variation is illustrated in Figs. 3A and 3B (see color plate) for SAP3 and SAP6. We wish to suggest that the net charge of the SAP 1 - 6 enzymes plays an important role in the tissue distribution and mode of action of these enzymes. The increase in positive charge from SAP 1 to SAP6 might be related to different optimum activities, different distribution in various host tissues, or to other environmental factors which have been shown to affect expression levels and isoenzyme distribution (White and Agabian, 1995). While this net charge refers to the entire protein, the variation in charge is also reflected in the active sites. For the purpose of the discussion, the residue numbers are given for SAP2 (Cutfield et al., 1995) as presented in reference (Abad-Zapatero et al., 1996). Because insertions/deletions in the SAP sequences cause different numbers for equivalent positions, residues in other SAP enzymes are listed without numbers to avoid confusion (Table I; see color plate).

C. DESCRIPTION OF S A P 1 - 6 ACTIVE SITES: CONSERVED REGIONS The central portion of each active site of SAP 1 - 6 bears high similarity to other fungal proteases such as rhizopuspepsin and endothiapepsin. Specifically, the Asp-Thr/Ser-Gly-Ser/Thr-Ser/Thr signature sequence for aspartyl proteases is intact for each isoform (residues 32-36 and 218-222 in SAP2) (AbadZapatero et al., 1996; Cutfield et al., 1995). In addition, the 85-1oop, which comprises one of the active site flaps of SAP 1-6, has similar counterparts in other fungal proteases. The Tyr residue (Tyr84 in SAP2) provides a boundary between the $1 and $2' binding pockets and is conserved in the Candida proteases. The Tyr-Gly-Asp-Gly-Ser sequence that follows (residues 8 4 - 8 8 in SAP2) shows only minor variation in the SAP1-6 family: Gly85 at the tip of the flap is replaced by Ala in SAP4 and SAP6 (Table II). This provides a slightly expanded hydrophobic region that falls at the interface of the $2 and $1' pockets in these two isoforms. The Asp at position 86 is conserved in SAP 1 - 6 and

126

Stewart et al.

TABLE II Summary of Amino Acid Substitutions at the Different Active Site Subsites among the Various SAPs from C a n d i d a a

SAP1

SAP2

SAP3

SAP4

SAP5

SAP6

SAPT

Comment

30 84 88 119 123

Ile Tyr Ser Ile Ile

Ile Tyr Ser Ile Ile

Val Tyr Thr Val Ile

Ile Tyr Ser Ala Ile

Ile Tyr Ser Ala Ile

Ile Tyr Ser Ala Ile

Val Tyr Thr Val Ile

Before signature motif 85-1oop(act. site flap) Active site flap hN2 deletion After hN2 deletion

221 225 301 303 305 S3A 12 13 119 220 222 S3B 51 86 118 120

Thr Tyr Ser Ala Ile

Thr Tyr Asn Ala Ile

Thr Tyr Ser Tyr Ile

Thr Tyr Ser Asp Ile

Thr Tyr Ser Asp Ile

Thr Tyr Ser Asp Ile

Thr Tyr Asn Ala Ile

Signature motif Specifcity ridge Specificity ridge Specificity ridge

Val Ser Ile Gly Thr

Val Thr Ile Gly Thr

Val Ser Val Gly Thr

Ile Thr Ala Gly Thr

Ile Thr Ala Gly Thr

Ile Thr Ala Gly Thr

Pro Ser Val Gly Thr

First ]3-turn First ~-turn hN2 deletion Signature motif Signature motif

Arg Asp Ser Pro

Tyr Asp Ser Asp

-Asp Ser Asp

Trp Asp Ser His

Trp Asp Ser Arg

Trp Asp Ser His

Tyr Asp Ser Asp

Additional, broad flap Consensus before hN2 hN2 deletion

Ile Tyr Gln Leu Ala Gly

Ile Tyr Gln Leu Ala Asp

Ile Tyr Gin Leu Ala Gly

Ile Tyr Glu Arg Tyr Arg

Ile Tyr Glu Arg Tyr Arg

Ile Tyr Glu Arg Tyr Arg

Ile Tyr Tyr Gly Leu Ser

Glu Arg Leu Ala Ile

Glu Arg Leu Ala Ile

Glu Arg Leu Tyr Ile

Thr Ser Leu Asp Ile

Lys Thr Leu Asp Ile

Thr Set Leu Asp Ile

Glu Arg Val Ala Ile

Ser Ile Asn Ala

Ser Ile Asn Ala

Ser Ile His Ala

Ser Ile Asn Ala

Ser Ile Gly Ala

Ser Ile Asn Ala

Ser Ile Asp Ala

$1

$2

$4 223 225 295 297 281 299 $1' 193 195 216 303 305 $2' 35 82 131 133

Domain interface Domain interface Domain interface Possible salt bridge Signature motif Specificity ridge Specificity ridge Signature motif Act. site flap

aData summarized from references (Abad-Zapatero et al., 1996, 1998; Cutfield et al., 1995; Symersky et al., 1997).

Secreted Proteinases from Candida

127

together with Asp32 and Asp218, provides the high-negative-charge character of the center of the active sites of each of SAP1- 6 (Fig. 4, see color plate). Residue 87 is conserved in SAP 1 - 6 to be Gly, but is Leu in SAPT. The side-chain of this residue is solvent exposed in SAPT and does not comprise an active site residue. The residue Ser88 is replaced by Thr in SAP3 and SAPT. Relative to Ser, the extra methyl of Thr extends into the S 1 binding pocket and would serve to slightly decrease the available volume for an inhibitor side-chain in the P1 position. In summary, residues 32-36, 84-88, and 218-222 in SAP 1 - 6 and SAPT represent regions of high sequence homology (Table I). They comprise a conserved core of three regions similar to other aspartyl proteinases and correspond to the sequences containing the two catalytic aspartates and the active site flap.

D. DESCRIPTION OF VARIABLE REGIONS

SAP1-6 ACTIVE SITES:

In contrast to the above-described core positions showing minor variations within the SAP1-6 family, more significant structural variation is observed at other positions. The discussion below will sequentially move clockwise through the active site covering the $1, S3a, S3b, $4, $2, $1', $3', and finally the $2' subsites (Fig. 1). A tabulation of the relevant residues is given in Table II. The $1 pocket is comprised of residues 30, 119, and 123 in addition to several conserved residues listed above. While residue 123 is a conserved Ile for SAP1-6, positions 30 and 119 show more hydrophobic variation, which will slightly alter the size of the S 1 pocket. Residue 30 is Ile for SAP 1,-2, a n d - 4 - 6 and Val for SAP3 and SAPT. Residue 119 is Ile for SAP1 and-2, Val for SAP3 and SAPT, and Ala for SAP4-6. In summary, the position of largest variation within the S1 pocket is at residue 119 with medium hydrophobic side chains for SAP1-3 and a smaller Ala side chain for SAP4-6. Thus, one may expect corresponding substrate/inhibitor P1 residues to be larger hydrophobic residues for SAP4-6 than SAP1-3. As described previously (Abad-Zapatero et al., 1996; Cutfield et al., 1995), the $3 subsite in SAP2 was found to be rather large and was operationally divided into sections a and b (Abad-Zapatero et al., 1998). This separation is extended now to the models of SAP1 and SAP3-6. The presence of an additional broad flap (residues 47-59, Table I; Fig. 2) is tantamount to a "second" active site loop. This insertion creates a large binding region in the $3 region and the inhibitor A-70450 explores this entire site through a unique ketopiperidine backbone amide bond replacement. The proteases SAP 1, SAP2, SAP4-6, and SAPT possess the same loop length for the 50-loop, while SAP3 possess a loop shorter by i residue. Residue 51 in this loop has the potential for contacting both inhibitors and substrates: the Tyr residue at position 51 in SAP2 was

128

Stewart et al.

observed to provide a long Van der Waals contact with the P3b terminal piperazine group. As aligned, residue 51 corresponds to an Arg in SAP 1 and it is conceivable that Tyr51 in SAP2 coincides spatially with Arg in SAP 1. However, Arg51 in SAP1 is both preceded and followed by Pro residues, which could impart some special loop conformation not modeled well by the simple loop replacement routine used in our work. As stated, SAP3 possesses a deletion in this region and as modeled does not have a residue that corresponds to Tyr51. Given the uncertainties of the 50-loop structure in both SAP1 and SAP3, experimental structure studies would be required for a complete understanding. Residue 51 is Trp in SAP4-6 and we suggest that the Trp has a similar rotamer preference to Tyr51 in the SAP2 crystal structure. If so, then the S3b subsite will have diminished volume due to the larger aromatic residue in SAP4-6, relative to SAP2 (and SAP1 and -3). Therefore, the S3b region is a site of significant variation within the active site of SAP1-6 due to the different residues at position 51. Residues 119-121 and 12-13 are near residue 51 and comprise the remainder of the S3a and S3b subsites. At residues 119-121, the Ile-Asp-Gln of SAP2 are replaced by Ile-Pro-Gln in SAP1, Val-Asp-Gln in SAP3 and SAPT, AlaHis-Lys in SAP4 and -6, and Ala-Arg-Lys in SAP5. These three residue changes, together with variants at residue 51 result in enlargening the $1 pocket and collapsing and adding positive character to the S3b pocket in SAP4-6, relative to SAP1-3 and SAPT. The Lys121 of SAP4-6 is positioned well for electrostatic interaction with the conserved Glul0. Residues 12 and 13 comprise the S3a subsite and show minor variation among the isoforms: ValSer in SAP1 and -3, Val-Thr in SAP2, and Ile-Thr in SAP4- 6. Since this variation at residues 12 and 13 involves only changes of a methyl group (Ser to Thr and Val to Ile), no large impact upon the S3a subsite volume is observed in the models of SAP1-6. The SAPT possesses a Pro and Ser at positions 12 and 13, respectively. The crystal structure shows that the Pro 12 of SAPT overlays well with Va112 of SAP2X, providing a similar hydrophobic surface in the S3a subsite between SAP2X and SAPT. In summary, the $3 subsite of the SAPs may be further divided into two regions: the S3a region is relatively similar among SAP1- 6 and SAPT, while the S3b region is more varied with SAP1-3 and SAPT having a larger and neutral character, and in SAP4- 6 it is smaller and positively charged. The $4 subsite provides a very clear-cut point of variation between the SAP1-3 and SAP4-6 subgroups. Residues Ile223, Tyr225, Gln295, Leu297, and Ala281 comprise this subsite in SAP2. Of these residues, Ile223 and Tyr225 are conserved in SAP1-6 and SAPT. While Gln295, Leu297, and Ala281 are conserved in SAP 1-3, these residues are replaced by Glu, Arg, and Tyr, respectively, in SAP4-6. At the far end of the $4 pocket lies residue 299. This residue is Asp299 in SAP2, Gly in SAP1 and -3, and Arg in SAP4- 6. The crystal structure of SAPT shows that residues 281,295,297, and 299 of SAP2 correspond

Secreted Proteinases from Candida

129

to Leu, Tyr, Gly, and Ser in SAPT, respectively, which are different from any of SAP1-6. In summary, the $4 subsite will be extremely different between the two subgroups of SAPs, both in shape and polarity: SAP 1-3 will have neutral residues comprising this pocket, such as Ala at 281, Gln at 295, Leu at 297, and Gly at 299 (an exception is Asp 299 for SAP2), while SAP4-6 will have an aromatic, Tyr at 281, or charged residues Glu at 295 and Arg at 297 and 299. Because of the uncertainty in side-chain conformation prediction, particularly for Arg, we cannot predict the exact shape of the $4 subsite, but the residue changes described here strongly suggest that the P4 position of either substrates or inhibitors will have a corresponding large variation in character for SAP 1 - 6 and SAPT. The $2 subsite is comprised of the side-chains of four residues at positions 225, 301,303, and 305; the last three forming the "specificity ridge" (AbadZapatero et al., 1996). Residues 225 and 305 are conserved at Tyr and Ile, respectively, for SAP1-6 and SAPT. Residue 301 is Ser or Asn in the SAP1-6 family. Residue 303 forms the interface between the $2 and $1' subsite and is discussed further in the section below. This residue is hydrophobic, Ala in SAP 1 and -2, Tyr in SAP3, polar, or Asp in SAP4-6. The steric impact of the large Tyr residue is described below in the S 1' subsite description. The presence of the negatively charged Asp at residue 303 in SAP4-6 will make the S 1' subsite significantly more negative than in SAP 1-3. The SAPT possess $2 subsite residues identical to SAP2, and comparison of the two crystal structures shows these residues to be identically oriented. In summary, the $2 subsites of SAP 16 and SAPT appear to be well filled sterically by subsitutents the size of the P2 nor-Leu of A-70450, but the presence of a polar residue at position 301 and the variation in polarity at residue 303 suggests that both hydrophobic and hydrophilic subsituents may be accommodated. The S 1' pocket is composed of the conserved residues of Leu216 and Ile305 for SAP1-6 and varied residues at positions 193, 195, and 303. As mentioned above in the $2 subsite description, one large steric difference among the SAP enzymes occurs at position 303, within the specificity ridge (Abad-Zapatero et al., 1996). This residue is Ala303 in SAP 1 and -2, Asp in SAP4- 6, and Tyr in SAP3. The Tyr residue in SAP3 may limit the access to this pocket in SAP3 substrates to P 1' residues with small side-chains, such as Val, Ala, and Thr. In addition, the Tyr at position 303 in SAP3 has the potential of contacting the 85-1oop described above and closing the access to the active site. This effect may be seen in Fig. 4c, where the protein surface appears to be continous from the flap to the carboxy domain and partially covers the inhibitor. The Asp303 in SAP4- 6 is predicted to be very close to the side-chain of the PI' substituent and should disfavor binding of hydrophobic moieties. Residues Glu193 and Arg195 form a salt bridge in SAP1-3. We have previously commented on the special role that Arg195 may play in the observed backbone orientation of residues 301-305 (Abad-Zapatero et al., 1996). Importantly, we do not know

130

Stewart et al.

how removing this Arg195 side-chain will impact the 301-305 position or the orientation of the "mobile subdomain" (Abad-Zapatero et al., 1990; Sali et al., 1992; Sielecki et al., 1990). As modeled here, the Asp residue at 303 in SAP46 will likely make a direct interaction with the P I' substituent of substrates/ inhibitors. The ionic residues at 193 and 195 are replaced with small polar neutral residues Thr and Ser in both SAP4 and SAP6, thus opening up a large volume for extended PI' residues for both substrates and inhibitors. Interestingly, SAP5 diverges from SAP4 and SAP6 in this SI' regions in that residues 193 and 195 are Lys and Thr in SAP5, with Lys193 having the potential for salt bridge formation with Asp303. This proposed salt bridge between 193 and 303 in SAP5 would partially block the S 1' pocket and decrease the available volume of this pocket, relative to SAP4 and SAP6. In summary, SAP 1 and -2 and SAPT are predicted to have very similar S 1' subsites; SAP3 has less volume available relative to SAP2; and SAP4-6 have polar SI' subsites. The $2' pocket is composed of conserved residues Ile82 and Ser35 and two residues from a variable loop. The 13 l-loop is an irregularly shaped turn that links the ~8-strand of residues 122-127 (~8hl, central strand of the ~1 loop) and the conserved a-helix of residues 139-148 (helix ah2) (Abad-Zapatero et al., 1990; Sielecki et al., 1990). The length of this loop differs among the Candida proteases with SAP4-6 and SAPT possessing a loop length of 12 residues and SAP 1-3 possessing a loop length of 11 residues. Fortunately, crystal structures are available from enzymes possessing loops with each of these different loop lengths, SAP2, and SAPT, so that suitable protein backbone templates were available for the homology modeling of SAP1 and SAP3-6. The loop in SAPT served as a reasonable template for SAP4-6 in this loop region as no clashes occurred during the modeling of the SAP4-6 structures. Importantly, the 1residue insertion at position 134 in the SAP4-6 loop does not directly contact the $2' pocket and instead projects into solvent, relative to the corresponding residues in SAP1-3. The two residues at positions 131 and 133, prior to the insertion position, comprise the $2' subsite in both SAPT and SAP2 crystal structures, and the side-chains overlap well. Therefore, we can safely predict that the protein backbone that forms the $2' subsite appears to be conserved for the Candida enzymes, and only the variation in side-chain structure will determine any subsite variation. Position 131 is Asn in SAP 1,-2,-4, and-6; His in SAP3; Gly in SAP5; and Asp in SAPT. Residue 133 is Ala in all the Candida enzymes. Thus, the sole residue that generates variation in the $2' subsite is residue 131 and with the exception of SAP5, this side-chain has polar character (Asn or His). The special case of SAP5 should be mentioned as the Gly residue at position 131 will not have a side-chain that forms a surrounding surface of the $2' subsite; therefore the $2' subsite in SAP5 will be much larger than in the other Candida enzymes. Should it ever be desirable to discover SAP5-specific inhibitors, this variation in the $2' subsite might be a suitable region to

Secreted Proteinases from Candida

131

start such a search. In the crystal structure with A-70450, a butyl group from the inhibitor P2 r group fills the $2 r subsite, with larger groups likely projecting into solvent. Thus, the $2 ~ subsite in SAP1-4 and -6 would appear to be well filled sterically by groups like the P2' butyl group of A-70450, but the presence of polar side-chains nearby at residue 131 suggest that purely hydrophobic groups may not be optimal for occupying the $2 ~ subsite. The $2 ~ subsite of SAP5 diverges form SAP1-4 and -6 and SAPT because of its much larger volume. In summary, the central portion of the active site of the SAP enzymes appears to be similar among all the fungal proteases. More variation is observed in the residues outlining the boundaries of the subsites. Both SAP 1 and-2 are most similar to one another among the SAP enzymes, with their most significant difference being located at position 51 in the S3b subsite. The protease SAP3 is distiguished by the Tyr303 within the specificity ridge, which impacts the $1 ~ subsite. The subgroup of SAP4-6 clearly diverges from the SAP1-3 subgroup. This is most evident in the larger S 1 subsites; the Trp at residue 51 in the S3b subsite, causing a contracted volume in this subsite; a charge differential at the S3b at residues 120-121; a charge differential at $4 subsite at residues 295,297, and 281; and a charge differential of the $1 r subsite at residue 303. As illustrated in Figs. 4A-4F, the active sites of SAP4-6 possess significantly more positive character than the active sites of SAP 1-3. Overall, SAPT is most similar to SAP2-3, but diverges from all SAP 1 - 6 in the $4 subsite.

E. STRUCTURE--ACTIVITY RELATIONSHIPS OF INHIBITORS OF THE SECRETED ASPARTIC ACID PROTEASE 2 In spite of the differences between SAP2 and other aspartic proteinases, pepstatin A binds in a canonical mode, although less tightly than in other wellcharacterized fungal proteinases such as rhizopuspepsin, endothiapepsin, and penicillopepsin (Cutfield et al., 1995). In addition to the structure of SAP2 inhibited with pepstatin A, structures of SAP2 and a clinical isolate variant of C. albicans SAP2 (SAP2X) have been determined to be complexed with the same subnanomolar inhibitor, A-70450 (Cutfield et al., 1995; Abad-Zapatero et al., 1996). Both complexes established the binding of the wedge-shaped inhibitor in an extended conformation with the broad side occupying the $3 subsite. Details of the interactions of the branched inhibitor with the enlarged $3 subsite have been presented elsewhere (Abad-Zapatero et al., 1996; Cutfield et al., 1995). Structure-activity relationships of some structural analogs of A-70450 as bound to SAP2X have been analyzed previously (Abad-Zapatero et al., 1996) and are briefly discussed. In view of the conservation of residues

132

Stewart et al.

between SAP2 and SAP2X in the proximity of the active site (Abad-Zapatero et al., 1998), the conclusions are applicable to SAP2 and can be used as a framework for further structure-based design. The compound A-70450 possesses a hydroxyethylene peptide bond isostere as a transition state mimic with the hydroxylic carbon exhibiting the S-configuration (Fig. 1). The hydroxy group is located between Asp32 and Asp218, equivalent to the pepsin active site residues Asp32 and Asp215. Analogs of A-70450 have been described which identify several structural features required for potency. There was a significant drop in potency for changing the configuration at the P2 nor-Leu residue. Changing the configuration of the P3a benzyl led to an analog equipotent to A-70450. These changes in inhibitor potencies are consistent with the respective subsites: the $2 site being small and restrictive, the $3 site being much more open. A reduced-bond analog of A-70450 at the PI'-P2' linkage ( C ~ O being replaced by--CH2 m) showed an almost 700-fold drop in potency relative to the parent. Most probably, this was caused by the loss of a hydrogen bond between the inhibitor and the amide nitrogen of Gly85 in the protein backbone. The urea carbonyl of A-70450 makes no direct contact with the protein and could be replaced by a sulfonyl linkage with only a twofold loss in potency. Finally, the P2' butyl group of the original A-70450 may be replaced with 3-morpholinopropyl, dimethylaminoethyl, or dimethylaminopropyl groups, with only a three- to fourfold drop in potency. The enzyme selectivity of these last three compounds is particularly interesting as they show reduced activity against renin and sharply decreased affinity for cathepsin D (Abad-Zapatero et al., 1996). In addition, the conformation of the terminal methylpiperazine ring was found to be different in the two crystal forms. Possible reasons for the alternate conformation have been suggested. Namely, the instrinsic flexibility of the methylpiperazine ring and the different pH of the crystals, which could result in altering the water structure in the proximity of residues Thr50 and Gln121 (Abad-Zapatero et al., 1998). The dual conformation of A-70450 in the crystals confirms the existence of a very large $3 subsite in the SAP2 isoenzyme, which could be modulated by the individual differences among SAPs discussed previously. Although active in vitro at the nanomolar level against SAP2, neither A-70450 nor A-77912 performed well against other approved chemotherapeutic agents such as amphotericin B and fluconazole (Goldman et al., 1995). Two more inhibitors of the SAP family have been isolated from the fermentation broth of Streptomyces sp. (Sato et al., 1994a,b) with IC50 values in the low milimolar range; their future performance is uncertain. Yet, the detailed analysis of the different subsites for the various SAPs (Section V,D) suggests that the study of the specific interactions of SAP2 with A-70450 should provide a valuable structural framework for future structurebased drug against any of the SAP variants. In particular, the suboptimal po-

Secreted Proteinases from Candida

133

larity match of A-70450 at its P2 and P2 r sites is of note. In each of the $2 and $2 r subsites, a hydrophobic butyl group from the inhibitor terminates near a region of polarity within the SAP enzymes, including SAP2--the enzyme used for cocrystallization. Since A-70450 has subnanomolar potency for SAP2 inhibition, this polarity mismatch is not completely detrimental to affinity; however, it is possibly not optimal for affinity either. It is straightforward to imagine analogs of A-70450, which might exploit these polar interactions within the $2 and $2 r subsites. However, further experimentation will be required to learn if these additional polar interactions will lead to more potent inhibitors or to effective anti-Candida agents.

VI. C A N D I D A G E N O M I C S An additional resource in the search for anti-Candida agents should be mentioned. The complete genome of C. albicans is currently being sequenced. Information about this sequencing effort may be found at world wide web (WWW) address http://alces.med.umn.edu/Candida.html at University of Minnesota. The goal of the C. albicans sequencing project is to provide 1.5x mean sequence coverage of the haploid genome from C. albicans strain SC5314. This effort should provide at least partial sequence for most genes in C. albicans and give a preliminary identification of function based on similarity to other species, especially S. cerevisiae. This Internet site also provides additional information on Candida biology and is is a useful resource for Candida research. A search of the genome database for the keyword "protease" or "proteinase" yields 15 different proteases identified in C. albicans, which includes proteases which are not secreted and not part of the SAP family. There is a special section in this W W W site devoted to the SAP gene family, http://alces.med.umn.edu/candida/ proteinase.html. Another internet resource valuable in Candida research is the mailing list called "candidanews" originating from University of Otago, New Zealand. Information on subscription to this listserver may be found at http:// alces, med. umn. edu/candida/candidannews, h tml.

VII. S U M M A R Y A N D C O N C L U S I O N S Important experimental results were gathered during the past few years with respect to Candida SAPs. Among the more noteworthy are: (1) the identification of multigene families (SAP1-3 and SAP4-6) for SAPs in Candida species and the elucidation of some aspects of differential transcriptional control; (2) induction of SAPs by intact proteins potentially via contact sensing of substratum surfaces; (3) demonstration of SAP activity under physiological conditions of pH and ionic strength; (4) detection of circulating SAP antigen during

134

Stewart et al.

infection in humans; (5) discovery of novel inhibitors of SAPs; and (6) perhaps most importantly for structure-based drug design and discovery efforts, the determination of the crystal structure of an SAP family member bound to a potent inhibitor. Yet, the poor efficacy of the leading SAP2 inhibitors A-70450 and A-79912 in murine models of Candida infection was both discouraging and unexpected. Several reasons have been invoked elsewhere (Abad-Zapatero et al., 1998) to rationalize these negative results. Some of them can be mentioned again: inability to penetrate cell barriers, insufficient potency, and, likely, the inability to inhibit all members of the SAP family, or at least those most important for virulence. We had ruled out lack of bioavailability and lack of activity at physiological pH as explanations early in our studies. However, we are now faced with the possibility that, in addition to the various isoenzymes present, we have to consider the factor of differential transcriptional control. The structural data and the modeling results reviewed here suggest that the active sites of the various secreted SAPs from C. albicans are indeed sufficiently different to allow for different specificities at the different protein subsites. In particular, the dichotomy between the active sites SAP1-3 and SAP4-6 has been clearly documented. Also, the microdiversity within the individual amino acid residues bordering the substrate for the individual SAPs has been discussed and opens the way toward targeting either the invidual SAPs or a certain subset of them. Chemical synthesis strategies to exploit the differences at the $3 and $4 subsite were outlined previously (Abad-Zapatero et al., 1998), including the posibility of "freezing" into one compound the two inhibitor conformations found in the crystalline complexes of SAP2 (Abad-Zapatero et al., 1996; Cutfield et al., 1995). The large differences in overall charges among the different members of the family - 2 1 (SAP2) to + 2 (SAP5-6) opens the possibility that the different isozymes target different tissues during the course of the infection, providing a different kind of specificity and targeting. In our view, the challenge of a successful antifungal program resides in combining the biological and genetic data on the differential expression, regulation, and virulence of the SAPs from Candida with the structural data on the microheterogeneity of the various enzyme subsites. No matter how long and difficult the road, the discovery of effective prophylactic and therapeutic antifungal agents targeted on the fungal SAPs will be our best course of action.

VIII. M E T H O D S Three-dimensional models of SAP1 and SAP3-6 were created by appropriate residue replacement and loop-searching techniques using the SAP2X structure as a template. Sequences of the SAP enzymes were taken from GenBank data-

Secreted Proteinases from Candida

135

base using the following entries: SAP1,X56867; SAP3, L22358; SAP4, L25388; SAP5, Z30191; SAP6, Z30192. The SAP2X sequence used in this w o r k is a varient of SAP2 with 96% indentity, GenBank entry M83663. The overall percentage identity b e t w e e n the SAP enzymes ranges 4 0 - 7 5 % and well above the 2 0 - 3 0 % problematic range, so the SAP2X structure is believed to be a good template for the h o m o l o g y modeling of SAP enzymes. The SAP2X structure was used as template for the entire protein structure of SAP1 and S A P 3 - 6 except as noted. Protein Data Bank entry 1EAG for SAP2 was used as template for the 245-1oop, and the C. tropicalis SAPT protein was used as a template for the 13 lloop of S A P 4 - 6 . Loop searches were carried out for residues 5 0 - 5 2 of SAP3 and the 210-1oop of SAP1 and - 3 - 6 . After manually adjusting side-chains that were inappropriate rotamers or that clashed with other protein residues, energy minimization of the protein for 200 cycles using a gradually decreasing restraint on the protein atoms p r o d u c e d a final structure with no gross VDW clashes. These final models were used in the subsequent DelPhi analysis of the charge distribution and are available to academic investigators u p o n request from the authors. DelPhi calculations were carried out with 1 A spacing using dielectric constants for the protein and bulk solvent of 2 and 80, respectively.

ACKNOWLEDGMENTS We appreciate the critical reading of the manuscript by Drs. J. Greer and C. Hutchins and the support of Drs. D. Norbeck and A. Rosenthal within the Pharmaceutical Products Division at Abbott Laboratories.

REFERENCES Abad-Zapatero, C., Goldman, R. C., Muchmore, S. W., Hutchins, C., Oie, T., Stewart, K., Cutfield, S. M., Cutfield, J. F., Foundling, S. I., and Ray, T. L. (1998). In "Advances in Experimental Medicine and Biology" (M. G. N. James, Ed.). Aspartic Proteinases pp. 297-313. Plenum, New York. Abad-Zapatero, C., Goldman, R., Muchmore, S. W., Hutchins, C., Stewart, K., Navaza, J., Payne, C. D., and Ray, T. L. (1996). Structure of a secreted aspartic protease from C. albicans complexed with a potent inhibitor: Implications for the design of antifungal agents. Prot. Sci. 5, 640-652. Abad-Zapatero, C., Rydel, T. J., and Erickson, J. W. (1990). Revised 2.3 A structure of porcine pepsin: Evidence for a flexible subdomain. Prot. Struct. Funct. Genet. 8, 62-81. Aguilar, C. F., Cronin, N. B., Badasso, M., Dreyer, T., Newman, M. P., Cooper, J. B., Hoover, D. J., Wood, S. P.,Johnson, M. S., and Blundell, T. L. (199"/). The three-dimensional structure at 2.4 A resolution of glycosylated proteinase A from the lysosome-like vacuole of Saccharomyces cerevisiae.J. Mol. Biol. 267,899-915. Akiyama, K., Shida, T., Yasueda, H., Mita, H., Yamamoto, T., and Yamaguchi, H. (1994). Atopic asthma caused by Candida albicans acid protease: Case reports. Allergy 49, 7'78-781.

136

Stewart et al.

Akiyama, K., Shida, T., Yasueda, H., Mita, H., Yanagihara, Y., Hasegawa, M., Maeda, Y., Yamamoto, T., Takesako, K., and Yamaguchi, H. (1996). Allergenicity of acid protease secreted by Candida albicans. Allergy 51,887-892. Capobianco,J. O., Lerner, C. G., and Goldman, R. C. (1992). Application of a fluorogenic substrate in the assay of proteolytic activity and in the discovery of a potent inhibitor of Candida albicans aspartic proteinase. Anal. Biochem. 204, 96-102. Colina, A. R., Aumont, F., Belhumeur, P., and de Repentigny, L. (1996). Development of a method to detect secretory mucinolytic activity from Candida albicans. J. Med. Vet. Mycol. 34,401-406. Cutfield, S. M., Dobson, E. J., Anderson, B. F., Moody, P. C. E., Marshall, C. J., Sullivan, P. A., and Cutfield, J. F. (1995). The crystal structure of a major secreted aspartic proteinase from Candida albicans in complexes with two inhibitors. Structure 3, 1261-1271. Cutler, E. (1991). Putative virulence factors of Candida albicans. Ann. Rev. Microb. 45, 187-218. Data, A. (1994). Pathogenecity of Candida albicans: Quest for a molecular switch. Brazilian J. Med. Biol. Res. 27, 2721-2732. De Bernardis, F., Boccanegra, M., Adriani, D., Sprechini, E., Santoni, G., and Cassone, A. (1997). Protective role of antimannan and anti-aspartyl proteinase antibodies in an experimental model of Candida albicans vaginitis in rats. Infect. Immun. 65, 3399-3405. de Viragh, P. A., Sanglard, D., Togni, G., Falchetto, R., and Monod, M. (1993). Cloning and sequencing of two Candida parapsilosis genes encoding acid proteases. J. Gen. Microbiol. 139, 335 -342. Dealwis, C. G., Frazao, C., Badasso, M., Cooper, J. B., Tickle, I. J., Driessen, H., Blundell, T. L., Murakami, H., Sueiras-Diaz, J., Jones, D. M., and Szelke, M. (1994). X-ray analysis at 2.0 A resolution of mouse submaxillary renin complexed with a decapeptide inhibitor CH-667, based on the 4-16 fragment of rat angiotensinogen.J. Mol. Biol. 236,342-360. Douglas, L.J. (1988). Candida proteinases and candidosis. Crit. Rev. Biotechnol. 8, 121-129. Edison, A. M., and Manning-Zweerink, M. (1988). Comparison of the extracellular proteinase activity produced by a low-virulence mutant of Candida albicans and its wild-type parent. Infect. Immun. 56, 1388-1390. Fallon, K., Bausch, K., Noonan, J., Huguenel, E., and Tamburini, P. I. (1997). Role of aspartic proteases in disseminated Candida albicans infection in mice. Infect. Immun. 65, 551-556. Frey, C. L., Barone, J. M., Dreyer, G., Koltin, Y., Petteway, S. R., and Drutz, D. J. (1990). Synthetic protease inhibitors inhibit Candida albicans extracellular protease activity and adherence to endothelial cells. Abst. Ann. Meet. Am. Soc. Microb. poster no. F-102. Fusek, M., Smith, E. A., Monod, M., Dunn, B. M., and Foundling, S. I. (1994). Extracellular aspartic proteinases from Candida albicans, Candida tropicalis and Candida parapsilosis differ substantially in their specificities. Biochemistry 33, 9791-9799. Goldman, R. C., Frost, D. J., Capobianco, J. O., Kadam, S., Rasmussen, R. R., and Abad-Zapatero, C. (1995). Antifungal drug targets: Candida secreted aspartyl protease and fungal wall ~-glucan synthesis. Infect. Agents Dis. 4, 228-247. Homma, M., Chibana, H., and Tanaka, K. (1993). Induction of extracellular protease in Candida albicans. J. Gen. Microbiol. 139, 1187-1193. Hsu, I., Delbaere, L. T. J., James, M. N. G., and Hoffmann, T. (1977). Penicillopepsin from Penicilliumjanthinellum crystal structure at 2.8 A and sequence homology with porcine pepsin. Nature 266, 140-145. Hube, B., Monod, M., Schofield, D. A., Brown, A. J., and Gow, N. A. (1994). Expression of seven members of the gene family encoding secretory aspartyl proteinases in (Bossche, H. V., Stevens, D. A., and Odds, F. C., Eds.) Candida albicans. Mol. Microbiol. 14, 87-99. Hube, B., Sanglard, D., Monod, M., Brown, J. P., and Gow, N. A. R. (1997a). In "Host-Fungus Interplay: Proceedings of the Fifth Symposium on Topics in Mycology" (Bossche, Stevens, and _Odds, Eds.), pp. 109-122. National Foundation for Infectious Diseases, Bethesda, MD.

Secreted Proteinases from Candida

13 7

Hube, B., Sanglard, D., Odds, F. C., Hess, D., Monod, M., Schafer, W., Brown, A. J. E, and Gow, N. A. R. (1997b). Disruption of each of the secreted aspartyl proteinase genes SAP1, SAP2, and SAP3 of Candida albicans attenuate virulence. Infect. Immun. 65, 3529-3538. Hube, B., Turver, C.J., Odds, F. C., Eiffert, H., Boulnois, G. J., Kochel, H., and Ruchel, R. (1991). Sequence of the Candida albicans gene encoding the secretory aspartate proteinase. J. Med. Vet. Mycol. 29, 129-132. Kaminishi, H., Miyaguchi, H., Tamaki, T., Suenaga, N., Hisamatsu, M., Mihashi, I., Matsumoto, H., Maeda, H., and Hagihara, Y. (1995). Candida SAP is known to degrade immunoglobulins.Infect. Immun. 63,984-988. Lerner, C. G., and Goldman, R. C. (1993). Stimuli that induce production of Candida albicans extracellular aspartyl proteinase. J. Gen. Microbiol. 139, 1643-1651. Lortholary, O., and Dupont, B. (1997). Antifungal prophylaxis during neutropenia and immunodeficiency. Clin. Microb. Rev. 10,477-504. Magee, B. B., Hube, B., Wright, R. J., Sullivan, P. J., and Magee, P. T. (1993). The genes encoding the secreted aspartyl proteinases of Candida albicans constitute a family with at least three members. Infect. Immun. 61, 3240-3243. Miyasaki, S. H., White, T. C., and Agabian, N. (1994). A fourth secreted aspartyl proteinase gene (SAP4) and a CARE2 repetitive element are located upstream of the SAP1 gene in Candida albicans. J. Bacteriol. 176, 1702-1710. Monod, M., Togni, G., Hube, B., and Sanglard, D. (1994). Multiplicity of genes encoding secreted aspartic proteinases in Candida species. Mol. Microbiol. 13,357-368. Morrison, C. J., Hurst, S. F., Bragg, S. L., Kuykendall, R. J., Diaz, H., Pohl, J., and Reiss, E. (1993). Heterogeneity of the purified extracellular aspartyl proteinase from Candida albicans: Characterization with monoclonal antibodies and N-terminal amino acid sequence analysis. Infect. Immun. 61, 2030-2036. Morrow, B., Srikantha, T., and Soil, D. R. (1992). Transcription of the gene for a pepsinogen, PEP1, is regulated by white-opaque switching in Candida albicans. Mol. Cell Biol. 12, 2997-3005. Morschhauser, J., Virkola, R., Korhonen, T. K., and Hacker, J. (1997). Degradation of human subendothelial extracellular matrix by proteinase-secreting Candida albicans. FEMS Microbiol. Lett. 153,349-355. Mukai, H., Takeda, O., Asada, K., Kato, I., Murayama, S., and Yamaguchi, H. (1992). cDNA cloning of an aspartic proteinase secreted by Candida albicans. Mol. Cell Biol. 12, 2997-3005. Odds, F. C. (1985). Candida albicans proteinase as a virulence factor in the pathogenesis of Candida infection. Zentralbl Bakteriol. Mikrobiol. Hyg. A 260, 539-542. Odds, F. C. (1987). Candida infections: An overview. Crit. Rev. Microb. 15, 1-5. Ray, T. L., Payne, C. D., and Morrow, B.J. (1991). Candida albicans acid proteinase: Characterization and role in candidiasis. Adv. Exp. Med. Biol. 306, 173-183. Ruchel, R. (1984). A variety of Candida proteinases and heir possible targets of proteolytic attack in the host. Zentralbl. Backteriol. Mikrobiol. Hyg. A 257, 266-274. Ruchel, R. (1986). Cleavage of immunoglobulins by pathogenic yeasts of the genus Candida. Microbiol. Sci. 3, 316-319. Ruchel, R., de Bernardis, F., Ray, T. L., Sullivan, E A., and Cole, G. T. (1992). Candida acid proteinases.J. Med. Vet. Mycol. 30, 123-132. Ruchel, R., Ritter, B., and Schaffrinski, M. (1990). Modulation of experimental systemic murine candidosis by intravenous pepstatin. Int. J. Med. Microbiol. 273, 391-403. Sali, A., Veerapandian, B., Cooper, J. B., Moss, D. S., Hofmann, T., and Blundell, T. L. (1992). Domain flexibility in aspartic proteinases. Prot. Struct. Funct. Genet. 12, 158-170. Sanglard, D., Huber, B., Monod, M., Odds, F. C., and Gow, N. A. R. (1997). A triple deletion of secreted aspartyl proteinase genes SAP4, SAP5, and SAP6 of Candida albicans causes attenuated virulence. Infect. Immun. 65, 3539-3546.

138

Stewart et al.

Sato, T., Nagai, K., Shibazaki, M., Abe, K., Takebayashi, Y., Lumanau, B., and Rantiatmodjo, R. M. (1994a). Novel aspartyl protease inhibitors, YF-0200R-A and B. J. Antibiot. (Tokyo) 47, 566-570. Sato, T., Shibazaki, M., Yamaguchi, H., Abe, K., Matsumoto, H., and Shimizu, M. (1994b). Novel Candida albicans aspartyl protease inhibitor. II. A new pepstatin-ahpatinin group inhibitor, YF044P-D.J. Antibiot. (Tokyo) 47,588-590. Sielecki, A. R., Fedorov, A. A., Boodhoo, A., Andreeva, N. S., and James, M. N. G. (1990). The molecular and crystal structure of monoclinic porcine pepsin refined at 1.8 A resolution. J. Mol. Biol. 214, 143-170. Stringaro, A., Crateri, P., Pellegrini, G., Arancia, G., Cassone, A., and de Bernardis, F. (1997). Ultrastructural localization of the secreted aspartyl proteinase in Candida albicans cell wall in vitro and in experimentally infected rat vagina. Mycopathology 95, 105. Subramanian, E. (1978). Molecular structure of acid-proteases. Trends Biochem. Sci. 3, 1-3. Subramanian, E., Swan, I. D. A., Lie, M., Davies, D. R., Jenkins, J. A., Tickle, I. J., and Blundell, T. L. (1977). Homology among acid proteases: Comparison of crystal structures at 3 A resolution of acid proteases from Rhizopus chinensis and Endothia parasitica. Proc. Natl. Acad. Sci. USA 74, 556-559. Symersky, J., Monod, M., and Foundling, S. I. (1997). High-resolution structure of the extracellular aspartic proteinase from Candida tropicalis yeast. Biochemistry 36, 12700-12710. Tang, J., James, M. N. G., Hsu, I. N., Jenkins, J. A., and Blundell, T. L. (1978). Structural evidence for gene duplication in the evolution of the acid proteases. Nature 271,618-621. Traub, P. (1985). Are intermediate filament proteins involved in gene expression? Ann N.Y. Acad Sci 455, 68-78. White, T. C., and Agabian, N. (1995). Expression of Candida albicans secreted aspartyl protease isoenzymes is determined by cell type, and levels are determined by environmental factors. J. Bacteriol. 177, 5215-5221. White, T. C., Kohler, G. A., Miyadaki, S. H., and Agabian, N. (1995). Expression of virulence factors in Candidaalbicans. Can.J. Bot. 73(Suppl. 1), 1058-1064. White, T. C., Miyasaki, S. H., and Agabian, N. (1993). Three distinct secreted aspartyl proteinases in Candida albicans. J. Bacteriol. 175, 6126-6133. Wright, R. J., Carne, A., Hieber, A. D., Lamont, I. L., Emerson, G. W., and Sullivan, P. A. (1992). A second gene for a secreted aspartate proteinase in Candida albicans. J. Bacteriol. 174, 78487853. Zotter, C., Haustein, U. F., Schonborn, C., Grimmecke, H. D., and Wand, H. (1990). Die wirkung von pepstatin A auf die Candida-albicans-infektion der maus. Dermatol. Monatsschr. 176, 189-198.

Proteolytic Enzymes of the Viruses of the Family Picornaviridae ERNST M. BERGMANNAND MICHAEL N. G. JAMES Department of Biochemistry and Medical Research Council of Canada Group in Protein Structure Function, University of Alberta, Edmonton, Alberta, T6G 2G7 Canada

I. II. III. IV.

Picornaviridae Viral Replication and Polyprotein Processing Picornaviral Proteinases Conclusions and Implications for Antiviral Strategies References

I. P I C O R N A V I R I D A E The Picornaviruses constitute a large family of positive-sense, single-stranded RNA viruses ( + RNA viruses) (Rueckert, 1996). There are more than 200 known viruses that belong to this family and are classified into six genera (Table I). These viruses share the major features of the viral replication cycle, including the central role of the specific proteolytic processing of a viral polyprotein (Palmenberg, 1990). Individual details of the viral replication and of the polyprotein processing distinguish the genera of the family Picornaviridae (Ryan and Flint, 1997). Picornaviruses are small icosahedral viruses. There are examples of atomic resolution structures of individual viruses from four of the six genera (rhino-, entero-, cardio-, and aphtho-). The assembly of the precursor of the capsids is regulated by successive proteolytic cleavages of the structural proteins. The final assembly of the protomers into the procapsids requires the RNA genome and is not completely understood (Rueckert, 1996). A nonenzymatic, so-called maturation cleavage within the assembled procapsids then yields the infectious virus (Palmenberg, 1990). Proteases of Infectious Agents Copyright 9 1999 by Academic Press. All rights of reproduction in any form reserved.

139

140 TABLE I

Ernst M. Bergmann and Michael N. G. James The Family Picornaviridae

Genus

Entero-

Number of serotypes 93 hnPV 1-3; Cox A 1-A23, A24, B1-B6; EV 1-9, 11-21, 24-28, 68-71

Examples

Associated disease

Polio Coxsackie Echo

Myelitis, carditis meningitis, encephalitis, herpangina, myralgia pleurodynia,

Proteolytic enzymes 2A, 3C

Rhino-

105

Rhino

pneumonia Commoncold

2A, 3C

Aphtho-

7

FMDV

Foot-and-mouth disease

L, 3C

of cloven-hoofed animals Cardio-

2

EMCV

7

3C

Hepato-

1

HAV

Hepatitis A

3C

Parecho

2

Echo 22 Echo 23

Myocarditis

3C

EV22, EV23

Picornaviruses cause a wide variety of diseases in humans and animals (Couch, 1996; Hollinger and Ticehurst, 1996; Melnik, 1996). These range from relatively mild and widespread infections, such as the common cold and hepatitis A, to rare but often severe enteroviral diseases (Table I). There is some evidence that Picornavirus infections are also involved in the onset of severe autoimmune diseases such as myocarditis, diabetes, and multiple sclerosis (Carthy et al., 1997; Steinmann and Conlon, 1997). Recently a mouse model for a demyelinating disease that resembles multiple sclerosis, provided a plausible, immunological mechanism for the triggering of such diseases by a viral infection (Miller et al., 1997).

II. V I R A L R E P L I C A T I O N AND POLYPROTEIN PROCESSING A. THE PICORNAVIRAL LIFE CYCLE The Picornaviruses release their single-stranded, mRNA-like genome into the cytosol of the host cell where it is translated into large polyproteins. The resulting polyproteins are then cleaved by specific viral proteinases into the structural and nonstructural proteins of the virus (Kr~iusslich and Wimmer, 1988). Figure 1 shows a simplified scheme of the life cycle of a typical Picornavirus.

0

----= 9-= - - =

~.

"

i

-"'* /

9

9

~'

I

=

I ~ "~.

~

-

( ]

"~I p

@=o "r. ~

[

~

~

/,-

"*"'"

I.iI

E re'

p,.

o

-!.'~ ~a

~!

~

~0

9~ -'~

Z~ ~

;.~

~i

14 2

Ernst M. Bergrnannand MichaelN. G.James

The virus binds to a specific receptor on the cell surface. The specific receptors are different for various Picornaviruses and in some cases are not yet known. Following attachment, the virus particles lose the VP4 protein and undergo a change that allows them to release their RNA genome into the cystosol of the host cell. Picornaviral RNA genomes have a small, viral protein (VPg, the 3B gene product) covalently attached at the 5' terminus. After releasing the RNA into the cytosol of a host cell VPg is cleaved off by a cellular enzyme to yield a functional mRNA. The RNA is then translated into a large polyprotein which is cotranslationally proteolytically processed into the individual viral proteins (Palmenberg, 1990). Under normal conditions the full-length polyprotein is never found. Proteolytic processing of the polyprotein is accomplished by one or two viral proteinases which are themselves part of the polyprotein. The first proteolytic cleavage usually separates the structural proteins (P1 in Picornaviruses) from the remainder of the polyprotein. The RNA genome is also replicated to yield negative-sense RNAs. The negative-sense RNAs then serve as templates for the production of viral genomes. Picornaviral RNA replication is accomplished by a multienzyme complex that includes the viral RNA-dependent RNA polymerase (3D), the putative viral RNA helicase (2C), and the picornavrial 3C proteinase (Porter, 1993; Wimmer et al., 1993). Also part of this complex is the 3AB gene product. The 3A gene product presumably anchors the picornaviral RNA replication complex to the membrane of the smooth ER. Modification of the membrane structure of the host cell is a common feature of picornavirus infection (Bienz et al., 1983, 1990; Teterina et al., 1997a,b). The 3B gene product constitutes VPg--the viral genome-associated protein--which remains covalently bound to the 5'-end of the viral RNA genome (Wimmer, 1982). There is also evidence that some cellular proteins or their proteolytic cleavage products form part of the viral RNA replicase complex (Andino et al., 1993; Xiang et al., 1995; Gamarnik et al., 1996; Parsley et al., 1996). Picornaviral 3C proteinases possess an RNA binding site and RNA binding activity that is distinct from its proteolytic activity (H~immerle et al., 1992; Andino et al., 1993; Porter et al., 1993; Walker et al., 1995; Kusov and GaussM~iller, 1997). It is common for the limited number of gene products of small RNA viruses to perform multiple, distinct functions. The exact function of the 3C proteinase within the RNA replicase complex is not clear. Apparently its RNA binding activity is required for the initiation of RNA replication. Certain proteolytic cleavages in some picornaviruses could be essential steps during RNA replication within the RNA replicase complex, e.g., the 3C-mediated cleavage of the RNA-associated VPg (3B) from the membrane anchor (3A) may be necessary to release the RNA from the membrane-bound replicase complex.

Picornaviral Proteinases

143

It is also believed that in some picornaviruses the cleavage of 3CD within the replicase complex after binding of the RNA is required to allow 3D to perform RNA replication (Harris et al., 1992; Molla et al., 1995). Initially the structural proteins (P 1) are cleaved from the viral polyprotein. Two more 3C-mediated cleavages (1AB]IC and 1CI1D or VP0]VP3 and VP3 IVP 1, respectively) are required within the capsid precursor. The resulting protomer then assembles into pentamers and the pentamers form the provirions by a poorly understood pathway that requires the VPg-linked RNA genome. A final maturation cleavage within the provirion (VP0 --->VP2 + VP4) yields the infectious virus particles. This maturation cleavage is believed to be nonenzymatic and to require the presence of the packaged RNA genome (Palmenberg, 1990; Rueckert, 1996) Many details such as the composition of the RNA replicase complex, the function of the individual components of the RNA replicase complex, the pathway of provirion assembly and so on are, even in the best studied viruses, not completely understood. There are also differences in many aspects of the viral life cycle between the individual genera of the Picornaviridae.

B. POLYPROTEIN PROCESSING AND OTHER FUNCTIONS OF THE PICORNAVIRAL PROTEINASES The genome of all picornaviruses carries at least one, more often two, genes encoding proteolytic enzymes (Ryan and Flint, 1997). The 3C gene product is the major processing proteinase in all picornaviruses. The primary function of the picornaviral proteinases is the cotranslational, specific cleavage of the viral polyprotein into the structural and nonstructural proteins. The individual proteolytic cleavages by the 3C proteinases within the picornaviral polyproteins are sequential; some sites are cleaved faster then others. The cleavage sites are identified by the sequence of the residues immediately preceding and following the scissile bond (approximately P4 to P~ in the nomenclature of Schechter and Berger, 1967). The P1 residue, immediately preceding the scissile peptide bond, is almost always a glutamine. The 3C proteinases of the individual picornaviruses also have sequence preferences for the residues at the P4, P2, P~, and P~ sites of a cleavage site (Nicklin et al., 1988; Long et al., 1989; Pallai et al., 1989; Weidner and Dunn, 1991; Malcolm, 1995). However, what distinguishes the good, preferred cleavages sites from the ones that are cleaved more slowly is not apparent from the peptide sequence. It is very likely that other factors, such as the accessibility and the local conformation, play a part in the determination of the sequence of cleavages. The details of the polyprotein processing are one factor that distinguishes

144

Ernst M. Bergmann and Michael N. G. James

the six different genera of the Picornaviridae (Ryan and Flint, 1997). Only a single proteinase, the 3C gene product, is present in the cardio-, hepato-, and parechoviruses. In the entero- and rhinoviruses the 2A gene product is a second proteolytic enzyme. An L proteinase at the amino-terminus of the polyprotein is a unique feature of the aphthoviruses. The separation of the structural and nonstructural proteins is usually the primary cleavage event, but this is accomplished quite differently in the individual genera. In enterom and rhinoviruses 2A is a separate proteolytic activity. It performs the primary cleavage at its own amino-terminus which separates P 1 from the nonstructural proteins. In the hepatom and parechoviruses the primary cleavage is a 3C-mediated cleavage at the amino-terminus of the 2B gene product (Jia et al., 1993; Schultheiss et al., 1994; Martin et al., 1995; Schultheiss et al., 1995a). It is not clear if the small 2A gene product has any function. In the aphtho- and cardioviruses the primary cleavage is at the carboxy-terminus of 2A. The 2A gene product is not a proteolytic enzyme in these viruses. The cleavage is presumably nonenzymatic and requires the carboxy-terminal residues of 2A (Palmenberg et al., 1992; Donnelly et al., 1997). The second proteinase present in the aphthoviruses, the L proteinase, only cleaves itself from the amino-terminus of the polyprotein (Strebel and Beck, 1986). Larger precursors of the 3C proteinase, such as 3CD or 3ABC are also catalytically active proteinases (Ypma-Wong et al., 1988; Harris et al., 1992; Davis et al., 1997). It has been shown in some systems that the presence of an additional domain can change the efficiency and specificity of the proteolytic activity. In poliovirus, 3CD is a proteinase with a distinct specificity (Ypma-Wong et al., 1988). It cleaves at least some of the cleavage sites within the viral polyprotein more efficiently and is presumably required for the processing of the cleavage sites within the structural protein. In other picornaviruses other precursors may play a similar role. It has been suggested that 3ABC is a proteolytically active precursor of 3C in HAV (Harmon et al., 1992; Schultheiss et al., 1994). Structural proteins of viruses in general are designed to form large assemblies, such as viral capsids. Therefore, they have to be synthesized as precursors, which are covalently modified before they can assemble. One very common form of modification, not only in small + RNA viruses, is proteolytic processing of the precursor (Kay and Dunn, 1990). This is one reason why proteolytic enzymes are among the most ubiquitous enzymatic activities expressed by viruses (Dougherty and Semler, 1993). In the Picornaviruses the structural proteins are further proteolytically proJ cessed after they are separated from the nonstructural proteins. Two sequenJ tial, 3C-mediated cleavages are required within the capsid protein before the resulting 5S protomers can assemble into larger (14S) pentamers (Fig. 1). Final capsid assembly then requires the presence of RNA. In poliovirus the cleavages

Picornaviral Proteinases

14 5

of the capsid proteins within the protomer require the proteolytic activity of the precursor 3CD (Ypma-Wong et al., 1988). The picornaviral 3C proteinase cleaves itself out of the polyprotein. In experimental systems it was shown that this can be accomplished both in cis, when 3C is expressed as part of the polyprotein, or in trans, when 3C is expressed separately (Krtiusslich and Wimmer, 1988; Harmon et al., 1992). Kinetic evidence obtained with encephalomyocarditis virus (EMCV) also suggests that the cleavage in cis at both the amino and carboxy termini of 3C is intramolecular (Palmenberg and Rueckert, 1982). Another possible interpretation of these data is that the cleavages are performed by another 3C proteinase within a tight dimer or larger polymer. Further evidence for an intramolecular, autocatalytic cleavage of the 3C proteinase was provided by Hanecak et al. (1984). The crystal structures of 3C proteinases have allowed one to deduce a structural model for an intramolecular cleavage of 3C at its own amino-terminus (Matthews et al., 1994; Bergmann et al., 1997). This model proposes that the amino-terminal helix, which is a unique feature of the 3C proteinase, folds out of the active site of 3C after 3C has cleaved its own amino-terminus. How and if the 3C proteinase could cleave its own carboxy-terminus in an intramolecular reaction is much less obvious. An additional function of picornaviral proteinases is the inhibition or at least down-regulation of specific host cell functions which compete with the viral replication cycle (Ryan and Flint, 1997). The entero- and rhinoviral 2A proteinases cleave specifically one of the cellular proteins that forms part of the caprecognition complex (eIF4G) (Lamphear et al., 1993; Sommergruber et al., 1994a; Haghighat et al., 1996). This serves to down-regulate the translation of capped host cell mRNAs which competes with the translation of the picornaviral RNA genome. It is remarkable that the L proteinase of aphthoviruses, in spite of being a different proteinase, performs the same function. The entero-/ rhinoviral 2A proteinase and the aphthoviral L proteinase-mediated cleavages of eIF4G occur in different places on the molecule (Kirchweger et al., 1994). The picornaviruses which do not have a second proteolytic activity besides 3C do not cleave eIF4G or inhibit host cell translation by this mechanism. Hepatitis A virus even requires intact eIF4G for the translation of its own genome (Borman and Kean, 1997). There are also reports of host cell proteins being substrates of the picornaviral 3C proteinases. Most of these cellular substrates of the 3C proteinases are involved in some aspect of cellular translation or replication (Ryan and Flint, 1997; Yalamanchili et al., 1997). Thus, there are three main functions of picornaviral proteinases: the specific processing of the viral polyprotein, covalent modification of the precursors of the viral capsid and down regulation of host cell processes by proteolytic

146

Ernst M. Bergmannand Michael N. G. James

cleavage of host cell proteins (Gorbalenya and Snijder, 1996; Kay and Dunn, 1990; Kr~usslich and Wimmer, 1988; Ryan and Flint, 1997). Cotranslational, specific processing of a viral polyprotein by a specific viral proteinase is an essential part of viral replication in + RNA viruses. This is true even for some families of + RNA viruses which have developed additional strategies to generate individual gene products from a single RNA genome, e.g., subgenomic RNAs or multiple ORFs. Proteolytic cleavage as a covalent modification of the precursors of viral structural proteins is even more common and occurs even in DNA viruses. Down-regulation of the host cell metabolism by specific cleavage of cellular proteins is a mechanism which is not found in all viruses. Given these important functions, it is not surprising that proteolytic enzymes are ubiquitous gene products in all + RNA and many other viruses.

III. PICORNAVIRAL PROTEINASES A. THE 3C PROTEINASE 1. Structure The major processing proteinase of the picornaviruses, the 3C proteinase, belongs to a new family of proteolytic enzymes: the chymotrypsin-like cysteine proteinases (Gorbalenya and Snijder, 1996). This had initially been predicted based on analysis of the sequence of the 3C gene product (Gorbalenya et al., 1986, 1989; Bazan and Fletterick, 1988). This prediction was shown to be correct by the first crystal structure of a 3C proteinase (Allaire et al., 1994). Refined crystal structures of 3C proteinases have now been published for the enzymes from hepatitis A virus (HAV), poliovirus (PV), and human rhinovirus (HRV) (Matthews et al., 1994; Bergmann et al., 1997; Mosimann et al., 1997). The three-dimensional structure of the 3C proteinases from HAV and PV are shown in Figs. 2 and 3, respectively. The two enzymes differ in size and belong to two subclasses of the 3C proteinases. The 3C gene product of HAV consists of 219 residues and the molecule from PV consists of 183 residues. In spite of the size difference, the core of the enzymes superimpose surprisingly well. The rms difference for the Ca-atoms of 154 residues which superimpose closely is 1.85A. This indicates that the core of the two domain structure of the 3C proteinase is fairly well conserved. Differences between the various 3C proteinases manifest in the length of the secondary structure elements and in the turns and loops that connect the fl-strands and protrude from the core of the fl-barrel domains. In spite of being cysteine proteinases, the 3C proteinases belong structurally to the superfamily of chymotrypsin-like proteinases (Gorbalenya and Snijder,

Picornaviral Proteinases

14 7

1996). The structures of chymotrypsin-like proteinases are formed by two antiparallel fl-barrels with the proteolytic active site at the domain interface. Both domains contribute to the catalytic residues in the active site. Both domains also participate in the binding of peptide substrates. The N-terminal domain is mostly involved in binding the substrate residues following the scissile peptide bond (P~ to P~) whereas the C-terminal domain forms the specific subsites for the substrate residues preceding the scissile bond (P4 to P1) (Perona and Craik, 1995). The two domains of the chymotrypsin-like proteinases are usually described as six-stranded, antiparallel fl-barrels, with the individual fi-strands labeled aI-fl and alI-flI (Figs. 2 and 3, see color plates). An alternative description of the fl-barrels is that of a sandwich of two orthogonal, four-stranded, antiparallel fi-sheets (Chotia, 1984). The fl-strands, which form the edge of the sheets, belong to both sheets and continue, sometimes uninterrupted, from one sheet to the other. As a result the two corners of the "fl-sandwich," which are formed by the edge-strands, are closed, while the other two corners are splayed (Chotia, 1984). In both fl-barrel domains of the HAV 3C proteinase one of the edge strands is interrupted while the other continues from one fl-sheet to the other (Fig. 2) (Bergmann et al., 1997). In the N-terminal domain fi-strand el is interrupted by a single helical turn. fl-strand bI forms a fi-bulge at Val 28, allowing it to bend from one fl-sheet to the other. The residue Val 28 is involved in the binding of peptide substrates by HAV 3C. In the C-terminal domain of HAV 3C the blI strand is interrupted by a short stretch of random coil structure, whereas the eli strand continues from one sheet to the other. There are seven defined fl-strands in both of the domains of HAV 3C. In the two smaller fi-barrels, which form the domains of the polio 3C, the edge strands continue uninterrupted from one sheet to the other (Fig. 3). There are six defined fl-strands in each domain of the polio 3C proteinase (Mosimann et al., 1997). Two of the fl-strands, blI and clI, of the C-terminal domain of the 3C proteinases are extended past the C-terminal fl-barrel (Bergmann et al., 1997). From the point where the two strands are no longer part of the fl-barrel they form an antiparallel, two-stranded fl-ribbon (light gray in Figs. 2 and 3). A defined fl-bulge introduces a bend into this fl-ribbon, which causes it to curl back toward the active site. The longer fl-ribbon in the HAV 3C proteinase contributes to the residues involved in the catalytic mechanism and also to the binding of peptide substrates (see below). Because the fl-ribbon is shorter in the poliovirus 3C, it only contributes to the P4 binding pocket and the proteolytic active site of polio 3C is much more accessible (Mosimann et al., 1997). This fl-ribbon is a unique feature of the 3C proteinase and replaces the "methionine loop" of the chymotrypsin-like serine proteinases. The corresponding topological feature is somewhat similar, but smaller, in some bacterial proteinases (e.g., cr-lytic proteinase; Fujinaga et al., 1985).

148

Ernst M. Bergmannand MichaelN. G.James

There are helices at the N- and C-termini of the 3C proteinases. The N-terminal helix packs against the C-terminal ]3-barrel and the C-terminal helix packs against the surface of the N-terminal domain. The two helices stabilize the structure like two latches (Bergmann et al., 1997). The N-terminal c~-helix is a unique feature of the 3C proteinases among all chymotrypsin-like proteinases (Gorbalenya and Snijder, 1996). It has been speculated that it is important for the mechanism of a proposed intramolecular cleavage at the N-terminus of 3C (Matthews et al., 1994; Bergmann et al., 1997). In the proposed model for the N-terminal, intramolecular proteolytic cleavage, this helix is folded after 3C cleaves its own N-terminus. The favorable free energy of the folding of this stable helix may be required to fold the new N-terminus out of the active site in order to create the active proteinase (Bergmann et al., 1997). The sequence of the residues which form the last turn of this helix is highly conserved throughout the picornaviral 3C genes (K/RR/KNL/I), It is interesting that the structural and functional details of the proteolytic active site of the 3C proteinases are not the most conserved part of the 3C structure (Gorbalenya et al., 1988). The 3C gene product constitutes one subunit of the picornaviral RNA replicase complex and has a distinct RNA binding site (Hammerle et al., 1992; Andino et al., 1993; Leong et al., 1993; Kusov et al., 1997). The sequence of the residues which have been implicated in this second activity, KFRDI, is located in the domain connection of 3C, on the opposite site of the molecule from the proteolytic active site (Figs. 2b and 3b). It is completely conserved throughout the 3C gene sequences of all picornaviruses (Ryan and Flint, 1997). It was first shown for poliovirus that mutations within this sequence are deleterious for the viral replication and show two different phenotypes (Hammerle et al., 1992). These results can now be interpreted in light of the structures. The three charged residues within the consensus sequence (K82, R84, and D85 in poliovirus and K95, R97, and D98 in HAV) form part of the surface of the RNA binding site and are probably directly involved in RNA binding. The side-chains of the two highly conserved hydrophobic residues (F83 and I86 in poliovirus 3C and F96 and I99 in HAV) are packed into the interior of the molecule. They are part of the internal hydrophobic interactions that maintain the structure in this region and are important for this reason. The side-chain of the conserved phenyalanine interacts with a conserved glycine at the end of/3-strand bI inside the N-terminal/3-barrel. The sequence surrounding this glycine, LGVK/,D, is also highly conserved within the 3C genes. The residues in this sequence motif, from His 31 in poliovirus 3C and Lys 35 in HAV 3C on, form a reverse turn and connect/3-strand bI and cI (Figs. 2b and 3b). They contribute to the surface of the RNA binding site of 3C. Presumably, also contributing to the molecular surface of the RNA binding site are the turns which connect/3-strands dI and eI and dlI and eli (Figs. 2b and 3b). The latter connection forms a single turn of a helix in HAV 3C (Fig. 2b).

Picornaviral Proteinases

149

One face of the N- and C-terminal helix each flanks the conserved residues within the domain connection and probably contributes to the RNA binding site. It appears likely, as was proposed by Ryan and Flint (1997), that the binding of RNA to this site would have an influence on the proteolytic processing of both the N- and C-termini of 3C. On the other hand, it is not known if binding of RNA to the RNA binding site of 3C affects the proteolytic activity. Only structural work on a complex of 3C and bound RNA could provide a definite answer to this question. Because the RNA binding site is on the opposite side of the molecule from the proteolytic active site, the structures suggest that it could be possible that the two activities are independent. However, the 3C structures also suggest a possible mechanism whereby binding of RNA in the RNA binding site of 3C could influence the proteolytic activity. The turns, which connect/3-strands bI and cI and dII and eII, are probably involved in the specific binding of RNA. At the other end of each of these strands are residues which play important roles in the proteolytic activity. Slight conformational changes involving these/t-strands could have a dramatic effect on the proteolytic activity. 2. Activity and Specificity The picornaviral 3C proteinases are relatively slow enzymes when compared to some of the mammalian, extracellular serine proteinases. They have evolved to be very specific enzymes (Malcolm, 1995; Gorbalenya and Snijder, 1996; Ryan and Flint, 1997; Bergmann, 1998 and references therein). The chymotrypsin-like proteinases belong to a large group of proteolytic enzymes in which the nucleophile is the oxygen or sulfur atom of the side-chain of a serine or cysteine residue, respectively. In these enzymes the general acidbase catalyst is a conserved histidine residue. It is generally accepted that the mechanism of these enzymes involves an acyl-enzyme intermediate formed between the nucleophile and the carbonyl of the P1 residue of the substrate. Additional, so-called tetrahedral intermediates occur both during formation and hydrolysis of the acyl-enzyme intermediate. The tetrahedral intermediates carry a negative charge on the oxygen atom of the scissile peptide bond. In the catalytic reaction of the chymotrypsin-like serine proteinases the transition states leading to the tetrahedral intermediates are rate limiting and structurally resemble the tetrahedral intermediates. Three chemical groups with distinct functions are typically found in the active sites of proteolytic enzymes(James, 1993; Ryan and Flint, 1997): a nucleophile, which attacks the carbonyl of the scissile peptide bond; a general acidbase catalyst, which assists in the attack and protonates the leaving group; and an electrophilic structure, which stabilizes the developing negative charge on the carbonyl. The latter structure is usually referred to as the oxyanion hole. In

150

Ernst M. Bergmannand Michael N. G. James

the chymotrypsin-like serine proteinases it consists of a stretch of seven residues with a consensus sequence XGDSGG, where the serine is the nucleophile. The main-chain conformation of this structure orients the first and third peptide bonds so that they donate hydrogen bonds to the carbonyl of the scissile bond. These two hydrogen bonds of the oxyanion hole help to stabilize the developing negative charge on the carbonyl oxygen during the reaction (Whiting and Peticolas, 1994). There are additional chemical groups in the active site of proteinases, the function of which is less clear. In the chymotrypsin-like serine proteinases the carboxylate of an aspartate residue interacts with the edge of the imidazole of the histidine general acid-base catalyst, which is opposite from the nucleophile (N s). Originally it was thought that this "third member of the catalytic triad" participates in a proton transfer, but it is now generally believed that its function is to maintain the orientation of the histidine general acid-base catalyst and possibly to stabilize its developing positive charge. There are chemical groups in similar positions to the carboxylate of the third member of the catalytic triad in the 3C proteinases, but the interactions with the histidine general acid-base catalyst are different (Fig. 4, see color plate). It is generally accepted that the active sites of cysteine proteinases, such as the enzymes of the papain family, contain a thiolate-imidazolium ion pair that is stabilized over a wide pH range (Storer and M~nard, 1994). The active site of the 3C proteinases feature a thiol and an imidazole but in the structural context of a chymotrypsin-like proteinase. There is no direct experimental evidence for the charge and protonation state in the active site of the 3C proteinases. Even though the 3C proteinases belong to the superfamily of the chymotrypsin-like proteinases, they are cysteine proteinases. Whether the mechanism of 3C proteinases more closely resembles the mechanism of chymotrypsin-like serine proteinases or other cysteine proteinases or is unique is not clear. Figure 4 shows the details of the active site residues of the 3C proteinases from HAV (Bergmann et al., 1997) and PV (Mosimann et al., 1997). The arrangement of the cysteine-histidine dyad and the oxyanion hole is similar to that observed in the chymotrypsin-like serine proteinases but due to the size of the sulfur nucleophile the active site is larger and the chemical groups are further apart. The conserved glycine residue in the oxyanion hole of the wild-type 3C proteinases shows a conformation that is similar to the one seen for the corresponding residue in the chymotrypsin-like serine proteinases. This left-handed 3 ~o-helical conformation requires a glycine in this position ( ~ = 95 ~ 9 = - 5 ~ (Bergmann et al., 1997). In the chymotrypsin-like serine proteinases this conformation is maintained by interactions of the carbonyl of this peptide bond with other groups in the structure. The carbonyl of the corresponding residue

Picornaviral Proteinases

151

in the 3C proteinases (Pro 169 in HAV and Ala 144 in PV) does not make any interactions in the crystal structures. In the crystal structures of mutants of the nucleophilic cysteine of the HAV 3C proteinase this peptide bond is indeed flipped and the oxyanion hole has collapsed to a lower energy main-chain conformation (Allaire et al., 1994). It appears that the presence of the nucleophilic sulfur atom itself is required to maintain the proper conformation of the oxyanion hole in the 3C proteinases. We believe that it is a negative charge on the nucleophilic sulfur that orients the peptide bonds of the oxyanion hole and take this as partial evidence for a mechanism involving a thiolate-imidazolium ion pair. Direct experimental evidence for the protonation state of the residues in the active site and the mechanism of the 3C proteinases is, however, lacking. An aspartate or glutamate residue, which is in an equivalent position to the third member of the catalytic triad of the chymotrypsin-like serine proteinases, is present and conserved throughout the 3C proteinases (Gorbalenya et al., 1988; Ryan and Flint, 1997). However, the interaction typical for the third member of the catalytic triad is not observed. In HAV 3C the side-chain of Asp 84 points away from the imidazole of the general acid-base His 44. It is locked in interactions with other regions of the structure (Bergmann et al., 1997). The position of the carboxylate of the third member of a catalytic triad is taken up by a water molecule in HAV 3C, which forms a hydrogen bond to the N a of His 44 (Fig. 4a). This water molecule, the imidazole of His 44 and the nucleophilic S r atom of Cys 172 of HAV 3C are in a common plane. Perpendicular to this plane and 3.0A above it, is the sidechain of Tyr 143, which is located in the antiparallel fi-ribbon of HAV 3C. Mutational studies have shown that Tyr 143 is important for the catalytic activity of HAV 3C. The side-chain of Tyr 143 does not form a hydrogen bond in the crystal structure of HAV 3C. It is perpendicular to the plane of the imidazole and it is 3.5A away from the water molecule. We believe the side-chain of Tyr 143 is deprotonated and negatively charged in the structure of the HAV 3C proteinase. Presumably, an electrostatic interaction between His 44 and Tyr 143 helps to maintain the side-chain conformation of His 44 with the imidazole in the same plane as the nucleophilic S ~ atom of Cys 172 and helps to stabilize a positive charge on the His 44 imidazole. The conserved glutamate, which is present in the position that corresponds to the third member of a putative catalytic triad in polio virus 3C (Glu 71), does interact with the imidazole of the general acid-base catalyst His 40 (Mosimann et al., 1997). The interaction is, however, unusual. Accepting a hydrogen bond from the N a atom of His 44 is the anti lone electron pair of the carboxylate of Glu 71. This is similar to the structure of the 3C proteinase from rhinovirus (Matthews et al., 1997). Thus, the conserved features in the proteolytic active site of the picornaviral

152

Ernst M. Bergmannand MichaelN. G.James

3C proteinase are a cysteine-histidine dyad and an oxyanion hole that resembles that of the chymotrypsin-like serine proteinases remarkably well. Additional chemical groups have been shown to be important by mutagenesis experiments and are making interactions with mostly the histidine general acid-base catalyst. Their function is probably to maintain the orientation of the active site residues and to provide a specific electrostatic environment. Chymotrypsin-like serine proteinases bind specific substrates in a canonical mode and with a specific conformation of the bound peptide substrate (Read and James, 1986; Bode and Huber, 1992). We now have evidence from structures of enzyme inhibitor complexes that the 3C proteinases bind peptide substrates in a similar conformation (Bergmann and James, manuscript in preparation). Furthermore, the specific recognition of the cleavage sites within the viral polyprotein by the 3C proteinases can be rationalized if one assumes a similar binding mode. Chymotrypsin-like proteinases specifically bind 4 - 5 residues that precede the scissile bond and 2 - 3 residues that follow it in the sequence (i.e., P5 to P~). The residues from P5 to P2 of the substrate are usually in a/3-strand conformation. The P1 residue adopts a main-chain conformation that corresponds to a tight 31o helix. This places the carbonyl of the scissile peptide bond into the oxyanion hole. The P'I and P2 residues are usually also in a/t-conformation. This main chain conformation of a peptide substrate orients at least some of the peptide side-chains into specificity pockets, which are formed by the surface of the enzyme. While interactions between the enzyme and the main-chain of a bound substrate contribute significantly to the binding of a substrate, most of the specificity is provided by the interactions of the peptide side-chains in the specificity pockets of the enzyme (Fig. 5, see color plate). The minimum size for a good substrate of a 3C proteinase is a hexapeptide with the specific P4 to P~ residues. Sequence preferences for certain residues that distinguish the 3C cleavage sites in the picornaviral polyprotein can also be found for the P4 to P; residues (Pallai et al., 1989; Weidner and Dunn, 1991; Bergmann, 1998 and references therein). Figure 5 shows a model of a hexapeptide substrate with the sequence of the primary cleavage site of the HAV polyprotein in the active site of HAV 3C. With the exception of the two hydrogen bonds that the carbonyl of the scissile peptide bond makes in the oxyanion hole, the main-chain interactions between the HAV 3C proteinases and the bound peptide are/t-sheet interactions. In HAV 3C the substrate residues from P5 to P2 form an antiparallel/3-sheet with/3-strand eII. This form of substrate binding is common to all chymotrypsin-like proteinases. In HAV 3C there is also a parallel/t-sheet interaction between the P4 to P2 residues of the substrate and enzyme residue from the extension of/t-strand bII. This extension is not present in the smaller enteroviral 3C. Third, the P'I and P2 residues of the substrate of HAV 3C form an antiparallel/t-interaction with the residues of the

Picornaviral Proteinases

153

enzyme, which form the/3-bulge in strand bI of the enzyme. The conformation of fl-strand bI is different in polio 3C and presumably forces a different mainchain conformation on the substrate. This could explain the unique preference of polio 3C for a P'I glycine residue. All 3C proteinases share the preference for a glutamine residue in the P1 position of a substrate. Sequence preferences for the other residues of a substrate from P4 to P~' differ among the enzymes from the different genera. The major determinant of the primary specificity is a conserved histidine residue which is positioned inside the S1 pocket of the enzymes (Fig. 4). This histidine residue (191 in HAV and 161 in hnPV) is conserved throughout the 3C genes of all Picornaviruses (Ryan and Flint, 1997). Models of substrates bound to 3C proteinases agree that this histidine residue donates a hydrogen bond to the carbonyl oxygen atom of the side-chain of a glutamine residue in the $1 pocket. The environment of this histidine, which must contribute to the specific distinction between glutamine and glutamate, is, however, different in the crystal structures of HAV 3C and PV 3C. In HAV 3C His191 interacts, via buried water, with the side-chain of Glu 132. Bergmann et al. (1997) suggest that the deprotonation of this Glu 132, which is buried in the interior of the C-terminal domain, would be energetically expensive and unfavorable. Because the two residues interact, the protonation of His191 would also be unfavorable. In the entero- and rhinoviral enzymes a buried tyrosine residue performs a similar function (Mosimann et al., 1997). Comparison of the sequences of the natural cleavage sites of the HAV polyprotein reveals distinct sequence preferences for the residues in the P4, P2, and Px position of a 3C substrate (Bergmann, 1998). All natural cleavage sites in the HAV polyprotein have large hydrophobic residues (preferably Leu or Ile) in P4, serine or threonine in P2, and glutamine in P1. The 3C proteinase from poliovirus has a sequence preference for a small, hydrophobic residue in P4, glutamine in Px, and glycine in the P'I position of a peptide substrate. Bergmann et al. (1997) suggest that His145 in HAV 3C can form a hydrogen bond to a serine or threonine residue in P2 and is therefore responsible for the P2 specificity (Fig. 5). The/3-ribbon-contributing His145 in HAV 3C is shorter in the poliovirus enzyme so there is no equivalent residue in polio 3C. This correlates well with the fact that polio 3C does not show a sequence preference for a P2 residue. The hydrophobic $4 pocket of the 3C proteinases is a cleft formed by fl-strands eII and fII and the fl-ribbon formed by the extension of/3-strands bII and cII (Fig. 5). It is quite large in the HAV 3C proteinase. In polio 3C several of the hydrophobic residues that form this pocket are substituted by larger ones (e.g., Ala141 and Val200 in HAV 3C correspond to Leu125 and Phe170 in polio 3C). Therefore, the $4 pocket in the polio 3C is smaller and polio 3C prefers smaller, hydrophobic side-chains in P4.

154

Ernst M. Bergmann and Michael N. G. James

The larger 3C proteinase from HAV thus forms more extensive interactions with peptide substrates, both main-chain and side-chain. However, it is important to keep in mind that in the enteroviruses the 3CD precursor is a more active proteinase and is required for some of the cleavages of the polyprotein (Ypma-Wong et al., 1988). Bergmann et al. (1997) suggested a model for the interactions of 3C with the 3D and 3AB domains in a larger precursor. In this model the 3D part of a 3CD precursor would be in a position to interact with the residues in the P2 and P3 position of a substrate and could influence the proteolytic activity (top left of the 3C molecule in Figs. 2a and 3a). While the models of substrate binding to the 3C proteinases allow one to rationalize how the 3C proteinases recognize the specific cleavage sites within the polyprotein, it is not possible to explain the preference of some of the cleavage sites over others during the polyprotein processing. Presumably other factors besides the subsite specificity, such as the accessibility within the folded polyprotein, must play a part in the determination of the sequential polyprotein processing. 3. Inhibition Inhibitors of the 3C proteinases usually combine a chemical functionality that covalently attaches to the nucleophilic thiol in the active site, with other groups which target some of the specific interactions between the proteinases and its substrates. Typical cysteine proteinase inhibitors such as iodoacetamide, N-ethylmaleimide, epoxides, and aldehydes are also effective against the 3C proteinases (Malcolm, 1995). More promising inhibitors are the fluoromethylketones and y-aminovinylsulfones (Rasnik, 1996). Some of the best inhibitors available to date combine the latter functionalities with a peptidic specificity address which mimics the natural peptide specificity. A tetrapeptide fluoromethylketone inhibitor with the sequence Acetyl-Leu-Ala-Ala-Gln-FMK has been shown to be an effective inhibitor of the HAV 3C proteinase in vitro and in vivo (Morris et al., 1997). It covalently attaches to the HAV 3C proteinase and is capable of reducing the production of progeny virus in infected cells. Other functionalities that are now being investigated as inhibitors of the chymotrypsin-like cysteine proteinases include aft-unsaturated carboxylesters, /3- and y-lactones, lactams, isatins (2,3-dioxindoles), and triterpene sulfates (Skiles and McNeil, 1990; Brill et al., 1996). A cocrystal structure of the rhinovirus 3C proteinase with an isatin analog inhibitor in the active site shows that some of these compounds also covalently attach to the active site thiol and mimic the P1 specificity determinant of a natural substrate (Webber et al., 1996). While the details of the specific enzyme substrate interactions gleaned

Picornaviral Proteinases

15 5

from the crystal structures of 3C proteinases provide valuable information for the design of effective inhibitors, there is little experimental evidence for the mechanism of the chymotrypsin-like cysteine proteinases. This kind of information would, however, be of great value in identifying potential chemical functionalities and inhibitors.

B. THE ENTERO- AND RHINOVIRAL 2 A PROTEINASE The primary cleavage at the N-terminus of the 2A gene product that separates the structural and nonstructural proteins in the entero- and rhinoviruses is performed by the 2A gene product. Analysis of the sequence, mutational studies, and model-building studies have shown that the entero- and rhinoviral 2A proteinase is also a chymotrypsin-like cysteine proteinase but distinct from and only distantly related to 3C. The 2A proteinase is a smaller enzyme of 142 residues. It has been suggested, based on the results of mutational studies and structural models, that the active site of the 2A proteinases contains a catalytic triad of Cysl06, His18, and Asp35, which more closely resembles that of the serine proteinases (Sommergruber et al., 1989; Hellen et al., 1991). The sequence alignments also seem to indicate a closer relationship of 2A to the small, bacterial serine proteinases (Sommergruber et al., 1997). The 2A proteinase has less stringent specificity requirements than the 3C proteinases (Skern et al., 1991). Presumably this reflects the fact that its major function is to perform an intramolecular cleavage at its own N-terminus. Indeed, it has been found that amino acid changes in a substrate affect a trans activity but not the intramolecular cis activity (Hellen et al., 1992). The 2A proteinase is a zinc protein and the tightly bound zinc ion presumably plays a structural role (Sommergruber et al., 1994b; Voss et al., 1995). In the model of Sommergruber et al. (1997) the N-terminal fl-barrel has fewer strands and presumably the zinc ion is therefore needed to stabilize the N-terminal domain of the small 2A proteinase. A second function of the entero- and rhinoviral 2A proteinase is the specific cleavage of one of the proteins of the eukaryotic CAP-binding complex, eIF4G (Sommergruber et al., 1994a; Haghighat et al., 1996). This results in inhibition of the translation of capped, cellular mRNAs and preferential translation of the viral RNA. The 2A gene product of entero- and rhinoviruses also has functions in addition to its proteolytic activity (Belsham and Sonnenberg, 1996). There is evidence that 2A forms a complex with other viral proteins and is involved in viral RNA translation and other aspects of viral replication (Molla et al., 1993; Lu et al., 1995; Cuconati et al., 1998).

156

Ernst M. Bergmann and Michael N. G. James

There is little experimental evidence for the catalytic mechanism of the 2A proteinase. The chemical functionalities which provide good inhibitors of the 3C proteinases, such as the fluoromethylketones, are also effective against other chymotrypsin-like cysteine proteinases. Because the specificity requirements of the 2A proteinases are less stringent, the design of specific inhibitors against this class of enzymes could be more difficult.

C. THE L PROTEINASE OF THE APHTHOVIRUSES The aphthoviruses have another distinct proteolytic activity besides the 3C proteinase. The gene coding for the L proteinase is located at the N-terminus of the polyprotein and precedes the structural proteins (Ryan and Flint, 1997). The L proteinase cleaves its own C-terminus (Strebel and Beck, 1986). In vitro this cleavage can occur in cis and trans (Medina et al., 1993; Cao et al., 1995). The aphthoviral L proteinase also cleaves the cellular eIF4G and thus causes inhibition of the translation of capped, cellular mRNAs. This function of the aphthoviral L proteinase is similar to the one performed by the entero- and rhinoviral 2A proteinases. However, the cleavage of eIF4G by the L proteinase occurs in a different position (Kirchweger et al., 1994). In spite of these functions the L proteinase is not essential for the replication of the virus (Piccone et al., 1995a). The L proteinase is a cysteine proteinase and analysis of the sequence suggests that it belongs to the family of papain-like proteinases (Gorbalenya et al., 1991). The enzyme is present in two forms which differ by size and originate from two different initiation codons in the viral genome. The Lb proteinase of foot-and-mouth disease virus (FMDV) consists of 173 residues and the Lab proteinase is 28 residues longer. Sequence analysis, site-directed mutagenesis, and modeling studies identified the nucleophile, the general acid-base catalyst, and the third member of the catalytic triad as Cys51, His148, and Asp 164, respectively (Gorbalenya et al., 1991; Piccone et al., 1995b; Roberts and Belsham, 1995; Skern et al., 1998). Sequence alignments also suggest that the side-chain of Asn46 contributes to the oxyanion hole (similar to Gln 19 of papain) (Ryan and Flint, 1997). The C-terminus of the L proteinase has an extension, compared to the structure of papain, which has been predicted to adopt a helical conformation. Skern et al. (1998) suggest that this additional helix plays an important role for the mechanism of the intramolecular cleavage at the C-terminus of the L proteinase. Crystallization of the Lb proteinase from FMDV has been reported but the crystal structure has not been published (Guarnr et al., 1996).

Picornaviral Proteinases

15 7

IV. C O N C L U S I O N S A N D I M P L I C A T I O N S FOR ANTIVIRAL STRATEGIES The picornaviral 3C proteinases constitute an ideal target for the rational design of antiviral drugs. There is now a considerable amount of structural information for both enzymes and enzyme-inhibitor complexes. The details of the molecular interactions that are responsible for the specific substrate binding are reasonably well understood. Furthermore, the chymotrypsin-like cysteine proteinases constitute a unique class of enzymes with a distinct substrate specificity and are so far only found in +RNA viruses. Within these viruses the 3C proteinases perform a central and indispensable role during the viral life cycle and 3C proteinase inhibitors have the potential to limit the spread of viral infections (Morris et al., 1997). Neither the 2A nor the L proteinases are as attractive as targets for antiviral strategies. The activity of the L proteinase is apparently not as critical for viral replication. There is not as much structural information available for the entero-/rhinoviral 2A proteinase. The fact that the 2A proteinase activity is also less stringently specific could make the design of inhibitors more difficult. While there are many Picornaviruses, and viruses from related families, that cause disease in humans, few of these are considered important targets for the design of antiviral drugs. Rhinoviruses cause at least half of all common colds in humans. But because there are other families of unrelated viruses that cause upper respiratory tract infections, which are essentially indistinguishable, effective drugs against rhinoviruses would only be useful in combination with simple analytical procedures to unambiguously identify rhinoviral infections. Such simple analytical procedures are not available at present (Couch, 1996). There are safe and effective vaccines available against poliovirus and HAV. As a result of extensive worldwide vaccination, the incidence of poliomyelitis has been decreasing and currently there are realistic efforts underway to eradicate the disease completely. The introduction of an effective vaccine against HAV was very recent and it is too early to predict its effect. Wide-spread vaccination against HAV is currently not planned as hepatitis A is usually not a lifethreatening disease. Because co-infections of chronic carriers of hepatitis B, C, and G with HAV appears to be dangerous, the observed increase in chronic infections with other forms of hepatitis may have an influence on future strategies to control hepatitis A. It would be desirable to have antiviral drugs available against the more severe enteroviral infections. While most of the enteroviral infections are rare, they can have serious consequences. Because enteroviral infections occur infrequently, this is not considered an economically important target. Several other families of + RNA viruses also carry 3C or 3C-like proteinases

1 58

Ernst M. Bergmann and Michael N. G. James

(Wirblich et al., 1995; Martin Alonso et al., 1995; Tibbles et al., 1996). Most notably, the Corona- and Caliciviridae cause upper respiratory tract infections and intestinal infections in humans. The viruses of these families are less well studied than the Picornaviruses but distantly related. The design of 3C proteinase inhibitors would in all likelihood also be useful toward the development of antiviral drugs against the 3C-like proteinases of the viruses of these families. At present the mechanism by which some + RNA viruses, most notably the enteroviruses, can trigger severe autoimmune diseases are not well understood. It is also questionable whether inhibition of viral replication would prevent the disastrous consequences of the immune response at a later stage of an infection. Therefore, it is not clear whether antiviral drugs would be useful in the prevention of these diseases. In conclusion, there is a wealth of experimental information available for the best-studied examples of the viruses of the Picornaviridae. This information provides an opportunity to design inhibitors against the viral 3C proteinase. Effective inhibitors of the picornaviral 3C proteinase have the potential to become effective antiviral drugs against human diseases such as the common cold, HAV, enteroviral infections, and diseases caused by related + RNA viruses.

REFERENCES Allaire, M., Chernaia, M. M., Malcolm, B. A., and James, M. N. G. (1994). Picornaviral 3C cysteine proteinases have a fold similar to chymotrypsin-like serine proteinases. Nature 369, 72-76. Andino, R., Rickhof, G. E., Achacoso, P. L., and Baltimore, D. (1993). Poliovirus RNA synthesis utilizes an RNP complex formed around the 5'-end of viral RNA. EMBOJ. 12, 3587-3598. Bazan, J. F., and Fletterick, R.J. (1988). Viral cysteine proteinases are homologous to the trypsinlike family of serine proteinases: Structural and functional implications. Proc. Natl. Acad. Sci. USA 85, 7872-7876. Belsham, G.J., and Sonnenberg, N. (1996). RNA-protein interactions in regulation of picornavirus RNA translation. Microbiol. Rev. 60,499-511. Bergmann, E. M. (1998). Hepatitis A virus picornain 3C. In "Handbook of Proteolytic Enzymes" (A. D. Barrett, N. J. Rawlings, and F. Woesner, Eds.). Academic Press, London. Bergmann, E. M., Mosimann, S. C., Chernaia, M. M., Malcolm, B. A., and James, M. N. G. (1997). The refined crystal structure of the 3C gene product from hepatitis A virus: Specific proteinase activity and RNA recognition.J. Virol. 71, 2436-2448. Bienz, K., Egger, D., Rasser, Y., and Bossart, W. (1983). Intracellular distribution of poliovirus proteins and the induction of virus-specific cytoplasmic structures. Virology 131, 39-48. Bienz, K., Egger, D., Troxler, M., and Pasamontes, L. (1990). Structural organization of poliovirus RNA replication is mediated by viral proteins of the P2 genomic region. J. Virol. 64, 11561163. Bode, W., and Huber, R. (1992). Natural protein proteinase inhibitors and their interactions with proteinases. Eur. J. Biochem. 204,433-451. Borman, A. M., and Kean, K. M. (1997). Intact eukaryotic initiation factor 4G is required for hepatitis A virus internal initiation of translation. Virology 237, 129-136. Brill, B. M., Kati, W. M., Montgomery, D., Karwowski, J. P., Humphrey, P. E., Jackson, M., Clement,

Picornaviral Proteinases

15 9

J.J., Kadam, S., Chen, R. H., and McAlpine,J. B. (1997). Novel triterpene sulfates from fusarium compactum using a rhinovirus 3C protease inhibitor screen. J. Antibiot. 49,541-546. Cao, X., Bergman, I. E., F~illkrug, R., and Beck, E. (1995). Functional analysis of the two alternative initiation sites of foot-and-mouth disease virus. J. Virol. 69, 560-563. Carthy, C. M., Yang, D., Anderson, D. R., Wilson, J. E., and McManus, B. M. (1997). Myocarditis as systemic disease: New perspectives on pathogenesis. Clin. Exp. Pharmacol. Physiol. 24, 997-1003. Chotia, C. (1984). Principles that determine the structure of proteins. Annu. Rev. Biochem. 53, 537-572. Couch, R. B. (1996). Rhinoviruses. In "Fields Virology" (B. N. Fields, D. M. Knipe, P. M. Howley, R. M. Channock, J. L. Melnick, T. P. Monath, B. Roizmann, and S. E. Straus, Eds.). LippincottRaven, Philadelphia. Cuconati, A., Xiang, W., Lahser, F., Pfister, T., and Wimmer, E. (1998). A protein linkage map of the P2 nonstructural proteins of poliovirus.J. Virol. 72, 1297-1307. Davis, G.J., Wang, Q. M., Cox, G. A., Johnson, R. B., Wakulchik, M., Datson, C. A., and Villarreal, E. C. (1997). Expression and purification of recombinant rhinovirus 14 3CD proteinase and its comparison to the 3C proteinase. Arch. Biochem. Biophys. 346, 125-130. Donnelly, M. L. L., Gani, D., Flint, M., Monaghan, S., and Ryan, M. D. (1997). The cleavage activity of aphtho and cardiovirus 2A proteins.J. Gen. Virol. 78, 13-21. Dougherty, W. G., and Semler, B. L. (1993). Expression of virus-encoded proteinases: Functional and structural similarities with cellular enzymes. Microbiol. Rev. 57, 781-822. Fujinaga, M., Delbaere, L. T.J., Brayer, G., and James, M. N. G. (1987). Refined crystal structure of c~-lytic protease at 1.7 h resolution.J. Mol. Biol. 184, 479-502. Gamarnik, A. V., and Andino, R. (1997). Two functional complexes formed by KH domain containing proteins with the 5' noncoding region of poliovirus RNA. RNA 3, 882-892. Gorbalenya, A. E., and Snijder, E.J. (1996). Viral cysteine proteinases. Perspect. Drug Disc. Design 6, 64-86. Gorbalenya, A. E., Blinov, V. M., and Donchenko, A. P. (1986). Poliovirus-encoded proteinase 3C: A possible evolutionary link between cellular serine and cysteine proteinase families. FEBS Lett. 194, 253-257. Gorbalenya, A. E., Donchenko, A. P., Blinov, V. M., and Koonin, E. V. (1989). Cysteine proteinases of positive strand RNA viruses and chymotrypsin-like serine proteinases: A distinct protein superfamily with a common strcutural fold. FEBS Lett. 243, 103-114. Gorbalenya, A. E., Koonin, E. V., and Lai, M. M. C. (1991). Putative papain-related thiol protease of positive strand RNA viruses: Identification of rubi- and aphthovirus proteases and delineation of a novel conserved domain associated with proteases of rubi-, alpha- and coronaviruses. FEBS Lett. 288, 201-205. Guarnr A., Kirchweger, R., Verdaguer, R., Liebig, H. D., Blaas, D., Skern, T., and Fita, I. (1996). Crystallization and preliminary X-ray diffraction studies of the Lb proteinase of foot-and-mouth disease virus. Prot. Sci. 5, 1931-1933. Haghighat, A., Svitkin, Y., Novoa, I., K~chler, E., Skern, T., and Sonnenberg, N. (1996). The elF4GeIF4E complex is the target for direct cleavage by the rhinovirus 2A proteinase. J. Virol. 70, 8444-8450. H~immerle, T., Molla, A., and Wimmer, E. (1992). Mutational analysis of the proposed FG loop of poliovirus proteinase 3C identified amino acids that are necessary for 3CD cleavage and might be determinants of a function distinct from proteolytic activity.J. Virol. 66, 6028-6034. Hanecak, R., Semler, B. L., Ariga, H., Anderson, C. W., and Wimmer, E. (1984). Expression of a cloned gene segment of poliovirus in E. coli: Evidence for autocatalytic production of the viral proteinase. Cell 37, 1063-1073. Harmon, S. A., Updike, W., Xi-Ju, J., Summers, D. F., and Ehrenfeld, E. (1992). Polyprotein

160

Ernst M. Bergmann and Michael N. G. James

processing in cis and in trans by hepatitis A virus 3C protease cloned and expressed in E. coli. J. Virol. 66, 5242-5247. Harris, K. S., Xiang, W., Alexander, L. S., Lane, W. S., Paul, A. V., and Wimmer, E. (1994). Interactions of poliovirus polypeptide 3CD Prowith the 5' and 3' termini of the poliovirus genome. J. Biol. Chem. 269, 27004-27014. Hellen, C. U. T., Fache, M., Krausslich, H. G., Lee, C., and Wimmer, E. (1991). Characterization of poliovirus 2A proteinase by mutational analysis: Residues required for autocatalytic activity are essential for induction of eukaryotic initiation factor 4F polypeptide p220. J. Virol. 65, 4226-4231. Hellen, C. U. T., Lee, C., and Wimmer, E. (1992). Determinants of substrate recognition by poliovirus 2A proteinase. J. Virol. 66, 3330-3338. Hollinger, F. B., and Ticehurst, J. R. (1996). Hepatitis A virus. In "Fields Virology" (B. N. Fields, D. M. Knipe, P. M. Howley, R. M. Channock, J. L. Melnick, T. P. Monath, B. Roizmann, and S. E. Straus, Eds.). Lippincott-Raven, Philadelphia. James, M. N. G. (1993). Convergence of active-centre geometries among the proteolytic enzymes. In "Proteolysis and Protein Turnover" (J. S. Bond and A. J. Barrett, Eds.). Portland Press, London. Jia, X.-Y., Summers, D. F., and Ehrenfeld, E. (1993). Primary cleavage of the HAV capsid protein precursor in the middle of the proposed 2A coding region. Virology 193, 515-519. Kay, J., and Dunn, B. M. (1990). Viral proteinases: weakness in strength. Biochim. Biophys. Acta 1048, 1-18. Kirchweger, R., Ziegler, E., Lamphear, B. J., Waters, D., Liebig, H. D., Sommergruber, W., Sobrino, F., Hohenadl, C., Blaas, D., Rhoads, R. E., and Skern, T. (1994). Foot-and-mouth disease virus leader proteinase: Purification of the Lb form and determination of its cleavage site on eIF47. J. Virol. 68, 5677-5684. Kr/msslich, H.-G., and Wimmer, E. (1988). Viral proteinases. Ann. Rev. Biochem. 57, 701-754. Kusov, Y. Y., and Gauss-M~ller, V. (1997). In vitro RNA binding of the hepatitis A virus proteinase 3C (HAV 3C Pr~ to secondary structure elements within the 5' terminus of the HAV genome. RNA 3, 291-302. Lamphear, B. J., Yan, R., Yang, F., Waters, D., Liebig, H.-D., Klump, H., K~chler, E., Skern, T., and Rhoads, R. E. (1993). Mapping the cleavage site in protein synthesis initiation factor elF-47 of the 2A proteases from human coxsackie virus and rhinovirus.J. Biol. Chem. 268,19200-19203. Leong, L. E. C., Walker, P. A., and Porter, A. G. (1993). Human rhinovirus 14 protease 3C (3C Pr~ binds specifically to the 5'-noncoding region of the viral RNA. J. Biol. Chem. 268, 2573525739. Long, L. A., Orr, D. C., Cameron, J. M., Dunn, B. M., and Kay, J. (1989). A consensus sequence for substrate hydrolysis by rhinovirus 3C proteinase. FEBS Lett. 258, 75-78. Lu, H. H., Li, X., Cuconati, A., and Wimmer, E. (1995). Analysis of picornavirus 2A (pro) proteins: Separation of proteinase from translation and replication functions. J. Virol. 69, 7445-7452. Malcolm, B. A. (1995). The picornaviral 3C proteinases: Cysteine nucleophiles in serine proteinase folds. Prot. Sci. 4, 1439-1445. Martin Alonso, J. M., Casais, R., Boga, J. A., and Parra, F. (1996). Processing of rabbit hemorrhagic disease virus polyprotein.J. Virol. 70, 1261-1265. Martin, A., Escriou, N., Chao, S. F., Girard, M., Lemon, S. M., and Wychowski, C. (1995). Identification and site-directed mutagenesis of the primary (2A/2B) cleavage site of the hepatitis A virus polyprotein: Functional impact on the infectivity of HAV RNA transcripts. Virology 213, 213 -222. Matthews, D. A., Smith, W. W., Ferre, R. A., Condon, B., Budahazi, G., Sisson, W., Villafranca,J. E., Janson, C. A., McElroy, H. E., Gribskov, C. L., and Worland, S. (1994). Structure of human rhinovirus 3C protease reveals a trypsin-like polypeptide fold, RNA-binding site and means for cleaving precursor polyprotein. Cell 77, 761-771.

Picornaviral Proteinases

161

Medina, M., Domingo, E., Brangwun, J. K., and Belsham, G.J. (1993). The two species of the footand-mouth disease virus leader protein expressed individually, exhibit the same activities. Virology 194, 355-359. Melnick, J. L. (1996). Enteroviruses: Polioviruses coxsackie viruses, echoviruses, and newer enteroviruses. In "Fields Virology" (B. N. Fields, D. M. Knipe, P. M. Howley, R. M. Channock, J. L. Melnick, T. P. Monath, B. Roizmann, and S. E. Straus, Eds.). Lippincott-Raven, Philadelphia. Miller, S. D., Vanderlugt, C. L., Smith-Begolka, W., Pao, W., Yauch, R. L., Neville, K. L., Katz-Levy, Y., Carrizosa, A., and Kim, B. S. (1997). Persistent infection with Theiler's virus leads to CNS autoimmunity via epitope spreading. Nat. Med. 3, 1133-1136. Molla, A., Paul, A. V., Schmid, M., Jang, S. K., and Wimmer, E. (1993). Studies on dicistronic polioviruses implicate viral proteinase 2A vr~in RNA replication. Virology 196, 739-747. Morris, T. S., Frormann, S., Shechosky, S., Lowe, C., I_all, M. S., Gauss-MOiler, V., Purcell, R. H., Emerson, S. U., Vederas, J. C., and Malcolm, B. A. (1997). In vitro and ex vivo inhibition of hepatitis A virus 3C proteinase by a peptidyl monofluoromethyl ketone. Bioorg. Med. Chem. 5, 797-807. Mosimann, S. C., Chernaia, M. M., Sia, S., Plotch, S., and James, M. N. G. (1997). Refined X-ray crystallographic structure of the poliovirus 3C gene product. J. Mol. Biol. 273, 1032-1047. Nicklin, M. J., Harris, K. S., Pallai, P. V., and Wimmer, E. (1988). Poliovirus proteinase 3C: Largescale expression, purification and specific cleavage activity on natural and synthetic substrates in vitro. J. Virol. 62, 4586-4593. Pallai, P. V., Burkhardt, F., Shoog, M., Schreiner, K., Bax, P., Cohen, K. A., Hansen, G., Palladino, D. E., Harris, K. S., Nicklin, M. J., and Wimmer, E. (1989). Cleavage of synthetic peptides by purified poliovirus 3C proteinase. J. Biol. Chem. 264, 9738-9741. Palmenberg, A. C. (1990). Proteolytic processing of picornaviral polyprotein. Annu. Rev. Microbiol. 44, 602-623. Palmenberg, A. C., and Rueckert, R. R. (1982). Evidence for intramolecular self-cleavage of picornaviral replicase precursors. J. Virol. 41,244-249. Palmenberg, A. C., Parks, G. D., Hall, D. J., Ingraham, R. H., Seng, T. W., and Pallai P. V. (1992). Proteolytic processing of the cardioviral P2 region: Primary 2A/2B cleavage in clone derived precursors. Virology 190, 754-762. Parsley, T. B., Towner, J. S., Blyn, L. B., Ehrenfeld, E., and Semler, B. L. (1997). Poly (rC) binding protein 2 forms a ternary complex with the 5'-terminal sequences of poliovirus RNA and the viral 3CD proteinase. RNA 3, 1124-1134. Perona, J. J., and Craik, C. S. (1995). Structural basis of substrate specificity in the serine proteinases. Prot. Sci. 4, 337-360. Piccone, M. E., Rieder, E., Mason, P. W., and Grubmann, M.J. (1995a). The foot-and-mouth disease leader proteinase gene is not required for viral replication.J.Virol. 69, 5376-5382. Piccone, M. F., Zellner, M., Kumosinski, T. F., Mason, P. W., and Grubman, M.J. (1995b). Identification of the active-site residues of the L proteinase of foot-and-mouth disease virus. J. Virol. 69, 4950-4956. Porter, A. G. (1993). Picornavirus nonstructural proteins: Emerging roles in virus replication and inhibition of host cell functions. J. Virol. 67, 6917-6921. Rasnick, D. (1996). Small synthetic inhibitors of cysteine proteinases. Perspect. Drug Disc. Design 6, 47-63. Read, R.J., and James, M. N. G. (1986). Introduction to the Protein Inhibitors: X-ray Crystallography. In "Proteinase Inhibitors" (A. J. Barrett and G. Salvesen, Eds.). Elsevier, Amsterdam. Roberts, P. J., and Belsham, G.J. (1995). Identification of critical amino acids within the foot-andmouth disease virus leader protein, a cysteine protease. Virology 213, 140-146. Rueckert, R. R. (1996). Picornaviridae: The Viruses and their Replication. In "Fields Virology" (B. N. Fields, D. M. Knipe, P. M. Howley, R. M. Channock, J. L. Melnick, T. P. Monath, B. Roizmann, and S. E. Straus, Eds.). Lippincott-Raven, Philadelphia.

162

Ernst M. Bergmann and Michael N. G. James

Ryan, M. D., and Flint, M. (1997). Virus-encoded proteinases of the picornavirus super-group. J. Gen. Virol. 78, 699-723. Schechter, I., and Berger, A. (1967). On the size of the active site in proteases. I. Papain. Biochem. Biophys. Res. Commun. 27, 157-162. Schultheiss, T., Kusov, Y. Y., and Gauss-MOiler, V. (1994). Proteinase 3C of hepatitis A virus (HAV) cleaves the HAV polyprotein P2-P3 at all sites including VP1/2A and 2A/2B. Virology 198, 275-281. Schultheiss, T., Emerson, S. U., Purcell, R. H., and Gauss-MOiler, V. (1995a). Polyprotein processing in echovirus 22--A first assessment. Biochem. Biophys. Res. Commun. 219, 1120-1127. Schultheiss, T., Sommergruber, W., Kusov, Y. Y., and Gauss-MOiler, V. (1995b). Cleavage specificity of purified recombinant hepatitis A virus 3C proteinase on natural substrates.J. Virol. 69,17271733. Skern, T., Fita, I., and Guarn6, A. (1998). A structural model of picornavirus leader proteinase based on papain and bleomycin hydrolase. J. Gen. Virol. 79, 301-307. Skiles, J. W., and McNeil, D. (1990). Spiro indolinone/3-1actams, inhibitors of poliovirus and rhinovirus 3C-proteinases. Tetrahedr. Lett. 31, 7277-7280. Sommergruber, W., Zorn, M., Blaas, D., Fessel, F., Volkmann, E, Mauser-Fogy, I., Pallai, E, Merluzzi, V., Matteo, M., Skern, T., and K~chler, E. (1989). Polypeptide 2A of human rhinovirus type 2: Identification as a proteinase and characterization by mutational analysis. Virology 169, 68-77. Sommergruber, W., Ahorn, H., Klump, H., Zoephel, A., Fessl, F., Blaas, D., KOchler, E., Liebig, H.-D., and Skern, T. (1994a). 2A proteinases of coxsackie- and rhinovirus cleave peptides derived from eIF-4y via a common recognition motif. Virology 198, 741-745. Sommergruber, W., Casari, G., Fessl, F., Seipelt, J., and Skern, T. (1994b). The 2A proteinase of human rhinovirus is a zinc containing enzyme. Virology 204, 815-818. Sommergruber, W., Seipelt, J., Fessl, F., Skern, T., Liebig, H.-D., and Casari, G. (1997). Mutational analyses support a model for the HRV2 2A proteinase. Virology 234, 203-214. Steinmann, L., and Conlon, E (1997). Viral damage and the breakdown of self-tolerance. Nature Med. 3, 1085-1087. Storer, A. C., and M6nard, R. (1994). Catalytic mechanism in papain family of cysteine peptidases. Meth. Enzymol. 244,486-500. Strebel, K., and Beck, E. (1986). A second proteinase of foot-and-mouth disease virus. J. Virol. 58, 893-899. Teterina, N. L., Bienz, K., Egger, D., Gorbalenya, A. E., and Ehrenfeld, E. (1997a). Induction of intracellular membrane rearrangements by HAV proteins 2C and 2BC. Virology 237, 66-77. Teterina, N. L., Gorbalenya, A. E., Egger, D., Bienz, K., and Ehrenfeld, E. (1997b). Poliovirus 2C protein determinants of membrane binding and rearrangements in mammalian cells. J. Virol. 71, 8962-8972. Tibbles, K. W., Brierley, I., Cavanagh, D., and Brown, T. D. K. (1996). Characterization in vitro of an autocatalytic processing activity associated with the predicted 3C-like proteinase domain of the Coronavirus avian infectious bronchitis virus. J. Virol. 70, 1923-1930. Voss, T., Meyer, R., and Sommergruber, W. (1995). Spectroscopic characterization of rhinoviral protease 2A: Zn is essential for structural integrity. Prot. Sci. 4, 2526-2531. Walker, E A., Leong, L. E. C., and Porter, A. G. (1995). Sequence and structural determinants of the interaction between the 5'-noncoding region of picornavirus RNA and rhinovirus protease 3C.J. Biol. Chem. 270, 14510-14516. Webber, S. E., Tikhe, J., Worland, S. T., Fuhrmann, S. A., Hendrickson, T. F., Matthews, D. A., Love, R. A., Patick, A. K., Meador, J. W., Ferre, E A., Brown, E. L., Delisle, D. M., Ford, C. E., and Binford, S. L. (1996). Design synthesis and evaluation of nonpeptide inhibitors of human rhinovirus 3C proteinase. J. Med. Chem. 39, 5072-5882.

Picornaviral Proteinases

163

Weidner, J. R., and Dunn, B. M. (1991). Development of synthetic peptide substrates for the poliovirus 3C proteinase. Arch. Biochem. Biophys. 286,402-408. Whiting, A. K., and Peticolas, W. L. (1994). Details of the acyl-enzyme intermediate and the oxyanion hole in serine protease catalysis. Biochemistry 33,552-561. Wimmer, E. (1982). Genome linked proteins of viruses. Cell 28, 199-201. Wimmer, E., Hellen, C. U. T., and Cao, X. (1993). Genetics of poliovirus. Ann. Rev. Genet. 27, 353-436. Wirblich, C., Sibilia, M., Boniotti, M. B., Rossi, C., Thiel, H.-J., and Meyers, G. (1995). 3C-like protease of rabbit hemorrhagic disease virus: identification of cleavage sites in the ORF1 polyprotein and analysis of cleavage specificity. J. Virol. 69, 7159-7169. Xiang, W. S., Harris, K. S., Alexander, L., and Wimmer, E. (1995). Interaction between the 5'terminal cloverleaf and 3AB/3CD Pr~of poliovirus is essential for RNA application. J. Virol. 69, 3658-3667. Yalamanchili, D., Weidman, K., and Dasgupta, A. (1997). Cleavage of transcriptional activator Oct-1 by poliovirus encoded protease 3C pro. Virology 239, 176-185. Ypma-Wong, M. F., Dewalt, E G., Johnson, V. H., Lamb, J. G., and Semler, B. L. (1988). Protein 3CD is the major poliovirus proteinase responsible for cleavage of the P1 capsid precursor. Virology, 166, 265-270.

Proteases as Drug Targets for the Treatment of Malaria COLIN BERRY

Cardiff School of Biosciences, Cardiff University, Cardiff CF1 3US, Wales, UK

I. I n t r o d u c t i o n II. P r o t e a s e s i n M a l a r i a P a r a s i t e s

III. Current Antimalarial Agents with Effects in Parasite Proteolytic Enzymes IV. Concluding Remarks References

I. I N T R O D U C T I O N

A. O C C U R R E N C E OF MALARIA Every minute, approximately four children die from malaria. Globally, this culminates in 3 million deaths per year from up to 500 million clinical cases and it has been estimated that more than 2 billion people, over 40% of the world's population, are at risk from the disease (Najera and Hempel, 1996). This means that malaria causes almost as many fatalities each year as the total AIDS death toll over the past 15 years. Malaria occurs throughout the tropics (Fig. 1), limited by the distribution of mosquitoes of the genus Anopheles, the intermediate vectors, which spread the parasites from person to person. The incidence of this disease is now increasing owing to several factors including (1) increased resistance of the parasites to current antimalarial drugs, (2) increased resistance of mosquitoes to insecticides, and (3) increased size of endemic regions (e.g., because of deforestation and movement of populations into cities). Proteasesof InfectiousAgents Copyright 9 1999 by AcademicPress. All rights of reproduction in any form reserved.

165

166

Colin Berry

d ~

Affectedregions

FIGURE 1 Regionsof the world where malaria is endemic.

The problem of malaria can be tackled on two fronts: (1) by attacking the mosquito vector to reduce rates of transmission and (2) by prevention and treatment of infection in the human host. Measures such as insecticide spraying and the use of insecticide-impregnated bednets have produced some reduction in transmission although insect resistance is a developing problem. The prospects for human immunization against malaria have received much attention but suitable vaccines have not been produced as yet. At present, therefore, our defence against the parasite relies on the use of several classes of prophylactic or curative drugs. The mechanism of action of some of these drugs is known (e.g., the inhibitors of folate synthesis, the sulfonamides and sulfones, and the folate antagonists proguanil and pyrimethamine). However, the activity of many agents, including chloroquine, the mainstay of antimalarial chemotherapy for over 40 years, is poorly understood. Unfortunately, the effectiveness of our current arsenal of antimalarial compounds is increasingly compromised by the spread of resistant parasites. This makes it essential that new targets are sought to intervene in crucial biochemical pathways in the parasite. Inhibitors for the specific blockade of such targets may then be designed to develop the next generation of drugs to combat this scourge of human health.

B. MALARIAL LIFE CYCLE Malaria in humans is caused by four species of protozoan parasites in the genus Plasmodium. Plasmodium ovale and Plasmodium malariae are relatively uncommon infections. Plasmodium vivax and Plasmodium falciparum are the

Malaria

16 7

most common and P. falciparum accounts for by far the greatest number of deaths. The life cycle of the malarial parasites is complex and has many distinct phases. In humans, where antimalarial drug intervention must occur, these stages can be summarized briefly as follows. With a bite from an infected Anopheles mosquito, the parasite in its sporozoite stage is injected into the human host. In this form, the parasite migrates through the blood stream to the liver where it invades hepatocytes to begin the intrahepatic phase of the cycle. All P. falciparum and P. malariae cells and many of the cells of P. vivax and P. ovale then develop into hepatic trophozoites, which grow and divide to release the merozoite stage into the blood stream. However, P. vivax and P. ovale have an extra life cycle stage that may occur in the liver and some of the parasites of these species may enter the dormant hypnozoite stage. The hypnozoites may remain in the liver for months or years after initial infection before they, in turn, develop to release merozoites into the blood. The merozoites invade red blood cells to initiate the intraerythrocytic phase of the life cycle. After invasion, the parasites are termed "ring stage." These grow and develop into trophozoites that divide to form schizonts, burst the red blood cells, and release more daughter merozoites, which can in turn invade further erythrocytes. It is during the erythrocytic cycle of infection that symptoms of malaria first appear. The lysis of red cells to release merozoites tends to become synchronized with the consequent release of pyrogens, giving rise to the characteristic cyclic fevers of this disease. Following release from red blood cells, a few merozoites go on to develop into male and female gametocytes that cannot develop further in the human host and will die if they are not taken up by another Anopheles mosquito to complete the life cycle.

II. P R O T E A S E S IN M A L A R I A PARASITES Like other eukaryotes, malarial parasites contain a range of proteolytic enzymes that play important roles in functions such as protein processing. In addition, the complex life cycle of these protozoa gives rise to a variety of specialized functions that may be mediated by endo- and exopeptidases. These functions include processing of major parasite surface antigens, host invasion, morphological changes between the distinct stages of the parasite life cycle, digestion of host-derived proteins to obtain nutrients for growth, and release from host cells (reviewed by Schrevel et al., 1990; McKerrow et al., 1993). All such specific parasite processes are potential targets for new antimalarial interventions and therefore parasite proteases have received much attention. To date, the proteolysis of hemoglobin has been examined most intensively and the findings from such studies is reviewed below.

168

Colin

Berry

A. PROTEINASES AND HEMOGLOBIN DEGRADATION During the intraerythrocytic phase, parasites engulf red blood cell cytoplasm, which is then transported via a double-membrane-enclosed cytosome to the food vacuole (also known as the digestive vacuole), which has a single membrane. (For an excellent review of the metabolic role of the food vacuole, see Olliaro and Goldberg, 1995.) Within the latter lysosome-like acidic organelle, breakdown of the major red blood cell protein hemoglobin occurs to provide nutrients for parasite growth and development. In an established infection, this digestive process occurs on a very large scale such that 20% of red cells may be parasitized with 75% of host cell hemoglobin degraded. This can lead to the destruction of an estimated 100 g of hemoglobin during each cycle of erythrocyte infection (Goldberg et al., 1990). A biproduct of hemoglobin destruction, heme, is released. Free heme lyses malaria parasites (Orjih et al., 1981) so the parasite detoxifies the high concentrations accumulated in the food vacuole by polymerizing the heine to form the so-called malaria pigment hemozoin. Three proteinases have been isolated from the food vacuoles of P. falciparum (Goldberg et al., 1991; Gluzman et al., 1994), one cysteine proteinase (falcipain) and two aspartic proteinases (plasmepsin I, EC3.4.23.38, and plasmepsin II, EC3.4.23.39). The individual roles of each of these enzymes in the pathway is still controversial but inhibition of either cysteine or aspartic proteinase activity has been shown to cause growth inhibition and death of P. falciparum in red blood cells in culture (Rosenthal et al., 1988; Francis et al., 1994; Moon et al., 1997). As a result of the success of these and related studies, the Plasmodium cysteine and aspartic proteinases have been accepted as important drug targets by the World Health Organization (WHO, 1996). Proteolysis in the food vacuole leads to the accumulation of a series of discrete peptides (Kolakovich et al., 1997). It appears, therefore, that a specific system must exist to transport these peptide products from the food vacuole to the parasite cytoplasm where further degradation to individual amino acids may be mediated (at least in part) by a cytosolic aminopeptidase (Vander Jagt et al., 1984; Curley et al., 1994; Kolakovich et al., 1997). The roles of each of the food vacuole enzymes and the aminopeptidase will be discussed below with reference to their potential as targets for novel antimalarial intervention.

B. FALCIPAIN" A MALARIAL CYSTEINE PROTEINASE 1. Effect of Cysteine Proteinase Inhibitors on Parasites

To investigate the role of various proteinases in hemoglobin degradation within trophozoite stage parasites, Rosenthal et al. (1988) studied the effect of inhibi-

Malaria

169

tors on P. falciparum growing in red blood cells in culture. Food vacuoles with abnormal morphology were observed in parasites after 6 h incubation with the cysteine proteinase inhibitors leupeptin (20 - 100/zM) or L-transepoxysuccinylleucylamido-(4-guanidino)-butane (E64, 140/zM). Parasite differentiation and multiplication were also inhibited so that very few parasites progressed to reinvade further red blood cells to form new ring stages. Further examination of the vacuoles showed that they were completely filled with undigested erythrocyte cytoplasm (Rosenthal et al., 1988), suggesting a crucial role for cysteine proteinase(s) in the digestive pathway. The food vacuole abnormality was a specific consequence of the inhibition of cysteine proteinase activity rather than a general symptom of toxicity in Plasmodium, since other compounds including aspartic proteinase inhibitors (Rosenthal et al., 1988) and antimalarial drugs including chloroquine, mefloquine, and quinine (Rosenthal, 1995), did not produce the same morphological changes. 2. Possible Roles for Falcipain in Hemoglobin Degradation That falcipain plays a crucial role in hemoglobin degradation and that its inhibition is fatal to malarial parasites is not in doubt. The precise role of the enzyme in the degradative pathway is, however, the subject of some speculation. Rosenthal and his co-workers have proposed that falcipain is involved in the early stages of the catabolic pathway and have proposed a role in initial hemoglobin denaturation and heme release (Gamboa de Dominguez and Rosenthal, 1996). This may be supported by the findings of Asawamahasakda et al. (1994), who showed that E64 inhibited the formation of hemozoin to a greater extent than the aspartic proteinase inhibitor pepstatin. Subsequent involvement of falcipain in the first stages of globin digestion has also been inferred from the accumulation of undigested globin in the food vacuoles of parasites treated with cysteine proteinase inhibitors (Rosenthal et al., 1991; Rosenthal, 1995). Support for this model is derived from experiments indicating that falcipain is able to degrade native hemoglobin (Salas et al., 1995). These assays were performed in the presence of reducing agents (typically 10 mM Dithiothreitol) and this was considered to mimic the reducing effects of glutathione which might be provided by the erythrocyte cytoplasm and which also stimulates the activity of falcipain (Rosenthal et al., 1988). Francis et al. (1996) also showed naturally occurring falcipain isolated from parasite food vacuoles could degrade native hemoglobin in the presence of reducing agents. In contrast, in the absence of reducing agents, native hemoglobin is not digested by falcipain, although aciddenatured globin is still broken down and the specific peptide fragments produced have been characterized (Gluzman et al., 1994). This suggests that although reducing agents do produce some increase in activity of falcipain (Rosenthal et al., 1988), their major importance in the cleavage of native hemoglobin may be due to their effects on the structure of the substrate (Fran-

170

Colin Berry

cis et al., 1996). Indeed, Gamboa de Dominguez and Rosenthal (1996) and Francis et al. (1996) have shown that at the pH of the food vacuole (pH 5.0 to 5.4) hemoglobin is denatured only in the presence of reducing agents. The reducing potential of the food vacuole is therefore an important factor which influences the susceptibility of hemoglobin to attack by falcipain. Salas et al. (1995) proposed that ingested erythrocyte glutathione may provide the reducing environment necessary for hemoglobin to be cleaved. However, Francis et al. (1996) suggested that the levels of catalase present in the food vacuoles may be sufficient to protect hemoglobin from thiol-mediated denaturation, a process which is peroxide mediated. In the latter model, initial cleavage of the substrate is accredited to the action of aspartic proteinases (see below). Accumulation of undegraded hemoglobin in parasites treated in culture with cysteine proteinase inhibitors is then explained as follows. The action of the aspartic proteinases may lead to a build-up of peptide fragments (Kolakovich et al., 1997) that may no longer be broken down into amino acids and peptides for export from the vacuole while the cysteine proteinase is inhibited. As a consequence, these peptides may cause a hyperosmotic potential in the food vacuole which would in turn bring about the influx of water, swelling of the vacuole, and finally lead to a dysfunctional organelle in which catabolism no longer occurred so that native hemoglobin would accumulate. These differing views of the role of falcipain in the hemoglobin catabolic pathway remain to be resolved. However, it is clear from the studies of both Rosenthal and Goldberg that inhibition of falcipain activity is lethal to malarial parasites and thus this enzyme is a potential target for antimalarial drug design.

3. Characterization of Falcipain Cysteine proteinase activity in trophozoite stage parasites was initially analyzed by nonreducing, gelatin-substrate PAGE, with and without the inhibitors E64 or leupeptin (Rosenthal et al., 1988). These experiments confirmed earlier observations of a trophozoite cysteine proteinase (TCP) of approximately 28-kDa (Rosenthal et al., 1987). The food vacuole was identified as the location of this activity by demonstrating the accumulation of [3H]leupeptin in these organelles. Rosenthal et al. (1988) and Gluzman et al. (1994) have isolated a cysteine proteinase, falcipain, from the food vacuoles of P. falciparum. Data on the localization of falcipain and TCP and their substrate specificities have indicated that they are likely to be the same enzyme (Salas et al., 1995; Francis et al., 1996), although this remains to be proven rigorously. In vitro assays using fluorogenic peptides showed that the activity of falcipain was stimulated by the presence of sulfydryl agents and inhibited reversibly by leupeptin and irreversibly by E64. The peptide Z-Phe-Arg-AMC was the

171

Malaria

best substrate tested in these studies (Table I) and this substrate preference led to the conclusion that falcipain might be a cathepsin L-like proteinase (Rosenthal et al., 1988, 1989). Subsequently, a gene encoding a 569-amino-acid proenzyme in the cathepsin L family was identified in P. falciparum (Rosenthal and Nelson, 1992) and other Plasmodium species (Rosenthal, 1993, 1996; Rosenthal et al., 1993b). This zymogen is predicted to be activated to form a 26.8-kDa enzyme which is believed to be falcipain. Although other genes encoding cysteine proteinases are known in P. falciparum (Knapp et al., 1989, 1991; Li et al., 1989; Berti and Storer, 1995; Francis et al., 1996), Northern blot analysis has shown that only the falcipain gene has an expression pattern consistent with the trophozoite cysteine proteinase, as its mRNA is expressed during the ring stage and, at much lower levels, in the trophozoite stage (Rosenthal and Nelson, 1992; Francis et al., 1996). The availability of the gene encoding the falcipain precursor permitted the production of active recombinant falcipain in a baculovirus expression system (Salas et al., 1995). The recombinant protein had a pH profile of activity similar to trophozoite cysteine proteinase with an optimum in the range pH 5.5 to 6.0 and was shown to be able to degrade hemoglobin in the presence of reducing agents. Nevertheless, the protein produced in baculovirus, as assessed by gelatin PAGE, migrated as two bands (consistent with molecular weights of 55 and 45 kDa respectively) rather than the 28-kDa band characteristic of naturally occurring falcipain from P. falciparum. These higher-molecular-weight forms may be a result of incomplete processing of the zymogen (Salas et al., 1995) and may explain the very different kinetic properties (Table II) of the recombinant protein (Salas et al., 1995) and the naturally occurring enzymes (Rosenthal et al., 1989; Francis et al., 1996).

TABLE I RelativeRates of Cleavage of AMC Peptide Substrates by Trophozoite Extracta AMC peptide

Relative activity

Z-Phe-Arg Z-Val-Leu-Arg Z-Arg-Arg Z-Leu Z-Phe-Pro-Arg Z-Phe Z-Ala-Arg-Arg

100 37 20 7 7 4 3

aResults are normalized to 100 for the most effective substrate. Data from Rosenthal et al. (1988).

172

Colin Berry

TABLE II Kinetic constants for the hydrolysis of peptide substrates by naturally occurring (N.O.) and recombinant (Recomb.) forms of falcipain a

Z-Val-Leu-Arg-AMC

Z-Phe-Arg-AMC

Falcipain source

K,,, (/zM)

kcat

kcat/Km

(sec-1)

(M-lsec -~)

Recomb. N.O.

4 5

0.25 0.01

62,500 2,000

Recomb. N.O.

28 43

0.02 0.09

720 2000

aAdapted from Francis et al. (1996).

4. Antimalarial Action of Falcipain Inhibitors Initial studies with the relatively nonspecific inhibitors E64 and leupeptin showed that these compounds could cause the death of malaria parasites in red blood cells in culture (Rosenthal et al., 1988). Subsequently, the activity of peptide fluoromethyl ketone cysteine proteinase inhibitors was assessed against trophozoite extracts and against parasites in culture (Rosenthal et al., 1991). The ability of each of these inhibitors to kill parasites was well correlated to their effectiveness at inhibiting cysteine proteinase activity in the trophozoite extracts. The compound Z-Phe-Arg-CH2F, a potent inhibitor of cathepsin L, was the most effective inhibitor tested. Thus, the identification of falcipain as a cathepsin L family proteinase was further confirmed. With the identification of the falcipain homolog in the murine malarial parasite Plasmodium vinckei (Rosenthal, 1993), an animal model was developed for the testing of cysteine proteinase inhibitors as antimalarial agents (Rosenthal et al., 1993a). Despite the fact that the enzyme from P. vinckei was generally less susceptible to a range of inhibitors than the P. falciparum falcipain, inhibitors such as morpholine urea-Phe-Hphe-CH2F (Mu-Phe-Hphe-CH2F) were still effective against P. vinckei parasite extracts in vitro. As a result, this inhibitor was administered to infected mice to assess its effect on the activity of falcipain in vivo and on parasitemia. Falcipain activity isolated from parasites from treated animals was shown to be reduced by >90%, 2 h posttreatrnent with this irreversible inhibitor. Furthermore, after 4 days at 100 mg/kg 4 times per day, 80% of mice were cured of parasitemia (Rosenthal et al., 1993a). The inhibitor M u - P h e - H p h e - CHeF is not active specifically against falcipain (IC~0 3 and 5 nM against the P. falciparum and P. vinckei enzymes, respectively); it also inhibits the host enzymes cathepsin L (ICso 3 nM) and cathepsin B (ICs0 3 nM). Nevertheless, the effects of this compound on the murine host appeared to be relatively mild; lethargy was noted and skin ulcers occurred at the site of subcutaneous administration but both of these side-effects resolved quickly when treatment was discontinued (Rosenthal et al., 1993a). Therefore, al-

Malaria

17 3

though Mu-Phe-Hphe-CH2F is clearly not usable as a drug to tackle human malaria, the principle that inhibition of falcipain can lead to a cure for parasitemia in vivo was established by these studies. The lack of correlation between toxicity in the host and inhibition of host cysteine proteinases for the fluoromethyl ketone inhibitors led Rosenthal et al. (1996) to speculate that the side-effects may have resulted from the production of toxic metabolites rather than host proteinase inhibition. Therefore, a new series of peptide-based inhibitors was tested in which the fluoromethyl ketone leaving group was replaced by a vinyl sulfone group (VSPh). The compound Mu-Phe-Hphe-VSPh was a weaker inhibitor (IC50 80 nM) of P. falciparum falcipain than the fluoromethyl ketone analog Mu-Phe-Hphe-CH2F (3 nM). Substitution of Leu for Phe in the compound Mu-Leu-Hphe-VSPh produced an inhibitor with an ICs0 of 3 nM for falcipain, which caused hemoglobin accumulation and inhibition of parasite development in culture in the 10-30 nM range (Rosenthal et al., 1996). This vinyl sulfone-containing peptidomimetic showed no apparent toxicity or pathology in rats given up to 30 mg/kg daily for 28 days and thus would appear to be safer for use than the fluoromethyl ketone compounds. The above studies (Rosenthal et al., 1993a, 1996), using the cysteine proteinase from P. falciparum and P. vinckei, raise an important issue for the development of antimalarial inhibitors. Although P. falciparum causes the most deadly form of malaria in humans, the financial investment necessary for drug development would be likely to dictate that any compound produced should have the widest possible application and therefore should be effective not only against P. falciparum but also against the three other parasites that cause human malaria (P. vivax, P. malariae, and P. ovale). Comparison of inhibitor binding to falcipain and its homolog from P. vinckei shows a variation in ICs0 of more than 50-fold in some cases. It remains to be seen whether an inhibitor can be developed with characteristics to allow effective inhibition of falcipains from all four human Plasmodium parasites and still be selective enough to cause no host toxicity. The production and assay of the falcipain homologs from P. vivax, P. malariae, and P. ovale will be essential in future inhibitor development studies. 5. Computer-Aided Design of Novel Falcipain Inhibitors Characterization of the gene encoding the falcipain precursor and derivation of the amino acid sequence of the protein (Rosenthal and Nelson, 1992) paved the way for generation of a computer model for mature falcipain, based on the X-ray structures of papain and actinidin (Ring et al., 1993). Identification of profalcipain genes from different Plasmodium species (Rosenthal et al., 1993b; Rosenthal, 1996) has allowed the identification of conserved residues that appear to be characteristic of the falcipains and that are not present in other

174

Colin Berry

papain family proteinases. This information facilitates design steps to produce the lead compounds for drug development. The model of falcipain allowed the use of the DOCK 3.0 program to screen a small molecule database for moieties that might fit the active site of the enzyme (Ring et al., 1993). Over 2000 compounds were selected from an initial screening and were then judged manually for those most likely to produce a good interaction. Finally, 31 compounds were chosen for assay to determine their abilities to inhibit cysteine proteinase activity in trophozoite extracts. Four showed ICs0 values of <100/zM and the best of these, Oxalic bis [(2-hydroxy-1naphthylmethylene)hydrazide], inhibited with an ICs0 of 6/xM. When tested against parasites growing in culture, this compound inhibited the incorporation of hypoxanthine, a marker of parasite metabolism, indicating that it was also effective against P. falciparum growing in a culture system. Oxalic bis[ (2-hydroxy-l-naphthylmethylene)hydrazide] was therefore used as a lead for further structure-based inhibitor design (Li et al., 1995). Various modifications of this parent compound were made with the intention of enhancing water solubility, stability, and electrostatic interactions with His67 in the Sz subsite pocket of the enzyme. Assay of the resulting compounds (chalcones) against P. falciparum in culture identified several compounds with ICsos in the low micromolar range. They were poor inhibitors of mammalian cathepsin B and were also found to be reversible inhibitors of a cysteine proteinase (cruzain) from the sleeping sickness parasite Trypanosoma cruzi. Analysis of the modeled structures predicted that the $2 and $3 subsites would be sufficiently different between cathepsin B and the parasite enzymes to explain this selectivity. In related studies to develop antimalarial agents with activity against falcipain, Dominguez et al. (1997) produced new phenothiazine derivatives of acridinediones, compounds already shown to have antimalarial activity. A number of the new compounds blocked the degradation of hemoglobin and inhibited falcipain. A dual mechanism of action was proposed to account for the action of these compounds against parasites: the as-yet-unknown effect of the parent acridinediones and inhibition of falcipain. 6. Summary: Falcipain as a Potential Antimalarial Target The studies described above have clearly shown that inhibition of falcipain is lethal to malarial parasites by preventing breakdown of hemoglobin; the potential of falcipain inhibitors as antimalarial agents has thus been established. Good progress has been made in identifying moieties which are able to inhibit this enzyme with some degree of specificity. Some of these compounds have been tested in vivo and have demonstrated limited host toxicity. Work is currently underway to produce further inhibitors with enhanced selectivity against

Malaria

17 5

parasite enzymes to reduce host side-effects. Further modification of the resulting lead compounds to enhance pharmacokinetic properties will be necessary to produce viable drugs for use against malaria.

C. THE PLASMEPSINS: MALARIAL ASPARTIC PROTEINASES 1. Effect of Aspartic Proteinase Inhibitors on Parasites In the same studies that showed a cysteine proteinase was important in hemoglobin degradation, Rosenthal et al. (1988) demonstrated that pepstatin, a specific inhibitor of aspartic proteinases, was also able to kill parasites growing in red blood cells in culture. The morphology of parasites treated with pepstatin differed from those treated with E64; cells treated with aspartic proteinase inhibitors became condensed and misshapen after 6 h (Rosenthal et al., 1988; Rosenthal, 1995). Further studies, described below, have attempted to elucidate the cause of these physical changes by defining the role of the aspartic proteinases in Plasmodium. 2. Possible Roles for the Plasmepsins in Hemoglobin Degradation Aspartic proteinase activity capable of hydrolyzing hemoglobin was identified in extracts from the avian malaria parasite P. lophurae by Sherman and Tanigoshi (1981). Later, the same authors purified the enzyme and showed that it was also capable of degrading erythrocyte membrane proteins, suggesting a possible role in release of parasites from red blood cells at the end of the intraerythrocytic cycle (Sherman and Tanigoshi, 1983). Elucidation of the role of aspartic proteinase activity in the hemoglobin breakdown pathway was given a major boost when Goldberg et al. (1990) developed a method for the isolation of P. falciparum food vacuoles to allow study of their contents in isolation. The major hemoglobin degrading activity in these vacuoles could be inhibited by pepstatin and thus was attributable to aspartic proteinase(s). Subsequently, an aspartic proteinase (now called plasmepsin I) was purified from these organelles and laser desorption ionization mass spectrometry was used to show how it cleaved hemoglobin at a series of discrete sites located in the a-chain (Goldberg et al., 1991). The first cleavage, which appears to be necessary to initiate the whole degradative process, occurs between the Phe33 and Leu34 residues in the ce-globin chain. This sequence is located in an a-helical structure in the hinge region of hemoglobin where intermolecular contacts help to keep the tetrameric structure intact during the conformational changes that occur during binding and dissociation of oxygen. The structural importance of this

176

Colin Berry

region means that this sequence is highly conserved and is not altered in any known hemoglobinopathies. This suggests that the malarial proteinase is striking at an important target in the molecule that the human host may be unable to mutate and thus develop resistance to the parasite. Cleavage of this initial peptide bond appears to disrupt the integrity of the hemoglobin tetramer, causing dissociation and heme release, and renders the subunits susceptible to further degradation. A second vacuolar aspartic proteinase (plasmepsin II) was later identified from gene sequencing (Dame et al., 1994) and shown to be present in isolated food vacuoles (Francis et al., 1994). While both plasmepsins were able to cleave the Phe33-Leu34 bond in ce-globin, a different pattern of secondary cleavage sites was defined for each of the enzymes (Gluzman et al., 1994). Similar results were obtained by VanderJagt et al. (1992), who assessed aspartic proteinase activities by peptide mapping using hemoglobin as a substrate. Very similar patterns were derived from two of the fractions, suggesting that these might represent different forms of the same enzyme. The other fraction gave a different peptide map consistent with a total of two aspartic proteinase activities in P. falciparum. As a result of the above studies, researchers in Goldberg's group proposed an ordered pathway for hemoglobin degradation in which the initial cleavage event is performed by aspartic proteinase(s) and subsequent digestion within the food vacuole is the result of the synergistic action of plasmepsin I, plasmepsin II, and falcipain (Goldberg and Slater, 1992; Gluzman et al., 1994).

3. Characterization of the Plasmepsins a. Features of the Plasmepsin Sequences

Characterization of the plasmepsins has been facilitated by the elucidation of the sequence of the proplasmepsin genes (Francis et al., 1994; Dame et al., 1994; Berry et al., 1995). The genes are transcribed at distinct phases of the intraerythrocytic life cycle with proplasmepsin I message confined to ring stages and proplasmepsin II message produced mainly in the trophozoite stage (Moon et al., 1997; Francis et al., 1997). The deduced amino acid sequences of the two plasmepsins are members of the aspartic proteinase family of zymogens. Proparts of the aspartic proteinases are generally less well conserved but the plasmepsins were particularly unusual, having very long proparts (approximately 125 amino acids rather than approximately 50 residues in the typical zymogens). The proparts also lack the signal peptide, which characterizes the pre-pro forms of many monomeric aspartic proteinases. Instead, the plasmepsins have a hydrophobic stretch of 20 amino acids, beginning 35 residues from the N-terminus of the propeptide. It was postulated that this hydrophobic region, located within the propart, may

Malaria

17 7

be involved in the membrane localization of the proplasmepsins (Francis et al., 1994; Dame et al., 1994) and it has since been shown that the proplasmepsins are type II integral membrane proteins (Francis et al., 1997). This accounts for earlier findings that showed aspartic proteinase activity from P. falciparum associated with the parasite membrane fractions (Vander Jagt et al., 1987, 1992). The mature plasmepsins share 73% amino acid identity with each other and approximately 30% identity with each of the five known human aspartic proteinases: renin, pepsin, gastricsin, cathepsin D, and cathepsin E. The similarities with other members of the aspartic proteinase family allowed the plasmepsins to be modeled on the known X-ray structures of cathepsin D (Dame et al., 1994; Moon et al., 1997). Elucidation of the crystal structure of plasmepsin II (Silva et al., 1996) (Fig. 2, see color plate) and the model of plasmepsin I facilitate comparisons of the plasmepsins with other aspartic proteinases. Several features of the plasmepsins are atypical. The second active site motif, located in the C-terminal domain, is Asp-Ser-Gly rather than the more usual AspThr-Gly sequence. Intriguingly, this Thr-Ser replacement is also found in plant aspartic proteinases such as barley (Runeberg-Roos et al., 1991). The residue at position 13 in most aspartic proteinases (pig pepsin numbering), which influences $3 subsite interactions, is usually Glu or Gln but in the plasmepsins (and no other aspartic proteinases as yet characterized), a Met residue is present at this location. This may produce a more hydrophobic $3 pocket in these enzymes (Dame et al., 1994). Indeed, the active site cleft of each of the plasmepsins is generally hydrophobic in nature and this is reflected in the substrate specificities of the enzymes (see below). The atypical features of the plasmepsins also encourage the belief that specific inhibitors can be designed to block the action of these enzymes without effect on the human counterparts. b. Activation of the Proplasmepsins

Naturally occurring plasmepsins I and II can only be isolated in small quantities from P. falciparum. Therefore, the zymogens proplasmepsin I and proplasmepsin II have been expressed to high levels in recombinant form in Escherichia coli (Hill et al., 1994; Luker et al., 1996; Moon et al., 1997) to allow the enzymes to be studied in detail. Proplasmepsin II autoactivates readily at acid pH (Hill et al., 1994) to give a mature enzyme 12 amino acids longer than the form of plasmepsin II that is isolated from food vacuoles. The extra 12 N-terminal amino acids may be removed by the action of other proteolytic enzymes either in the vacuole or during purification of the enzymes from P. falciparum. In contrast to proplasmepsin II, proplasmepsin I does not undergo autocatalytic conversion from zymogen to mature enzyme nor is it processed by plasmepsin II (Moon et al., 1997) or falcipain (Francis et al., 1997). This

178

Colin Berry

implies the presence of another proteolytic activity in vivo, which is responsible for activating proplasmepsin I (and possibly cleaving proplasmepsin II to give the 12-residue shorter form of the enzyme isolated from parasites). Investigations into the nature of proplasmepsin I processing have indicated (1) that an acid pH is required and (2) that the activation can be blocked by N-acetyl-Lleucyl-L-leucyl-norleucinal and N-acetyl-L-leucyl-L-leucyl-methional but not by E64, Phenyl methyl sulphonyl fluoride, or pepstatin (Francis et al., 1997). Both of these tripeptide inhibitors are known to block the activity of some cysteine proteinases including calpain but this enzyme acts at neutral pH. The nature of the putative proplasmepsin processing protease remains to be determined: it may be produced by P. falciparum or may be a host enzyme that the parasite uses to mediate the activation process. In any case, the importance of the activity of this enzyme in the production of mature plasmepsin I makes it another potential future target for therapeutic intervention with proteinase inhibitors. In order to produce recombinant, mature plasmepsin I, Moon et al. (1997) generated a mutant protein that was capable of autoactivation. The scissile peptide bond (*) cleaved during autoactivation of proplasmepsin II is in the sequence N-F'L-N, whereas the analogous sequence in proplasmepsin I is K-F-F-K. Activation of proplasmepsin I at this point would thus require a lysine residue to be accepted in the $2 subsite during the self-activation process. The molecular model of plasmepsin I indicated that a lysine would be poorly accommodated in this position and that a valine might be a better fit. Consequently, a mutant proplasmepsin I was produced, containing the sequence V-F* F-K, and was shown to be capable of autoactivation at acid pH. c. Substrate Specificity of the Plasmepsins

Plasmepsins I and II are very similar enzymes, sharing 73% identity, and both are present in food vacuoles. A question therefore arises: are these enzymes performing distinct roles in the parasite or are they degenerate in their function? The activity of both enzymes (determined with substrates including hemoglobin and fluorogenic and chromogenic peptides) shows similar pH profiles (Goldberg et al., 1991; Dame et al., 1994; Moon et al., 1997), but this is to be expected of enzymes that must function in the Plasmodium food vacuole at pH 5.0. The first indication that they might perform different roles was gained by study of the hemoglobin cleavage sites for each enzyme (Gluzman et al., 1994). This investigation showed that while both enzymes could carry out the cleavage of the important Phe33* Leu34 bond in c~-globin to initiate hemoglobin degradation, subsequent cleavage sites in c~- and/3-globins were distinct for each enzyme (Table III). Plasmepsin II was also shown to hydrolyze denatured globin three times faster than plasmepsin I (Luker et al., 1996), indicating a different susceptibility to attack and altered specificity for each enzyme.

17 9

Malaria TABLE III

Hemoglobin Cleavage Sites for Plasmepsins I and II and for Falcipain a

Enzyme

Globin chain

Cleavage site

Cleavage sequence

Plasmepsin I

cr

33/34 46/47 31/32 41/42 129/130

RMF*LSF PHF*DLS VNF*KLL GRL*LVV QRF*FES QAA*YQK

33/34 108/109 136/137 32/33

RMF*LSF LVT*LAA TVL*TSK RLL*VVY

31/32 33/34 32/33 69/70 82/83

LER*MFL RMF*LSF RLL*VVY VLG*AFS LSA*LSD

98/99

Plasmepsin II

cr

fl

Falcipain

a ~8

aAdapted from Gluzman et al. (1994).

Assays with a fluorogenic substrate DABCYL-Glu-Arg-Met-Phe*LeuSer-Phe-Pro-EDANS, based on the Phe33*Leu34 cleavage site, extended the results gained using hemoglobin by showing that both k cat and Km for plasmepsin II were 4 - 5 times higher than those for plasmepsin I (Luker et al., 1996). This, however, results in specificity constants (kcat/Km) for the two enzymes that are very similar. Recombinant plasmepsin II showed the same kinetics as the naturally occurring enzyme with the above substrate but the recombinant proplasmepsin I used in this work was not able to self-activate and this zymogen did not have comparable kinetics to the naturally occurring, mature plasmepsin I. Assays which have been performed with chromogenic peptide substrates are difficult to compare directly with those using fluorogenic substrates as both the substrate type and the pH of assay were different. However, as in the studies above, assays with chromogenic substrates have also shown that plasmepsin II has a higher kcat than plasmepsin I (Moon et al., 1997). In contrast, these experiments showed that the Km values for each plasmepsin with two related peptides (varied only in the P2 position) were of the same order. However, plasmepsin I showed the highest affinity for the P2 Ile peptide while plasmepsin II showed highest affinity for the P2 Val substrate (Table IV), indicating different substrate specificity for each enzyme. Further studies of the recombinant plasmepsins using more extensive series of substrates (Tyas, 1997; Westling et al., 1997) have shown that overall the plasmepsins of P. falciparum

180

Colin Berry

TABLE IV Kinetic Parameters for the Cleavage of Chromogenic Peptide Substrates by Recombinant Plasmepsins I and II a

Enzyme

K.1 (/xM)

kcat (sec -~)

kca~/K,. (mM-~sec -~)

Xaa = Ile

Plasmepsin I Plasmepsin II

10 25

1 15

120 610

Xaa = Val

Plasmepsin I Plasmepsin II

30 20

2 9

50 470

Substrate

apeptides were of the form Leu-Glu-Arg-Xaa-Phe*Nph-Ser-Phe. Reproduced from Moon et al. (1997).

have very similar subsite preferences, generally preferring hydrophobic residues. Some differences do exist, however, as illustrated in Table IV. These alterations in specificity between the two enzymes can be explained with reference to the structure of plasmepsin II (Silva et al., 1996) and the model of plasmepsin I (Moon et al., 1997). Several residues in the enzyme contribute to the architecture of the $2 subsite pocket but only amino acid 289 (pepsin numbering) is different in the two plasmepsins. Plasmepsin I has a valine in this position, whereas plasmepsin II has a leucine. This may make the plasmepsin I $2 site more open and better able to accommodate the Pz isoleucine substituent while plasmepsin II prefers a valine in Pz (Table IV). Similar variations in substrate specificity between the plasmepsins are also reflected in other subsites (Tyas, 1997) and these disparate subsite preferences may also account for the different affinities of plasmepsins I and II for a variety of inhibitors (see below). 4. Selective Plasmepsin Inhibitors Several aspartic proteinase inhibitors have been found to act against Plasmodium parasites growing in culture. The pepstatins have similar Ki values against both plasmepsins (Luker et al., 1996), but three other aspartic proteinase inhibitors (Table V), which have been found to kill parasites and have been assayed against the individual plasmepsins, have much lower Ki against plasmepsin I than plasmepsin II (Francis et al., 1994; Moon et al., 1997). This difference in inhibitor susceptibility confirms that the two plasmepsins have distinct subsite preferences. It also indicates that they must perform distinct roles in P. falciparum since the concentrations of inhibitors used to kill parasites in culture would only cause inhibition of plasmepsin I; therefore, inhibition of plasmepsin I alone is sufficient to cause parasite death and plasmepsin II cannot compensate for the blockade of plasmepsin I function. This, along with evidence that the proplasmepsin genes are transcribed at different points in the life cycle (Moon

181

Malaria

TABLE V Effects of Inhibitors on Purified Plasmepsin I (PM I), Plasmepsin II (PM II), and P. falciparum (Strain *K1 or *HB3) Growing in Red Blood Cells in Cuhure a .,

Enzyme PM I

PM II

(~)

inh~b~o~

Parasite

,~~~I~~

O

O

NH

O

Ro40-4388

9

700

250 *

Ro40-5576

8

250

900 *

O

O

z. "~~

~"

I

H

OH

\

oHN

/

O

~ 0 ~-~

O HN""~

SC-50083

0 ~-~

OH

~O~N~~~N~~-~"~ H

it O ~

--"

H

--"

OH

-

OHa

aFrom Francis et al. (1994); Moon et al. (1997).

500

ICso > 106

2,000

18 2

Colin Berry

et al., 1997) indicate that the two enzymes are likely to have distinct functions in P. falciparum. Other compounds with activity against P. falciparum in culture have been shown to inhibit plasmepsin II but their activity against plasmepsin I has not been determined (Silva et al., 1996). Thus, it remains possible that specific inhibition of plasmepsin II may also cause parasite death, though presumably by interfering with a process distinct from that blocked by the specific plasmepsin I inhibitors.

5. Plasmepsins from Other Plasmodium Species As discussed above (Section II,B,4), for a new antimalarial drug to be commercially viable, activity against all four human malaria parasites is necessary. Therefore it is essential that the proteinases from these other species are also characterized. The genes encoding proplasmepsin zymogens have been cloned from the human parasites P. vivax, P. malariae, and P. ovale and also from the rodent parasite Plasmodium berghei. It is important to note that in each of these species only one functional aspartic proteinase gene appears to exist. This raises the question: why does P. falciparum appear to require two such enzymes? At present this remains a mystery. The zymogens from the other three human parasites all possess autoactivation sites similar to that of proplasmepsin II, whereas the P. berghei zymogen, like plasmepsin I, is not able to self-activate. Therefore in P. berghei, as in P. falciparum, the presence of a proplasmepsin processing proteinase must be postulated. Production and assay of recombinant plasmepsins from P. vivax (PvPM) and P. malariae (PmPM) have shown that these enzymes are more similar to each other than to plasmepsin II in their interactions with a range of substrates and inhibitors (Westling et al., 1997). Furthermore, a compound previously shown (Luker et al., 1996) to be a subnanomolar inhibitor of plasmepsin I but a poor inhibitor of plasmepsin II (Ki ~ 1/zM) was also found to have subnanomolar potency in its ability to inhibit PvPM and PmPM (Westling et al., 1997). This is an encouraging sign that inhibitors of plasmepsin I, which have been shown to kill parasites in culture, may also prove effective against other malaria parasites.

D. SYNERGISTIC EFFECTS OF THE INHIBITION OF FALCIPAIN AND THE PLASMEPSINS Independent studies of the action of cysteine and aspartic proteinase inhibitors on malaria parasites in red blood cells in culture have shown that the simul-

Malaria

183

taneous presence of E64 and pepstatin is more effective than either inhibitor alone (Rosenthal et al., 1988) or may be synergistic (Bailly et al., 1992). Synergism was also observed between E64 and the more specific inhibitor of plasmepsin I, SC-50083 (Gluzman et al., 1994). These results suggest that a combination drug therapy to impose a double blockade on the hemoglobin digestion pathway by inhibition of both falcipain and plasmepsin I (and II) may be possible. This might provide better control of parasitemia by synergistic action and also help to prevent/delay the onset of resistance by attacking two target enzymes at once.

E. A PLASMODIUM AMINOPEPTIDASE It has been shown that the action of cysteine and aspartic proteinases in the P. falciparum food vacuole results in the production of a series of distinct peptide products from the digestion of hemoglobin (Kolakovich et al., 1997). The final stages of the catabolic pathway to yield free amino acids must therefore take place outside the food vacuole, mediated by exopeptidase activity. Several groups have described aminopeptidase activity from trophozoite stage parasites from various Plasmodium species (Charet et al., 1980; Vander Jagt et al., 1987; Curley et al., 1994). Although Charet et al. (1980) detected up to four aminopeptidase activities following isoelectric focusing of extracts from the rodent malaria Plasmodium chabaudi, Curley et al. (1994) found evidence for only one aminopeptidase of approximately 80 kDa in this species, in another rodent malaria P. berghei, and in P. falciparum. These enzymes were active with Leu-, Ala-, Arg-, Lys-, and Gly-AMC substrates but not with His- or Pro-AMC. A preference for the Leu- and Ala-AMC substrates was established and, given that these amino acids together account for more than 25% of the residues in hemoglobin, this selectivity is consistent with their proposed role in the final stages of hemoglobin degradation. After further purification of the enzyme from P. chabaudi, it was shown (Nankya-Kitaka et al., 1998) that the inhibitors bestatin and nitrobestatin inhibited the enzyme at concentrations in the nanomolar range (Ki = 50 and 2.5 nM, respectively). Nankya-Kitaka et al. (1998) also demonstrated that these inhibitors blocked the growth of intraerythrOcytic P. chabaudi and P. falciparum in culture with IC5o values against the latter parasite of 1.7/~M for bestatin and 0.4/~M for nitrobestatin. Bestatin has a low oral toxicity for mammals with an LDso > 4 g/kg in mice (Sakakibara et al., 1983); however, toxicity was seen following subcutaneous or intraperitoneal administration, which limits the suitability of this compound as a lead for the design of drugs against the malarial aminopeptidases. The aminopeptidases from Plasmodium were shown to be approximately 80-kDa metallopeptidases, were moderately thermophilic, and had pH optima

184

Colin Berry

of 7.2, consistent with activity in the parasite cytoplasm (Curley et al., 1994). However, Curley et al. (1994) suggested that, in contrast to other metalloaminopeptidases, these enzymes may not be Zn 2§ dependent. This is of interest, as it may mean that malarial aminopeptidases are distinct from their human counterparts and this, in turn, would facilitate the design of specific antimalarial inhibitors of these Plasmodium enzymes.

III. C U R R E N T A N T I M A L A R I A L A G E N T S W I T H E F F E C T S O N PARASITE PROTEOLYTIC ENZYMES The preceding sections have described the effects of protease inhibitors on parasites and the rationale for producing more selective ligands for development into novel antimalarial drugs. Evidence exists, however, to suggest that some of the current arsenal of antimalarial drugs may exert at least part of their effect by inhibition of proteolysis. Moulder and Evans (1946), the first investigators to demonstrate protease activity in Plasmodium, showed that the drugs quinine and atabrine caused significant inhibition of parasite proteolytic activity at concentrations of i/~M. Chloroquine, like the falcipain inhibitors, leads to an accumulation of globin in the parasite food vacuole (although at a much lower level) and inhibits falcipain with an IC50 of 400/.~M (Rosenthal, 1995). While this would not normally be considered to be a very potent inhibition, this antimalarial drug appears to accumulate in food vacuoles in millimolar concentrations (Geary et al., 1986), which would be high enough, therefore, to cause effects on falcipain. Charet et al. (1980) have also reported inhibition of the aminopeptidase by chloroquine. This finding was confirmed by Vander Jagt et al. (1987), who reported noncompetitive inhibition of the amino peptidase by chloroquine with a Ki of 410/~M. Nevertheless, no aminopeptidase inhibition by chloroquine was seen by Nankya-Kitaka et al. (1998) and since this enzyme appears to be cytosolic, therapeutic levels of the drug in this cell compartment are in any case unlikely to be sufficient to cause significant aminopeptidase inhibition. In addition to possible direct effects of chloroquine on proteases, chloroquine is known to cause accumulation of heme in parasite food vacuoles and heme or chloroquine-heme complexes are inhibitory to both falcipain and the plasmepsins (Vander Jagt et al., 1987, 1992; Gluzman et al., 1994). Thus, food vacuole proteinase inhibition may form part of the complex mechanism of action of chloroquine. This is supported by the finding that treatment of P. falciparum in culture with plasmepsin I inhibitors (Ro40-4388 or Ro40-5576) in combination with chloroquine, showed that the proteinase inhibitors were an-

Malaria

18 5

tagonistic to the action of the drug (Moon et al., 1997). This is thought to reflect the fact that the formation of a chloroquine-heme complex is necessary for the drug to have its maximum antiparasitic effect. In the presence of inhibitors of hemoglobin degradation, however, little heme will be released for incorporation into such a complex.

IV. CONCLUDING REMARKS To overcome parasite resistance and to produce new generations of antimalarial drugs, parasite proteinases have been identified as important new targets, particularly in light of their crucial roles in the process of hemoglobin degradation and parasite nutrition. One cysteine proteinase, two aspartic proteinases, and an aminopeptidase play roles in this catabolic process and all have been shown to be potential targets for inhibitor-based drugs. A further, as-yet poorly characterized enzyme, the so-called proplasmepsin processing proteinase (see earlier citation), appears to be critical for the activation of plasmepsin I and is thus a further potential drug target. To date the cysteine proteinase, falcipain, and the aspartic proteinases, plasmepsins I and II, have been studied in the greatest detail. Substrate specificities have been determined, X-ray structures (plasmepsin II) or models (falcipain and plasmepsin I) have been produced, and several series of inhibitors of the enzymes have been tested. For the future, more screening for new lead inhibitors will be necessary and these new compounds, along with those already identified, must be modified and adapted for therapeutic use. It is clear that proteases are excellent drug targets in P l a s m o d i u m and could represent the Achilles heel of these parasites, which are major killers worldwide.

ACKNOWLEDGMENTS Work carried out in the author's laboratoryreceivedfinancial support fromthe UNDP/WORLD BANK/WHO Special Programme for Research and Training in Tropical Diseases (TDR), The Royal Society, The Biotechnologyand Biological Sciences Research Council, and F. HoffmannLa Roche Ltd.

REFERENCES Asawamahasakda, W., Ittarat, I., Chang, C.-C., McElroy,P., and Meshnick, S. R. (1994). Effects of antimalarials and protease inhibitors on plasmodialhemozoinproduction. Mol. Biochem. Parasitol. 67, 183-191.

186

Colin Berry

Bailly, E., Jambou, R., Savel, J., and Jaureguiberry, G. (1992). Plasmodium falciparum: Differential sensitivity in vitro to E-64 and pepstatin A. J. Protozool. 39, 593-599. Berry, C., Dame,J. B., Dunn, B. M., and Kay,J. (1995). Aspartic proteinases from the human malaria parasite Plasmodium falciparum. In "Aspartic Proteinases: Structure, Function, Biology and Biomedical Implications" (K. Takahashi, Ed.), pp. 511-518. Plenum, New York. Berti, E J., and Storer, A. C. (1995). Alignment/phylogeny of the papain superfamily of cysteine proteases. J. Mol. Biol. 246, 273-283. Charet, E, Aissi, E., Maurois, E, Bouquelet, S., and Biguet, J. (1980). Aminopeptidases in rodent Plasmodium. Comp. Biochem. Physiol. B 65,519-524. Curley, G. E, O'Donovan, S. M., McNally, J., Mullally, M., O'Hara, H., Troy, A., O'Callaghan, S.-A., and Dalton, J. E (1994). Aminopeptidases from Plasmodium falciparum, Plasmodium chabaudi chabaudi and Plasmodium berghei. J. Euk. Microbiol. 41, 119-123. Dame, J. B., Reddy, G. R., Yowell, C. A., Dunn, B. M., Kay, J., and Berry, C. (1994). Sequence, expression and modeled structure of an aspartic proteinase from the human malaria parasite Plasmodium falciparum. Mol. Biochem. Parasitol. 64, 177-190. Dominguez, J. N., Lopez, S., Charris, J., Iarruso, L., Lobo, G., Semenov, A., Olson, J. E., and Rosenthal, E J. (1997). Synthesis and antimalarial effects of phenothiazine inhibitors of a Plasmodiumfalciparum cysteine protease.J. Med. Chem. 40, 2726-2732. Francis, S. E., Banerjee, R., and Goldberg, D. E. (1997). Biosynthesis and maturation of the malaria aspartic hemoglobinases plasmepsins I and II. J. Biol. Chem. 272, 14961-14968. Francis, S. E., Gluzman, I. Y., Oksman, A., Banerjee, D., and Goldberg, D. E. (1996). Characterisation of native falcipain, an enzyme involved in Plasmodiumfalciparum hemoglobin degradation. Mol. Biochem. Parasitol. 83, 189-200. Francis, S. E., Gluzman, I. Y., Oksman, A., Knickerbocker, A., Mueller, R., Bryant, M. L., Sherman, D. R., Russel, D. G., and Goldberg, D. E. (1994). Molecular characterisation and inhibition of a Plasmodium falciparum aspartic hemoglobinase. EMBO J. 13, 103-317. Gamboa de Dominguez, N. D., and Rosenthal, E J. (1996). Cysteine proteinase inhibitors block early stages in hemoglobin degradation by cultured malaria parasites. Blood 87, 4448-4454. Geary, T. G., Jensen, J. B., and Ginsburg, H. (1986). Uptake of [3H]chloroquine by drug-sensitive and -resistant strains of the human malaria parasite Plasmodium falciparum. Biochem. Pharmacol. 35, 3805-3812. Gluzman, I. Y., Francis, S. E., Oksman, A., Smith, C. E., Duffin, K. L., and Goldberg, D. E. (1994). Order and specificity of the Plasmodium falciparum hemoglobin degradation pathway. J. Clin. Invest. 93, 1602-1608. Goldberg, D. E., and Slater, A. F. G. (1992). The pathway of haemoglobin degradation in malaria parasites. Parasitol. Today 8, 280-282. Goldberg, D. E., Slater, A. F. G., Beavis, R., Chait, B., Cerami, A., and Henderson, G. B. (1991). Hemoglobin degradation in the human malaria pathogen Plasmodium falciparum: A catabolic pathway initiated by a specific aspartic protease. J. Exp. Med. 173,961-969. Goldberg, D. E., Slater, A. F. G., Cerami, A., and Henderson, G. B. (1990). Hemoglobin degradation in the malaria parasite Plasmodium falciparum: An ordered process in a unique organelle. Proc. Natl. Acad. Sci. USA 87, 2931-2935. Hill, J., Tyas, L., Phylip, L. H., Kay,J., Dunn, B. M., and Berry, C. (1994). High level expression and characterisation of plasmepsin II, an aspartic proteinase from Plasmodium falciparum. FEBS Lett. 352, 155-158. Knapp, B., Hundt, E., Nau, U., and Kuepper, H. A. (1989). Molecular cloning, genomic structure and localization in a blood stage antigen of Plasmodium falciparum characterized by a serine stretch. Mol. Biochem. Parasitol. 32, 73-83. Knapp, B., Nau, U., Hundt, E., and Kuepper, H. A. (1991). A new blood stage antigen of Plasmo-

Malaria

187

dium falciparum highly homologous to the serine-stretch protein SERP. Mol. Biochem. Parasitol. 44, 1-14. Kolakovich, K. A., Gluzman, I. Y., Duffin, K. L., and Goldberg, D. E. (1997). Generation of hemoglobin peptides in the acidic digestive vacuole of Plasmodium falciparum implicates peptide transport in amino acid production. Mol. Biochem. Parasitol. 87, 123-135. Li, W. B., Bzik, D. J., Horii, T., and Inselburg, J. (1989). Structure and expression of the Plasmodium falciparum SERA gene. Mol. Biochem. Parasitol. 33, 13-25. Li, R., Kenyon, G. L., Cohen, F. E., Chen, X., Gong, B., Dominguez, J. N., Davidson, E., Kurzban, G., Miller, R. E., Nuzum, E. O., Rosenthal, E J., and McKerrow, J. H. (1995). In vitro antimalarial activity of chalcones and their derivatives.J. Med. Chem. 38, 5031-5037. Luker, K. E., Francis, S. E., Gluzman, I. Y., and Goldberg, D. E. (1996). Kinetic analysis of plasmepsins I and II, aspartic proteases of the Plasmodium falciparum digestive vacuole. Mol. Biochem. Parasitol. 79, 71-78. McKerrow, J. H., Sun, E., Rosenthal, E J., and Bouvier, J. (1993). The proteinases and pathogenicity of parasitic protozoa. Ann. Rev. Microbiol. 47, 821-853. Moon, R. E, Tyas, L., Certa, U., Rupp, K., Bur, D., Jaquet, C., Matile, H., Loetscher, H.-R., Grueninger-Leitch, F., Kay, J., Dunn, B. M., Berry, C., and Ridley, R. G. (1997). Expression and characterisation of plasmepsin I from Plasmodium falciparum. Eur. J. Biochem. 244, 552560. Moulder, J. W., and Evans, E. A. (1946). The biochemistry of the malaria parasite. VI. Studies on the nitrogen metabolism of the malaria parasite. J. Biol. Chem. 164, 145-157. N~ijera, J. A., and Hempel, J. (1996). The burden of malaria. World Health Organisation, Geneva, Switzerland. Nankya-Kitatka, M. F., Curley, G. E, Gavivan, C. S., Bell, A., and Dalton, J. P. (1998) Plasmodium chabaudi chabaudi and Plasmodium falciparum: Inhibition of aminopeptidase and parasite growth by bestatin and nitrobestatin. Parasitol. Res. 84, 552-558. Olliaro, E L., and Goldberg, D. E. (1995). The Plasmodium digestive vacuole: Metabolic headquarters and choice drug target. Parasitol. Today 11,294-297. Orjih, A. U., Banyal, H. S., Chevli, R., and Fitch, C. D. (1981). Hemin lyses malaria parasites. Science 214, 667-669. Ring, C. S., Sun, E., McKerrow, J. H., Lee, G. K., Rosenthal, E J., Kuntz, I. D., and Cohen, F. E. (1993). Structure-based inhibitor design by using protein models for the development of antiparasitic agents. Proc. Natl. Acad. Sci. USA 90, 3583-3587. Rosenthal, E J. (1993). A Plasmodium vinckei cysteine proteinase shares unique features with its Plasmodium falciparum analogue. Biochim. Biophys. Acta 1173, 91-93. Rosenthal, E J. (1995). Plasmodium falciparum: Effects of proteinase inhibitors on globin hydrolysis by cultured malaria parasites. Exp. Parasitol. 80, 272-281. Rosenthal, P.J. (1996). Conservation of key amino acids among the cysteine proteinases of multiple malarial species. Mol. Biochem. Parasitol. 75,255-260. Rosenthal, E J., and Nelson, R. G. (1992). Isolation and characterisation of a cysteine proteinase gene of Plasmodium falciparum. Mol. Biochem. Parasitol. 51,143-152. Rosenthal, E J., Kim, K., McKerrow, J. H., and Leech, J. H. (1987). Identification of three stage specific proteinases of Plasmodium falciparum. J. Exp. Med. 166, 816-821. Rosenthal, E J., Lee, G. K., and Smith, R. E. (1993a). Inhibition of Plasmodium vinckei cysteine proteinase cures murine malaria.J. Clin. Invest. 91, 1052-1056. Rosenthal, E J., McKerrow, J. H., Aikawa, M., Nagasawa, H., and Leech, J. H. (1988). A Malarial cysteine proteinase is necessary for hemoglobin degradation by Plasmodium falciparum. J. Clin. Invest. 82, 1560-1566. Rosenthal, E J., McKerrow, J. H., Rasnick, D., and Leech, J. H. (1989). Plasmodium falciparum:

188

Colin Berry

Inhibitors of lysosomal cysteine proteinases inhibit a trophozoite proteinase and block parasite development. Mol. Biochem. Parasitol. 35, 177-183. Rosenthal, P. J., Olson, J. E., Lee, G. K., Palmer, J. T., Klaus, J. L., and Rasnick, D. (1996). Antimalarial effects of vinyl sulfone cysteine proteinase inhibitors. Antimicrob. Agents Chemother. 40, 1600-1603. Rosenthal, P. J., Ring, C. S., Chen, X., and Cohen, F. E. (1993b). Characterisation of a Plasmodium vivax cysteine proteinase gene identifies uniquely conserved amino acids that may mediate the substrate specificity of malarial hemoglobinases. J. Mol. Biol. 241, 312-316. Rosenthal, P. J., Wollish, W. S., Palmer, J. T., and Rasnick, D. (1991). Antimalarial effects of peptide inhibitors of a Plasmodium falciparum cysteine proteinase. J. Clin. Invest. 88, 1467-1472. Runeberg-Roos, P., Tormakangas, K., and Ostman, A. (1991). Primary structure of a barley-grain aspartic proteinase: A plant aspartic proteinase resembling cathepsin D. Eur. J. Biochem. 202, 1021-1027. Sakakibara, T., Ito, K., Irie, Y., Hagiwara, T., Sakai, Y., Hayashi, M., Kishi, H., Sakamoto, M., and Suzuki, M. (1983). Toxicological studies on bestatin. I. Acute toxicity test in mice, rats and dogs. Jpn. J. Antibiot. 36, 2971-2984. Salas, F., Fichmann, J., Lee, G. K., Scott, M. D., and Rosenthal, P.J. (1995). Functional expression of falcipain, a Plasmodium falciparum cysteine proteinase, supports its role as a malarial hemoglobinase. Infect. Immun. 63, 2120-2125. Silva, A. M., Lee, A. Y., Gulnik, S. V., Majer, P., Collins, J., Bhat, T. N., Collins, P.J., Cachau, R. E., Luker, K. E., Gluzman, I. Y., Francis, S. E., Oksman, A., Goldberg, D. E., and Erickson, J. W. (1996). Structure and inhibition of plasmepsin II, a hemoglobin-degrading enzyme from Plasmodium falciparum. Proc. Natl. Acad. Sci. USA 93, 10034-10039. Schrevel, J., Deguercy, A., Mayer, R., and Monsigny, M. (1990). Proteases in malariaqnfected red blood cells. Blood Cells 16, 563-584. Sherman, I. W., and Tanigoshi, L. (1981). The proteases of Plasmodium: A cathepsin D-like enzyme from Plasmodium lophurae. In "The Biochemistry of Parasites" (G. M. Slutzky, Ed.), pp. 137149. Pergamon, New York. Sherman, I. W., and Tanigoshi, L. (1983). Purification of Plasmodium lophurae cathepsin D and its effects on erythrocyte membrane proteins. Mol. Biochem. Parasitology 8, 207-226. Tyas, L. (1997). Ph.D. thesis. University of Wales, Cardiff. Vander Jagt, D. L., Baack, B. R., and Hunsaker, L. A. (1984). Purification and characterisation of an aminopeptidase from Plasmodium falciparum. Mol. Biochem. Parasitol. 10, 45-54. Vander Jagt, D. L., Hunsaker, L. A., and Campos, N. M. (1987). Comparison of proteases from chloroquine-sensitive and chloroquine-resistant strains of Plasmodium falciparum. Biochem. Pharmacol. 36, 3285-3291. Vander Jagt, D. L., Hunsaker, L. A., Campos, N. M., and Scaletti, J. V. (1992). Localization and characterization of hemoglobin-degrading aspartic proteinases from the malarial parasite Plasmodium falciparum. Biochim. Biophys. Acta 1122, 256-264. Westling, J., Yowell, C. A., Majer, P., Erickson, J. W., Dame, J. B., and Dunn, B. M. (1997). Plasmodium falciparum, P. vivax and P. malariae: A comparison of the active site properties of plasmepsins cloned and expressed from three different species of the malaria parasite. Exp. Parasitol. 87, 185 -193. World Health Organization (1996). P.falciparum proteinases inhibitor Development. World Health Organization, Geneva, Switzerland.

Chagas Disease JUANJOSE C A Z Z U L O Instituto de Investigaciones Biotecnol6gicas, Universidad Nacional de General San Martfn, Buenos Aires, Argentina

I. I n t r o d u c t i o n II. T h e P r o t e i n a s e s o f Trypanosoma cruzi III. P r o s p e c t s for N o v e l D r u g s T a k i n g Trypanosoma

cruzi P r o t e i n a s e s as Targets References

I. INTRODUCTION

A. CI-LaGAS DISEASE The American trypanosomiasis, Chagas disease, is an endemic disease prevalent in most of Latin America, where it affects an estimated 16-18 million people (WHO, 1997). Its etiological agent is a flagellated protozoan, Trypanosoma cruzi, transmitted by a triatomine insect vector. The geographic limits of the disease are often set at the Rio Grande, at the United States-Mexican border, in the North, and the north of Patagonia, in the South. However, indigenous cases have been reported in some North American states, and the disease is becoming a health problem in the United States because of the possibility of transmission through blood transfusions. The disease was first described by the Brazilian physician Carlos Justiniano Ribeiro Chagas (1879-1934) in 1909. It is a remarkable feat that Chagas ProteasesofInfectiousAgents Copyright 9 1999by AcademicPress. All rights of reproductionin any formreserved.

189

190

Juan Jose Cazzulo

described a new disease, its etiological agent, and its vector within 2 years (Lewinsohn, 1981). The disease has an acute phase, which may be mistaken for a flu, although occasionally it can lead to fatal meningoencephalitis or acute myocarditis (mostly in very young children); an indeterminate asymptomatic phase, which can last for about 10-20 years or even for the whole lifetime of the infected person; and finally a chronic phase, characterized by heart symptoms, which eventually leads to sudden death and to enlargening of the hollow viscerae, like the esophagus and the colon (megaesophagus and megacolon), which can also be fatal (WHO, 1997). Diagnosis of Chagas disease can be parasitological, by direct detection of the parasite in blood, during the acute phase; however, due to the very low parasitemia during the indeterminate and chronic phases, serodiagnosis is the usual way to detect the infected individual. At present, modern techniques are being introduced; the presence of the parasite can be detected by DNA amplification by the polymerase chain reaction (PCR), and new recombinant antigens are being applied for serodiagnosis (Pastini et al., 1994). Acute-phase patients can be successfully treated with benznidazole (Radanil) or with nifurtimox (Lampit), although the latter is no longer easily available. The drugs, however, show little or no activity in chronic patients, and their use in this phase of the disease is not recommended, since they have serious side-effects and are potentially mutagenic and carcinogenic. There is an urgent need, therefore, to develop new drugs effective against Chagas disease. Over the past decade, there have been trials with allopurinol with poor results; there is hope that cytochrome P450 inhibitors, like ketoconazole, able to block ergosterol biosynthesis, may be useful for treatment (WHO, 1997). The major form of transmission of the disease is by a triatomine insect vector, such as Triatoma infestans (Argentina and part of Brazil and Chile), Panstrongilus megistus (Brazil), Rhodnius prolixus (the Caribbean area), and Dipetalogaster maximus (Mexico). The infective form of T. cruzi is found in the posterior end of the triatomine's gut and is released with the feces. The parasite then penetrates the mammalian skin through the puncture left by the bite, helped by instinctive scratching, or actively when left on a mucosal membrane, such as the conjuntiva.

B. TRYPANOSOMA CRUZI

Trypanosoma cruzi belongs to the Order Kinetoplastida, characterized by the presence of a single mitochondrion branching throughout the cell. This mitochondrion contains a specialized part, the kinetoplast (usually located in the vicinity of the basal body of the flagellum) containing the mitochondrial DNA

191

Chagas Disease

(kDNA). In most species of the Order, kDNA may account for as much as 30% of the total DNA content of the cell (Vickerman, 1976). The parasite has an obligate intracellular replicative form, the amastigote (A), and a nonreplicative one, invasive for host cells, the trypomastigote (BT), found in the mammalian host's bloodstream. The major forms present in the insect vector are also a replicative stage, the epimastigote (E), and a nonreplicative one, the infective metacyclic trypomastigote (MT) (Fig. 1). These forms differ in size, the epimastigote being larger; in the relative position of the kinetoplast and the nucleus; and in antigenic and some metabolic properties.

II. T H E P R O T E I N A S E S O F T R Y P A N O S O M A

CRUZI

Since the identification of a number of proteolytic activities in cell-free extracts of epimastigotes (Itow and Camargo, 1977; Avila et al., 1979), several enzymes have been purified from the parasite and characterized. They include cysteine proteinases (CPs), serine peptidases (SPs), metalloproteinases (MPs), and the proteasome.

A

/

--~ ~ BT INSECT

MT ~

l ~//N

FIGURE 1 Life cycle of Trypanosoma cruzi. Amastigotes (A) and bloodstream trypomastigotes (BT) are present in the mammalian host, whereas epimastigotes (E) and metacyclic trypomastigotes (MT) are present in the insect vector.

19 2

Juan Jose Cazzulo

A. CYSTEINE PROTEINASES Cystine proteinases are the peptidases present in trypanosomatids that are most often reported and are better characterized. Following the pioneering work of Coombs, North, Mottram, and their co-workers in Leishmania mexicana (summarized in Coombs and Mottram, 1997) and Trypanosoma brucei (Mottram et al., 1989), CPs have been identified and partially or completely characterized in T. cruzi (see below), Trypanosoma rangeli (Labriola and Cazzulo, 1995), and Crithidia fasciculata (Cazzulo et al., 1995), as well as in other Leishmania spp. (Coombs and Mottram, 1997). The best-characterized CP in T. cruzi is cruzipain (Cazzulo et al., 1990), also known as cruzain (Eakin et al., 1992) or GP57/51 (Murta et al., 1990); recently, however, proteolytic activities adscribed to other, probably minor, CPs, have been described. 1. Cruzipain Cruzipain was first reported in cell-free extracts of epimastigotes by Itow and Camargo (1977) and purified to homogeneity by Bontempi et al. (1984); the CP purified by Rangel et al. (1981) is likely to be the same enzyme, despite some discrepancies (Cazzulo, 1984), and probably also the same as the 50-kDa proteinase described by Greig and Ashall (1990). Cruzipain is differentially expressed in the four main stages of the parasite's life cycle; its activity is 10- to 100-fold higher in epimastigotes as compared with the other parasite forms (Campetella et al., 1990; Franke de Cazzulo et al., 1994). The bulk of the enzyme is lysosomal; its location in an epimastigotespecific prelysosomal organelle called the "reservosome," which contains protein that is digested during differentiation to metacyclic trypomastigotes, has been reported (Soares et al., 1992). Trypomastigotes are able to excrete CPs, most probably cruzipain among them, into the medium (Yokoyama-Yasunaka et al., 1994). Cruzipain (3-4% of the total soluble protein of the cell in axenic cultureepimastigotes of T. cruzi, Tul 2 strain) can be readily purified by conventional methods (Cazzulo et al., 1989) or affinity chromatography on ConA-Sepharose and chicken egg cystatin-Sepharose (Labriola et al., 1993). An active truncated form of the enzyme has been expressed in Escherichia coli (Eakin et al., 1992; see below). Cruzipain is an endoproteinase able to digest proteins such as casein, bovine serum albumin, and denatured hemoglobin (optimal pH 3-5), and synthetic blocked chromogenic and fluorogenic substrates with optimal pH 7-9. In the latter case, it prefers Arg or Lys at the P1 position, and a hydrophobic or a positively charged residue at P2 (Cazzulo et al., 1990). When acting on the oxidized A and B chains of insulin, however, it acts better on peptidic bonds

Chagas Disease

193

having bulky hydrophobic residues at P2 and P3 (Raimondi et al., 1991). The enzyme also hydrolyzes peptides containing the sequence VVG#GPG, present in the boundary between its catalytic and C-terminal domains (see below); this makes it highly probable that the enzyme processes itself, both at this boundary and at the boundary between the pro- and the catalytic domain (VVG#APA) (Cazzulo et al., 1996). As predicted from former specificity studies (Cazzulo et al., 1990), a modified peptide VVR#GPG was a considerably better substrate than VVG#GPG, and this information could be applied to synthesize a better substrate for cruzipain (Z-VVRpNA) than the commercial ones used so far (Cazzulo et al., 1996). Recent and very thorough studies of the hydrolysis of peptides with sequences derived from the active site of cystatins (Serveau et al., 1996), as well as of peptides from combinatorial libraries (Del Nery et al., 1997a), have shown the importance of a Pro residue at P2' for the substrate specificity of cruzipain. Recently Del Nery et al. (1997b) have shown that natural cruzipain, at variance with truncated recombinant cruzain and cruzipain 2, was able to cleave human kininogen to give Lys-bradikinin and also to activate plasma pre-kallikrein to kallikrein. The peculiar substrate specificity of cruzipain, somehow intermediate between those of cathepsins L and B, since the enzyme is able to accomodate either a hydrophobic or a positively charged residue at P2, is consistent with the presence of Glu at position 205, as first pointed out by Lima et al. (1992). The enzyme is inhibited by organomercurial reagents, E-64, TLCK, leupeptin and a number of peptidyl chloromethane and peptidyl fluoromethane derivatives (Bontempi et al., 1984; Cazzulo et al., 1990; Meirelles et al., 1992; Harth et al., 1993; Franke de Cazzulo et al., 1994). Lalmanach et al. (1996) used the cystatin-based synthetic substrates (Serveau et al., 1996) to develop peptidyl diazomethane inhibitors far more specific for cruzipain than any tested so far. Cystatins, stefins, and kininogens (Stoka et al., 1995; Serveau et al., 1996; Turk et al., 1996) are also strong inhibitors of the enzyme. Cruzipain is encoded by a high number of genes (up to 130 in the Tul 2 strain of the parasite), located in tandems placed in chromosomes 2-4, depending on the parasite clone or strain (Campetella et al., 1992). The genes, which contain no introns, encode a signal peptide, a propeptide, and a mature enzyme, consisting of a catalytic moiety, with high sequence homology with some cathepsins, particularly cathepsin S, and a 130-amino-acid-long C-terminal extension (C-T), which seems so far restricted to Type I CPs from trypanosomatids (Coombs and Mottram, 1997). The C-T, at variance with the similar CPs from L. mexicana and T. brucei, and as in the CPs from T. rangeli and C. fasciculata, is kept in the natural mature form of the enzyme (Cazzulo et al., 1995). The C-T consists of a "core" of 76 amino acids tightened by disulfide bridges; an N-terminal segment of 27 amino acid residues up to the only Met residue, containing 7 modified Thr residues and 7 Pro residues, which probably

194

Juan Jose Cazzulo

acts as a "hinge" linking the "core" to the catalytic moiety; and a highly hydrophilic C-terminal "tail" of 27 residues, containing 7 Arg, 2 His, 2 Asp, and 5 Ser residues. The Thr modifications are still unidentified, but they seem not to be due to either phosphorylation or O-glycosylation (Cazzulo et al., 1992). This "normal" C-T is absent in the last gene of the tandem, being replaced by a highly hydrophobic 49-amino-acid-residue extension (Tomas and Kelly, 1996); this last gene, however, seems not to be expressed. It is noteworthy that in the case of the Type I CP tandems of L. mexicana, Mottram et al. (1997) have found a similar situation, namely the last gene in the tandem is a nonexpressed pseudogene, in this case lacking the portion encoding the C-T. The only N-glycosylation site in the C-T (Asn255), as well as the first potential N-glycosylation site in the catalytic moiety (Asn33), are glycosylated in vivo; the latter bears only high mannose-type oligosaccharides. There is still no evidence on the N-glycosylation status of the second potential site (Asn169) (Metzner et al., 1996). The X-ray crystallographic structure of a recombinant truncated form of the enzyme (cruzain Ac), lacking the C-T and the last 9 amino acid residues of the catalytic moiety, in complex with the synthetic inhibitor Z-Phe-Ala fluoromethane has been determined (McGrath et al., 1995). The catalytic moiety of cruzipain consists of one polypeptide chain of 215 amino acid residues folded into two distinct domains which interact, creating the active site cleft; the overall folding pattern and the arrangement of the active site residues are similar to those in papain (Fig. 2, see color plate) (McGrath et al., 1995). The molecular weight of cruzipain can be estimated from sequence studies and considering two high-mannose oligosaccharide chains, to be --~40 kDa; however, the enzyme exhibits an anomalous behavior in SDS-PAGE, yielding apparent molecular weight values of 35 to 60 kDa, depending on the experimental conditions (Martinez and Cazzulo, 1992). Upon isoelectrofocusing, purified preparations present up to 12 bands, with pI values ranging from 3.7 to 5.1 (Stoka et al., 1995). Natural cruzipain is a complex of isoforms, as judged from Mono Q chromatography, reversed-phase HPLC, and isoelectrofocusing (Cazzulo et al., 1995). This heterogeneity is probably due to both the simultaneous expression of several genes bearing amino acid substitutions and to the presence, in different cruzipain molecules, of either high mannose-type, hybrid monoantennary-type, or complex biantennary-type oligosaccharide chains at the only N-glycosylation site in the C-T, Asn255 (Parodi et al., 1995). The latter is probably caused by the presence of nonconservative amino acid substitutions at the C-T, which change the predicted isoelectric point of the protein and are likely to result in structural variants at this level (Martinez et al., 1998). Most cruzipain genes completely or partially sequenced so far are highly homologous; the exception, however, is cruzipain 2 (Lima et al., 1994), which has only 86% identity (Fig. 3) with other sequenced genes (Campetella et al.,

0

L~

i [~

u~

9

9 9

.

9 9 9 9

~

~

rO ~ ~ ~

9

~ ~

L~

.

C~;

~

. .

9

~

121 C~

~

I--I C~

.

. .

.

.

--

9

9

~

9 9

9

~

~u~ ~ tO

9

. 9 9 9

.

~ ~

~

9

~.~

~ ~

~ ~

. 9

9 9

i

9 9

~

~

~

9

9 9

~

.

~ ~

9

9

0 u~

9

.

9 .

L)

9

9 9 ~

9

C~ ~ n~ tO ~.~

O H rO

9 9 ~ .

9

9

O~

~ 2; tO h-i

9 .

~

M

L9

9

r-fl

o

[~

C~

L~

,~

B

B

H

L~

. .

-H

~

ul

I-I

~

9

9

~

.

.

L)

L)

- H -,--t

E~

~

[--, ~1

~

9

~

13-]

~

9

~

,--1 B

9

9

.

9

~

"H

OL)

9

u~

~

r,D

~

r~

L)

- H -r-t

L)

o~ ,~

o

,~ ro 0 B ~, 9 9 9

9

9 9

~ 9 9 9

~

.

~ ~ ~

9 9

I-I

0 O~ r~ ~

9 9 9

9

~

0 r~ ~

L)

9

9

~

C:~

ul

~ B

B ~ B

r~

rJl

r..)

I-t H

N

-H

r~

N

-H

O O

I~.

n:~

o-1

r

~o . ~

.~

U

~

~..~

o

"'~

n~

~o~ ~

~0

,~

~. C~

8 ~

9

~

C)~

~

r

o

~~

~

o

>

B

~'

~

.~

~

196

JuanJose Cazzulo

1992; Eakin et al., 1992; Lima et al., 1994). Whereas the other genes sequenced, encoding what we may call the "cruzipain 1" isoforms complex, differ mostly at the C-T, presenting only a few conservative amino acid replacements at the catalytic domain, cruzipain 2 is less homologous to the other isoforms in the catalytic domain than in the C-T. The amino acid residue substitutions found in the catalytic domain include substitutions at the $2, $1', and $2' subsites and the absence of the first potential N-glycosylation site; those in the C-T include the absence of the last Cys residue (Lima et al., 1994), thus suggesting the possible lack of a disulfide bridge. It seems likely, therefore, that cruzipain 2 may have a different substrate specificity compared with "cruzipain 1." Recent evidence shows that this is indeed the case, since truncated recombinant cruzain and cruzipain 2, both lacking the C-terminal domain, had different specificity when acting on peptides derived from human kininogen (Del Nery et al., 1997b). The situation in T. cruzi may, therefore, be similar to that in L. mexicana, where Mottram et al. (1997) have recently reported the differential expression of CPs with different specificity in different parasite stages. Cruzipain is an immunodominant antigen, recognized by most sera from patients with Chagas disease (Scharfstein et al., 1988; Murta et al., 1990; Martinez et al., 1991); most antibodies in natural infections and in immunized animals are directed against the C-T, and enzyme molecules with antibodies bound to this domain are still active, at least against small peptidic substrates (Martinez et al., 1993). In addition to its obvious role in parasite nutrition, as its major lysosomal proteinase, cruzipain has been proposed to be involved in the penetration of the trypomastigote into the mammalian cell (Piras et al., 1985); in an escape mechanism from the immune response of the host by digestion of immunoglobulins at the "hinge," leaving the F(ab)2 fragment protecting the antigen on the surface of the parasite, instead of opsonizing it for phagocytosis or activating the complement cascade ("fabulation") (Krettli et al., 1980; see also Cazzulo et al., 1997); and in the differentiation steps of the life cycle of the parasite, which are blocked by permeant irreversible inhibitors of the enzyme (Meirelles et al., 1992; Harth et al., 1993; Franke de Cazzulo et al., 1994). The possible participation of cruzipain in metacyclogenesis, first proposed by Bonaldo et al. (1991) and supported by the results of Franke de Cazzulo et al. (1994), has received further support from the work of Tomas et al. (1997), who showed that overexpression of cruzipain enhanced the differentiation of epimastigotes to metacyclics. 2. Other Cysteine Proteinases

Starting with the first study of the subcellular localization of cruzipain by immunoelectron microscopy (Souto-Padr6n et al., 1990), evidence suggesting a

Chagas Disease

19 7

second localization of enzyme isoforms at the plasma membrane, in addition to the lysosomal localization, has been found. Fresno et al. (1994), in a preliminary report, indicated the presence of a CP bound to the plasma membrane by a glycosyl phosphatidyl inositol (GPI) anchor. The enzyme was not completely characterized, however, and it is thus not possible to conclude whether it is a cruzipain isoform bearing a different C-terminal domain, to be partially replaced by the GPI anchor, or if it is a more divergent CE Recently, the possibility of membrane-bound CP isoforms has been addressed again, using extraction of the parasites with Triton X-114 followed by partition. A set of CP isoforms, with a different pattern as compared with the usual hydrophilic isoforms, was consistently found in the detergent phase (Parussini et al., 1998). These isoforms, which were present in epimastigotes, amastigotes, and trypomastigotes, and reacted with polyclonal anticruzipain sera, were labeled with a biotin derivative in vivo under conditions where only surface molecules reacted, and were shown, by their response to inhibitors and binding to cystatin-Sepharose, to be CPs (Parussini et al., 1998). Due to the fact that these CPs are present in a much lower amount as compared with the common hydrophilic isoforms, peptide sequences allowing a direct comparison with the other CP isoforms reported so far in the parasite have not been obtained. N6brega et al. (1996) have reported the presence in the three major developmental stages of T. cruzi of a 30-kDa, acidic, ATP-activated CP with broad substrate specificity. Cloning and sequencing of the gene encoding the enzyme showed high homology with cathepsin B. This enzyme has recently been shown to be antigenic in human Chagas disease (Fernandes et al., 1997).

B. SERINE PROTEINASES 1. Oligopeptidase B An alkaline peptidase of high apparent molecular mass (150-200 kDa), initially classified by Ashall (1990) as a CP, was suggested afterward to be a serine peptidase, based on its inhibition by di-isopropyl fluorophosphate and low sensitivity to E64, and considered to be a possible processing proteolytic enzyme (Ashall et al., 1990). A peptidase, which seems to be the same enzyme, was purified to homogeneity by Santana et al. (1992); it has recently been shown to be indirectly involved in the penetration of the trypomastigote into the mammalian cell by triggering a Ca 2+-signaling mechanism (Burleigh and Andrews, 1995). The gene encoding the enzyme has been cloned and sequenced, revealing a novel enzyme, oligopeptidase B, homologous to members of the prolyl oligopeptidase family of SPs, which are known in higher organisms to participate in the maturation of biologically active peptides (Burleigh et al., 1997). It is possible, therefore, that oligopeptidase B processes a T. cruzi protein producing

198

JuanJose Cazzulo

a factor responsible for the Ca 2+-signaling activity for mammalian cells, necessary for parasite penetration. 2. Other Putative Serine Proteinases A cytosolic SP activity, able to cleave Boc-Ala-Ala-pNA, has been reported in cell-free extracts of epimastigotes of T. cruzi by Healy et al. (1992), but has not been characterized. A gene fragment apparently encoding a chymotrypsinlike SP has been amplified by PCR from T. cruzi DNA (Sakanari et al., 1989). Recently, a secreted 80-kDa proteinase with collagenase activity (specific for human collagen types I and IV) has been purified and partially characterized from T. cruzi (Santana et al., 1997). Only one peptide sequence was determined, and this was most similar to the sequence of a metalloproteinase from Crithidia fasciculata. However, the inhibition pattern suggests that the enzyme could be either a CP or a SP. Further characterization of this enzyme, which is secreted by the mammalian forms of the parasite, is necessary before it can be unmistakably placed in one of the major proteinase classes.

C. METALLOPROTEINASES Metalloproteinases have been described in trypanosomatids, the most notable being the surface metalloproteinase (gp63) of Leishmania spp. (Bouvier et al., 1995). Lowndes et al. (1996) have reported the expression of a complex array of metalloproteinases in T. cruzi, all of them membrane bound, as suggested by their partition into the detergent fraction upon Triton X-114 extraction. These proteinases showed considerable qualitative and quantitative variation in different strains and developmental stages of the parasite. Their possible relevance in the parasite's biology is not known.

D. THE PROTEASOME The proteasome, ubiquitous in all cells from Archaebacteria to higher eukaryotes, was reported in 1996 in both T. brucei (Hua et al., 1996) and T. cruzi (Gonz~ilez et al., 1996). Proteasomes (20S) were purified from epimastigotes of T. cruzi. The 670-kDa intact molecules are made up of subunits of 25-35 kDa, with pI values ranging from 4.5 to 8.5 (Gonz~ilez et al., 1996). In both trypanosomatids the proteasomes presented their characteristic electron microscopy images. In the case of T. cruzi the activity of the proteasome was specifically inhibited by lactacystin, which was able to prevent the differentiation of trypomastigotes to amastigotes and the intracellular differentiation of amastigotes to trypomastigotes (Gonzalez et al., 1996).

Chagas Disease

199

III. P R O S P E C T S F O R N O V E L D R U G S TAKING TRYPANOSOMA CRUZI PROTEINASES AS T A R G E T S The multiple roles suggested for T. cruzi proteolytic activities make them attractive potential targets for chemotherapeutic attack. However, the functions of cruzipain and other, quantitatively minor proteinases in the biology of the parasite are still not clear. Null mutants, like those obtained by Mottram and his co-workers (Coombs and Mottram, 1997) have not yet been attained, despite many efforts, mainly by Kelly's group (Tomas and Kelly, 1996). It cannot be clearly determined, therefore, if the highly promising results obtained for the inhibition of the parasite life cycle at the differentiation steps (Meirelles et al., 1992; Harth et al., 1993; Franke de Cazzulo et al., 1994) are actually due to the inhibition of cruzipain or to some other, minor, and highly specific CE The effects observed might even be due to proteinases belonging to other classes, showing an atypical response to inhibitors: oligopeptidase B, for instance, was strongly inhibited by Z-Phe-Arg-fluoromethylketone, despite its belonging to the SP class (Burleigh et al., 1997). McKerrow et al. (1995) have reported a considerable decrease in parasitemia in infected mice treated with some CP inhibitors. It can be concluded that the proteinases of T. cruzi are highly promising targets for chemotherapeutic attack against Chagas disease, which still need proper validation.

REFERENCES Ashall, F. (1990). Characterization of an alkaline peptidase of Trypanosoma cruzi. Mol. Biochem. Parasitol. 38, 77-88. Ashall, F., Harris, D., Roberts, H., Healy, N., and Shaw, E. (1990). Substrate specificity and inhibitor sensitivity of a trypanosomatid alkaline peptidase. Biochim. Biophys. Acta 1035, 293-299. Avila, J. L., Casanova, M. A., Avila, A., and Bretafm, A. (1979). Acid and neutral hydrolases in Trypanosoma cruzi: Characterization and assay. J. Protozool. 26,304-311. Bonaldo, M. C., D'Escoffier, L. N., Salles, J. M., and Goldenberg, S. (1991). Characterization and expression of proteases during Trypanosoma cruzi metacyclogenesis. Exp. Parasitol. 73, 44-51. Bontempi, E., Franke de Cazzulo, B. M., Ruiz, A. M., and Cazzulo, J. J. (1984). Purification and some properties of an acidic protease from epimastigotes of Trypanosoma cruzi. Comp. Biochem. Physiol. B 77, 599-604. Bouvier, J., Schneider, P., and Etges, R. (1995). Leishmanolysin: Surface metalloproteinase of Leishmania. Meth. Enzymol. 248, 614-633. Burleigh, B. A., and Andrews, N. W. (1995). A 120-kDa alkaline peptidase from Trypanosoma cruzi is involved in the generation of a novel Ca 2+-signalling factor for mammalian cells.J. Biol. Chem. 270, 5172-5180. Burleigh, B. A., Caler, E. V., Webster, P., and Andrews, N. W. (1997). A cytosolic serine endopep-

200

Juan Jose Cazzulo

tidase from Trypanosoma cruzi is required for the generation of Ca 2+ signalling in mammalian cells. J. Cell Biol. 136, 609-620. Campetella, O., Henriksson,J., Aslund, L., Frasch, A. C. C., Pettersson, U., and Cazzulo,J.J. (1992). The major cysteine proteinase (Cruzipain) from Trypanosoma cruzi is encoded by multiple polymorphic tandemly organized genes located on different chromosomes. Mol. Biochem. Parasitol. 50, 225-234. Campetella, O., Martinez, J., and Cazzulo, J. J. (1990). A major cysteine proteinase is developmentally regulated in Trypanosoma cruzi. FEMS Microbiol. Lett. 67, 145-150. Cazzulo, J. J. (1984). Protein and aminoacid catabolism in Trypanosoma cruzi. Comp. Biochem. Physiol. B 79,309-320. Cazzulo, J. J., Cazzulo Franke, M. C., Martinez, J., and Franke de Cazzulo, B. M. (1990). Some kinetic properties of a cysteine proteinase (Cruzipain) from Trypanosoma cruzi. Biochim. Biophys. Acta 1037, 186-191. Cazzulo, J. J., Couso, R., Raimondi, A., Wernstedt, C., and Hellman, U. (1989). Further characterization and partial amino acid sequence of a cysteine proteinase from Trypanosoma cruzi. Mol. Biochem. Parasitol. 33, 33-41. Cazzulo, J. J., Labriola, C., Parussini, F., Duschak, V., Martinez, J., and Franke de Cazzulo, B. M. (1995). Cysteine proteinases in Trypanosoma cruzi and other Trypanosomatid parasites. Acta Chim. Slovenica 42,409-418. Cazzulo, J. J., Martinez, J., Parodi, A. J. A., Wernstedt, C., and Hellman, U. (1992). On the posttranslational modifications at the C-terminal domain of the major cysteine proteinase (cruzipain) from Trypanosoma cruzi. FEMS Microbiol. Lett. 100,411-416. Cazzulo, J. J., Stoka, V., and Turk, V. (1997). Cruzipain, the major cysteine proteinase from the protozoan parasite Trypanosoma cruzi. Biol. Chem. 378, 1-10. Coombs, G. H., and Mottram, J. C. (1997). Proteinases of Trypanosomes and Leishmania. In "Trypanosomiasis and Leishmaniasis" (G. Hide, J. C. Mottram, G. H. Coombs, and P. H. Holmes, Eds.). CAB International, Oxford. Del Nery, E., Juliano, M. A., Meldal, M., Svendsen, I., Scharfstein, J., Walmsley, A., and Juliano, L. (1997a). Characterization of the substrate specificity of the major cysteine proteinase (cruzipain) from Trypanosoma cruzi using a portion-mixing combinatorial library and fluorogenic peptides. Biochem. J. 323,427-433. Del Nery, E.,Juliano, M. A., Lima, A. P. C. A., Scharfstein,J., andJuliano, L. (1997b). Kininogenase activity by the major cysteinyl proteinase (cruzipain) from Trypanosoma cruzi. J. Biol. Chem. 272,25713-25718. Eakin, A. E., Mills, A. A., Harth, G., McKerrow, J. H., and Craik, C. S. (1992). The sequence, organization, and expression of the major cysteine proteinase (cruzain) from Trypanosoma cruzi.J. Biol. Chem. 267, 7411-7420. Fernandes, L. C., Bastos, I. M. D., Lauria-Pires, L., Grellier, P., Schrevel, J., Teixeira, A. R. L., and Santana, J. M. (1997). The oligopeptidase B and the cathepsin B-like protease of Trypanosoma cruzi are antigenic in human and rabbit infections. Mem. Inst. Oswaldo Cruz 92,262. Franke de Cazzulo, B. M., Martinez, M., North, M. J., Coombs, G. H., and Cazzulo, J. J. (1994). Effect of proteinase inhibitors on growth and differentiation of Trypanosoma cruzi. FEMS Microbiol. Lett. 124, 81-86. Fresno, M., Hernandez-Munain, C., de-Diego, J., Rivas, L., Scharfstein, J., and Bonay, P. (1994). Trypanosoma cruzi: Identification of a membrane cysteine proteinase linked through a GPI anchor. BrazilianJ. Med. Biol. Res. 27,431-437. Gonz~ilez, J., Ramalho-Pinto, F. J., Frevert, U., Ghiso, J., Tomlinson, S., Scharfstein, J., Corey, E.J., and Nussenzweig, V. (1996). Proteasome activity is required for the stage-specific transformation of a protozoan parasite.J. Exp. Med. 184, 1909-1918. Greig, S., and Ashall, F. (1990). Electrophoretic detection of Trypanosoma cruzi peptidases. Mol. Biochem. Parasitol. 39, 31-38.

Chagas Disease

201

Harth, G., Andrews, N., Mills, A. A., Engel, J. C., Smith, R., and McKerrow, J. H. (1993). Peptidefluoromethyl ketones arrest intracellular replication and intercellular transmission of Trypanosoma cruzi. Mol. Biochem. Parasitol. 58, 17-24. Healy, N., Greig, S., Enahoro, H., Roberts, H., Drake, L., Shaw, E., and Ashall, F. (1992). Detection of peptidases in Trypanosoma cruzi epimastigotes using chromogenic and fluorogenic substrates. Parasitology 104, 315-322. Hua, S.-B., To, W.-Y., Nguyen, T. T., Wong, M.-L., and Wang, C. C. (1996). Purification and characterization of proteasomes from Trypanosoma brucei. Mol. Biochem. Parasitol. 78, 33-46. Itow, S., and Camargo, E. P. (1977). Proteolytic activities in cell extracts of Trypanosoma cruzi. J. Protozool. 24, 591-595. Krettli, A., Thomas, N., and Eisen, H. (1980). Escape mechanisms of Trypanosoma cruzi from the host immune system. In "Les Colloques de I'INSERM: Cancer Immunology and Parasite Immunology" (L. Israel, P. Lagrange, and J. C. Salomon, Eds.), Vol. 97, pp. 553-558. INSERM, Paris. Labriola, C., and Cazzulo, J. J. (1995). Purification and partial characterization of a cysteine proteinase from Trypanosoma rangeli. FEMS Microbiol. Lett. 129, 143-148. Labriola, C., Sousa, M., and Cazzulo, J. J. (1993). Purification of the major cysteine proteinase (cruzipain) from Trypanosoma cruzi by affinity chromatography. Biological Res. 26, 101-107. Lalmanach, G., Mayer, R., Serveau, C., Scharfstein, J., and Gauthier, F. (1996). Biotin-labelled peptidyl diazomethane inhibitors derived from the substrate-like sequence of cystatin: Targeting of the active site of cruzipain, the major cysteine proteinase of Trypanosoma cruzi. Biochem.J. 318, 395 -399. Lewinsohn, R. (1981). Carlos Chagas and the discovery of Chagas's disease (American trypanosomiasis).J. Roy. Soc. Med. 74, 451-455. Lima, A. P. C. A., Scharfstein, J., Storer, A. C., and Mr R. (1992). Temperature-dependent substrate inhibition of the cysteine proteinase (GP 57/51) from Trypanosoma cruzi. Mol. Biochem. Parasitol. 56, 335-338. Lima, A. P. C. A., Tessier, D. C., Thomas, D. Y., Scharfstein, J., Storer, A. C., and Vernet, T. (1994). Identification of new cysteine protease gene isoforms in Trypanosoma cruzi. Mol. Biochem. Parasitol. 67,333-338. Lowndes, C. M., Bonaldo, M. C., Thomaz, N., and Goldenberg, S. (1996). Heterogeneity of metalloproteinase expression in Trypanosoma cruzi. Parasitology 112,393-399. Martinez, J., Campetella, O., Frasch, A. C. C., and Cazzulo, J. J. (1991). The major cysteine proteinase (Cruzipain) from Trypanosoma cruzi is antigenic in human infections. Infect. Immun. 59, 4275-4277. Martinez, J., Campetella, O., Frasch, A. C. C., and Cazzulo, J. J. (1993). The reactivity of sera from chagasic patients against different fragments of cruzipain, the major cysteine proteinase from Trypanosoma cruzi, suggests the presence of defined antigenic and catalytic domains. Immunol. Lett. 35, 191-196. Martinez, J., and Cazzulo, J. J. (1992). Anomalous electrophoretic behaviour of the major cysteine proteinase (cruzipain) from Trypanosoma cruzi in relation to its apparent molecular weight. FEMS Microbiol. Lett. 95,225-230. Martinez, J., Henriksson, J., Rid~ker, M., Pettersson, U., and Cazzulo, J. J. (1998). Polymorphisms of the genes encoding cruzipain, the major cysteine proteinase of Trypanosoma cruzi, in the region encoding the C-terminal domain. FEMS Microbiol. Lett., 159, 35-39. McGrath, M. E., Eakin, A. E., Engel, J. C., McKerrow, J. H., Craik, C. S., and Fletterick, R.J. (1995). The crystal structure of cruzain: A therapeutic target for Chagas' disease. J. Mol. Biol. 247, 251-259. McKerrow, J. H., McGrath, M. E., and Engel, J. C. (1995). The cysteine protease of Trypanosoma cruzi as a model for antiparasite drug design. Parasitol. Today 11,279-282. Meirelles, M. N. L., Juliano, U, Carmona, E., Silva, S. G., Costa, E. M., Murta, A. C. M., and Scharf-

202

Juan Jose Cazzulo

stein, J. (1992). Inhibitors of the major cysteinyl proteinase (GP57/51) impair host cell invasion and arrest the intracellular development of Trypanosoma cruzi in vitro. Mol. Biochem. Parasitol. 52, 175-184. Metzner, S. I., Sousa, M. C., Hellman, U., Cazzulo, J. J., and Parodi, A.J. (1996). The use of UDPGlc: glycoprotein glucosyltransferase for radiolabelling protein-linked high mannose-type oligosaccharides. Cell. Mol. Biol. 42,631-635. Mottram, J. C., North, M. J., Barry, J. D., and Coombs, G. H. (1989). A cysteine proteinase cDNA from Trypanosoma brucei predicts an enzyme with an unusual C-terminal extension. FEBS Lett. 258, 211-215. Mottram,J., Frame, M.J., Brooks, D. R., Tetley, L., Hutchison,J. E., Souza, A. E., and Coombs, G. H. (1997). The multiple cpb cysteine proteinase genes of Leishmania mexicana encode isoenzymes that differ in their stage regulation and substrate preferences.J. Biol. Chem. 272, 14285-14293. Murta, A. C. M., Persechini, E M., de Souto Padron, T., de Souza, W., Guimaraes, J. A., and Scharfstein, J. (1990). Structural and functional identification of GP57/51 antigen of Trypanosoma cruzi as a cysteine proteinase. Mol. Biochem. Parasitol. 43, 27-38. N6brega, O. T., Dos Santos Silva, M. A., Teixeira, A. R. L., and Santana, J. M. (1996). Further characterization and molecular cloning of an ATP-activated cysteine protease of Trypanosoma cruzi. Mem. Inst. Oswaldo Cruz 91,271. Parodi, A. J., Labriola, C., and Cazzulo,J. J. (1995). The presence of complex-type oligosaccharides at the C-terminal domain glycosylation site of some molecules of cruzipain. Mol. Biochem. Parasitol. 69, 247-255. Parussini, F., Duschak, V. G., and Cazzulo, J. J. (1998). Membrane-bound cysteine proteinase isoforms in different developmental stages of Trypanosoma cruzi. Cell. Mol. Biol., 44, 513-519. Pastini, A. C., Iglesias, S. R., Carricarte, V. C., Guerin, M. E., Sanchez, D. O., and Frasch, A. C. C. (1994). Immunoassay with recombinant Trypanosoma cruzi antigens potentially useful for screening donated blood and diagnosing Chagas disease. Clin. Chem. 40, 1893-1897. Piras, M. M., Henriquez, D., and Piras, R. (1985). The effect of proteolytic enzymes and protease inhibitors on the interaction of Trypanosoma cruzi and fibroblasts. Mol. Biochem. Parasitol. 14, 151-163. Raimondi, A., Wernstedt, C., Hellman, U., and Cazzulo, J. J. (1991 ). Degradation of oxidized insulin A and B chains by the major cysteine proteinase (Cruzipain) from Trypanosoma cruzi epimastigotes. Mol. Biochem. Parasitol. 49,341-344. Rangel, H. A., Aratijo, E M. F., Repka, D., and Costa, M. G. (1981). Trypanosoma cruzi: Isolation and characterization of a proteinase. Exp. Parasitol. 52, 199-209. Sakanari, J. A., Staunton, C. E., Eakin, A. E., Craik, C. S., and McKerrow, J. H. (1989). New serine proteases from nematode and protozoan parasites: isolation of sequence homologues using generic molecular probes. Proc. Nat. Acad. Sci. USA 86, 4863-4867. Santana, J. M., Grellier, E, Rodier, M.-H., Schrevel, J., and Teixeira, A. (1992). Purification and characterization of a new 120-kDa alkaline proteinase of Trypanosoma cruzi. Biochem. Biophys. Res. Commun. 187, 1466-1473. Santana, J. M., Grellier, E, SchrCvel, J., and Teixeira, A. R. L. (1997). A Trypanosoma cruzi-secreted 80 kDa proteinase with specificity for human collagen types I and IV. Biochem.J. 324, 129-137. Scharfstein, J., Schechter, M., Senna, M., Peralta, J. M., Mendonr L., and Miles, M. M. (1988). Trypanosoma cruzi: Characterization and isolation of a 57/51,000 m.w. surface glycoprotein (GP57/51) expressed by epimastigotes and bloodstream trypomastigotes. J. Immunol. 137, 1336-1341. Serveau, C., Lalmanach, G., Juliano, M. A., Scharfstein, J., Juliano, L., and Gauthier, F. (1996). Investigation of the substrate specificity of cruzipain, the major cysteine proteinase of Trypanosoma cruzi, through the use of cystatin-derived substrates and inhibitors. Biochem. J. 313, 951-956.

Chagas Disease

203

Soares, M. J., Souto-Padr6n, T., and De Souza, W. (1992). Identification of a large pre-lysosomal compartment in the pathogenic protozoon Trypanosoma cruzi. J. Cell Sci. 102, 157-167. Souto-Padr6n, T., Campetella, O., Cazzulo, J. J., and de Souza, W. (1990). Cysteine proteinase in Trypanosoma cruzi: Immunocytochemical localization and involvement in parasite-host cell interaction.J. Cell Sci. 96,485-490. Stoka, V., Nycander, M., Lenarcic, B., Labriola, C., Cazzulo, J. J., Bj6rk, I., and Turk, V. (1995). Inhibition of cruzipain, the major cysteine proteinase of the protozoan parasite, Trypanosoma cruzi, by proteinase inhibitors of the cystatin superfamily. FEBS Lett. 370, 101-104. Tomas, A. M., and Kelly, J. M. (1996). Stage regulated expression of cruzipain, the major cysteine proteinase of Trypanosoma cruzi, is independent of the level of RNA. Mol. Biochem. Parasitol. 76,91-103. Tomas, A. M., Miles, M. M., and Kelly, J. M. (1997). Overexpression of cruzipain, the major cysteine proteinase of Trypanosoma cruzi, is associated with enhanced metacyclogenesis. Eur. J. Biochem. 244, 596-603. Turk, B., Stoka, V., Turk, V., Johansson, G., Cazzulo, J. J., and Bj6rk, I. (1996). High-molecularweight kininogen binds two molecules of cysteine proteinases with different rate constants. FEBS Lett. 391,109-112. Vickerman, K. (1976). The diversity of the kinetoplastid flagellates. In "Biology of the Kinetoplastida" (W. H. R. Lumsden and D. A. Evans, Eds.), Vol. 1, pp. 1-34. Academic Press, New York. World Health Organization (1997). Chagas disease. In "Tropical Disease Research," pp. 112-123. WHO, Geneva, Switzerland. Yokoyama-Yasunaka, J. K. U., Pral, E. M. F., Oliveira, O. C., Jr., Alfieri, S. C., and Stolf, A. M. S. (1994). Trypanosoma cruzi: Identification of proteinases in shed components of trypomastigote forms. Acta Tropica 57,307-315.

Cellular Proteinases and Viral Infection: Influenza Virus, Sendai Virus, and HIV-1 HIROSHI KIDO, YE CHEN, MEIKO MURAKAMI, YOSHIHITO BEPPU, AND TAKAE TOWATARI Division of Enzyme Chemistry, Institute for Enzyme Research, The University of Tokushima, Tokushima 770,Japan

I. Introduction II. Tryptase Clara in the Airway Epithelium Proteolytically Triggers the Infectivity of the Influenza A and Sendai Viruses III. Human T-Cell Proteinase, Tryptase TL2, Binds to HIV-1 Envelope Glycoprotein gp 120 Following Limited Cleavage of the gp 120 V3 Loop References

I. I N T R O D U C T I O N Post-translational proteolytic cleavage of the precursors of the fusion glycoproteins of enveloped viruses is a prerequisite for viral fusion activity and infectivity. Proteolytic modification is therefore indispensable for effective virus spread in the infected host and is a prime determinant of the pathogenicity of enveloped viruses. Although receptors capable of bonding these viruses are widely distributed among various cell types in the lungs and other organs (Homma and Ohuchi, 1973; Rott, 1979; Tashiro and Homma, 1983; Klenk and Rott, 1988), the target of these viral infections is restricted to the airway epithelial cells because the processing protease is restricted to those cells or that compartment and determines the infectious tropism of these enveloped viruses. In contrast, virulent pantroic Proteases of Infectious Agents Copyright 9 1999 by Academic Press. All rights of reproduction in any form reserved.

205

206

Vaao et

al.

paramyxo- and orthomyxoviruses and retroviruses have mostly multiple basic residues in the cleavage sites (Garten et al., 1994), and the processing takes place in the cis or medial cisternae of the rough endoplasmic reticulum-Golgi complex or the late Golgi region through intracellular processing proteinases (Stein and Engleman, 1990; Kamoshita et al., 1995). Human immunodeficiency virus type 1 requires proteolytic processing of the envelope glycoprotein precursor gp160 by an intracellular processing proteinase before virus budding. Furthermore, several lines of evidence suggest that HIV-1 also requires the processing of gp120 by an extracellular processing proteinase before virus entry (Hattori et al., 1989; Moore and Nara, 1991; Werner and Levy, 1993). In this chapter we describe the current understanding of the processing proteinase in the airway, which determines the infectious tropism of the influenza and Sendai viruses as well as the cellular processing proteinase, which may play a role in the extracellular processing of HIV-1 gp120 before virus entry.

II. TRYPTASE C L A R A IN T H E AIRWAY E P I T H E L I U M PROTEOLYTICALLY T R I G G E R S T H E I N F E C T I V I T Y OF T H E I N F L U E N Z A A A N D SENDAI VIRUSES Processing proteases of the fusion glycoprotein (Fo) protein of Sendai virus and/or hemagglutinin (HA) of human influenza virus have been reported. The F0 protein of Sendai virus is cleaved by trypsin in vitro (Homma and Ohuchi, 1973; Scheid and Choppin, 1974) and by blood clotting factor Xa in the chorioallantoic fluid (Gotoh et al., 1990), but there is no evidence that these proteases also proteolytically activate viruses in the mammalian respiratory tract. We have isolated a trypsin-type serine proteinase, designated as tryptase Clara, from the rat lung (Kido et al., 1992) and have also partially purified it from surgically removed human inferior nasal conchae. This proteinase is localized in and secreted by secretory epithelial cells of the airway and extracellularly cleaves the precursor F0 protein of progeny Sendai virus and HA of human influenza virus (Kido et al., 1992; Tashiro et al., 1992). Figure 1 shows immunohistochemical localization of tryptase Clara. Light micrographs revealed that heavy deposits of reaction products were present in nonciliated cuboidal epithelial cells lining the lobar bronchi and bronchioles. Other epithelial cells including ciliated epithelial cells and endocrine cells of the bronchi, subepithelial tissues, alveolar cells, lung mast cells, and endothelial cells of the alveolar capillaries were not labeled. On immunoelectron micrography, gold particles indicating the antigenic site of tryptase Clara were mainly found to be deposited in secretory granules (Sakai et al., 1993). Tryptase Clara is constitutively se-

Cellular Proteinases and Viral Infection

207

FIGURE 1 Immunohistochemical localization of tryptase Clara. (A) High-magnification light micrograph of a respiratory bronchiole and adjacent alveoli stained with antibodies against tryptase Clara. Positive staining can be seen in Clara cells (bar, 20/a,m). (B) Immunoelectron micrograph of the peripheral region of a Clara cell. Immunoelectron labeling with gold particles indicated tryptase Clara antigenicity. Granules can be seen protruding from the cell surface (arrows) (bar, 5/zm).

creted from the granules (Kido et al., 1993), and the secretion is stimulated by infection with Sendai virus. The accumulation of tryptase Clara on the luminal surface of bronchiolar epithelial cells and/or in the airway lumen may provide favorable conditions for proteolytic viral activation and multiplication. Purified tryptase Clara from rat lungs exhibits molecular masses of 30 + 1.5 kDa on sodium dodecyl sulfate-polyacrylamide gel electrophoresis (SDSPAGE) under reducing conditions and 180 + 16 kDa on gel-permeation highperformance liquid chromatography. The enzyme was active toward substrates with a single arginine at the P1 position, whereas it poorly cleaved peptides with basic amino acids at positions P 1 and P2. The enzyme was not affected by divalent metal ions including Ca 2+ and Mg 2+, chelating agents, or dithiothreitol. Of the compounds tested, Boc-Gln-Ala-Arg-MCA was the best substrate, this compound fitting within the consensus cleavage motif sequence, Gln(Glu)X-Arg, of the fusion glycoprotein precursors of human influenza, Sendai, and Newcastle disease viruses (Kido et al., 1992). Nonactivated influenza A/Aichi/2/26 (H3N2) viruses grown in MDCK cells were exposed in vitro to various concentrations of tryptase Clara, and then the restoration of infectivity was examined together with the results of trypsin digestion. Tryptase Clara triggered the infectivity of the virus in a dosedependent manner, as shown in Fig. 2A. Although the induction of infectivity by tryptase Clara was less efficient than that by trypsin, in contrast, tryptase Clara at higher concentrations did not decrease the viral infectivity. The results suggested that tryptase Clara specifically cleaves at the activation cleavage site

208

I~ao ~t al.

FIGURE 2 Proteolytic activation of influenza virus A/Aichi/2/68 (H3N2) in vitro by tryptase Clara. (A) Activation of virus infectivity. Nonactivated virus propagated in Madin-Darby canine kidney (MDCK) cells was suspended in PBS- (pH 7.2) and then digested with tryptase Clara (0) or trypsin (o) for 15 min at 37 ~C. The infectivity of the activated virus was assayed by means of immunofluorescent cell counting, using MDCK in the absence of trypsin, and is shown as cellinfecting units (CIU) (Homma and Ohuchi, 1993). (B) Sodium dodecyl sulfate-polyacrylamide gel electrophoresis of the viral glycoproteins. [3H]Glucosamine-labeled nonactivated virus grown in MDCK cells (lane 1) was treated with trypsin (10/zg/ml) for 15 min (lane 2), tryptase Clara (10/zg/ml) for 15 min (lane 3), tryptase Clara (50/zg/ml) for 15 min (lane 4), or tryptase Clara (50/zg/ml) for 30 min (lane 5) at 37 ~C. The SDS-PAGE was performed in a 13% acrylamide gel under reducing conditions.

of fusion glycoprotein H A and not at other residues, which may be cleaved by trypsin at high concentrations. The hemolytic activity of the virus was also enhanced by tryptase Clara, but its hemagglutinating activity was not affected. SDS-PAGE separation of [3H]glucosamine-labeled influenza virus revealed that tryptase Clara cleaved HA into subunits HA1 and HA2 (Fig. 2B). Direct amino acid sequencing of the amino-terminus of the HA2 subunit revealed the residues Gly-Leu-Phe-Gly-Ala-Ile-Ala-Gly-, indicating that the cleavage site of HA for tryptase Clara was between Arg 32s and Gly 326. The enzyme also cleaved precursor polypeptide Fo into subunits F1 and F2 at a site between Arg x~6and Phe ~7. The results support our notion that tryptase Clara specifically recognizes and cleaves at a single arginine residue in the consensus cleavage motif, Gln(Glu)-X-Arg, in the precursors of these viral envelope fusion glycoproteins. To evaluate the role of tryptase Clara in infection by these pneumotropic

Cellular Proteinases and Viral Infection

209

viruses in vivo, rats that had been infected with Sendai virus were treated intranasally and intraperitoneally with antitryptase Clara antibodies, which specifically bind to the enzyme and inhibits its activity (Fig. 3) (Tashiro et al., 1992). Without the antibodies, progeny virus in the lungs was produced in the activated form and underwent multiple cycles of replication until days 5 to 6, when viral replication began to stop, presumably due to the host immunological response. Severe inflammatory lesions were evident in the lungs (Fig. 3B), and more than 90% of the infected rats had died by 9 days after infection (Fig. 3C). In contrast, when the antitryptase Clara antibodies were administered intranasally and intraperitoneally to infected rats, activation of the progeny virus in the lungs was reduced to 10 to 20% and viral replication was suppressed accordingly. Lung lesions were induced, but only slightly compared with the results in untreated animals (Fig. 3B). The mortality rate of infected rats was reduced, and half the animals survived even with a fatal dose of the virus (Fig. 3C). Immunoglobulins from normal rabbits had no preventive effect (data not shown). These results indicate that tryptase Clara extracellularly activates progeny virus in the lungs of rats and is a primary host protease that activates progeny Sendai virus at that site. The physiological role of tryptase Clara is unknown, but the activity of the enzyme, like those of many proteases, may be strictly regulated by endogenous protease inhibitors. Its activity was inhibited by reported inhibitors of trypsintype serine proteases, such as aprotinin, human mucus protease inhibitor (MPI), leupeptin, antipain, and Kunitz-type soybean trypsin inhibitor, but was less strongly inhibited by cr 1-antitrypsin, benzamidine, and Bowman-Birk soybean trypsin inhibitor. To identify the endogenous inhibitor of tryptase Clara in the airway as a defensive compound, we examined human bronchial lavage fluid and found two inhibitors of this enzyme. One was a pulmonary surfactant epithelium to (Kido et al., 1993), a lipoprotein complex, that coats the alveolar ' lower the surface tension at the air-liquid interface and increases phagocytosis by mononuclear cells and alveolar macrophages. The other inhibitor was human mucus protease inhibitor (MPI) (also named antileukoprotease and secretory leukoprotease inhibitor) (Wallner and Fritz, 1974; Seem(iller et al., 1986; Fritz, 1988), a nonglycosylated granulocyte elastase inhibitor (Beppu et al., 1997). Both compounds are produced by nonciliated secretory airway epithelial cells, such as Clara and goblet cells (Mooren et al., 1983; Puchelle et al., 1985), and have been found in bronchial lavage fluid as well as in the alveolar walls (Willems et al., 1989). We found a novel role of these compounds as defensive compounds against infection by pneumotropic viruses under physiological conditions. These compounds significantly inhibited the proteolytic activation by tryptase Clara in a dose-dependent manner in vitro and in vivo (Kido et al., 1993; Beppu et al., 1997). However, the concentrations of the inhibitors constitutively secreted may be insufficient for complete inhibition of infection because tryptase Clara activity was observed in rat bronchial lavage despite the

210

Kido et al.

A A

7

t.m

D Li,. n

6 1/oI

o

.l.,,,

I

.I/.'

5

", 9

\ "~~

e~

4

.,=,o GO L_

3

4 o

3

9

oo 9 ooo

9

o9

9

ooo

o

0o

o

o

I

I

l/9

.g

2

o9

9

9

o

O0

m

1

9149 o 9

o

o

o

o9

e-J

O

100

'-

tlO

l,O

<~

---===-~-

,

I

I

I

,

I

I

J

i

C

.-,....

50

I_

.>_. GO

0

I

.

.

.

.

I

0

,

,

,

5 Days

after

,

I

I

,

,

10 infection

FIGURE 3 Effects of antitryptase Clara antibodies on viral replication in the lungs, pulmonary pathology, and mortality rates of rats infected with Sendai virus. Three-week-old rats were infected intranasally with 2 • 104 p.f.u, of Sendai virus (D). One group was administered 10-fold-diluted antitryptase Clara immunoglobulins intranasally (arrowheads) and intraperitoneally (arrows). (A) Viral replication and activation in the lungs in the absence (solid lines) or presence (broken lines) of antibodies. Lung homogenates were assayed differentially for total yield (o) and activated viruses (e). There were 20 rats in the antibody-treated group and 21 in the untreated group. Each plot represents the mean value for 2 to 4 animals that died or were sacrificed on the indicated days. (B) Lung pathology of infected rats without (e) or with (o) administration of antibodies. Lung lesions were scored from 1 to 4 according to the extent of macroscopic consolidation of the lung surface. Each plot represents one of the rats described above. (C) Mortality rates of infected rats without (solid line) or with (broken line) administration of antibodies. Each group contained 22 rats.

211

Cellular Proteinases and Viral Infection

presence of these inhibitors in the airway fluid. Furthermore, pneumotropic virus infection stimulates secretion of tryptase Clara (Kido et al., 1993) and may induce an imbalance between the amount of tryptase Clara and that of the inhibitors. These data suggest that the administration of these endogenous inhibitors at therapeutic doses would prevent proteolytic activation and inhibit muhicycles of viral replication in vivo (Fig. 4).

III. H U M A N T-CELL PROTEINASE, TRYPTASE TL2, BINDS TO HIV-1 E N V E L O P E G L Y C O P R O T E I N g p 1 2 0 F O L L O W I N G LIMITED CLEAVAGE OF THE g p 1 2 0 V3 L O O P The initial stage of infection of human CD4 + cells by HIV is virus-host cell fusion mediated through high-affinity binding between the envelope glycoprotein, gp120, and the CD4 receptor together with chemokine receptors as coreceptors on the cell surface (Dalgleish et al., 1984; D'Souza and Harden, 1996). [Air Way]

t CiUated Cell

Clara Cell

FIGURE 4 Hypothetical scheme of the proteolytic activation of influenza A and Sendal viruses in the respiratory tract, and inhibition of the activation by endogenous inhibitors of tryptase Clara.

212

l~ao

et al.

In the process of HIV-1 entry, envelope glycoprotein gp120 is cleaved, within the third variable domain (V3) loop, into two fragments (Werner and Levy, 1993). The V3 domain of gp120, which contains a disulfide-linked loop of about 35 amino acids, is an important determinant of HIV-1 infectivity, membrane fusion, and infectious tropism (Moore and Nara, 1991; Javaherian et al., 1989; Freed et al., 1991). Antibodies to this domain inhibit virus entry after the virus binds to its cellular receptor, CD4, and chemokine receptors, and mutations introduced into the loop induce a loss of membrane fusion activity. Although the role of the V3 loop cleavage in virus entry has not been fully elucidated, the findings suggest that the proteotytic modification of the V3 loop allows a change in the conformation of gp120 and thereby plays some fundamental role in virus entry after the virus has bound to the CD4 and chemokine receptors. Because of the sequence similarity between the crowns of most V3 loop sequences (GPGRAF) and the reactive site of Kunitz type II protease inhibitors (GPCRAF), we speculated that the V3 loop interacts with a membrane-bound or membrane-associated cellular protease and then is cleaved (Hattori et al., 1989; Kido et al., 1990). We previously isolated a membrane-associated serine proteinase, tryptase TL2, from a human CD4 + T-cell line, Molt-4 Clone 8 cells. Tryptase TL2 exhibits both trypsin- and chymotrypsinlike endopeptidase activities, and these activities are inhibited by Kunitz type II protease inhibitors with a GPCRAF sequence in their reactive sites, V3 loop peptides, and gp120 of HIV-1 (Kido et al., 1990). Tryptase TL2 binds to envelope glycoprotein gp120 by interacting with its V3 domain (Kido et al., 1991), with an apparent dissociation constant (Kd) of 3.8 X 10-8 M. Although this value indicates that gp120 exhibits high affinity for tryptase TLz, it is lower than that for soluble CD4 (Kd = 0.72-4.0 X 10 -9 M) (Smith et al., 1987), suggesting that gp120 initially binds to the CD4 molecule on the cell surface, with subsequent interaction with tryptase TL2 (Inoue et al., 1994). Figure 5 shows the proteolytic cleavage of recombinant gp120 from the HIV-1 SF2 strain (V3 loop sequence: IYIGPGRAFHTTGRIIGDIRKA)expressed in CHO cells due to tryptase TL2 with or without recombinant-soluble CD4 (rsCD4) (Niwa et al., 1996). Tryptase TL2 cleaved the full-length gp120 into two protein species of about 70 and 50 kDa, respectively (Fig. 5A, lanes 2 and 3), and the cleavage was suppressed by neutralizing antibodies against the V3 loop, such as anti-gp120N and HIV-1 IIIB-V3-13, both of which react with the center tip of the V3 domain and markedly neutralize the HIV-IIIB, MN, and SF2 strains (Laman et al., 1992) (Fig. 5B, lanes, 3 and 5, respectively). The cleavage was highly specific, since there was no additional cleavage of breakdown products, even on incubation for over 12 h. Processing of gp120 by thrombin was also examined, and the cleavage products are shown in Fig. 5A, lanes 4 and 5, and in Fig. 5B, lane 4, as markers of the V3 loop cleavage prod-

Cellular Proteinases and Viral Infection

213

FIGURE 5 Proteolytic cleavage of recombinant HIV-1 SF2 gp120 by various proteases in the presence or absence of rsCD4, and blocking of the cleavage by neutralizing antibodies. (A) Purified recombinant HIV-1 SF2 gp120 (1.5/zg) from CHO cells (lane 1) was incubated with tryptase TL2 (1.5/zg) (lanes 2 and 3), thrombin (0.2/zg) (lanes 4 and 5), or cathepsin G (0.2/zg) (lanes 6 and 7) in the absence (lanes 2, 4, and 6) or presence (lanes 3, 5, and 7) of rsCD4 (0.5/zg) at 37 ~C for 8 h. Digested samples were Western immunoblotted with anti-HIV-1 gp120 polyclonal antibodies. Immunoreactive fragments were detected as to ECL chemiluminescence. (B) Human immunodeficiency virus type 1 SF2 gp120 (1.5/zg) (lane 1) was incubated with tryptase TL2 (1.5/zg) in the absence (lane 2) or presence of neutralizing monoclonal antibodies against the V3 domain, such as anti-HIV-1 gp120 N (10/zg) (lane 3) and IIIB-V3-13 (10/~g) (lane 5) and thrombin (0.2/zg) alone (lane 4). Digested samples were treated as described in A.

ucts. Thrombin cleaves gp120 at the V3 loop (GPGRSAFVT), and forms 70 and 50 kDa products in vitro (Clements et al., 1991), although the cleavage of gp120 does not trigger the infectivity of HIV-1. We also analyzed the effect of cathepsin G, a membrane-associated protease with trypsin- and chymotrypsinlike activities, on gp120 because the activity of cathepsin G is inhibited by V3 peptides and HIV-1 gp120 (Avril et al., 1993). However, limited cleavage of gpl20 by cathepsin G was not observed, and it converted recombinant gp120 into various small fragments with apparent molecular masses of below 40 kDa (Fig. 5A, lane 6). The proteolytic cleavage by tryptase TL2 was slightly enhanced by rsCD4 in the reaction mixture (Fig. 5A, lane 3), but the effect of rsCD4 on the processing by thrombin or cathepsin G was not remarkable (lane 5 and 7, respectively). Considering the variability of the V3 loop in the HIV-1 strains isolated so far, the V3 loop cleavage sites for tryptase TL2 and/or other protease(s) may differ among strains. To examine the diversity of the cleavage site(s) for tryptase TL2 in various V3 loops, we determined the cleavage sites of gp120 of several strains. The V3 loop of HIV-1 MN, which is the most common genotype in

214

aido

et al.

North America, was preferentially cleaved by tryptase TL2 at the chymotryptic cleavage site in the V3 loop (GPGRAFSYTTKN). The V3 loop of HIV-1 SF2 gp120 was cleaved by tryptase TL2 at a different site (GPGRAFHTTSGR) from that for thrombin. The cleavage site of the V3 loop of HIV-1 IIIB gp120 for tryptase TL2 was (SIRQSRGPGRAF). These data suggest that the most frequent cleavage sites for the enzymes are located around the center tip sequence, GPGRAF, but not in that sequence. The multisubstrate specificity of tryptase TL2 may correspond to the variability of the V3 loop, and the cleavage diversity may correspond to the V3-based viral tropism and infectivity (Niwa et al., 1996). The effects of inhibitors of tryptase TL2 on HIV-1 replication have been reported. The inhibitors of tryptase TL2, such as leech-derived tryptase inhibitor form C, recombinant variant [Arg 15, Phe 17, Glu 52] aprotinin and [Leu 15, Phe17, Glu 52] aprotinin, and the modified C-terminal domain of bikunin ([Arg 94] bikunin), completely inhibit the syncytium formation induced by HIV-1 and also inhibit virus replication at concentrations between 2 and 20 ~M (Auerswald et al., 1994; Brinkmann et al., 1997). Mucous protease inhibitor, which inhibits the activity of tryptase Clara and influenza virus infection in the airway (Beppu et al., 1997), has also been found to contribute to the important antiviral activity of saliva associated with the infrequent oral transmission of HIV-1 (McNeely et al., 1995). Since interaction of MPI with viral proteins has not been demonstrated, MPI may interact with a host cell-associated molecule, such as tryptase TL2, and thereby inhibit the processing of gp120 and viral internalization. The two host proteases discussed in this chapter, tryptase Clara and tryptase TL2, illustrate the role of cellular factors in the infection process. Clearly, other viral infectious agents may also explain features of the host's target cells in their strategy for cell entry and activation.

REFERENCES Auerswald, E. A., Morenweiser, R., Sommerhoff, C. P., Piechottka, G. P., Eckerskorn, C., Gurtler, L. G., and Fritz, H. (1994). Recombinant leech-derived tryptase inhibitor: Construction, production, protein chemical characterization and inhibition of HIV-1 replication. Biol. Chem. Hoppe Seyler 374, 695-703. Avril, U-M., Martino-Ferrer, M. D., Barin, F., and Gauthie, F. (1993). Interraction between a membrane-associated serine proteinase of U-937 monocytes and peptides from the V3 loop of the human immunodeficiencyvirus type 1 (HIV-1) gp120 envelope glycoprotein. FEBS Lett. 317, 167-172. Barber-Morel, C. L., Oeltmann, T. N., Edwards, K. M., and Wright, P. F. (1987). Role of respiratory tract proteases in infectivityof influenza A virus.J. Infect. Dis. 155,667-672. Beppu, Y., Imamura, Y., Tashiro, M., Towatari, T., Ariga, T., and Kido, H. (1997). Human mucus

Cellular Proteinases and Viral Infection

215

protease inhibitor in airway fluids is a potential defensive compound against infection with influenza A and Sendai viruses.J. Biochem. (Tokyo) 121,309-316. Blumberg, B. M., Giorgi, C., Rose, K., and Kolakofsky, D. (1985). Sequence determination of the Sendal virus fusion gene. J. Gen. Virol. 66, 317-331. Brinkmann, T., Sch~ifers, J., GOrtler, L., Kido, H., Niwa, Y., Katunuma, N., and Tschesche, H. (1997). Inhibition of trypyase TL2 from human T4 + lymphocytes, and inhibition of HIV-1 replication in H9 cells by recombinant aprotinin and bikunin homologues. J. Protein Chem. 16, 651-660. Clements, G. J., Price-Jones, J. M. J., Stephens, P. E., Sutton, C., Schultz, T. F., Clapham, P. R., McKeating, J. A., McClure, M. O., Thomson, S., Marsh, M., Kay,J., Weiss, R. A., and Moore,J. P. (1991). The V3 loops of the HIV- 1 and HIV-2 surface glycoproteins contain proteolytic cleavage sites: A possible function in viral fusion? AIDS Res. Hum. Retroviruses 7, 3-16. Dalgleish, A. G., Beverley, P. C. L., Clapham, P. R., Crawford, D. H., Greaves, M. F., and Weiss, R. A. (1984). The CD4 (T4) antigen is an essential component of the receptor for the AIDS retrovirus. Nature 312, 763-767. D'Souza, M. R., and Harden, V. A. (1996). Chemokines and HIV-1 second receptors. Nature Med. 2, 1293-1300. Freed, E. O., Myers, D. J., and Risser, R. (1991). Identification of the principal neutralizing determinant of human immuno-deficiency virus type 1 as a fusion domain. J. Virol. 65, 190-194. Fritz, H. (1988). Human mucus proteinase inhibitor (human MPI): human seminal inhibitor (HUSI-I), antileukoprotease (ALP), secretory leukocyte protease inhibitor (SLPI). Biol. Chem. Hoppe-Seyler 369, 79-82. Garten, W., Hallenberger, S., Ortmann, D., Sch~ifer, W., Vey, M., Angliker, H., Shaw, E., and Klenk, H. D. (1994). Processing of viral glycoproteins by the subtilisin-like protease, furin, and its inhibition by specific peptidylchloroalkylketones. Biochimie 76, 217-225. Gething, M.-J., Bye, J., Skehel, J., and Waterfield, M. (1980). Cloning and DNA sequences of doublestranded copies of haemagglutinin genes from H2 and H3 strains elucidates antigenic shift and drift in human influenza virus. Nature 287,301-306. Gotoh, B., Ogasawara, T., Toyoda, T., Inocencio, N., Hamaguchi, M., and Nagai, Y. (1990). An endoprotease homologous to the blood clotting factor X as a determinant of viral tropism in chick embryo. EMBOJ. 9, 4189-4195. Hattori, T., Koito, A., Takatsuki, K., Kido, H., and Katunuma, N. (1989). Involvement of tryptaserelated cellular protease(s) in human immunodeficiency virus type 1 infection. FEBS Lett. 248, 48 -52. Homma, M., and Ohuchi, M. (1973). Trypsin action on the growth of Sendai virus in tissue culture cells.J. Virol. 12, 1457-1465. Inoue, M., Hoshino, T., Fukuma, T., Niwa, Y., and Kido, H. (1994). Close co-localization of CD4 and a serine esterase tryptase TL2 on the cell-surface of human monocytoid and CD4 § lymphoid cells. Biochem. Biophys. Res. Commun. 201, 1390-1395. Javaherian, K. A. J., Langlois, C., McDanal, K. L., Ross, L. I., Eckler, C. L., Jellis, A. T., Profy, J. R., Rusche, D. P., Bolognesi, D. P., Putny, S. D., and Matthews, T.J. (1989). Principal neutralization domain of the human immunodeficiency virus type 1 envelope protein. Proc. Natl. Acad. Sci. USA 86, 6768-6772. Kamoshita, K., Shiota, M., Sakai, M., Sasaki, M., Koga, Y., Okumura, Y., and Kido, H. (1995). Calcium requirement and inhibitor spectrum for intracellular HIV type 1 gp160 processing in cultured HeLa cells and CD4 § lymphocytes: Similarity to those of viral envelope glycoprotein maturase.J. Biochem. (Tokyo) 117, 1244-1253. Kido, H., Fukutimi, A., and Katunuma, N. (1990). A novel membrane-bound serine esterase in human T4 § lymphocytes immunologically reactive with antibody inhibiting syncytia induced by HIV-I: Purification and characterization. J. Biol. Chem. 265, 21979-21985.

216

aiao et

al.

Kido, H., Fukutomi, A., and Katunuma, N. (1991). Tryptase TL2 in the membrane of human T4 § lymphocytes is a novel binding protein of the V3 domain of HIV-1 envelope glycoprotein gp120. FEBS Lett. 286, 233-236. Kido, H., Yokogoshi, Y., Sakai, K., Tashiro, M., Kishino, Y., Fukutomi, A., and Katunuma, N. (1992). Isolation and characterization of a novel trypsin-like protease found in rat bronchiolar epithelial Clara cells: Possible activator of the viral fusion glycoprotein. J. Biol. Chem. 267, 13573 -13579. Kido, H., Sakai, K., Kishino, Y., and Tashiro, M. (1993). Pulmonary surfactant is a potential endogenous inhibitor of proteolytic activation of Sendai virus and influenza virus. FEBS Lett. 322, 115-119. Klenk, H.-D., and Rott, R. (1988). The molecular biology of influenza virus pathogenicity. Adv. Virus Res. 34, 247-281. Laman, J. D., Schellekens, J. M. M., Abacioglu, Y. H., Lewis, G. K., Tersmette, M., Fouchier, R. A. M., Langedijk, J. P. M., Classen, E., and Boersma, M. J. A. (1992). Variant-specific monoclonal and group-specific polyclonal human immunodeficiency virus type i neutralizing antibodies raised with synthetic peptides from the gp120 third variable domain.J. Virol. 66, 1823-1831. McNeely, T. B., Dealy, M., Dripps, D. J., Orenstein, J. M., Eisenberg, S. P., and Wahl, S. M. (1995). Secretory leukocyte protease inhibitor: A human saliva protein exhibiting anti-human immunodeficiency virus i activity in vitro.J. Clin. Invest. 96, 456-464. Moore, J. P., and Nara, P. L. (1991). The role of the V3 loop in HIV infection. AIDS 5(Suppl.), 521-533. Mooren, H. W. D., Kramps, J. A., Franken, C., Meijer, C. J. L. M., and Dickman, J. A. (1983). Localization of a low-molecular-weight bronchial protease inhibitor in the peripheral human lung. Thorax 38, 180-183. Niwa, Y., Yano, M., Futaki, S., Okumura, Y., and Kido, H. (1996). T cell membrane associated serine protease, tryptase TL2, binds HIV-1 gp120 and cleaves the third-variable-domain loop of gp 120: Neutralizing antibodies of human immunodeficiency virus type I inhibits cleavage of gp120. Eur.J. Biochem. 237, 64-70. Puchelle, E. J., Hinnraski, J., Tournier, J. M., and Adnet, J. J. (1985). Ultrastructural localization of bronchial inhibitor in human airways using protein A-gold technique. Biol. Cell 55, 151-154. Rott, R. (1979). Molecular basis of infectivity and pathogenicity of myxovirus. Arch. Virol. 59, 285-298. Sakai, K., Kawaguchi, Y., Kishino, Y., and Kido, H. (1993). Electron immuno-histochemical localization in rat bronchiolar epithelial cells of tryptase Clara, which determines the pneumotropism and pathogenicity of Sendai virus and influenza virus. J. Histochem. Cytochem. 41, 89-93. Scheid, A., and Choppin, P. W. (1974). Identification of biological activities of paramyxovirus glycoproteins: Activation of cell fusion, hemolysis, and infectivity by proteolytic cleavage of an inactive precursor protein of Sendai virus. Virology 57,475-490. Seem~iller, U., Arnhold, M., Fritz, H., Wiedenmann, K., Machleit, W., Heinzel, R., Appelhans, H., Gassen, H. G., and Lottspeich, F. (1986). The acid-stable proteinase inhibitor of human mucus secretions (HUSI-I, antileukoprotease). FEBS Lett. 199, 43-48. Smith, D. H., Byrn, R. A., Marsters, S. A., Gregory, T., Groopman, J. E., and Capon, D. J. (1987). Blocking of HIV-1 infectivity by a soluble, secreted form of the CD4 antigen. Science 238, 1704-1707. Stein, B. S., and Engleman, E. G. (1990). Intracellular processing of the gp160 HIV-1 envelope precursor endoproteolytic cleavage occurs in a cis or medial compartment of the Golgi complex. J. Biol. Chem. 265, 2640-2649. Tashiro, M., and Homma, M. (1983). Pneumotropism of Sendai virus in relation to proteasemediated activation in mouse lungs. Infect. Immun. 39, 879-888. Tashiro, M., Yokogoshi, Y., Tobita, K., Seto,J. T., Rott, R., and Kido, H. (1992). Tryptase Clara, an

Cellular Proteinases and Viral Infection

217

activating protease for Sendai virus in rat lungs, is involved in pneumopathogenicity. J. Virol. 66, 7211-7216. Wallner, O., and Fritz, H. (1974). Characterization of an acid-stable proteinase inhibitor in human cervical mucus. Hoppe-Seyler's Z. Physiol. Chem. 355,709-715. Werner, A., and Levy, J. A. (1993). Human immunodeficiency virus type 1 envelope gp120 is cleaved after incubation with recombinant soluble CD4. J. Virol. 67, 2566-2574. Willems, L. N. A., Kramps, J. A., Stijnen, T., Sterk, P. J., Weening, J. J., and Dijkman, J. M. (1989). Antileukoprotease-containing bronchiolar cells: Relationship with morphologic disease of small airways and parenchyma. Am. Rev. Respir. Dis. 139, 1244-1250.

Bacterial Type I Signal Peptidases MARK O. LIVELY Department of Biochemistry, Wake Forest University School of Medicine, Winston-Salem, North Carolina27157

I. II. III. IV.

Introduction Biology,Biochemistry, Genetics, and Structure Known Inhibitors Prospects for Novel Drugs References

I. I N T R O D U C T I O N The emergence of new strains of bacteria that are resistant to currently available antibiotics is increasing the difficulty of treatment of common bacterial infections. Emergent drug-resistant organisms include Neisseria gonorrhoeae, Mycobacterium tuberculosis, Staphylococcus aureus, Streptococcus pneumoniae, enterococci, and Escherichia coli O157:H7 among others. Infectious diseases remain the leading cause of death worldwide (WHO, 1992) and the third leading cause of death in the United States (NAIAD, 1992). New strategies and new drug targets are required for development of antibacterial agents to treat diseases due to drug-resistant pathogens. Toward that goal, bacterial protein secretion pathways can provide logical targets for identification of new antibacterial drugs (Misra and Silhavy, 1992). Type I bacterial signal peptidase is an enzyme essential for viability in E. coli (Date, 1983), Staphylococcus aureus (Cregg et al., 1996), and probably all bacteria (Dalbey et al., 1997); thus, this enzyme is a potentially important drug target. Recognizing this opportunity, a number of pharmaceutical companies have conducted research programs over the past decade to identify drugs that would kill bacteria by inactivating signal peptidases, but these efforts have apparently not yet led to the identification of useful drug candidates. Nevertheless, signal peptidase remains a viable drug target because recent progress Proteases of InfectiousAgents Copyright 9 1999 by Academic Press. All rights of reproduction in any form reserved.

219

220

Mark O. Lively

toward understanding its catalytic mechanism and structure should enhance efforts to identify effective inhibitors. Effective antisignal peptidase drugs could provide a novel class of antibiotics effective against a broad range of bacterial species. This chapter discusses the type I bacterial signal peptidases, enzymes that catalyze cleavage of secretory signal peptides from proteins exported from the cytoplasm (Dalbey et al., 1997).

II. BIOLOGY, BIOCHEMISTRY, GENETICS, AND STRUCTURE A. BIOLOGY Most proteins to be exported from bacterial cells as well as eukaryotic cells are synthesized as precursors with N-terminal extensions of 15 to 30 amino acids (Nielsen et al., 1997). Secreted soluble proteins and many membrane proteins follow this pathway. The hydrophobic signal peptide targets proteins to export sites in the cytoplasmic membrane where the proteins are then translocated through the lipid bilayer to periplasmic or extracellular locations (Rapoport et al., 1996). Removal of the signal peptides is catalyzed by type I signal peptidase. Although removal of the signal peptide is not required for translocation of proteins to occur, removal is essential for proper release of the secreted protein as well as for proper localization of membrane proteins synthesized with cleaved signal peptides. Uncleaved precursor proteins that retain their signal peptides typically remain bound in the membrane via the signal peptide as a membrane anchor (Koshland et al., 1982; Fikes and Bassford, 1987; Kuhn and Wickner, 1985). In a conditional lethal strain of E. coli in which the signal peptidase gene lep is under control of the arabinose promoter, uncleaved precursors were translocated across the membrane in the absence of signal peptidase but remain anchored to the periplasmic surface (Dalbey and Wickner, 1985). Signal peptidase plays a pivotal role in the export of proteins from bacteria.

B. BIOCHEMISTRY Escherichia coli type I signal peptidase is the best characterized bacterial signal peptidase. The E. coli enzyme was the first signal peptidase to be cloned and characterized biochemically (Date and Wickner, 1981; Dalbey and Wickner, 1985; Tschantz and Dalbey, 1994; Zwizinski and Wickner, 1980; Wolfe et al., 1982, 1983). Subsequently, signal peptidase from other species have been stud-

Type I SignalPeptidases

221

ied: Streptococcus pneumoniae (Zhang et al., 1997), Bacillus subtilis (Tjalsma et al., 1997), Salmonella typhimurium (van Dijl et al., 1990), and Staphylococcus aureus (Cregg et al., 1996). As a result of the explosion in genomic sequencing of bacterial species, more than 25 different bacterial type I signal peptidase gene sequences are now known. Figure 1 shows a multiple amino acid sequence alignment of signal peptidases from six selected pathogenic bacteria. These sequences are representative of the currently known bacterial signal peptidase genes. The enzyme sequences aligned in Fig. i were selected because infections caused by each of these bacteria are significant health problems and these particular signal peptidases are perhaps the most important species for drugs that would target this enzyme. Biochemical studies using site-directed mutagenesis have shown that E. coli signal peptidase has an essential Ser (Sung and Dalbey, 1992) and an essential Lys (Black, 1993; Tschantz et al., 1993; Paetzel et al., 1997). Interestingly, none of the His or Cys residues are required for activity (Sung and Dalbey, 1992). The direct involvement of Ser90 of E. coli signal peptidase in catalysis was established when Tschantz and coworkers (1993) showed that Ser90 could be replaced by Cys without complete loss of catalytic activity. The Cys90 signal peptidase was inhibited by N-ethylmaleimide, a Cysspecific reagent that does not inhibit wild-type signal peptidase, providing additional evidence that the substituted Cys had replaced the role of the wild-type Ser. These studies have led to the conclusion that the type I signal peptidase is an unusual serine protease unlike the classic serine protease family (Sung and Dalbey, 1992; Paetzel and Dalbey, 1997). Instead of the classic catalytic triad of Ser, His, and Asp, the proposed mechanism of catalysis involves a Ser-Lys dyad without the apparent involvement of aspartic acid (van Dijl et al., 1995). This mechanism is analogous to that proposed for autoproteolytic cleavage by the LexA repressor (Slilaty and Little, 1987) E. coli UmuD' (Peat et al., 1996) and to the mechanism of/3-1actam cleavage by E. coli/3-1actamase (Strynadka et al., 1992). In this mechanism, Ser acts as the nucleophile to attack the carbonyl of the scissile peptide bond. The Lys acts as a general base to accept the proton from Ser to promote attack on the peptide bond. Enzymes using this mechanism are typically less efficient than typical serine proteases (Paetzel and Dalbey, 1997). This observation is consistent with the generally low catalytic efficiency of signal peptidases. This unusual catalytic mechanism may explain why none of the commonly used inhibitors of proteolytic enzymes inhibit the enzyme (see below). The best substrates for signal peptidases in vitro are full-length precursor proteins. In early studies of signal peptidase, assays were performed using secretory precursor proteins synthesized by cell-free protein synthesis in the presence of radioactive amino acids (Jackson, 1983). Those assays are time consuming and difficult to quantify for kinetic analyses. They do not permit

~ B B

m M ~

~

t

H

N

" ~ ~

N

H

M M

t

M M M

~m~m

~

H ~ M M ~m

~

m o o ~ m N N N ~

~H ~

BB

k~

~

O

~-~~

I I I I I I t I

I

~

M ::z:: ~L9 ~DB ,~ r.,.] I-.i i-i H BB

H H

H

H

i i I i

I i I I

H ~-I

[-,-1 I::1 ,-] H

M m m

~D~D C~C~

e 1

1

~Z M

a

I

H

1~,,~ I:~,~ I:).,,~

~

I

M M B I::::::1 l:::11::::11::::::l

B

I--I IM

GJ

~

I i i i

I I i

? t i

~

'

~

~

~

{~1 L.~ {"~1 {'~1 L,D I,.~

I

m ~ M M ~ m

~

I~ I:l.,

c~

i i i

B ::I:: MM

I i

~1 Ir ~r.tl B

H1 ~>~ H ~ II, II, MH

B~ ,-]H fl, tl, G) G)

~

111111

N

M

H~

g~

I--t M ~

r~ M ~

i.~ 1.~

M

N

m m

H

H

i i I I

??

Cl i::I 1::1

r..)

r~

111

if)

J.J - ,.--i

F.l.l

CD~ UOI I--I

I I I I

1:1

121 ~

~H

i I I t I

~

i I I i t I I i

I

B~ 1::1 I~

I

M

I

~

I

I

I--4

I

~M B@ H G)

I

B~

O~

r.tl M 121M

r~

m

I::1 C~M

O~O~

M ~B IM I-4

~M BB F.I.1

>::> F.;.1

I

~DL~

I

I

I

~~~.~

I

t.o {-Xl ~1 ffl ,~

I

Z ~ m m ~

I-4

o

~

~

~

~o~~ ~-,-~

{~1 r H

r H

r

~

t

i

N ~ H N

t

~--I L'~I

,,~'4 {:~ ,,::~ ,::d4 ~1 ,,~34

H

t

g~gg~

~ I

t

r,/l l/) I~

~ O ~ C D O ,.-] I::l U ,.-1

U

I/) ll1 ~

{'~1 {',4 G3 00 LD r {"4 <:::X3 IX3 I.s ~ IX3 {"q {"q @4 ~1 r-t r-I

Type I Signal Peptidases

223

rapid screening of candidate inhibitors as is required by drug discovery groups for identification of active compounds. Small pentapeptide analogs of signal peptides are cleaved but only very slowly and subsequent detection requires separation of the products by a method such as high-performance liquid chromatography (HPLC) (Dev et al., 1990). A decapeptide containing an internally quenched fluorescent group has been reported that is cleaved by E. coli signal peptidase (kcat/Km = 71.1 M -1 sec-1). This peptide is the first signal peptidase substrate that can be directly detected during the enzymatic reaction (Zhong and Benkovic, 1998). Although this peptide is also cleaved slowly, it or similar compounds could provide the necessary leads for development of more effective assay substrates. The best signal peptidase substrate currently in use is a hybrid precursor protein composed of the signal peptide of the E. coli outer membrane protein A (OmpA) fused to the catalytic domain of Staphylococcus aureus nuclease A (Chaterjee et al., 1995; Suciu et al., 1997). The rate of cleavage of this substrate is approximately six orders of magnitude greater than that of the peptide substrates. Although the pro-OmpA-nuclease A fusion protein is a much more efficient substrate, cleavage must still be detected by separation of the precursor from the cleavage product either by SDS-gel electrophoresis or by HPLC. Assays can be completed within 2 h with analysis by SDS-gel electrophoresis but this substrate is still not well suited for high-capacity screening of candidate inhibitors.

C.

GENETICS

More than 25 bacterial signal peptidase gene sequences are currently known (Dalbey et al., 1997). A BLAST search (Altschul et al., 1997) at the National Center for Biotechnology Information (NCBI) web site (www.ncbi.nlm.nih.gov) using the E. coli signal peptidase sequence as a query revealed more than 35 signal

FIGURE 1 Multiplesequence alignmentof type I signalpeptidases frompathogenicbacteria. The sequences of six selected bacterial signalpeptidaseswere alignedusing the multiple sequencealignment program, MSA, version 2.1, implemented at the Biology Workbench site (http://biology.ncsa.uiuc.edu/) of the National Center for SupercomputingApplications at the Universityof Illinois at Urbana- Champaign. Conservedamino acids are shownin boldfacein the figure.Putative active site amino acid residues are indicated by asterisksabovethe alignments. ConservedDomains B through E are indicated by lines above the aligned sequences. The sequences are as followswith their NCBI (http://www.ncbi.nlm.nih.gov/Entrez/protein.html) GenPept sequence ID number in parentheses: M. tuberc, Mycobacterium tuberculosis (1708796); H. influen, Haemophilus influenza Rd (1572959); E. coli, Escherichia coli (2506809); S. typhim, Salmonella typhimurium (126189); S. aureus, Staphylococcus aureus (1595810); S. pneumo, Streptococcus pneumoniae (2149614).

224

Mark O. Lively

peptidase sequences in the database (with some repeated entries). The bacterial signal peptidases have no amino acid sequence similarity to any other type of proteolytic enzyme. The enzymes range in size from 174 amino acids (S. aureus, Genbank Accession No. U65000, PID 1595810) to 349 amino acids (Haemophilus influenza Rd, Genbank Accession No. U32687, PID 1572959). Representative signal peptidase sequences aligned in Fig. 1 reveal that the enzymes are not highly conserved globally but there are four conserved domains (labeled Domains B-E, Fig. 1; Domain A, not indicated, designates the N-terminal membrane spanning domains). These conserved domains can be found in all of the currently known signal peptidases, including eukaryotic signal peptidases, which appear to be distantly related to the bacterial enzymes (Dalbey et al., 1997). Domain B (Fig. 1) includes Ser90 of E. coli signal peptidase. This residue is the essential Ser required for enzyme activity (Tschantz et al., 1993; Sung and Dalbey, 1992). A highly conserved Met residue of unknown function follows the active site Ser. The catalytically essential Lys is found in Domain D. While the position of this Lys in M. tuberculosis signal peptidase was not aligned directly with the essential Lys in the other five enzymes, the sequence alignment performed by the MSA program places a Lys in the next position suggesting that this Lys may be the essential amino acid in M. tuberculosis signal peptidase. The roles of Domains C and E are currently unknown. Preliminary data suggest that Domain C plays an important role in substrate binding (C. A. Ashwell and M. O. Lively, unpublished results). It is interesting to note that Domain E contains two very highly conserved Asp residues. The Gly-Asp sequence found in the middle of Domain E is invariant in all known bacterial signal peptidase sequences, whereas the second Asp is absent from some sequences. The currently proposed catalytic mechanism does not predict the involvement of Asp in the catalytic mechanism but the evolutionary conservation of this Asp residue suggests that it plays an important role in enzyme function or structure. It is also important to note that there are no invariant His residues in this alignment, giving additional support to the conclusion that signal peptidases are not typical serine proteases that require a His at the active site. The complete DNA sequence of the genome of Mycoplasma genitalium revealed that this organism does not have any gene with sequence similarity to the type I signal peptidase family (Fraser et al., 1995). This pathogen is a parasite for a wide range of hosts, including humans, and is believed to represent a minimal life form. Although the genomic sequence does not have a recognizable signal peptidase gene, the organism does encode 11 possible precursor proteins with apparent type I signal peptides. The absence of a recognizable signal peptidase in the genome suggests that this organism, and perhaps others, can function effectively without a type I signal peptidase. Alternatively, an unrecognized signal peptidase gene may be present in M. genitalium.

225

Type I Signal Peptidases

D. STRUCTURE Type I signal peptidases are membrane-bound enzymes anchored in the cytoplasmic membrane of bacterial cells via one or more membrane anchors (Fig. 2). As represented in Fig. 2, E. coli leader peptidase is a single-chain protein with 324 amino acids anchored in the cytoplasmic membrane by two membrane anchors such that the active site domain is positioned in the periplasm (Moore and Miura, 1987; Whitley et al., 1993; Bilgin et al., 1990). Based on amino acid sequence alignments with E. coli signal peptidase, Salmonella typhimurium signal peptidase is also bound to the cytoplasmic membrane by two anchors (van Dijl et al., 1990). Other bacterial signal peptidases, including the enzymes from S. aureus (Cregg et al., 1996) and B. subtilis (van Dijl et al., 1995), are anchored to the membrane via a single membrane spanning domain (Fig. 2, left). Haemophilus influenza appears to have three membrane anchors that would result in the orientation depicted on the right side of Fig. 2. In all cases, Domain B, which includes the active site Ser, is positioned near the C-terminus of the last membrane spanning anchor suggesting that the active site is located near the surface of the membrane. Progress has been made toward the solution of an X-ray crystal structure of E. coli signal peptidase. Because signal peptidase is a membrane protein, it has not yet been possible to obtain crystals of the full-length enzyme for X-ray diffraction analysis. However, creation of a soluble form of E. coli signal peptidase that lacks the two N-terminal membrane anchors (Kuo et al., 1994) has allowed

Cytoplasm N

N

Membrane~

~~

~~~ kJ-

(

) Extracellular

~ii= Domain B

space

F~ r~ Membraneanchor (Domain A)

FIGURE 2 Topologymodel for bacterial signal peptidases. The orientation of bacterial signal peptidases in the cytoplasmic membrane. The hatched rectangles represent membrane spanning domains. The black ellipse represents DomainB that contains the active site Ser.

226

Mark O. Lively

the formation of diffraction-quality crystals (Paetzel et al., 1995). Although a completed structure has not yet been published, such a structure should be available in the near future to guide structure-function studies. In the absence of an X-ray structure, Paetzel and coworkers (1997) have used the X-ray crystal structure of E. coli UmuD' to model the possible active site geometry of E. coli signal peptidase. UmuD' is an essential protein involved in DNA repair and its active form is produced by autoproteolysis (Peat et al., 1996). The autoproteolytic mechanism involves a Ser and a Lys as proposed for signal peptidase. Although the sequence similarity between UmuD' and signal peptidase is very low, a model can be constructed using the signal peptidase sequence that places the proposed catalytic Ser and Lys residues in the same orientation as the Ser and Lys residues of UmuD'. This model predicts the presence of a hydrophobic cleft next to the active site amino acids.

III. K N O W N I N H I B I T O R S

A. CLASSIC INHIBITORS Bacterial signal peptidases are resistant to inhibition by the usual reagents that inactivate the typical mechanistic classes of proteolytic enzymes. The following inhibitors of typical protease families do not inhibit bacterial type I signal peptidases: 1-chloro-3-tosylamido-7-amino-2-heptanone; 2,6-pyridine dicarboxylic acid; 4-(amidinophenyl) methanesulfonyl fluoride; antipain; aprotinin; bestatin; chymostatin; diaxoacetyl-DL-norleucine methyl ester; dichloroisocoumarin; elastatinal; ethylenediamine tetraacetic acid; iodoacetamide; leupeptin; L-transepoxysuccinyl-leucylamido (4-guanidino) butane; 1,2-epoxy-3-(p-nitrophenoxy)propane; N-carbobenzyloxy-L-phenylalanyl chloromethyl ketone; N-ethyl maleimide; o-phenanthroline; pepstatin; phenylmethylsulfonyl fluoride; phosphoramidon; and tosylamido-2-phenylethyl chloromethyl ketone (Black et al., 1992; Kuo et al., 1993; Vehmaanper~t et al., 1993; Zwizinski et al., 1981).

B. PENEMS AND fl-LACTAMS Ironically, the only reported inhibitors of bacterial signal peptidases arefl-lactam compounds similar to known antibiotics. A series of 6-(substituted oxyethyl) penem compounds (Fig. 3) have been discovered by SmithKline Beecham that effectively inhibit E. coli and S. aureus signal peptidases (Allsop et al., 1996, 1995). Compound lb (Fig. 3) has an IC50 less than 1/zM. Unfortunately, none

227

Type I Signal Peptidases

~

O2H

C',) 0 C H 3 ~ l ~~

~

CH3

)l yo, SmithKline Beecham Compound 1b

0

o oY Merck L-684,248

FIGURE 3 Knowninhibitors of bacterial signal peptidase. Two different classes of inhibitors of bacterial signal peptidases have been reported. SmithKline Beecham compound lb (Allsop et al., 1996) is a representative of a group of 6-(substituted oxyethyl) penems. Merck L-684,248 is representative of a series of/3-1actamderivativesthat inhibit both bacterial signal peptidase and human leukocyte elastase (Kuo et al., 1994).

of the penem derivatives reported by the SmithKline group showed antibacterial activity at a useful concentration against a number of different bacteria (Allsop et al., 1996). Workers at Merck Research Laboratories have reported a series of/3-1actams with inhibitory activity against E. coli signal peptidase (Kuo et al., 1994). Merck compound L-684,248 (Fig. 3) is a moderately effective inhibitor of E. coli signal peptidase. The series of compounds related to L-684,248 showed a dependence on the 13-1actam structure and on the stereochemistry so the group concluded that inhibition was specific. Interestingly, a related series of 13-1actams effectively inhibit human leucocyte elastase (Knight et al., 1992; Green et al., 1995), a classic serine protease. This result suggests some similarity of mechanism between the classic serine proteases and the signal peptidases. The Merck group did not report data on the effects of these compounds on bacteria.

IV. P R O S P E C T S F O R N O V E L D R U G S Any agent that blocks the processing activity of signal peptidase will lead to the accumulation of precursor proteins in the cytoplasmic membrane and perhaps within the cytoplasm. In cases of overexpression of recombinant proteins in

228

Mark O. Lively

E. coli where abnormally large amounts of precursors are synthesized, precur-

sors often accumulate within the cytoplasm. This accumulation presumably results because the capacity of the protein export pathway has been exceeded and newly synthesized proteins are unable to get out of the cell (for example, see Chatterjee et al., 1995; Suciu et al., 1997). If caused by inhibition of signal peptidase, such an accumulation will compromise cellular function by blocking protein export pathways as unprocessed precursors remain bound in the cytoplasmic membrane. Additionally, cellular function would be compromised because necessary proteins required for cell function would not reach their correct sites in the periplasm, outer cell membrane, or the surrounding extracellular environment. Signal peptidase plays a central role in the protein secretion pathway of all cells. A number of pharmaceutical research teams have attempted to develop specific inhibitors of signal peptidase but apparently none has successfully identified a class of compound that could lead to full drug development. Several factors may have contributed to this failure. First, signal peptidase assays have not been readily adaptable to full-scale screening protocols to permit rapid and sensitive detection of inhibitory activity. Some potential leads may have been missed using existing screens. Second, until recently, knowledge of the mechanism of signal peptidase action has been limited, so rational approaches to drug design have not been available. With the foreseeable completion of the X-ray structure of E. coli signal peptidase it will become possible to use molecular modeling to design potential inhibitors and then perhaps promising leads can be developed and tested. The development of any potential inhibitor of bacterial signal peptidase must also actively consider the potential cross-reactivity with the eukaryotic microsomal signal peptidase (Dalbey et al., 1997). The microsomal signal peptidases appear to be distantly related to the bacterial signal peptidases and may have the same or similar mechanism of action. The conserved amino acid sequence domains shown in Fig. 1 are also found in the Sec11 family of microsomal signal peptidase. The active site Ser residue in Domain B is also invariant in the eukaryotic signal peptidase. Consequently any antisignal peptidase drugs developed should be tested for reactivity with the eukaryotic signal peptidase. Even if cross-reactivity is observed between bacterial signal peptidase inhibitors and the microsomal enzyme, inhibition of the microsomal enzyme in vivo would not necessarily occur if the drug did not reach the enzyme. The microsomal signal peptidase is localized within the lumen of the endoplasmic reticulum. For inhibition to occur, the antibacterial agent would have to cross two lipid bilayers to react with the enzyme. In contrast, the bacterial signal peptidases are located on the exterior of the cytoplasmic membrane where they should be readily accessible to antisignal peptidase agents. This enzyme re-

Type I Signal Peptidases

229

mains a potentially significant drug target for the development of a new generation of antibacterial drugs.

REFERENCES Allsop, A. E., Brooks, G., Bruton, G., Coulton. S., Edwards, P. D., Hatton, I. K., Kaura, A. C., McLean, S. D., Pearson, N. D., Smale, T. C., and Southgate, R. (1995). Penem inhibitors of bacterial signal peptidase. Bioorg. Med. Chem. Lett. 5,443-448. Allsop, A., Brooks, G., Edwards, P. D., Kaura, A. C., and Southgate, R. (1996) Inhibitors of bacterial signal peptidase: A series of 6-(substituted oxyethyl) penems. J. Antibiot. 49,921-928. Altschul, S. F., Madden, T. L., Schaffer, A. A., Zhang, J., Zhang, Z., Miller, W., and Lipman, D. J. (1997). Gapped BLAST and PSI-BLAST: A new generation of protein database search programs. Nucleic Acids Res. 25, 3389-3402. Bilgin, N., Lee, J. I., Zhu, H. Y., Dalbey, R. E., and von Heijne, G. (1990). Mapping of catalytically important domains in Escherichia coli leader peptidase. EMBOJ. 9, 2717-2722. Black, M. T. (1993). Evidence that the catalytic activity of prokaryote leader peptidase depends upon the operation of a serine-lysine catalytic dyad. J. Bacteriol. 175, 4957-4961. Black, M. T., Munn, J. G., and Allsop, A. E. (1992). On the catalytic mechanism of prokaryotic leader peptidase 1. Biochem. J. 282, 539-543. Chatterjee, S., Suciu, D., Dalbey, R. E., Kahn, P. C., Inouye, M. (1995). Determination of Km and kcat for signal peptidase I using a full length secretory precursor, pro-OmpA-nuclease A.J. Mol. Biol. 245,311-314. Cregg, K. M., Wilding, E. I., and Black, M. T. (1996). Molecular cloning and expression of the spsB gene encoding an essential type I signal peptidase from Staphylococcus aureus. J. Bacteriol. 178, 5712-5718. Dalbey, R. E., and Wickner, W. (1985). Leader peptidase catalyzes the release of exported proteins from the outer surface of the Escherichia coli plasma membrane. J. Biol. Chem. 260, 1592515931. Dalbey, R. E., Lively, M. O., Bron, S., and van Dijl, J. M. (1997). The chemistry and enzymology of the type I signal peptidases. Prot. Sci. 6, 1129-1138. Date, T. (1983). Demonstration by a novel genetic technique that leader peptidase is an essential enzyme of Escherichia coli. J. Bacteriol. 154, 76-83. Date, T., and Wickner, W. (1981). Isolation of the Escherichia coli leader peptidase gene and effects of leader peptidase overproduction in vivo. Proc Natl Acad Sci USA 78, 6106-6110. Dev, I. K., Ray, P. H., and Novak, P. (1990). Minimum substrate sequence for signal peptidase I of Escherichia coli. J. Biol. Chem. 265, 20069-20072. Fikes,J. D., and Bassford, P.J.,Jr. (1987). Export of unprocessed precursor maltose-binding protein to the periplasm of Escherichia coli cells. J. Bacteriol. 169, 2352-2359. Fraser, C. M., Gocayne, J. D., White, O., Adams, M. D., Clayton, R. A., Fleischmann, R. D., Bult, C. J., Kerlavage, A. R., Sutton, G., Kelley, J. M., Fritchman, J. L., Weidman, J. F., Small, K. V., Sandusky, M., Fuhrmann, J., Nguyen, D., Utterback, T. R., Saudek, D. M., Phillips, C. A., Merrick, J. M., Tomb, J.-F., Dougherty, B. A., Bott, K. F., Hu, P.-C., Lucier, T. S., Peterson, S. N., Smith, H. O., Hutchison, C. A., III, and Venter, J. C. (1995). The minimal gene complement of Mycoplasma genitalium. Science 270,397-403. Green, B. G., Chabin, R., Mills, S., Underwood, D.J., Shah, S. K., Kuo, D., Gale, P., Maycock, A. L., Liesch, J., Burgey, C. S., Doherty, J. B., Dorn, C. P., Finke, P. E., Hagman, W. K., Hale, J. J.,

230

Mark O. Lively

MacCoss, M., Westler, W. M., and Knight, W. B. (1995). Mechanism of inhibition of human leucocyte elastase by beta-lactams. 2. Stability, reactivation kinetics, and products of betalactam-derived E-I complexes. Biochemistry 34, 14331-14343. Jackson, R. C. (1983). Quantitative assay for signal peptidase. Methods Enzymol. 96, 784-794. Knight, W. B., Green, B. G., Chabin, R. M., Gale, P., Maycock, A. L., Weston, H., Kuo, D. W., Westler, W. M., Dorn, C. P., Finke, P. E., Hagman, W. K., Liesch, J., MacCoss, M., Navia, M. A., Shah, S. K., Underwood, D., and Doherty, J. B. (1992). Specificity, stability, and potency of monocyclic beta-lactam inhibitors of human leucocyte elastase. Biochemistry. 31, 8160-8170. Koshland, D., Sauer, R. T., and Botstein, D. (1982) Diverse effects of mutations in the signal sequence on the secretion of beta-lactamase in Salmonella typhimurium. Cell 30,903-914. Kuhn, A, and Wickner, W. (1985). Conserved residues of the leader peptide are essential for cleavage by leader peptidase.J. Biol. Chem. 260, 15914-15918. Kuo, D., Weidner, J., Griffin, P., Shah, S. K., and Knight, W. B. (1994). Determination of the kinetic parameters of Escherichia coli leader peptidase activity using a continuous assay: The pH dependence and time-dependent inhibition by beta-lactams are consistent with a novel serine protease mechanism. Biochem. 33, 8347-8354. Misra, R., and Silhavy, T.J. (1992). In "Emerging Targets in Antibacterial and Antifungal Chemotherapy" (J. Sutcliffe and N. H. Georgopapadakou, Eds.), pp. 163-175. Chapman and Hall, New York. Moore, K. E., and Miura, S. (1987). A small hydrophobic domain anchors leader peptidase to the cytoplasmic membrane of Escherichia coli. J. Biol. Chem. 262, 8806-8813. NIAID (1992). "Report of the Task Force on Microbiology and Infectious Disease." NIH Publication 92-3320, U.S. Department of Health and Human Services, Public Health Service, National Institutes of Health, Bethesda, MD. Nielsen, H., Engelbrecht, J., Brunak, S., and von Heijne, G. (1997). Identification of prokaryotic and eukaryotic signal peptides and prediction of their cleavage sites. Prot. Engineer. 10, 1-6. Paetzel, M., and Dalbey, R. E. (1997). Catalytic hydroxyl/amine dyads within serine proteases. Trends Biochem. Sci. 22, 28-31. Paetzel, M., Chernaia, M., Strynadka, N., Tschantz, W., Cao, G., Dalbey, R. E., and James, M. N. (1995). Crystallization of a soluble, catalytically active form of Escherichia coli leader peptidase. Prot. Struct. Funct. Genet. 23, 122-125. Paetzel, M., Strydnadka, N. C. J., Tschantz, W. R., Casareno, R., Bullinger, P. R., and Dalbey, R. E. (1997). Use of site-directed chemical modification to study an essential lysine in Escherichia coli leader peptidase. J. Biol. Chem. 272, 9994-10003. Peat, T. S., Frank, E. G., McDonald, J. P., Levine, A. S., Woodgate, R., and Hendrickson, W. A. (1996). Structure of the UmuD p protein and its regulation in response to DNA damage. Nature 380, 727-730. Rapoport, T. A., Jungnickel, B., and Kutay, U. (1996). Protein transport across the eukaryotic endoplasmic reticulum and bacterial inner membranes. Ann. Rev. Biochem. 65, 271-303. Slilaty, W. R., and Little, J. W. (1987). Lysine-156 and serine-119 are required for LexA repressor cleavage: a possible mechanism. Proc. Natl. Acad. Sci. USA 84, 3987-3991. Strynadka, N. C., Adachi, H., Jensen, S. E., Johns, K., Sielecki, A., Betzel, C., Sutoh, K., and James, M. N. (1992). Molecular structure of the acyl-enzyme intermediate in beta-lactam hydrolysis at 1.7 A. Nature 359, 700-705. Suciu, D., Chatterjee, S., and Inouye, M. (1997). Catalytic efficiency of signal peptidase I of Escherichia coli is comparable to that of members of the serine protease family. Prot. Engineer. 10, 1057-1060. Sung, M., and Dalbey, R. E. (1992). Identification of potential active-site residues in the Escherichia coli leader peptidase.J. Biol. Chem. 267, 13154-13159. Tjalsma, H., Noback, M. A., Bron, S., Venema, G., Yamane, and van Dijl, J. M. (1997) Bacillus

Type I Signal Peptidases

231

subtilis contains four closely related type I signal peptidases with overlapping substrate specificities.J. Biol. Chem. 272, 25983-25992. Tschantz, W. R., and Dalbey, R. E. (1994). Bacterial leader peptidase I. Methods Enzymol. 244, 285-301. Tschantz, W. R., Sung, M., Delgado-Partin, V. M., and Dalbey, R. E. (1993). A serine and a lysine residue implicated in the catalytic mechanism of the Escherichia coli leader peptidase. J. Biol. Chem. 268, 27349-27354. van Dijl, J. M., de Jong, A., Venema, G., and Bron, S. (1995). Identification of the potential active site of the signal peptidase SipS of Bacillus subtilis: Structural and functional similarities with LexA-like proteases.J. Biol. Chem. 270, 3611-3618. van Dijl, J. M., van den Bergh, R., Reversma, T., Smith, H., Bron, S., and Venema, G. (1990). Molecular cloning of the Salmonella typhimurium lep gene in Escherichia coli. Mol. Gen. Genet. 223, 233-240. Vehmaanper~, J., Gorner, A., Venema, G., Bron, S., and van Dijl, J.M. (1993). In vitro assay for the Bacillus subtilis signal peptidase SipS: Systems for efficient in vitro transcription-translation and processing of precursors of secreted proteins. FEMS Microbiol. Lett. 114, 207-214. Whitley, P., Nilsson, L., and yon Heijne, G. (1993). Three-dimensional model for the membrane domain of Escherichia coli leader peptidase based on disulfide mapping. Biochemistry 32, 8534-8539. Wolfe, P. B., Silver, P., and Wickner, W. (1982). The isolation of homogeneous leader peptidase from a strain of Escherichia coli which overproduces the enzyme.J. Biol. Chem. 257, 7898-7902. Wolfe, P. B., Wickner, W., and Goodman, J. M. (1983). Sequence of the leader peptidase gene of Escherichia coli and the orientation of leader peptidase in the bacterial envelope. J. Biol. Chem. 258, 12073-12080. World Health Organization (1992). "Global Health Situations and Projections, Estimates." WHO, Geneva. Zhang, Y.-B., Greenberg, B., and Lacks, S. A. (1997). Analysis of a Streptococcus pneumoniae gene encoding signal peptidase I and overproduction of the enzyme. Gene 194, 249-255. Zhong, W., and Benkovic, S.J. (1998). Development of an internally quenched fluorescent substrate for Escherichia coli leader peptidase. Anal. Biochem. 255, 66-73. Zwizinski, C., and Wickner, W. (1980). Purification and characterization of leader (signal) peptidase from Escherichia coli. J. Biol. Chem. 255, 7973-7977. Zwizinski, C., Date, T., and Wickner, W. (1981). Leader peptidase is found in both the inner and outer membranes of Escherichia coli. J. Biol. Chem. 256, 3593-3597.

Proteinases Involved in Plant Virus Genome Expression JUAN ANTONIO GARCiA, MARIA ROSARIO FERNANDEZ-FERNANDEZ, AND JUAN Josl~ LOPEZ-MOYA Centro Nacional de Biotecnologfa (CSIC), Campus de la Universidad Aut6noma de Madrid, 28049 Madrid, Spain

I. I n t r o d u c t i o n

II. III. IV. V. VI.

Picornalike Supergroup Alphalike Supergroup Sobemolike Supergroup Plant Pararetroviruses Concluding Remarks and Perspectives References

I. I N T R O D U C T I O N Viruses that infect plants have genomes of rather small size that are expressed following very different strategies, which in many cases involve proteolytic processing of protein precursors (Zaccomer et al., 1995; Maia et al., 1996b). Host proteinases have been shown to participate in the maturation of animal virus proteins that are targeted to the viral envelope (see chapter by Kido). Although these events should also take place in plants, they have not yet been studied in detail in viruses infecting plants, probably due to the scarcity of enveloped plant viruses. For this reason, this chapter focuses only on virus-encoded proteinases. The genome of most plant viruses consists of one or various single-stranded (ss) RNA molecules of positive (4-) polarity. Despite their large diversity of genome organization and virion morphology, nucleotide sequence data have revealed that most eukaryotic (4-)RNA viruses (irrespective of whether they infect plants or animals) can be classified into two large "supergroups" (picornalike and alphalike) and a limited number of less well-defined minor Proteases of Infectious Agents Copyright 9 1999 by Academic Press. All rights of reproduction in any form reserved.

233

234

Garcia et al.

"supergroups" (as carmolike and sobemolike) (Goldbach et al., 1991). Table I lists plant virus groups in which virus-encoded proteinases have been identified, or its existence has been suggested on the basis of sequence analysis. Cysteine, serine, and serinelike (with a cysteine in the active center) proteinases, but not aspartic proteinases nor metalloproteinases, have been shown to be involved in the proteolytic processing of protein precursors encoded by plant (4-)RNA viruses (Maia et al., 1996b). The genomic RNAs of picornalike viruses encode large polyproteins that are processed by one or more virus-encoded proteinases. Proteolytic processing of polyproteins might not only determine the time of appearance of each gene

TABLE I Genera of Plant Viruses That Code for Demonstrated or Putative Proteinases Virus genus Picornalike Potyviridae Potyvirus Rymovirus Bymovirus Comoviridae Comovirus Nepovirus Sequiviridae Sequivirus Waikavirus Alphalike Tymovirus Marafiviruses Carlavirus Capillovirus Trichovirus Benyvirus Closterovirus Sobemolike Luteovirus (subgroup II) Sobemovirus Pararetrovirus Caulimoviridae

Proteinase classa

Ser, Cys, Ser-like Ser, Cys, Ser-like Cys, Ser-like Ser-like Ser-like Ser-like Ser-like Cys Cys Cys Cys Cys Cys Cys Ser Ser Asp

aThe class of proteinase(s) that the viral protein(s) is associated (or proposed to be associated) with is presented. Abbreviations: Ser, serine; Cys, cysteine; Serlike, serine-like with cysteine in the active center; Asp, aspartic.

Plant Virus Proteinases

235

product but also regulate its activity or subcellular localization by removal of functional domains. Genes of alphalike, carmolike, and sobemolike viruses are expressed through different combinations of strategies acting at the level of transcription (synthesis of subgenomic (sg) RNAs) and/or translation (alternative translation initiation sites, frameshifting, and readthrough at suppressible termination codons). In some cases, gene expression is completed posttranslationally by proteolysis mediated by a virus-encoded proteinase. Proteolytic processing of protein precursors has also been observed in plant pararetroviruses. These viruses have double-stranded (ds) DNA genomes that replicate via RNA intermediates (Hohn and FOtterer, 1997). In this case, the virus-encoded proteinase belongs to the aspartic family and a host cysteine proteinase seems to be involved in the maturation of a viral protein. Here, we have attempted to give an overall view of the use of proteolytic processing by different plant virus groups for the expression of their genomes.

II. PICORNALIKE SUPERGROUP A. POTYVIRIDAE The family Potyviridae includes three definite genera, the monopartite potyand rymoviruses and the bipartite bymoviruses and some still unclassified viruses such as those tentatively designated ipomoviruses and macluraviruses (Murphy et al., 1995). All potyviruses have similar genome organizations (Fig. 1). Each potyviral genomic RNA encodes all its proteins in single open reading frames (Shukla et al., 1994). The proteolytic processing of the polyprotein into which potyvirus RNA is translated has been extensively studied both in vitro (using cell-free translation systems) and in heterologous (Escherichia coli) or homologous (infected and transgenic plants) in vivo systems (reviewed by Riechmann et al., 1992). Two potyviral proteinases located at the N-terminus of the polyprotein, P 1 (Carrington et al., 1990; Verchot et al., 1991) and HC (Carrington et al., 1989a), autocatalytically cleave at their respective C-termini, whereas the NIa proteinase processes intra- and intermolecularly the rest of the polyprotein sites (Carrington and Dougherty, 1987; Hellmann et al., 1988; Garcia et al., 1989b; Ghabrial et al., 1990). The three potyviral proteinases have RNA binding activity that, in contrast with the RNA affinity of the picornaviral 3C proteinase, seems to be nonspecific (Brantley and Hunt, 1993; Maia and Bernardi, 1996; Daros and Carrington, 1997). Most potyviral proteins have been shown to be multifunctional. Thus, RNA binding and proteolysis might be independent activities of the potyviral proteinases.

0

;>

0

.t

~.~

o

< r~

>,

r176

r

o

.o

.~ 0

5~[~

~ !.!

~

,,~.~

9! ~ ~.~>

..o ~

Plant Virus Proteinases

23 7

1. P1 Proteinase

The P 1 proteinase is derived from the amino-terminal region of the polyprotein (Fig. 1). The proteolytic activity of the P 1 protein resides in its C-terminal half. In spite of the high variability found among the potyviral P 1 proteins, amino acids characteristic of serine proteinases have been found to be very well conserved, showing the signature HxsD/Ex29_32GxSG (D predominates over E) (Riechmann et al., 1992; Ryan and Flint, 1997). The same sequence, although with slight differences in spacing, can be identified in brome streak mosaic rymovirus (BrSMV) and sweetpotato mild mottle ipornovirus (SPMMV), but not in the bipartite bymoviruses. Site-directed mutagenesis of His214, Asp223, and Ser256 of P 1 from tobacco etch potyvirus (TEV) supported the assumption that this conserved motif corresponds to the active center of the enzyme (Verchot et al., 1992). The sequence GxSG is identical to the consensus motif around the Ser in the active site of trypsin- and chymotrypsinlike serine proteinases (Barrett, 1986). However, the distances between His, Asp, and Ser of the proposed catalytic triad of P1 proteinases are considerably shorter than those separating the active site residues in the cellular enzymes, indicating a quite broad evolutionary distance between them. In vitro processing experiments in a wheat germ system have shown that P 1 cleaves at its carboxyl end and is unable to act in trans (Verchot et al., 1992). Sequence comparisons among potyviral polyproteins show a nonstrictly conserved consensus sequence H/Q-Y/F$S for the P1 cleavage site, which resembles those of the cellular cathepsin C and chymotrypsin serine proteinases. An unusual feature of the P 1 proteinase is its inability to function in an in vitro rabbit reticulocyte system (Mavankal and Rhoads, 1991; Verchot et al., 1991). Addition of relatively small amounts of wheat germ extract to the reticulocyte lysate promoted P 1 proteinase activity, suggesting that inactivity in the latter is probably due to the lack of a host factor, rather than to the presence of an inhibitor (Verchot et al., 1992). This factor should also be present in insect cells, since P1 proteinase is active when expressed using a baculovirus vector (Thornbury et al., 1993). However, the fact that a chimerical plum pox virus (PPV)-TEV P 1 proteinase is active in a reticulocyte lysate system (P. Saenz and J. A. G., unpublished results) seems to indicate that the host factor is not strictly required for P 1 proteolytic activity. It has been shown that in TEV P1 acts in trans as an accessory factor for genome amplification (Verchot and Carrington, 1995b). The infectivity of P1 proteinase-debilitated TEV mutants was restored by second-site mutations that inserted a cleavage site recognized by the NIa proteinase (Verchot and Cartington, 1995a). This result indicates that proteolytic separation of P 1 from the next gene product, HC, but not P1 proteolytic activity per se, is essential for virus viability in plants.

238

Garcia et al.

2. HC Proteinase The HC protein is adjacent to the C-terminus of P1 proteinase (Fig. 1). This protein, which often aggregates into amorphous inclusion bodies (De Mejia et al., 1985), was first identified as a factor required for aphid transmission of the virus (helper component) (Berger et al., 1989), but later on it has been shown to be involved in efficient genome amplification (Atreya et al., 1992) and cell-to-cell (Rojas et al., 1997) and long-distance (Cronin et al., 1995; Kasschau et al., 1997) movement (reviewed by Maia et al., 1996a). Recent reports indicate that HC behaves as a broad-range pathogenicity enhancer (Pruss et al., 1997; Shi et al., 1997). A papainlike proteinase domain has been localized to the C-terminal half of HC protein (Carrington et al., 1989a). In TEV, the catalytic dyad was shown by sequence comparisons and mutagenesis analysis to be composed of Cys649 and His722 (Oh and Carrington, 1989). The HC proteinase domain is well conserved among potyviruses and may be easily aligned, specially at the sequences around the catalytic residues and at a LGxWP motif, with homologous polyprotein regions in the mite-transmitted BrSMV rymovirus and the whitefly transmitted SPMMV ipomovirus (Fig. 7). A considerable similarity was also detected between the HC proteinase domain and the N-terminal 28K protein encoded by the RNA-2 of fungus-transmitted bymoviruses. Other cysteine proteinases that bear striking resemblance to HC are encoded by closteroviruses (Agranovsky et al., 1994; Karasev et al., 1995) (see Section III,A) and by ORFA and ORFB of the hypovirulence-associated dsRNA virus (HyAV) of chestnut blight fungus (Choi et al., 1991; Shapira and Nuss, 1991) (Fig. 7). Viral cysteine proteinases more distantly related to HC have been identified as products of different plant alphalike viruses (see Section III,B), aphtoviruses, alphaviruses, rubiviruses, arteriviruses, and coronaviruses (Gorbalenya et al., 1991; Rozanov et al., 1995). The HC proteinase is responsible for processing at its own C-terminus via an autocatalytic mechanism. When analyzed in vitro, it exhibits little or no proteolytic activity in trans (Carrington et al., 1989a,b). The TEV mutant genomes modified at the HC proteinase active site were amplification defective in protoplasts and plants. Introduction of a heterologous cleavage site recognized by the NIa proteinase at the HC C-terminus was not sufficient to restore genome amplification. In addition, an active-site mutant was not complemented by wild-type HC protein supplied in trans in transgenic plants. These results suggest that an active HC proteinase is required in cis for virus amplification (Kasschau and Carrington, 1995). Tobacco etch potyvirus HC proteinase cleavage in vitro occurs at a Gly-Gly dipeptide (aa 763-764). There is a good consensus, YxVGSG, at the presumed cleavage sites of different potyviruses, SPMMV ipomovirus and BrSMV rymovirus. The marked preference for specific amino acids at this site has been dem-

Plant Virus Proteinases

239

onstrated by site-directed mutagenesis (Carrington and Herndon, 1992). With the exceptions of Tyr to Phe (P4) and Val to Leu (P2) changes, which were partially tolerated, even very conservative substitutions at P4, P2, P 1, and P 1' positions were found to eliminate or nearly eliminate proteolysis. Substitutions at the P5, P3, and P2' positions permitted processing to occur, although in some cases at reduced rates. This level of specificity, that it is also shown by the potyviral NIa proteinase (see below), is not usual in viral proteinases. Interestingly, the sequence around the putative cleavage site at the C-end of the 28K proteinase of bymoviruses differs from the potyvirus HC consensus sequence, reflecting the divergence between the potyvirus HC and the bymovirus 28K proteinases.

3. NIa Proteinase The NIa (nuclear inclusion protein a) forms, together with the RNA replicase NIb, crystalline inclusions within the nucleus of cells infected with some, but not all, potyviruses (Hiebert et al., 1984). Nuclear inclusion protein a consists of an amino domain that constitutes the genome-linked protein (VPg) and a carboxyl domain that is associated with the inter- and intramolecular proteolytic activity responsible for most of the potyvirus polyprotein processing (Carrington and Dougherty, 1987; Hellmann et al., 1988; Dougherty and Parks, 1991; Garcia and Lain, 1991). Nuclear inclusion protein a belongs to a family of viral proteinases whose archetypal member is the picornavirus 3C protein. These proteins are related to the trypsinlike family of cellular serine proteinases, but Cys replaces Ser in their active center (Bazan and Fletterick, 1988, 1990; Gorbalenya et al., 1989). Sequence comparison analysis and site-directed mutagenesis enabled localization of the probable catalytic triad, composed of His, Asp, and Cys, and the His residue in the substrate-binding pocket characteristic of a Gln-x substrate specificity (Fig. 4) (Carrington et al., 1988; Dougherty et al., 1989b; Gorbalenya eta/., 1989; Bazan and Fletterick, 1990; Garcia et a/., 1990; Ghabrial et al., 1990). Similar catalytic triads formed by His, an acidic residue (Asp or Glu) and Cys have been suggested for other viral 3C-like proteinases (Ryan and Flint, 1997). However, crystal structure of hepatitis A virus (HAV) 3C proteinase poses some doubts on the catalytic relevance of the acidic residue and suggests that viral 3C proteinases may have a catalytic dyad rather than a triad (Bergmann et al., 1997; see chapter by Bergmann and James). Interestingly, a substitution of Glu for the proposed catalytic Asp in the TEV and PPV NIa proteinases had different effects on proteolytic activity depending on the cleavage site analyzed (Dougherty et al., 1989b; Garcia et al., 1990). These results suggest that there may be different structural requirements in the active center

240

6arcia et al.

of the proteinase for processing at different cleavage sites. The proposed catalytic triad is well conserved among genera of the family Potyviridae, although sequences around them have diverged considerably (Fig. 4). It is remarkable that the place of the His involved in recognition of Gln at position P1 of the cleavage site is occupied in SPMMV ipomovirus by an Asn residue; in agreement with this fact, cleavage sites of SPMMV NIa proteinase seem to differ from those of the rest of NIa proteinases (see below). The NIa cleavage sites are defined by conserved heptapeptide sequences (Carrington and Dougherty, 1987, 1988; Garcia et al., 1989a; Martin et al., 1990). The requirement of such an extended sequence motif is a peculiarity of the potyvirus NIa proteinases, and it is not shared by the 3C-like proteinases of other picornalike viruses. Although there is not an extended conserved cleavage motif used by all potyviruses, the NIa cleavage sites share enough features to be easily identified by sequence analysis. Position P 1 is always occupied by Gln or, in much fewer sites, by Glu. The Val residue is the preferred one for position P4, which in rare occasions is occupied by another hydrophobic residue. The other positions of the heptapeptide are less conserved, although PI' residue is most of the time Gly, Ala, or Ser, while His or an aromatic residue predominate at the P2 position and acidic or amide residues (specially Glu) are quite common at the P6 position. Site-directed mutagenesis studies of two TEV cleavage sites demonstrated that the presence of particular amino acids at positions P6, P4, P3, P1, and PI' are essential for the cleavage site functionality (Dougherty et al., 1988), whereas the amino acid at positions P5 and P2 influence the cleavage reaction profile (Dougherty et al., 1989a; Dougherty and Parks, 1989). It is important to note that P6 and P3 positions are more conserved in TEV than P4 and P2 ones, whereas the opposite situation is observed in other potyviruses; thus, it is possible that the relevance of each heptapeptide sequence residue can vary depending on the particular potyvirus. Artificial NIa cleavage sites have been constructed by inserting the appropriate heptapeptide sequences in nonspecific protein regions (Carrington and Dougherty, 1988; Garcia et al., 1989a). However, sequences and/or conformational context outside the conserved heptapeptide have been shown to modulate the cleavage reaction efficiency (Garcia et al., 1992). Sequences similar to those of the potyvirus NIa cleavage sites are present at the expected positions of the polyproteins encoded by BrSMV rymovirus RNA and barley yellow mosaic virus (BaYMV) and barley mild mosaic (BaMMV) bymovirus RNA-1 molecules. However, we have not been able to identify putative cleavage sites for the NIa-like protein of SPMMV ipomovirus, suggesting that their sequences have diverged from those recognized by the potyvirus NIa. This supposition is in agreement with the lack in SPMMV of the typical His of the 3C-like proteinase substrate-binding pocket (Fig. 4). Although both NIa proteinases and their recognition sequences have consid-

Plant Virus Proteinases

241

erable similarity among different potyviruses, NIa cleavage sites of each potyvirus seem to be efficiently recognized only by their own proteinases (Garcia et al., 1989a; Garcia and Lain, 1991; Parks and Dougherty, 1991). Results from chimeric NIa proteinases (TEV-TVMV and TEV-PPV) suggest that NIa proteinase recognition and catalytic sites are closely interlinked. Several protein domains, one of them including the substrate-binding pocket His residue, appear to be important in determining substrate specificity (Garcia and Lain, 1991; Parks and Dougherty, 1991). The main task of NIa proteinase is to obtain the final processing products of the polyprotein. Nevertheless, regulation of the processing pathway is probably essential to: (1) synthesize the required product at the right time and place, (2) maintain functional partially processed products, and (3) control protein activity by cleavage of functional domains. Little is known about this level of regulation in potyviruses. However, some processing steps whose objective goes beyond producing the final protein products have been proposed. Cleavage between the PPV P3 protein and a putative 6K1 peptide has been shown to occur in vitro (Garcia et al., 1992). However, in vitro cleavage at the equivalent TEV site (discernible by sequence alignments) has not been detected (Parks et al., 1992), and processing of the PPV polyprotein P3-6K1 junction seems not to be essential for virus viability, although it affected virus infectivity and symptom induction. Thus, it has been suggested that the role of cleavage at the P3-6K1 site, rather than producing two proteins, P3 and 6K1, is to regulate the activity of a single functional protein, P3-6K~ (Riechmann et al., 1995). Another small peptide, 6K2, is placed upstream of NIa in the potyviral polyprotein. It has been shown that, whereas TEV NIa is transported to the nucleus, the 6K2-NIa precursor is directed to membranous structures, where potyvirus replication takes place (Restrepo-Hartwig and Carrington, 1992, 1994; Schaad et al., 1997); thus, 6K2 probably lacks activity by itself, playing its role in the context of the 6K2-NIa unprocessed product. In this scenario, cleavage at the 6K2-NIa junction would be involved in the control of NIa activity. Another clear example of a regulated proteolytic event is the internal cleavage that splits the VPg and proteinase domains of the NIa protein (Dougherty and Parks, 1991; Laliberte et al., 1992). Although the unprocessed NIa product can readily function as a proteinase, and also it has been found covalently attached to a fraction of the TEV genomic RNAs examined by Murphy et al. (1990), a mutation that inhibited internal cleavage of NIa abolished TEV infectivity, indicating that proteolytic separation of the VPg and Pro NIa domains is essential for viral viability (Carrington et al., 1993). The NIa internal cleavage site is processed incompletely in infected cells and inefficiently in vitro. Thus, whereas TEV NIa is a very abundant protein that accumulates in nuclear inclusions in infected cells, the 21K VPg is only found linked to the viral RNA and the 27K proteinase domain is found neither in nuclear inclusions nor in total

242

Garcia et al.

protein extracts from infected tissues (Dougherty and Parks, 1991; Carrington et al., 1993). While most sites recognized by the NIa proteinases have a Gln at P1 position, a Glu residue is present at this position of all proposed VPg-Pro cleavage sites. The residue at position P3 of the TEV VPg-Pro cleavage site also deviated from the heptapeptide consensus sequence. These changes might be involved in slowing cleavage. Mutations that accommodated the VPg-Pro junction to a consensus cleavage site accelerated internal processing in vitro very much. Genome amplification was drastically disturbed by these substitutions, suggesting that the slow-processing feature may accomplish an important regulatory function (Schaad et al., 1996). The proteolytic processing strategy of gene expression provides the opportunity to use partially processed forms of viral proteinases to play alternative roles (Dessens and Lomonossoff, 1992; Hellen and Wimmer, 1992; Margis et al., 1994). However, when the cleavage profiles of precursor and processed forms of the TEV NIa proteinase were analyzed, most substrates were processed in a similar fashion by all proteolytic forms. Only at the 6K1-CI site, slight processing differences could be observed (Parks et al., 1992). Further autoprocessing at specific positions of the NIa C-terminal region has been described in turnip mosaic potyvirus (TuMV) (Kim et al., 1995; MCnard et al., 1995; Kim et al., 1996) and TEV (Parks et al., 1995). The sequences around these cleavage sites were not similar to the typical heptapeptide recognition signals. Whereas a truncated TuMV 25K product (lacking the last 20 aa) was as active as the complete 27K proteinase for the cleavage at the 6K1-CI site (Kim et al., 1995), a TuMV 24K protein (lacking the last 30 aa) did not cleave at this site (Kim et al., 1996), and a TEV 25K protein (lacking the last 24 aa) was approximately onetwentieth as efficient in proteolysis of the NIb-CP site as the full-length form (Parks et al., 1995). The functional relevance of sequences at the C-terminal region of the NIa proteinase (that may depend on the substrate analyzed), has also been shown by deletion and site-directed mutagenesis (Garcia et al., 1989b; Kim et al., 1996). Thus, although it has not been demonstrated that trimming at the NIa C-end takes place in vivo, the possibility of regulation of activity by removal of C-terminal sequences of the NIa proteinase must be carefully considered.

B. COMOVIRIDAE The family Comoviridae includes the genera Comovirus, Nepovirus, and Fabavirus (Murphy et al., 1995). Since at present data on the molecular biology of fabaviruses are not available, only como- and nepovirus proteinases are discussed. In both cases, their genomes are split in two RNA molecules, which encode large polyproteins that are proteolytically cleaved by viral proteinases.

Plant Virus Proteinases

243

Whereas comovirus RNA-B and nepovirus RNA-1 code for all the proteins required for RNA replication, including the viral proteinases, comovirus RNA-M and nepovirus RNA-2 encode the capsid proteins (CPs) and proteins involved in virus movement. Comovirus RNA-M differs from nepovirus RNA-2 in being translated from two alternative in frame AUG codons giving rise to two coterminal polyproteins (Fig. 2). The best studied member of the Comoviridae family is cowpea mosaic comovirus (CPMV). Its RNA-B encodes a 24K proteinase homologous to the 3C-like proteinases. The genomic location of the proteinase gene, as part of a VPgproteinase-replicase segment, is consistent with that of the rest of picornalike viruses. Sequence alignments (Gorbalenya et al., 1989; Bazan and Fletterick, 1990; Shanks and Lomonossof, 1990) and site-directed mutagenesis analysis (Dessens and Lomonossoff, 1991) have identified the probable catalytic triad, composed of His987, Glu1023, and Cys1113. Sequence analysis also predicts that His1131 is the residue of the substrate-binding pocket that interacts with Gln at cleavage site P 1 position. The CPMV 24K proteinase recognize Q-G, Q-S, and Q-M cleavage sites (Wellink and van Kammen, 1988). Similar dipeptides are present at the putative cleavage sites of other sequenced comoviruses (Chen and Bruening, 1992a,b; Shanks and Lomonossoff, 1992). A weak consensus for Ala at positions P2 and P4 is the only other feature observed at the comovirus cleavage sites. Comovirus polyprotein processing differs from those of other picornalike viruses in being regulated by a viral product without proteolytic activity. After in vitro translation, an N-terminal 32K protein is released from the CPMV 200K RNA-B-encoded polyprotein by an intramolecular cleavage and remains associated with the remaining 170K protein, probably by interaction with its 58K domain (Peters et al., 1992b). The same complex can be formed if 32K and 170K proteins are translated simultaneously from different RNA molecules, but the proteins cannot associate if they are translated separately and mixed later (Peters et al., 1992b). When the 170K polyprotein, which contains the 24K proteinase, is associated with the 32K cofactor, further 170K self-cleavage is very slow (Peters et al., 1992b). Also, trans cleavage between the two RNA-Mencoded CPs mediated by the 32K-170K complex is very inefficient (Vos et al., 1988). In contrast, this complex (or noncleaved 200K protein) accomplishes the processing at the Gln-Met site that separates the 48- and 58-kDa proteins from the 60K CP precursor (Vos et al., 1988; Peters et al., 1992b). In the absence of the 32K factor, the 170K protein was efficiently processed, essentially by cis cleavages, following three different pathways that start with the synthesis of 60K+ 110K, 80K+87K, and 58K+ 112K proteins (Peters et al., 1992a). The puzzling final pattern includes not only the fully processed products, but also stable intermediates that are not efficiently processed due to the cis preference

< ,.... 0

.I. e e

! >

0

<

>

!

z

0

<

J.

T > < 0

>

4=

5

0

4=

<

"~

0

0

o~

9~

"~

.,,,~

~ ' =0

~..~

Q,)

~'~ 0

~3

Plant Virus Proteinases

245

of the 24K proteinase (Peters et al., 1992a) and the requirement of upstream sequences for efficient cleavage of the 24K proteinase at its C-end (Dessens and Lomonossoff, 1992). This complex control of the proteolytic processing seems to be essential even for the replication of CPMV RNA-B alone, since a mutated RNA-B that lacks the sequences coding for the 32K protein did not replicate in cowpea protoplasts (Peters et al., 1992b). It has been suggested that the 32K protein might act as a molecular chaperon blocking a certain folding pathway for the 170K protein that could lead to the formation of abortive structures as a result of premature self-cleavage (Peters et al., 1992b). Sequence alignments of the 24K proteinases from different comoviruses show large sequence conservation, even at the residues predicted to be involved in substrate binding by Bazan and Fletterick (1988). In spite of this fact, the 24K proteinase of a comovirus is not able to process cleavage sites of another comovirus neither in trans (at the 58/48K- 60K junction) (Goldbach and Krijt, 1982) nor in cis (at the 32K-170K junction) (Shanks et al., 1996). The fact that changing the Gln-Gly dipeptide at the 37K-23K junction of the M polyprotein into either Gln-Ser or Gln-Met resulted in a dramatic decrease of proteolysis efficiency indicates that cleavage specificity of the 24K proteinases is determined in part by the amino acid sequence of the junction site (Vos et al., 1988). However, cleavage by the CPMV 24K proteinase at the CPMV 32K-170K junction is not prevented when the Gln-Ser site is changed into a His-Met site (Peters et al., 1992b), whereas CPMV 24K proteinase is not able to cleave at Gln-Gly site of the red clover mottle comovirus 32K-170K junction (Shanks et al., 1996), suggesting that in this case the specificity is determined by tertiary structure interactions between the substrate and the substrate-binding pocket of the proteinase. The proteinase involved in the proteolytic processing of the nepovirus polyproteins occupies the same genomic place and has similar size as the comovirus 24K proteinase (Figs. 2 and 4). The catalytic triad of the grapevine fanleaf nepovirus (GFLV) 24K proteinase, predicted on sequence alignments and studied by site-directed mutagenesis, is formed by His1284, Glu1328, and Cys1420 (Margis and Pinck, 1992). The GFLV 24K proteinase differs from the 3C-like proteinases of poty- and comoviruses in two main aspects. First, its active site Cys can be mutated to Ser without loss of activity and, second, Leu1438 substitutes for the typical His of the 3C proteinase substrate-binding pocket (Margis and Pinck, 1992). Similar Leu residues are found in the 24K proteinases of tomato black ring nepovirus (TBRV) and grapevine chrome mosaic nepovirus (GCMV). All these data suggest that the substrate specificity of nepovirus enzymes is more similar to that of cellular serine proteinases than to that of their viral counterparts. In agreement with this assumption, some nepovirus proteinase cleavages take place at Arg-Ala, Arg-Gly, and Arg-Ala dipeptides, although cleavages at Cys-Ala, Cys-Ser, and Gly-Glu dipeptides have also been

246

Garcia et al.

described (Brault et al., 1989; Pinck et al., 1991; Margis et al., 1993). The 24K proteinase of tomato ringspot nepovirus (TomRSV), which probably form part of a distinct subgroup, seems to be different from those of GFLV and other related nepoviruses, being more similar to the CPMV 24K proteinase since it has a His in the putative substrate-binding pocket and probably cleaves Gln-x sites (Rott et al., 1995). Although the proteolytic processing of nepovirus polyproteins should be also tightly regulated, the control mechanisms seem to be quite different from those of comoviruses. A virus-encoded cofactor is not required for in vitro trans processing of the RNA-2-derived polyprotein (Margis et al., 1993). However, the identification at the N-terminal region of some nepovirus polyproteins of sequence motifs also present in the comovirus 32K protein (Ritzenhaler et al., 1991; Rott et al., 1995) might suggest that, at least in some cases, this protein could collaborate with the 24K protein in the nepovirus proteolytic processing. Similarly to the comovirus proteinase, the activity of the nepovirus 24K proteinase is modulated by sequences surrounding it. However, while sequences upstream of the comovirus proteinase enhance in vitro cleavage at its C-terminus (Dessens and Lomonossoff, 1992), the GFLV 24K-92K precursor is better cleaved than the VPg-24K-92K intermediates (Margis et al., 1994).

C. SEQUIVIRIDAE The family Sequiviridae consists of the genera Sequivirus and Waikavirus (Murphy et al., 1995). The monopartite genome of sequi- and waikaviruses differs from that of the potyviruses in encoding three capsid proteins (like animal picornaviruses), located internally near the N-terminus of the large polyprotein (Fig. 3). Particular features of the waikaviruses are long AUG-containing sequences upstream of the large genomic ORF and small 3 ~ ORFs that might be expressed by subgenomic RNAs (Shen et al., 1993; Reddick et al., 1997). The genomic RNA of parsnip yellow fleck sequivirus (PYFV) lacks a poly-A tail (Turnbull-Ross et al., 1992). Although until now experimental data on the activity of the sequiviridae 3Cdike proteinases have not been obtained, they can be clearly identified by sequence alignments (Fig. 4) (Shen et al., 1993; Turnbull-Ross et al., 1993; Reddick et al., 1997). The predicted catalytic triads of PYFV sequivirus and rice tungro spherical waikavirus (RTSV) are formed by His, Glu, and Cys (Figs. 3 and 4). However, the proposed acidic active site residue of maize chlorotic dwarf waikavirus (MCDV) is an Asp residue (Reddick et al., 1997). Differences are also observed at the putative substrate-binding pockets. The typical His of proteinases that cleave after a Gln residue can be identified in the RTSV and MCDV waikavirus sequences, but it is replaced by Leu (like in most nepoviruses) in

mA~

D~ D~

~.~.

>

0

<

I

D~m

0

>

.,~

c~

~ 0

0

o

.~

~

~,~

~" 0

~'~ 0 .,.~

l~

o~@ ~

"~

~.~

0

o

"~

.~

~_~

.

cn

o=o~

.~

9~

~

~

o

o

~~

~

0

0

0

0

~-.~ ~ o

o

249

Plant Virus Proteinases

PYFV. In agreement with these data, N-terminal amino acid sequencing has revealed that the RTSV (Shen et al., 1993) and MCDV (Reddick et al., 1997) CPs probably arise from proteolytic cleavages at Gln-x dipeptide, whereas no consensus sequence can be found for the cleavage sites that originate the PYFV CPs (Ser-Pro, Asn-Ala, and Gln-Ala) (Turnbull-Ross et al., 1993). However, it cannot be ruled out that the observed CP N-termini might result from secondary proteolytic degradation rather than from the primary cleavage event carried out by the virus proteinase.

III. ALPHALIKE SUPERGROUP

A. CLOSTEROVIRUS The closterovirus genomes, which can be mono- or bipartites, are the largest among all (+)ssRNA plant viruses (Murphy et al., 1995). The organization and expression of the closterovirus genome resembles that of coronaviruses, with polyprotein processing, translational frameshifting, and multiple sgRNA formation. However, the closterovirus RNA replicase belongs to the alphalike lineage and the mechanism of sgRNA transcription of citrus tristeza closterovirus (CTV) is similar to that of other alphalike viruses, differing clearly from that of coronaviruses (Karasev et al., 1997). The closterovirus large replication proteins are encoded by ORFla and ORFlb, the second one being probably translated by ribosomal frameshifting as a fusion product with the upstream protein (Fig. 5). The 5' end of beet yellows closterovirus (BYV) ORFla encodes a papainlike cysteine proteinase domain which has been proven to be active in an in vitro assay (Agranovsky et al., 1994). A similar domain has been identified in other closteroviruses (Agranovsky, 1996) and it is duplicated in CTV (Karasev et al., 1995). The size of these closterovirus leader proteinases is around 60K. In spite of the similarity of their proteinase domains, the regions upstream of these domains are badly conserved among different closterovirus leader proteins. Leader papainlike proteinases have been described for very different (+)ssRNA viruses, such as coronaviruses, arteriviruses, aphtoviruses, and hypoviruses (Gorbalenya et al., 1991; Rozanov et al., 1995). Moreover, duplication of leader

H463 H955 403C~]1 v 895C~11 v I

9

I 9 I

I

Ill

!

[

I

I

I-i*lU

~

U

CTV

FIGURE 5 Genomemap of citrus tristeza closterovirus depicted as explained in the legend to Fig. 1. (&) Indicatesa methyltransferase-likedomain.

250

Garcia et al.

papainlike proteinases also seems to be a common phenomenon, whose selective advantage is still unknown (Lee et al., 1991; Shapira and Nuss, 1991; Godeny et al., 1993). The proteinase domain of the closterovirus leader protein is more similar to those of HC proteins of different genera of the family Potyviridae than to those of other papainlike proteinases (Section II,A,2; Fig. 7) (Agranovsky et al., 1994). Sequence alignments and site-directed mutagenesis experiments indicate that Cys509 and His569 constitute the catalytic dyad of the BYV leader proteinase (Agranovsky et al., 1994). The closterovirus cysteine proteinases resemble other viral papainlike proteinases in cleaving at u G S x sites, where u is a bulky hydrophobic residue and x is usually a Gly. Some conservation of a negatively charged residue at P4' position has also been observed (Jelkmann et al., 1997). Little is known about the functional role of closterovirus leader proteins. A putative function in aphid transmission has been suggested on the basis of analogy to potyvirus HC protein. However, the multifunctional nature of HC protein and the conservation of the cysteine proteinase domain in potyviridae (see Section II,A,2) and closteroviruses (Jelkmann et al., 1997) that are not transmitted by aphids poses concerns on this hypothesis. Recently, it has been described that the leader proteinase of BYV suppresses potyvirus infection in the BYV nonhost Nicotiana tabacum, but does not affect potyvirus replication in N. tabacum protoplasts or systemic infection in the BYV host Nicotiana benthamiana (Dolja et al., 1997). The authors suggest that the potyviral HC and the BYV leader proteinases have analogous structural and functional organization and that the two proteins may compete for interaction with the same cellular target. The complex formed by BYV protein would be functional in the BYV host plant but would interfere with normal potyvirus infection in the BYV nonhost plant.

B. TYMOVIRUS-LIKE The plant alphalike viruses are characterized by replication proteins that contain an ordered series of domains: methyltransferaselike (MTr), helicaselike (Hel), and polymeraselike (pol) (Goldbach et al., 1991). In some cases two proteins, one containing the MTr and Hel domains and the other containing the Pol domain, are encoded by separated RNA molecules. Other viruses encode a single polyprotein that contains the three domains. However, most of the alphalike viruses have developed special strategies to produce different amounts of a protein containing the three domains and a Pol-containing protein. Suppression of termination at leaky stop codons (readthrough), ribosomal frameshifting, and proteolytic processing can play that role. The proteinases involved in the proteolytic processing of the plant alphalike replication proteins have been denoted tymolike because they share some

251

Plant Virus Proteinases

particular features with the best studied member of the group, the tymovirus proteinase (for a recent review, see Rozanov et al., 1995). The proteinase responsible for the intramolecular cleavage of the 206K replication protein of turnip yellow mosaic virus (TYMV) has been mapped just upstream of the Hel domain (Fig. 6), and deletion analysis has delimited the proteinase domain to residues 731-885 (Bransom and Dreher, 1994). Sequence alignments and sitedirected mutagenesis analysis indicate that the TYMV proteinase is a papainlike cysteine proteinase with a predicted catalytic dyad formed by Cys783 and His869 (Bransom and Dreher, 1994; Rozanov et al., 1995). In contrast with the closterovirus leader and the potyvirus HC proteinases, the TYMV proteinase does not cleave at the end of the proteinase domain but further downstream between the Hel and the Pol domains. N-Terminal sequencing of the C-terminal

783C H869 TYMV

13 739CI 18828 i

q ~

~

Y? ,,,~,

Y?

~, m

m

]non

9

993C H1074 i !

,

OBDV

polyA

u *-

I~l

Ii 853C H929

,~

9 i

~'~ i,

~

! HI_

~

*

I

polyA

PVM

y?

~

~§

9

polyA

ACLSV

1202C H1283 ~,~

606C H691 ,~

~(~

W-

~

~

9

>olyA A S P V

W. 9

polyA

ASGV

1293C H1384 "~ 9

i'

*

i

9 . . . . . . ~ - polyA

BNYVV a N A l

FIGURE 6 Genome maps of members of virus groups that have tymolike proteinases depicted as explained in the legends to Figs. 1 and 5. Viruses not mentioned in the text are: potato carlavirus M (PVM), apple chlorotic leaf spot trichovirus (ACLSV), and apple stem grooving capillovirus (ASGV). Tymolike proteinase sequences are represented by crossed patterns. Dashed vertical lines indicate putative cleavage sites predicted by sequence analysis. A box inside another one indicates sequences that are thought to be expressed both as a fusion with preceding in-frame ones and by a subgenomic RNA.

252

Garcia et al.

cleavage product derived from autoprocessing of the TYMV 206K polyprotein synthesized in vitro (Bransom et al., 1996) or in E. coli (Kadar6 et al., 1995) has shown that cleavage occurs between Ala1259 and Thr1260. The sequences at the cleavage sites predicted for different tymoviruses are poorly conserved, although a small amino acid is always present at position P1 (resembling other viral papainlike cysteine proteinases, see Sections II,A,2 and III,A) and P2 (Kadar6 et al., 1995). Predicted cysteine proteinase domains similar to the TYMV one have been identified in the replication proteins of carlaviruses, capilloviruses, trichoviruses, marafiviruses, and apple stem pitting virus (ASPV) (Rozanov et al., 1995; Edwards et al., 1997) (Figs. 6 and 7). Although the genome structures of these viruses are very different, their large replication proteins always show the same modular organization: MTr-proteinase-Hel-Pol. Interestingly, the capsid protein of capilloviruses (Ohira et al., 1995) and one of the capsid proteins of oat blue dwarf marafivirus (OBDV) (Edwards et al., 1997) are translated as the C-terminal part of their large polyproteins, but it is unknown if they are proteolytically processed by their tymolike cysteine proteinases. Experimental evidence for proteolytic activity of tymolike proteinases other than the TYMV one has only been reported for that of blueberry scorch carlavirus (BBScV) (Lawrence et al., 1995).

FIGURE 7 Sequencealignments of regions around the catalytic residues (signaled by *) of plant virus papainlike proteinases. The HyAVpapainlike proteinases were also included in the alignment. Black or graybackgrounds indicate highly or moderatelyconserved residues, respectively,either in the HC-like or in the tymolike groups. Virus names are explained in the text and in the legend to Fig. 6.

Plant Virus Proteinases

253

The protein encoded by beet necrotic yellow vein benyvirus (BNYVV) RNA-1, which contains the information necessary for replication of the viral genome, has been shown to undergo autocatalytic processing (Hehn et al., 1997). A domain with sequence similarity to the papainlike TYMV proteinase has been found in the BNYVV RNA-l-encoded product (Figs. 6 and 7). In contrast with other tymolike proteinase domains, that of BNYVV is located between the Hel and Pol domains of the polyprotein; that is, closed to the cleavage site upstream of the Pol domain (Rozanov et al., 1995; Hehn et al., 1997).

IV. S O B E M O L I K E S U P E R G R O U P The sobemolike is a small supergroup formed by the genus Sobemovirus and the subgroup II of the genus Luteovirus, which have been proposed to have emerged by a recombination event between a sobemovirus and a subgroup I luteovirus (Goldbach et al., 1991). The genome organization of these rather small viruses is quite complex, and its genome expression employs sgRNAs, frameshifting, readthrough, and, probably, proteolytic processing. Gorbalenya et al. (1988) proposed some years ago that a serine proteinase is encoded by the sobemovirus genome. Although the proteinase domain has been tentatively identified in newly sequenced sobemoviruses and luteoviruses, direct experimental evidence of proteolytic activity associated with a gene product from these viruses is not yet available. Recently, indirect evidence for a proteinase activity associated to potato leafroll luteovirus (PLRV) has been reported. The experimentally determined N-terminal amino acid sequence of the PLRV VPg has been shown to map to the carboxyl region of the PLRV ORF1 product downstream of the putative proteinase domain (van der Wilk et al., 1997) (Fig. 8). Since the RNA replicase is thought to form the carboxyl part of the readthrough product of ORF1 and ORF2, the position of the proteinase on the luteoviral polyprotein would differ from the picornalike VPg-proteinase-Pol arrangement, which prevails in all other ssRNA viruses with a VPg. According to sequence alignment analysis, the proteolytic processing site at the N-terminus of the subgroup II luteovirus VPg is predicted to be E$S/T (van der Wilk et al., 1997).

V. P L A N T P A R A R E T R O V I R U S E S Aspartyl proteinases from animal retroviruses have been studied in great detail in the last years (Dougherty and Semler, 1993). In contrast, few experimental data are available on plant pararetrovirus proteinases. Plant pararetroviruses are now classified as a novel family Caulimoviridae

254

Garcia et al.

FIGURE 8 Genome maps of a luteovirus (PLRV) and a sobemovirus (SBMV) depicted as exm plained in Figs. 1 and 5.

(Pringle, 1998). Two genera, Caulimovirus and Badnavirus, were first defined using the capsid morphology as main classification criterion. However, the larger number of different plant pararetrovirus genome organizations demand a more complex classification (Hohn and F~itterer, 1997; Pringle, 1998). All plant pararetroviruses encapsidate dsDNA, contain genes homologous to gag and pol genes of animal retroviruses, and seem to share the same replication strategy involving reverse transcription. Conversely, different plant pararetroviruses use quite diverse gene expression mechanisms (Hohn and F~itterer, 1997). Cauliflower mosaic caulimovirus (CaMV) is the best-studied plant pararet~ rovirus (Fig. 9). The CaMV genome encodes CP, proteinase, reverse transcriptase, and RNase H in the same order than animal retroviruses; however, CaMV differs from them in producing independent polyproteins for the CP (ORF4) and the enzymatic functions (ORF5) (Schultze et al., 1990). In vitro translation of ORF5 has shown that its primary translation product is processed to yield an N-terminal protein containing the proteinase domain and a CJterminal one containing the reverse transcriptase and RNase H domains (Torruella et al., 1989). The CaMV proteinase has the characteristic DTG active site (involvement of the Asp residue in the proteolytic activity has been shown by siteJ directed mutagenesis) and the conserved Gly in the typical IIGD context of aspartyl proteinases. Its 20K size and the fact that it contains only one copy of the proteinase motifs suggest that, like its animal retrovirus counterparts, the CaMV proteinase is active as a dimer. Experiments in plant protoplasts and in E. coli have demonstrated that the ORF5-encoded proteinase is also involved in the processing of the ORF4 product. The resulting 44K protein undergoes further, not well characterized, posttranslational modifications and forms the viral capsids (Mart~nez-Izquierdo and Hohn, 1987). There are also experimental data on the aspartyl proteinase of rice tungro bacilliform virus (RTBV). In this virus, the CP, proteinase, reverse transcriptase, and RNase H are synthesized as part of a single polyprotein that includes addi-

255

Plant Virus Proteinases D45 I &

ORF3

II i

t ORF4

CaMV

ORF5

D987 .f, t

.f. I1

RTBV

FIGURE 9 Linearrepresentation of the circular genomes of CaMV caulimovirus and RTBVdepicted as explained in the legends to Figs. 1 and 5. The aspartyl proteinase is represented by a bricked pattern. Only cleavage sites for which there is experimental evidenceare indicated. Dashed vertical lines indicate that the exact place of the cleavage site has not been identified. The cleavage site of the noncharacterized cysteine proteinase in the ORF3 product is not shown.

tional sequences of unknown function (Qu et al., 1991). Making use of a baculovirus expression system, Laco et al. (1994, 1995) have demonstrated that the RTBV proteinase is able to cleave upstream of the reverse transcriptase domain and downstream of the RNaseH domain. Cleavage at this second site is not required for reverse transcriptase activity but it is needed for RNase H activity. The sequences at the two cleavage sites, GYSKN and LK$CL, are not similar to those described for animal retroviruses. Immunoelectron microscopy experiments have shown the presence of the RTBV proteinase in the surface of virus particles; however, it is not known if it is present as a free protein or as part of a larger precursor (Hay et al., 1994). It has been suggested that a virion-associated cysteine proteinase is involved in the processing of the CaMV ORF3, a minor component of the virus particles (Guidasci et al., 1992; Dautel et al., 1994). More information is required to know the relevance of this second proteinase, probably of cellular origin, in the infection cycle of CaMV.

VI. C O N C L U D I N G PERSPECTIVES

REMARKS AND

In the last years, advances in plant virus genome sequencing and the availability of in vitro and in vivo heterologous experimental systems have permitted the identification and characterization of a large number of plant virus-encoded proteinases. The development of full-length cDNA clones from which infectious transcripts can be produced either in vitro or in vivo, has facilitated the functional analysis of the plant virus proteinases. However, at present nearly

256

Garcia et al.

nothing is known about how the different proteolytic processing pathways are controlled (by viral and host factors) to engender the required protein products in the appropriate place, amount, and time. In spite of the high specificity of the viral proteinases, cellular substrates for animal virus proteinases have been described (for instance, Devaney et al., 1988; Clark et al., 1993; Novoa et al., 1997). At least some of these cellular substrates are proteins involved in the control of cell transcription and translation. Thus, the activity of the viral proteinases can interfere with important cellular processes to favor virus replication. Although many plant virus encoded proteinases only act in cis, and cleavage of plant cell proteins by viral proteinases has not been described, it is tempting to speculate that virusinduced proteolytic activities could affect the basic plant cell machinery and/or its defensive responses. These events could be specially relevant to explain the ability of the virus to infect particular hosts and the development of disease symptoms. Finally, the high specificity of the plant virus-encoded proteinases confers upon them very high interest as potential biotechnological tools and targets. As an example of their use in biotechnology, the potyvirus NIa proteinase has been reported to be helpful for the purification of tag-linked proteins synthesized in heterologous systems (Parks et al., 1994) and for the production in transgenic plants of multiple proteins through tranlation of single self-processing polypeptide (Marcos and Beachy, 1997). On the other hand, the recent use ofproteinase inhibitors in AIDS therapy has emphasized the convenience of virus-encoded proteinases as targets of antiviral action. Van Rompaey et al. (1995) have designed a mutant protein able to inhibit the activity of the TEV proteinase by manipulation of the c~2-macroglobulin bait region. The expression of appropriately designed proteinase inhibitors might provide to transgenic plants suitable virus resistance.

REFERENCES Agranovsky,A. A. (1996). Principles of molecular organization, expression and evolution of closteroviruses: over the barriers. Adv.Virus Res.47, 119-158. Agranovsky,A. A., Koonin, E. V., Boyko,V. P., Maiss, E., Fr6tschl, R., Lunina, N. A., and Atabekov, J. G. (1994). Beet yellows closterovirus: complete genome structure and identification of a leader papain-like thiol protease. Virology 198, 311-324. Atreya, C. D., Atreya, P. L., Thornbury, D. W., and Pirone, T. P. (1992). Site-directed mutations in the potyvirusHC-Pro gene affect helper componentactivity,virus accumulation, and symptom expression in infected tobacco plants. Virology 191,106-111. Barrett, A.J. (1986). An introduction to the proteinases. In "Proteinase Inhibitors" (A.J. Barret and G. Salvessen,Eds.), pp. 3-22. Elsevier, New York. Bazan,J. F., and Fletterick, R.J. (1988). Viral cysteineproteases are homologousto the trypsin-like

Plant Virus Proteinases

257

family of serine proteases: Structural and functional implications. Proc. Natl. Acad. Sci. USA 85, 7872-7876. Bazan, J. F., and Fletterick, R.J. (1990). Structural and catalytic models of trypsin-like viral proteases. Semin. Virol. 1,311-322. Berger, P. H., Hunt, A. G., Domier, G. M., Hellman, G. M., Stram, Y., Thornbury, D. W., and Pirone, T. P. (1989). Expression in transgenic plants of a viral gene product that mediates insect transmission of potyviruses. Proc. Natl. Acad. Sci. USA 86, 8402-8406. Bergmann, E. M., Mosimann, S. C., Chernaia, M. M., Malcolm, B. A., and James, M. N. G. (1997). The refined crystal structure of the 3C gene product from hepatitis A virus: Specific proteinase activity and RNA recognition.J. Virol. 71, 2436-2448. Bransom, K. L., and Dreher, T. W. (1994). Identification of the essential cysteine and histidine residues of the turnip yellow mosaic virus protease. Virology 198, 148-154. Bransom, K. L., Wallace, E., and Dreher, T. W. (1996). Identification of the cleavage site recognized by the turnip yellow mosaic virus protease. Virology 217,404- 406. Brantley, J. D., and Hunt, A. G. (1993). The N-terminal protein of the polyprotein encoded by the potyvirus tobacco vein mottling virus is an RNA-binding protein.J. Gen. Virol. 74, 1157-1162. Brault, V., Hibrand, L., Candresse, T., Le Gall, O., and Dunez, J. (1989). Nucleotide sequence and genetic organization of Hungarian grapevine chrome mosaic nepovirus RNA2. Nucleic Acids Res. 17, 7809-7819. Carrington, J. C., and Dougherty, W. G. (1987). Small nuclear inclusion protein encoded by a plant potyvirus genome is a protease.J. Virol. 61, 2540-2548. Carrington, J. C., and Dougherty, W. G. (1988). A viral cleavage site cassette: Identification of amino acid sequences required for tobacco etch virus polyprotein processing. Proc. Natl. Acad. Sci. USA 85, 3391-3395. Carrington, J. C., and Herndon, K. L. (1992). Characterization of the potyviral HC-Pro autoproteolytic cleavage site. Virology 187,308-315. Carrington, J. C., Cary, S. M., and Dougherty, W. G. (1988). Mutational analysis of tobacco etch virus polyprotein processing: Cis and trans proteolytic activities of polyproteins containing the 49-kilodalton proteinase. J. Virol. 62, 2313-2320. Carrington, J. C., Cary, S. M., Parks, T. D., and Dougherty, W. G. (1989a). A second proteinase encoded by a plant potyvirus genome. EMBOJ. 8, 365-370. Carrington, J. C., Freed, D. D., and Sanders, T. C. (1989b). Autocatalytic processing of the potyvirus helper component proteinase in Escherichia coli and in vitro. J. Virol. 63, 4459-4463. Carrington, J. C., Freed, D. D., and Oh, C.-S. (1990). Expression of potyviral polyproteins in transgenic plants reveals three proteolytic activities required for complete processing. EMBO J. 9, 1347-1353. Carrington, J. C., Haldeman, R., Dolja, V. V., and Restrepo-Hartwig, M. A. (1993). Internal cleavage and trans-proteolytic activities of the VPg-proteinase (NIa) of tobacco etch potyvirus in vivo. J. Virol. 67, 6995-7000. Chen, X., and Bruening, G. (1992a). Cloned DNA copies of cowpea severe mosaic virus genomic RNAs: Infectious transcripts and complete nucleotide sequence of RNA 1. Virology 191, 607-618. Chen, X., and Bruening, G. (1992b). Nucleotide sequence and genetic map of cowpea severe mosaic virus RNA 2 and comparisons with RNA 2 of other comoviruses. Virology 187,682-692. Choi, G. H., Pawlyk, D. M., and Nuss, D. L. (1991). The autocatalytic protease p29 encoded by a hypovirulence-associated virus of the chesnut blight fungus resembles the potyvirus-encoded protease HC-Pro. Virology 183, 747-752. Clark, M. E., Lieberman, P. M., Berk, A. J., and Dasgupta, A. (1993). Direct cleavage of human TATA-binding protein by poliovirus protease 3C in vivo and in vitro. Mol. Cell. Biol. 13, 12321237.

258

Garcia et al.

Cronin, S., Verchot, J., Haldeman-Cahill, R., Schaad, M. C., and Carrington, J. C. (1995). Longdistance movement factor: A transport function of the potyvirus helper component proteinase. Plant Cell 7, 549-559. Daros, J. A., and Carrington, J. C. (1997). RNA binding activity of Nla proteinase of tobacco etch potyvirus. Virology 237,327-336. Dautel, S., Guidasci, T., Pique, M., Mougeot, J.-L., Lebeurier, G., Yot, E, and Mesnard, J. B. (1994). The full-length product of cauliflower mosaic virus open reading frame III is associated with the viral particle. Virology 202, 1043-1045. De Mejia, M. V. G., Hiebert, E., Purcifull, D. E., Thornbury, D. W., and Pirone, T. E (1985). Identification of potyviral amorphous inclusion protein as a nonstructural virus-specific protein related to helper component. Virology 142, 34-43. Dessens, J. T., and Lomonossoff, G. E (1991). Mutational analysis of the putative catalytic triad of the cowpea mosaic virus 24K protease. Virology 184, 738-746. Dessens, J. T., and Lomonossoff, G. E (1992). Sequence upstream of the 24K protease enhances cleavage of the cowpea mosaic virus B-RNA-encoded polyprotein at the junction between the 24K and 87K proteins. Virology 189,225-232. Devaney, M. A., Vakharia, V. N., Lloyd, R. E., Ehrenfeld, E., and Grubman, M. J. (1988). Leader protein of foot-and-mouth disease virus is required for cleavage of the p220 component of the cap-binding protein complex. J. Virol. 62, 4407-4409. Dolja, V. V., Hong, J., Keller, K. E., Martin, R. R., and Peremyslov, V. V. (1997). Suppression of potyvirus infection by coexpressed closterovirus protein. Virology 234, 243-252. Dougherty, W. G., and Parks, T. D. (1989). Molecular genetic and biochemical evidence for the involvement of the heptapeptide cleavage sequence in determining the reaction profile at two tobacco etch virus cleavage sites in cell-free assays. Virology 172, 145-155. Dougherty, W. G., and Parks, T. D. (1991). Post-translational processing of the tobacco etch virus 49-kDa small nuclear inclusion polyprotein: Identification of an internal cleavage site and delimitation of VPg and proteinase domains. Virology 183,449-456. Dougherty, W. G., and Semler, B. L. (1993). Expression of virus-encoded proteinases: Functional and structural similarities with cellular enzymes. Microbiol. Rev. 57, 781-822. Dougherty, W. G., Carrington, J. C., Cary, S. M., and Parks, T. D. (1988). Biochemical and mutational analysis of a plant virus polyprotein cleavage site. EMBOJ. 7, 1281-1287. Dougherty, W. G., Cary, S. M., and Parks, T. D. (1989a). Molecular genetic analysis of a plant virus polyprotein cleavage site: A model. Virology 171,356-364. Dougherty, W. G., Parks, T. D., Cary, S. M., Bazan, J. F., and Fletterick, R.J. (1989b). Characterization of the catalytic residues of the tobacco etch virus 49-kDa proteinase. Virology 172, 302-310. Edwards, M. C., Zhang, Z., and Weiland, J.J. (1997). Oat blue dwarf marafivirus resembles the tymoviruses in sequence, genome organization, and expression strategy. Virology 232, 217-229. Garcia, J. A., and Lain, S. (1991). Proteolytic activity of plum pox virus-tobacco etch virus chimeric proteins. FEBS Lett. 281, 67-72. Garcia,J. A., Lain, S., Cervera, M. T., Riechmann,J. L., and Martin, M. T. (1990). Mutational analysis of plum pox potyvirus polyprotein processing by the NIa protease in Escherichia coli. J. Gen. Virol. 71, 2773-2779. Garcia, J. A., Martin, M. T., Cervera, M. T., and Riechmann, J. L. (1992). Proteolytic processing of the plum pox potyvirus polyprotein by the NIa protease at a novel cleavage site. Virology 188, 697-703. Garcia, J. A., Riechmann, J. L., and Lain, S. (1989a). Artificial cleavage site recognized by plum pox potyvirus protease in Escherichia coli. J. Virol. 63, 2457-2460. Garcia, J. A., Riechmann, J. L., and Lain, S. (1989b). Proteolytic activity of the plum pox potyvirus NIa-like protein in Escherichia coli. Virology 170,362-369.

Plant Virus Proteinases

259

Ghabrial, S. A., Smith, H. A., Parks, T. D., and Dougherty, W. G. (1990). Molecular genetic analyses of the soybean mosaic virus NIa protease.J. Gen. Virol. 71, 1921-1927. Godeny, E. K., Chen, L., Kumar, S. N., Methven, S. L., Koonin, E. V., and Brinton, M. A. (1993). Complete sequence and phylogenetic analysis of the lactate dehydrogenase-elevating virus. Virology 194, 585-596. Goldbach, R., and Krijt, J. (1982). Cowpea mosaic virus-encoded protease does not recognise primary translation products of mRNAs from other comoviruses.J. Virol. 43, 1151-1154. Goldbach, R., Le Gall, O., and Wellink, J. (1991). Alpha-like viruses in plants. Semin. Virol. 2, 19 -25. Gorbalenya, A. E., Donchenko, A. P., Blinov, V. M., and Koonin, E. V. (1989). Cysteine proteases of positive strand RNA viruses and chymotrypsin-like serine proteases: A distinct protein superfamily with a common structural fold. FEBS Lett. 243, 103-114. Gorbalenya, A. E., Koonin, E., Blinov, V. M., and Donchenko, A. P. (1988). Sobemovirus genome appears to encode a serine protease related to cysteine proteases of picornaviruses. FEBS Lett. 236,287-290. Gorbalenya, A. E., Koonin, E., and Lai, M. M.-C. (1991). Putative papain-related thiol proteases of positive-strand RNA viruses: Identification of rubi- and aphtovirus proteases and delineation of a novel conserved domain associated with proteases of rubi-, a-, and coronaviruses. FEBS Lett. 288, 201-205. Guidasci, T., Mougeot, J. L., Lebeurier, G., and Mesnard, J. M. (1992). Processing of the minor capsid protein of the cauliflower mosaic virus requires a cysteine proteinase. Res.Virol. 143, 361-370. Hay, J., Grieco, F., Druka, A., Pinner, M., Lee, S.-C., and Hull, R. (1994). Detection of rice tungro bacilliform virus gene products in vivo. Virology 205,430-437. Hehn, A., Fritsch, C., Richards, K. E., Guilley, H., and Jonard, G. (1997). Evidence for in vitro and in vivo autocatalytic processing of the primary translation product of beet necrotic yellow vein virus RNA 1 by a papain-like proteinase. Arch. Virol. 142, 1051-1058. Hellen, C. U. T., and Wimmer, E. (1992). Maturation of poliovirus capsid proteins. Virology 187, 391-397. Hellmann, G. M., Shaw, J. G., and Rhoads, R. E. (1988). In vitro analysis of tobacco vein mottling virus NIa cistron: Evidence for a virus encoded protease. Virology 163, 554-562. Hiebert, E., Purcifull, D. E., and Christie, R. G. (1984). Purification and immunological analysis of plant viral inclusion bodies. In "Methods in Virology" (K. Maramorosch and H. Koprowski, Eds.), Vol. 8, pp. 225-279. Academic Press, New York. Hohn, T., and F~itterer, J. (1997). The proteins and functions of plant pararetroviruses: Knowns and unknowns. Crit. Rev. Plant Sci. 16, 133-161. Jelkmann, W., Fechtner, B., and Agranovsky, A. A. (1997). Complete genome structure and phylogenetic analysis of little cherry virus, a mealybug-transmissible closterovirus. J. Gen. Virol. 78, 2067-2071. Kadar~, G., Rozanov, M., and Haenni, A.-L. (1995). Expression of the turnip yellow mosaic virus proteinase in Escherichia coli and determination of the cleavage site within the 206 kDa protein. J. Gen. Virol. 76, 2853-2857. Karasev, A. V., Boyko, V. P., Gowda, S., Nikolaeva, O. V., Hilf, M. E., Koonin, E. V., Niblett, C. L., Cline, K., Gumpf, D. J., Lee, R. F., Garnsey, S. M., Lewandowski, D. J., and Dawson, W. O. (1995). Complete sequence of the citrus tristeza virus RNA genome. Virology 208, 511-520. Karasev, A. V., Hilf, M. E., Garnsey, S. M., and Dawson, W. O. (1997). Transcriptional strategy of closteroviruses: Mapping the 5' termini of the citrus tristeza virus subgenomic RNAs. J. Virol. 71, 6233-6236. Kasschau, K. D., and Carrington, J. C. (1995). Requirement for HC-Pro processing during genome amplification of tobacco etch potyvirus. Virology 209, 268-273.

260

Garcia et al.

Kasschau, K. D., Cronin, S., and Carrington, J. C. (1997). Genome amplification and long-distance movement functions associated with the central domain of tobacco etch potyvirus helper component-proteinase. Virology 228, 251-262. Kim, D. H., Hwang, D. C., Kang, B. H., Lew, J., Han, J. S., Song, B. O. D., and Choi, K. Y. (1996). Effects of internal cleavages and mutations in the C-terminal region of NIa protease of turnip mosaic potyvirus on the catalytic activity. Virology 226, 183-190. Kim, D.-H., Park, Y. S., Kim, S. S., Lew, J., Nam, H. G., and Choi, K. Y. (1995). Expression, purification, and identification of a novel self-cleavage site of the NIa C-terminal 27-kDa protease of turnip mosaic potyvirus C5. Virology 213,517-525. Laco, G. S., Kent, S. B. H., and Beachy, R. N. (1994). Rice tungro bacilliform encodes reverse transcriptase, DNA polymerase, and ribonuclease H activities. Proc. Natl. Acad. Sci. USA 91, 26542658. Laco, G. S., Kent, S. B. H., and Beachy, R. N. (1995). Analysis of the proteolytic processing and activation of the rice tungro bacilliform virus gene products reverse transcriptase. Virology 208, 207-214. Lalibertr J.-F., Nicolas, O., Chatel, H., Lazure, C., and Morosoli, R. (1992). Release of a 22-kDa protein derived from the amino-terminal domain of the 49-kDa NIa of turnip mosaic potyvirus in Escherichia coli. Virology 190, 510 -514. Lawrence, D. M., Rozanov, M. N., and Hillman, B. I. (1995). Autocatalytic processing of the 223kDa protein of blueberry scorch carlavirus by a papain-like proteinase. Virology 207, 127-135. Lee, C.-J., Shieh, C.-K., Gorbalenya, A. E., Koonin, E. V., La Monica, N., Tuler, J., Bagdzhadzhyan, A., and Lai, M. M. C. (1991). The complete sequence (22 kilobases) of murine coronavirus gene 1 encoding the putative proteases and RNA polymerases. Virology 180, 567-582. Maia, I. G., and Bernardi, F. (1996). Nucleic acid-binding properties of a bacterially expressed potato virus Y helper component-proteinase.J. Gen. Virol. 77,869-877. Maia, I. G., Haenni, A.-L., and Bernardi, F. (1996a). Potyviral HC-Pro: A multifunctional protein. J. Gen. Virol. 77, 1335-1341. Maia, I. G., Seron, K., Haenni, A. L., and Bernardi, F. (1996b). Gene expression from viral RNA genomes. Plant Mol. Biol. 32,367-391. Marcos, J. F., and Beachy, R. N. (1997). Transgenic accumulation of two plant virus coat proteins on a single self-processing polypeptide.J. Gen. Virol. 78, 1771-1778. Margis, R., and Pinck, L. (1992). Effects of site-directed mutagenesis on the presumed catalytic triad and substrate-binding pocket of grapevine fanleaf nepovirus 24-kDa proteinase. Virology 190, 884-888. Margis, R., Ritzenthaler, C., Reinbolt, J., Pinck, M., and Pinck, L. (1993). Genome organization of grapevine fanleaf nepovirus RNA2 deduced from the 122K polyprotein P2 in vitro cleavage products.J. Gen. Virol. 74, 1919-1926. Margis, R., Viry, M., Pinck, M., Bardonnet, N., and Pinck, L. (1994). Differential proteolytic activities of precursor and mature forms of the 24K proteinase of grapevine fanleaf nepovirus. Virology 200, 79-86. Martin, M. T., L6pez-Otin, C., Lain, S., and Garcia, J. A. (1990). Determination of polyprotein processing sites by amino terminal sequencing of nonstructural proteins encoded by plumpox potyvirus. Virus Res. 15, 97-106. Martinez-Izquierdo, J., and Hohn, T. (1987). Cauliflower mosaic virus coat protein is phosphorylated in vitro by a virion-associated protein-kinase. Proc. Natl. Acad. Sci. USA 84, 1825-1828. Mavankal, G., and Rhoads, R. (1991). In vitro cleavage at or near the N-terminus of the helper component protein in the tobacco vein mottling virus polyprotein. Virology 185, 721- 731. MCnard, R., Chatel, H., Dupras, R., Plouffe, C., and Lalibertr J.-F. (1995). Purification of turnip mosaic potyvirus viral protein genome-linked proteinase expressed in Escherichia coli and development of a quantitative assay. Eur. J. Biochem. 229, 107-112.

Plant Virus Proteinases

261

Murphy, F. A., Fauquet, C. M., Bishop, D. H. L., Ghabrial, S. A., Jarvis, A. W., Martelli, G. P., Mayo, M. A., and Summers, M. D. (1995). Virus taxonomy: Sixth Report of the International Committee on Taxonomy of Viruses. In "Archives of Virology: Supplement 10." Springer-Verlag, Wien/New York. Murphy, J. F., Rhoads, R. E., Hunt, A. G., and Shaw, J. G. (1990). The VPg of tobacco etch virus RNA is the 49-kDa proteinase or the N-terminal 24-kDa part of the proteinase. Virology 178, 285-288. Novoa, I., Martinez Abarca, F., Fortes, P., Ortin, J., and Carrasco, L. (1997). Cleavage of p220 by purified poliovirus 2A(pro) in cell-free systems: Effects on translation of capped and uncapped mRNAs. Biochemistry 36, 7802-7809. Oh, C. S., and Carrington, J. C. (1989). Identification of essential residues in potyvirus proteinase HC-Pro by site-directed mutagenesis. Virology 173,692-699. Ohira, K., Namba, S., Rozanov, M., Kusumi, T., and Tsuchizaki, T. (1995). Complete sequence of an infectious full-length cDNA clone of citrus tatter leaf capillovirus: Comparative sequence analysis analysis of capillovirus genomes. J. Gen. Virol. 76, 2305-2309. Parks, T. D., and Dougherty, W. G. (1991). Substrate recognition by the NIa proteinase of two potyviruses involves multiple domains: Characterization using genetically engineered hybrid proteinase molecules. Virology 182, 17-27. Parks, T. D., Howard, E. D., Wolpert, T. J., Arp, D.J., and Dougherty, W. G. (1995). Expression and purification of a recombinant tobacco etch virus NIa proteinase: Biochemical analysis of the full-length and a naturally occurring truncated proteinase form. Virology 210, 194-201. Parks, T. D., Leuther, K. K., Howard, E. D., Johnston, S. A., and Dougherty, W. G. (1994). Release of proteins and peptides from fusion proteins using a recombinant plant virus proteinase. Anal. Biochem. 216, 413-417. Parks, T. D., Smith, H. A., and Dougherty, W. G. (1992). Cleavage profiles of tobacco etch virus (TEV)-derived substrates mediated by precursor and processed forms of the TEV proteinase. J. Gen. Virol. 73, 149-155. Peters, S. A., Voorhorst, W. G. B., Wellink, J., and van Kammen, A. (1992a). Processing of VPgcontaining polyproteins encoded by the B-RNA from cowpea mosaic virus. Virology 191, 90-97. Peters, S. A., Voorhost, W. G. B., Wery, J., Wellink, J., and van Kammen, A. (1992b). A regulatory role for the 32K protein in proteolytic processing of cowpea mosaic virus polyproteins. Virology 191, 81-89. Pinck, M., Reinbolt, J., Loudes, A. M., Le Ret, M., and Pinck, L. (1991). Primary structure and location of the genome-linked protein (VPg) of grapevine fanleaf nepovirus. FEBS Lett. 284, 117-119. Pringle, C. R. (1998). The universal system of virus taxonomy of the International Committee on Virus Taxonomy (ICTV), including new proposals ratified since publication of the sixth ICTV report in 1995. Arch. Virol. 143, 203-210. Pruss, G., Ge, X., Shi, X. M., Carrington, J. C., and Vance, V. B. (1997). Plant viral synergism: The potyviral genome encodes a broad-range pathogenicity enhancer that transactivates replication of heterologous viruses. Plant Cell 9,859-868. Qu, R., Bhattacharyya-Pakrasi, M., Laco, G. S., De Kochko, A., Subba Rao, B. L., Kaniewska, M. B., Elmer, J. S., Rochester, D. E., Smith, C., and Beachy, R. N. (1991). Characterization of the genome of rice tungro bacilliform virus: Comparison with commelina yellow mottle virus and caulimoviruses. Virology 185,354-364. Reddick, B. B., Habera, L. F., and Law, M. D. (1997). Nucleotide sequence and taxonomy of maize chlorotic dwarf virus within the family Sequiviridae. J. Gen. Virol. 78, 1165-1174. Restrepo-Hartwig, M. A., and Carrington, J. C. (1992). Regulation of nuclear transport of a plant potyvirus protein by autoproteolysis.J. Virol. 66, 5662-5666.

262

Garcia et al.

Restrepo-Hartwig, M. A., and Carrington, J. C. (1994). The tobacco etch potyvirus 6-kilodalton protein is membrane associated and involved in viral replication. J. Virol. 68, 2388-2397. Riechmann, J. L., Lain, S., and Garcia, J. A. (1992). Highlights and prospects ofpotyvirus molecular biology.J. Gen. Virol. 73, 1-16. Riechmann, J. L., Cervera, M. T., and Garcia, J. A. (1995). Processing of the plum pox virus polyprotein at the P3-6K1 junction is not required for virus viability. J. Gen. Virol. 76, 951-956. Ritzenhaler, C., Viry, M., Pink, M., Margis, R., Fuchs, M., and Pinck, L. (1991). Complete nucleotide sequence and genetic organization of grapevine fanleaf nepovirus RNA1. J. Gen. Virol. 72, 2357-2365. Rojas, M. R., Zerbini, F. M., Allison, R. F., Gilbertson, R. L., and Lucas, W.J. (1997). Capsid protein and helper component proteinase function as potyvirus cell-to-cell movement proteins. Virology 237, 283-295. Rott, M. E., Gilchrist, A., Lee, L., and Rochon, D. (1995). Nucleotide sequence of tomato ringspot virus RNA1.J. Gen. Virol. 76,465-473. Rozanov, M. N., Drugeon, G., and Haenni, A.-L. (1995). Papain-like proteinase of turnip yellow mosaic virus: A prototype of a new viral proteinase group. Arch. Virol. 140, 273-288. Ryan, M. D., and Flint, M. (1997). Virus-encoded proteinases of the picornavirus supergroup. J. Gen. Virol. 78,699-723. Schaad, M. C., Haldeman-Cahill, R., Cronin, S., and Carrington, J. C. (1996). Analysis of the VPgproteinase (NIa) encoded by tobacco etch potyvirus: Effects of mutations on subcellular transport, proteolytic processing, and genome amplification.J. Virol. 70, 7039-7048. Schaad, M. C.,Jensen, P. E., and Carrington, J. C. (1997). Formation of plant RNA virus replication complexes on membranes: Role of an endoplasmic reticulum-targeted viral protein. EMBO J. 16, 4049- 4059. Schuhze, M., Hohn, T., and Jiricny, J. (1990). The reverse transcriptase gene of cauliflower mosaic virus is translated separately from the capsid gene EMBOJ. 9, 1177-1185. Shanks, M., and Lomonossof, G. P. (1990). The primary structure of the 24K protease from red clover mottle virus: Implications for the mode of action of comovirus proteases. J. Gen. Virol. 71,735-738. Shanks, M., and Lomonossoff, G. P. (1992). The nucleotide sequence of red clover mottle virus bottom component RNA.J. Gen. Virol. 73, 2473-2477. Shanks, M., Dessens, J. T., and Lomonossoff, G. P. (1996). The 24 kDa proteinases of comoviruses are virus-specific in cis as well as in trans. J. Gen. Virol. 77, 2365-2369. Shapira, R., and Nuss, D. L. (1991). Gene expression by a hypovirulence-associated virus of the chesnut blight fungus involves two papain-like protease activities. J. Biol. Chem. 266, 1941919425. Shen, P., Kaniewska, M., Smith, C., and Beachy, R. N. (1993). Nucleotide sequence and genomic organization of rice tungro spherical virus. Virology 193,621-630. Shi, X. M., Miller, H., Verchot,J., Carrington, J. C., and Vance, V. B. (1997). Mutations in the region encoding the central domain of helper component-proteinase (HC-Pro) eliminate potato virus X/potyviral synergism. Virology 231, 35- 42. Shirako, Y., and Wilson, T. M. A. (1993). Complete nucleotide sequence and organization of the bipartite RNA genome of soil borne wheat mosaic virus. Virology 195, 16-32. Shukla, D. D., Ward, C. W., and Brunt, A. A. (1994). Genome structure, variation and function. In "The Potyviridae" (D. D. Shukla, C. W. Ward, and A. A. Brunt, Eds.), pp. 74-112. CAB International, Cambridge. Thornbury, D. W., van den Heuvel, J. F.J.M., Lesnaw, J. A., and Pirone, T. P. (1993). Expression of potyvirus proteins in insect cells infected with a recombinant baculovirus. J. Gen. Virol. 74, 2731-2735. Torruella, M., Gordon, K., and Hohn, T. (1989). Cauliflower mosaic virus produces and aspartic proteinase to cleave its polyproteins. EMBOJ. 8, 2819-2825.

Plant Virus Proteinases

263

Turnbull-Ross, A. D., Mayo, M. A., Reavy, B., and Murant, A. F. (1993). Sequence analysis of the parsnip yellow fleck virus polyprotein: Evidence of affinities with picornaviruses. J. Gen. Virol. 74, 555-561. Turnbull-Ross, A. D., Reavy, B., Mayo, M. A., and Murant, A. F. (1992). The nucleotide sequence of parsnip yellow fleck virus: A plant picorna-like virus. J. Gen. Virol. 73, 3203-3211. van der Wilk, F., Verbeek, M., Dullemans, A. M., and van den Heuvel, J. F. J. M. (1997). The genome-linked protein of potato leafroll virus is located downstream of the putative protease domain of the ORF1 product. Virology 234, 300-303. Van Rompaey, L., Proost, P., Van den Berghe, H., and Marynen, P. (1995). Design of a new protease inhibitor by the manipulation of the bait region of cr2-macroglobulin: Inhibition of the tobacco etch virus protease by mutant ce2-macroglobulin. Biochem. J. 312, 191-195. Verchot,J., and Carrington,J. C. (1995a). Debilitation of plant potyvirus infectivity by P 1 proteinaseinactivating mutations and restoration by second-site modifications. J. Virol. 69, 1582-1590. Verchot, J., and Carrington, J. C. (1995b). Evidence that the potyvirus P1 proteinase functions in trans as an accesory factor for genome amplification.J. Virol. 69, 3668-3674. Verchot, J., Herndon, K. L., and Carrington, J. C. (1992). Mutational analysis of the tobacco etch potyviral 35-kDa proteinase: Identification of essential residues and requirements for autoproteolysis. Virology 190, 298-306. Verchot, J., Koonin, E. V., and Carrington,J. C. (1991). The 35-kDa protein from the N-terminus of a potyviral polyprotein functions as a third virus-encoded proteinase. Virology 185,527-535. Vos, P., Verver, J., Jaegle, M., Wellink, J., van Kammen, A., and Goldbach, R. (1988). Two viral proteins involved in the proteolytic processing of the cowpea mosaic virus polyproteins. Nucleic Acids Res. 16, 1967-1985. Wellink, J., and van Kammen, A. (1988). Proteases involved in the processing of viral polyproteins. Arch. Virol. 98, 1-26. Zaccomer, B., Haenni, A.-L., and Macaya, G. (1995). The remarkable variety of plant RNA virus genomes.J. Gen. Virol. 76, 231-247.

INDEX

A (-) strand synthesis, 141 ( + ) strand synthesis, 141 cel-antitrypsin, 209 a/~/-unsaturated carboxylesters, 154 a-cyclopropylbenzyl, 34 a-globin, 175 a-interferon, 62, 65 a-ketoamides, 82, 110 ~/- and/z-lactones, 154 ~-barrel domain, 72, 146 fl-hairpin structures, 9 ~-lactamase, E. coli, 221 ~-lactams, 221,226, 227 ~-sandwich, 147 y-aminovinyl sulfones, 154 1,2-epoxy-3-(pnitrophenyloxy)propane, 226 1-chloro-3-tosylamido-7-amino-2-heptane, 226 2,6-dimethylphenoxyacetyl, 31 2,6-pyrdine dicarboxylic acid, 226 2A proteinase, 155 2-amino-butyric acid, 76 2-o-t-butylphenylthio, 23 31o helical conformation, 150 3ABC, 144 3C gene product, 143 3C proteinase, 146, 147, 153 3CD, 144 3C-like proteinase, 157, 236, 243 3D and 3AB domains, 154 3-morpholinopropyl, 132

4- (amidinophenyl)methanesulfonyl fluoride, 226 4-hydroxypyrone, 33 5,6-cycloalkylpyrones, 34 55-kDa Gag polyprotein, 5 6-(substituted oxyethyl)penem, 226 A-70450, 122, 123, 124, 127, 131,132 A-74704 A-77003, 18, 29, 38 A-79295, 19 A-79912, 122, 123, 132 A-80987, 18, 29 A-84538 (ABT-538) Ritonavir, 18 A-98881, 25 ABT-538, 29 Acquired immunodeficiency syndrome, 2 Acridinediones, 174 Actinidin, 173 Activation cleavage site, 207 Active site, 9, 11,148, 150 aspartic acid, 8 cavity, 112 flap, 126 interactions in HIV PR, 12 mutants, 37, 39 residues, 150 subsites, secreted aspartic proteases, 126 thiol, 154 titrations, 74 Activity and specificity, 149 Acute myocarditis, 190 Acyl-enzyme intermediate, 149

265

266 Adsorption to specific receptor, 4 Affinity chromatography, 192 AG-1254, 22, 31 AG- 1284, 23, 30 AG-1343 (Nelfinavir), 15, 29 AIDS, 1 Airway epithelial cells, 205 Airway, 214 Ala-AMC substrate, 183 Alanine scanning, 77 Aldehydes, 154 Allopurinol, 190 Alphalike viruses, 238 Alphalike, 233 - 235 Alveolar capillaries, 206 cells, 206 epithelium, 209 macrophages, 209 Amastigotes, 191,197, 198 AMC peptide substrate, 171 American trypanosomiasis, 189 Amino acid sequence alignment, 221 Aminohydroxy indane, 28 Aminomethylene isostere, 109 Aminopeptidase inhibition, 184 Aminopeptidase, 184, 185 cytosolic, 168 Plasmodium, 183 Amino-terminal helix, 145 Amphotericin B, 119, 120, 132 Amprenavir, 30 Ancestral aspartic proteinase gene, 10 Animal retroviruses, 3,253-255 Animal virus proteinases, 256 Anthranilamide, 31 Antibacterial activity, 227 agents, 219 drugs, 229 therapy, 118 Antibiotics, 219 Antifungal activity, 123 targets, 118 Antigenic site, 206 Antihypertensive agents, 13 Antileukoprotease, 209 Antimalarial activity, 174

Index agents, 172 drugs, 165, 184, 185 Antipain, 209, 226 Antiparallel, 102 B-barrels, 147 ~/-sheet, 9, 152 Antitryptase Clara antibodies, 209, 210 Antiviral drug discovery, 3 EC90, 27 therapeutics, 43 therapy, 1 Aphthovirus, 140, 144, 238, 249 Apple chlorotic leaf spot tricho virus, 251 Apple stem grooving capillovirus, 251 pitting virus, 252 Aprotinin, 209,214, 226 Aptamers, 81 AQ-148, 16 Arabinose promoter, 220 Arteriviruses, 238, 249 Aseptic meningitis, 96 Aspartic proteinase, 11,168, 170, 176, 177, 185,234, 235,253,254 family, 9 fold, 124 gene, 182 ancestral, 10 inhibitors, 175, 180, 182 mechanism, 9 malarial, 175 secreted aspartic proteinases, 118, 119, 122 Assembly of capsid precursor, 139 of virions, 2 protein precursor, 96 Atomic absorption spectroscopy, 78 Atomic resolution structure, 139 Atopic asthma, 121 ATP-activated carboxyl proteinase, 197 Attachment, 141 Autoactivation, 178 Autocatalytic, 177 mechanisms, 238 processing, 253 Autoprocessing, 242 of herpes proteases, 107 Autoproteolysis, 226

Index Autoproteolytic cleavage, 105,221 Avian malaria parasite, 175

B B- or T-lymphocytes, 96 Bacillus subtilis, 221 Bacteria, 219 Bacterial cells, 220 infections, 219 signal peptidase, 155,219, 220, 224-226, 228 type I signal peptidase gene type I signal peptidases, 219-229 Baculovirus expression system, 171,255 vector, 237 Badnavirus, 254 Barley mild mosais virus, 240 Barley yellow mosaic virus, 240 BaYMV, 236 Beet necrotic yellow vein benyvirus, 253 Beet yellows closterovirus, 249 Benyvirus, 234 Benzamide, 81 Benzamidine, 209 Benzanilide, 81 Benznidazole (Radanil), 190 Benzoxazinones, 110, 111 Bestatin, 183,226 Biantennary-type oligosaccharide, 194 Bifurcated H-bond, 34 Bikunin, 214 BILA 2011 Palinavir, 16 Bila-1906, 40 Bila-2185, 40 Bioavailability, 13, 30, 42 oral, 27, 28, 29 Biochemical fitness, 41 Bipartite bymoviruses, 235,237 Biphasic pH dependence, 99 Bis-hydroxymethylbenzyl analog, 32 Blood plasma concentration, 32 Blueberry scorch carlavirus, 252 BMS-186318, 21 Boc-Gln-Ala-Arg-MCA, 207 Bovine serum albumin, 192 Bowman-Birk soybean trypsin inhibitor, 209 Bripiodionen, 110, 111

267 Brome streak mosaic rymovirus, 237 Bronchiolar epithelial cells, 207 BrSMV rymovirus, 238 RNA, 240 Budding, 4 and maturation of HIV-1, 7 Bymovirus, 234, 236, 239

C CA, 6 CA/p2, 7 Ca 2+ signaling mechanism, 197 Calpain, 178 CaMV proteinase, 254

Candida albicans, 118 dissemination, 119 genomics, 133 proteases, 117-135 tropicalis, 118 virulence, 121 parasilosis, 118 Capillovirus, 234, 252 Cap-recognition complex, 145 Capsid (CA) protein, 5 assembly, 4 formation, 97 maturation, 101 morphology, 3, 7 processing, 141 Carboxyl proteinase inhibitors, 199 Carboxylate, aspartate residue, 150 Cardiovirus, 140 Carditis meningitis, 140 Carlavirus, 234, 252 Carmolike, 234, 235 Casein, 192 Catalase, 170 Catalytic activity, 99,101 aspartates, 27 aspartic acid residues, 9 domain, 196 dyad, 238, 251 efficiency, 76, 221 mechanism, 106, 221,224 moiety, 193,195 residues, 238, 248, 252

268 Catalytic (continued) triad, 68, 79, 103,237, 239, 240, 243, 245,246 NS3 serine protease, 66 Cathepsin B, 172, 174, 197 C, 237 D, 177 E, 177 G, 213 L and B, 193 L, 172 L-like proteinase, 171 S, 193 Cauliflower mosaic caulimovirus, 254 Caulimoviridae, 234, 254, 254 CD4, 212, 213 helper T cells, 2 receptor, 5 Cellular aspartic proteinase, 13 factors,1214 immunodeficiency, 118 processing proteinase, 206 protease, 6 receptor, 212 serine proteinase, 245 substrates, 256 Cellular proteases and viral infection, 205-214 Chagas disease, 189, 190 Chalcones, 174 Chelating agents, 207 Chemokine receptor, 212 Chemotherapeutic attack, 199 Chestnut blight fungus, 238 Chicken egg-white cystatin-Sepharose, 192 Chickenpox, 94, 96 Chloromethyl ketone, 80 TPCK, 66 Chloroquine, 166, 184 Chloroquine-heme complex, 184 CHO cells, 213 Chorioallantoic fluid, 206 Chromogenic peptide substrate, 178, 179, 180, 192 Chronic hepatitis, 62 Chymostatin, 226 Chymotrypsin serine proteinase, 237 chymotrypsinlike cysteine proteinase, 146, 154, 155 fold, 72

Index

serine proteinase, 99, 149, 150, 152, 198, 237 Chymotryptic cleavage site, 214 Ciliated epithelial cells, 206 Cis cleavage site, 76, 243 Citrus tristeza closterovirus, 249 Cleavage site, 242, 243,251,255 activation, 207 mutants, 40 preferences, 8 sequences, 7, 36 specificity, 245 Cleavage, 4 autoproteolytic, 105,221 of NS3-NS4A precursor, 72 Clinical manifestations, 94 Closterovirus cysteine proteinases, 250 leader protein, 250 leader proteinase, 249 Closterovirus, 234, 238, 249 CMV, 108 CMV protease, 113 Coat protein, 236 Cocrystal structure, 154 Cold sores, 96 Collagenase activity, 198 Colonization of mucosal surfaces, 118 Common cold, 140 Como- and nepovirus proteinase, 242 Comoviridae, 234 Comovirus, 234, 242, 244-246 polyprotein processing, 243 Competitive inhibitor, 109 Complement cascade, 196 Complex formation, kinetic consequences, 74 Computer-aided design, 173 ConAJSepharose, 192 Conditional lethal strain, 220 Consensus cleavage motif sequence, 207 Conserved domain, 224 Coodination geometry, 78, 79 Core protein, 63 Co-receptors, 211 Coronaviruses, 238, 249 Coumarin, 33 Cowpea mosaic comovirus, 243 Cowpea protoplasts, 245 Coxsackievirus, 140 CPMV, 244

Index Crithidia fasciculata, 192, 193, 198 Cross-resistance, 41, 43 Cruzain, 174, 194 Cruzipain, 192 i isoform, 196 genes, 194 Crystal packing forces, 9 structure (HIV), 10, 29 structure determination, 9 structure of CMV protease, 102 structure, 145, 146 structure, plasmepsin II, 177 Crystallization, 156 C-terminal domain, 195 extension, 193 tail, 194 CTV, 249 Cuboidal epithelial cells, 206 Cyanoguanidine, 32 Cycloartanol sulfate, 112 Cyclohexanediols, 32 Cystatin-based substrates, 193 Cystatins, 193 Cystatin-Sepharose, 197 Cysteine proteinase, 67, 150, 156, 168-171, 173-175, 178, 185, 191,192, 196, 234, 238, 252 domain, 250 inhibitors, 168, 172, 182 host, 235 Cysteine residues in P1 position, 75 Cysteine-histidine dyad, 152 Cytochrome P450 3A4, 42 enzymes, 29 inhibitors, 190 Cytomegalovirus, 93, 95, 96 Cytoplasm, 220, 227 Cytoplasmic invasion, 5 membrane, 220, 227, 228 Cytoskeletal proteins, 8 Cytosome, 168

D DABCYL-Glu-Arg-Met-Phe#Leu-Ser-Phe-ProEDANS, 178 Decahydroisoquinoline moiety, 27, 28

269 Decamer peptide substrates, 76 Decapeptide, 223 Defensive responses, 256 Deletion analysis, 251 DelPhi analysis, 135 Denatured hemoglobin, 192 Design of antiviral agents, 1 of HIV protease inhibitors, 11 Diabetes mellitus, 118 Diamino diol inhibitors, 29 Diaxoacetyl-DL-norleucine methyl ester, 226 Dichloroisocoumarin, 226 Differentiation of trypomastigotes, 198 Digestion of hemoglobin, 183 of host-derived proteins, 167 Digestive vacuole, 168 Di-isopropyl fluorophosphate, 197 Dimentylaminopropyl, 132 Dimer formation, 67 interface, 103 Dimethylaminoethyl, 132 Dipetalogaster maximus, 190 Disease, 94 symptoms, 256 in humans and animals, 140 Dissociation constant, 212 Disulfide bridge, 196 Dithiothreitol, 207 Divalent metal ions, 207 D-Mannitol, 33 DMP-323 (XM-323), 23, 32, 38 DMP-450 (XM-412), 24, 32 DOCK 3.0 program, 174 Domain interface, 126 Dominant geneotype, 43 Drug design targets, 1 discovery program, 8, 108 discovery, 223 resistance, 2, 36, 40, 42 selection pressure, 36 target, 219 Drug-resistant mutants, 33 organisms, 219 pathogens, 219 variants, 36

270 E E. coli leader peptidase, 223,225,226, 227 E-64, 172, 175,178, 183, 193,197 ECs0, 31, 34 Echovirus, 140 EDTA, 66 Eglin C, 80 Elastinal, 226 Electron microscopy, 198 Electronic spectroscopy, 78 Electrophilic structure, 140 Electrostatic potential surface, 125 EMCV, 140 Encephalitis, 94, 140 Endocrine cells of the bronchi, 206 Endoplasmic reticulum, 228 Endothelial cells, 206 Enterococci, 219 Enteroviral 3C, 152 diseases, 140 Enterovirus, 140 Envelope glycoprotein, 212 gp120, 212 Enveloped viruses, 205 Enzyme mechanism, 149 acyl-enzyme intermediate, 149 Epimastigote, 191,192, 196-198 Epoxide carbon, 35 Epoxides, 154 Epstein-Barr virus, 95 Equine infectious anemia virus, 8 ER membrane, 65, 70 Ergosterol biosynthesis, 190 Erythrocyte membrane proteins, 175 Escherichia coli, 223 0157:H7, 219 type I signal peptidase, 220, 221 Essential Lys residue, 221 Ser residue, 221 Ethylenediamine tetraacetic acid, 226 Eukaryotic CAP-binding complex, 155 cells, 220 microsomal signal peptidase, 228 signal peptidases, 224 Evolutionary distance, 237 Exopeptidase, 183 Extracellular localization, 220

Index

F F(ab)2 fragment, 196 F0 protein of Sendai virus, 206 Fabavirus, 242 Factor Xa, 206 Falcipain, 169-173,176, 182-5 homolog, 172 inhibitors, 172, 184 Flagellated protozoan, 189 Flagellum, 190 Flap, 9, 125 Flavivirus NS2B, 70 Fluconazole, 119, 132 Flucytosine, 119 Fluorogenic Ala-AMC substrate, 183 AMC peptide substrate, 171 DABCYL-Glu-Arg-Met-Phe#Leu-Ser-PhePro-EDANS, 178 Leu-AMC substrate, 183 peptide, 170 substrates, 122, 178, 179, 192 Fluoromethyl ketones, 110, 154, 173 FMDV, 140 Folate antagonists, 166 Food vacuole, 168, 176, 177, 183 proteinase inhibition, 184 Foot-and-mouth disease, 140, 156 Fungal metabolite, 110 Fungus-transmitted bymoviruses, 238 Fusion glycoprotein HA, 208 precursor, 207 Pr 160 ~ag-pol,6

G Gag, 254 and pol gene products, 6 polyprotein, 5, 13 precursor protein, 5 Gag, pol, and env genes, 5 Gag-pol polyprotein, 13 precursor protein, 5 Gastricsin, 177 Genbank, 134 Gene disruption, 121 expression, regulation, 121 sequencing, 176

Index General acid-base catalyst, 149, 151 base, 221 Genetic structure, 3 Genital herpes, 94, 96 Genome amplification, 237, 238, 242 maps, 236, 244, 247, 249,252, 254 of Candida albicans, 133 organization, 6, 233,253 structure, 252 -linked protein, 239 sequencing, 221 GFLV, 244 proteinase, 245 Gingivostomatitis, 94 Globin, 184 Glutathionine, 169, 170 Glycerol, 101 Glycosyl phosphatidyl inositol (GPI) anchor, 197 Golgi complex, 206 Gp120 of HIV-1,206, 212 Gp 160, 206 Granulocyte elastase inhibitor, 209 Grapevine chrome mosaic nepovirus, 245 Grapevine vanleaf nepovirus, 245 Greek key motif, 72 Ground state stabilization, 106 Ground-state binding, 77

H

Haemophilus influenza, 223,224, 225 Hairy cell leukemia, 3 HAV, 140 HC proteinase, 238 Helicase-like, 236, 250 Hemagglutinating activity, 208 Heme release, 169 Heme, 168 Hemiketal adducts, 110 Hemoglobin, 170, 173,174, 178, 179, 183 degradation, 168, 175, 178, 185 degrading activity, 175 denaturation, 169 digestion pathway, 183 substrate, 176 tetramer, 176 Hemoglobinopathies, 176 Hemolytic activity, 208

271 Hemozoin, 168, 169 Hepatic trophozoites, 167 Hepatitis A virus, 140, 239, 248 Hepatitis C proteases, 61-83 Hepatitis C virus, 61 needle stick, 62 prevalence, 62 risk factors, 62 route of transmission, 62 non-A, non-B, 61 transfusion-associated, 61 Hepatocellular carcinoma, 62 Hepatovirus, 140 Heptapeptide sequences, 240 Herpangina, 140 Herpes labialis, 96 Herpes simplex virus, 93, 94 -2, 94 protease amino acid sequence alignment, 98 proteases, 93-113 Hexamer cleavage products, 82 Hexapeptide aldehyde, 81 Hexapeptide, 152 High mannose-type oligosaccharide, 194 Highly active antiretroviral therapies (HAART), 5 HIV protease, 1-44 HIV acquired immunodeficiency syndrome, 2 AIDS, 1 genome, 5 Gp120 of HIV-1,206, 212 Gp 160, 206 inhibitor cores, 14-27 life cycle, 5 multidrug-resistant strains of HIV, 36, 43 PR dimer, 9 PR monomer, 36 protease (PR) inhibitors, 3, 8, 13, 27, 28, 34, 35, 43 cross-resistance, 41, 43 protease (PR), 2, 11, 13 protease inhibitor complexes, 11 protease structure, 31 protease, 1, 5 L90M mutant, 39 V82A mutation, 37 infectivity, 212

272 HIV (continued) reverse transcriptase, 36 SF2, 214 HIV-2, 3 Homodimers, 101,103 Homology modeling, 68, 130 Host cell functions, inhibition, 145 signal peptidases, 63 cysteine proteinase, 235 invasion, 167 proteases, 214 HSV-1 protease, 97 Human bronchial lavage fluid, 209 Human herpesvirus proteases, 93-113 kinetic parameters of, 100 -6, 95 -7, 95 -8, 95 6 and 7, 93 Human immunodeficiency virus, 206 proteases, 43 type I (HIV), 2 Human influenza virus, 206, 207 kininogen, 196 leucocyte elastase, 227 mucus protease inhibitor, 209 pancreatic secretory trypsin inhibitor, 80 renin, 13 T-cell leukemia virus (HTLV-1), 3 T-cell proteinase, 211 Hydrated amide, 27 Hydrogen bonds, 11 Hydrophobic signal peptide, 220 hydroquinones, 32 Hydroxyethyl hydrazine, 28 urea core, 28 Hydroxyethylamine, 28 Hydroxyethylamine-based compound, 27 Hydroxyethylene analogs, 28 peptide bond isostere, 132 Hydroxymethyl carbonyl analog, 28 Hyperlipidemia, 42 Hyperosmotic potential, 170 Hypnozoite state, 167 Hypovirulence-associated dsRNA virus, 238

Index Hypoviruses, 249 Hypoxanthine, 174

I ICso values, 132, 172-4, 181,183,225 Icosahedral capsid, 96 ICP35, 96 Imidazolones, 110, 111 Immature morphology, 8 Immune response, 196 Immunodominant antigen, 196 electron microscopy, 196, 206, 225 globins, 209 histochemical localization, 206 reactive fragments, 213 suppressive therapy, 118 IN, 6 Inclusion bodies, amorphous, 238 Indinavir, 5, 28, 38, 39, 40, 42 Infection with Sendai virus, 207 Infectious diseases, 219 tropism, 205,206, 212 virus, 97, 139 Infective metacyclic trypomastigote, 191 Infectivity, 205 Infectosome, 141 Influenza virus, 206 infection, 214 Inhibition of falcipain, 172 of infection, 209 of P450 enzymes, 30 of signal peptidase, 228 Inhibitors, 108, 154, 181,197 aprotinin, 209, 214, 226 binding subsites, 37 design, 11, 42 di-isopropyl fluorophosphate, 197 iodoacetamide, 66, 154, 226 irreversible, 35, 172 leupeptin, 169, 170, 172, 193,209,226 N-ethylmaleimide, 66, 154, 221,226 pepstatin, 120, 122, 169, 175, 178, 180, 183 Saquinavir, 5, 8, 30, 31, 38, 41, 42 of plasmepsin I, 182 of folate synthesis, 166 of HIV PR, 2

273

Index of human renin, 13 of NS3 protease, 80 of proteases, 226 of tryptase TL2, 214 secreted aspartic proteinases, 131 Inhibitory activity, 228 Insect vector, triatomine, 189 Insecticide-impregnated bednets, 166 Insecticides, 165 Insulin resistance, 42 Integrase (IN), 5 Integration of proviral DNA, 5 Intermediate filaments, 122 Intermediate in amide hydrolysis, 27 Internal (I) site, 99 Internal cleavage, NIa, 241 Internally quenched, 76 Internet resources, 133 Intestinal infections, 157 Intestinal wall, invasion, 120 Intracellular processing proteinases, 206 replicative form, 191 Intraerythrocytic cycle, 175 Intrahepatic phase, malaria, 167 Intramolecular cleavage, 148, 243 complex formation, 70 NS23-NS4A site, 70 reaction, 68 site, 69 Intraperitoneal, 183 Intravascular catheters, 118 Intraerythrocytic life cycle, 176 Invasion process, 120 Iodoacetamide, 66, 154, 226 Ipomoviruses, 235 Irreversible inhibitors, 35, 172 Isatins (2,3-dioxindoles), 154 Isoelectric focusing, 183, 194 point, 194 Isostere, 27 Itraconazole, 119

K Kallikrein, 193 Kaposi's sarcoma, 95 -associated herpesvirus, 93

kca,, 179 Keratitis, 94 Ketoconazole, 190 Ketones, 110 Ki, 28, 31, 33, 34, 39, 122, 180-184 Kidney stones, 42 Kinetic analysis, 221 parameters, 122, 180 Kinetoplast, 190, 191 Kininogen, 193 Kin, 179 KNI-272, 12, 39 Kunitz type II protease, 212 Kunitz-type soybean trypsin inhibitor, 209

L L proteinase, 156 L-700417, 21 L90M mutant, 39 Labialis, 94 Lactacyctin, 198 Lactams, 154 Laser desorption ionization mass spectrometry, 175 Late golgi, 206 LB-71350, 20, 35 Leader proteinase, BYV, 250 Leech-derived tryptase inhibitors, 214 Leishmania mexicana, 192. 193 spp., 198 Lentiviruses, 2, 3, 7 Leu-AMC substrate, 183 Leukemia, 118 Leupeptin, 169, 170, 172, 193,209,226 Leutovirus, 234 LexA repressor, 221 ligand binding, 112 Lipid bilayer, 220, 228 Lipodystrophy, 42 Liver cirrhosis, 62 L-trans-epoxysuccinyl-leucylamido-( 4guanidino)-butane, (E-64), 169, 226 Lung mast cells, 206 Luteovirus, 253,254 Lymphadenopathy-associated virus, 3 Lymphoreticular cells, 93

274 Lymphotropic, 96 Lys-bradykinin, 193 Lysosomal, 192 localization, 197

M

M. tuberculosis signal peptidase, 224 MA, 6 MA/CA, 7 Macluraviruses, 235 Macromolecular inhibitors, 80 Maize chlorotic dwarf waikavirus, 246 Malaria digestive vacuole, 168 intraerythrocytic life cycle, 176 parasites, 172, 182 proteases, 165-185 proteinase, 176 occurrence, 165 Male and female gametocytes, 167 Marafiviruses, 234, 252 Matrix (MA) protein, 5 Maturation (M) site, 96 cleavage, 139, 139, 143 Mature enzyme, 193 MDL-74695, 17 Mechanism acyl-enzyme intermediate, 149 -based inhibitors, 110 of HIV protease, 8 oxyanion hole, 82, 107, 149, 151,156 Medial cisternae, 206 Membrane associated serine proteinase, 212 bound enzymes, 225 fusion, 212 localization, proplasmepsins, 177 protein, 220, 225 Menignoencephalitis, 190 Merozoite stage, 167 Metabolic stability, 28 Metacyclic trypomastigotes, 192 Metacyclics, 196 Metacyclogenesis, 196 Metal b inding Site, 78 Metalloaminopeptidase, 184 Metallopeptidases, 183 Metalloproteinases, 191,198, 234

Index

Methylpiperazine, 132 Methyltransferaselike, 250 Microfilaments, 122 Microtubules, 122, Minibody, 80 Minimized antibody, 80 Mitochondrion, 190 MK-639 (L-735,524) Indinavir, 20, 29 Molecular chaperone, 245 dynamics simulations, 9, 39 virology, 3 Monoantennary-type oligosaccharide, 194 Monoclonal antibody, 213 Mononuclear cells, 209 Mononucleosis, 95 Monopartite potyviruses, 235 rymoviruses, 235 Morpholine urea-Phe-Hphe-CH2F, 172 Morphological changes, 167 switch pathway, 121 Morphology, 175 Mosquitoes, Anopheles, 165 MrSMV rymovirus, 238 M-site sequence, 106 Mucous protease inhibitor, 214 Mu-Leu-Hphe-VSPh, 173 Multidrug-resistant strains of HIV, 36, 43 Multiple sclerosis, 140 Mu-Phe-Hphe-CH2F, 173 Mu-Phe-Hphe-VSPh, 173 Murine leukemia virus, 8 malaria, 172 Mutational strategies for resistance, 42 Mycobacterium tuberculosis, 219, 223 Mycoplasma genitalium, 224 Myelitis, 140 Myelopathy disorder, 3 Myralgia, 140 Myristic acid, 6

N N-acetyl-L-leucyl-leucyl-methional, 178 N-acetyl-L-leucyl-L-leucyl-norleucinal, 178 Nasal conchae, 206 National Center for Biotechnology Information, NCBI, 223

l~aex NC, 6 NC/pl, 7 N-carbobenzyloxy-L-phenylalanylchloromethyl ketone, 226 Neisseria gonorrhoeae, 219 Nelfinavir, 5, 15, 29, 31 Neonatal herpes, 94 Neoplastic disease, 2 Neopvirus, 242 Nepovirus 24K proteinase, 246 polyprotein, 245,246 Nepovirus, 234, 244, 246 N-ethylmaleimide, 66, 154, 221,226 Neutralizing antibodies, 212 Newcastle disease virus, 207 NF-KB, 8 N-glycosylation, 194 N-hydroxymethylbenzyl, 32 Nia, 236 cleavage sites, 240 proteinase, 235,237, 238, 240, 241 NIb, 239 Nicotiana tabacum, 250 Nifurtimox (Lampit), 190 Nitrobestatin, 183 Noncovalent complex, 72 Non-Hodgkins T-cell lymphoma, 3 nonpeptidic HIV PR inhibitors, 32 Nonprime region, 112 Nonstructural proteins, 144 Northern blot, 171 NS2, 63 NS2-NS3 junction, 65 NS2-NS3 junction, processing, 65 NS2-NS3 protease, 70 NS2-NS3, 67 NS3 protease, 64, 68, 72, 78 secondary structure topology of, 73 substrate specificity of, 75 three-dimensional structure of, 72 protease, cofactor, 70 substrate consensus sequence, 69 NS3-NS4A, 63, 70 complex, 74 heterodimer formation, 71 interaction, 70 NS4A, 64, 70, 71, 78 NS4A-NS4B, 64, 70 NS4B-NS5A, 64, 70 NS5A-NS5B, 64

275 N-terminal amino acid sequencing, 249, 253 extensions, 220 membrane spanning domain, 224 NTPase activity, 64 Nucleocapsid (NC) protein, 5 Nucleophile, 149 Nucleophilic sulfur atom, 151 Nucleoprotein core, 5 particles, 6 Nucleotide sequence data, 233

O Oat blue dwarf marfvirus, 252 O-glycosylation, 194 Oligopeptidase B, 197, 199 OmpA, 223 o-phenanthroline, 226 Opportunistic infections, 2 Opsonizing, 196 Optimal pH, 99, 171 Oral and esophageal infection, 118 Oral bioavailability, 27, 28, 29 Organ transplant patients, 118 Organomercurial reagents, 193 Orphanovirus, 140 Orthomyxoviruses, 206 Oxalic bis(2-hydroxy- 1-naphthylmethylene)hydrazide, 174 Oxidized A and B chains of insulin, 192 Oxyanion hole, 82, 107, 149, 151,156

P Plasmodium berghei, 183 lophurae, 175 malariae, 182 ovale, 182 vinckei, 173 vivax, 182 P.falciparum, 171,173, 174, 176, 177, 182-4 food vacuole, 175 P', 76 P1 benzyl group, 31, 35 cleavage site, 237 glutamine, 153 position, 192, 207 cysteine residues in, 75

276 P 1 (continued) proteinase, 237 residue, 143 PI', 7, 11,27,32,33,38,239,240 position, 69 P1, 7, 76, 127, 239, 240, 242, 243,252 P1/PI', 33 pl/p6, 7 P 1-P 1' residues, 7 PI'P2', 30 P2, 7, 11, 28, 30, 31-33, 193,207, 239, 240, 243 position, 179, 192 substituents, 30, 31 p2/NC, 7 P2', 7, 31-33, 193,239 P3, 193,239, 240, 242 P3', 7 P3/P2, 31 P4, 230, 240, 243 binding pocket, 147 P4', 250 P5,239, 240 P6, 240 Panstrongilus megistus, 190 Papain, 173, 194 Papain-like, 236 cysteine proteinase, 156, 249, 250, 251 proteinase domain, 238 TYMV proteinase, 253 Paramyxoviruses, 206 Pararetrovirus, 234, 235 Parasite, 168, 173,175, 178, 184, 190, 193, 197, 198, 224 cytoplasm, 184 food vacuole, 184 life cycle, 199 membrane fractions, 177 metabolism, 174 resistance, 165, 185 surface antigens, processing, 167 Parasitemia, 172, 173, 183, 199 Parasites in culture, 172 Parsnip yellow fleck sequivirus, 246 Pathogen, 224 Pathogenesis, 94 Pathogenic bacteria, 221,223 Pathogenicity enhancer, broad-range, 238 of enveloped viruses, 205

Index p-cyanobenzenesulfonamide, 34 PD-153103, 26 Penems, 226 Pentafluoroethylketones, 82 Peperazine, 28 Pepsin, 9, 177 Pepstatin, 120, 122, 169, 175, 178, 180, 183 phenylmethylsulfonyl fluoride, 226 Peptide bond breakage, 9 fluoromethyl ketone, 172 inhibitors, 109 specificity, 154 substrate, conformation, 152 substrates, 106, 147, 172 -based inhibitors, 9, 77, 81 Peptidomimetic, 173 inhibitors, 11 substrate-based approach, 13 Peptidyl chloromethane derivatives, 193 Peptidyl fluoromethane derivatives, 193 Periplasm, 225 Periplasmic localization, 220 surface, 220 Pestivirus p 10, 70 P-glycoprotein efflux ssytem, 30 pH optimum, 101 Phagocytosis, 196 Pharmacodynamics, 36 Pharmacokinetics, 13, 28, 35 Phenanthrenequinone, 81 Phenanthroline, 66 Phenothiazine, 174 Phenylglycine surrogate, 28 Phosphoramidon, 226 Phosphorylation, 194 PI value, 198 Picornalike viruses, 234 Picornaviral proteases, 139-158 2A proteinase, 155 3C proteinase, 142, 146, 147, 153,235 3C-like proteinase, 157, 236, 243 Picornalike, 233,234 Picornaviral 3C proteinase, 142, 235 polyprotein, 152 Picornaviridae, 140 Picornavirus 3ABC, 144

277

Index 3C gene product, 143 3C protein, 239 3C proteinase, 146, 147, 153 3CD, 144 3C-like proteinase, 157, 236, 243 3D and 3AB domains, 154 life cycle of 140, 141 Piperazine analogs, 28 PKR, 65 Plant aspartic proteinase, 177 Plant pararetrovirus genome, 254 proteinases, 253 Plant virus 3C-like proteinases, 248 genome sequencing, 255 papainlike proteinases, 252 proteases, 233-256 proteinase, 255 -encoded proteinases, 255 Plasma concentration, 27 membrane, 197 protein binding, 27 Plasmepsins, 182, 184 I, 168, 175-180, 182-185 II, 168, 176, 178-180, 182, 184, 185 I and II, 183 Plasmodium, 166 berghei, 182 chabaudi, 183 falciparum, 166, 172 malariae, 166 ovale, 166 vinckei, 172 vivax, 166 PLRV, 254 Plum pox virus, 237 PMSF, 80, 178 Pneumonia, 140 Pneumotropic virus infection, 211 Pneumotropic viruses, 208, 209 PNU-140690, 26, 34 Pockets, 11 Pol, 254 Polio 3C, 153 Poliomyelitis, 157 Poliovirus, 140 Polyacrylamide gel electrophoresis, 207 Polycistronic messenger RNA, 1, 5

Polyclonal antibodies, 213 anticruzipain sera, 197 Polymerase chain reaction, 190 -like, 250 Polyproteins, 234, 237 cleavage sites, 41 folding, 76 percursor, 6, 63 processing, 141,249 proteolytic processing, 236 substrate recognition, 39 Porcine pepsin, 10 Postherpetic neuraglia, 94, 96 Postive-sense RNA molecule, 5 Posttranslational modifications, 254 Potato carlavirus M (PVM), 251 leafroll luteovirus, 253 Poty- and comoviruses, 245 Potyviral NIa proteinase, 239 polyproteins, 237, 241 proteinases, 235 Potyviridae, 234, 235,240 Potyvirus, 234, 236, 238-241 HC proteinase, 251 infection, 250 Nia cleavage site, 240 Nia proteinase, 240, 256 polyprotein processing, 239 PPV P3 protein, 241 PPV polyprotein, 241 PR, 6 PR inhibitors, 8 PR specificity, 8 PR/RT, 7 Precursors, 220 polypeptide, 1,208 proteins, 220 Pre-kallikrein, plasma, 193 prime side, 112 subsites, 75 Primordial aspartic proteinase gene, 9 Processing activity of signal peptidase, 227 enzymes, 2 in cis, 145 in trans, 145

278 Processing (continued) of gag and gag-pol polyproteins, 5 protease, 205 sites for HIV-1 protease, 7 Proenzyme, 171 Profalicpain gene, 173 Proguanil, 166 Proline mimetics, 28 Prolonged neutropenia, 118 Prolyl oligopeptidase family, 197 Propeptide, 193, 195 Proplasmepsins, 177 genes, 176 I, 176-178 II, 176-178, 182 processing proteinase, 185 Protease (PR), 5 and helicase domains, 72 biochemistry, 5 inhibitors, 7, 122, 184, 209 substrates, 122 Protein Databank, 135 Protein precursors, proteolytic processing, 234 secretion pathway, 228 Proteinases, 250 domain, 236, 253,254 gene, 243 inhibitors, 178 Protein-protein interaction, 68 Proteolytic activities, 191, 237 cleavage, 205,213,249 efficiency, 245 enzyme, 224 modification, 212 processin site, 253 processing, 6, 107, 233,234, 242 of polyproteins, 235 of viral polyprotein, 139 Proteosome, 191,198 Proviral DNA, 2, 5, 8 Provirion, 141 Pulmonary surfactant, 209 Putative proteinase, 234 PYFV sequivirus, 246, 247 Pyrimethamine, 166 Pyrimidinone analogs, 32 Pyrogens, 167 Pyrone analogs, 33, 34

I~aex Q Quinoline-carbonyl-asparagine, 31

R R040-4388, 184 Rabbit reticulocyte lysate system, 237 Rat lung, 206 Rational approach to drug design, 228 Receptors, 205 Recognition sequence, 105 Recombinant falcipain, 171 HIV-1 SF2 gp120, 213 plasmepsin I and II, 178-180, 182 protein, 227 Red blood cells, 182 Red clover mottle comovirus, 245 Reduced peptide bond inhibitor, 109 Regulation of processing pathway, 241 Release (R) site, 96 Release from host cells, 167 Renin, 28, 177 Replicase-like, 236 Replication cycles, 141,209 steady-state rate of replication, 36 Replicative intermediate, 141 Reservosome, 192 Resistance, 176, 183 pathways, 41 mutants, 13 Respiratory tract, 206 Retroviral animal, 3,253-255 aspartic proteinase, 13 proteases, 9 replication, 8 Retroviridae, 2 Reverse transcriptase (RT), 2, 5,254 activity, 255 Reverse transcription, 4 Rhinovirus, 140 3C, 154 Rhodnius prolixus, 190 Ribavirin, 62 Ribosomal frameshifting, 6 Rice tungro bacilliform virus, 254 spherical waikavirus, 246 Ring stage, 167, 176

Index Ritonavir, 5, 18, 30, 38, 42 -resistant mutants, 34 RNA binding activity, 235 binding site, 142, 148 genome, single-stranded, 63 helicases, 64, 142 molecules, 81,242 positive polarity, 233 replicase, 239,253 replicative complex, 142 -dependent DNA polymerase, 2 -dependent RNA polymerase, 142 -helicase activity, 64 RNase H, 7 activity, 255 domain, 254 Ro31-8959 Saquinavir, 15 Ro40-4388, 181 Ro40-4476, 184 Ro40-5576, 181 Root-mean-square deviations, 124 Roseola infantum, 95 Roseola, 96 Rough endoplasmic reticulum, 206 Rous sarcoma virus (RSV) protease, 9 RT, 6 (internal), 7 RT/IN, 7 RTBV, 247 proteinase, 255 RTSV, 247 Rubiviruses, 238 Rymovirus, 234

S S. aureus signal peptidase, 226 $1', 27, 32-34, 37, 125,126, 196 pocket, 34, 129 $1,31,32, 37, 125, 126, 127 pocket, 125 histidine, 153 $1/$1', 33, 38 S1-$2', 34 $2, 32, 34, 37, 125,126, 196 and $3 subsites, 174 subsite pocket, 129, 174, 178, 180, 196 $2', 32-34, 37, 125,126, 196 pocket, 125, 129 $2'$3 pocket, 124

279 $3,31,37 subsite pocket, 129, 174, 178, 180, 196 $3', 37 $3/$3', 38 S3A, 126 S3B, 126 $4, 126 subsite, 128, 153 Salmonella typhimurium, 221,223,225 Saquinavir, 5, 8, 30, 31, 38, 41, 42 -resistant, 39 SB-204,144, 16 SB-206343, 22 SBMV, 254 SC-50083, 181,183 SC-52151, 15 Schizonts, 167 Scissile peptide bond, 7, 147, 221 S-configuration, 132 Screening protocols, 228 SDS-PAGE, 207 Secreted aspartic proteinases, 118, 119, 122 Secretory pithelial cells, 206 glands, 93 granules, 206 leukoprotease inhibitor, 209 signal peptides, 220 Sendai virus, 206, 207, 210 Sensory ganglia, 93 Septicemia, 118 Sequence amino acid sequence alignment, 221 alignment, type I signal peptidases, 223 alignments, 155,243,248, 250-242 comparison, cruzipains, 195 homology, 193 identity, 124 Sequivaridae, 234, 246 Sequiviridae 3C-like proteinase, 246 Sequivirus 234, 246, 247 Ser-His catalytic diad, 112 -His triad, 103,106 Serine protease domain, 68 inhibitor, 110 Serine proteinases, 96, 155,191,197, 221, 234, 237, 253 Serinelike proteinase, 234 Serine-type, 236

280 Ser-Lys dyad, 221 Serum-induced hyphal formation, 121 Seven-stranded orthogonally packed /3-barrel, 102 Shingles, 94, 96 Signal peptidase, 228 assay, 228 gene, 220 substrate, 221,223 Signal peptide, 193, 195,220 transduction event, 121 Signature motif, 126 Site-directed mutagenesis, 221,237, 239, 240, 242, 243,250, 251,254 SIV, 3 Sobemolike, 234, 235,253 Sobemovirus genome, 253 Sobemovirus, 234, 253,254 Sodium citrate, 101 Sodium dodecyl sulfate, 207 Specific substrates, 152 Specificity constants (kca/Km), 179 pockets, 152 ridge, 124, 126 -determining residues, 37 Spirocyclopropyl oxazolones, 110, 111 SPMMV ipomovirus, 238, 240 NIa proteinase, 240 Sporozoite stage, 167 Staphylococcus aureus, 120, 219,221,223-225 nuclease A, 223 Steady-state rate of replication, 36 Stefins, 193 Stereochemistry, 227 Streptococcus pneumoniae, 219, 221,223 Structure-based drug design, 11 Structural homology, 9 polyprotein Pr55 gag,6 protein processing, 144 proteins, 63, 142, 144 proteins, virion, 6 variation, 127 Structure of HIV protease, 8 -activity relationship, 82, 13' -aided drug design, 117, 123

Index -based approaches, 13 -based design, 30 -based inhibitors, 13, 29 Subcutaneous, 172, 183 Subendothelial extracellular matrix, 120 Subepithelial tissues, 206 Subsites, 11,123 Substrate binding pocket, 246 binding residues, 236 binding, 224,245 mimetics, 13 P1 amino acid, 248 eptidomimetic inhibitors, 13 sequence, 7 specificity, 40, 113, 178, 193, 196, 197, 241 -based inhibitors, 40 -binding pocket, 239,241,243 for signal peptidases, 221 Subtilisinlike, 99 Superposition, 102, 107 Surface loops, 112 SV-652, 24 Sweetpotato mild mottle ipomovirus, 237 Symmetry-based inhibitors, 29 Synergistic action, 183 effects, 38 Synthetic inhibitor, 194 substrates, 8

T T. brucei, 198 T. cruzi, 197, 198 Target-based screening, 29 Targets for antiviral drug design, 3 t-butyl benzamides, 30 t-butyl cyclohexylcarboxamides, 30 T-cell leukemia, 3 poliferation, 121 Tegument, 96 Temporal heirarchy, processing, 68 Tetrahedral intermediate, 149 Tetrahydrothiophene, 28 TEV, 236 site, 241

281

Index NIa, 241 proteinase, 242 VPg-Pro, 242 TF/PR, 7 Theipane, 33 Therapeutic targets, 2 Thiazolidine, 81 thienoxazinones, 110. 111 Thiepane dioxides, 33 Thiolate-imidazolium ion pair, 150, 151 Three-dimensional structure, 124, 146 Thrombin, 212, 213 Thrush, 118 TLCK, 193 Tobacco etch potyvirus, 237, 238 Tomato black ring nepovirus, 245 Tomato ringspot nepovirus, 246 Tosylamido-2-phenylethyl chloromethyl ketone, 226 Trans cleavage site, 69 Transcription of proviral DNA, 5 Transcription, 235 Transition-state analog, 27 Translation, 4, 141,235 initiation site, 244 strategy, 6 Translocations of proteins, 220 Transmission through blood transfusions, 189 Triatoma infestans, 190 Triatomine insect vector, 189, 190 Trichovirus, 234, 252 Trifluoromethylketones, 82 Triperpene sulfates, 154 Triton X-114, 197, 198 Trophozoite, 183 cysteine proteinase, 170, 171 extract, 174 stages, 171,176

Trypanosoma brucei, 192, 193 cruzi, 174, 189, 190 life cycle, 191 proteases, 189-199 rangeli, 192, 193 Trypanosomatids, 193, 198 Trypanosomiasis, 189 Trypomastigote, 191,192, 196, 197 Trypsin- and chymotrypsinlike activities, 213 endoproteinase, 212

Trypsin, 206 cleavage, 207 -like serine protease, 68, 209,237, 239 -type serine proteinase, 206 Tryptase Clara, 206, 207-209, 214 TL2, 211,212, 213 Tryptophan fluorescence spectrum, 73 Turnip mosaic potyvirus, 242 yellow mosaic virus (TYMV), 251 Turnover rate, 99 Two-fold crystallographic axis, 104 Two-proton transfer, 106 Tymolike, 250 cysteine proteinase, 252 proteinase, 251,253 Tymovirus, 234, 252 proteinase, 251 TYMV proteinase, 251 Type I signal peptidase, 224 Type II integral membrane protein, 177

u U-103017, 26, 34 U-96988, 25, 34 Upper respiratory tract infections, 157

v V3 loop, 212, 214 peptides, 212, 213 V82A mutation, 37 Vaginal candidiasis, 120 Vaginitis, 118 Van der Waals interactions, 38 Variable domain, 212 Variant protease, 41 Varicella-zoster virus, 93, 94, 96 Vector, mosquito, 166 Vimentin, 122 Vinyl sulfone, 173 Viral 3C-like proteinases, 239 activation, proteolytic, 207 capsid protein, 63 capsids, 144 envelope glycoproteins, 63 enzymes, 5

282 Viral (continued) fusion, 205 integrase, 5 life cycle, 5 papainlike cysteine proteinase, 252 polysome, 141 PR, 6 protease as drug design targets, 5 protease, 1, 5 proteinase, 239, 242, 243,256 resistance, 65 RNA-dependent RNA polymerase, 65 Virion assembly, 141 morphology, 233 -associated cysteine proteinase, 255 -associated reverse transcriptase, 5 Virulence factors, 118, 119 Virulent pantroic viruses, 205 Virus assembly, 2 and budding, 6 infectivity, 207, 241 -encoded protease, 2 -encoded proteinase, 233,234, 235 infecting plants, 233 Vitality, 41 Vomovirus proteinase, 246 VP4, 141 VPg, 236, 239,241 3B gene product, 142 -proteinase-replicase segment, 243 VX-478 (Amprenavir), 15, 28, 29, 30

Index VZV and CMV protease, 102 protease, 112

w

Waikavirus, 234, 246 Warfarin, 34 Western blot, 213 Whitefly transmitted SPMMV ipomovirus, 238

x XC-52151, 28 X-ray crystal structure, 74, 78, 177, 185,194, 225,226 analysis, 35 X-ray diffraction analysis, 225 XV-652, 32

z Zinc ion, 67, 155 ligands, 66 -dependent metalloprotease, 66 Z-Phe-Ala-fluoromethane, 194 Z-Phe-Arg-AMC, 170 Z-Phe-Arg-CH2F, 172 Z-Phe-Arg-fluoromethylketone, 199 Zymogens, 176, 177, 182

C~

o~

9

~

o~o~ G

c~

C~

p~

~o

9

~o

o~

~j

CHAPTER 2, FIGURE 5 Three dimensional structure of the NS3 protease domain (green) in complex with an NS4A cofactor peptide (red). The residues of the catalytic triad are shown. The structural zinc ion is in magenta. The termini of the NS4A peptide are indicated by C and N.

CHAPTER 3, FIGURE 6 The structures of the CMV (top) and VZV (bottom) protease dimers. (Red) catalytic residues; N- and C-termini are labeled.

CHAPTER 3, FIGURE 3 Ribbon diagrams of CMV protease, VZV protease, chymotrypsin and subtilisin. (Light blue) helices and loops, (yellow) strands, and (red) catalytic residues. (Dark blue) the AA loop observed in the VZV protease structure. The diagrams are drawn using MOLSCRIPT (Kraulis, I991),

CHAPTER 3, FIGURE 9 Molecular surface of the VZV protease looking into the postulated substrate binding groove. The surface is color coded by electrostatic potentials (blue for positive and red for negative) and calculated with the program GRASP (Nicholls and Honig, 1991). The M-site substrate is modeled. The arrow indicates the scissile bond position.

CHAPTER 4, FIGURE 2 Ribbon diagram of the structure of the Candida albicans secreted aspartic proteinase bound to inhibitor A-70450 (Abad-Zapatero et al., 1996). The protein backbone of SAP2X is shown as a ribbon color coded according to the position along the polypeptide chain, which allows easier viewing of the backbone. The C- and N-termini are marked as C and N, respectively. The protein chain begins with red color and passes through orange, yellow, green, blue-green, light-blue, and, finally, dark blue. The traditional aspartic proteinase active site flap (the 85-1oop) corresponds to yellow and the broader unique Candida "second active site flap" (insertion at the 45-1oop) is displayed in orange. The deletion of helix hN2 in pepsin corresponds to the light green region next to the two flaps. The C-terminal extension is in dark blue.

CHAPTER 4, FIGURE 3 View of the overall electrostatic potential surface of (A) SAP3 and (B) SAP6. The total charge of SAP3 i s - 2 1 and that of SAP6 is +2. Note that the predominantly red surface of SAP3 has been changed to a blue-peppered surface in SAP6. The view is from the P2' site toward the interior of the protein.

CHAPTER 4, FIGURE 4 Electrostatic potential surfaces of SAP1-6 (A-E respectively). The orientation shows the inhibitor in the orientation of Fig. 1 partly covered by the 85-1oop. On the left of the 85-1oop, the P3b piperazine, P3a benzyl, and the P2 nor-Leu are visible. On the right of the 85-1oop, the PI' Val isopropyl group and the P2' butyl group are visible. Note that in SAP3, the PI' Val is obscured by Tyr303 as described in the text. The blue and red refer to positively and

CHAPTER 4, FIGURE 4 (continued) negatively charged regions within the protein. While the large amount of red surface is reflective of the significantly negatively charged character of the active sites of SAP1-6, there is an increasing amount of positive character in $AP4-6 relative to SAP1-3, as indicated by the additional blue regions for SAP4-6.

CHAPTER 4, TABLE I

Comparison of the Different SAP1-6 Structures

STRUCTURAL

ALIGNMENT

OF

SAPS

FROM

Candida

1 SAP1 QAIPVTLNNE SAP2QAVPVTLHNE SAP3 QTVPVKLINE SAP4 GPVAVKLDNE SAP5 GPVAVTLHNE SAP6 GPVAVKLDNE SAPT SDVPTTLINE CONS V V L NE

LVSYAADITI QVTYAADITV QVSYASDITV IITYSADITI AITYTADITV IITYSADITV GPSYAADIVV !!%Y aDItv

GSNKQKFNVI GSNNQKLNVI GSNKQKLTVV GSNNQKLSVI GSDNQKL~/I GSNNQKLSVI GSNQQKQTVV GSn@QKI Vi

* VDTGSSDL[p~ VDTGSSDL[~ IDTGSSDL~AR[ VDTGSSDL~R7 VDTGSSDLWI VDTGSSDLWI IDTGSSDL[~ vDTGSSDLWv

50 PDASVTCDKP PDVNVDCQVT PDSQVSCQAG PDSNAVCIPK PDSNVICIPK PDSKAICIPK VDTDAECQVT pD% ! C

SAP1 SAP2 SAP3 SAP4 SAP5 SAP6 SAPT CONS

51 RPGQSADFCK YSDQTADFCK .QGQDPNFCK WPGDRGDFCK WRGDKGDFCK WRGDCGDFCK YSGQTNNFCK # g !dFCK

GKGIYTPKSS QKGTYDPSGS NEGTYSPSSS NNGSYSPAAS SAGSYSPASS NNGSYSPAAS QEGTFDPSSS @ G%y%P S

TTSQNLGSPF SASQDLNTPF SSSQNLNSPF STSKNLNTPF RTSQNLNTRF STSKNLNTRF SSAQNLNQDF s%sqnLn% F

YIGYGDGSSS KIGYGDGSSS SIEYGDGTTS EIKYADGSVA DIKYGDGSYA EIKYADGSYA SIEYGDLTSS I YgDgs

i00 QGTLYKDTVG QGTLYKDTVG QGTWYKDTIG QGNLYQDTVG KGKLYKDTVG KGNLYQDTVG QGSFYKDTVG qG IYkDTvG

SAP1 SAP2 SAP3 SAP4 SAP5 SAP6 SAPT CONS

i01 FGGASITKQV FGGVSIKNQV FGGISITKQQ IGGVSVRDQL IGGVSVRDQL IGGASVKNQL FGGISIKNQQ GG!S!$ Q!

FADITKTSIP LADVDSTSID FADVTSTSVD FANVRSTSAH FANVWSTSAR FANVWSTSAH FADVTTTSVD fA v sTS!

QGILGIGYKT QGILGVGYKT QGILGIGYKT KGILGIGFQS KGILGIGFQS KGILGIGFQT QGIMGIGFTA GIiGiG# %

NEA.AGDYDN NEA.GGSYDN HEA.EGNYDN NEATRTPYDN GEATEFDYDN NEATRTPYDN DEAGYNLYDN EA YDN

149 VPVTLKNQGV VPVTLKKQGV VPVTLKNQGI LPITLKKQGI LPISLRNQGI LPISLKKQGI VPVTLKKQGI !P!tLk Qgi

SAP1 SAP2 SAP3 SAP4 SAP5 SAP6 SAPT CONS

150 IAKNAYSLYL IAKNAYSLYL ISKNAYSLYL ISKNAYSLFL IGKAAYSLYL IAKNAYSLFL INKNAYSLYL I KnAYSLyL

NSPNAATGQI NSPDAATGQI NSRQATSGQI NSPEASSGQI NSAEASTGQI NSPEASSGQI NSEDASTGKI NS ~A%%GqI

IFGGVDKAKY IFGGVDNAKY IFGGVDNAKY IFGGIDKAKY iFGGIDKAKY IFGGIDKAKY IFGGVDNAKY iFGG!D AKY

SGSLIAVPVT SGSLIALPVT SGTLIALPVT SGSLVDLPIT SGSLVDLPIT SGSLVELPIT TGTLTALPVT sGsL! IP!T

199 SDRELRITLN SDRELRISLG SDNELRIHLN SDRTLSVGLR SEKKLTVGLR SDRTLSVGLR SSVELRVHLG Sd$ L ! L

SAP1 SAP2 SAP3 SAP4 SAP5 SAP6 SAPT CONS

200 * SLKAVGKNIN G.NIDVLLDS SVEVSGKTIN TDNVDVLLDS TVKVAGQSIN A.DVDVLLDS SVNVMGQNVNV.NAGVLLDS SVNVRGRNVD A.NT~!LLDS SVNVMGRNVNV.NAGVLLDS SINFDGTSVS T.NADWLDS sv v G in n! V I L D S

GTTITYLQQD GTTiTYLQQD GTTITYLQQG GTTISYFTPN GTTISYFTRS GTTISYFTPS GTTITYFSQS GTTI%Y

VAQDIIDAFQ LADQIIKAFN VADQVISAFN IARSIIYALG IVRNILYAIG IARSIIYALG TADKFARIVG !a ii a

249 AELKLDGQGH GKLTQDSNGN GQETYDANGN GQVHYDSSGN AQMKFDSAGN GQVHFDSAGN A..TWDSRNE ! #Ds gn

SAP1 SAP2 SAP3 SAP4 SAP5 SAP6 SAPT CONS

250 TFYVTDCQTS SFYEVDCNLS LFYLVDCNLS EAYVADCKTS KVYVADCKTS KAYVADCKTS IYRLPSCDLS y!!dC S

GTVDFNFDNN GDVVFNFSKN GSVDFAFDKN GTVDFQFDRN GTIDFQFGNN GTVDFQFDKN GDAVFNFDQG G%vdF@Fd n

AKISVPASEF AKISVPASEF AKISVPASEF LKISVPASEF LKISVPVSEF LKISVPASEF VKITVPLSEL !KIsVPaSEf

299 TAPLSYANGQ PYPKCQLLLG AASLQGDDGQ PYDKCQLLFD TAPLYTEDGQ VYDQCQLLFG LYQLYYTNGE PYPKCEIRVR LFQTYYTSGK PFPKCEVRIR LYQLYYTNGK PYPKCEIRVR ILKDSDSS ..... ICYFGIS ! py kC ! !

SAP1 SAP2 SAP3 SAP4 SAP5 SAP6 SAPT CONS

300 ISDANILGDN VNDANILGDN TSDYNILGDN ESEDNILGDN ESEDNILGDN ESEDNILGDN RNDANILGDN s- N I L G D N

FLRSAYLVYD FLRSAYIVYD FLRSAYIVYD FMRSAYIVYD FLRSAYVVYN FMRSAYIVYD FLRRAYIVYD FIRsAYiVYd

LDDDKISLAQ LDDNEISLAQ LDDNEISLAQ LDDRKISMAQ LDDKKISMAP LDDKKISMAQ LDDKTISLAQ LDD IS Aq

VKYTSASNIA VKYTSASSIS VKYTTASNIA VKYTSQSNIV VKYTSESDIV VKYTSESNIV VKYTSSSDIS V K Y T s S I!

342 ALT ALT ALT GIN AIN AIN AL. a!

56% 63% 62% 50% 50% 51% -

Abbreviations are defined in the text. The numbering system is based on SAP2 (residues 1-342). Note that the alignment in Fig. 2 (Abad-Zapatero et al., 1996) for residues 134-141 was in error. The percentages listed indicate percent identity with 5APT. The two active site aspartate residues are denoted by an asterisk. CONS indicates the consensus sequence with the following symbols: -, acidic; !, hydrophobic; @, amido; #, aromatic; $, basic; %, hydroxyl-containing. Uppercase denotes strict identity; lowercase indicates at least 70% conservation.

CHAPTER 5, FIGURE 2 Ribbon representation of the three-dimensional structure of the 3C gene product from hepatitis A virus in stereo. (a) The view is into the proteolytic active site. Included are the side-chains of the residues that contribute to the proteolytic activity and to the primary, P1, specificity. The N-terminal [~-barrel domain is on top (cyan) and the C-terminal domain on the bottom (lilac). The part of the main-chain that forms the oxyanion hole is in black. Several water molecules are represented by royal blue-colored spheres. (b) The same structure rotated approximately 130 ~ to the left to show the RNA binding site of the 3C gene product. The main-chain of the conserved sequence motif KFRDI is in black. Charged side-chains within the putative RNA-binding site are in blue (Lys and Arg) and red (Asp and Glu).

CHAPTER 5, FIGURE 3 Ribbon representation of the three-dimensional structure of the 3C gene product from poliovirus. (a) The proteolytic active site with the residues involved in the proteolytic activity and in the recognition of the P1 glutamine residue. (b) The RNA binding site. The orientation of the molecules and the coloring scheme is similar to that in Fig. 2.

CHAPTER 5, FIGURE 4 The proteolytic active site of the picornaviral 3C proteinases, hepatitis A 3C (a) and polio 3C (b). The view is the same in a and b. The oxyanion hole is on the top right and the S 1 specificity pocket below it. The spheres represent water molecules in the structure of HAV 3C. Hydrogen bonds are represented by broken lines.

.~

,-~

~

~

.,~

o~

9

9

~

~ ~ ~ o~ <

.,~

~<~ ,.=

~

=~--

o

~

C~

~

o~ U ~

~o

o

~ ~ ....

~

.~~

~

~

-8

(o ~

~~

~

~

~

~

=~

~

~o

~