Toxicology Testing Handbook
s
” “ ” ”
” ” ”
This Page Intentionally Left Blank
Toxicoloav Handbook Principles, A...
138 downloads
1569 Views
35MB Size
Report
This content was uploaded by our users and we assume good faith they have the permission to share this book. If you own the copyright to this book and it is wrongfully on our website, we offer a simple DMCA procedure to remove your content from our site. Start by pressing the button below!
Report copyright / DMCA form
Toxicology Testing Handbook
s
” “ ” ”
” ” ”
This Page Intentionally Left Blank
Toxicoloav Handbook Principles, Applications, and Data Interpretation
edited by
David Jacobson-Kram BioReliance Corporation Rockville, Maryland
Kit A. Keller Toxicology Consultant Washington, D.C.
MARCEL
MARCEL DEKKER, INC. D E K K E R
NEWYORK BASEL
ISBN: 0-8247-0073-2 This book is printed on acid-free paper.
Headquarters Marcel Dekker, Inc. 270 Madison Avenue, New York. NY 10016 tel: 212-696-9000; fax: 212-685-4540 Eastern Hemisphere Distribution Marcel Dekker AG Hutgasse 4, Postfach 812, CH-4001 Basel, Switzerland tel: 41-61-261-8482; fax: 41-61-261-8896 World Wide Web http://www.dekker.com The publisher offers discounts on this book when ordered in bulk quantities. For more information, write to Special Sales/Professional Marketing at the headquarters address above.
Copyright 0 2001 by Marcel Dekker, Tnc. All Rights Reserved. Neither this book nor any part may be reproduced or transmitted in any form or by any means, electronic or mechanical, including photocopying, microfilming, and recording, or by any information storage and retrieval system, without permission in writing from the publisher. Current printing (last digit): 10987654321
PRINTED IN THE UNITED STATES OF AMERICA
This book is dedicated to the memory of Dr. Donald L. Putman, colleague, mentor, and valued friend.
This Page Intentionally Left Blank
Preface
This text provides practical guidance to persons responsible for developing toxicology data, evaluating results from toxicology studies, and performing risk assessments. It will be particularly useful to those using outside laboratories to perform studies for regulatory submission. Individuals charged with developing a safety profile on a new material may be nontoxicologists, inexperienced entrylevel toxicologists, or toxicologists with expertise in a particular subdiscipline but lacking experience in other areas. Individuals with responsibility for assuring safety of new products and materials are foundin an array of businesses including the pharmaceutical industry. biotechnology companies, medical device manufacturers, formulators of cosmetics and personal care products, and the chemical, pesticide, and petroleum industries. This text servesas a guide for proper study design to help ensure regulatory acceptance. It addresses such issues as species selection, dose level and dosing regimen, animal number, routes of exposure, and proper statistical evaluation. Chapters focused on particular subdisciplines examine the purposeof the study, choice of species and the conditions under which the animals are maintained, experimental design, routeof exposure, the duration of the study, choiceof vehicle, and endpoints evaluated. The final chapters present insights into future directions of this field. These chaptersdiscuss how new techniques in molecularbiology,suchasuse of transgenic animals, will impact the practice of this discipline in the twenty-first century. The current business environment is dominated by down-sized workforces, consolidated corporate functions, and lean start-ups. Often companies have a sinW
vi
Preface
gle individual responsible for all regulatory compliance issues. If that person has a background in toxicology at all, it is often general and cursory. This text can help lessen the dependence on outside consultants to design product safety studies and facilitate regulatory approval. It can serve as a principal textbook for courses in regulatory affairs and quality assurance and will complement other texts in basic and advanced toxicology courses. Since similarities and differences in regulatory requirements in the United States, Europe, and Japan are an important topic in each chapter, it could serve as a resource to individuals responsible for registering products in overseas markets.
David Jacobson-Kmm Kit A. Keller
I
1
I
Contents
Preface Contributors
1. Use of Laboratory Animals in Toxicology Studies
V
ix
1
Kit A. Keller
2. Toxicity Associated with Single Chemical Exposures Andrew I. Soiefer and Elmer J. Rauckman
19
3. Multidose Toxicity and Carcinogenicity Studies Christopher Banks and Kit A. Keller
33
4. Metabolism and Toxicokinetics J. Caroline English
73
5. Inhalation Toxicity Studies Raymond M. David
103
6. Genetic Toxicology Donald L. Putman, Ranladevi Gudi, Valentine 0. Wagner 111, Richard H. C. San, and David Jacobson-Kram
127
vii
Contents
viii
7. Developmental and Reproductive Toxicology Kit A. Keller
195
8. Neurotoxicology
255
Wulter
P. Weisenburger
9. Toxicological Assessment of the Immune System Gary J. Rosenthal and Dori R. Gerrnolec 10. Toxicological Pathology Assessment
29 1
315
Lyldu L. Lunning
11. Assessment of Laboratories for Good Laboratory Practice Compliance Lindu J. Frederick
345
12. Use of Transgenic Animals for the Assessment of Mutation and Cancer Robert Young urzd David Jacobson-Kram
36 1
13. Health Risk Assessment of Environmental Agents: Incorporation of Emerging Scientific Infornlation Vicki L. Dellurco, William H. Furland, m d Jeunette A. Wiltse
389
Index
415
Contributors
ChristopherBanks,D.A.B.T. Director of ScientificOperations,Toxicology Department, ClinTrials BioResearch, Senneville, Quebec, Canada Raymond M. David, Ph.D., D.A.B.T. Senior Toxicologist, Toxicological Sciences Department, Eastman Kodak Company, Rochester, New York Vicki L. Dellarco, Ph.D. Senior Geneticist, Officeof Pesticide Programs, U.S. Environmental Protection Agency, Washington, D.C.
J. CarolineEnglish,
Ph.D., D.A.B.T. Manager,BiochemicalToxicology, HealthandEnvironmentLaboratories,Eastman Kodak Company,Rochester, New York William H. Farland Office of Research and Development, U.S. Environmental Protection Agency, Washington, D.C.
Linda J. Frederick, B.S., RQAP Manager, Research Quality Assurance Department, Pharmaceutical Products Division, Abbott Laboratories, Abbott Park, Illinois DoriR.Germolec,Ph.D. EnvironmentalImmunologyDepartment,National Institutes of Environmental Health Sciences, Research Triangle Park, North Carolina ix
Contributors
X
RamadeviGudi, Ph.D. LaboratoryDirector,GeneticToxicologyDivision, BioReliance Corporation, Rockville, Maryland David Jacobson-Kram,Ph.D., D.A.B.T. Vice President, Toxicology and Laboratory Animal Health Division, BioReliance Corporation, Rockville, Maryland Kit A. Keller, Ph.D. Toxicology Consultant, Washington, D.C. Lynda L. Lanning, D.V.M., D.A.B.T. Director of Veterinary Pathology and Clinical Pathology, Mammalian Toxicology Division, BioReliance Corporation, Rockville, Maryland Donald L. Putman? Rockville, Maryland
GeneticToxicologyDivision,BioRelianceCorporation,
Elmer J. Rauckman, Ph.D.
Consulting Toxicologist, Flemington, New Jersey
Gary J. Rosenthal, Ph.D., D.A.B.T. RxKinetix, Inc., Louisville, Colorado
VicePresident,DrugDevelopment,
Richard H. C. San, Ph.D. ScientificDirector,GeneticToxicologyDivision, BioReliance Corporation, Rockville, Maryland Andrew I. Soiefer, Ph.D.,D.A.B.T. L.L.C., Succasunna, New Jersey
North JerseyToxicologyAssociates,
Valentine 0. Wagner 111, M.S. Laboratory Director, Genetic Toxicology Division, BioReliance Corporation, Rockville, Maryland Walter P. Weisenburger, Ph.D. SeniorResearchInvestigator,DrugSafety EvaluationDepartment,ReproductiveandDevelopmentalToxicology, Pfizer Global Research and Development, Pfizer, Inc., Groton, Connecticut Jeanette A. Wiltse Office of Research and Development, U.S. Environmental Protection Agency, Washington, D.C. RobertYoung Maryland
t Deceased.
ToxicologyDivision,BioRelianceCorporation,Rockville,
Use of Laboratory Animals in Toxicology Studies Kit A. Keller Toxicology Consultant, Washington, D. C.
1.
INTRODUCTION
Basic scientific research in the field of toxicology includes investigations utilizing a wide range of animal species, from fruitfly to fish to sheep. Over the years,in regulatory toxicology, a smaller numberof “purpose-bred” species (rat, mouse, guinea pig, rabbit, dog, monkey) have become the generally accepted test models to investigate and extrapolate for human risk assessment (Table 1). Initially, species selectionhad more to do with availability, cost, and ease of use. As the years progressed and knowledge accumulated, selection factors such as comparative metabolismandpharmacokinetics,availablehistoricalcontroldatabases,and other species-specific factors also began to play roles in species selection. The use of animals in research is generally regulated to some extent in Europe, Japan, and the United States, as well as in other countries around the of world. In the United States, the primary statutory rules on the use and care animals are contained in the Federal Animal Welfare Act (Animal Welfare Act, Public law 89-544, 1966, as amended 91-279 and 99-198; implementing regulations published in the Code of Federal Regulations (CFR), Title 9. Chapter 1, Subchapter A, Parts 1 to 3). The Iaw is administered by the U.S. Department of Agriculture. In general, thislaw mandates a basic standard for the care of animals used in research. It stresses adequate institutional oversight and veterinary care, including the appropriate use of anesthestics, analgesics, and tranquilizers to reduce pain and distress. A Guide for the Care and Use of Laboratory Animals is No. published by the U.S. Department of Health and Human Services (NIH Pub. 74-23) and a copy can be obtained on NIH’s website (http://www.nih.gov). 1
Keller
2
Table 1 Animal Species Commonly Utilized in Regulatory Toxicology Studies
Common Study type
Primary alternative species species
Rat, mouse Rodent (rat), nonrodent Mouse, monkey (dog) Rat, mouse Carcinogenicity Mouse In vivo mutagenicity monkey hamster, Mouse, rabbitRat, Development and reproduction Rat Neurotoxicology Rat pig guinea Mouse, Immunotoxicology/ sensitization Acute toxicity Multidose toxicity
Mouse
I
This chapter provides brief a overview of the variouscon~n~only used laboratory animals as well as some recommended guidelines for their care and use under current regulations.
II. COMMONSPECIES IN REGULATORYTOXICOLOGY There is no such thing as the perfect animal species for testing potential human toxins [I]. No single species is predictive for man in all possible circumstances. Each species has its pros and cons, and no one species matches human physiology, organ function, and morphology, or reaction to exogenous chemicals consistently. For example, human skin is more resistant to dermal absorption of compounds than most animal models because of a thickened stratum corneum (the pig is considered the closest model). How chemicals are handled in the body can also vary markedly in each species. For example, rats and dogs are considered relatively efficient biliary excretersin comparison with the guinea pig and monkey. Indeed, extrapolationof toxicology findings from animals to humansin risk assessment requires expertise and a thorough knowledgeof each species and the field of toxicology [2-41. Most of the common laboratory animals used in biomedical research are purpose bred by vendors who supply “disease-free” animals. Mice, rats, guinea pigs, other rodents, and rabbits should be “specific-pathogen free” (SPF). In addition, rats and mice should also be “virus-antibody free’’ (VAF). This is important because this ensures the general health and quality of the animals and limits factors that could complicate or interfere with the results of a study. It is also important that large animals, such as the beagle dog and monkey, are also
Laboratory Animals in Toxicology Studies
3
supplied by reputable vendors. This is especially true of primates, which can harbor a number of pathogens that are harmful to humans. Maintaining such standards in a laboratory facility prevents a constant influx of viruses, bacteria, and parasitic organisms that could threaten the healthof the entire facility’s animal population and study viability. Many laboratories order additional animals for selected studies to serve as a nontreated sentinel population that is evaluated at scheduled intervals for possible viral infection. Animals should always be examined for signs of disease when they first arrive, preferably by a veterinarian, prior to accepting any animals from an outside vendor. The animals are always allowed an acclimation period and kept separate from other animals when arriving at a facility. This period can range from a few days to many weeks depending on the species and type of study for which they are to be utilized. The only exception to this can be circumstances when time-mated (pregnant) animals are being used in a reproductive or developmental toxicity study. Generally, more animals than needed for a study are ordered, which allows for prestudy selection of the most suitable animals for the study. Criteria for selection into the study for rodents, guinea pigs, and rabbits are usually based on acceptable body weight range (i.e., 2 2 standard deviation from the mean), normal foodconsumption,andacceptablehealthexamination.When used in acute testing, the condition of the eyes and skin are also considered. For large animals, one or two pretest measurements of clinical chemistry, hematology, and electrocardiographic and ophthalmological parameters are also factored into the decision. All animals must be “randomized” into the study treatment groups to eliminate bias. All animals rooms and caging must be cleaned on a regular basis. For Good Laboratory Practice (GLP) regulatory studies there is a required species-specific range of allowable room temperature and humidity that mustbe maintained and documented. In addition, all animal rooms require a minimum number of air changes within each room, most often 12 to 15 changes each hour.
A.
GeneralConsiderations
1. Rodents Mice and rats are relatively low in cost for both purchasing and housing. In addition, they are readily available in healthy condition and are considered time efficient and relatively docile in handling and dosing with consistent results in toxicology testing. Their many years of use and their genetic stability have en[5,6].Both inbred and outbred sured that extensive background data are available strains may be used, but the former breeding system offers less genetic variation [7]. Examples of rodent strains that are commonly used in toxicology experimen-
Keller
4
tation are presentedin Table 2. Rodents have a short gestation period, high fertility rate, and large litters, making them particularly economical to produce large numbers for research as well as very useful in studying potential reproductive or developmental toxicity. Their short life spans also make them useful in chronic and lifetime studies (i.e., carcinogenicity). On the less favorable side, rats and mice have a relatively fast metabolic rate, can be stress sensitive, lack a gallbladder, have no emetic reflex, are able to produce ascorbic acid internally, and have CYP2C as their primary P450 metabolizing enzymes (compared with CYP3A in humans). Rats are also obligate nasal breathers and do not generally make a good model for humans with regard to inhalation studies. Their size can bean advantage or disadvantage depending on what end point one is considering. Mice are particularly favoredwhen test material isin short supply, asthey generally require much less material for a comparable study than any other commonly used species of blood (Table 3). However, if one is trying to optimize the volume and number samples for a study, rodents, particularly mice, will have some limitations. Roof dents have also been found to be unsuitable test models for specific types chemicals. For example, they are unsuitable for testing dopamine agonists for potential reproductive toxicity due to their dependence on prolactin in early pregnancy. In general, the monkey and guinea pig are better models than rodents for investigations into the potential toxic effects on gonadotropic and ovarian function.
Table 2 Rodent Strains Frequently Used in Toxicology Studies ~~~
Strains Rat Sprague-Dawley
Wistar Fischer 344 Mouse CD-1 C3H C57BL BALB/c
~~
Description An outbred albino strain, frequently used with a very large background database, but prone to obesity and mammary neoplasms. Propensity for geriatric renal disease may limit the utility of the SD rat for studying nephrotoxic compounds. An outbred albino strain. Good survival for 2-yr bioassays, but prone to mammary and pituitary neoplasms. An inbred albino strain, small and often used, but prone to leukemia and testicular and pituitary neoplasms. An outbred albino strain. most frequently used in safety studies, but prone to liver neoplasms and amyloidosis. An inbred agouti strain, commonly used in government laboratories, but prone to liver neoplasms. An inbred black strain. An inbred albino strain prone to testicular atrophy.
-.a
v)
Table 3 Hypothetical Test Material Requirements: Comparison Among Species
4
Mouse
Rat
Monkey (cyno)
30 g body wt lO/sex/group 4 groups 25, 250, and 500 mg/kg/day 28 days’ treatment 20%
+
Assumptions 250 g body wt 2.5 kg body wt 1O/sex/group 4/sex/group 4 groups 4 groups 25, 250, and 500 mg/kg/day 25, 250, and 500 mg/kg/day 28 days’ treatment 28 days’ treatment +20% +20%
-62 g
-520 g
Test material required -2080 g
Dog (beagle)
9 kg body wt 4/sex/group 4 groups 25, 250, and 500 mg/kg/day 28 days’ treatment +20% -7500 g
0
5.
3 v)
Keller
6
The rat is the most commonly used species for all types of toxicology studies, including acute, multidose, developmental and reproductive, carcinogenicity, 1). The mouse is mostly used for acute toxicand neurotoxicity studies (see Table ity, in vivo mutagenicity, and carcinogenicity studies, althoughit can be used as an alternative to the rat when applicable. There are numerous publications on historical control data for rats and mice, including growth and development patterns and spontaneous disease [8-161. Additional background data may also be obtained from suppliers.
2. Guinea Pig The guinea pig, while relatively docile and easy to handle, is generally not as well studied as other laboratory species in toxicology and thus historical control data as well as pharmacokinetic data are often lacking. In addition, they can be very susceptible to disturbanceof the alimentary tractby orally administered test materials,intravenousadministrationcanbe very difficult,and,historically, guinea pigs havebeen found to provide poor comparison with human metabolism of many compounds [17,18]. More recently, in view of their hypersensitivity, guinea pigs havebeen suggested as a second species when testing biopharmaceutical products that consist of proteins or peptides. Currently, the guinea pig is used almost exclusively for acute sensitization/ immunotoxicity studies (see Table 1). They have been used in special cases for reproduction studies since they are similar to humans and other primates in that the placenta takes over the hormonal control of pregnancy very early in gestation and thus is not dependent on active corpora lutea as are rodents and rabbits. However, their long gestation period (65 to 72 days) has generally limited their use in developmental toxicity studies [ 191.
3.
Rabbit
Rabbits are generally docile but not as easy to handle as rodents due to their larger size and easily injured back and limbs [20]. Rabbits are more expensive than rodents and require larger caging. The primary strains of rabbits used in regulatory toxicology are the New Zealand White or the Dutch Belted rabbit. While “clean” rabbit suppliers are generally available, historically pasteurellosis and coccidiosis have been problems (manifested as nasal discharge, diarrhea, and congested lungs, as well as pitted kidneys and brain lesions).In addition, rabbits are very susceptible to disturbances of the gastrointestinal (GI) tract, especially when testing antibacterial agents. Clinical signs and body weight changes in rabbits can be erratic and difficult to interpret. Rabbits are used exclusively in eye and dermal irritation studies and in developmental toxicity studies, and thus often lack other available toxicity or pharmacokinetic data unless specifically generated in range-finding studies (see
L
Laboratory Animals in Toxicology Studies
7
Table 1). Although their use is somewhat limited, some background and historical control data are available in the literature [20-231.
4.
Dog
There are more than 300 varieties of the domestic dog (Canis familiaris), but the beagle has become the most commonly used in nonclinical safety assessments. Though significantly more expensive than rodent species, toxicology studies with beagle dogs offer an economical, large, and often-required second nonrodent animal species that is docile and easy to work with [24]. The beagle is generally consistentin its geneticprofile and extensive background data are available. For inhalation studies, dogs are not an obligate nasal breather, like the rat, and therefore, more closely resemble humans. In addition, aerosol deposition in the alveolar regionof the dog lung is closer to humans [25]. Dog studies generally utilize much smaller group sizesthan small animal studies and thus are less powerful statistically. In shorter studies, the usual young age of dogs can complicate interpretation of histopathology findingsin the reproductive organs as sexual maturity in dogs usually does not occur until at least 9 mo of age. Dogs are used almost exclusively for general toxicology studies ranging in duration from 2 wk to 1 yr (see Table 1). They can be highly sensitive to a number of chemical classes, including cardiovascular active compounds and nonsteroidal anti-inflammatory agents. Laboratory beagles show a high spontaneous incidence of polyarteritis, which can cause problems of interpretation of whether or not such a lesion, if found, represents a potential human hazard or is clinically not relevant. In general, dogs tend to have slower drug clearance than rats, more closely resembling humans. However, aswith every species, there are marked exceptions (e.g., dissimilar pharmacokinetics with organic acids). Dogs cannot acetylate primary arylamino groups, whichcan also result in large differences in pharmacokinetics compared with humans. For example, dogs demonstrate better tolerance to the renal effects of sulphonamides than that seen in humans due to their lack of acetylation ability.
5. Nonhuman Primates The four most commonly used primate species in preclinical toxicology studies are cynomolgus monkeys, rhesus monkeys, baboons, and marmosets.The cynomolgus monkey is usually the first choice unless the test compound can be more relevantly tested in another species. Baboons and rhesus monkeys are more difficult to obtainin numbers required for preclinical toxicity testing, and marmosets, though offering a smaller model, are less hardy and experience is required in their specialized husbandry requirements [26]. In all cases, they are expensive animals to purchase and house, and only relatively recently have organized breeding programs been instituted. Wild caught animals have always had the potential
8
Keller
problem that the interpretation of any toxic effects could be complicated by the presence of parasites, preexisting disease, precapture lesions, and the absenceof genetic consistency. Similar to the dog, the primate would be used almost exclusively for general toxicology studies ranging in duration from 2 wk to 1 yr (see Table 1). Due to than dog the expense and riskof primate studies, such studies are initiated rather studies onlywhen there is a definite need due to such factors as poor pharmacokinetics in the dog, known overt sensitivity or toxicity to certain classes of compounds in dogs, or known antigenicity issues as often seen with biopharmaceuticals. It should be noted that pharmacokinetics in nonhuman primates can differ from humans as much as with other species and preliminary studies should be conducted before proceeding with large primate studies. Other alternatives do exist, such as the mini-pig or the ferret, but their use is uncommon.
B. HousingandHusbandry 1.
Rodents
Single housing of rodents is routine in North America. Gang housing (five rats per cage or four mice per cage) is the normin Europe. Both husbandry systems have their positive and negative aspects. Gang housing of rodents does not allow individual food consumption values and moribund animals have to be isolated, which introduces variability, or, if left group housed, may result in the loss of tissue samples by cannibalism. For male mice especially, there is the problem of fighting during the first weeks of a study. However, the housingof more than one animal per cage does contribute to animal survival (predominantly on chronic or carcinogenicity studies) and reduces the obesity evidentin older rodents, particularly rats. Studies on restricting food consumption (diet “optimization”) of individually housed rodents have shown similar benefits in body weight gain, morbidity, and survival. Some form of food restriction is now generally recommended on chronic and carcinogenicity studies as well as the preceding rangehas identified. This finding studies.To date, no single restricted diet regimen been can present problems for interpretation of data in relation to historical controls. It should be noted that much of the background (historical control) data collected on rodent studies are specific to the type of husbandry used and,in these respects, are not interchangeable. Caging for rodents is usually solid or mesh stainless steel. Holes or mesh flooring allow urine and fecesto fall through onto a collection tray/mat beneath the cage. This tray/mat must be changed regularly. The cages themselves and entire animal room should be on a regular cleaning schedule as well. The minimum space recommendations for single rodent caging are presented in Table 4. A compromise when designing caging is usually made between allowing easy
in’
Animals Laboratory
in Toxicology Studies
9
Table 4 MinimumSpaceRecommendationsforLaboratoryRodents
Body (g)
weight
Height area/animal Floor
Species Mouse
<10 10-15 15-25 >25
< 100
Rat
100-200 200-300 300-400 400-500 >500
6
8 12 15 17 23 29 40 60 70
39 52 78 97 110
149 188 259 388 452
5
13
5 5 5
13
7 7 7 7 7 7
13 13 18 18 18 18 18 18
access and examination of the animals and the prevention of escape or injury. The cages can be suspended from permanent wall mountings orin movable racks/ batteries that allow flexibility for room cleaning and periodic repositioning of the animals around the housing room to minimize environmental influences on the experimental results. Top-of-the-line caging systems are totally enclosed with full rack ventilation.In reproduction studies which require litter rearing, a larger, solid-floor cage with bedding for “nesting” material is required. Rodents are extremely hardy and capableof survival in an extremely wide range of temperatures and humidity, but within the toxicology laboratory, a controlled environment is essential to eliminate variables. Mice and rats should be maintained at temperatures of 18 to 26°C (64 to 79°F) and relative humidity of 30 to 70%. Room illumination is usually controlled to 12 hr of light and 12 hr of darkness. Water is supplied ad libitum by either an automatic watering system or individually filled “water bottles.” Feeders supplying either certified diet in the form of pellets or powdered meal, ad libitum or controlled amounts, are either attached to the caging or are free standing.
2.
Guinea Pig
Guinea pigs grow to just over onekg in weight and have husbandry requirements that are similar to those of rats, though the diet should be supplemented with vitamin C and solid-floored cages with bedding that offer at least 700 cm2 floor area and a height of 17.78 cm should be provided. Food and water should be provided ad libitum similar to rodents.
Keller
10
3. Rabbit Rabbits are also cagedin stainless steel cages madeof mesh or bars that are close enough together to prevent injury to the animals.The cages are usually on racks of rabbit, body weights of and stacked 3 to 4 high. Depending on the strain 3.0 to 6.0 kg. Minimal space recommendations sexually mature adults range from for rabbits are outlinedin Table 5. Rabbits generally prefer a bit cooler environment than rodents, with the room temperature kept within a range of 16 to 22°C (61 to 72"F), relative humidity shouldbe 30 to 70%, and a 12-hr light/dark cycle should be maintained. Rabbits can be maintained on certified dry rabbit chow pellets but generally do better in laboratory settingswhen their food consumption is controlled to a set daily amount rather than allowing ad libitum feeding [27].
4.
Dog
Most dog cages are stainless steel with a floor area of at least 0.75 m' and a height of 82 cm for dogs up to 15 kg. These cages are usually attached to the wall or mounted in racks in tiers of two. Dogs are usually housed individually, but 'the exercise requirements of the animals must be considered. Communal areas, usually large joined pens, are essential to allow exercise and social interaction. Males and females should never be exercised together and mixing of dogs from different dose groups is usually avoided.The temperature of the dog room should be kept within a range of 18 to 29°C (64 to 84"F), relative humidity should be 30 to 70%, and a 12-hr light/dark cycle should be maintained. Dogs are supplied with water on an ad libitum basis, but receive a daily measured ration of diet. For an average weight dog on toxicity studies (8 to 12 kg), 400 g of certified pelleted food is usually provided and 1ahr feeding period is sufficient after training.A commercially available certified dried feed provides a satisfactory diet, but during the acclimation period or if an animal is in poor
I
Table 5 MinimumSpaceRecommendationsfor Laboratory Rabbits Floor
ft
Body (kg) weight
Height
area/animal m? cm
in
<2
1.5
0.14 35.56
u p to 4 u p to 5.4 >5.4
3.0
0.27
14
0.36 35.56 0.45 35.56
14 14
4.0 B5.0
14 35.56
Animals Laboratory
in Toxicology Studies
11
condition, this shouldbe supplemented by moistening the pelleted foodwith water or supplementation with commercially available nutritional supplements or available moist canned diet. Quantitative measurement of food and water consumption is not normally attempted due to the active nature of the species. A more qualitative assessment of food consumption (i.e., all, 3/4, I/?, ‘ / 3 , or none) is usuallysufficient. Any dietmodificationorsupplementationshouldbedocumented.
5. Nonhuman Primates Primate caging is similar to that used to house dogs, with some additions. Primates have complex social behavior and are prone to developing abnormal behavior patterns (e.g., stereotyping or self-mutilation) if not cared for properly. When singly housed for toxicology studies, some form of environmental enrichment must be provided in the form of interactive toys and regular entertainment, such as videos.The cages used to house primates should have a screened area to allow animals to break contact with facial displays of other aggressive animals. Primates can be dangerous when handled without adequate training. A perch and a system to move the animal to the offront the cage to allow removal for examinations or study-related procedures without injury to the animal or handler can be a useful addition. Minimal space recommendations for primates are presented in Table 6. The temperature in the husbandry area of nonhuman primates is usually maintained at 18 to 29°C (64 to 84”F), the humidity at 30 to 70% with a 12-hr light/dark cycle. Water is provided ad libitum and fixed a amount of diet is given each day as well as a fruit supplement. Similar to dogs, quantitative measurement of food and water consumption is not normally attempted, due to their active nature and fruit supplementation. A more qualitative assessment of food consumption (i.e., all, 3/4, V 2 , %,or none) is usually sufficient.
Table 6 MinimumSpaceRecommendations for Laboratory Primates Floor
Body (kg)weight Species Marmosets Macaques Baboons
Height area/animal m2 cm
<1
3- 10 <25
0.1551 30 0.477 0.74
ft’
in
1.6 4.3
20
8.0
92
36
12
Keller
C. Animal Identification Identification of animalsensuresthecontinuity of recordsandinterpretation throughout the study. In the past, it was considered adequate to rely on the cage labeling and only discriminate among animals when gang housed. Today, each animal shouldbe individually identified and,if possible, the specific study identification should also be indicated. Methods of identification should be permanent (this does not include “permanent” ink). Table 7 outlines historical and current methods for identification of the common laboratory species. The use of microchip transponders implanted subcutaneously has greatly enhanced the efficiency and accuracy of animal identification in many animal facilities. Invasive methods, such as toe clipping and ear punching,noare longer used at the majority of facilities due to the pain and stress associated with such methods. Tattooing is also losing favor formany species in which it requires constant shavingof the area. In some cases, animals arrive with a supplier’s identification, which is often different from what is used in the actual study identification.
D. Routes of Administration Laboratory animals can be administered test material by a wide varietyof routes. Oral administration canbe by gavage with a tube or specialized “gavage needle,” capsule, or diet. Test materials can be injectedanasintravenous (IV), intraperito(SC), or intradermal (ID) syringe neal (IP), intramuscular (IM), subcutaneous injection. Lung as the route of exposure can entail such test systems as wholebody chambers, nose-only exposures, aerosol canisters, or intratracheal instillations. Other miscellaneous routes, such as sublingual, rectal, intravaginal, and topical, are also used in specialized situations. Prior to the commencement of dosing by any route, the animals should be familiarized or habituated to any restraint procedures that may be necessary.
Table 7 AnimalIdentificationMethods
Methods
Species Mouse Toe clip (fetudneonate), ear punch, tattoo, implant Rat Toe clip (fetudneonate), ear punch, tattoo, ear tag, implant GuineapigEarpunch,eartag,tattoo,implant Rabbit Ear tag, tattoo, implant Dog Tattoo. collar, implant Primate Tattoo, collar, implant
Laboratory Animals in Toxicology Studies
13
Usually the material being tested is administered in a vehicle such as water, saline, methylcellulose, or corn oil. There are published recommendations for maximum dosing volumes for oral gavage and injection routes for a variety of laboratory animals (Table8). When volume limits are a problem, systemic exposure can often be increasedby increasing the frequency of administration. Common abbreviations include q.d. (once per day),b.i.d. (twice per day), t.i.d. (three times per day) and q.0.d. (every other day). In general, for studiesusing an injection as the routeof administration, the rate should be adjusted to avoid pain and local tissue damage, the temperature of the dosing solution/suspension shouldbe close to thebody temperature of the animal and alternative injection sites should be used sequentially. Ideally, fluids for parenteral administration shouldbe isotonic; however, nonisotonic fluids can be dosed by the intraperitoneal or intravenous routes. A test article with any degree of irritancy is likely to produce local reactions at the dose site and, in some cases, ultimately prevent dosing in the area orby that particular route. An alternative method of administering irritant solutions or for repeated-dose infusion studies is to use implanted catheters. Preliminary studies, such as a vein irritation test, muscle irritation test, in vitro hemolysis, paworlick test, can anticipate problems. For oral studies, pathology of the GI tract from early range-finding studies will usually identify problems. For studies on environmental chemicals or food additives, the test material is usually mixed into diet or drinking water. Such exposures are commonly expressed in terms of concentration (parts per million, or ppm) or in terms of the actual test material received by the animal (mg/kg/day; milligram per kilogram body weight per day) based on the amount of diet or water consumed. In order to ensure consistent dosing in growing animals, the investigator is required to predict both increases in body weight and expected food consumption based on historical control records in order to prepare the next week's admixtures.
Table 8 RecommendedMaximumDosingVolumes(ml/kgj'
Oral
2-2.5
gavage
Species
10-15
Mouse Rat 5 Guinea pig 2 Rabbit Dog Primate
20
TV
IP
10 0.05b
20 10 10
5
20
20 10
4
2.5
10
0.5 ~~
1
2
~
Values of British Pharmaceutical Industry [28]. Represents recommended total ml volume not m l k g
-
IM O.lb O.lb 0.5 0.25 0.25 0.5
sc
ID
20 5 5
0.5 0.5
1 1
0.5 0.5
Table 9 Blood Sample Collection Techniques in Laboratory Animals Technique-site
Species
Volume obtainable (ml)
Terminal procedure
Retro-orbital plexus
Rat, mouse
0.35 (mouse); 0.35-3.0 (rat)
No
Cardiac puncture
Rat, mouse
1 (mouse); 0.7-5 (rat)
Yes
Lateral tail vein
Rat
0.35-2
No
Jugular
Rat, rabbit
1-2 (rat); 0.35-5 (rabbit)
No
Posterior vena cava
Rat, mouse, rabbit, dog, primates
Yes
Marginal ear vein
Rabbit
1 (mouse); 0.7-3 (rat); 5075 (rabbit); >75 (large animals) 0.35-3
No
Dog, primates
1.5-5
No
Vein-saphenous, femoral
cephalic,
Comments Moderate volumes; relatively “dirty” samples Possible injury to eye Anesthesia should be used Large volumes Anesthesia should be used Heating tail aids in sampling but increases collection time Less trauma to animal Repeated collections Quick method Repeated collections Quick method Anesthesia should be used Quick method Repeated collections Quick method Repeated collections
Laboratory Animals in Toxicology Studies
15
E. BiologicalSampling Analysis of urine and blood samples from rodents an is integral part of a toxicology study. Obtaining urine samples using metabolism caging is a straightforward, noninvasive technique that requires the animals be to transferred to special cages in which urine is separated from feces and collected over a specified time period. Due to the low volumes, samples from micemay be pooled. Very “clean” urine samples may also be obtained at necropsy using a syringe if the bladder contains at necropsy urine. In the larger animals, urinemay be collected in live animals or using a catheter. Blood samples can be obtained by a wide variety of techniques and sites 9). These procedures all require technifrom live animals and at necropsy (Table cal expertise to avoid harm the to animals and obtain adequate samples. Anesthetics shouldbe used in many cases to avoid unnecessarypain and stress. The selection of themostappropriateanalgesicoranestheticisdependent upon the procedure. Commonly used products include ketamine, carbon dioxide/oxygen, [29]. The volume sodium pentobarbital, halothane, isoflurane, and sevoflurane of blood samples taken at any one time should not be harmful to the animal. Multiple sampling from the larger animal species is generally not a problem or confounding factor in data interpretation. However, for smaller species, separate satellite groups are usually added to the study specifically for blood sample collections either for pharmacokinetic and/or clinical pathology procedures.
F. Euthanasia Euthanasia is the “act of killing animals by methods that induce rapid unconsciousness and death withoutpain or distress.” In the United States, the selected euthanasia method is usually consistent with published recommendations [30], unless justified for scientific or medical reasons. The specific methodwill depend upon the species involved and the objectives of the study. Both inhalants and noninhalant chemicals, such as barbiturates, fluorinated anesthetics, and carbon dioxide/oxygen, are generally preferable to more historical physical techniques, such as cervical dislocation or decapitation.
REFERENCES S.C.GadandC.P.Chengelis, Aninla1 Models in To-x-icology,Marcel Dekker, NY. 1992. 2. G. Zbinden. The concept of multispecies testing in industrial toxicology. Regul. Toxicol. Phat-macol. 17; 85.1993. 3. E.J. Calabrese, Suitability of animal models for predictive toxicology: Theoretical and practical considerations. Drz4g Metab. Rev. 15; 505, 1984. 1.
16
Keller
4. I.W.F. Davidson, J.C. Parker and R.P. Beliles, Biological basis for extrapolation across mammalian species. Regul. Toxicol. Pharrnacol. 6; 211, 1986. 5. H.J. Baker, J.R. Lindsey and S.H. Weisbroth, The klboratory Rat, Vol 1, Biology and Disease, Academic Press, NY, 1979. 6. H.L. Foster, J.D. Small and J.G. Fox, The Mouse irt Biornedical Research, Vol. II, Disease. Academic Press, NY. 1982. 7. M. Festing. Use of genetically heterogeneous rats and mice in toxicological research: A personal perspective. Toxicol. Appl. Pharmacol. 102; 197, 1990. 8. M.A. Attia, Neoplastic and non-neoplastic lesions in the mammary gland, endocrine and genital organs in aging male and female Sprague-Dawley rats. Arch. Toxicol. 70; 461, 1996. L.Z. Florence,Referencevaluesforyoungnormal 9. L.E.Lillie,N.J.Templeand Sprague-Dawley rats: Weight gain, hematology and clinical chemistry. Hum. Exp. Toxicol. 15; 612, 1996. 10. J.K. Haseman, J. Bourbina and S.L. Eustis, Effect of individual housing and other experimental design factorson tumor incidence in B6C3F1 mice. Fund. Appl. Toxicol. 23; 44, 1994. 11. P.L. Lang and W.J. White, Growth, development and survival of the Crl:CD(SD)BR stock and CDF(F344)/CrlBR strain. In:Pathology of the Aging Rat(U. Mohr, D.L. Dungworth and C.C. Capen, Eds.),Vol2, ILSI Press, Washington DC, p. 587, 1994. 12. D.N. McMartin, P.S. Sahota, D.E. Gunson, H.H. Hsu and R.H. Spaet, Neoplasms and related proliferative lesions in control Sprague-Dawley rats from carcinogenicity studies. Historical data and diagnostic considerations. Toxicol. Pathol. 20; 212, 1992. 13. M. Chandra and C.H. Frith, Spontaneous neoplasms in aged CD-1 mice. Toxicol. Lett. 61; 67, 1992. of morbidity, and 14. K. Maita, M. Hirano, T. Harada et. al. Mortality, major cause spontaneous tumors in CD-1 mice. Toxicol. Pathol. 26; 340. 1988. 15. G.J. Turnbull, P.N. Lee and F.J.C. Roe, Relationship of body-weight gain to longevity and to risk of development of nepthropathy and neoplasia in Sprague-Dawley rats. Food Chern. Toxicol. 23; 355, 1985. 16. B.L. Oser, The rat as a model for human toxicological evaluation. J. To-xicol. Envirorz. Health 8; 521, 1981. 17. J.E. Wagner and P.J. Manning, The Biology of the Guinea Pig, Academic Press, NY.1976. 18. R.L.Smithand J.Caldwell,Drugmetabolisminnon-humanprimates.In: Drug Metabolism From Microb to Man (D.Z. Parke and R.L. Smith, Eds.), Taylor and Francis, Ltd., London, p. 331, 1977. 19. C.H. Phoenix, Guinea pigs. In: Reproduction and Breeding Techniquesfor Laborat o y Animals, (E.S.E. Hafez, Ed.), Academic Press, NY, p. 244, 1970. 20. S.H. Weisbroth, R.E. Flatt and A.L. Kraus, The Biology of the Laboratory Rabbit, Academic Press, NY, 1974. 21. J.P. Gibson, Use of rabbit in teratogenicity studies. Toxicol. Appl. Pharmacol. 9; 398,1966. 22. Y. Kameyama, T. Tanimura and M. Yasuda, Spontaneous malformations in laboratory animals-photographic atlas and reference data. Rabbit. Cortg. Anom. 20; 64, 1980.
t
Animals Laboratory
in Studies Toxicology
17
23. A. Bortolotti, D. Castelli and M. Bonati. Hematology and serum chemistry values of adult, pregnant and newborn New Zealand rabbits. Lab. Anim. Sci. 39: 437, 1989. 23. C. Parkinson C and P Grasso, The use of the dog in toxicity tests on pharmaceutical compounds. Hum. Exp. Toxicol. 12; 99, 1993. 25. R.B. Schlesinger, Comparative deposition of inhaled aerosols in experimental animals and humans: A review. J. Toxicol. Environ. Health, 15; 197, 1985. 26. B.T. Bennett, C.R. Abee and R. Henrickson, Nonhurnan Prilnates in Biomedical Research. Academic Press, NY, 1995. 27. R.L. Clark, J.M. Antonello, J.D. Wenger, K. Deyerle-Brooks and D.M. Duchai, Selection of food allotment for New Zealand White rabbits in developmental toxicity studies. Fund. Appl. Toxicol. 17; 584, 1991. 28. R.M. Hull, Guideline limit volumes for dosing animals in the preclinical stage of safety evaluation. Hum.Exp. Toxicol. 14; 305. 1995. 29. D.F. Kohn, S.K. Wixson, W.J. White and G.J. Benson, Anesthesia and Analgesia in Lnborutory Animals. Academic Press, NY, 1997. 30. American Veterinary Medical Association, Report of the AVMA panel on euthanasia. J. Am. Vet. Med. Assoc. 202; 229, 1993.
This Page Intentionally Left Blank
Toxicity Associated with Single Chemical Exposures Andrew 1. Soiefer North Jersey Toxicology Associates, L.L. C., Succasunna, New Jersey
Elmer J. Rauckman Consulting Toxicologist, Flemington, New Jersey
1.
ACUTETOXICITYSTUDIES
Acute exposure may be defined as exposure to a toxicant for a short period, usually less than 24 hr. Toxicity tests designed to explore adverse chemical effects following brief exposures are useful in classifying toxic agents, protecting workers, and safeguarding the community against accidental chemical release.
A. StudyObjectives The general goal of acute toxicity tests is to determine the toxic potential of a test chemical following a single exposure. A single dose of the chemical under study is administered to groups of laboratory animals that are then held and observed for a defined period to access adverse outcomes of exposure.
B. Routes of Exposure Animals in the laboratory may be exposed to chemicals in a variety of ways. The route of exposure isusually chosen toreflect conditions under which humans may encounter the chemical under study. Exposure by the oral route is common 19
20
Soiefer and Rauckman
and data from acute oral toxicity studies often form the basis by which chemicals are compared with each other. In the workplace, humans are more likely to be exposed to chemicals through skin contact or by inhalation. Therefore, acute toxicity tests by the dermal and inhalation routes of exposure provide the safety information most critical to the occupational environment. Formany chemicals, acute toxicity data by the dermal or inhalation routesmay not be available. Several possible explanationsmay account for this. For one, dermal studies are more easily performed on larger laboratory species, increasing the cost. Second, whichever species is selected, special equipment is always required to expose laboratory animals to chemicals by the inhalation route.
C. AnimalSpecies A wide varietyof animal speciesmay be handled in the laboratory and utilized in acute toxicity testing. Generally, tests include equal numbers of male and female animals to insure that gender-specific adverse effects are recognized. Rodents are the most commonly utilized animals for acute toxicity testing due to considerations of size, cost, and relevance of results. Testing guidelines for U.S. and international regulatory agencies routinely recommend the rat as the preferred species for acute oral and inhalation toxicity tests [ 11. Rabbits are often recommended for dermal studies [2]. Since these animal species are commonly chosen for human safety assessment, a large historical information base exists with governmental agencies, contract laboratories, animal suppliers, andin the published scientific literature. Animals for acute toxicity testing are healthy, outbred strains obtained from U.S. Department of Agriculture-approved and -regulated suppliers. Upon receipt at the laboratory, animals are usually quarantined for 7 to 14 days. Holding test animals for this period allows them to become acclimated to the laboratory. During this time any stress from transport should resolve and careful observation should reveal any unwanted and confounding disease states. Test animals are usually housed individually in suspended stainless steel cages or other suitable housing. Animal room temperature and humidity are carefully controlled within standard ranges to ensure animal health and comfort. Temperature is maintained at 72°F and humidity ranges from 40 to 60%. Room lighting is controlled to maintain a 12 hr light/dark cycle and room ventilation is adjusted to produce a minimum of 10 air changes per hour. Standards for animal care are specified in the Guide for the Care and Use of Laboratory Animals [3]. Animals are selected for study and assigned to test groups in a strictly random manner to avoid introducing potential bias. The stock of available animals is culled to include only healthy animals that have passed a detailed pretest evaluation. All female animals are nulliparous and nonpregnant.
Single Chemical Exposures
21
D. ExperimentalDesign Acute toxicology tests have undergone significant design changes in recent years. Although the output of these tests is somewhat the same, an attempt has been made to derive the key data set using as few animals as possible. In many cases in which full dose-response evaluations might have been conductedin the past, current designs provide limited data that only establish a rough range of effect. In currenttest designs, materialsthat show little acute toxicity would not be tested above 5000 mg/kg. In evaluating a chemical whose toxic potential is unknown, it is often useful to begin the assessment by conducting a range-finding study. A typical rangefinding test might call for one male and one female animal at of each five selected doses. Animals are administered the test chemical based on their weight. Typical range-finding doses are500, 1000, 2000, 3000, and 5000 milligrams of chemical per kilogramof body weight. Animals are randomly assigned to treatment groups and fasted overnight prior to chemical administration. Fasting overnight assures an empty stomach; this allows absorption of the test material without the confounding influence of food. Animals are weighted immediately prior to dosing. Treated animals are observed for adverse effectsof chemical administration im7 days. In range-finding mediately following chemical exposure and daily for studies, only clinical signs of toxicity and mortality information are collected; gross necropsy examinations are usually not performed. Animals that survive to day 7 are humanely euthanized. If the chemical under study is known (by range-finding or other data) or suspected (by comparison to other similar toxicants) to be of low toxicity, a limit test design may be chosen to access acute toxicity. In this design, a single, high dose of chemical is administeredto five male and five female animals.A chemical that produces no mortality in a sufficiently sized group of animals at 5000 mg/ kg would be considered practically nontoxic and would require no further evalua50% of exposed tion for acute effects. The LDsO,which is the lethal dose for animals, would be reported as “greater than 5000 mg/kg.” The experimental procedure is similar to the range-finding test. In a limit study, more detailed clinical observations would be made and animals would be observed for 14 days following chemical exposure. Signs of toxicity might include changes in eyes, mucous membranes, fur, skin, respiratory function, circulatory system, or nervous system. As a general indicatorof health, body weights are measured and recorded on days 0, 1, 2, 4, 7, and 14. All test animals are necropsied and evaluated for gross pathological effects. If one or two deaths occur during the limit test, a second limit study may be performed using a reduced dose level. In situations in which limit data are not sufficient the LDsOcan be determined. A larger numberof animals are required.A minimum of three dose groups
Soiefer and Rauckman
22
is typically studied. As before in the limit test, five males and five females make up each dose group.If time is not a limiting factor,the chemical may be administered dose groupby dose group allowing sufficient time between groups to gauge the effect. In order to accurately estimate the LDsO, doses producing a range of mortality (25 to 75%) need to be documented. Given a sufficient data set, the LDsO and its confidence limits can be calculated using standard statistical methods
[41. Descriptive ratings for toxicity basedon the oral LD50 have been described by many authors. Generally, chemicals having an oral LDsO above5000 mg/kg are considered only slightly toxic. Chemicals with an oral LDSoin the range from 500 to 5000 mg/kg are considered moderately toxic. A chemical with an LDsO between 50 and 500 would be considered very toxic and chemicals whose LDsO is below 50 mg/kg are considered extremely toxic.
II. EYE IRRITATIONSTUDIES These safety tests are critical for chemicals that may be accidentally or intentionally introduced into the eye. In the workplace, accidental ocular exposure by chemical splash may occur, leading to serious and irreversible injury. Components of cosmetic and shampoo formulations frequently contact the eye in the course of everyday use.
A.
StudyObjectives
The goal of the primary eye irritation safety test is to determine the irritant or corrosive potentialof a chemical following a single applicationthe tomammalian eye. Information derived from this studymay be used as the basis for classification and labeling the test material. On rare occasions, a material may be so toxic that systemic effects up to and including death may result following a single ocular exposure.
B. Toxicology of theEye Humans and selected members of the animal kingdom are principally visual creatures. That is to say, sight or sightedness is our dominant sensory modality. Irreversible loss of vision in one or both eyes is a crippling injury, one that leaves the organism impaired, unable to efficiently respond to the environment that surrounds it. The visual apparatusiscomplex,containingperipheralandcentral elements. The eye is the critical sensory end organ of the visual system, and the cornea and conjunctiva are the structures that are directly exposed to external insults. The surface of the eye is bathed fluid in to protect it, and reflexive blinking
Chemical
Single
23
acts to maintain this liquid coating. The corneahas special features that increase its vulnerability and susceptibility to irreversible injury. For one, unlike other parts of the body, the cornea must maintain transparency to remain functional. Whereas other parts of the body may be restored by the mechanisms of wound healing, the cornea will be functionally destroyed by vascularization and scar formation.
C. SafetyTests in Animals The New Zealand White rabbit is the most commonly used animal species for eye irritation studies. Rabbitsmay be handled safely in the laboratory by trained personnel and the albino rabbit eye is pigmentless making adverse change easy to observe. Historical data demonstrate that the New Zealand White rabbit is sensitive to a wide variety of ocular irritants and is a suitable model for human safety assessment. The New Zealand White rabbit is recommended as the test species for eye irritation studies by a number of U.S. and international regulatory agencies [5]. Rabbits for study are chosen randomly from healthy, acclimated stock animals to avoid any unintentional selection bias. Males or females may be used since chemically-initiated irritant or corrosive eye injury shows no relationship to gender. Stock females are nulliparous and nonpregnant. All animals chosen for study receive a baseline eye examination to ensure normal ocular health prior to test chemical exposure. The rabbit eye is typically evaluated macroscopically with indirect light and with long-wave ultraviolet (UV) light for fluorescein dye retention. Experimental designs of eye irritation studies vary. Current research protocols test fewer animals thanin the past, and test compounds are screened to eliminate materials with very high or low pH. These chemicals are assumed to be corrosive basedon chemical properties andmay be labeled without subjecting animals to irreversible eye injury. Materials of unknown eye irritancy may be tested with and without a rinse procedure. Three or six animals are usuallyused for each treatment group (no rinse, rinse). For liquids, gels, and pastes, a volume of 0.1 ml is instilled into the conjunctival sac of one eye. The eyelid is then gently held shut for 1 or 2 sec to limit test article loss. The contralateral eye is not treated and serves as a control. Powders and other solids are administeredat a weight equivalent of 0.1 ml. For animals in a rinse group, the test and control eyes are rinsedwith 0.9% physiological saline30 sec after test chemical administration. Animals must be carefully observed following ocular exposure toan unknown chemical. In the event a severe local reaction takes place, the animal may experience pain and should be humanely euthanized. In a standard eye irritation study design, the treated eyeswill be evaluated macroscopically by indirect light at 1, 24,48, and 72 hours postexposure. Ocular
Soiefer and Rauckman
24
change is gradedbased on a scoring system developedby Draize et al.[6]. Scores for corneal opacity (0,0 to 4), area of corneal involvement (A, 0 to 4), iritis (I, 0 to 2), conjunctival redness (R, 0 to 3). conjunctival swelling (S, 0 to 4), and of conjunctival discharge (D, 0 to 3) are combined to yield a maximum score 110. The mathematical equation to calculate the Draize ocular score is as follows: Ocularscore
=
(0 X A
X
5)
+ (I X
5)
+ 2(R + S + D)
(1)
At 24 hours, all treated eyes are also routinely evaluated using fluorescein dye retention to access corneal damage. If ocular irritation is observed at any time point, the test may be extended to up to 21 days to record recovery, should it occur. If no treatment-associated irritation is observedby 72 hours, the chemical is judged nonirritating and the test is ended. Descriptive ratings for eye irritation based on the ocular score were suggested in the original test method [6]. A chemical was described as a “nonirritant” if its maximummean irritation scorewas less than or equal to1.0. A chemical producing a mean irritation score less than or equal to 6.0 at 24 hr, without corneal or iridal involvement, and reversibleby day 7 was described as a“s1ight irritant.’’ A “moderate irritant’ ’ produced an ocular score greater than 6.0 at 24 hr but less than 30.0 at maximum, and irritation was reversible by day 14. A maximum mean irritation score greater than 30.0 with corneal or iridal findings that persist beyond day 14 was described as a “severe irritant.” “Corrosive” was reserved for chemicals that produced irreversible eye injury. Another useful set of criteria for evaluating ocular irritation has been described by Kay and Calandra [7]. In this scheme, descriptive rating and class are assigned based on the maximum mean ocular score over thefirst 4 days, the time to reach maximum and score persistence. Using eye irritation data from tests with standard Draize design, one can assign up to eight levelsof irritation, from “nonirritating” to “extremely severe irritant,’’ using this set of criteria. Sensible modifications of the standard eye irritation test exist and offer several advantages. Alternate protocolsmay be employed to reduce animal usage and some may even more accurately predict irritant effects to the human eye. In certain situations, a screening approach is appropriate and the irritation test is initiated with a single animal. If the material produces severe irritation, no more animals are tested and the material is labeled a severe eye irritant. If the screening test results indicate moderate or slight irritation, additional animals are tested to confirm the result, thereby reducing the possibility of underclassifying the mateif an rial. Used in this manner, the screening approach is conservative, in that error is made it tends be to in the direction of overclassification ratherthan underclassification. The downside of any overclassification that results from using this conservative screening protocol is outweighed by the gains made in reducing animal discomfort and usage.
Chemical
Single
25
A low-volume eye test (LVET) has also been described and validated [8]. This procedure uses only one-tenth the material of the standard Draize protocol and the test substance is applied directly to the cornea. Data show that the LVET is actually more predictive of human response [9], and it generally produces less irritation and hence discomfort. Since the irritation scores are lower, the classification scheme is shifted in order to produce results comparable to the Draize test. Recent work has shown that the number of animals necessary to produce reasonably dependable results can be reduced form six to three [lo]. In situations in which legal requirements do not specify the exact protocol or required number of animals, the use of these types of alternative protocols should be considered.
D. Alternatives to AnimalTesting Chemical manufacturers, consumer product producers, and others have the responsibility of identifying chemicals that have the potential to induce severe eye irritation or irreversible eye injury. Incidents involving chemicals hazardous to visual function are greatly reduced by the proper useof the information developed in the basic ocular irritation test. Despite the utility of this information, the ocular irritation test has become the focal point of considerable controversy because it is viewed by some as inhumane. No one disputes that introducing an irritating chemical into the eye of a laboratory animal produces a significant amount of distress. Although there are numerous reasons for developing a sensitive and reliable ir? vitro test as an alternative to using live animals, it is pressure from society that has put the development of alternative procedures on the fast track. The complex structure of the eye makes finding and validating an in vitro alternative for the rabbit ocular irritancy test challenging. One approach to this problem is to consider the various steps by which chemicals cause corrosive damage to the eye at a cellular and subcellular level. Regardless of the precise molecular mechanism, chemicals that can disrupt the physical structure of the eye do so by killing cells and disrupting membranes. In an attempt to model corrosivity, chemical effectson isolated cells and membranes have been explored and compared with the available literature on ocular corrosion. Measures of cell viability include growth inhibition, colony-forming efficiency, cell detachment, total protein, and dye-binding studies. Chemical effects on membrane integrity have been assessed by isotope release, dye release, and other assays that measure membrane breakthrough. Ocular irritation has been more difficult to model i n I1itr-osince it involves a cellular response from the immune system with the induction of inflammation. One of the more successful model systems to observe this effectisthechorioallantoicmembrane of the chickembryo.Although many methods have shown potential, only a few have been validated across more than
Soiefer and Rauckman
26
one class of chemical. At present, no single in vitro test captures all aspects of the whole animal response to an ocular irritant.
111.
DERMALIRRITATIONSTUDIES
Our skin is our boundary with the environment, and the environment is composed of chemicals. Dermal contact with chemicals occurs on a daily basis inhome. the workplace, and everywhere in between. Dermal irritation studies are safety tests of chemicals thatmay accidentally that predict the irritant and/or corrosive effects or intentionally contact the skin. In the workplace, accidental dermal exposure can lead to serious and irreversible injury. In the home, chemicals that comprise a wide variety of common consumer products contact the skin in the course of everyday use.
A. StudyObjectives The goal of the primary skin irritation safety test is to determine the irritant and/ or corrosive potential of a chemical following a single application to mammalian skin. Information derivedfrom this study may be used as the basis for classification and labeling of the test material. Some chemicals may be rapidly absorbed across the dermis, producing systemic poisoning following a single-dermal exposure. On these rare occasions, information regarding adverse systemic effects may be identified from a primary skin irritation test.
B. Toxicology of theSkin Our skin covers the surface of our body and provides a boundary between our internal structures and the environment. Skin is a unique structure meeting the criteria of a specialized organ system. As such, our skin is one of the largest organs of the body constituting as much as 10% of total body weight. In addition to protection from the environment, our skin provides other important physiological functions including the regulation of body temperature and water retention. The skin is also metabolically active and can be an important site for the biotransformation of chemicals. The skin is multilayered in structure with each of two main layers arising from different embryological cell masses. The thinner, outer layer is called the epidermis, which. as indicated by its name, has epithelial characteristics. The thicker, inner layer is composed of connective tissue and is called the dermis. The thickness of the skin varies in different regions of the body. The palms of the hand and the soles of the feet are areas where the skin is relatively thick.
Single Chemical Exposures
27
The skin covering the scrotum is relatively thin. Skin thickness is an important variable, particularly in relation to chemical absorption across the dermal layer.
C. Safety Tests in Animals The New Zealand White rabbit is the most commonly used animal species for skin irritation testing.The New Zealand White rabbithas no dermal pigment and due to its size and proportions, the dorsal surface of its back is relatively large. With these characteristics, dermal application of test materials is accurate, and adverse chemical effectsmay be easily observed.In a standard laboratory setting, rabbits may be handled easily and safely by trained personnel. Also, the New Zealand White rabbit has been shown to be sensitive to the irritant/corrosive effects of a wide range of chemicals. This combination of properties makes the New Zealand White rabbit a rational alternative to larger mammals, providing a suitable dermal model for human safety assessment. The New Zealand White rabbit is recommended as the test species of choice for primary irritation testing by a number of U.S. and international regulatory agencies [ 111. As in ocular irritation testing, rabbits for study are chosen randomly from healthy, acclimated stock animals to avoid selection bias. Male and females may be used: females are nulliparous and nonpregnant. All selected animals receive a detailed pretest observation prior to dosing. Animals with preexisting skin irregularities are identified and removed from the study group. Experimental design of skin irritation tests is basedon the work of Draize [ 121. As in ocular testing, current protocols test fewer animals than in the past and test compounds are screened to eliminate materials with very high or low pH. Strong acids or bases are assumedbetocorrosive and may be labeled without subjecting animals to irreversible dermal injury. Test materials with structural similarities to chemicals known to be damaging to skin may be screened using one or two animals in a pilot test. Materials of unknown irritancy are typically tested using six animals, three maleand three female. On the day prior to testing, animals selected for skin irritancy studies have the fur clipped from the dorsal surface of their backs. Care is taken to avoid abrading the skin with the animal clipper. The next day, the test is initiated by applying the test material to the dorsal skin surface. For liquids, gels, or pastes, a 0.5-ml dose of test material is administered. Powdered test materials are similarly applied at a dose of 0.5 g and are tnoistened with 0.5 ml of distilled water. After administrationof the test material, the test area is covered with a 1 X 1 inch, square, 4-ply gauze held in place with nonirritating tape. To prevent removal and ingestionof the test material, the test area is further covered with a semiocclusive elastic bandage. The elastic bandage is wrapped around the trunk of the rabbit and the ends are secured with adhesive tape. Specially designed animal collars may also be utilized to prevent treated animals from removing the bandage coverings and disturbing the
Soiefer and Rauckman
28
test site. In the standard test design, the test material is allowed to remain in contact with the skin for 4 hr. After this exposure period, the elastic bandage and gauze are removed.The boundary of the test site is marked and any residual test material is gently wiped from the test site. The condition of test animals exposed to a chemicalof undefined dermal irritancymust be carefully monitored. Should the test material produce severe injury to the skin, exposed animalsmay experience unacceptable levels of discomfort and should be humanely euthanized. In the standard dermal irritation design, treated animals are evaluated for erythema and edema at 1, 24, 48, and 72 hr postexposure. In situations in which irritation is still present at the 72-hr examination interval, observations may be extended for up to 7 days postexposure. Dermal irritation is graded based on a scoring system developed by Draize [ 121. Scores for erythema (E, 0 to 4) and edema (Ed, 0 to 4) are combined at each evaluation interval to yield a maximum score of 8. The primary irritation index is then calculated using the following equation: Primaryirritationindex
=
2 (E
+ Ed) 5 (No. of testsites
X
4)(2)
Descriptive ratings for skin irritation based on the primary irritation index have been described. Materials producing a primary irritation index of 0.00 are classi0.01 to2.00areratedasslightlyirritating. fied asnonirritating.Scoresfrom 5.01 Scores from 2.01 to 5.00 are rated as moderately irritating, and scores above are classified as severely irritating. In addition to erythema and edema, the following descriptors of dermal injury may be recognized and reported with or without an associated severity: beet redness, desquamation, fissuring, blanching, eschar formation, eschar exfoliation, ulceration, and necrosis.
D. AlternativestoAnimalTesting In the last 15 years, significant effort has been directed toward developing i n vitro alternatives to Draize-type irritation tests. Methods developed for i n vitro skin testing have progressed on a parallel track to ocular irritation sceening (see above) but are more advanced. Some of the more successful techniques have employed coculture of relevant cell typeson various support matrices, and artificial membrane systems that develop color upon test material breakthrough. These test systems have been validated in discrete classes of chemicals and are useful for research and development(R & D) applications and some regulatory classifications. At present, the information provided by these techniques can be extremely useful but cannot completely substitute for whole-animal testing.
Single Chemical Exposures
29
IV. TESTINGSTATEGIES Policies vary considerably among companies regarding toxicology testing of new and existing materials.New and existing substances in the United States are regulated under the Toxic Substances Control Act (TSCA). There are no specific data requirements specified under this act; thus, decisions concerning acquisition of toxicity data are generally left up to the producerhmporterhser. A good testing strategy will protect workers, customers, and the business. The studies will be custom-tailored for each test material such that the maximum amount of information is obtained at the lowest cost and using the smallest number of animals. Testing prior to significant production has several advantages: Allows for proper hazard communication Allows proper classification for shipping Supports recommendations for engineering controls and personal protective equipment Establishes safety of the product relative to competitive materials Permits early identification of unacceptable chemicals Speeds the premanufacture notification process (PMN) for new materials Likewise, testing presents some challenges: Can be expensive Can slow product development if in critical path May be perceived as unnecessary by business interests Can result in TSCA Section 8(e) notifications Cost of testing for new products is often the factor that determines how much testing can be conducted. New product development teams are often working on a limited budget and testing costs must compete with other development costs.
A. Why Do SafetyTesting? Safety testing is conducted for both moral-ethical reasons and for regulatory compliance. Although the United States does not have a minimum data set requirement under TSCA, the U.S. Environmental Protection Agency (EPA) can and will use structural alerts and structure-activity relationship (SAR) data to argue that a material presents an unreasonable risk. In the absence of test data, it is reasonable to assume that a material is as, or more, toxic than materials with a similar structure. The EPA has the responsibility to protect public health, and conservative estimates of toxicity are reasonable and justifiable. Actual animal test data provide a realistic basis from which to estimate human hazard.
30
Soiefer and Rauckman
It is important to have enough information to be confident that newamaterial does not pose an unreasonable health risk. All available information should be used in hazard evaluation of a new material and in design of the appropriate testingprogram.Informationfromknowntoxicity of similarchemicals(i.e., SAR). knowledge of likely metabolic pathways, physical properties of the test material. ill vitro data,proposedmanufacturingprocedures,andpotentialend uses should allbe used to design a testing program that gives the most information possible using the smallest feasible number of animals.
B. Phasing Phasing refers to implementing a testing program in a time- and cost-effective manner. When considering acute tests, there is no obvious sequence, such as subacute prior to subchronic. However, logical considerations are potential human exposure and “knock-out” effects based on structural alerts. In addition, limitations of resources, including test material availability, can be a reason for the phasing of acute tests. For example, if one were in the early stages of compound development with limited sample, it might make sense to conduct an acute oral toxicity study in rats and a single irritation test in rabbits (skin or eye). These two studies together would give a good indication of the potential of a material to cause fatalities or severe irritant effects.The oral study in rats is sensitiveand will generally be required for a PMN or standard classification schemes. The eye or skin test will also show if the material happens to be usually toxic by the dermal route of exposure and, thus, gives limited, but possibly important, systemic toxicity datain a second species. Completionof the remainder of the standard series of acute tests could be logically postponed until more material is available or the material was found to be acceptable by these two initial tests.
C. Communications Rapid communicationof test results that suggest high hazard is required if current activity with the material could result in worker or customer harm. There are in the decisions of what, when, and to both legal and ethical issues involved whom test results are communicated. Legal communication requirements exist under the EPA’s TSCA Section 8(e), andU.S. the Occupational Safety and Health Administration’s (OSHA) regulations. The EPA’s TSCA Section 8(e) requires that information be provided to the agency for any material that is considered a by “substantial risk’ ’ within 15 working days after the information is known a company manufacturing or distributing the material (including research and development work prior to actual manufacture). There have been many debates over the TSCA Section 8(e) regulation and most companies have developed their own set of reporting criteria. The EPA has issued more specific guidance and
Chemical
Single
31
most companies usethe same bright line for LDso values. There is a wider range of opinions regarding when to report strong irritants and sensitizers to theEPA. If the effect noted in animals could result in an irreversible loss of a bodily function or death, it is clearly reportable under TSCA Section 8(e). Thus, irritants that may cause loss of vision, or sensitizers that may cause anaphylactic shock are reportable. Hazard communication under OSHA regulations stipulates that results of new toxicity tests appear on the Material Safety Data Sheet for the material within 90 days of test report receipt. Practically, however,if test results are received indicating that material a presents an unusual hazard, immediate and appropriate notification of potentially affected persons is encouraged. Notification should be made in clear language explaining what was found and how it might affect someone. Occupational health professionals should be closely involved, and should determine if the handling procedures and personal protective equipment are adequate. Follow-upnotification should be made if additional information is obtained showing that the material is more or less hazardous than indicated by the preliminary data.
ACKNOWLEDGMENT The authors would like to thank Dr. Marie Beauregard for her assistancein verifying the references for this article and Ms. Nancy S. Soiefer for proofreading the text.
REFERENCES 1.
2.
3. 4. 5. 6. 7. 8.
Health Effects Test Guidelines, OTS, 798.1 175& 798.1 150; OECD Guideline 401 & 403. Health Effects Test Guidelines, OTS, 798.1100; OECD Guideline 402. National Research Council, Guide for the Care und Use of Lnborutoty Atlimuls. National Center for Research Resources, Bethesda. MD, ReportNo.: ISBN-0-30905377-3.141pages,1996. Litchfield, J.T. and Wilcoxon, F., A simplified method of evaluating dose-effect experiments, J. Phurruncol. Exp. Ther. 9699 (1949). Health Effects Test Guidelines, OTS, 798.4500; OECD Guideline 405. Draize. J.H., Woodward, G. and Calvery, H.O., Methods for the study of irritation and toxicity of substances applied topically to the skin and mucous membranes, J. Phurrnclcol. Exp. Ther. 82:377 (1944). Kay, J.H. and Calandra, J.C., Interpretation of Eye Irritation Tests, J. SOC. Cosr?let. Chem. 13:281 (1 962). Griffith, J.F.. Nixon. G.A.. Bruce, R.D. Reer, P.J. and Bannan, E.A., Dose-response studies with chemical initants in the albino rabbit eye as a basis for selecting opti-
32
Soiefer and Rauckman
mum testing conditions for predicting hazard to the human Tosicol. eye, Appl. Pharmacol. 55501 (1 980). 9. Freeberg. F.E., Hooker, D.T. and Griffith, J.F., Correlation of animal eye test data with human experience for household products: an update, J. Tosicol. Cut. Ocular Toxicol. 5:115 (1986). 10. Bruner, L.H., Parker, R.D. and Bruce, R.D.. Reducing the number of rabbits in the low-volume eye test, Furzdm.t. Appl. Toxicol. 19:330 (1992). 1 1. Health Effects Test Guidelines, OTS, 798.4470; OECD Guideline 404. 12.Draize,J.H.,Dermaltoxicity, Assoc. Food and Drug Oficials, U S . Appraisal of the Safety of Cherilicnls irz Food, Drugs arid Cosmetics. Texas State Dept.of Health, Austin, Texas, 1959, p. 46-59.
Multidose Toxicity and Carcinogenicity Studies Christopher Banks ClinTrials BioResearch, Senneville, Quebec, Canada
Kit A. Keller Toxicology Consultant, Washington, D. C.
1.
INTRODUCTION
Whereas acute toxicity studies are performed to demonstrate toxic effects and the target organs associated with a single administration of an agent, multidose studies, often referred to as subacute, subchronic, and chronic studies, are performed to provide informationon the effects of repeat administrations of a compound. The distinction between subacute and subchronicmay sometimes be unclear, but it is generally accepted that subacute studies are of 28 days’ (1 mo) wk (3 mo). A study duration or less and subchronic studies are classically 13 duration of 13 wk is considered to approximate10% of the lifes span of a rodent. Longer-duration toxicology studies are classedas chronic studies with a duration of 26 wk (6 mo), 38 wk (9 mo), or 52 wk (1 yr), allowing investigation of testarticle-related effects over a larger proportionof the animal’s life span. Carcinogenicity studies are performed almost exclusively in rats and mice and assess the tumorigenic potential of an agent over the majority of the rodent life span (1.5 to 2 yr). Distinctions in the types of subacute or subchronic studies requiredcan be made according to the type of agent to be tested. Table 1 lists some examples of subchronic and chronic testing requirements for various types of chemicals. When performed aspart of the preclinical safety assessmentof a pharmaceutical product by the new International Conference on Harmonization (ICH) guidelines, 33
Multidose
Banks and Keller
34
Table 1 RegulatoryAgenciesRequiringMultidoseToxicityTesting ~~
Agency Industrial chemicals EEC Directives: Dangerous Substance Directive (67/548; 79/83 1: 92/32) OECD Guidelines for Testing of Chemicals (#407, 408,3 409,2 410, 41 1. 412, 413, 451. 452. 453) U.S. EPA: Toxic Substance Control Act (TOSC, 409A) Japan Law 44 (jointly by MTTI, MHW, and MOL) Agrochemicals EEC Directives (79/117: 91/414) U.S. EPA: FIFRA (40 CFR 158) Human Pharmaceuticals ICH Guidelines (Guideline Nos. M3, S4A.2 S6. Q3A. Q3B) ICH Guidelines (Guideline Nos. SIA, SIB, SlC)
Veterinary Pharmaceuticals EEC Directive (81/85 1 as amended by 92/18) Part I11 U.S. FDA (Fed. Reg. 52:49572. 1987)
Food Additives U.S. FDA (Red Book) a
Guidelines being promulgated or revised
New chemicals from 10-1000 tons/yr require base set of studies from 28 days (rodent) up to chronic repeated-dose toxicity studies (rodent and nonrodent) Tier testing: multidose studies from 21/28 days up to chronic and carcinogenicity studies (rodents and nonrodents) Specific studies requested on a case-bycase basis New chemicals must undergo biodegradation studies; if bioaccumulation potential, then require mutagenicity and 28day repeated-dose toxicity studies Tier testing. May require up to a 3-mo dietary or inhalation study Requires 2-wk to 1-yr studies depending on the duration of human therapy required; also may be required for impurities or degradants Carcinogenicity testing required when expected use is greater than 6 mo, for delivery systems that may prolong exposures, and where there is cause for concern about carcinogenic potential. 90 days in rodent and nonrodent; subject to interpretation 90-day feeding study in rodent and nonrodent; residue in animal products requires further 6 mo study in nonrodent: 2-yr carcinogenicity studies in rodents 2-yr feeding study in two species for new additives; 90-day study in two species
Multidose Toxicity and Carcinogenicity
35
the general toxicity studies are usually performed in one rodent species and a second, nonrodent species, such as dogs or nonhuman primates. When performed for the U.S. Environmental Protection Agency (EPA) according to the requirements of the Toxic Substances Control Act (TSCA) or the Federal Insecticide, Fungicide and Rodenticide Act (FIFRA) to investigate the effects of chemicals or pesticides, the studies are predominantly performed in rodents only. However, for the registration of pesticides, a chronic study in beagle dogs, which may be preceded by asubchronicstudy,isrequired. The OrganizationforEconomic Cooperation and Development (OECD) safety study requirements for chemical transportation are dependent upon the tonnage of chemical to be shipped. The standard 2-yr bioassay in rodents (rats and mice) remains the most common study design in carcinogenicity testing, although there is a trend at present to replace to 2-yr mouse studywith alternative typesof in vivo assays as discussedin chapter 12 [1,2]. The historical emphasis in multidose studies has been on defining a “no observed effect level’’ (NOEL) or ‘‘no observed adverse effect level” (NOAEL). These are usually qualified in more than one speciesand for different durations. However, some investigators have argued that such a standard practice is not justified and use of a nonrodent species (the dog) was considered unnecessary in either long-term studies or in any study in which there was a large difference between the NOEL and the expected human exposure [3,4]. More recent data surveys suggest that no additional clinically relevant information was obtained after 6 mo in repeat-dose animal tests (excluding those for carcinogenicity) [581. A recent draft ICH guideline for testing pharmaceuticals reflects this new survey of the data and allows for a maximum durationof repeated-dose toxicity tests of 6 mo (excluding carcinogenicity studies) in the rodent and 9 mo in the nonrodent. Such changing attitudes toward the design of toxicity studies, regulatory requirements, and human risk assessment will continue as the available database grows.
II. STUDYDESIGN AND PARAMETERS Repeat-dose general toxicity studiesin laboratory animals are designed to monitor asmany bodily functionsas possible to maximize the potential for identifying adverse effects and target organs. While there are some differences between the conduct of rodent and nonrodent studies, these are largely the consequence of practical considerationsin handling of the difference species. In general, the study designs have become standardized to a considerable degree irrespective of the species, the treatment duration, or the regulatory agency for which it is conducted, such that a regular set of measured parameters is routinely included (Table 2). This standardization allows for objective comparison of different test materials
Keller 36
and
Banks
Table 2 Basic Multidose Study Design in Rodents and Nonrodents
Nonrodent
Rodent 3 or 4/sex/group Not needed -4-6 mo
Number of animals Number of PK satellites Age at initiation
10-15/sex/group 3-25/sex/group -6 wk
Common dosing methods
Oral gavage, diet, IV, inha- Oral gavage or capsule, TV lation Retro-orbital, cardiac punc- Posterior vena cava, sapheture, lateral tail vein, jug- nous, cephalic, and femoral veins ular, posterior vena cava
Common blood sampling methods Parameters measured Clinical observations Body weight Food consumption Ophthalmoscopic exam Electrocardiographic exam Hematology Clinical chemistry Urinalysis
Gross necropsy Organ weight Histopathology Statistics
At least weekly At least weekly At least weekly for the 1st month Pretest, end of treatment Not routine
At least weekly At least weekly At least weekly for the 1st month Pretest, end of treatment Pretest, end of treatment
Pretest, at least once during treatment Pretest, at least once durAt least once ing treatment Metabolism cage or catheriMetabolism cage, at least zation. pretest, at least once once during treatment or at necropsy Standard Standard At least liver and kidney At least liver and kidney At least high-dose and con- At least high-dose and control. target organs trol, target organs At least once
Various methods applicable
Very limited due to small group size
in similar studies and also aidsin species-to-species extrapolation. However, one must be careful that this same standardization does not lead to the mechanical design of studies without regard to the known or projected activity of a test material or to the species under study. In contrast, the design of carcinogenicity studies is limited to variables applicable to identification of tumorigenicity (gross and microscopical) (Table
Multidose Toxicity and Carcinogenicity
37
Table 3 BasicCarcinogenicityDesigninRodents
Rodent Number of animals Number of PK satellites Age at initiation
At least 50/sex/group 3-25/sex/group
Common dosing methods Common blood sampling methods
Oral gavage, diet Retro-orbital, cardiac puncture, lateral tail vein
Parameters measured Clinical observations, including palpitation for tumor masses Body weight Food consumption Hematology Gross necropsy Histopathology Statistics
-6 wk
At least weekly At least weekly At least weekly for the 1st month At least twice Standard At least high-dose and control Various methods applicable; often uses procedure for adjusting for intercurrent mortality
3). Variables such as clinical pathology and organ weight are generallynot evaluated in these studies due to the high variability of these parameters in older rodents [9].
A.
Study Basics
All Good Laboratory Practice (GLP) studies require a signed and dated protocol. The protocol is the unique driving document for a and study is requiredby regulatory agencies to itemize all specifications and end points for a toxicology study. This is usually an extremely detailed document and is supplementedby standard operating procedures (SOPS). This combination ensures that variation in procedures and end points are kept to minimum and there is a high of level confidence in the accuracy and consistency of any results. The requirements for a protocol are specified in GLP regulations (see Chapter 1) together with the procedures to be followed for changing or amending the document during a study. There are generally two key personnel responsible for a study, usually a toxicologist (who acts as the official study director and has final oversight of the study) and a veterinary pathologist.
38
Banks and Keller
1. Test Material Formulation and Administration When embarking upon a series of toxicity studies, whenever possible, the test article to be investigated should be “technical grade” material orof similar composition tothat expected for human exposure. The material shouldbe well characterized with respect to purity, stability, and contaminant profile. The carrier or vehicle used in formulation of the test material should alsobe as well characterized as the test article, and any separate preparation processes should be clearly documented with expiration dates as appropriate. Dietary studies also require characterization of the vehicle. Not only must the diet be certified, but also lot numbers and expiration dates must be recorded. It is essential that a clear “chain of custody’’ is established for the test material at the testing facility and that all activities involving the material are unambiguously documented. A Material Safety Data Sheet should accompany the compound on arrival, a description of the material should be recorded and matched to that documented by the supplier, storage conditions and expiration date should be recorded, and all usage monitoredby weighing containers before and after use. In addition to the prestudy characterization mentioned above, the test article should be analyzed by the test facility or supplier on completion of the study to confirm stability during storage over the duration of use. Routinely, the vehicle used in formulating the test article is also appropriate for use as the control. A solution for intravenous administration should ideally be prepared in physiological saline. however, a more complex, buffered vehicle containing preservativesmay also be used, and this should also be used for administration to the control group,at the same dose volume and concentration asused in the high-dose group. This is also true for gavage solutions or suspensions, dietary preparations, dermal formulations, and liquids to be nebulized for inhalation exposure. In the last case, should the inhalation studybe an investigation of a chemical or pesticide (for the EPA) then the entire formulation is considered to be the test article and an air control (breathingroom air) would be appropriate. If the test material is insoluble in regular vehicles, there is aofrange materials that can be used as vehicles or carriers to create suspensions.The most commonly used are methylcellulose and carboxymethylcelluloseat concentrations of 0.1 to 1.O%, Corn oil, or polyethylene glycol (commonly a molecular weight of 400 is used and designated asPEG4OO). Polysorbate 80 (Tween 80) can be used at low concentrations in combination with polymers as a surfactant to improve dispersion. However, adverse effects, especially in the gastrointestinal (GI) tract (diarrhea or soft stool), are associatedwith these vehicles. In addition, care must be taken during administration, as aspirationof the vehicle into the lungs canbe fatal for the animal. Analyticalassessment of allformulations(solutions,suspensions,and diets) should be made during the study. Solutions should be checked for concen-
Carcinogenicity and
Toxicity Multidose
39
tration and stability. Suspensions and diets also require a check for homogeneity. Capsule dosing (usuallyin dogs and occasionallyin primates) requires confirmation of the concentration and stability of the test article in the gelatin capsule. Inhalation studies requirean initial checkof dosing solutions prior to atmosphere generation as well as frequent checksof the concentration of the test material in the generated atmosphere. As a minimum, for a subacute study, checks at the start and endof treatment are recommended. For subchronic and chronic studies, checks at the start, middle, and end of treatment are the usually accepted minimum. Studies can be performed using a variety of dose routes to accommodate various formulation types and exposure scenarios. Wherever possible, the intended route of administration or potential exposure in humans should be mimicked in these safety studies. Routes of administration that are frequently used include oral, intravenous, subcutaneous, inhalation, intramuscular, and dermal, with, less often, intrarectal, intranasal, intravaginal, or ocular. The oral route can be subdivided into active administration (i.e., gavage, intubation, or capsule) and passive dosing (i.e., adding the test article to the animal’s food or water). The formulation type, the dose route, and dose regimen should be as close as possible to those expected for human exposure, but should also take into account practical limitationsof bioavailability or humane considerations. For example, pharmaceuticals are dosed according the expected clinical dosing regimen and chemicals being safetytested for occupational exposure are usually administered 5 days per week. Routes such as intravenous, dermal. and inhalation are relatively easy to transfer from the animal studies to human risk assessment if toxicokinetic data are available. Oral gavage and dietary administration in rodents offer two widely differing oral presentations of the test article. The gavage route is a rapid, accurate way to deliver a bolus dose of an agent similar to that when administering solutions, suspensions, or tablets to humans. Incorporation of the test article intothe diet offers a slower rateof absorption, better mimics unintentional human exposure to some chemicals, such as pesticides or food additives, and helps avoid peak plasma effects associated with bolus dose administration.
2. Test System The various global regulatory testing guidelines generally require rodent studies at a minimum and, in many cases, also require nonrodent studies. The rat is considered the rodent species of choice for routine multidose toxicology and carcinogenicity studies (see Chapter 1). Its compact but practical size and relatively consistent geneticprofile have encouraged its use and provided a comprehensive background database. Although mice offer economies of scale for test article use, their small size can render some routine procedures impractical. Thus, mice are usually limited to use in the carcinogenicity study and accompanying
40
and
Banks
Keller
range-finding type studies. Procedures requiring extensive handlingand biological sampling are more suited to larger nonrodent species; the beagle dog is the most common choice.The dog provides a consistent, docile, and relatively cheap model. However, under select circumstances, the dog isan unsuitable nonrodent model and primate studies are usually conducted in their place. Primates (cynomolgus monkey, rhesus monkey, baboon, or marmoset) offer a species genetically closer to humans and, therefore, theoretically more likely to identify potentialhazards of somecompoundstohumans.Substitution of amoreunusual species as the second, nonrodent test model may be appropriatein some instances, though a justification must be given. Preclinical evaluationof biotechnology products requires additional considerations for the selection of the appropriate species. Justification must be provided for the pharmacological relevanceof the proposed species according to drug action or tissue specificity. Wherever possible, the testing program should include at least one species that demonstrates this specificity, and the use of transgenic animals or models of the disease should be considered when no cross-reactivity can be demonstrated in regular laboratory species. In practice, the majority of biologicals are tested in rats and beagle dogs or nonhuman primates, but sometimes in only one species depending upon the specificity, clinical indication,and the stage of compound development. Generally, purpose-bred and ' 'disease-free, ' animals are preferred (see Chapter 1). The study is initiated using young, growing animals, equal numbers of males and females per group (rats commonly 10 to 20/sex/group; dogs 4 to 8 group) and usually three treated groups and one vehicle control group. Most general toxicity studies utilize ad libitum fed rodents. However, for chronic studies, including carcinogenicity studies, the use of diet restriction has been shown to increase life expectancy[ 101 and may become the norm for these studies. It is important that the allocation of animals to the study groups results in only healthy animals being used in the study and that body weights at the start of the treatment period are similar (not statistically significantly different). For rodent studies, a subpopulation of the animals intended for the study may have clinical pathology and postmortem examinations performed to ensure suitable health status. Suppliers of rodents should provide regular results of viral health screens from representative batches of animals. Dogs and primates should undergo a rangeof in-life examinations, including clinical pathology, to ensure suitability for use in the study. Animals should be allocated randomly to groups after exclusion of any animals considered unsuitable. The number of animals to be used on subacute and subchronic studies is a balance between ensuring a statistically meaningful population size and ethical considerations to use the minimum number of animals. Range-finding studies commonly use as few as 1 to 5 animals/sex/group. Subacute studies generally
Carcinogenicity and
Toxicity Multidose
41
employ 10 rodents/sex/group or 2 or 3 dogs/sex/group when the studies are to be submitted to the regulatory agencies. Subchronic studies performed using rodents for the U.S. Food and Drug Administration (FDA) employ 15 rats or mice of each sex; similar studies performed for the EPA or for Japanese regulatory authorities presently require only 10 animals of each sex. Dog studies can range of from 3 to 8/sex/group. For submission in the FDA, the minimum number large animals required for subacute toxicology studies to support clinical trials 4. for pharmaceutical products is usually 3/sex/group and for subchronic studies For Japanese authorities, three animals per sex are sufficient for subacute and 6 subchronic studies. For carcinogenicity studies, dosing is usually initiated at wk of age and much larger test groups (at least 50/sex/group) areused to ensure an adequate numberof surviving animals at study termination (guidelines require 50 to 25% survival at the endof the study). The actual number per group is often dependent on the individual laboratory and the historical survival rates of rodents in their laboratory. These minimum numbers are increased if one wants to include interim sacrifice periods, recovery animals, or satellite animals for biological sampling. In addition, when gavage dosing is initiated in young rodents, it can be wise to allow for replacement animals for accidental deaths due to gavage accidents during the initial phase of the study. The recovery or reversibility phase of a study follows the completion of the dosing period, and a numberof the animals are retained from the control and one or more of the treated groups to assess any changes in the toxicologically relevant findings. It may not be necessary to add a recovery component to each of the treated groups. Sufficient informationmay be obtained from the examination of recovery animalsin the high-dose group alone. Nonetheless, it is important that the recovery animals are also included in the control group for comparative purposes. The need for a recovery period may only be ascertained from the results of prior repeat-dose studies. For biological response modifiers, a period to assess recovery of normal physiological function is almost mandatory. For chemical entities, recovery periods aremost often used to determine reversalof pharmacological effects rather than target organ toxicity. Clinical pathology and postmortem data at the end of the treatment period can give an indication of the end points that should be examined during the recovery phase. As stated earlier, the requirement for a recovery component increases the number of animals required in the study. For rodent studies, an additional five animals of each sex in each recovery group is normal. For studies using dogs or primates, one or two animals of each sex per group would suffice.
3. Dosage Selection Appropriate dosage selection is one of the hardest components of study design. In general, data gathered from acute single-dose studies will provide information
42
and
Banks
Keller
to select appropriate dosages for a subacute study; the subacute study will provide data to select appropriate dosages for a subchronic study: and finally, the subchronic data will aid in dosage selection for chronic studies. Key data points from each study that aid in dosage selection include the lowest observed effect level (LOEL) and the maximum tolerated dose (MTD). To start the whole process, the current approach to acute range-finding studies uses aflexibleprotocol:either “up and down”for rodentsor“ascending” for larger animals. This nomenclature refers to the regimen whereby the dose level is selected according to the signs of reaction at the previous dose level. The LOEL is identified as the lowest dosage producing the minimum adverse drug-related activities, while the MTD would be considered that dose of the agent that elicits clear evidence of toxicity, but allows the animal to survive without undue distress. It is necessary to avoid the use of severely toxic doses in order to prevent unnecessary suffering in animals and to eliminate confounding factors introduced by stressed or moribund animals. The goal is to establish a quantitative relationship between toxicity in the animals and exposure to the test material. In the ideal study, the low-dose group is equal to or a small multiple of expected human exposures and is associated with no toxicity.The mid-dose group shouldbe conducted at the expected LOEL. However, in practice, many investigators, particularly those in industry, prefer the mid-dose to be a NOEL for risk assessment purposes since the difference between the NOEL and the expected human exposure is considered the “margin of safety.” Finally, the high-dose group should show toxicity, butnot enough to result in the death of more than 10% of the group. Many investigators target a 10% reduction in body weight gain as a marker of sufficient toxicity in the highdose group. Dosage selection is most accurate for risk assessment purposes when based on plasma exposure levels. If this is not possible, extrapolation is best calculated based on surface area (mg/m2)rather than just body weight (mg/kg). Acutestudiesshouldinvestigatehigh-dosetoxicity,andthesubsequent subacute and subchronic studies should also yield some expression of toxic effect. However. there will be some instances in which a demonstration of toxic effect requires extremely high doses that are many orders of magnitude above the expected human exposure andmay require quantitiesof test article far beyondwhat is reasonable or financially justifiable. In these cases, alternative factors should beconsidered.Forexample,theEPAalreadyimposesa“limit”tothedose levels required when testing innocuous materials(2 g/kg body weight when dosing orally and 2 mg/l of atmosphere when dosing rats by the inhalation route). There may also be physical limitations to the highest dose, such as test material solubility or saturation of kinetics, that define the highest exposures possible. Many pharmaceuticals, especially protein-based drugs, have no apparent highdose limit of toxicity and therefore cannot be evaluated in the usual way (i.e., the high-dose level would represent the MTD). In this case, the highest dose is
Multidose Toxicity and Carcinogenicity
43
usually justified on the basis of plasma levels, multiplesof the intended maximum clinical dose, or on the establishment of the maximum pharmacological effect (provided that a sensitive species is identified). For any of these strategies, it is important thatthe regulatory agency is consulted prior to the start of the preclinical program. Selection of dosage levels for carcinogenicity studies can also be very complicated and hasbeen the subject of much debate [ 11-13]. Historically, the highdose level has been set at the MTD, which is generally defined as the highest dosage that does not lead to clinical deterioration, influence longevity, or induce greater than a 10% reduction in body weight gain. However, it has been argued that tumorigenicity observed atan MTD may be a consequence of metabolic conditions that arenot representative of lower-exposure conditions.The difficulty is differentiating between those test substances that are carcinogenic only at an MTD from those test substances with a carcinogenic response directly dependent on their toxicity. The selections of the mid- and low-dose levels are often dependent on the dose-response curvein the dose range-finding studies, identified noeffect levels and pharmacokinetic factors. The international testing guidelines for pharmaceuticals (ICH S 1C available on the FDA website, http://www.fda.gov) contain an excellent review and guidance on dosage selection for carcinogenicity studies.
4. Toxicokinetics Toxicokinetic data are requiredin toxicology studies used in the development of pharmaceuticals. Such systemic exposure data are less often collected for toxicity studies used for other regulatory submissions. The relationship between exposure (plasma drug levels) and toxicity allows interpretation and extrapolation of the findings in different species, and relating these to potential or actual adverse effects. The intravenous dose route represents 100% bioavailability. This reference point is used in bioavailability studies in pharmaceutical development and provides “worst-case” toxicity for exposureby other routes. For example,if dermal exposure is one of the expected routes of human exposure, then the intravenous route mimics the increased uptake potential for damaged or diseased skin. If inhalationexposureisexpected,againtheintravenousroute can representa worst-case scenario of diseased or damaged lungs. Such comparative data can be very useful for risk assessment analyses. Blood drug analysis was initially included in toxicology studies merely to provide “proof of absorption.” Blood samples would be taken immediately after dosing to confirm the presence of test article and once again, immediately prior to the next dose, to assess potential for accumulation. It is now accepted that it is necessary to provide amuch more complete pictureof the behavior of the test
44
4
Banks and Keller
article in the body. On a 2-wk subacute study, multiple samples should be taken after the first dose and again after the final dose. Ideally, five or more samples of are taken in the 24 hr after dosing to follow the absorption and elimination the test article. Consideration has to be given to the route of administration and the likely speed of absorption, and hence the rapidity with which the maximal concentration (CmaX) is reached (see Chapter2). Toxicokinetics for rats or mice on dietary studies presents specific problems. Since rodents are nocturnal,they feed during the night and diurnal fluctuations in systemic levels of the test material can be expected. It is important to determine the magnitude of these variations to assess the overall systemic exposure. Obtaining multiple blood samples on large animal studies is relatively simple. A 10-kg beagle has approximately 750 to 1000 ml of circulating blood, of which 15% can be safely withdrawn in a single day, in any 7-day period. Therefore, over 100 ml of blood can be taken for analysis. However, a 300-g rat has only 25 to 30 ml of circulating blood, and therefore only2 to 3 ml of blood that can be taken for analysis. In addition, repeat blood sampling is difficult in rats and almost impossible in mice without compromising the animals. It is important that the blood sampling does not compromise the primary objective of the study, to assess toxic potential of the test compound. Animals should not be lost prematurely during sampling and the condition of the animal should not be affected by the sampling or anesthetic procedures. For these reasons, satellite animals are often included in rodent toxicity studies. These animals are treated identically to the main study animals but no toxicology end points (clinical pathology, histopathology) are recorded. They arein the study solely to provide blood samples. It is usually possible to obtain three blood samples from each rat: the first two samples from the orbital sinus under anesthetic, and the last taken from the abdominal aorta at euthanasia. An alternative method may be to catheterize the animals to obtain a greater number of serial samples. Mice are usually sacrificed at each blood sampling time point and sampled from the abdominal aorta. Blood drug levels in dogs and primates are more prone to individual variability than those in rodents, and the use of serial blood sampling for these animals is valuable. The standard group size of three or four animals of each sex provides sufficient values at each time point to evaluate group effects.
L..., -,*
B. ParameterstoBeMeasured 1. Clinical Signs The examination of animals for ill health and morbidity (referred to as clinical signs or clinical observations) is routinely performed in all toxicology studies. I
/..x.
."L.*&.".,,.*.,
.I,.
.,.IC
..J>,I.
Carcinogenicity and
Toxicity Multidose
45
This usually entails at least a twice-daily check for mortality or “overt” clinical signs (at the beginning and the endof the working day). It isnot normally necessary to remove the animal from the cage for this examination, but it should be in distress or are showing apparsufficiently detailed to identify animals that are ently treatment-related signs. This‘ ‘AM/PM check” minimizes unnecessary loss of data due to autolysis of animals that have died or loss due to cannibalism in gang-housed animals. In addition, a more detailed examination is performed less often, usually at a stated time after dosing, and involves removing the animal from the cage and performing a full examination of the entire animal, including palpation for masses. In most cases, the examination can be performed by adequately trained technical staff. However, a veterinarian may be able to provide a more thorough assessment, especially in dogs and primates. The observation of the animals before, during, and/or following the dosing procedures makes it morelikely that transienteffects,particularlypharmacological-relatedevents, will be identified. It may also document findings related to the particular dosing procedure and help in the interpretation of findings at the end of the study. An attempt should be made to ascertain the onsetof the adverse event as well as the recovery time, as appropriate.
2.
Body Weights
The body weight measurements before and during a toxicology study an areintegral part of all toxicology studies. The body weights are used in the allocation of animals to dose groups, pretreatment health screening, and the monitoring of the animal’s condition throughout the dosing period. Body weight values are necessary for the prediction of diet concentrations, for dose per kg calculations in capsule studies, and calculationof dose volumes for gavage or injection studies. Body weights should be recorded on a weekly basisat a minimum for studies up to 13 weeks. After 13 weeks, in longer-term chronic studies, body weights may be recorded less often. More frequent measurements may be necessary to monitor treatment-related effects or the health of a particular animal.
3.
Food and Water Consumption
This should be monitored at least weekly in subacute and subchronic studies; in conjunction with body weight, it is a sensitive indicatorof ill health or treatmentin related effects. The measurement of food consumption can be complicated rodent studies when animals are gang housed. Weekly food consumption values are divided by the number of individuals in the cage to estimate individual food consumption. However, this calculated number givesno indication of individual food consumption status. Thus, the values may need to be adjusted or excluded from the report when animals are moribund or die during any particular week.
Banks and Keller
46
1 I
Food intake data are also requiredin the calculation of test article concentrationsindietarystudies. Food spillageshouldalsoberecorded(thisisthe amount that is discarded or dropped as the animal feedsmay and be an indication of palatability problems) and taken into account when calculating dietary concentrations of test articles to avoid underdosing. Spillagemay be measured or visually assessed from the food scattered on the tray paper under the cage. Water consumption measurements areno longer standard onmany toxicolof water bottles ogy studies. This measurement requires the filling and weighing on a regular basis, and thus can be labor intensive in these days of available automatic watering system. In rodent and dog studies in which the data are considered necessary, it is usually sufficient to measure intake for 1 wk every 6 to 8 wk. It can oftenbe difficult to generate accurate data, asthe animals frequently play with the drinking tubes causing considerable spillage.
4.
Ophthalmoscopic Examination
Examination of the eye and adjacent structures for abnormalities or treatmentrelated effects by a trained ophthalmologist should be a routine assessment in multidose toxicology studies. The examination is performed in both the rodent and nonrodent, although the pigmented dog eye is generally considered a better model than the eyeof the albino rat. This isof particular importancewhen testing substances known to bind to melanin. The examinations should be performed prior to treatment and again prior to termination of the study as a minimum. Interim examinations should be considered, especially if the compoundmay accumulate in the retina or affect the blood pressure. The examination should include indirect and slit lamp examinations. All regions of the eye should be included (cornea, lens, retina, conjunctiva, sclera, iris, and fundus) [ 141. Prior to examination, a mydriatic can be administered to allow visualization of the deeper structures without pupillary constriction.
5. Cardiovascular Assessments Electrocardiography and measurement of blood pressure are generally confined to nonrodents but can,when necessary, be assessed during rodent studies as well. Similar to ophthalmic examination, this procedure should be performed prior to the study start to establish individual baselines and again before termination, with interim assessments if appropriate. It is important that measurements be performed at similar times of day and, ideally, prior to dosing, to avoid transient pharmacological effects, which canbe assessed by additional measurements following dosing. Each species has different requirements for the placement and even the type of electrode employed. In busy laboratory settings, electrocardiograms are often recorded using limb leads alone. Although additional information can be
Multidose Toxicity and Carcinogenicity
47
obtained by the use of chest leads, the time involved in preparing the animal often makes it impractical. Systolic blood pressure can be measured indirectly by using a pediatric cuff and systemic blood pressure can be measured directly in dogs by an arterial catheter in the ear artery. It is not feasible to measure systemic pressure in rats in standard toxicity studies.
6. Clinical Pathology All multidose toxicity studies should include hematology, blood biochemistry, and urinalysis at least once during the study. usually near the end of treatment. Studies longer than 13 wk also usually include one or two interim samples. In addition, prior to treatment, all nonrodents should be sampled and analyzed to ensure suitability for testing and to establish baseline values. For rodent studies, a subpopulation can be sampled prior to initiation of dosing as a general health screen. As discussed in the section above on toxicokinetics, it is important that the blood sampling methodology and the volumeof blood taken do not compromise the health status of the animals. Blood samples are usually obtained from the jugular vein or femoralvein of conscious dogs or primates, respectively. Rodents may be sampled by a variety of methods depending on the blood volumes required. During the study, rodents can be lightly anesthetized and sampledvia the jugular vein or orbital sinus and conscious rodents can be restrained and bled from the lateraltail vein. At termination of the study, rodentscan be anesthetized and blood obtained from the abdominal aorta or by cardiac puncture. However, both the sampling methodology and fasting state of the animalneed to be factored into interpretation of the data. For example, blood samples from tail cuts can result in anomalous results due to the extrusion of extracellular fluids and damaged cell constituents into the samples. It is also important that the blood be sampled at approximatelythe same timeof day and that the samplingbe randomized among the treatment groups to avoid confounding changes in blood constituents due to circadian rhythms. Urine samples are obtained passively by housing the animals in “metabolism cages” over a given period of time. Dog and primate urine can also be collected by catheterization of the urinary bladder. Table 4 presents a list of clinical pathology parameters often measured in multidose studies (see also Chapter 6 for more details). Specific parameters for analysis may differ slightly according to the regulatory agencyand type of compound being tested. These parameterscan be considered a basic screen and additional analyses shouldbe considered when investigating specific pharmacological effects or concerns. An analysis of the hematopoietic system providesan overall assessment of the circulating blood, focusing primarilyon the erythrocytes, the leukocytes, and
ysis
Banks and Keller
48
Table 4
chemistry
Clinical
StandardHematology,ClinicalChemistry,andUrinalysisParameters
Hematology Hemoglobin Hematocrit or packed cell volume Red blood cell count Reticulocyte count Erythrocytic indices (MCV, MCH, MCHC, RDW) White blood cell count Differential white blood cell count Platelet count Mean platelet volume Prothrombin time Activated partial thromboplastin time
Urea nitrogen Creatinine
Color and turbidity Specific gravity
Electrolytes (Na. K, C1) Calcium phosphorus Total C 0 2 ALT AST GGT ALP LDH Total protein Albumin Globulin (calculated) Total bilirubin Cholesterol Triglycerides Glucose Bile acids
Osmolality PH Protein Urobilinogen Bilirubin Electrolytes (Na. K, C1) Nitrites Glucose Ketones Occult blood Creatinine Microscopic sediment
clottingability.Parametersassociated with the status of erythronincludethe erythrocyte (red blood cell) count, hematocrit, and hemoglobin concentration, complimented by the morphology assessment and the erythrocyte indices. For standard clinical pathology assessments, the total numberof leukocytes are counted, a blood smear prepared, and a white cell differential count performed, including identification of any abnormal morphology. The leukocytes are categorized into lymphocytes, neutrophils (segmented and nonsegmented), eosinophils,basophils, and monocytesasapercentage of the totalleukocyte count. Absolute values for the leukocytes should be calculated routinely to evaluate changes in cell numbers. In carcinogenicity studies, red and white blood cell counts are used primarily as a tool for identifying hematopoietic cancers (e.g., leukemia). The blood clotting system is the third itemof the hematopoietic system to be evaluated. A platelet count may be the only investigation possible when the relatively large blood volumes for clotting time assessments cannotbe obtained. Any reduction in platelet numbers (thrombocytopenia) can have significant effects on the well-being of the animal. The total bleeding time is a means of assessing overall coagulation efficiency that can be applied to a rangeof species. It requires the production of a standardized incision and the monitoring of the
Toxicity Multidose
and Carcinogenicity
49
subsequent bleeding time.When samples can be taken into citrate anticoagulant, further investigation can be performed on activated partial thromboplastin times or prothrombin times, which assess the intrinsic and extrinsic coagulation pathways, respectively. Blood chemistry analysis can be especially useful to help identify target organs of toxicity aswell as the general health status of the animal. Itis important that animals be fasted prior to sampling to prevent dietary influenceson glucose In addition, levels and, to a lesser extent, potassium or phosphorus concentrations. an assessment of the degree of hemolysis in the sample shouldbe made as potassium concentrationand lactate dehydrogenase activity may be affected and assays employing colorimetric end points, such as bilirubin, may be compromised. Urinalysis is easy to perform but interpretation is fraught with difficulty due to the crude procedures often used in collection. Possible contamination of the samples with blood, hair, dust, bacteria, or food should be taken into account as well.
7.
Postlife Observations-Necropsy and Pathology
On completion of the dosing and/or recovery period, the animals are euthanized and a postmortem (necropsy) examination performed.The primary purpose is to obtain tissue samples for histopathological examination. Significant information can also be obtained from gross observations and organ weights. If animals are killed before the scheduled study termination or die unexpectedly, a postmortem examination should be performed and tissues retained. Organ weightsusually are not recorded for animals that die during study due to inherent inaccuracies induced by autolysis. A number of methods can be used for euthanasia, including lethal chemical injection, asphyxiation, cervical dislocation, and decapitation.The selection of a particular method can be dependent on the type of postlife parameters needed. For example, an injectable anesthetic is more suitable for inhalation studies so that there is no interference in lung histopathology assessment.The order of necin the data related to individuropsy should be randomized to minimize variations als or the time of sacrifice. Immediately after euthanasia, any blood samples and/ or urine samples shouldbe taken for clinical pathology and a bone marrow smear prepared. For animals killed prematurely, blood samples should be taken whenever possible as they may provide valuable information for the interpretation of the cause of death. However, such data are not included in the statistical analyses for the treatment group. Following euthanasia and fluid sampling, the animals are usually exsanguinated,which helps reducethe variability in subsequent organ weights. A proper necropsy should include a gross evaluation of all major organ systems. Examination should also includean external evaluation of body condi-
50
4
Keller
Banks and
tion, such as staining or hair loss, and should correlate withany in-life findings. Eveninshort-termstudies, the animalsshould be palpatedforsubcutaneous masses. All findings should be recorded in a precise, descriptive, and consistent manner, indicating location, size, shape, color, and type of change or lesion. It is important that descriptive terms rather than diagnostic or interpretive terms be included in theraw data. Any possible artifacts inducedby the methodof euthanasia or method of prosection (i.e., incision, cutting, removal, and trimming of tissues) should be noted. A full list of tissues together with any macroscopic abnormalities should be placed into fixative for future processing and examination.The list of tissues retained on completionof toxicity studies varies according to species and regulatory authority. Table 5 presents a generic list that satisfies the majorityof cases.
( a ) Organ Weights. Someregulatoryguidelinesrequire that keyorgans be weighed during the necropsy of each animal in standard multidose toxicology studies. Regardless of requirements, most laboratories make this a routine part
Table 5 Tissues for Histopathological Processing and Examination
Abnormalities observed at necropsy Animal identification (not proce‘ssed) Adrenals Aorta (thoracic) Bone and marrow (sternum) Brain (cerebrum, cerebellum, midbrain and medulla oblongata) Cecum Cervix Colon Duodenum Epididymides Esophagus Eyes Harderian glands Heart (including section of aorta) Ileum Jejunum Kidneys Lacrimal glands Liver (sample of 2 lobes) Lungs (all lobes) Lymph nodes (mandibular and mesenteric) Mammary gland (inguinal)
Optic nerves Ovaries Pancreas Pituitary Prostate Rectum Salivary gland Sciatic nerve Seminal vesicles Skeletal muscle Skin Spinal cord Spleen Stomach Testes Thymus Thyroid lobes (and parathyroids) Tongue Trachea Urinary bladder Uterine horns Uterus Vagina
Toxicity Multidose
and Carcinogenicity
51
of their necropsy procedures since organ weight changes can be very sensitive indicators in many toxic events [15]. However, variability in organ weights of rodents in studies of longer than 1 yr indicate little scientific rationale for recording organ weights in these studies [16]. The most commonly included organs include the liver, lungs, kidneys, spleen, and gonads, as well as suspected or known target organs. In order to minimize spurious variation in organ weights, it is important to use consistent and careful dissection techniques. Common confounding factors include remaining tissue or fat adhering to the organ, entrapped blood, and variations in cutting (e.g., where the spinal cord is cut when weighing the brain). Careful technique is also necessary for insuring the integrity and quality of the tissue for subsequent histopathological evaluation. Other experimental factors that can influence organ weights and that often are overlooked include local environmental conditions affecting organ hydration, the method of euthanasia, the extent of time between death and necropsy, the fasting state of the animal, and exsanguination state. Most organ weights are recorded tothe nearest 1 mg. Smaller organs, such as the thyroid,adrenal, and pituitary,areweighedto the nearest 0.1 mg. An evaluation of the accuracy of organ weights revealed that reporting too many significant figures tends to limit their interpretive usefulness, whereas recording the weights with too few figures can also limit their usefulness [ 16,171.
(t?) Histopathology. The list of tissuesthatshould be examinedvaries according to speciesand the regulatory agency.The site from which some tissue samples aretaken can be critical to accurate interpretation of histopathology data. For example, it is important to select the same site for bone sampling in all animals within a particular study since it is known that even minor differences can influence histomorphological variables. For dog or primate studies, tissues from all groups are usually processed and examined (this correspondsto approximately 1500 tissues).Due to the larger number of rodents per group, the initial examination normally comprises all tissues fromthe control and high-dose groups and any animals that died prematurely, plus any abnormal tissues from the intermediate and low-dose groups. Ifany treatment-related changes are identified in the high-dose group, these target organs in the low and intermediate groups or recovery animals should then be examined. A valid histopathological evaluation should always include a peer review process for any study conducted for regulatory submission. This involves a check of a percentage of the diagnoses in a particular study by a second pathologist. The proper fixation, slide preparation, and stainingof individual organs or tissues are critical to accurate interpretation of any histopathological finding.The majority of tissues are fixed in neutral buffered 10% formalin. The lungs should as eyes, testes, and epididybe infused with fixative after weighing. Tissues such
Banks and Keller
52
Table 6 SpecializedTissueStainingMethods
to
Used
Stain Congo Red Ziehl-Neelsen Brown-Brenn Giemsa Periodic acid-Schiff (PAS) Verhoeff 's van Gieson's Oil red 0 Toluidine blue Gridley's Gomori's method Gomori's stain Gomori's silver impregnation
Amyloid Acid fast bacteria Gram + and - bacteria Blood cells Carbohydrates/glycogen Elastin Collagen Fat deposits Mast cells Fungi Iron deposits Pancreatic islet cells Reticulum
mides, which possess a thick capsule and are preserved whole, require a fasterpenetrating fixative to avoid degradation during the fixation process. Commonly used fixatives for these organs include Bouin's, Davidson's, and Zenker's solutions. Tissues are routinely prepared for histopathological examination by embedding in paraffin wax, sectioning onto slides, and staining. Hematoxylin and eosin stain is used routinely for tissues from toxicity studies and requires considerable experience to produce consistently stained sections. The hematoxylin serves as a powerful nuclear stain (blue) and the counterstain (or secondary stain) eosin differentiates the cytoplasm into various shades of pink. In addition to the routine staining, specialized staining methods can provide help in discerning specific compound-related changes in particular organs. Table 6 contains a list of some of the more specialized staining techniques and what they are used to identify.
(c) Electronmicroscopy. Electronmicroscopy,whichallowsevaluation of tissues at the cellular level, can .be used to aid in the interpretation of any histopathological findings. Although not routinely used, consideration should be given to its use for at least the major target organs, including the liver and kidneys.
111.
DATAREPORTING,ANALYSIS,INTERPRETATION, AND RISK ASSESSMENT
A.
DocumentationandReporting
A study report written for submission to regulatory agencies must contain. at a minimum: the title and purpose of the study, the facility performing the experi-
Multidose Toxicity and Carcinogenicity
53
ment, thekey personnel involvedin the study, and details of the timingof various study activities. All data collected should be included in the report in the form of individual data tables and summary data tables with the of results any statistical of the study analyses. The bulk of the report text is divided between a description design and the experimental methods used and the presentation and interpretation of the data. The experimental methods section is usually a reflection of the study protocol.Allprotocolamendmentsanddeviationsfromtheoriginalprotocol should be identified. Standard subacute or subchronic toxicity tests should present detailed results and discussion sections as well as an outline of the study in the form of a summary. The summary or “study abstract” should bean independent document that provides critical details of the methods and results. It should include only those findings considered related to treatmentand it is important that the conclusion reflects the purpose of the study as defined in the protocol.
B. DataInterpretationand Risk Assessment One of the most common reactions encounteredwhen presenting toxicology data to “non-experts” is panic. They react to the word “toxic” as if it is a lethal virus that is sure to kill a new project or product. The most important concept that the reader needs to understand here is that EVERYTHING is toxic under the right conditions and at high enough concentrations. The primay purpose of conducting animal toxicology studies is not to prove that an agent is nontoxic. of toxicity that aspecific compound Its purpose is to evaluate the particular form manifests (i.e., targets of toxicity) and to ascertain the “margin of safety” of a compound (i.e., the difference between the expected exposure level under reallife conditions and the exposure levels inducing significant levels of toxicity). The definitions of four distinct characteristics describing the toxicity of a compound are important to understand. The toxic potential is the actual ability of a chemical to disrupt normal physiology, morphology, and/or function. The of exposure at which the toxicity toxic potency refers to the dosage and frequency manifests itself. It has also been used to describe the severity and incidence rate of a particular toxic effect. The toxic hazard is a description of the potential for danger (i.e., the steepness of the dose-response curve or, for instance, the distance between a therapeutic and a toxic dosage level). The toxic risk describes the likelihood of a toxic effect at expected (real-life) exposure levels and conditions. Thus, a compound that has a high or dangerous hazard profilemay pose no risk to humans if exposure levels are low and the toxic potency is low. Key dosage levels identified in toxicology studies include “no the observed effectlevel”(NOEL)or “no observedadverseeffectlevel”(NOAEL),the “lowestobservedeffectlevel”(LOEL)or“lowestobservedadverseeffect level” (LOAEL), and the “maximum tolerated dose” (MTD). Using the LOEL and/or NOEL, various typesof margins can be calculated that estimate expected
Banks and Keller
54
safe exposure levels. The size of the “safety margin” can vary according to the route of administration, the toxicokineticsof the test compound, and the biological activity of the test compound in each species. Extrapolation of animal data to human risk assessment can use a variety of measures of exposure including the administered dosage (mg/kg/day), dosage corrected for surface area differences (mg/m’), and actual measured systemic exposure (plasma area under the curve [AUC] andC,,,) [ 18-20]. The particular calculation method for estimating safety margins, safe exposures, or considering risk is dependent upon whether the test compound is a pharmaceutical, food additive, agricultural chemical, industrial chemical, or environmental pollutant [21-251. Most toxicological effects exhibit a normal bell-shaped (Gaussian) curve for dosage vs. effect. The slope of the curve can give an indication of some of the intrinsic characteristics of a test compound. Two test compounds can have the same NOEL but different slopes, indicating quite different characteristics. For example, a steep curve may indicate rapid onsetof action or faster absorption. When the slope is relatively flat, a larger marginof safety canbe anticipated, since a large increase in dosage would be expected to produce only small increasesin adverse effects. The dose-response slope is often used for extrapolation to very low and even no observable effect levels. The use of statistical tests on data from studies with group sizes of at least 10 animals is routine and a valuable tool in data interpretation [26]. Statistical analysis almost always uses comparison to concurrent control data and/or positive control data. Numerous different methods of statistical analyses canbe used. It is comnlon to follows a decision tree with the data first assessed for homogeneity of variance followed by an appropriate analysis of variance such as in the example shown below: Bartlett’s homogeneity of variance
Nonsignificant
Significant
1
(Nonparametric) (Parametric) ANOVA Kruskal-Wallace
Significant
Dunn’s
1 1
Significant
Dunnett’s
1 1 1
While statistical analysis isan effective way of assessing large amounts of data, it is just as important to consider the biological significance of the effect along withan understanding of the limitationsof the study design.The biological, toxicological, and statistical significance of a particular effect are often interpreted differently. The use of compiled data from the control groups of previous
Multidose Toxicity and Carcinogenicity
55
studies (historical control data) can be useful when trying to ascertain whether any particular data point is within expected biological variation. The presence or lackof a dose responsecan also be a usefultool in deciding whether an effect is test material related. Generally,if an effect occurs in several test groups without a dose relationship, the same effect is usually found in the control group as well. This is a strong indication that the effect isnot caused by the test material. Trend analysis is useful in evaluating dose responses. For nonrodent species, the use of statistical analysis is usually not approin nonrodent studies tend priate, due to the small group sizes. Therefore, findings to be viewed on an individual basis rather than relying on group results. Most often the test values obtained during the treatment period are compared with individual pretreatment values or the laboratory background data. In this way, the individual animal becomes its own control. Baseline data are particularly useful when assessing the results of hematological and clinical chemistry data. As much weight is given to comparisonof clinical data from a dog during treatment with that of the same animal before treatment started, as is given to comparing the results of treated animals with the controls. This approach is more difficult for the rat, mainly because it is not practical to take pretreatment samples for the large number of animals in the study. In rodent studies, it is more common to give more weight to group findings than to individual findings, since it is possible to apply appropriate statistical methods to large-group-size rodent data. For many end points, the number of affected animals vs. the numberof unaffected animalsin each groupcan also be a measure of severity. For analysis of tumor rates, carcinogenicityrisk assessment guidelines generally require adjustments for intercurrent mortality in the study. More recently, adjustments for rodent body weight are also under consideration.
1. Target Organs The general toxicology of a test agent is usually discussed in terms of identified target organs. Target organs or systems are identified using all of the measured parameters on the study (i.e., clinical chemistry, electrocardiography. ophthalmological examination, organ weights, and histopathology). In a reviewof published toxicology literature, Heywood [ 151 estimated that, in most cases, target organ toxicity was generally seenat 5 to 6 times theNOEL in the rat, dog,and monkey. In addition, he found that the NOEL in rodents was usually threefold higherthan in the dog and monkey. Most target organ toxicity was identified by 13 weeks of treatment. The most likely additional target organ to manifest itself after this duration of exposure was the eye. He also warned that extrapolation of safety across species should be based on the absence of toxic signs rather than on the 20% correlation demonstrated target organ toxicity since there was less than a
56
Banks and Keller
in target organs across species. Some correlation between dog and human target organs was established forthe gastrointestinal tract, urinary tract, central nervous system, and skin. Correlation between nonhuman primates and humans was found for the gastrointestinal tract, hematopoietic system, liver, central nervous system, and skin. Effects on a particular organ may be a primary effect of the agent or a secondary effect induced by changes elsewhere in the body. For example, an agent that injured the liver resulting in altered hepatic steroid metabolism may have secondary effects in the reproductive organs, due to subsequent changesin circulating hormones. Various confounding factors must also be considered when evaluating possible target organs. The vehicle usedin the studymay have adverse effects of its own that need to be separated from effects caused by the test compound. A good example of this is the vehicle cyclodextrin, which can cause renal lesions. This is one reason why it is important to include a vehicle control group in a study. Other method-related effects should also be considered.If the animal is put under toomuch stress in the study, this can result in increased cortisol levels leading to acute involution of the thymus and other lymphoid tissues. Lesions can be caused by the method of administration rather than the test agent, especially in repeat-dose studies. Species differences in organ morphology and weight are alsoimportanttoconsider.Forexample,thekidneyweightrelativeto body weight is greater in rodents (0.65% of body weight) than in the dog (0.5%) or monkey (0.42%) and could have an influence on species susceptibility to renal toxins. Some of the most common target organs are discussed below. The reproductive, nervous, immune, and respiratory systems as target organs are not included as they are covered in Chapters 5 through 9.
( a ) Liver. The liver system (including the gallbladder and common bile duct) is the most common target organ for chemical toxicity [27]. This is due to the fact that most chemicals are metabolizedin the liver before being eliminated from the body, often through the bile. The centrilobular cells are generally the most susceptible due to the fact that they have higher metabolic rates and fewer detoxification enzymes than periportal cells. Numerous different mechanismsof action have been shown to induce hepatoxicity. The type of liver injury is often dependent upon not only the particular agent and its mechanism of action but also on the length of exposure. Subacute exposures are usually associated with such findings as single cell necrosis, lipid accumulation, or cholestasis. Fibrosis and cirrhotic changes usually require longer-term chronic exposures.In addition to morphological changes, possible early signs of liver injury can be identified by changes in clinical chemistry. Hepatocellular injury is associated with increases in the enzymes alanine aminotransferase (ALT), aspartate aminotransferase (AST), and sorbitol dehydrogenase (SDH) (Table 7). Hepatobiliary injury is
Table 7 Chemistry Changes in Blood and Urine Associated with Specific Organ Toxicity ~~
Organ injury Hepatocellular injury
Hepatobiliary injury
Renal injury tubular
papillary Myocardial cell death Congestive heart failure Muscle necrosis
End points Increased ALT, AST. and SDH
Increased GGT. ALP, total bilirubin, and bile acids (urinary bilirubin); decreased amylase
Increased serum ALP and glucose (transient); increased urinary glucose, proteins, BUN. and creatinine Increased urinary NAG, proteins Increased AST, CK, and LDH with no change in ALT Increased AST, ALT, and LDH with no change in CK Increased AST. CK, and LDH with no change in
~~
Comments SDH has greater predictive value than ALT; accuracy increased when both ALT and SDH showed changes Hepatic ALT in dogs is 5 times that of other tissues Proportion of AST isozymes (cytostolic vs. mitochondrial) may indicate extent of injury GGT is very good marker of cholestasis ALP in rat liver is low, limiting its predictability Food intake in rat affects proportion of intestinal ALP in plasma Bile acid changes confounded by relation to food intake Increased urinary bilirubin may occur in dog before changes in plasma total bilirubin
iz
r, B -I
P)
3 Q
Increased urinary protein usually concomitant with reductions in serum albumin
ALT Pancreatitis Lung embolism
Changes in plasma lipids, increased calcium; increased amylase levels Increase in LDH with minimal increase in CK and no change in AST or ALT
In severe injury, feces may contain high levels of undigested fat
ul
-I
58
Banks and Keller
associated with increases in the enzymes y-glutamyltransferase (GGT) and alkain bilirubin,blood urea nitrogen linephosphatase(ALP),aswellaschanges (BUN),and bile acids (Table 7). The incidence of hepatic injury canvary among species and can be reversibleinsomecases. In addition,suchinjuryis not alwaysapparentfromthe dose-response relationship. Standard toxicology study designs are not able to identify agents associated with immunotoxic or idiosyncratic hepatic injury, due the small numberof animals. This is a large problem, since such injury can occur in man, and methods are being developed to help in the identification of such agents. The liver is one of the most common sites for chemical-induced tumor formation in rodent carcinogenicity studies, while this is relatively rare in humans. The majority of test substances that have been identified as hepatocellular carcinogens in rodents are not considered to be hepatocellular carcinogens in humans [28-301.
I
(b) Kidney. The kidney is the second most common target of chemical injury [31]. This is due to the fact thatmany chemicals are excreted through the urine. In addition, the kidney has an extensive blood supply and the glomeruli have a large surface area available for exposure.The kidney’s ability to concentrate solutes and substances also enhances its susceptibility to chemical injury. The identification of the potential to cause renal injury in animal studies is of great importance since,in humans, acute nephrotoxicity usually presents minimal early function changes and the progression to chronic renal failure is silent. As with hepatotoxicity, there are numerous mechanisms of action causing nephrotoxicity. One of the most common sites of chemical injury is the proximal tubules, since most blood flow is delivered to this area of the kidney, which functions to concentrate and reabsorb solutes. Tubular necrosis and tubular hyper- or hypoplasia are common findings. Other sites of chemical injury can include the loop of Henle and the glomeruli. Clinical chemistry changes in the blood and urine that can signal tubular damage include increases in serum alkaline phosphatase (ALP) and glucose, and urinary glucose, proteins, BUN,and creatinine (see Table 7). Injury to renal papillary tissues can be accompanied by increases in urinary Nacetylglucosan~idase(NAG) and proteins (see Table 7). Some findings in the kidney are not applicable to human risk assessment. For example, a?,-globulin nephropathy is an important species-specific toxicological syndrome that occurs in male rats following exposure to a number of industrial and environmental chemicals [32]. In female rats, mineral deposits at the corticomedullary junction are relatively common and attributed to reduced tubular phosphorus resorption comparedwith males. This findingis also not considered to predict similar changes in humans.
Carcinogenicity and
Toxicity Multidose
59
Tumor incidence in the kidney is nearly always accompaniedby increases in the incidence or severityof nonneoplastic lesions indicativeof kidney toxicity
[331. ( c ) Cardiovascular System. Histopathologicalchanges in hearttissue due to chemical exposure in standard toxicology studies are relatively rare. When present, injuries are usually manifested in the myocardium as degenerative lesions and, with chronic exposure, fibrosis [34]. Prolonged insult can result in cardiac hypertrophy. Chemicals can also target the blood vessels, producing degenerative and/or inflammatory lesions. Clinical chemistry changes that have been associated with injury to cardiac tissues include increases in serum enzymes AST, ALT, lactate dehydrogenase (LDH), and creatine kinase (CK) (see Table 7). In most cases, cardiotoxicity is manifested as acute, transient, functional responses. If the animal survives, the effect is usually reversible. These functional responses include bradycardia, tachycardia, and various forms of arrhythmia. In pharmacological safety testing,such cardiovascular changes are generally considered to be exaggerated pharmacological effects rather than toxicological effects. Agents that induce such functional changes in the heart are described as having chronotropic effects (act on heart rate), dromotropic effects (act on conductivity), bathmotropic effects (act on excitability), or inotropic effects (act on contractility). Cardiovascular effects of chemicals can also be manifested as hypotension, hypertension, hemorrhage, thrombosis, and embolism. Nutritional status, electrolyte imbalance, anemia,and thyroid dysfunction can be secondary factors causing cardiac dysfunction.
(d) Adrenal Gland. The adrenal cortex is one of the most vulnerable of the endocrine glands to chemical-induced injury[35,36]. Its susceptibility is due in part to the lipophilic nature of the organ, which can promote deposition and accumulation of hydrophobic xenobiotics. The gland also has a good blood supply and relatively high concentrationsof metabolizing enzymes, making it more likely to be exposed to toxic intermediates. Alterations in adrenal gland function can have numerous secondary effects in the body, including changes in metabolism,electrolytebalance,cardiacfunction, and steroidogenesis. An important consideration when extrapolating adrenal findings-in the rat to risk in man is that such lesionsin the rat are usuallynot hyperfunctional (i.e., donot produce symptoms of hypertension). However, in humans this is usually the case. ( e ) Hetnntopoietic System. The hematopoieticsystemisoftenover[37]. This complex system includes looked as a possible target for chemical injury the bone marrow, circulating blood cells, spleen, lymph nodes, and reticuloendothelial tissue in various organs. The bone marrow, which is responsible for the production of all blood components, is particularly vulnerable to cytotoxic agents
60
and
Banks
Keller
due to the rapid cell division within the marrow. Many chemicals can injure the system without damaging the marrowby such mechanisms as oxidative hemolysis within the circulation and immunotoxic reactions with blood components. as a reduction in the Effects on the hematopoietic system can be manifested number of all formed elements (pancytopenia), reductions in circulating red cells (anemia), white cells (granulocytopenia), or thrombocytes (thrombocytopenia), or reductions in individual components (e.g., leukopenia, lymphocytopenia). (f) GastrointestinalTract. Thereare very fewagentsthatspecifically target the gastrointestinal tract [38]. However, components of this system, especially the intestines, are particularly vulnerable to cytotoxic agents due to the high rateof cell division. Stomach lesions, such as ulcerations and inflammations, arealsorelativelycommon in studiesusingoralgavageadministrationwith agents that have an irritant effect. Disturbances of the GI tract are easily observed and usually manifestedas vomiting, excessive salivation, or changes in feces (soft stool, diarrhea, absence of stool, bloody stool). These effects will often result in changes in blood and urine electrolytes due to fluid loss. Prolonged or extreme fluid loss will also affect the hematocrit, plasma proteins, and urine osmolality. (g) Eye. The eyecontainsnumeroustargetsforchemicaldamage,due to its complexity and wide range of cell types, and is considered a highly sensitive organ system [39]. Chemical-induced lesions can occurin many structures of the eye with systemic exposure.The most common targets include the retina, cornea, and lens. Chemical compounds with a high affinity for melanin are often associated with some levelsof retinopathy. Cataracts are also fairly common, but their significance needs to be considered in relation to the age of the animal.
2.
Interpretation of Specific Study Parameters
(a) ClinicalSigns. Clinicalsignsaremostoftenthought of asagross indication of an animal's general health. These findings are often overlooked as an important component in identifying target organ effects and more care needs to be taken when reviewing these data. Changes such as cyanosis, flushing, and weakness are often the first indication of cardiotoxic effects. Piloerection of the fur and/or lacrymation (tearing of the eyes) can be indicative of disturbances of the autonomic nervous system. Discharges from the nostrils can be a sign of pulmonary edema. Changesin the normal glossy coat can be indicative of effects on sebaceous glands. In some instances, clinical signs can help identify very specific targets. For example, agents adversely affecting the cellular division of skin cells have been shown to cause abnormally high incidences of ulceration and inflammation in areas subjected to minor trauma of everyday life, such as the feet and tail.
Multidose Toxicity and Carcinogenicity
61
(b) Body Weight and Food Consumption. Bodyweightandfoodconsumption data are considered gross indicatorsof general systemic toxicity. Body weight is most often evaluated as changes body in weight gain since most toxicology studiesareconductedwithyoungadultanimals,whicharestill in their growth phase. Reduction in body weight may be directly related to reductions in food consumption, but not in all cases. In studies in which the test material is admixedwith diet, it is importantnot only to monitor dietary intake to calculate exposure to the test material, but also to document possible palatability problems. Evaluation of body weights in rodent studies can use group mean values supplemented by astatisticalassessment. In largertoxicologystudies,body weight and food consumption changes will most often demonstrate a dose response. Preliminary interpretation of body weight effects on studies using dogs or primates can also be performed using the group mean data, but should include a review of the individual data as interindividual variation is more common in nonrodents. It is often helpful to calculate weekly and total body weight gains or changes and compare the treated and control groups. This is especially helpful in large animal studies showing greater individual variation in body weights at the start of treatment. (c) Water Consumption. Waterconsumptionisnotconsideredastandard end pointin general toxicology studiesat this point in time. This is primarily due to the widespread use of automatic watering systems. If the test material is administered in the drinking water, consumption should be recorded at regular intervals to assess actual test material exposures. (d) OplzthalrnologicalExamination. A trainedveterinaryophthalmologist may be necessary for proper analysisof the data. Ophthalmological findings in treated animals should be compared with possible findings from the preexposure examination for that same individual before concluding that there is a chemical effect. Microscopic examination can also confirm the findings seenwith gross examination. In many instances, the interpretation of findings in some structures of the eye in terms of human risk assessment may be unclear because of the species specificity of many ocular structures. (e) Cardiovascular.Assessnzent. Properlyemployedandanalyzed,electrocardiograms (ECGs) can be one of the most sensitive indicatorsof cardiovascular dysfunction. There are marked differences among species and between sedated and nonsedated animals. The proper interpretationof ECGs usually requires a trained veterinary cardiologist. Computer-assisted analyses programs are available that function in data extraction and calculation of the necessary quantitative data. However, a significant amount of information from ECGs is qualitative rather than quantitative, such as pattern recognition, and requires extensive experience in cardiac pathophysiology to identify [40,41].
Banks and Keller
62
While blood pressure and heart rate measures are noninvasive, do not require sedation, and are easy to perform, they can be subject to significant shortterm variability, greatly reducing both their sensitivity and reliability. Any interpretation of these data needs to keep such considerations in mind. Electrolyte balance and nutritional status can confound data interpretation as well. (f) Hematology. The mostcommonhematologychangeacrossspecies in safety testing is reduction in red blood cells (anemia). Increases in reticulocytes (reticulocytosis) and, in severe cases, erythroblasts (erythroblastemia) an areindior hypoxic cation of accelerated production in the bone marrow following injury conditions. Splenic enlargement is another sign of hematopoietic injury, although spleen weight in dogs is not considered a reliable indicator of blood cellularity. Increases in blood components can also be a manifestation of injury, such as an increase in red blood elements (polycythemia), and is often interpreted as a regenerative overcompensation to injury. Identification of chemical injury to the hematopoietic system is complicated by the constant cell turnover and production of new cellular components. Each cell type has a particular life span within the system, and species differences in these life spans further complicate risk assessment. In most instances, the sequence of cells disappearing from the peripheral blood will be consistent with known transit time and the origin of the cell. For example,in rodents, neutrophils (originating in the bone marrow with a lifespanof 10 hr) are thefirst to decrease in the event of myeloid injury. On the other hand, lymphocytes are less affected by bone marrow injury becausethey predominately originatein lymphoid tissues. Reductions in thrombocytes after bone marrow injury generally take much longer to appear due to their relatively long life span in the circulation. In addition, there are many factors to consider when interpreting changes in blood elements. Mild exercise, stress, inflammation, nutritional status, anorexia, and hemorrhage can all influence blood components. The amount and timing of blood sampling and fasting can also have a marked impact on reticulocyte count and hemoconcentration.
-
( g ) CZiniccll CIwmistry. Wheninterpretingclinicalchemistrydatathe distinction between a chemically induced change and natural variation needs to include consideration of the species, strain, gender, age, study conditions, time of sampling (cyclic biorhythms). diet and water intake, stress, and the route of administration of the test compound. Several environmental factors, including caging density, lighting, room temperature, humidity, and even cage bedding, can influence study results. Circadian variations for some parameters, such as hormones and blood glucose levels, can be dramatic and need to be factored in to any data interpretation. One of the easiest controlled factors is diet. Fasting prior to sample collection can produce marked differences in plasma enzymes, urea, creatinine, and glucose.
Carcinogenicity and
Toxicity Multidose
63
Biochemical changes associated with stress (caging, restraint, handling, and invasive measurements) include changes in plasmacorticosterone/cortisol, catecholamines, and electrolytes. Electrolytes. Many confoundingfactors need to be taken intoaccount when interpreting changes in plasma electrolyte levels. Excessive stress and use of restraining procedures duringblood collection can markedly affect potassium, calcium, and magnesium levels. In rodents, the blood collection site and anesthetic method can also influence electrolyte values. In many instances, the electrolyte changes are secondaryto adverse effectson renal function, dehydration, and/ or anorexia. Changes in electrolytes, especially magnesium and calcium, may in turn also cause secondary effects, such as increased cardiac tissue sensitivity, arrhythmias, or significant changes in vascular permeability. Changes in plasma anion concentrations (chloride, bicarbonate, and inorganic phosphate) are of lesser significance for cardiac function.
soSodium.Hypernatremia(excesssodium)orhyponatremia(deficient dium) may be observed in cardiac failure, depending on the volemic (i.e., hydration) state of the animal. Calcium and magnesium. When considering any possible effects that increases in calcium and magnesium may have caused in cardiac function, the concentration not bound to proteins (free concentration) is more important than total plasma concentration. Dependingon the species, approximately40% of calcium and 30% of magnesium is bound. Hypercalcemia may be masked if there is a concomitant reduction of plasma albumin, as often seen in renal injury. Osmolality and acid-base balance. Plasma osmolaity and acid-base measurements generally have limited usein standard toxicology studies. While there are several formulas for the calculation of osmolality from plasma concentrations of sodium, urea, and glucose that canbe used with humans blood samples, these formulas have limited applicability with other species, due to the variability of these components. In addition, these parameters should be interpreted with caution for small laboratory animals because of a high variability due to blood collection procedures. Plasma osmolality changes can be indicative, for example, of cardiac conditions such as congestive heart failure, which can be manifested as an increase in plasma sodium levels and extracellular fluid volume with evidence of hyponatremia. Enzymes. When tissues are injured, damaged cells leak intracellular enzymes into the systemic circulation or are found in the urine. Changesin specific enzyme levels are oneof the most common markersof target organ toxicityused
64
and
Banks
Keller
in general toxicology studies. Commonly measured enzymes include aspartate animotransferase (AST, formerly SGOT), alanine animotransferase (ALT, formerly SGTP), alkaline phosphatase (ALP), y-glutamyltransferase (GGT), glutamate dehydrogenase (GLD), lactate dehydrogenase (LDH), sorbitol dehydrogenase (SDH), creatinine kinase (CK, formerly CPK) and N-acetylglucosamidase (NAG). These enzymes are differentiated by not only what cells they can come from but also wherethey are normally located in the cell (i.e., brush border membrane, lysosomes, mitochondria, cytoplasm). Some examplesof specific changes or patterns in enzyme levels, as well as other clinical chemistry end points that have been reported to be associated with organ toxicity in the literature have been summarized above in Table 7. As with other end points, certain methodological considerations need to be taken into account when interpreting any changes. Many investigators do not realize that these enzymes have a relatively short half-life in the blood as well as differing rates of clearance. In addition, different dosages of an agent can cause damage and subsequent enzyme release at different time intervals. Thus, the presence of enzymes in the blood can often be a hit-or-miss proposition and the absenceof marker enzymes should.notbe considered an indication that organ damage has not occurred. Also to be considered is that chemical-induced organ dysfunction that is not associated with morphological damage to the organ will not be detected by the presence of serum enzymes, since these enzymes are released only by dead or dying cells. Finally, some enzyme changes may be due to enzyme induction rather than morphological injury, such as in the increasein ALP that occurs with glucocorticosteroids. The identification of toxicity-related increases in enzyme levels can be confounded by the normal presence of these enzymes in blood components. For example, both CK and LDH should be measured in plasma rather than serum, due to the relatively high concentrations of these enzymes in platelets. Blood samples with visible signs of hemolysis should not be used, again due to extraneous enzymes from the various damaged blood cells. Damaged tissue at the site of blood collection can also lead to erroneous enzyme levels. CreatinineKinase(CK).Creatininekinaseisacytosolicenzyme with three major isozymes: CK-MM is the “muscle type” isoenzyme, CK-MB is the “myocardial type” isoenzyme, and the third dimer, CK-BB, is the “brain type” isoenzyme. Confounding factors include administration route, study conditions, and animal age. Intramuscular injections cause increased plasma CK. Creatine kinase valuesmay also be affectedby stress and severe exercise. Age of an animal also affect plasma CK, with levels generally being higher in younger animals. Changes in CK are observed with musclar, cardiovascular, and pulmonary injuries.
Carcinogenicity and
Toxicity Multidose
65
Lactate dehydrogenase (LDH). Lactate dehydrogenase is a cytosolic enzyme with five major isoenzymes. The distribution of LDH in various tissues is generally ubiquitous with large variations in normal levels among and within species. Due to the broad rangeof normal plasma LDH levels in laboratory animals, significant changesin LDH levels are often difficult to detect and to interpret. This enzymehas a greater predictive value in humans, in whom there is less variability. When necessary, electrophorectic identification and quantificationof the various isoenzymes canaid in interpretation. However, some drugs have been shown to modify the electrophoretic mobility of some LDH isoenzymes. Aspartateaminotransferase(AST,formerlySGOT)andalanineaminotransferase (ALT, formerly SGTP). Each of these enzymes has two isoenzymes (cytosolic and mitochondrial). These enzymes are often altered following hepatic and myocardial damage. They are not tissue specific, however, in many laboratory animals, cardiac AST levels are higher than most major tissues, whereas cardiac tissue ALT levels usually vary among species. In the rat, mouse, and dog, ALT levels are highest in liver. In primates, hepatic and cardiac levels are similar. Alkaline phosphatase (ALP). There are two major isoenzymes, osseous and intestinal. The high variability in laboratory animals compared with humans limits the predictive value in some species. In young animals, the osseous ALP predominates, in older animals intestinal ALP is highest. y-Glutamyltransferase(GGT). An increasedlevel marker for chemically induced cholestasis.
of GGT isa
good
Amylase. Serum levels of this enzyme are generally elevated when pancreatitis or renal insufficiency are present. Reduction in amylase levels can be indicative of hepatobiliary toxicity. Lipids. Adverse effects on lipid metabolism can be manifested as changes in plasma cholesterol, triglycerides, plasma lipoproteins, total lipids, phospholipids, apoplipoproteins, and nonesterified fatty acids. The plasma lipid pattern can vary with animal age, sex, diet, and period of food withdrawal prior to sample collection. There are both qualitative and quantitative differences in lipid metabolism in the most commonly used laboratory animals. This is due to differences in rates and routes of absorption, synthesis, metabolism, and excretion. In rat, mouse, rabbit, guinea pig, ferret, and dog, the major plasma lipoprotein classes are the high-density lipoproteins (HDL). In contrast, the major classes in primates, including humans, are low-density lipoproteins (LDH). This factor has a significant impacton interpretation and extrapolationof lipid data from common toxicology test species to human risk.
66
Banks and Keller
Glucose.Changes in glucoseconcentrationsare most oftenassociated with renal injury. However, stress can cause marked elevations in plasma glucose levels as well. In addition, nutritional status can have a marked influence on glucose levels. Neonates are more susceptible than adults to fasting-induced (anorexic) hypoglycemia due their relatively small store of glycogen. Urea nitrogen and creatinine. Increased BUN, often accompanied by increased protein, can be indicative of renal injury, but is not considered the most sensitive indicator(i.e., renal function canbe reduced by 50% before BUN levels are changed). On the other hand, increased creatinine levels are considered a much more sensitive indicator of renal injury. In animals, BUN levels are often reduced with severe liver injury and under overhydration conditions. ( h ) Urinalysis. Urinalysisisthe most usefulnoninvasivemeasure of kidney function. The conditions under which samples are collected are critical and must be considered when interpreting data. Reduced water consumption as well as severe emesis or diarrhea will affect urine output.
Volume and osmolality. If carefully collected over a specific period of time, the volumeof urine may be useful forfluid balance assessments (e.g., diuresis, dehydration, etc.). Osmolality is an indicator of the ability of the kidney to concentrate urine. pH. The pH ismeaningless,unlessurineis collecgd directlyfromthe bladder. This is due to the quick dissipation of dissolved carbon dioxide (C02) after urination. Fasting, ketosis, and/or tubular dysfunction can increase the acidity of urine. Electrolytes. Electrolyte levels in the urine are highly dependent intake and other extrarenal factors. They are not sensitive indicators toxicity.
on food of organ
Ketones. The presence of increased ketones in the urine indicates disturbance in carbohydrate metabolism. Glucose. Elevated glucose levels are either the result of increased blood glucose, which can be confirmed in the clinical chemistry testing, or an indication of damage to the proximal tubules. In tubular injury, changes in glucose can be a sensitive and early marker of injury. Protein. Increased total urinary protein may indicate renal injury or extrarenal hemorrhage or inflammation. Increased proteins may also be seen with urinary tract infections. If renal injury is suspected, the ratio of high- to lowmolecular-weight proteins may differentiate tubular from glomerular injury. A concomitant drop in albumin concentrations in the plasma may occur in cases of marked increases in urinary proteins.
Toxicity Multidose
and Carcinogenicity
67
Sediment. The sediment is examined microscopically for the presence of erythrocytes, leukocytes, renal epithelial cells, bladder cells, crystals, and spermatozoa. The presence of erythrocytes is indicative of hemoglobinuria, hematuria, or hepatic porphyria. Leukocytes may indicate renal or bladder bacterial infection. Crystalluria may be due to pH-dependent precipitation of urates or phosphates or may reflect high urinary concentrationsof the test material and/or a metabolite. Urinary spermatozoa indicate male ejaculation dysfunction (retrograde ejaculation).
(i) Organ Weights. Organweightdataarecommonlyanalyzed in relationshipto body weightand brain weight.In many cases,changes in organ weights are reflective of reductions in body weight rather than direct effects of the test material. Liver, testes, adrenal, and thyroid weight changes are typical with suppression of body weight gain of 20% or more in rodents [15,42]. The relative weights of the brain, kidneys, heart, spleen, pituitary, and prostate are not as highly influenced by reductions in body weight in rodents. For dogs, it has been shown thatmarked weight gain suppression is associatedwith increased liver weight, variable effects on thyroid and adrenal glands, and reductions in gonad weights. In monkeys, reductions in body weight are associatedwith lower liver, kidney, and testes weights. Even in studies in which there are marked reductions in body weights, direct test material effects on organ weights cannot be dismissed. For example, testicular weight very often has a good correlation with testicular toxicity even in anorexic animals. The uterus and ovaries present a particularly difficult interpretive problem since their weights are normally highly variable as a consequence of cyclical reproductive functions. Increasesordecreases in organweightscan be associated with test material-induced changes in function and/or morphology of an organ, including disturbances in phospholipid metabolism, induction of enzymes, cellular necrosis, hypo- and hyperplasia, and hypersecretion. ( j ) Gross and MicroscopicPathology. The properidentification of pathological damage and interpretation of the biological significance in toxicology studies requires extensive trainingin veterinary pathology. It is not possible in the context of this chapter to delve deeply into this subject or any specific organ pathology. There are many excellent publications for those needing more detailed information [43-461. There are often numerous descriptions of “nonlesions” in gross necropsy reports that canmask true test material-related changes. This can become a particular problemin studies in which many animals died during the study. Necropsy reports on found dead animals will contain descriptions of congestion in numerous organs that is simply a reflection of blood settling after death. Similarly, in found dead animals, accumulations of gas in the intestinal tract due to autolysis
68
Banks and Keller
should not be reported as “dilation of intestine.’’ Thus, many morphological findings in animals that died during the study will notbe related to test material treatment. In general, lesions are evaluated in termsof their cause. The injury can be spontaneous, related to postmortem events, a direct effect of the test material, or a secondary effect to other test material-related changes. The evaluation should utilize comparison to findings in the concurrent control group, and, as necessary, historical incidence ratesin that strain and species.The incidence of spontaneous lesionsincreases with theage of theanimalandcanmakeinterpretation of chronic studies particularly difficult. The functional significanceof a pathological change can be quite different depending on the etiology of the lesion. Toxicology hazard and risk assessment relies on group and cross-study to use standardized classificomparison. Thus, it is important for the pathologist cations. Care mustbe taken that the same lesions are not described in such different ways that they cannot be related to one another in the final analysis. This is a particular problem when studies on a test material are conducted in more than one laboratory. In addition, deciding the significance of a particular finding can be difficult in situations in which given grades of severity (e.g., mild, moderate, severe) are not predefined and used in a consistent mannerby the study pathologist. Subtle lesions are always the most difficult to interpret with regard to significance to human risk assessment.
3. Carcinogenic Risk Assessment Survival is a key end point in defining the acceptability of a carcinogenicity study and in interpreting tumor incidence rates. When a dose-related increase in mortality is observed in a study, the lack of tumor development in the affected dose group(s) must be interpreted with much consideration. This occurrence of increased mortality lessens the number of animals available for evaluation, the length of their exposure, and the latency period needed for tumor development. The strain of rat and mouse also needs to be taken into accountwhen interpreting results of carcinogenicity tests, as strains differ in both their susceptibility to tumor induction as well as their background tumor incidence rates. Many changes have occurred in carcinogenic risk assessment in the last decade. A clearer understanding of the mechanisms of toxicity leading to tumor formation and the differentiation between those agents inducing carcinogenicity by genotoxic vs. nongenotoxic mechanisms has changed the way in which toxicologists and regulators view test agents identified as carcinogenic in animal studies [33,47-541. A weight of evidence assessmentnow almost always includes both a probable mode of action as well as effect level, which provides more accurate discrimination of risk as it applies to humans. Auxiliary information, such as mutagenicity data and pharmacokinetic data, are also considered. Flags
Carcinogenicity and
Toxicity Multidose
69
for possible carcinogenic action include increased incidence of tumors in multiple experiments and/or dose-responsive increase in the incidence of a tumor to an unusual degree with respect to site, type, and/or latency. Tumor induction in animals by a nongenotoxic mechanism is often species and sex specific [ S I . The occurrence of a low incidenceof a “rare” tumor does not automatically classify a compound as a carcinogen. Likewise, the induction of benign tumors only is also generally not considered sufficient for a conclusion of carcinogenicity. Increases in tumorgenicity in animal studies have been shown tobe species specific in some cases and are not always associated with increased risk in humans. For example, the occurrenceof bladder tumors in the presence of urinary of Leydig cell tumors in rodent carcinogenicity studies calculi and the occurrence are generally believed to be irrelevant to human risk. Furthermore, the occurrence of a carcinogenic effect in multiple organs in a study has been suggested to be the manifestationof too high a dose level rather than a true carcinogenic response and thus has little predictive value for human exposure conditions [56].
IV. SUMMARYANDFUTURETRENDS The design of multidose toxicology studies has become standardized over the years for both rodent and nonrodents. These studies have an extremely broad responsibility for identifying target organs NOEL/LOEL and values usinga small number of study parameters. The regulatory agencies arenow asking for mechanistic information on test compounds to aid in risk assessment. In response to this, additional end points are often being added to the standard multidose studies. These can include end points such as cell division rates within an organ, measurement of hormone concentrations, and functional organ testing. Such end points, while added on a case-by-case basis, will continue to grow in popularity. Investigations still continue into possible in vitro methodologies to replace in vivo animal studies. While these methodologies havebeen used extensively as screening tests and tools to study mechanisms of action, none has been found acceptable as replacement in safety testing to date.
REFERENCES 1. G.A. Boorman,R.A.Maronpotand S.L. Eustis.Rodentcarcinogenicitybioassay: Past, present and future. Toxicol. Rev. 12:5034. 1995. 2. E.E. McConnell. Historical reviewof the rodent bioassay and future directions. Regul. Toxicol. Plznrmacol. 21:38, 1995. 3. C.S. Weil and D.D. McCollister. Relationship betweenshort- and long-term feeding studies in designing an effective toxicity test. J. Agri. Food Clzem. 11:486, 1963.
70
Banks and Keller
4. L.M. Appelman and V.J. Feron. Significance of the dog as “second animal species’’ in toxicity testing for establishing the lowest “no-toxic-effect level.” J. Appl. Tosicol. 6:271.1986. 5 . C.E. Lumley and S.R. Walker. The value of chronic animal toxicology studies of pharmaceuticalcompounds:Aretrospectiveanalysis. Fundam. Appl.Toxicol. 5: 1007,1985. 6. C.E. Lumley and S.R. Walker. A critical appraisal of the duration of chronic animal toxicity studies. Regul. Toxicol. Pharnzacol. 6:66, 1986. 7. L.H. Speid, C.E. Lumley, S.R. Walker and D.K. Luscombe. How useful are 12month toxicity tests in dogs? Toxicologist 10:143, 1990. 8. C.E. Lumley, C. Parkinson andS.R. Walker. An international appraisal of the minimum duration of chronic animal toxicity studies. Hum. Exp. Toxicol. 10:155. 1992. 9. G.G.LongandJ.T.Symanowski.Appropriateparameterstobetestedinrodent oncogenicity studies. Toxicol. Puthol. 26:319. 1998. 10. K.P. Keenan. The uncontrolled variable in risk assessment: Ad libitum overfed rodents-fat, facts and fiction. Toxicol. Patkol. 24:376, 1996. J. Am. Coll. Toxicol. 11. E.E. McConnell. The maximum tolerated dose: The debate. 8:1115,1989. 12. C.J. Carr and A.C. Kolbye, Jr. A critique of the use of the maximum tolerated dose in bioassays to assess cancer risks from chemicals. Regul. Toxicol. Pharmacol. 14: 78,1991. 13. T.A.S. Davis and A. Monroe. The case for an upper dose limit of 1000 mg/kg in rodent carcinogenicity test. Cancer Lett. 95:69, 1995. 14. M.H. Kuiper, T. Boeve, M.W. Jansen, J. Roelofs-van Emden, J.W.G.M. Thuring and M.V.W. Wijnands. Ophthalmologic examination in systemic toxicity studies: An overview. Lab. Aninl. 31:177, 1996. 15. R. Heywood. Target organ toxicity. Toxicol. Lett. 8:349, 1981. 16. G.G. Long, J.T. Symanowski and K. Roback. Precision in data acquisition and reporting of organ weights in rats and mice. Toxicol. Pntlzol. 26316, 1998. 17. G.J. Carr and J.K. Maurer. Invited Commentary. Precision of organ and body weight data: Additional perspective. Toxicol. Puthol. 26:321, 1998. 18. I.W.F. Davidson, J.C. Parker, and R.P. Beliles. Biological basis for extrapolating across mammalian species. Regrd. Toxicol. Phni-tnncol. 6:21 1. 1986. 19. H.W. Ruelius. Extrapolation from animals to man: Predictions, pitfalls and perspectives. Xenobiotic 17255, 1987. 30. S.L. Brown. S.M. Brent, M. Gough,et al. Review of interspecies risk comparisons. Regul. Toxicol. Phunnacol. 8:19 1. 1988. 21. M.L. Dourson, S.P. Felter, and D. Robinson. Evolution of science-based uncertainty factors in noncancer risk assessment. Regul. Toxicol. Phurnzucol. 24: 108, 1996. 22. H.J. Kramer, W.A. Van Der Ham, W. Slob, and M.N. Pieters. Conversion factors estimating indicative chronic no-observed-adverse-effect levels from short-term toxicity data. Regrcl. Toxicol. Phurmacol. 23:249, 1996. 23. G. Atherley. A critical review of time-weighted average as an index of exposure and dose of its key elements. Am. htd. Hyg. Assoc. J. 46481, 1985. 24. M. Sharratt. Assessing risk from data on other exposure routes. Regul. To-xicol.Phmmacol. 8:399, 1988.
Carcinogenicity and
Toxicity Multidose
71
25. F.C. Lu. A review of the acceptable daily intakes of pesticides assessed by WHO. Regul.Toxicol.Pharntacol. 21:352. 1995. 26. S.C. Gad and C.S. Weil.Statistics and Experintental Desigrt .for Toxicologists. Telford Press, Caldwell, New Jersey, 1986. 27. G.L. Plaa and W.R. Hewitt.Toxicology of the Liver, Raven Press, New York, 1982. 28. R.W. Moch. P.N. Dua and F.A. Hines. Problems in consideration of rodent hepatocarcinogenesis for regulatory purposes. Toxicol. Pnthol. 24: 138, 1996. 29. J.M. Ward, M.A. Shibata and D.E. Devor. Emerging issues in mouse liver carcinogenesis. Toxicol. Pathol. 24: 129. 1996. 30. Y. Dragan, J. Klaunig, R. Maronpot and T. Goldsworthy. Forum: Mechanisms of susceptibility to mouse liver carcinogenesis. Toxicol. Sci. 41:3, 1998. 31. J.B. Hook and R.S. Goldstein. Toxicology of the Kidney. Raven Press, New York, 1993. 32. J.A.Swenberg.a,,-Globulinnephropathy:Reviewofthecellularandmolecular mechanisms involved and their implications for human risk assessment. Emiron. Health Perspect. 101:39, 1993. 33. D.G. Hoel, J.K. Hasemand, M.D. Hogan, J. Huff and E.E. McConnell. The impact of toxicity on carcinogenicity studies: Implications for risk assessment. Carcinogenesis 9:2045. 1988. 34. D. Acosta Jr. Cardiovascular Toxicology. Raven Press, New York, 1992. 35. E.S. Ribelin and M.T. Mosley. Effects of drugs and chemicals upon the structure of the adrenal gland. Furzdam. Appl. Toxicol. 4:105, 1984. 36. H.D. Colby. Adrenal gland toxicity: Chemically induced dysfunction. J. Am. Coll. Toxicol. 7:45,1988. 37. R.D. Irons. Toxicology of the Blood and Borle Marrow. Raven Press, New York, 1985. 38. R.Rosmanand 0. Hanninen. Gastrointestinal Toxicology. Elsevier,Amsterdam, 1986. 39. W.M. Grant. Toxicology of the Eye. Charles C Thomas, Springfield, Illinois, 1974. 40. R. Hamlin. Extracting “more” from cardiopulmonary studies on beagle dogs. In: TheCanine as n BiornedicalModel (M.R. Gilman. ed.), Am. Coll. Toxicol. and LRE, Bethesda. MD, p. 9, 1985. 41. J.D.DohertyandS.M.Cobbe.Electrophysiologicalchangesinanimalmodelof chronic cardiac failure. Cardiovasc. Res. 24:309. 1990. Toxicology 42. K. Scharer. The effectof chronic underfeeding on organ weights of rats. 7:45,1977. 43. M.G. Farrow. Unique aspects of GLP pathology. J. Am. Coll. Toxicol. 6:389, 1987. 44. F.J.C Roe. Toxicity testing: Some principles and some pitfalls in histopathologic evaluation. Hun!. Toxicol. 7405, 1988. 45. Z. RubenandB.M.Wagner.Correlationsbetweenmorphologicandfunctional changes induced by xenobiotics: Is every induced change a sign of toxicity? Toxicol. Appl. Pharmacol. 97:4. 1989. Handbook of Toxicologic Pathology.Academic 46. W.M. Haschek and C.G. Rouseaux. Press, New York. 1991. 47. D.B. Clayson. The need for biological risk assessment in reaching decisions about carcinogens. Mutat.Res. 185:243, 1987.
72
I
Banks and Keller
48. G.A. Gastel and T.R. Sutter. Biologically bounded risk assessment for receptormediated nongenotoxic carcinogens. Regnl. Toxicol. Plzarmacol. 2273, 1995. Regzrl. 49. G.B. Gori. Science, imaginable risk, and public policy: Anatomy of a mirage. Toxicol.Plzarmacol. 23:304, 1996. 50. I.F.H. Purchase andT.R. Auton. Thresholds in chemical carcinogenesis.Regul. Tosicol. Phannacol. 22:199, 1995. 51. U.S. FDA. U.S. Food and Drug Administration Advisory Committee on Protocols for Safety Evaluation: Panel on carcinogenesis report on cancer testing in the safety evaluation of food additives and pesticides.Toxicol. Appl. Pharrnacol. 20:419, 1995. 52. U.S. EPA. U.S. Environmental Protection Agency proposed guidelines for carcinogen risk assessment. Fed. Register 61:17960, 1996. 53. J. Wiltse and V.L. Dellarco. The U.S. Environmental Protection Agency guidelines for carcinogenic risk assessment: past and future. Mutat. Res. 365:3, 1996. 54. R.L. Melnick. M.C. Kohn and C.J. Porter. Implications for risk assessment of suggested nongenotoxic mechanisms of chemical carcinogens. Erzviron. Health Perspect. 104:123, 1996. Toxicol. Lett. 64/65:605. 55. J. Ashby.Predictionofnon-genotoxiccarcinogenesis. 1992. 56. J.M.M. Meijers,G.M.H. Swaen andL.J.N. Bloemen. The predictive value of animal data in human cancer risk assessment. Regul. Toxicol. Pharrnacol. 25:94, 1997.
Metabolism and Toxicokinetics J. Caroline English Eastman Kodak Company, Rochester, New York
1.
A.
INTRODUCTION DefinitionsandScope
Metabolism and toxicokinetic studies constitute a part of the overall toxicological evaluation of chemicals. Studies are designed to obtain information on the fate of the compound in the organism, and so they focus more on the behavior and less on the eflect of the compound. The information developed permits a more complete understandingof the relationship between chemical exposure and toxicity. The focus of this chapter is toxicokinetic study design, data analysis, and of health effects, with particudata interpretation as applied to the understanding lar reference to regulatory guidelines. The term “toxicokinetics” has emerged to describe the generation of pharmacokinetic data in toxicity studies, or separate studies patterned after toxicity study design. The overarching goal is to evaluate internal (i.e., systemic) exposure to a compound, and relate that exposure to toxicity. Toxicokinetics encompasses the extent and rate of uptake of a compound into the bloodstream and lymphatic system (absorption), its conversion to new entities (biotransformation or metabolism), distribution within the body, and excretion of the compound and its biotransformation products. Toxicity is often associated with metabolic conversion of the compound to a more toxic metabolite or reactive intermediate. For this reason, metabolism and toxicokinetic studies are inextricably linked, and characterizing exposure to metabolites may be equally or more important than characterizing exposure to the original compound. The specific objectives of toxicokinetic studies depend upon the specific regulatory or safety need, but typically involve determining the magnitude. rate,
73
74
English
and duration of the internal dose, and its relationship to administered dose level. As the dose level is increased, the ability of the body to eliminate the compound can become overwhelmed. This occurs because most processes that govern chemical disposition (e.g., extraction from blood by organs, metabolism, biliary and urinary excretion) have a finite capacity. When this capacity is saturated or exceeded, as frequently occurs at high-dose levels, accumulation to toxic levels may result. The threshold dose of saturation can be determined by toxicokinetic studies performed over a range of dose levels; it is a critical parameter for limiting extrapolation from experimental high-dose levels to those levels encounteredby people.
B. GuidanceandTierApproaches Guidance in study design and conduct has been provided by: (1) the Organization for Economic Cooperation and Development (OECD) for the testing of chemicals [ 1,2]; (2) the European Economic Community (EEC) for testing of new chemicals [3]; (3) the U.S. Food and Drug Administration (FDA) for direct food additives and color additives usedin food [4] and compounds used in food-producing animals [ 5 ] ;(4) the International Conference on Harmonization (ICH) for human drugs [6.7]; ( 5 ) the U.S. Environmental Protection Agency (EPA) for industrial chemicalsoragrochemicals [8-171; and(6)Buchananetal.[18]forstudies within the National Toxicology Program (NTP). Some examples of data sought from these studies include the extent and rate of absorption by relevant routes of exposure, the biological half-life of the compound, the distribution pattern in organs and tissues, routes and rates of elimination, and amount and nature of metabolites. The extent to which these parameters are dependent upon sex, species, dose level, route of administration, and repeated vs. single administration may also be explored. Comprehensive guidelines address these multiple aspects of toxicokinetic investigation, but rarely are all elements necessary for the evaluation of a given compound. For this reason, toxicokinetic data development is especially conducive to tiered or staged testing, and various approaches that incorporate this philosophy have been devised [8,18-201. In a tiered testing approach, a minimal data set isfirst acquired and evaluated. Ideally, a set of criteria is defined that triggers the next tier or stage of testing, and several factors may influence the need for information beyond a minimal data set. These factors include the need for toxicokinetic data to help design or interpret a toxicity study; the commercial use of the product and extent of consumer exposure; and significance of findings related to absorption, metabolism, or persistence revealedby the first tier studies. The need for additional data can be determined as the data unfold. Tiered approaches require that the expertise for data evaluation and final decision making is available, both to the regulatory authority and the industry responsible for performing studies. However, staging
Metabolism and Toxicokinetics
75
the acquisition of toxicokinetic datamay significantly reduce animal use, product development cycle time, and study costs. Selected guidelines currently in use for toxicokinetic and metabolism are presented and compared in Table 1. The remainder of this chapter focuses on the development of basic toxicokinetic data and, where applicable, outline firststage or “tier 1” studies. It shouldbe noted that, prior to any toxicokinetic studies, an initial assessment of a chemical’s toxicokinetic behavior shouldbe made based on physicochemical properties, knowledge of the toxicokinetic behaviorof structurally similar substances, or information gleaned from basic toxicity studies [19,21,22].
II. STUDYDESIGN ANDSTUDYPARAMETERS A. TestSubstanceandCarrier The test substance and any carrier used for administration should be comparable to that used in toxicity studies and should resemble the substance encountered by humans. Consistencywith toxicity studies is also important for volume administered and formulation, both of which can influence absorption. Appropriate characterization(e.g.,purity,identity,stability,homogeneity)should be performed. The objectives of toxicokinetic studies are availed with the use of radiolabeled test substances. The principle underlying the use of radiolabeled material is that a substance possessing a radionuclidewill behave identically to its corresponding “cold” compound, and not influence the behavior of the unlabeled compound in any way. Radiolabeled test substances facilitate the determination of a mass balance, and allow for a more complete accounting of the fate of all test substance-related material at the endof the study. Radiolabeling also allows for greater analytical sensitivity and specificity than can typically be attained with unlabeled material. Carbon 14 is commonly used for organic materials, but other moderately energetic beta-emitting isotopes having appropriate decay rates may be used. When the radiochemical synthesis is performed, the label should be placed within a metabolically stable position of the molecule. Dual labelingwith two different isotopesmay be performed when a compound is known to undergo hydrolysis and both portions of the molecule need to be tracked. If, however, one portion of the molecule is of greater concern from a structure activity standpoint, then labeling of this portion alone may be adequate. The radiochemical purity of the test substance should typically be high (e.g., >95%) and significant impurities (e.g., 22%) need to be identified. Unlabeled test substances are,in most guidelines, allowed asan alternative to radiolabeled ones, provided the objectives of the study can be met. Indeed, use of unlabeled compound may be preferred in some circumstances.If the synthesis of radiolabeled test substance is unfeasible froman economic or technical
Table 1 Guidelines for Metabolism and Toxicokinetic Studies
OECD (1984) Toxicoliinetics guideline for testing chemicals Study Design Species Number of animals
Dose levels
Route of administration
One or more appropriate animal species. Four per group. Where sexual dimorphism exists use four animals of each sex. At least two for single dose studies. No observed toxic effect level and high level at which TK paratneters change or toxic effect occurs. Same as used in toxicity studies. Intravenous for absorption or distribution useful.
ICH (1995a.b) Toxicokinetics guideline for human drugs Same as toxicity test. Appropriate number to provide basis for risk assessment.
USEPA (1995) Metabolism and pharniacokinetics guideline for pesticides and toxic substances (tier 1) Rat; other or additional species if significant toxicity. At least four males. both sexes if evidence of sex difference in toxicity.
The same three dose levels used in toxicity study.
Single nontoxic dose level for each route of exposure.
Same as route(s) intended for product.
Oral gavage is customary method. other routes may be required.
Buchanan et al. ( 1 997) Guidelines for toxicokinetic studies within the NTP (minimal study design) Same as toxicity studies. Ensure three samples per time point of analysis. Initially one sex of one species. Same used or anticipated in toxicity studies. Three levels recommended as minimum (e.g., 0.1, 0.01 and 0.001 of LD5,,.) Same used in toxicity study or most common route of human exposure and IV.
Observations Absorption/Systemic Exposure
Distribution
Metabolism
Excretion
z Measure amount of test substance and/or metabolites in excreta and carcass; or compare with reference group-amount of dose excreted renally, area under plasma level vs. time curve of parent or metabolites, or biological response. Whole-body autoradiogrpahy and/or serial sacrifices with analysis of tissues and organs for test substance and/or metabolites.
Measure parent compound and/or metabolite in plasma, serum. or whole blood, ideally the same as used in clinical studies. As needed, assay genotoxicity indicator tissue. pregnant or lactating animals. embryos, fetuses or newbonis. Tissue distribution should be determined in some circumstances, especially for potential sites of action.
Elucidate structures of measured metabolites. Propose metabolic pathways. In vitro studies helpful to elucidate pathways. Other biochemical studies may be perfornied. Assay urine, feces, expired air, (sometimes bile) at intervals until 95% excreted or for 7 days.
Measure metabolite levels in plasma or other body fluid when compound is pro-drug, metabolized to active form, or is extensively metabolized and measurement of parent is impractical. Excreta measurements not included in standard study design.
Determine percentage of dose in excreta and. as necessary, tissues and residual carcass
Assay blood or plasma at 8- 12 time points for parent compound (major metabolite if rapid metabolism).
' 5'
3
3 Q
-I
0
i .
Collect liver, fat, kidney, spleen, blood, target organ, portal of entry for non-oral study, residual carcass at sacrifice; store frozen. Assay for radioactivity if significant amount of dose unaccounted for in excreta. Identify and quantitate unchanged test substance and metabolites comprising >5% of the dose in excreta. Provide metabolic scheme.
Assay urine, feces, expired air as appropriate for radioactivity at intervals until 90% recovered or for 7 days.
Tissue(s) other than plasma may be analyzed to assess systemic exposure. Tissue distribution not included in minimal study design.
3
$ u,
Limited metabolism knowledge assumed, e.g., when major metabolite is measured instead of parent. Metabolite identification not included in minimal study design. Excreta measurements not included in minimal study design. 4 4
. I
70
English
standpoint, then analytical methods should be developed for measuring stable isotopes of the test substance andkey metabolites in appropriate biological matrices (e.g., excreta and tissues). Such methods may also be needed if the animals to be used for toxicokinetic measurements are also involved in a toxicity study. This latter approach, also referred to as concomitant toxicokinetics, represents the ultimate integrationof toxicokinetic and toxicity studies, and can provide the most useful information for design and interpretation of toxicity studies [23].
B. Probe Studies The investigator charged with designing a toxicokinetic study needs to select appropriate relative sampling times, sampling matrices, and analytical methods. Often, little or no information is available to guide the investigator beforehand, and probe or pilot studies are therefore recommended. Specimens collected during the probewill be of value in developing analytical methods for separating and quantitating parent compound and metabolites, and determining which biological matrices are most appropriate to collect during the definitive study.
C. TestSystem 1. Selection and Justification Toxicokinetic studies are performed primarily to assist in the interpretation of toxicity studies, and as such, the test system selected will necessarily be the same as that used for the toxicity study. Most basic toxicokinetic studies are performed with young adult male and/or female rats of the appropriate strain. Weight variation of animals used should not exceed 520% of the mean weight [l]. The use of one sex in initial toxicokinetic studies generally suffices. If there is evidence of a sex-related differencein toxicity, studies may be confined to the more sensitive sex. Additional test systems may be needed to address compound-specific objectives, for example: (1) an additional species for interspecies comparisons, (2) animals of different ages to examine age-dependent changesin toxicokinetic parameters, or (3) animals in a particular physiological state (e.g., pregnancy, diabetes, inborn metabolic errors, etc.) to examine susceptibility issues.
2. Animal Number The conventional approach to the evaluation and interpretation of toxicokinetic data involves the collectionof a sufficient numberof serial samples to allow the time course of the chemical to be fully described in each individual. Guidelines that specify the number of animals to be used typically indicate three or four of a sex per group. Enough animals should be used such that data are obtained for
kinetics
and
Metabolism
79
no fewer than three animals per dose level [ 181. Four or more animalsmay therefore be needed in anticipation of the occasional lost sample or analytical mishap. Where a high degreeof interindividual variation in toxicokinetic behavior is displayed, more animals per group may be justified. An alternative study design that addresses variability within the group of animals tested [24] involves the use of composite rather than serial sampling for the collectionof data. This technique, which borrows from population pharmacokinetics [25], isused especially in the pharmaceutical industryto estimate pharmacokinetic parameter probability distributions when sample numbers are limited. As a rule. samples are obtained from more animals using fewer sampling times. Advantages cited for this approach include reduced sampling-related stress to animals and, when toxicokinetic studies are concomitant with toxicity studies. lower total animal use. The interested reader is referred to Vozehet al. [26] for a comprehensive reviewand perspective on this subject.
D. TestSubstanceAdministration 1. Routes The routes of test substance administration used for toxicokinetic studies are generally those of common human exposure, (i.e., oral, inhalation, and cutaneous). The route should be the same as that used in toxicity studies to aid their interpretation. While drinking water or feed exposure is commonly chosen for toxicity studies, the gavage method is often preferred for estimating absorption from the gastrointestinal tract. The toxicokinetic behavior of a chemical given in the feed or drinking water is likely to differ from that given by bolus adminisin ease of quantitative dose recovery tration: however, the latter offers advantages for material balance studies as well asin simplifying the analysis and interpretation of toxicokinetic data. A common strategy is to perform toxicity studies principally with one route and method of administration, while conducting toxicokinetic studies via the same and other relevant routes/methods to obtain systemic (internal) exposure data for making route comparisons. Such data are useful for supporting route-dependent extrapolations in risk assessment. When the cutaneous route is studied, animals are prepared 16 to 24 hr prior to dose administration by clipping the skin of hair in the region of the shoulders and back. The test substance is applied evenly to the skin and protected with a suitable covering [2,8,27]. For the inhalation route, exposures usually are accommay warplished with a nose cone or head-only apparatus. Finally, some designs rant the inclusion of the intravenous (IV) route of administration. By definition, absorption following IV injection is considered to be complete, i.e., loo%, making it useful both for defining important toxicokinetic parametersand as a refer-
English
80
ence dose route for determining extent of absorption via other routes. Intravenous dosing may be accomplished via a lateral tail vein injection or an implanted venous cannula, commonly within the jugular or femoral vein.
2.
DoseLevels
Guidelines vary both in the number of dose levels required (one to three) and the criteria to be used in their selection. Examples cited in Table 1 (see above) include a single nontoxic dose level; a minimum of two dose levels that includes a minimally toxic level (i.e., lowest observed effect level, or LOEL) and a nontoxic level (e.g., no observed effect level, or NOEL): and three levels anticipated or used in toxicity studies and selected on the basisof potency and slope factors. More dose levels allow better characterization of the dose-dependent behavior of a chemical, in particular, dosage rangesof proportional toxicokinetic behavior and dosage region where kinetics shift from linear to nonlinear. When a radiolabeled test substance is administered for the purpose of determining a mass balance (generally as a bolus dose), gravimetric determination of the administered dose (dpm or pCi) is recommended. This is accomplished by determining the radioactivity in weighed aliquots of the dose preparation to obtain a mean concentration (e.g., dpm/mg), then measuring the weight of the dose preparation administered. Multiplication of the above mean concentration by weight of dose administered will yield the dose administered in dpm that becomes the denominator in later calculations of the fraction or percentage of dose.
3. DoseRegimen Single exposures are generally used for basic toxicokinetic studies. For the oral or intravenous routes, a single bolus administration is used. Where the inhalation or cutaneous route is needed, a single exposure of a defined period is typical. Other factors to be considered in the design of inhalation or dermal toxicokinetic studies are the length of time needed to attain steady state concentrations in the body, the expected duration of relevant human exposure, and the need to minimize any discomfort or stress experienced by the animals. Guidelines [2,8] recommend a minimum cutaneous exposure period of 6 hr and a cleansing step at the end of the exposure period to recover unabsorbed test substance from the skin. For inhalation studies, a 4 to 6 hr of exposure using a nose cone or headonly apparatus is specified [8]. This specialized apparatus prevents depositionof the test substance on the animal's coat, which could, in turn, result in ingestion during grooming or dermal uptake, confounding the interpretation of the results. However, if dermal penetration is sufficiently slow, and groomingof the coat is curtailed by the useof Elizabethan-style collars, theuse of whole-body inhalation chambers may be justifiable. A single administrationof test substanceat specified
kinetics
and
Metabolism
81
dose levels will usually be adequate when evaluating dose level-, route-, sex-, or species-dependent toxicokinetic behavior, and meets the basic requirements of most guidelines. Repeated-dose or infusion studies, while not routinely required, may yield important additional information. If repeated-dose toxicokinetic data are called for, the regimen should consist of daily exposures over a period ranging from 5 days to 3 wk. Circumstances in which such studies may be warranted have been well described [7,18,28] and include the following: (1) Interpretation of a repeated-dose toxicity studymay require a repeated-dose toxicokineticstudy,particularly when biochemical,morphological,orfunctional (2) If the test substance causes changes occur relative to the single-dose situation. enzyme induction or inactivation, changes will generally be manifested within several days of repeated dosing and commonly will be accompanied by changes in toxicokinetic behavior. (3) Substances that display long elimination half-lives from plasma or other tissues after a single may doserequire repeated-dose toxicokinetic studies to accurately determine the extent of tissue accumulation or the potential for persistence within the body.
E. SampleCollection 1. Matrices The types of excreta and tissue specimens that should be collected can best be determined with the help of information obtained in probe studies. Guidelines generally address the collection of excreta and blood or plasma, but differ with respect to how collection of those matrices should be prioritized in a testing scheme. For example, ICH and NTP guidelinesdo not call for routine excretion and mass balance data, whereas first-level studies required under EEC and EPA guidelines generally do. Analysis of excreta at several postexposure time points is required both for information on extent of absorption and biotransformation, and routes and rates of excretion. Quantitative analysis of excreta is also needed to account for the entire mass of the dose and thereby the complete dispositionof the test substance, as required in a mass balance study. The separate collectionof urine and fecescan be accomplished with metabolismchambersspeciallydesignedfor this purpose. When appropriate, the expired air shouldbe directed through traps suitablefor the collection of exhaledvolatiletestsubstanceandmetabolites (e.g., activated charcoal) as well as carbon dioxide (e.g., 2.5 M potassium hydroxide) that is derived from test substance metabolism. Determination of the extent of absorption and thecalculation of mass balanceinvariablyrequires analysis of the carcass, and separate analysisof specified tissues is usually desirable. For a cutaneous exposure, recovery of the test substance washed from the skin and that associated with any containment devices or protective coverings
82
I
English
must also be assayed. Mass balance information is not routinely obtained following inhalation studies becausethe dose absorbed from the lung isnot known. When feasible. however, recovery data provide a good measure of the total inhalation dose. To assess systemic exposure, collectionof blood or plasma at several time points is usually required. Blood is chosen over plasma if the analyte possesses a high affinity for the cellular fraction. Collection of the target tissue for the measurement of analyte levels can alsobe done, and values may correlate better withorgan-specifictoxicityfindings.Bloodcollectionispreferred,however, sinceitallowsforserialmeasurementstobemadewithinthesameanimal, thereby reducing interindividual variation. The simplifying assumption is that the concentration of toxic substance in blood (or plasma) is a function of the concentration in target tissue(s). When study objectives require a determinationof tissue distribution of the test substance,whole-bodyautoradiographyisavaluabletoolforexamining changes in radiolabel distributionwithin the same animal over a specified period of time. The alternative method for assessing tissue distribution data is serial collection of appropriate tissues using greater numbers of animals. Collection and storage (frozen)of tissues and residual carcass at terminal sacrifice is required in one guideline [8] and helps ensure a complete material balance. While tissue distribution studies may be required for providing information on accumulation of the parent compound or metabolites, especially in relation to potential sites of action, views are split on the necessity of acquiring this information as a part of an initial data set. Tissue distribution studies will typically include known or suspected target tissues, organs of metabolism and excretion, site of action for therapeutics, and tissues associated with the accumulation of tested chemicals that are structurally related to the test substance.
2. Sampling Times The quantitative collectionof excreta and expired air specimens requires monitoring to begin immediately after administration of the test substance, or, for inhalation exposure, at the endof the exposure period.The period of collection continues for 7 days or until90% [8] or 95% [ 11 of the administered dose is recovered, whichever occurs first. Percentage of dose recovered is generally determinedby assay for the cumulative total radioactivity in excreta, including expired airwhen needed. Daily specimen collection is typically performed; however, several appropriate interim collections may also be required immediately following treatment. Many chemicals are excreted primarily by the kidneys into the urine following metabolism to water-soluble conjugates. Hence, protocols will commonly specify more frequent collection of urine (e.g., 0 to 6. 6 to 12, and 12 to 24 hr) within the first day of dose administration.
kinetics
and
Metabolism
83
Sampling times also need to be chosen for serial sampling of body fluids, usually blood or plasma,and occasionally bile, urine, ormilk from lactating animals. The goal is to define the concentration vs. time relationship for the parent compound and/or specific metabolites from the beginning of exposure through elimination. Therefore, sampling must be started at an appropriate interval after bolus administration, both during and after infusion, dermal, or inhalation exposure, and continueuntil the detection limit is reached. The number and frequency of sampling times shouldbe sufficient to define this concentration-time relationship, and the times will vary depending upon the routeand method of administration, as well as number of phases associated with the compound’s kinetic behavior. The kinetic profile of substances given via the IV route, for example, will lack an absorption phase. Likewise, substancesmay be eliminated from thebody in one, two, or more distinct phases, each with its own half-life. Each phase should ideally be described by a minimum of three time points, which should be chosen with the help of pilot study data. For blood or plasma, Buchanan et al. [ 181 specify collection of 8 to 12 samples within a 24-hr period of bolus administration. For some compounds, asfew as five samples within a 24-hr period [21] may provide reasonable estimates of two important toxicokinetic parameters, the maximum concentration (CmJ and the area under the concentration-vs.-time curve (AUC). Where the test substance is administeredvia feed or water, diurnal behaviorpatterns will influencesystemicexposure,andfourtoeightevenly spaced samples should be obtained over 24 h for derivationof the daily average level [29]. If excessive blood withdrawal is a concern because of the frequency or size of samples required, periodic monitoring of the hematocrit is advisable because both the health of the test systemand the toxicokinetics canbe impacted by excessive removal of blood from the body. For protocols that specify the determination of tissue distribution over time, whether by autoradiography or serial sacrifices, sampling times may be chosen using the same considerations described above for body fluids.
F. Analytical Methods 1. Selection of Analytes Analytes to be measured in blood and other serial samples should include the test substance itself and/or oneor more metabolites, especiallythose metabolites in the pathway leading to toxicity. Test substancesmay not be readily detectable if they undergo very rapid biotransformation, such as hydrolysis in the bloodstream, or first-pass metabolism in the liver or portal of entry. In this situation, analysis for an early metabolite as a substitute for the parent compound may be necessary. An understanding of the likely routesof metabolism for the compound under study is useful for predicting metabolites and developing analytical meth-
English
84
ods. Information on common metabolic pathways can be found in toxicology textbooks [e.g., 301 and from commercial metabolism databases which enable searches by chemical structure, substructure, and similarity features. Developing an analytical method for selected analytes also requires some of those compounds in the biological knowledge regarding the behavior expected matrix. Substances that are too reactive to measure directly may be determined indirectly by trapping the metabolite or reactive intermediate to form a stable reactionproduct.Endogenousmolecules,suchashemoglobinorglutathione, have been used effectively for this purpose. The tendency of a compound to bind reversibly to plasma and tissue proteins should alsobe considered during analytical method development. In general, the concentration of unbound or free compound in the plasma should be determined since it is this unbound fraction that is available for tissue uptake and biological activity. The analytical method used should distinguish unbound fraction from total compound in the sample in which extensive protein binding occurs. Where little protein binding occurs, measurementof total compound in the sample is a reasonable indicator of free compound. Analytes to be measured in excreta include the test substance itself and of the parent compound. For any appreciable metabolites or breakdown products test substances containing a radiolabel, the total radioactivity in the specimen should also be determined. Specific kinetic parameters should not be calculated from total radioactivity data unless it is known that the substance does not undergo biotransformation [311. Where a radiolabeled test substance has been used, tissue distribution can be determined by whole-body autoradiography orby preparation of collected tissue samples as appropriate (e.g., combustion, digestion, decolorization) for liquid scintillation spectrometry or other suitable method.
2.
Metabolite Measurement and Identification
For the quantitative determination of parent compound and metabolites in biological specimens, high-performance liquid chromatography or gas chromatography is commonly used for analytical separation. For biotechnology-derived products, electrophoreticseparations(e.g.,sodiumdodecylsulfate-polyacrylatnidegel electrophoresis [SDS-PAGE]) with immunochemical detection may be useful. Prior to the assay, sample preparation procedures, such as protein precipitation, extraction, or other technique appropriate for the matrix and analyte, are often needed to remove background interferences. For the identification of parent compound and metabolites in excreta, a combination of methodologies is used. Metabolite identification is commonly > 5% [8] to>10% [ 181 of the adminrequired for each metabolite that comprises istered dose. Cochromatography or coelution of standard material and unknown component provides the first level of metabolite characterization. Specifically,
Metabolism and Toxicokinetics
85
chromatographic retention time of an unknown substance (e.g., a radiolabeled component of urine) canbe compared with the retention time for authentic materials. In a similar fashion, evidence for the presence of conjugated metabolites can be provided by chromatographic retention time shifts occurring after hydrolytic treatment of the sample with either specific enzymes for cleavage of phase I1 conjugates (e.g., P-glucuronidase, sulfatase) or nonspecific hydrolysis treatments (e.g.,dilutehydrochloricacid[HCl]).Furthermetabolitecharacterizationand structural confirmation is accomplished using an appropriate mass spectrometry and/or nuclear magnetic resonance spectroscopic technique.
3.
Specificity, Sensitivity, and Precision
Methods used for the quantitative and qualitative analysis of all analytes need to be fully validated [32]. The method should be specific for the compound and any interference by endogenous components should be investigated. Stability of the analyte under experimental and storage conditions aswell as recovery of the analyte from the biological matrix should be addressed as a part of the validation. The method should be appropriately sensitive with the lower limits of detection and quantification specifically defined. The precision of the method should be determined by examining the reproducibility of results over a suitable period of time.
111.
DATAEVALUATIONANDINTERPRETATION
A.
DataandStatisticalAnalyses
The primary goal of toxicokinetic studies as a whole is to characterize the systemic exposure to a test substance with respect to both magnitude and duration. As described in the preceding section, the concentration of analyte, generally parent compound orspecific metabolite(s), is measured over timein one or more matrices consisting of a body fluid or tissue. The following section is subdivided into three parts that address data analysis (1) forsystemic absorption and eliminaThe stepwise approach pretion, (2) disposition, and (3) statistical summary. sented was devised as a practical introduction for handling basic toxicokinetic data obtained from guideline studies. The reader is referred elsewhere for introductions to the basic concepts of toxicokinetics [33-351; and to Gibaldi and Perrier [36] for a comprehensive presentation of the mathematical framework for this discipline.
1. Systemic Data Analysis The first step recommended for data analysis is to graphically examine the analyte concentration data obtained from each individual animal, against time, on log-
86
English
linear axes. Visual inspection of this curve can provide useful toxicokinetic information, in particular, the number of phases associated with the elimination, the maximum concentration achieved(C,,,), and the timeof maximum concentration (Tma,).The vast majority of elimination curves will possess a linear terminal portion, regardless of test substance, route, or dose level. The slope of this terminal linear segment of the curve is generallyused to estimate the elimination rate constant k, by the relationship = -2.303/Terminal kslope
(1)
This relationship assumes that, for extravascular routes of administration, the absorption rate is substantially fasterthan elimination rate. Alternatively, formetabolites, the rateof formation mustbe greater than the rateof elimination. Where the absorption rate is slower than the elimination rate, as perhaps occurs with sustained-release formulations of drugs or with slow percutaneous penetration, the terminal slope actually reflects the absorption rate, thus, the absorption rate constant k, is estimated by
k,slope = -2.303/Terminal
(2)
Whether the absorption or elimination rate term isdefined by the terminal slope is not necessarily evident from the data set, unless IV administration or other route known to result in rapid absorption was used to clearly define the elimination rate constant. Alternatively, an independent estimation of absorption rate or rate of metabolite formation may need to be made; in vitro systems can oftenbe used for this purpose [37,38]. The elimination half-life (tl/J is the time required to decrease the analyte concentration by one-half, and is related to the elimination rate constant by the following relationship tl/? = In 2/k
Ideally, data will be available for a period of time equivalent to four orfive halflives postexposure, which is the time it takes for about 94 or 97%, respectively, of the elimination to occur. Virtually all concentration-vs.-time curves will have, at least, a terminal log-linear segment reflecting first-order behavior. For intravascularly dosed substances, the slope of this linear phase may describe the entire curve (Figure 1). and a first-order mathematical expression of the concentration at time t (C,) can be readily obtainedby determining the elimination rate constant and the ordinate intercept (Co), which become exponent and coefficient, respectively, in the following equation
C,
=
Co ePht
i
Metabolism and Toxicokinetics
a7
time
Figure 1 Analyteconcentration(logarithmicscale)vs.timeforintravascularlydosed compound. Curves a, b. c, and d represent increasing dose levels. Saturation is evident at dose level c.
For curves having additional phases approximating first-order behavior, such as faster elimination phases or an absorption phase, the rate constants (and halflives) associated with each phase may be estimated by the method of residuals or by linear regression analysis, and comparable first-order expressions for C, obtained [34,36]. When the capacity of first-order processes governing toxicokiwill netic behavior is exceeded, as may occur at high exposure levels, curves show systematic deviations from these linear mathematical expressions (see Figure 1). Another parameter derived from the concentration-vs.-time curve is the area under the curve (AUC). The AUC may be measured by the trapezoidal rule, for example AUC =
2 [(t.
-
tl)/2] (C, +etc. C?),
(5)
where t l and t, are the first two time points, C I and C2 are the corresponding analyte concentrations, and areas are calculated and summed through the last sampling time and respective concentration(Clast).For calculation of an area under the curve that includes the triangular area beyond the last sampling time, (i.e., extrapolated to infinity), the term CI,,,/k is added to the sum. thus
English
88
AUCo-, = AUC
+ CI,,Jk
(6)
Some prefer the cut-and-weigh approach for determining AUC or AUCo-,. In this case, the curve is plotted on linear graph paperand the area tobe determined is carefully cut out and weighed.A rectangle of known area (e.g., pg/ml X min) is also cut out and weighed.The unknown AUC is solved by the proportionality between the area and weight determined by weighing the known area; thus AUC = Weight of AUC X Known area/Weight of knownarea
(7)
Systemic clearance (also referred to as total clearance, and usually blood or plasma clearance) represents the volumeof blood or plasma that is cleared of compound per unit time. and is a favored descriptor of the body’s efficiency for elimination of a substance. Clearance (Cl) is given by the relationship L.
C1 = Dose
X
F/AUCo-,
(8)
where F is the fraction of the dose that is absorbed, and by definition, is equal to 1 for studies done by the IV route. “F” is sometimes referred to as bioavailability, and for studies performedusing an extravascular (ex) route, it is calculated as: F = (AUCEx X DoseIv)/(AUCIvX DoseEx)
(9)
where AUC and dose are known for both the IV and extravascular routes. A parameter that conveys information on the extent of distribution of the analyte is the apparent volume of distribution (VD), which can be readily determined for parent compound from an IV dose study. The VD represents the ratio of the amount of analyte in thebody (i.e., DoseIv)to a theoretical analyte concentration (Co) obtained by extrapolation of the concentration curve to the time of dose administration, i.e., the ordinate intercept. Thus VD = DoseIv/Co
(10)
For studies done by an extravascular route. the appropriate equation is VD = (DoseExX F)/(AUC X k)
2.
(1 1)
Disposition Data Analysis
Studies requiring the collection and analysis of excreta are not generally useful for derivationof the above-described kinetic parameters due to the intervals separating the collection periods. Their purpose is more often to determine the routes and rates of excretion of total test substance equivalents, the extent of biotransformation, and for the identification of end products of biotransformation. Absorption, defined as the amount or massof test substance-related material transferred
Metabolism and Toxicokinetics
89
into thebody [39] may also be obtained from disposition data. Disposition analysis may include the analysis of tissues for concentration of the test substance, specific metabolites, or total radiolabel. In addition, reversible and irreversible protein binding in plasma or other tissue may be evaluated. Excreta, tissue, and carcass measurements are of value for accounting for all of the administered dose at the end of the study, i.e., obtaining a mass balance, which, in turn, strengthens conclusions related to the compound’s disposition. The use of radiolabeled test substance makes mass balance determinations relatively straightforward. Briefly, all excreta collected during the study are weighed, as are traps for expired any air, tissues collected,and the residual carcass. Metabolism chambers are rinsed during and after the study, and rinsings are quantitatively collected and weighed. For cutaneous route studies, containment devices, protective coverings, and postexposure skin washings are similarly collected. Aliquots of liquid matrices are weighed and analyzed directlyby liquid scintillation counting; solid matrices are homogenized, extracted, or oxidized, as needed, prior to liquid scintillation counting. Each resultant activity value (dpm) is divided by the aliquot weight, and the average dpm/weight of replicate aliquots is determined. The average dpm/weight is multiplied by the corresponding total weight of the matrix (e.g., excreta or tissue) to obtain the total associated activity. Finally, the total dpm associated with each matrix are added together to obtain total recovered dose for each animal. Total recovered dose will ideally be equal to 100 ? 5% of the administered dose. The percentage of the administered dose in each matrix can be readily determined, if desired. The amount of dose absorbed is obtained from the total amount of radioactivity recovered from excreta, tissues, and carcass. Radioactivity data can alternatively be expressed as pg or pmole equivalents by conversion of dpm using the specific radioactivity of the dose preparation. The presence of metabolites in urine and other excreta is typically determined by chromatography, with radiochemical or othermethod of detection used for quantification. The extent of test substance metabolism can be determined for each metaboliteor for total metabolismby adding the amountsin each excretion matrix.
3. Statistical Analysis Data evaluation typically involves the calculation of the mean and standard deviation of group data. Analysis of variance and logarithmic transformation of concentration data may be useful, as discussed by Igarashi et al. [29]. Computer mathematical curve fitting programs can be used to derive toxicokinetic parameters, providing the “best fit” of a given equation or model to a given data set. Some specialized programs additionally provide information about the “goodness of fit” (e.g., how closely observed and calculated values compare, whether
90
English
deviations are systematic or random), and how well parameters are estimated. Statistical tests (e.g., F-test, Akaike criterion) can be used to choose between models. A confidence interval method is recommended when toxicokinetic parameters obtained for different treatment groups (e.g., treatment routes or formulations) need to be compared [36]. When statistical comparisons are required, the use of an adequate number of animals and consideration of interindividual variability become critical for the detection of differences that are both statistically and biologically significant. More recently, population approaches have been applied to data analysis and estimation of parameters using a statistical methodology [25,40]. The software known commonly as nonlinear mixed-effects modeling available for data analysis has become increasingly user-friendly, and population modeling has enjoyed a corresponding gain in popularity [22].
B. InterpretationandUse of Data Toxicokinetic parameters are critical determinantsof the toxic response. Knowledge of the amount of chemical and the manner in which it exists in the test animal during a given interval allows a more meaningful correlation between dosage and effects observed. Studies describing the time course of systemic exposure to the compound and its metabolites also can be used to test any presumed association between a toxic chemical and a toxic outcome. Most importantly perhaps, toxicokinetic descriptions help predictunder what circumstances toxicity is likely to occur, thereby improving the reliability of the safety evaluation or risk assessment for a given chemical.
1. Systemic Exposure Systemic exposure is usually expressed as AUC and/or C,,,. If a test substance is associated with a lack of toxicity, it may be important to demonstrate that systemic exposure hasin fact occurred. This is particularly true for pharmaceutical agents, where showing systemic exposure is a validation of the toxicity test. Application of toxicokinetic data can help to ensure that all potential toxicities , 'of a compound have been identified. This occurs by ensuring that the systemic levels of the test substance in animals under study are appreciably higher than systemic levelsof the substance anticipated ormeasured in humans. On the other hand, demonstrating a lack of appreciable systemic exposuremay be desirable for chemicals associated with nonintentional exposure. Slow or minimal absorption accompanied by rapid, efficient elimination are favorable attributes in this case. If systemic exposure is sufficiently low, it may be reasonable to limit or forgo certain longer-term toxicity studies, or modify uncertainty factors used in the derivation of a reference dose.
Metabolism and Toxicokinetics
91
Systemic exposure information should be used in establishing dose levels for subsequent toxicity studies [23]. When the high-dose level is selected based solely on clinical or pathological end-points, a dosage may result that saturates absorption, metabolism, or excretion of the compound. The resulting data will rarely be of any relevance to exposure of humans and, therefore, of little value in safety evaluation or risk assessment. Dose selection does not lend itself to a standard formula or approach, but the reader is referred to Morgan et al. [41] for an excellent discussion of the factors that should be considered in applying toxicokinetic data to dosage selectionin toxicity studies. Systemic exposure data are similarly useful for selecting initial human doses and for escalating doses in drug clinical trials [42].
2.
Elimination
This term is used to describe the removalof a compound from the body, whether by metabolism or excretion.The overall rate canbe expressed as either the clearance or the elimination half-life in plasma or alternative matrix, and terms can be derived for parent compound or specific metabolites. Persistent substances have relatively low clearances and long half-lives, and may be associated with bioaccumulation, depending upon the duration and frequency of exposure. For therapeutics, rapid clearance from the circulation might indicate a need for more frequent dosing in repeat-dose toxicity studies. Clearance and elimination half-life are useful for making predictions concerning steady state. When an organism is exposed to a chemical, steady state in the body is reached when the rate of chemical uptake is equivalent to its rate of elimination. In other words, for a given level of continuous exposure, the steady state concentration (Css) in a given matrix represents the maximum level that the chemical can attain, and will attain given a sufficient lengthof exposure. If a rate of uptake or input (k,) is available (e.g., infusion rate, inhalation uptake rate, dermal penetration rate), the average steady state concentrationcan be predicted as
(12)
Css = ko/C1
assuming clearance is unchangedby continuous exposure. Similarly, elimination half-life can be used to predict the time required to approach steady state upon repeated exposure. The time to reach 95%of average steady state concentrations is given by the formula tgsm, ss = -3.32 X
t1/2
log (1
-
0.95)
(1 3)
3. Dose-Dependent Kinetics Several processes contribute to the removal of a compound from the body, active transport into the renal tubule, metabolism, and protein binding among them.
English
92
These processes involve the occupation of a limited number of binding sites. Increasing concentrationsof a substance will occupy a proportionately increasing number of available binding sites until all sites are occupied, i.e., saturation ocof the substance beyond saturation will result in a curs. Still higher concentrations disproportionate increase in the concentration of free compound. When systemic of toxicity study dose levels, graphiexposure has been determined over the range cal representation of the data (i.e., AUC or c,,,,vs. dose level) will reveal important information (Figure 2):
1
A linear relationship indicates dose proportionality (linear kinetics). A superproportional increasein AUC or C,,,, with dose level indicatesnonlinear kinetics (e.g., saturation of metabolism or excretion). A subproportional increase in AUC or C,,, with dose level may indicate poor absorption, autoinductionof metabolism, or other adaptive or highdosage shifts in toxicokinetic behavior.
External Dose
Figure 2 Relationships between external and internal doses. (a) A linear relationship results when internal dose is proportional to external dose; (b) a superproportional relationship results when elimination processes become saturated as dose level increases; (c) a subproportionalrelationshipresultswhenuptakeprocessesbecomesaturated,orother shifts in toxicokinetic behavior occur as dose level increases.
Metabolism and Toxicokinetics
93
Two of the more sensitive parameters for detecting nonlinear or saturation kinetics are C1 and dose-normalized AUC (i.e., AUC/dose). These parameters remain constant across the dose range of linear kinetics, but change at dose levels that exceed the linear range. Revealing nonlinear toxicokinetics has important impli431. Toxiccations for the interpretation of toxicity studies [for review article, see ity data obtained at dose levels that fall outside the linear or proportional range will be of limited use for extrapolation to low-dose levels.
4. Absorption The method used to determine the extent and rate of absorption of a test substance depends upon the route of administration of the test substance and the type of data acquired. Bolus IV injection results in complete (100%) absorption of the parent compound that is taken up nearly instantaneously. When use ofan IV reference group is specified, either the amount of dose excreted oran AUC should be determined for comparisonwith the tested route. Fororal and dermal administration, the fraction(F) or percentage(F X 100) of the parent compound absorbed can be determined by relating AUCs determined by the IV and extravascular routes (see section on systemic data analysis above). Without a reference dose, oral or dermal absorption may be expressed as a fraction or percentage of the dose administered (calculated from the total amountof dose-related material recovered from excreta, expired air, and carcass divided by the dose administered). Analysis of the parent compound concentration vs. time curve as described (see section on systemic data analysis above) may allow a determinationof the absorption rate constant for oral, dermal, or inhalation absorption. For the cutaneous route of exposure, it may be appropriate to calculate an absorption rate for substances penetratingat a relatively constant rate. This often occurs when an excess of the test substance is appliedto the skin for the entire exposure duration. Thus Average absorption rate = Total dose recovered in excreta and carcass Surface area of exposure X Time of exposure
(14)
Extent of cutaneous absorption may be calculated by subtracting the dose recovered from the skin surface from the dose administered. Similarly, rate of absorption following an inhalation exposure may be equivalent to the rate of loss of test substance from a static exposure system [44-461. For inhalation exposures, of determining extent of absorption isnot routinely calculated due to the difficulty dose administered. However, the experimentally determined ratio of inhalation and IV parent compound AUCs (first-order conditions) may provide an estimate of the inhalation dose (Doselnh) DOSe,,h = AUClnhX DoseIV/AUClv
(15)
English
94
5. Distribution The apparent volume of distribution (VD) provides an indication of the extent of a compound's distribution out of the analyzed matrix (usually plasma). The VD is larger for substances that are more extensively distributed to tissues. Itis common practice to relate apparent volumeof distribution to known volumesof body fluids. Thus, a VD equal to 580 ml/kg body weight is suggestive of a distribution restricted to total body water(body water accounts for 58% of body weight) [34], and a lack of distribution to extravascular tissues. Conversely, VD can exceed 1000 ml/kg (i.e., 100%) for substances that have a high affinity for certain tissues relative to plasma. A high VD generally reflects some degree of tissue concentration. Tissue to plasma concentration ratios be canused to indicate the extent of tissue distribution or accumulation: a time-independent alternative RR,which is calculatedby determining the ratioof tissue is the relative residence, by the measurement to plasma AUCs[35].When tissue distribution is determined of total radioactivity as a surrogate for test substance and related metabolites, it should be recognized that radiolabel may become incorporated into endogenous This conversion of a test substanceand biomolecules through normal catabolism. incorporation into the carbon pool represents an innocuous situation, in contrast with the irreversible bindingof reactive metabolites to critical biomolecules that is often responsible for toxicity.
6.
Biotransformation and Bioactivation
Metabolism studies reveal what substances, in addition to the original (parent) compound, the organism is exposed to in a toxicity study. Excreted metabolites represent the end products of biotransformation; their measurement provides information on the extentof biotransformation and the number and relative amounts of different end products formed. In some cases, additional information can be inferred regarding the metabolic pathways and bioactivated intermediates that are involved in the conversion of parent compound to end products. Toxicity of a compoundmay be associatedwith metabolism or with one or more metabolites. When this is the case, the onsetof toxicity can often be correlatedwith a certain concentration of metabolite in blood or target tissue. Dose-related comparisons are useful, since toxicity can arise from routes of metabolism that are minor at lower dosesbut become significant at dose levels that saturate competing elimination pathways. Comparing metabolite profiles across a range of dose levels sometimes provides evidence that a specific metabolite or pathway is associatedwith toxicity. Thus, the presence of a new or disproportionate increase in a specific metabolite as the dose level is increased to toxic levels would suggest a role for the metabolite in toxicity. Differences in metabolite profiles observed after single and repeated dosing may also be indicative of enzyme induction or inhibition, and may contribute to differences in acute and chronic toxicities.
Metabolism and Toxicokinetics
95
7. Single vs. Repeat-Dose Comparisons Repetitive dosing frequently alters a chemical’s toxicokinetic behavior, with implications for drug. environmental, or workplace exposure. Changesin some measures of systemic exposure, such asAUC or C,,,, may occur following repetitive dosing due to induction or inhibition of the compound’s own metabolism or changes in extent of absorption. Enzyme induction and inhibition can be examined by comparison of metabolite profile after single and repeated treatment. More direct evaluation of enzymology might include measurements of enzyme activities in vitro or product formation rates using the tissues from exposed and naive animals. Repeated-dose studies are also useful for determining the extent and tissue(s) of bioaccumulation. Bioaccumulation potential is higher for compounds having a low systemic clearance or long elimination half-life, and an accurate description of these parameters requires a well-defined terminal slope of the concentration vs. time curve. The potential for bioaccumulation may be difficult to detect in single-dose studiesif systemic concentrations drop quicklybelow limits of analytical sensitivity. Repeated-dose studies should focus on systemic measurements using an analytical method that is specific for the active form of the chemical found in target tissues or tissues suspected of the greatest degree of accumulation (e.g., fat).
8. Route and Species Comparisons Route-dependent differences in absorption, distribution, biotransformation, and associated kinetics are well known, and are important modifiersof a chemical’s toxic potential. The oral route, in particular, can result in a “first-pass effect” where extensive metabolism by the liver modifies the systemic delivery of the compound. If hepatic metabolism serves to detoxify the compound, then the inhalation route will deliver a greater dose of the toxicant relative to the oral route. On the other hand, if bioactivation occurs in the liver, then hepatic toxicity may be more prevalentwith an oral vs. alternate exposure route. In selecting the route of exposure for toxicity studies, primary consideration should be given to those routes expected in humans potentially exposed during the manufacture or useof the chemical. Route-dependent differences in toxicokinetics can explain route differences in toxicityandcanhaveasignificantimpacton risk assessment
WI. Cross-species extrapolation from experimental animals to humans is a fundamental problem facedby toxicologists. False predictionsof toxicity in humans need to be avoided by generating the best possible information on a chemical’s potential hazard. Data that reduce the uncertainty associated with cross-species extrapolation can be obtained from both toxicokinetic studies and studies of toxic mechanism of action.
English
96
To improve the reliability of cross-species extrapolations, data that relate exposure dose to internal tissue dose should be developed for more than one species of test animal, and, when feasible. humans. Species-related differences in toxic response can result from species-specific toxicokinetic differences (e.g., extent of absorption, degree of metabolic activation), and may also result from intrinsicspeciesdifferences in sensitivity,sometimesreferredtoastoxicodynamic differences [48]. The test species appropriate for risk assessment or safety evaluation should predict metabolic and kinetic behavior in humans as well as the biological response. Certain mechanisms of toxicity are not applicable to human safety evaluation or risk assessment. An understanding of both the mechanism of toxic action in the test species and the potential for this mechanism to operate in humans will further strengthen predictions of human toxicity based on toxicological data. Comparative data showing significant species differences of an existing may help preservea viable new product or prevent undue restriction product where its toxic effects are irrelevant to humans.
C. Reporting Reporting requirements for toxicokinetic studies, like the guidance provided. tend to be flexible and dependent upon the type(s) of study performed. They range from the most basic of instructions, as the following from ICH [6]: A comprehensive accountof the toxicokinetic data generated, together with an evaluation of the results and of the implications for the interpretationof the toxicology findings, should be given. An outline of the analytical method should be reported or referenced. In addition, a rationale for the choice of
the matrix analyzed and the analyte measured should be given. to a comprehensive listof specifications for the body of the report [8], as summarized below:
Sutntnmy: A summarized analysis of results and conclusions drawn. Introduction: Objectives, guideline references, regulatory history, if any, and rationale. Materials and Methods: Testsubstanceidentification:physicochemical properties, vehicles or carriers, identification of radiolabeled material; test system: species, strain, age, sex,body weight, health status and husbandry; study design and methodology; statistical analysis. Results: Tabulatedradioactivityanalysisresults(typicallydpmand pg equivalents).graphs,representativechromatogramsandspectrometric data. proposed metabolic pathways, and molecular structureof metabolites. If applicable, justification of exposure conditions, dose levels: deof radioactivscription of pilot studies; quantity and percentage recovery
okinetics
and
Metabolism
97
ity in urine, feces, and expired air, and other matrices as appropriate; tissue distribution (% dose and pg equivalents per g tissue); material balance;plasmaconcentrations and pharmacokineticparameters:rate and extent of absorption; quantitiesof test substance and metabolites(% dose) in excreta; individual animal data. Discussiorz and Corzclusions: Provide a plausible explanation of the metabolic pathway for the test substance; emphasize species and sex differences whenever possible; discuss the nature and magnitude of metabolites, rates of clearance, bioaccunlulation potential, and level of tissue residues, as appropriate; concise conclusion.
IV. CURRENTAPPLICATIONS Data developedin toxicokinetic and associated metabolism and mechanistic studies have improved human health risk estimates and reduced their uncertainty [49521. Dosimetry models (e.g., physiologically based pharmacokinetic, nasal air flow, dermal, and pulmonary uptake) are increasingly being used to estimate human kinetic parameters. These models incorporate the principal biological factors governing the disposition of the compound in the body, including bloodflow rates, organ volumes, intrinsic tissue solubility of the compound, protein binding, and metabolic rate constants. These models enable a chemical’s disposition to be simulated and provide estimates of the dose delivered to a target tissue. The behavior of the compoundcan also be predicted under varying conditions, making possible high-to-low dose, cross-species. and route-to-route extrapolations. Pharmacokinetic models can also be developed for groups of compounds that are metabolically related, in which the model consists of linked submodels for each compound [53]. Models developed for such “families” of compounds quantify the internal dose for the parent compound as well as internal doses for each metabolite. When such models are used in conjunction with parent compound toxicity studies, it is possible to estimate potentially toxic levels for each of the metabolites. This approach torisk assessment, named the family approach, has been recommended as an efficient method for determining acceptable exposure limits for metabolically related compounds. Quantitative risk assessments rely on the knowledge or expectation that the response to the chemical is related to its concentration, but thisnotisuniversally the case, as discussed in the section on route and species comparisons (see above) and elsewhere [54-561. While there are limits to the valid application of tissue concentration data in quantitative risk assessment, its use as a measure of exposure is preferable to the use of administered dose, which is standard practice in the absence of toxicokinetic data. The dosimetry model can be extended by incorporating mechanistic data as it becomes available, thus building a risk as-
English
98
sessment model that uses available knowledge of a compound’s kinetics and dynamics. The application of dosimetry models to problems in toxicology and risk assessment have been well described in numerous publications [e.g., 57621, and their use has been endorsedby industrial and regulatory scientists alike. When dosimetry models are not available for determining human kinetic parameters, certain parameters canbe estimated by a method known as interspecies allometric scaling [63,64]. In this method, a parameter determined for the experimental animal is adjusted to human proportions, usuallyby a power function x of the body weight (bw), thus (16)
Parameter
CY
bw”
AUC and C1 are useful parameters for making species comparisons, and can assist in the determination of appropriate allometric scaling factors. Allometry is not generally useful for extrapolating metabolic parameters across species, but other parametersin humans, such as the apparent volume of distribution and halflife, have been successfully predicted from similar parameters in rats [65]. The use of safety factors for interspecies and interindividual differences is common practice in determining acceptable human exposure levels for chemicals. Safety factors implicitly accommodate both the kinetic and dynamic aspects of inter- and intraspecies differences [66]. Renwick [67] has described a mechanism by which the usual 100-fold safety factor used for food additives could be modified by known or predicted differences in toxicokinetics between experimental animals and humans. The safety factor could be significantly reduced, for example, if a compound were shown to undergo complete metabolism (detoxication) prior to absorption. Such data-derived safety factors an example are of how toxicokinetic information has been used to set occupational exposure limits [68] and to establish acceptable or tolerable daily intakes [69,70]. This section has introduced several applications of toxicokinetic data for improving extrapolation to humans, risk assessment, and modifying safety factors.Besidesthesequantitativeapplications.toxicokineticdatahavepractical utility in the qualitative understanding of a compound’s behavior, as described throughout the chapter. The identification of key toxicokinetic characteristics such as elimination half-life, first-pass metabolism, bioactivation, and dose saturation, are invaluable to understanding circumstances under which toxicity is likely to be expressed.
REFERENCES 1.
OECD. OECD GuidelineforTesting April 4, 1984).
of Chemicals.“Toxicokinetics,”(Adopted
k
okinetics
and
Metabolism
99
2. OECD. OECD Guideline for the Testing of Chemicals. Proposal for a new guideline, Percutaneous absorption: in vivo method, Draft, June 1996. 3. EEC 1988. Commission Directive of 18 November 1987 adapting to technical progress for the 9th time Council Directive 67/548/EEC on the approximation of laws, regulations and administrative provisions relative to the classification, packaging and labellingof dangerous substances (88/302/EEC). Off. J. L. 133,30May 1988,l. 4. USFDA 1982. Toxicological Principles for the Safety Assessment of Direct Food Additives and Color Additives Used in Food. Bureau of Foods, FDA, 1986. 5. USFDA 1994. Guideline for metabolism studies and for selection of residues for toxicological testing. In: General principles for evaluating the safety of compounds usedinfood-producinganimals.RevisedJuly1994,FDACenterforVeterinary Medicine. 6. ICH 1995a. Toxicokinetics: the assessment of systemic exposure in toxicity studies. Guideline for Industry S3A, March 1995. 7. ICH 1995b. Pharmacokinetics: guidance for repeated dose tissue distribution studies. Guideline for Industry S3B, March 1995. 8. USEPA1998a.HealtheffectstestguidelinesOPPTS870.7485Metabolismand pharmacokinetics. 9. USEPA 1998b. Health effects test guidelines OPPTS 870.7600 Dermal penetration. Public draft. 10. USEPA 1996a. Health effects test guidelines OPPTS 870.8223 Pharmacokinetic test. Public draft. 11. USEPA 1996b. Health effects test guidelines OPPTS 870.8300 Dermal absorption for compounds that are volatile and metabolized to carbon dioxide. Public draft. 12. USEPA 1996c. Health effects test guidelines OPPTS 870.8320 OraUdermal pharmacokinetics. Public draft. 13. USEPA 1996d. Health effects test guidelines OPPTS 870.8340 Oral and inhalation pharmacokinetic test. Public draft. 14. USEPA 1996e. Health effects test guidelines OPPTS 870.8500 toxicokinetic test. Public draft. 15. USEPA 1996f. Health effects test guidelines OPPTS 870.8245 Dermal pharmacokinetics of DGBE and DGBA. Public draft. 16. USEPA 19968. Health effects test guidelines OPPTS 870.8360 Pharmacokinetics of isopropanol. 17. USEPA 1996h. Health effects test guidelines OPPTS 870.8380 Inhalation and dermal pharmacokinetics of commercial hexane. 18. J.R. Buchanan, L.T. Burka, and R.L. Melnick. Purpose and guidelines for toxicokinetic studies within the National Toxicology Program. Environ. Health Perspect. 105(5):468-471(1997). 19. ECETOC 1992. EC 7th Amendment: Role of mammalian toxicokinetic and metabolic studies in the toxicological assessment of industrial chemical. Technical Report No. 46. 20. A.G.E. Wilson,S.W. Frantz, and L.C. Keifer, A tiered approach to pharmacokinetic studies, Emiron. Health Perspect. 102(Suppl 11):5- 1I (1994). Xeno21. D.A. Smith,M.J. Humphrey. and C. Charuel, Design of toxicokinetic studies, biotica 20(11):1187-1199 (1990).
100
English
22. D.B. Campbell, Are we doing too many animal biodisposition investigations before Phase I studies in man? A re-evaluation of the timing and extent of ADME studies, Eur. J. Drug Metab. Pharmacokinet. 19(3):283-293 (1994). 23. N.W. Spurling and P.F. Carey, Dose selection for toxicity studies: a protocol for determining the maximum repeatable dose, Hzm. Exp. Toxicol. I1:449-457 (1 992). 24. J. VanBree,J.Nedelman.andJ.L.Steimer,Applicationofsparsesamplingapproachesinrodenttoxicokinetics:aprospectiveview, Drug In$ J. 28:263-279 (1994). 25. J.-L. Steimer, M.-E. Ebelin, and J. Van Bree, Pharmacokinetic and pharmacodynamic data and models in clinical trials,Eur. J. Drug Metab. Pknrrttncokirzet. 18(1): 61-76 (1993). 26. S. Vozeh. J.-L. Steimer, M. Rowland, P. Morselli, F. Mentrb, L.P. Balant, and L. Aarons, The use of population pharmacokinetics in drug development,Clin. Phnrmncokirzet. 30(2):81-93 (1996). 27. R.J. Boatman, L.G. Perry, L.A. Fiorica. J.C. English, R.W. Kapp Jr., C. Bevan, T.R. Tyler, M.I. Banton, and G.A. Wright, Dermal absorption and pharmacokinetics of isopropanol in the male and female F-344 rat, Drug Metab. arzd Dispos. 26(3):197202 ( 1998). 28. S.W. Frantz. P.W. Beatty, J.C. English. S.G. Hundley, and A.G.E. Wilson, The use of pharmacokinetics as an interpretive and predictive tool in chemical toxicology testing and risk assessment: a position paper on the appropriate use of pharmacokinetics in chemical toxicology. Regul. Toxicol. Pharrttacol. /9:3 17-337 (1994). 29. T. Igarashi, T. Yabe, andK. Noda. Study design and statistical analysisof toxicokinetics: a report of JPMA investigation of case studies, J. Toxicol. Sci. 21:497-504 (1996). 30. A. Parkinson, Biotransformation of xenobiotics, Casnrett and Dolrll’s Toxicology, the BasicScience of Poisons 5th Ed. (C.D. Klaassen, ed.), McGraw-Hill. New York, 1996. 31. T.W. Sweatman and A.G. Renwick, The tissue distribution and pharmacokinetics of saccharin in the rat, Toxicol. Appl. Plznrmncol. 55:18-31 (1980). 32. V.P. Shah, K.K. Midha. S. Dighe, I.J. McGilveray. J.P. Skelly, A. Yacobi. T. Layloff, C.T. Viswanathan, C.E. Cook, R.D. McDowall, K.A. Pittman, ands. Spector, Analytical methods validation: bioavailability, bioequivalence and pharmacokinetic studies. Pharmaceutical Res. 9:588-592 ( 1 992). 33. B. Clark and D.A. Smith, Art Introduction To Plzarrnncokinetics, Blackwell Scientific, Boston, Massachusetts, 1981. 34. A.G.Renwick. Toxicokinetics-Pharmacokinetics in Toxicology, Principles u r d Methods of Toxicology (A.W. Hayes, ed.), Raven Press, New York, 1994. 35. M.B. Abou-donia. Metabolism and Toxicokinetics of Xenobiotics,CRC Harzdbook of Toxicology (M. Derelanko and M. Hollinger, ed.), Boca Raton, Florida, 1995. Pharmacokinetics. 2dEd.,RevisedandExpanded(J. 36. A.GibaldiandD.Perrier. Swarbrick, ed.), Marcel Dekker, New York, 1982. 37. P.M. Silber. N.R. Myslinski, and C.E. Ruegg, In vitro methods for predicting human pharmacokinetics, Lab. Arzint. February: 36-38 (1995). 38. S.R. Obach, J.G. Baxter. T.E. Liston, M.B. Sibler, B.C. Jones, F. MacIntyre, D.J. Rance, and P. Wastall, The prediction of human pharmacokinetic parameters from
cokinetics
and
Metabolism
39. 40. 41. 42. 43, 44. 45. 46.
47. 48. 49. 50. 51. 52. 53.
54. 55.
101
preclinicalandinvitrometabolismdata, J. Phan~~ncol. Exp. Ther. 283(1):46-58 (1997). J.G. Dain, J.M. Collins, and W.T. Robinson, A regulatory and industrial perspective of the use of carbon-14 and tritium isotopes in human ADME studies,Phnrm. Res. I 1 (6):925-928 (1 994). E.I. Ette, A.W. Kelman, C.A. Howie, and B. Whiting, Analysis of animal pharmacokinetic data: performance of the one point per animal design, J. Phannncokinet. Bioplzarm 23(6):55 1-566 (1995). D.G. Morgan, A.S. Kelvin, L.B. Kinter, C.J. Fish, W.D. Kerns, and G. Rhodes, The application of toxicokinetic data to dosage selection in toxicology studies, Toxicol. Pathol. 22(2):112-123 (1994). S. Piantadosi and G. Liu, Improved designs for dose escalation studies using pharmacokinetic measurements.Stat. Med. 15:1605- I6 I8 ( I 996). J.H. Lin, Dose-dependent pharmacokinetics: Experimental observations and theoretical considerations, Bioplznrrn Drug Dispos. 15:1-31 (1994). J.G. Filser and H.M. Bolt, Pharmacokinetics of halogenated ethylenes in rats, Arch. Toxicol. 42: 123- 136 (1 979). M.E. Andersen. Recent advances in methodology and concepts for characterizing inhalation pharmacokinetic parameters in animals and man, D r q Metab. Rev. 13: 799-826 (1 982). M.E. Andersen, Inhalation pharmacokinetics: Evaluating systemic extraction, total in vivo metabolism, and the time course of enzyme induction for inhaled styrene in rats based on arterial blood: inhaled air concentration ratios, Toxicol. Appl. PharntaC O ~ .73~176-187 (1984). J.E. Doe, H.D. Hoffmann, Toluene diisocyanate: An assessment of carcinogenic risk following oral and inhalation exposure, To,.uicol. Ind. Health I l ( 1 ) : 13-32 (1995). C.T. Eason, F.W. Bonner. and D.V. Parke, The importance of pharmacokinetic and receptor studies in drug safety evaluation, Regul. To.xicol. Phar~nncol.I 1 :288-307 (1990). R.P. Beliles andL.C. Totman, Pharmacokinetically based risk assessment of workplace exposure to benzene, Regul. Toxicol. Phnrmacol. 9: 186-195 (1989). C.B. Frederick and A.G.E. Wilson, Comments on incorporating mechanistic data into quantitative risk assessment, Risk Anal. 11(4):581-582 (1991). M.E. Meek and K. Hughes, Approach to health risk determination for metals and their compounds under the Canadian Environmental Protection Act,Regul. Toxicol. Phnrmucol. 22:206-212 (1995). J.A. Bond. M.W. Himmelstein, and M.A. Medinsky, The use of toxicologic data in mechanistic risk assessment: 1,3-butadiene as a case study,bzt. Arch. Occup. EmiYon. Health 68:415-420 (1996). H.A. Barton, P.J. Deisinger, J.C. English, J.M. Gearhart, W.D. Faber, T.R. Tyler, M.I. Banton, J. Teeguarden, and M.E. Andersen. Family approach for estimating reference concentrations/doses for series of related organic chemicals,Toxicol. Sci. 54125 1-261 (2000). A. Monro, What is an appropriate measure of exposure when testing drugs for carcinogenicity in rodents? Toxicol. Appl. Phnrmacol. 112: 17-118 1 (1992). A. Monro, The paradoxical lack of interspecies correlation between plasma concen-
102
56. 57. 58. 59. 60. 61.
62. 63. 64. 65. 66. 67. 68. 69. 70.
English
trationsandchemicalcarcinogenicity, Regul. Toxicol. Plznrmncol. 18:115-135 (1993). A. Monro, Drug toxicokinetics: scope and limitations that arise from species differences in phamacodynamic and carcinogenic responses, J. Pharrizacokirzet. Biophcrlnz. 22(1):41-57 (1994). H.J. Clewell andM.E. Andersen. Dose. species, and route extrapolation using physiologically based pharmacokinetic models,Toxicol. Znd. Health. I : 11 1 131 (1985). J.N. Blancato, Physiologically-based pharmacokinetic models in risk and exposure assessment, Ann. Zst. Super. Snnita 27(4):601-608 (1991). H.-W. Leung, Development and utilization of physiologically based pharmacokinetic models for toxicological applications,J. Toxicol. Ewirorz. Health 32:247-267 (1991). M.E. Andersen. H.J. Clewell111, and C.B. Frederick, Applying simulation modeling to problems in toxicology and risk assessment-a short perspective, To-xicol.Appl. Phnrrnncol. 133:181-187 (1995). S.B.Charnick,R.Kawai.J.R.Nedelman, M. Lemaire, W. Niederberger,andH. Sato,Perspectivesinpharmacokinetics.Physiologicallybasedpharmacokinetic modeling as a tool for drug development,J. Phnnnncokinet. Bioplznrn1. 23(2):217229 (1 995). G.W.V.D. Molen, S.A.L.M. Kooijman, and W. Slob. A generic toxicokinetic model for persistent lipophilic compounds in humans: an application to TCDD,Fundurn. Appl. To.xico1. 31$3-94 (1996). R.W. D'Souza and H. Boxenbaum, Physiological pharmacokinetic models: some aspects of theory, practice, and potential, To,uicol. hzd. Health 4(2):151-171 (1988). R.M.J. Ings, Interspecies scaling and comparisons in drug development and toxicokinetics, Xelzobiotica 20(11):1201-1231 (1990). K. Bachmann. D. Pardoe, and D. White, Scaling basic toxicokinetic parameters from rat to man, Erzviron. Healtlr Perspect. 104(4):400-407 (1 996). A.G. Renwick, Safety factors and establishment of acceptable daily intakes, Food Additil'es nrzd Contrminants 8:135-150 (1991). A.G. Renwick, Data-derived safety factors for the evaluation of food additives and environmental contaminants, Food Addit. Corztnm. 10(3):275-305 (1993). B.D. Naumann and P.A. Weideman. Scientific basis for uncertainty factors used to establish occupational exposure limits for pharmaceutical active ingredients, Hzan Ecol. Risk Assess. 1(5):590-613 (1995). V. Morgenroth 111, Scientific evaluation of the data-derived safety factors for the acceptabledailyintake.Casestudy:diethylhexylphthalate, Food Addit. Contain. 10(3):363-373 (1993). G. Wurtzen, Scientific evaluation of the safety factor for the acceptable daily intake (ADI). Case study: butylated hydroxyanisole (BHA), Food Addit. Conturn. 10(3): 307-3 14 (1993).
I
I
5 Inhalation Toxicity Studies Raymond M. David Eastman Kodak Company, Rochester, New York
1.
A.
INTRODUCTION Inhalation as a Route of Exposure vs. Inhalation Toxicity
There are two ways to approach conducting an inhalation study: one is to use inhalation as a route of exposure and assess the effect on an organ system other than the respiratory tract, whilethe other is to conductan inhalation toxicity study in which effects on the respiratory tract are most important. Inhalation as a route of exposure for animals is used extensively when it mimics the likely route of human exposure, but the effect to be evaluatedmay be on an organ system other than the respiratory tract. For example, developmental toxicity, reproductive toxicity, and neurotoxicity studies have been conducted in which animals were exposed by inhalation but the end points evaluated did not involve the respiratory tract. Quite often, these studies involved the exposure of animals to vapors of volatile liquids such as solvents. Gases or vapors can be readily absorbed into the blood and be distributed to other organs. Therefore, the likelihood that a vaporized test substance will affect other organs is high. Exposure to aerosolsmay have localized effectsin the lungs or upper respiratory tract becausethe physics of the aerosol particle dictates where that particle is deposited. Particles may be deposited in only a specific part of the respiratory tract depending on their aerodynamic size. The respiratory tract will respond by trying to remove the particle via physical means such as sneezing or coughing, or via other defense mechanisms. Thus,most particles may not get absorbed into the blood at all. The basics of inhalation toxicity, effects on the respiratory tract, and factors that influence deposition and absorption of test substance have been reviewed in numerous texts [l-31. 103
David
104
B. Decisions,Decisions!
.
,
There are many options for how to conduct any toxicity study and what end points to evaluate. For many test substances and study types, these options are limitedeither by prevailingguidelinesor by generallyacceptedmethodsfor evaluating systemic toxicity. On the other hand, the number of options for evaluating inhaled test substances and the effectsof inhalation studies is greaterthan for other study types because (1) there are so many variables that influence how (2) the lung or respiratory tract can be the test substance gets to the lung, and a target organ or just a portal of entry. As a result, it may be necessary to be creative and/or pragmatic in designing a study. The information and guidance provided in this chapter enumerate some of the variables to consider and what options are available based on the purpose of the study and the characteristics of the test substance. There may be options other than those discussed here, and it is important to work with the study director or contract laboratory to explore alltheoptions. If the final studydesignchosenseemsunorthodoxcompared with the current testing guidelines, it is recommended that the design and the designrationale be discussed with the appropriateindividuals(scientists and regulators) in the government agency where the data are to be submitted. Althoughregulatoryofficialswillgenerally not giveapprovalfornonstandard study designs, making them aware of the issues and the logic in a study design before the study is conducted may be beneficial in the long run. If the data are for intemal uses only, it is recommended thatthe study design be reviewed carefully to make sure that it addresses all the issues of concern. In either case, it is best to remember that a single study may not be able to address all the issues of concern. The logistics of exposure, the number of animals needed, or the manpower needed to assess all the end points desired may require more than one study.
II. CONTRACTINGAN INHALATION STUDY
A.
Just How Complicated Can This Be?
Why should studies conductedby inhalation exposure beany more complex than otherstudies?Onereasonisthatthereareseveralfactorsthatinfluencethe amount of the test substance that gets to the target organ (Le., bioavailability). First, inhalation exposures are prolonged over several hours, much like slow infusion administration. As a result, there is a prolonged absorption phase that is limited by the availability of the test substance. Bioavailability isthen dependent on absorption and the amount of the test substance that enters the body. One reason for this is that the test atmosphere (i.e., the mixture of test substance and air) is constantly being prepared or generated. Minor changes in the supply of
t
I
Toxicity
Inhalation
105
air or test substance can alterthe concentration of the test atmosphere, and, thus, the amount of test substance to which the animal is exposed can vary from moment to moment. In addition, the test substance is not instantly delivered to the animal the moment the generator is turned on (except perhaps for some noseonly exposure designs). This delay in delivery is particularly true for whole-body exposure systems that use large chambers. Because these chambers are large relative to the amount of test substance entering the chamber, it takes time for the chamber to completely fill with the test substance. This equilibration time (and the same time for the termination) may pose logistical problems that are not associated with other routes of administration. A second factor influencing bioavailability, in the case of aerosols, is that the size of the particle influences where it is deposited in the respiratory tract. Particles that depositin one area of the respiratory tract may have a different biological effect than particles that deposit in another area of the respiratory tract. Also, particles deposited in the upper respiratory tract or upper regions of the lungmay be conveyed to the esophagus for ingestion. Thirdly, changes in the heart rate or respiration rate of the animal can affect the amount of test substance inhaled (and thus the amount available for absorption). For example, the test substance may be irritating to the mucous membrane or respiratory epithelium, and the animal may alter its breathingpatterntoreduce the quantityinhaled. All thefactorslistedaboveinfluence the amount of the test substance absorbed during inhalation exposure because they can change the amount of test substance depositedin the respiratory tract or the location of the deposition and therefore, potentially, the biological response. Because the quantity and characteristics of a test substance in the air are variable, constant checks and balances are necessary for the control of inhalation studies. In general, everything that can be measured or controlled is. Nothing, not evenmeasurementsfromsophisticatedinstruments,is taken atface value. Each piece of equipment, flow meter, and thermometer should be calibrated, each chamber should be tested. As a result of this checking and double checking, inhalation studies are time consuming, tedious, and generate volumes of data that assure whoever reviews the report that the biological responses observed could be reproduced because all the variables were quantified or controlled. As mentioned, it takes time for the chamber to completely fill with the test substance. Likewise, when the generator is turned off at the end of exposure, it takes time for the concentration of the test substancein the chamber tobe reduced to the point that the doors can be opened without exposing personnel. Because it takes time for the chamber be to depleted of the test atmosphere,do not expect to perform complex end points requiring a lot of animal handling immediately after the end of exposure. Typically, the earliest that the animals can be handled is 30 min after the end of the exposure. This earliest handling time may vary
106
David
with the exposure system, so discuss this issue with the study director if there are concerns or questions.
B. How to Select a Contract Laboratory for Testing The basic Good Laboratory Practice (GLP) considerations in selecting a contract laboratory have been reviewed in Chapter 1. Selecting an appropriate laboratory for an inhalation study becomes even more complex. Questions such as “How can laboratories with inhalation capability be identified?” “Do or all laboratories that conduct inhalation studies have the same capability or experience?” are frequentlyasked.Theseare not easyquestionstoanswerandanumber of approaches, such as questionnaires, have been used to help in the selection process. It is difficult to pose all the correct questions, however, because subsequent questions may depend on the initial answers. Telephone calls followed by personal visits to potential contracting facilities are also recommended. Make a list of questions that need to be answered before calling. The kinds of questions to ask depend very much on the test substance and the study design contemplated. Figure 1 provides some basic questions that are appropriate to pose to prospective laboratories. When asking about experience with a specific type of test substance or test A study director atmosphere, a negative response should not exclude a laboratory. gains experience by working with a test substance (one doesnot get that knowledge otherwise), and there should be no reason why the test substance in question cannot be the first. On the other hand, it may take longer for the study to start if a laboratory does not have experience with a specific type of test substance. Bear in mind that the preliminary test period allows the study director to become completely familiar with the characteristics of the test substance. This period will be longer than normal if the laboratory has never worked with this type of test substance before. Such a delay may need tobe factored into the selection process. Much of the process of selecting a laboratory dependson working with the study director. The creativity of the study director working with the technical staff will be the key to a good study. Generally, the more input a study director has in the preliminary testing, especially if the test substance is unusual, the better the study will turn out. The study director needs to understand the problems surrounding generation of the test atmosphere. In addition, as the sponsor, one needs to feel that the study directorhas a good grasp of the technical difficulties involved with generating the test atmosphere and has considered methods to circumvent those difficulties. Once a laboratory is selected, discuss with the study director the time line for getting thestudy done. Have realistic expectations. Frequently, inexperienced sponsors underestimate the problems that can arise in trying to generate test atmospheres. Solving these problems takes time. It is not uncommon to take 2 to 4 wk to setup a study taking into account determining the method of measurement,
Inhalation Toxicity Studies
107
On the Phone: Do you have inhalation chambers? What type?
m
Y Whole-body
Nose-only
What size? How many chambers of each size? Maximum numberof animals that can be exposed per chamber?
Flow-past or other type? Number of animals per unit? What type of restraint tubes and what sizes are they?
J, Y Couldyouconductastudy in which number of animals are exposed simultaneously? What type of generation systems do you have? How much experience doyou have with generating vapors or aerosols? What kind of aerosols, liquidsor dusts? How many studies haveyou conducted with each? Can you monitor the concentrationin each chamberM0 unit simultaneously? What kind of monitoringlanalytical capability do you have? Have you ever tested a test substance with these particular physical chemical properties before?
During your visit: Is room air or outside air used for the chamber? Is the air HEPA and charcoal filtered? How close are thefilters to the chamber? Where are the animals housed relativeto the exposure room? Do the chambers look clean? Are the chambers constructed of a non-reactive material? Can all or most of the animals be easily observed during exposure?
Figure 1 Questionnaire to help select a laboratory
for testing.
generating a test atmosphere, determining the distribution in the chamber, etc. All of these considerations will need to be addressed even for a single-exposure study, so one needs to be patient. How much time it takes to complete the study (i.e., get the report) will depend on how well the laboratory can respond to the sponsor’s needs. Several weeks for an acute study is not uncommon. More time is required for an acute study in which subsequent exposure concentrations are determined by the resultsof the previous exposures (i.e..LC5o[concentration that is lethalto 50% of the population] studies or range-finding studies). These studies generally take longer because the study director does not set the next exposure level until the results of the previous exposure level are seen.
111.
STUDY DESIGN
A. What Study Design Should Be Used? In order to design a good study, the first step is to clearly identify the purpose of the study, i.e., isit for regulatory submission (agricultural chemical, industrial
108
David
chemical, or pharmaceutical), consumer protection, or occupational hazard assessment (respiratory or systemic toxicity). One of the primary reasons to do this is to determine under which regulatory guideline (if any) the study must be conducted. In spite of the best efforts of toxicologists worldwide to harmonize testing guidelines, there remain subtle differences among study designs that vary with the regulatory agency and country. Consult a recent of copy whichever regulatory guideline is appropriate. If,the study report is to be submitted to several regulatory bodies or to regulatory bodies in different countries, select the most stringent guideline to use for the study to ensure that all the regulators will find the study acceptable. It is recommendedthat this issue of which guideline to use be discussed with the study director to find out what the study director’s latest experiences are. In general, the U.S. Environmental Protection Agency (EPA) and theOrganizationforEconomicCooperationandDevelopment(OECD) guidelines are detailed and quite similar. They spell out the numbers of animals per group, the lengthof the exposure, what needs to be measured, etc. They also allow for the use of “limit tests,” in which only a single group of animals is exposed to a high concentration of the test substance. If no mortality occurs (actually, if the L C , is greater the “limit,’ concentration), thenno further testing is required. The U.S. Food and Drug Administration (FDA) guidelines for pharmaceuticals provide rough guidance, and have not yet embraced the concept of a limit test. This lack of a limit test in the FDA guidelines means thatno matter what, several groups of animals may need to be exposed to define the LC50. For some test substances, such as pharmaceuticals, that are administered via specialized devices, such as pulse inhalers, there are no specific testing guidelines. Itmay be advisable, then, to utilize a study design that resembles a standard guideline. The only variation from that guideline may be in the meansof administration (and characterization of the test atmosphere). Because each delivery device may pose unique problems for the investigator, it is difficult to provide general guidance concerning these types of studies. The investigator should discuss all aspects of the study with the study director.
B. How Long to Expose the Animals If the study is not intended to meet regulatory requirements,then the number of options for the study design increases. The two most common alternative purposes for regulated studies are those for worker safety or consumer safety. In both cases, the use of accepted regulatory testing guidelines is recommended because it is easier to defend the study design when it conforms to an accepted testing methodology. On the other hand, using a study design patterned after a pesticide guideline may not adequately evaluate the test substance. For example, regulated studies use 4- or 6-hr exposures that can over- or underestimate the effects of the test substance dependingon the exposed population. Workersmay be exposed to low concentrationsof a neat chemical forat least 6 hr, sometimes
Inhalation Toxicity Studies
109
longer, or very high concentrations for short periods such as 1 hr, whereas consumers arerarely exposed to the neat chemical and generally not for a long period of time. Thus, depending on the population to safeguard, it may be wise to conduct two studies: one in which animals are exposed to high concentrations for only 1 hr (a Department of Transportation type of study design), and another in which animals are exposed for longer periods of time. Consumer safety studies are even trickier becauseit is best to generate the test substance in a way that most closely mimics consumer exposure conditions. For example,if a product is sold in an aerosolspray can, itmay be wise to expose animals to a test atmosphere generated from aerosol spray cans rather than one generated with a nebulizerused in the laboratory because the laboratory nebulizer is designed to produce aerosol particles of a specific size. Mass-produced nebulizers on spray cans may not produce particles with the same range of sizes as a laboratory nebulizer. Thus, the laboratory nebulizer may not truly mimic consumer exposure. In addition, most self-pressurized containers cannot generate consistently sized aerosol particles for long periods of time because the propellant tends to change the temperature of the can as it escapes. These containers are meant to be used for short bursts. Thus,the study design may need to mimic this intermittent exposure. For some consumer products, it is not the absolute or intrinsic toxicity of the test substance that is important, but its relative toxicity. For example, when testing product X123, the main question may be whether it is more or less toxic than product X122. Should that study design be required, it is best to make sure that the reference product is something that is readily available, has a known toxicity, andhas a composition or chemical identity that is not prone to alteration (i.e., does not deteriorate over time, or its composition is not changed by the producer).
C. AnAcuteStudy The complexity and cost of acute inhalation toxicity studies surprise most firsttime sponsors of inhalation studies. They should not. All the characterization of the test atmosphere, problems with generation, and questions abouthow well the animals are exposed must be resolved during thisbrief 4- to 6-hr exposure period. Unlike repeated-exposure studies, during which there is time to make adjustments in the generation of the test substance, acute studies need to reach the desired concentration quickly and remain at that level for the duration of the exposure without substantial variation. Otherwise, the time-weighted average concentration (the average concentration over the total timeof exposure) may not be within 10% of the target concentration desired. In addition, this is the first time that animals will have been exposed to the test substance, so the biological effects are unknown. Animalsmay need close monitoring to evaluatethe potential lethal effects. Range-finding inhalation studies prior to LCso studies are not common
110
David
because the effort required to expose one animal is the same as that required to expose 10 animals. However, if the test substance is very toxic by other routes (such as oral) or very irritating i n dermal irritation tests, exposing a few animals for 1 to 2 hr may help establish the concentration range needed for a definitive [41. The duration of exposure is fixed by the testing guideline.The Toxic Substances Control Act (TSCA) and Federal Insecticide, Fungicide and Rodenticide Act (FIFRA) testing guidelines, as well as the OECD guidelines, require a 4-hr 6 hr of exposure for acute studies even though repeated-exposure studies require exposure. Some people find this difference inconsistentin the evaluationof safety to workers who may be exposed for up to 6 hr per day. On the other hand, the DOT testing requirements are for a 1-hr exposure! Exposure to pharmaceuticals can be for as long as 12 hr depending on how the test substance is administered to humans (see [ l ] for review). Although the exposure system shouldnot influence the resultsof any inhalation study, it is possible to obtain seemingly different results using a nose-only exposure system compared with a whole-body exposure system [5], generally because of stress-related changes in the animal’s physiology or behavior. This may also occurin an acute toxicity study, i.e., lethal concentration values obtained in nose-only systems may be slightly lower (greater toxicity) than when wholebody chambers are usedif animals are not acclimated to the restraint. Such problems can be overcome by conducting a mock exposure while exposing the animals to air for a few days prior to the initiation of the study. When using the whole-body system, acclimation does not seem to be necessary. Observation of the animals isan integral part of any study, especially acute studies. Frequent (at least once per hr) descriptive observations during exposure can provide insight into the absorption and excretion of the test substance as well as its toxicity. Animals should also be observed prior to exposure to establish any preexisting conditions andat some point after exposure to establish the duration of any exposure-related effects. Because the exhaust period following wholebody exposure generally requires 30 min,isitunrealistic to perform postexposure observations before then. Animals can be observed as they are removed from the chamber, although recording the observations can be tedious.may It be more practical to observe the animals 1 hr after the end of the exposure (this includes the 30-min exhaust time) when all the animals are returned to theirhome cages. This time period will also give a better understanding of the excretion rate of the test substance.
D. A Repeated-ExposureStudy Repeated-exposure studies (1. 2, 4. 13 wk, or longer) can be easier to conduct than single-exposure studies because the purpose is not to determine the acute
Toxicity
Inhalation
111
life-threatening concentration but, rather, to evaluate cumulative toxicity from repeated exposure to sublethal concentrations. In repeated-exposure studies, there is more time to characterize the test atmosphere and to make adjustments to keep it within 10% of the target concentration. Therefore, frequent determination of concentration (once per hr) may be excessive, especially if conducting a longterm study. One aspect to conducting a repeated-exposure study thatis very important is the rotation scheme for the animal cages. Because the concentration is measured at a single location in the chamber and because there may be slight differences in the concentration at various locations within the chamber, moving the animals from one location to another (“cage rotation in the chamber”) tends to reduce the animal-to-animal variability that results from slight differences in concentration within the chamber. How frequently animals are moved to a new location in the chamber dependson the length of the study.For short-term studies may be advisable. For 13-wk studies, a weekly of 4 wk or less, daily cage rotation rotation scheme is probably adequate. Make sure that the laboratory hasan adequate rotation scheme that will ensure that each animal moves through the corners of the chamber. The termination of the animals in repeated-exposure studies relative to the last exposure is another important aspectof the study design. It is common practice, even for oral toxicity studies, to terminate the animals 24 hr after the last exposure. Inhalation studies should not be different. It is not advisable to finish the last exposure on a Friday and wait until the next Monday to collect blood and tissues, especially if the study design is less than 13 wk of exposure (it is, however, acceptablefor chronic studies). Waitingthe weekend before terminating the animals allows recovery even though the purpose is to look for cumulative toxicity. In addition to exposing the animals on the day prior to necropsy, some laboratories make it a practice to expose the animals at least 2 days prior to necropsy to ensure that blood levels of the test substance (or metabolite) are near steady state, especially if elimination is slow.
IV. THETESTSUBSTANCE A.
KnowtheTestSubstance
An inhalation study cannot be conducted without the sponsor and study director of the test substance. having a good graspof the physical and chemical properties The test substance is going to be mixed or suspended in the air for long periods of time, so the study director needs to know the answer to several questions to generate a test atmosphere: Is it a gas, liquid, or solidat room temperature? Does it react with air even in the presence of high humidity? Is it reactive with any of the components of the exposure system? If a liquid, will the test substance vaporize at a reasonable temperature? Does it react with air to form peroxides
112
David
or a different chemical species? What is the lower explosive limit? If the liquid cannot be vaporized, is its viscosity low enough that it can be aerosolized (nebulized)? Is it soluble in water? If a solid, is it already a particulate and what is be compacted easily? the size distributionof the particles? Can the test substance Is it hygroscopic or water-insoluble? The answers to these questions will determine if the study will be of a gas, liquid aerosol, or dust. This determination on the exposure system to will, in turn, influence the study director's decision use, or if inhalation is even practical. So knowing the characteristics of the test substance ahead of time may make the whole process of planning for a study 2 outlines someof the important (and gettingit started) go more smoothly. Figure questions about the test substance and how the answers influence the study and selection of exposure system.
I Liquid Liquid
WHAT IS THEPHYSICALSTATEATROOMTEMPERATURE?
/' . 4
4
Gas
Solid
Is it soluble In water72
Is it reactive71 Does It react wlth air or any of the componentsof the exposure system7
pressure at room
-
I Mixture
r -"
I
!
the bolllng polnt? Is It flammable or a peroxlde
MMAD 10 pm 3
I
What IS the lower exploslve
-
ryes I
Nebuhze solutlon
1 If no, 2 If no, 3 If no,
then go directlyto bottom of page then may need to conslder other route of admlnlstratlon such as Intratracheal lnstlllation then may needto conslder the blologlcal relevance of inhalation 4 If no, then may needto experiment with generation system. 5 If no, then goto "Is It soluble In water?"
Figure 2 Decision tree for selecting the natureof the test atmosphere and the exposure
system.
I
Toxicity
Inhalation
113
To make matters more complex. the test substance may be any of the above (gas, liquid, or solid) packaged in a unique delivery system, such as an inhaler or canisters. For these types of products, the device is the generator of the test atmosphere (usuallyan aerosol). Thatmay seem to resolve many questions about how to generate the test atmosphere, but it does not. Cannisters or inhalers are designed to deliver bursts of aerosol for short periods of time (seconds), not continuous delivery over hours. Therefore, a question mustbe faced: Should the device be attached to a continuous supplyof test substance under constant pressure so it can generate the test atmosphere continuously over the exposure period, or should the device be triggered to deliver bursts of test atmosphere for the exposure period? There is no correct answer to this question.
B. How Much Test Material Will Be Needed for the Study? Naturally. the characteristics of the test substance and the exposure system selected will determine how much test substance will be needed for the study. For example, for a 13-wk or chronic vapor study in large chambers, more than one 55-gal drum maybe needed. Large-scale nose-only systems (25 to 75 animal ports) will not be much different.Smaller-scalenose-onlysystems(12to24 animal ports) will use 25% or less of what a whole-body chamber will use. If a liquid test substance is to be aerosolized, the amount used will be much lower than for vapor studies. Even if a large chamber is used, the amount needed for be an aerosol study could be less than 100 kg for a 13-wk study. Table 1 can used as a guide for deciding on the amount of test substance required. For each exposure concentration, the amount of test substance requiredcan be estimated by multiplying column4 by the concentration in mg/land adding 25% for technical development.
Table 1 QuantitiesofTestSubstanceNeededforInhalationStudies
Exposure system/ Maximum number Likely day3 per (lpm) exposed ofrate rats chamber size nose-only 1?-port, 0.5 m3 1 m3 4 t-11~ 8 n1‘
12 20 30 100 300
’Assumes a concentration of 1 mg/l for 6 hr.
air flow
Minimum amount of test substance (g>
10 100 200 800 1600
6 40
80 300 600
114
David
Dusts are a different story. Aerosolizing (or suspending as the case may be) a dust presents an entirely different situation for the inhalation toxicologist. Typically, dust particles arenot generated in the same sense that a liquid aerosol is generated. The dust generator does not produce a particle that has a diameter that is much different from the diameter of the original test substance whereas by the generator (nebuthe sizeof a liquid aerosol particle is determined primarily lizer) used. In addition, dusts are very electrostatic, and the charge needs to be neutralized (or at least reduced) to prevent the dust from electrostatically adhering to every surface with which it comes in contact. Because of losses on surfaces, the maximal concentration of an atmosphere of dust tends to bemuch lower than that which can be achieved with a liquid. So, in addition to the exposure system dictating the amountof test substance needed for a study, the amount will depend on whether the test atmosphere is a vapor, liquid aerosol, or dust and how it is generated.
C.ChemicalAnalysis Good Laboratory Practice Regulations require that any test substance be analyzed by the performing laboratory for purity, identity, stability, concentration, and homogeneity in the vehicle. In order to do that, the laboratory will need to have a good analytical method. The concentration of the test substance i n air will be determined during the exposure (analytically as well as gravimetrically). The method used will need to be specific enoughto clearly identify the test substance in question. Absorption in the infrared range may not be adequate for this since the EPA does not consider the method to be specific enough for identification. Instead, wet chemical analysis or chromatographymay be necessary. The homogeneity of the test substance concentrationin air (or uniformity of the test atmosphere concentrationin the chamber, as the case may be) will be determined prior to exposures because this is typically part of the preexposure trials. Inhalation laboratories routinely do most of these analyses.
D. Analyzing How Much Is in the Air One of the single most important aspects to a study is the determination of the airborne concentration of the test substance. Nowadays. no one conducts inhalation studies without measuring concentration,but the scientific literature isfilled with acute toxicity values that relied entirely on nominal concentrations rather than analyticalconcentrations.Thedistinctionbetweenthese two methods of expressing the amount of test substance in the atmosphere is that the nominal concentration is calculated by dividing the total amount of test substance used in the exposure by the total amount Gf air flowing through the chamber, whereas
Inhalation Toxicity Studies
115
the analytical concentration is amean of the results of sequential analyses of the test atmosphere. How one analyzes thetest atmosphere depends on the physical and chemical propertiesof the test substance. For example, if the test substance an is organic liquid that can be vaporized or a gas, measuring the absorbance in the infrared (IR) or ultraviolet (UV) range can determine the concentration of organic vapors. The advantageof this method is thatthe measurement can be instantaneous (realtime) and continuous during the exposure. If the test substance does not absorb in the IR or UV ranges, one canuse gas chromatography to quantitate the amount of test substance in the air. It is difficult to do this continuously, however; it is more common to "grab" a sample and analyze that "grab" sample. This procedure generally involves siphoning-off some of the test atmosphere into a Teflon@ or Tedlar@ bag, then analyzing what is in the bag. The more frequently this is done, the better is the information about instantaneous concentrations. If a grab sample cannotbe taken or if the concentration isso low that the detector is limited in how much it can detect, then it may be necessary to "trap" the test substance by chemically reacting itwith another substance. Alternatively,the test substance can be adsorbed ontoan inert substance (e.g., activated charcoal) and the trapped amount determined. As mentioned earlier, IR is a great way to provide instantaneous (real-time) values, but trapping and chemically analyzing the atmosphere will also be necessary. The frequency of these analyses depends on whether an acute or repeated-exposure study is conducted, but it is safe to say that at least one measurement per day will be necessary no matter what. Grab sampling and determining the amount of test substance in the air gravimetrically are routinely used to confirm concentrations of aerosols (liquid and solid),but should be used for monitoring purposes only. Unlike grab samples of vapors that siphon-off some of the test atmosphere in a matter of minutes, grab samples of aerosols are generally collected over a period of time because of the need to collect enough mass for accurate weighing. Although grab samples are taken intermittently and donot give an instantaneous atmospheric concentration of the test substance, this does not mean that samples need to be taken very frequently. Typically, measurements once an hour are sufficient, at least for shortterm studies. For longer-term studies (28 days or longer) sampling once an hour yields too many numbers. There are instruments that can provide continuous. instantaneous measurementsof aerosol concentration. These instruments are typically easily overloaded and cannot be used for continuous 6-hr readings, however, unless the airborne concentrations are low. Determining the concentration of an aerosol provides only a portion of the necessary information about exposure. Determining the particle size is equally important because the size of the particle influences where it will deposit in the respiratory tract. Having a particle size that is too big is one way to invalidate
116
David
a study. Typically, the median particle size (the size that is exactly at the midpoint of the distribution) should be 1 to 5 microns for the study to be acceptable. If the particles are not in this size range, the study director will need to come up with alternative generation methods that will make the study acceptable. The frequency of particle size measurements is not as critical as that of concentrationmeasurementsbecausethesizedistributionfromthegenerator tends not to change once a steady state hasbeen reached. Therefore, conducting particle size measurements twice during an acute study (single exposure) is more than adequate. For a repeated-exposure study, once per day is adequate to determine how the particle size might change over the course of a study. Particle size determinations, like grab samples on filter paper, have not changed substantially over the years.A number of instruments are commercially available to determine the particle size distribution using photobeam or laser technology, but someof these instruments havebeen rejected by the EPA as not being accurate. The EPA prefers tried-and-true methods, such as cascade impaction. One final comment about sampling for concentration and particle size: The guidelinesrequire that measurementsfortheseparametersbetakenfrom the “breathing zone” of the animals. This provision is meant to ensure that the analytical concentration for the chamber accurately representsthe exposure concentration for the animals. It should be recognized that the concentration varies within the chamber simply because each chamber has its dead spots. Therefore, taking a sample from one portion of the chambermay not reflect the concentration throughout the chamber. On the other hand, it is impractical to take samples from different areas of the chamber during exposure because one would never know if changes in the concentration represent fluctuations of the generator output or just variability within the chamber. Therefore, most study directorspick a central spot in the chamber and take all samples from that spot. Hopefully, the spot is as close to the animals as practical, but in a large chamber with lots of animals, that will be impossible.The next best thing is to determinehow much variability there is in the chamber andif it can be minimized prior to initiationof the study. This procedure, typically referred to as homogeneity determination, compares the concentration at various points in the chamber to a reference point, which is usually the location where all measurements will be taken during the study. The sampling locations selected should probably correspond to the extremes in the chamber where animals might be placed during exposure. For example, if the chamber can hold 30 animals but the studyuses only 10, then the sampling locations should cover the extremes of where those 10 animals will be placed in the chamber, not just the cornersof the chamber.How much variability in concentration from one location to another is acceptable? It would be best if the difference between any given point and the reference point were less than 10%.If the difference is more than lo%, consult with the study director on how it can be mini-
Inhalation Toxicity Studies
117
mized, or increase the target concentrationso that the extremes are closer to the desired exposure concentration. For nose-only exposure, the question is where to take samples for concentration and particle size measurements. The concept of "breathing zone" is not applicable for nose-only systems because formost nose-only systems, the entire area is the breathing zone; in this case, sampling is generally done from any animal port. Drawing a sample from an animal portmay influence the distribution of the test atmosphere to the other ports, however,so sampling from the inlet or outlet of the apparatus may be a better site. This issue can be worked out with the study director. Theonly reason to discuss the sampling issues hereis to alert potential study monitors to the aspects of the exposure that are unique and need to be resolved. As already mentioned, characterizationof the test atmosphere during exposure is time consuming yet is very important. Taking frequent measurements to determine the concentration of the test substance (at least once per houror halfhour) is recommended. If dealing with an aerosol or other particulates, measure the particle size at least twice and remember to determine the test substance analytically as well as gravimetrically.
V.
THE TEST SYSTEM
A.
Rats or Mice?
Selecting the test species is a relatively straightforward decision. Rats and mice are perhaps the most commonly used animals for inhalation studies. Other species, such as rabbits or guinea pigs, can be used if there is a good reason, such as the end point to be evaluated requires a species otherthan rats and mice (e.g., a sensitization study that might be more appropriate in guinea pigs). However, under most regulatory protocols, the rat is the animal of choice because of its size. Micemay be unacceptable asan animal model becausethey have the ability to reduce their breathing patterns more than rats in response to inhaled irritants, thereby decreasing the inhaled dose. As a result. rats exposed to a test substance may demonstrate more severe biological responsesthan mice exposed tothe same concentration; that is. mice may underpredict the biological effects of exposure to irritating substances. Even among rats, however, there may be subtle differences in strains that require the use of one strain instead of another. For example, if behavioral end points are used, the Sprague-Dawley or Long-Evans strains are better choices than is the F-344 strain. However, the opposite may be true if pulmonary end points are being studied. In either case, the laboratory should have a historical database on which to rely when evaluating the effects of the test substance. That
118
David
is not to say that the historical database should substitute for a control group, only that experience with a particular strain (either typical background lesions or responses) can be helpfulto determine if the effects observedin the study are meaningful.
B. Nonrodents:Rabbits,Dogs,andMonkeys Developmental and reproductive toxicity studies, and some FDA studies, require the use of nonrodent species. Large species such as rabbits, dogs, and primates pose problems to the inhalation toxicologist by virtue of their size. Exposure of rabbits to vapors as part of developtnental or reproductive toxicity studies can be conducted in whole-body chambers. Special caging will be needed and the chamber size will need to exceed 3 m3 if body burden is not to exceed 10%. Many facilities built in the 1970s and early 1980s contained chambers of that size. Nose-only exposure systems for rabbits are rare. Exposure of other species, such as dogs or primates (exposing cats is not common), can be accomplished using masks(essentiallynose-only). As with rodents, the animals need to be acclimated to wearing the mask. Primates can also be trained to accept treatment via pulse inhalers, while exposure of dogs can be performed via implanted tracheal tubes. Intratracheal tubes for exposure, such as those described by Halpern and Schlesinger [6] for exposure of rabbits. may also be available. These novel exposure systemsmay be valuable for unique types of test substances that have characteristics that make standard exposure systems impractical. However,it is probably best to let the study director suggest using them rather than trying to find a laboratory that specializes in these types of systems. If the standard exposure systems are inadequate, intratracheal instillation is an option. This method of administration has been used for many years to deliver known quantities of test substance to the lungs. There are drawbacks to using intratracheal instillation (bolus into the lungs as opposed to airborne exposure), such as the volume of liquid (usually saline) that is used as a carrier, but the advantage isthat the lung burden (i.e.,the amount of test substance delivered to the lungs) is known.
C. Well-Aged Is Better Regardless of which species or strain used, it is important to make sure that the animals are the correct age prior to initiation of exposure. Most of the testing guidelines specify the age of the rats as “young adult,’’ whichmay translate into 4 to 6 wk of age. However, older animals (6 to 8 wk), which are larger, tend to tolerate exposure better. The selection of older animals is especially important when a nose-only exposure system is used because small animals may not fit
Toxicity
Inhalation
119
well in the restraint tubes. It is not uncommon for rats to try to avoid inhaling irritating test substances simply by turning around in the tube or at least pulling their head back out of the exposure stream. The smaller the animal, the easierit is for it to turn completely around. Larger rats fill the tube better and are less likely to be able to reverse their position to avoid inhaling the test substance.
D. HowMany? When the species, strain, and age of the animal have been selected, make sure that sufficient numbers of animals are included to meet the testing guidelines. Short-termstudies(acuteandrepeated-exposure,sometimescalledsubacute, studies: typically < 13 wk) using rodents require 5 animals per sex per group. This numberof animals tends tobe the minimum needed to evaluate group differences by typical statistical methods. Studies using larger animals, such as rabbits. dogs, or primates, generally require fewer animals (e.g., 12 rabbits for segment I1 studies, and 5 to 7 dogs for short-term studies). Thirteen-week studies (subchronic) using rodents typically require 10 animals per sex per group, whereas oncogenicity studies need at least 50 animals per sex per group. If this number is inadequate to assess all the end points desired, add more animals to the group. It would be a waste of animals to use too few animals on additional end points and not retainsufficientnumberstoconductappropriatecomparisonsamong in the response or effect. groups to distinguish biologically significant differences Remember that exposing a few extra animals has relatively little impact on the cost of the study unless it requires bigger chambers. It takes the same effort to expose 20 animals in a chamber as it does to expose one animal.
VI.
CHARACTERIZINGTHEEFFECTSOFEXPOSURE
A.
SystemicToxicity
Once the purpose of the study is defined, select the biological end points to evaluate that will help answer the questions and concerns about the test substance. For regulated studies, most biological end points are specified by the guideline that is most appropriate. Systemic end points, such as body weight. feed consumption, clinical pathology (clinical chemistry and hematology), and anatomical pathology are commonly included and, unless there are good reasons to exclude them, shouldbe included in most study designs becausethey provide information about whether systemic toxicity is associated with other end points. One important caveat is to make sure that the histopathology is appropriate for the respiratory tract (i.e., include the nasal cavity, larynx, trachea, and lungs). The lungs
David
120
and nasal cavity should be sectioned to provide the best possible evaluation of the tissue. For the lungs, the planes of the cuts shouldbe along the major bronchi for the best possible evaluation (see [7] for review).
B. PulmonaryToxicity Other end points specifically designed to evaluate pulmonary toxicity can be added becausethe lungs may not be a simple portalof entry. Traditional methods for evaluating pulmonary toxicity include weighing the lungs prior to preservation,androutinehistopathology.Althoughweighingthelungsappearsto be straightforward, there are occasional difficulties dissecting away extraneous tissue (the esophagus, for example), which leave the lungs or trachea perforated and make inflation with formalin fixative more difficult. Usually, experienced laboratories have learned to overcome such difficulties without sacrificing accuracy in the organ weight. Preserving the lungs for histological evaluation may also present some difficulties. The generally accepted method to fix the lungs is to inflate themwith fixative. How much fixative to add may vary from laboratory to laboratory, but 1.5 to 2 times the collapsed size is not uncommon. Inflating the lungs with too much fixative may damage the architecture of the alveoli and make the pathologist’s evaluation difficult. Conversely, insufficient fixativemay be equally harmful. An additional pulmonary end point is bronchoalveolar lavage, a procedure that washes out free lung cells, proteins, and debris from the lungs (see [8] for review). The washings are analyzed for total and differential cell counts, total protein and albumin concentrations, and the activities of lactate dehydrogenase, alkalinephosphatase, and N-acetylglucosaminidaseamongothers.Bronchoalveolar lavage has been used to evaluate the pulmonary toxicity of a variety of inhaled materials in animals and humans [9]. In most cases, increases in enzyme activity can berelateddirectlytodamage of specificcelltypes in the lung. Changes in free lung cell numbers can indicate inflammation. The EPA has recently incorporated this procedure into an acute inhalation exposure guideline (OPPTS 870.1350), which could be used to help understand the effects of environmental air pollutants. The end points that are recommended in the guideline are minimal, so adding others will provide for a better understandingof the pulmonary response to exposure.
C.
PulmonarySensitization
Another end point that is useful is the measurementof immunoglobulin E (IgE) levels in the blood asan indicator of pulmonary sensitization. Pulmonary sensitization can be a serious workplace issue, and a method for evaluating the potential
Toxicity
Inhalation
121
for pulmonary sensitization may help complete a workplace safety study [lo]. Several methodologies havebeen developed to evaluate pulmonary sensitization [ 11-13]. but care should be used in adapting any method to the test substance in question since the methods may be most appropriate only to specific classes of chemicals. However, IgE measurement is by no means the only method that has been used for detection of pulmonary sensitization. Discuss the possibilities with the study director or potential study director (if still in the interviewing stage) and find a method for determining pulmonary sensitization that is reliable and predictive.
D. Respiratory/Pulmonary Irritation Evaluating the irritation potentialof a test substance is often accomplished using standard dermal irritation studies. However, these studies donot adequately predict how the respiratory tract respondswhen exposed to irritating chemicals, Alarie [14] described the physiological responses following exposure to irritating chemicals. These responses can be characterized by measuring changesin respiration [ 151, and such changes have been proposed as a means of categorizing the toxicity of chemicals [ 161. While using changes in respiration as a tool may be valuable, it is not easy to incorporate measurementof respiration intothe standard inhalation toxicity study. Such evaluations are better performed within a specific study designed for only this purpose.
E. NonpulmonaryEndPoints There are casesin which the health concern isnot related to the lung, and inhalation exposure simply reflects the routeof human exposure. For example, inhalation exposure can beused as a routeof administration for neurotoxicity, reproductive, and developmental toxicity studies.When designing these studies, onemust take into account the logistics of exposure, and how the exposure system may influence the study design. For example, labor- or time-intensive neurotoxicity end points, such as functional observational battery, cannot be conducted on a large number of animals 6 hr after a 6-hour exposure without interfering with the light/dark cycle. Likewise,when conducting a reproductive or developmental toxicity study of an aerosolized material, the use of nose-only inhalation may introduce some potential artifact caused by the stress of restraining the animals. So, it is best to be creative in addressing the end points selected and to work within the logistics of exposure. It may also be necessary to deviate from timetables that are spelled outin the guidelines if they do not make sense for the exposure regimen that you have established.
David
122
VII. A.
EXPOSURE SYSTEMS Big ChambersorNose-Only?
Whole-body chambers are the most common system used and can vary in size from 0.5 to 18 m3or larger.The size of the chamber selected for the study depends on the number of animals per concentration group to be exposed. The type of system selected and finding a laboratory with that capability depends on the properties of the test substance. Whole-body chambers are most appropriate for vapors and gasses. If the test substance is an aerosol or fiber. however, a nose-only system may be more appropriate than whole-body chambers because substantial amounts of material can be deposited on the fur of the animal in whole-body chambers.Testsubstancedeposited on thefur can then beingestedthrough grooming. Although ingested test substancemay not greatly influence the results of the study, there are cases in which the effect of the test substance has been substantially altered [5] depending on the route of administration. Some laboratories have devised ingenious methods to overcome the problem of ingestion. including wrapping the animal so that only the head is exposed or washing the pelt following exposure. It seems easier to use a nose-only exposure system. Are there other factorsthat influence the selectionof the exposure system? Perhaps. One factor is the amount of test substance needed for nose-only vs. whole-body exposure systems. As discussed earlier. nose-only systems use far less materialthan whole-body systems. Theuse of less materialmay be important if testing a biochemical pesticide or a bioengineered pharmaceutical that is available only in small quantities. But, there are no other important parameters that influence the selection of exposure systems. Using a nose-only system usually causes great anxiety for the sponsor because the first question to ask is whether the regulatory agency will accept the data. It is true that most regulators are accustomed to inhalation studies using whole-body chambers, but the testing guidelines do provide forthe use of noseonly when appropriate, and most regulators now accept the advantages of noseonly exposure. Using a nose-only exposure system does bring with it unique issues and problemsthat may need to be considered. The first issue isthat animals have to be restrained during exposure. Restraints vary from immobilizing the neck to placingthe animal into a tube.Any form of restraint brings with it stress, which can result inan increased respiratory ratethat could lead to greater amounts of test substance being inhaled. Acclimation tothe restraint can reduce the severSo, if nose-only exposure ity of the stress response and can virtually eliminate it. systems are selected, make sure that the animals are acclimated for several days prior to exposure, even for an acute study. Another factor to contendwith is heat stress from tube restraints. Most tube restraints are made from plastic or polycarbonate and can retain the animal’s body heat. Although personal experience has demonstrated that the temperature
Inhalation Toxicity Studies
123
inside the tube increasedonly 1 to 3OC, this may be unacceptable for some types of studies. Body heat can easily be dissipated, though, by keeping the tail out of the tube or even keeping it in cool water.
VIII. THEREPORT ANDDATAINTERPRETATION Any report sums upthe study design and results. In general, reports for inhalation studies contain more detailed explanations of materials and methods than do most other studies becausethe methods vary significantly from laboratory to laboratory and from test substance to test substance. Make sure the report contains adequate details on the characterization of the test substance, test atmosphere generation, chamber design and sampling ports. analytical methods, etc. Table 2 provides some ideasof what information shouldbe contained in the report. There is probably no such thing as too much information. It may also be helpful to have some description of validation/calibration procedures, especially if the results are going to be viewed as controversial (positive or negative). There is a Standard Evaluation Procedure (SEP) for inhalation toxicity studieswritten by the EPA that may be very helpful in making sure that the report will meet standards [ 171. If the EPA or other government agency will review the study, it is recommended to
Table 2 Important Information for Material and Methods Section of Report
IncludeInformation to study of Aspect Size
Chamber Construction Shape Placement of inlet and outlet Location of animals Rotation scheme Type of filtration Air Supply Source of air supply Type of generator used Generation Diagram of system Physical parameters of generation: air flow. air pressure, feed flow or reservoir Analytical Instrumentation and detection limits Frequency of calibration check How calibration curve established Grab sampling procedures (if applicable) Type of monitoring device and location of sample line
124
David
read the SEP. In it are examples of what might make a study invalidin the minds of the regulators and are therefore issues to avoid or address. Interpreting the data with respect to effects on the respiratory tract can be complex because of the variables thatneed to be taken into account about where the test substance was deposited or absorbed. Recently, the International Programme on Chemical Safety (IPCS) developed a document containing the “Scientific Principles and Methods for Assessing Respiratory Tract Injury Caused by Inhaled Substances,” which provides an excellent overview for evaluating study data. If bronchoalveolar lavage is used, the effects of exposure on individual parameters have been well studied and reviewed [8,9].
REFERENCES 1. G.L. Kennedy, Jr. and R. Valentine, Inhalation toxicology. inPrir~ciplesand Metlzods of Toxicology, 3rd Edition (A.W. Hayes, ed.), Raven Press, New York, 1994. 2. S.C.GadandC.P.Chengelis.eds.Acuteinhalationtesting,in Acute Toxicology Testing Perspectilves and Horizolzs, The Telford Press, Caldwell, NJ, 1989. 3. R.O. McClellan and R.F. Henderson, eds.Co~~cepts in Inhalation Toxicology, Hemisphere Publishing Corp., New York. 1989. 4. A. Zwart, J.H.E. Arts, W.F. ten Berge, and L.M. Appelman, Alternative acute inhalation toxicity testing by determination of the concentration-time-mortality relationship: Experimental comparison with standard LCso testing, Reg. Toxicol. Pharmacol. 15: 278-290 (1992). 5. R.W. Tyl. B. Ballantyne, L.C. Fisher, D.L. Fait, D.E. Dodd, D.R. Klonne, I.M. Pritts, and P.E. Losco, Evaluation of the developmental toxicity of ethylene glycol aerosol in CD-1 mice by nose-only exposure. Ftrndarn. Appl. Toxicol. 27: 49-62 (1995). 6. M. Halpern and R.B. Schlesinger, Simple oral delivery device for inhalation exposure of rabbits to aerosols, J. Toxicol. Environ. Health 6: 751-755 (1980). 7. W.M. Haschek and H.P. Witschi. Respiratory system. in Hnrzdbook of Toxicologic Pathology (W.M. Haschek and C.G. Rousseaux, eds.). Academic Press, New York, 1991. 8. J.A. Bond. L.A. Wallace, S. Osterman-Golkar, G.W. Lucier, A. Buckpitt, and R.F. Henderson. Assessmentof exposure to pulmonary toxicants: Use of biological markers, Fundmn. Appl. Toxicol. 18: 161- 174 ( 1992). 9. H.Y. Reynolds, Bronchoalveolar lavage, Anz. Ret?.Respir. Dis. 135: 250-263 (1987). 10. I. Kimber and M.F. Wilks, Chemical respiratory allergy. Toxicological and occupational health issues, Hzun. Exp. Toxicol. 13: 735-736 (1 995). 11. K. Sarlo and E.D. Clark. A tier approach for evaluating the respiratory allergenicity of low molecular weight chemicals, F~mfariz.Appl. Toxicol. 18: 107- 1 14 (1992). 12. R.J.Dearman.L.M.Spence,and I. Kimber,Characterizationofmurineimmune responsestoallergenicdiisocyanates. Toxicol.Appl. Phcrrnzacol. 112: 190-197 ( I 992). 13. H.L. Ritz, B.L.B. Evans, R.D. Bruce, E.R. Fletcher, G.L. Fisher, K. and Sarlo. Respi-
Toxicity
Inhalation
14.
15. 16. 17.
125
ratory and immunological responses of guinea pigs to enzyme-containing detergents: A comparison of intratracheal and inhalation models of exposure, Fundum. Appl. T O X ~ C21: O~3 . 1-37 (1993). Y. Alarie, Sensory irritation by airborne chemicals, Crit. Rell. Toxicol, 2: 299-363 (1973). L.E. Kane, C.S. Barrow, and Y . Alarie, A short-term test to predict acceptable levels of exposure to airborne sensory irritant, Am. Znd. Hyg. Assoc. J. 40: 207-229 (1 979). M. Schaper. Developmentof a databasefor sensory irritation and itsuse in establishing occupational limits, Am. hzd. Hyg. Assoc. J. 54: 488-544 (1993). S.B. Grossand F.J. Vocci,Standardevaluationprocedure.Inhalationtoxicology testing, EPA-540/09-101, 1988.
This Page Intentionally Left Blank
Genetic Toxicology Donald L. Putman,* Ramadevi Gudi, Valentine 0. Wagner 111, Richard H. C. San, and David Jacobson-Kram BioReliance Corporation, Rockville, Maryland
1.
INTRODUCTION
Genetic toxicology, unlike other disciplines in toxicology, does not study a specific adverse health effect. Rather, potential genotoxic effects are evaluated since they are considered important prequelae to the development of adverse health effects such as cancer. Additionally, the inductionof mutations i n germinal cells can result in increased frequencies of genetic diseases or even the introduction of new genetic diseases into the human gene pool. It has also been suggested, although perhaps not proven, that somatic cell mutation is also important in the initiation of atherosclerotic plaques and may be the basis for the aging process. A number of short-term test systems are available for assessment of genetic hazard. These systems are often categorized by the end points that they measure: gene mutation, chromosome damage, or deoxyribonucleic acid (DNA) damage. It is the close association of these well-characterized and easily quantified end points with known mechanisms of oncogene activation orloss of tumor suppressorgenefunctionthatplacessuchimportanceongenotoxicitytesting.Thus, short-term genetic toxicology tests may be used to identify chemicals that require further testing in long-term animal systems as well as provide support for the evaluation and interpretation of carcinogenicity findings from these animal systems. While there is little question that genotoxicity testing should be part of a safety evaluation of all new chemical identities, the appropriate assay systems
:I.
In memoriam.
127
Putman et al.
128
to be used, as well as the protocol design itself, have often been determined by national regulatory guidelines. Considerable progresshas been made in attempts to standardize protocol requirements for genotoxicity testing, particularly through the efforts of the International Conferenceon Harmonization (ICH)and the Organization for Economic Cooperation and Development (OECD).
A.
Pharmaceuticals
The ICH process has resulted in the promulgation of guidelines for a “Standard Battery for Genotoxicity Testingof Pharmaceuticals (ICH Harmonised Tripartite Guideline S2B)” [l]and “Guidance on Specific Aspectsof Regulatory Genotoxicity Test for Pharmaceuticals (ICH Harmonised Tripartite Guideline S2A)” [2]. These guidelines have standardized genotoxicity testing on pharmaceuticals in the United States, Europe, and Japan. The recommended standard battery is (1 ) a testfor gene mutationin bacteria, (2)an in vitro test with cytogenetic evaluation of chromosomal damage with mammalian cells or an in vitro mouse lymphoma tk assay, and(3) an in vivo test for chromosomal damage using rodent hematopoietic cells. If all results in the standard battery are negative, will it usually provide a sufficient level of safety to demonstrate the absence of genotoxicity activity. However, if one or more positive responses are seen, test materials may have to be evaluated more extensively. Under certain circumstances, the standard battery may have to be altered. For example, under certain circumstances, the resultsof the bacterial assaymay not be informative. This could happen when testing materials that are excessively toxic to bacteria, such as certain antibiotics, or materials that are specifically designed to be activein mammalian cells. such as topoisomerase inhibitors or nucleoside analogues. Under these circumstances the guidelines suggest performing the two in vitro mammalian tests (cytogenetics and mouse lymphoma) in addition to the bacterial mutation assay. A second circumstance in which the standard battery may be modified is for compounds that bear structural alerts for genotoxic activity but give negative results in the three standard assays. In such a situation, additional testing may be required. This testing of the material, its known reactivity, and should take into account the nature information on how it is metabolized. The guidelines also note that the standard battery may have to be modified based on limitations in the use of in vivo tests. For example, pharmacokinetic datamay suggest that a compound isnot systemically absorbed and therefore not available to the target tissues, most often the bone marrow. Cited examples of such materials include radioimaging agents, aluminum-based antacids, and some dermally applied pharmaceuticals. If an in vivo route of exposure cannot be found that will provide target cell exposure, both in vitro mammalian cell tests should be performed. The guidelines also indicate that additional genetic toxicology testing may be required if a material is negative in the three-test battery but gives positive responses in a carcinogenicity bioassay. Examples of additional tests include use of modified conditions for
Genetic Toxicology
129
metabolic activation in in vitro tests and in vivo tests that measure DNA damage in tumor target organs. These can include unscheduled DNA synthesis, radioactive phosphorus (3'P)-postlabeling, mutation inductionin transgenes, or molecular characterizationof genetic changes in tumor-related genes. Finally, the guidelines indicate that additional genetic toxicology may be required for compounds with unique chemical structures that havenot been tested in carcinogenicity bioassays. For products of recombinant DNA technology such as vaccines, monoclonal antibodies, blood-derived products, hormones, cytokines, and other regulatory factors, the U.S. Food and Drug Administration (FDA) Center for Biologics Evaluation and Research reviews the requirement for genotoxicity testingon the basis of product application.
B.
MedicalDevices
In July, 1994. FDA Center for Devices and Radiological Health (CDRH) announced that it would abandon the Tripartite Guidance Document of 1986 in favor of the 1992 International Organization for Standardization (ISO) guidelines for mutagenicity testingof device materials [3]. According to the IS0 guideline, a three-test battery is required.The battery must include one test for gene mutation, one test for chromosomal aberrations, and one test for DNA effects; two of the three tests should use mammalian cells as the target. All tests must be conducted according to current OECD testing guidelines. Most recently, CDRH has been requesting the ICH mutagenicity battery in place of the IS0 battery. At the present timeit is not completely clear which guideline device manufacturers should follow.
C.
IndustrialChemicalsandPesticides
1. United States U.S. Environmental Protection Administration (EPA) authority for the regulation of chemicalsrequiringmutagenicitytestingiscoveredunder the ToxicSubstances Control Act (TSCA) or the Federal Insecticide, Fungicide and Rodenticide Act (FIFRA). The TSCA regulates both new chemicals (Section5 ) and those that were already in commerce at the time the act was passed (Section 4). While Section 5 does not require toxicology testing, the Office of Pollution Prevention and Toxics (OPPT) has established testing criteria based on human exposure or release into the environment[4]. For certain high-volume chemicals that reach the trigger for occupational or consumer exposure, the requirement includes two genetic toxicology assays, theSalmonella or Escherichia coli gene mutation assay and an in vivo micronucleus test. Toxicology testing of existing chemicals is determined through test rules
Putman et al.
130
or consent orders. Because Section 4 chemicals have wide distribution in the environment or widespread human exposure, the mutagenicity battery includes, in addition to the bacterial mutation assay and in vivo micronucleus test, an in vitro gene mutation assay, preferably the mouse lymphoma mutation assay. Submitters have the option of substituting a CHO/hgprt gene mutation assay and an in vitro chromosomal aberration assay in place of a mouse lymphoma test. A positive response in either gene mutation assay triggers a study for interaction with gonadal DNA, and may include end points such as SCE, alkaline elution, or UDS. Positive evidence of interaction with gonadal DNA triggers a specific locus test. A positive response in the in vivo bone marrow micronucleus test triggers a dominant lethal assay, and a positive dominant lethal triggers a heritable translocation assay. Chronic studies for carcinogenicity are triggered by a positive response in all three base sets, or a positive response in either in vitro mutation assay plus the micronucleus assay. Single positive responses or positive responses in the two in vitro assays results in a data review. The Office of Pesticide Programs (OPP) revised its mutagenicity testing [5]. Its three-test battery includes a scheme for agricultural chemicals in 1991 bacterial mutagenicity assay, a mouse lymphoma gene mutation assay, and an in vivo bone marrow cytogenetics assay, either bone marrow metaphase analysis or micronucleus test. The CHO or V79/HGPRT gene mutation assay and an in vitro chromosomal aberration assaymay be substituted for the mouse lymphoma assay.
2.
Europe
The Seventh Amendment to directive 79/83 1/EEC stipulates that a bacterial gene mutation assay using both Sulrnonella and E. coli and an in vitro cytogenetics assay are required for premanufacturingor preimport notification of new chemicals to be manufacturedin quantities between 1 and 10 metric tons per year. Only the bacterial mutation assay is required for chemicals manufactured in quantities between 100 kg and 1 metric ton. For chemicals with more than 10 tonnes per year in commerce or where50 metric tons in aggregate arein commerce, a mammalian gene mutation assay and in vivo cytogenetics assay are added to the base set.
3. Japan Two mutagenicity studies are required fornew chemicals and include the bacterial gene mutation assay and chromosomal aberration assayin cultured mammalian cells. The micronucleus test is also recommended asan additional screening test. The guidelines for screening new chemical substances were published in of International 1986 by the Ministry of Health and Welfare (MHW), the Ministry Trade and Industry (MITI), and the Agency of Environment.
Genetic
131
The Ministry of Labor also regulates registration of new chemical substances, marketing of chemicals, raw materials, intermediates, by-products, and waste generated in the workplace. Only a single mutagenicity test, the bacterial gene mutation assay usingS. typlzinzurium and E. coli. is required for these materials. A positive bacterial mutation assay requires performance of anin vitro chromosomal aberration assay. The Ministry of Agriculture, Forestry and Fisheries (MAFF) requires a S. typhirnuriunz three-test battery including a bacterial gene mutation assay using and E. coli, an in vitro chromosomal aberration assay,and a primary DNA damage assay as measured by a bacterial repair test, the Rec-assay [6].
II. BACTERIALMUTATIONASSAYS A.
Introduction
In the early 1970s, several researchers affirmed the use of bacterial mutation assays as a simple and rapid means of detecting mutagens and carcinogens [791. Mutations can be detected by either forward or reverse mechanisms. Forward mutation systems detect mutation as a change in the normal phenotype (appearance) of an organism, whereas reverse mutation systems detect mutation as a reversion from a mutant phenotype to a normal phenotype. In forward mutation systems, which theoretically possess a larger genetic target, any one of a number of nucleotides which when altered disable the expression or function of a gene. Reverse mutation assays generally focus on one or a few specific sites, which when altered, restore function to a defective gene. However, the advantage of the forward mutation assays is more theoretical than practical. Although other microbial mutation assays have'been developed, the two most popular and bestvalidated systems employ, either individually orin combination, S. typhimzwizm or E. coli. In their early development, bacterial mutation assays were reported to detect as mutagens 85 to 90% of the carcinogens tested [10,11]. More recent validation studies, using the U.S. National Toxicology Program database of 264 chemicals, have found that the S. typhimuriurn assay has a sensitivity to rodent carcinogens of 58% and a specificity for noncarcinogens of 73% [12]. Bacteria used in mutation assays are primarily sensitive to three types of mutational events: base-pair substitution mutations. frame-shift mutations, and DNA cross-linking [13]. A frame-shift mutation alters the reading frame of the DNA by insertion or deletionof one or more bases. Base-pair substitution mutations retain the correct reading frame but one of the base pairs undergoes a substitution such that the resulting DNA molecule contains an incorrect base. Deoxyribonucleic acid cross-linking agents covalently bind the two strands of the DNA double helix. Because of the importance in maintaining an organism's genetic integrity, nature has invested much energy in minimizing the chance for muta-
132
Putman et al.
tions to occur. All organisms, even the simplest bacteria, have elaborate DNA repair systems. Furthermore, mutations are extremely rare events. For these reasons, mutations are difficult to detect without the use of specialized bacteria that are hypersensitive to specific type of mutations.
1. Purpose of Study The purpose of these studies is to evaluate the mutagenic potential of a test article and its metabolites by measuring and quantifying its ability to induce reverse mutations at selectedloci of S. ~phirmu-iunzand E. coli in thepresenceand absence of metabolic activation. This test system has been shown to detect a diverse group of chemical mutagens [ 10,111.
2. Cell Selection and Justification The S. tyyhimurium and E. coli strains used in this assay each have a defect in one of the genes involved in histidine and tryptophan biosynthesis, respectively. The defect renders the cell dependent (auxotrophic) on exogenous histidine or tryptophan. Unless the cell experiences a mutation that reverts the dysfunctional gene back to the wild type (prototrophic), the cell becomes disabled when the exogenous histidine or tryptophan is exhausted. For this reason, this assay is referred to as a reverse or back mutation assay. To increase their sensitivity to mutagens, several additional mutations have been incorporated into the strains used in the standard battery. Mutations in the uvrA gene of E. coli and in thewrB gene of S. fyphinzzwiurn are partial deletions of each respective gene [14]. The z w genes code for a series of DNA excision repair enzymes involved in removal of T-T dimers induced by ultraviolet (UV) light. Cells with this type of mutation are unable to repair damage induced by UV light and other typesof mutagens. The presence of either of these mutations can be detected by demonstrating sensitivity to UV light. The pKMlOl plasmid codes for an error-prone DNA repair system[ 151. Since cells with this mutation cannot correctly repair DNA damage, they have increased sensitivity to mutagens. Cells containing this plasmid are resistant to ampicillin. The )fa wall mutation prevents theS. typhimuriulrz cells from synthesizing an intact polysaccharide cell wall [14]. Therefore, large molecules such as benzo[a]pyrene (BaP) that are normally excluded are able to penetrate the cell. Cells containing this mutation are sensitive to crystal violet. The genotypes of some of the more commonly used tester strains are summarized in Table 1. Since a single strain of bacterium is capable of detecting only a specific type of genetic damage, several strains must be used to effectively screen for mutagenic potential. The S. typhinzrlrium strains detect reversion from his- to his+ at a single site in one of the 12 steps of histidine biosynthesis. Strains TA98, TA1537, TA1538, and TA97 are reverted from histidine auxotrophy to histidine
Table 1 Tester Strain Genotypes -
Histidine mutation 12isG428
lzisC3076
hisD3052 Frame-shift detectors
HisD66 10
lzisG46
(PAQ 1) Base-pair detectors
Tryptophan mutation trpE
Additional mutations
LPS
Repair
R factor
al.
134
Putman et
prototrophy by frame-shift mutagens, whereasTAlOO is reverted by both frameshift and base substitution mutagens, and TA1535 is reverted only by mutagens that cause base substitutions [16]. Strain TA102 possesses A-T base pairs at the site of the mutation unlike other S. trvyhinzuriurn tester strains that possess G-C base pairs at the mutation sites. In addition, TA102 hasbeen shown to be useful for detecting oxidative mutagens such as bleomycin thatnot aredetected by other tester strains. The uvrB mutation has not been introduced into this strain; therefore, with an intact excision repair system, itcan detect cross-linking agents such as mitomycin C [17]. The E. coli strains detect reversion from trp- to trp+ at a site blocking tryptophan biosynthesis prior to the formation of anthranilic acid. trpE These strains have A-T base pairs at the critical mutation site within the gene [ 181. Tryptophan revertants can arise due to a base change at the originally mutated site or elsewhere in the chromosome causing the original mutation to be suppressed. Thus, the specificity of the reversion mechanism is sensitive to base-pair substitution mutations, rather than frame-shift mutations [ 191. To adequately assess the mutagenic potentialof a chemical, it is important to select an appropriate battery of tester strains. Although the selection is often of the chemical should always guided by the world’s regulatory bodies, the nature be considered prior to making the final selection. If the specific properties of a chemical make one strain particularly sensitive,that strain should most likely be included in the battery. The batteries usedby the world’s regulatory bodies have been selected to provide sensitivity for a diverse group of chemicals. The strain battery that was approved under the 1997 harmonization efforts of the OECD and the ICH allows the selection of one strain from each of five (4) TA1537, TA97. categories, as follows: (1) TA98; (2) TA100; (3) TA1535; A WP2 rwrA (pKM101). To detect crossor TA97a; ( 5 ) TA102, W P ~ ’ U I Yor linking agentsit is preferable to select TA102 or to add WP2 (pKM101) [2,20,23].
3.
Maintenance of Cells
To ensure that the cells are well characterized and traceable, it is best to obtain S. typhimuriunz strains may be obtained cultures from a recognized supplier. Most from Dr. Bruce Ames, University of California, Berkeley, CA. TheE. coli strains may be obtained from the National Collectionof Industrial and Marine Bacteria, Aberdeen, Scotland.The American Type Culture Collection (Manassas, VA) also supplies some of the strains. Shortly after receipt of new cultures, both frozen permanent stocks and working stocks should be prepared [16]. z strains are prepared by inoculating Master plates for S. ~ p l ~ i m u r i u ntester onto minimal media supplemented with histidine and biotin. For those strains that contain the pKM101 or pAQl plasmid, ampicillin or tetracycline are added as appropriate [ 161. Master plates forE. coli tester strains are preparedby inoculating onto Vogel-Bonner minimal medium E supplemented with 2.5% Oxoid
Genetic Toxicology
135
Nutrient Broth No. 2. Master plates are incubated at 37°C for approximately 24 to 48 hr and are stored at 4°C. Master plates can serve as working stocks for up of picking a to 6 wk.On occasion, one can experience the inherent problem revertant from the master plate for the overnight culture. When this occurs, the his+ or trp+cells, resultingin confluent entire overnight culture contains revertant bacterial growth on the plates. By gently placing a replicate plating pad onto the surface of the master plate, a small quantity of each colony can be transferred to a biotin-only plate. After incubation, the revertant colonies are clearly identifiable as isolated areasof bacterial growthon the biotin-only plate. When selecting colonies for inoculating the overnight culture, thesehis+ or trp+ colonies can be avoided. This dramatically reduces the chances of selecting a revertant colony for inoculation of the overnight cultures [21]. Overnight cultures are inoculated from the appropriate master plate or from the appropriate frozen stock. To ensure that cultures are harvested in late log phase, the length of incubation should be controlled and monitored. The shaker/ incubator is programmed to begin shaking at approximately 125 rpm at37OC, approximately 12 hr before the anticipated time of harvest. All cultures should be harvested by spectrophotometric monitoring of culture turbidity rather than by duration of incubation. Cultures shouldbe removed from incubation at a density of approximately lo9 cells/ml.Since undergrowth or overgrowthof the cultures can cause lossof sensitivity, it is importantto monitor the cultures to ensure a sufficient population of viable cells is used. The sensitivity of the bacterial mutation assays is enhanced by the large population of individual cells at risk, approximately 10 to 100 million cells per plate. Wagner and San [22] have reported that cultureswith titer values betweenlo8and lo9yield comparable sensitivity.
B. ExperimentalDesign Bacterialmutationassaysarecommonlyperformed in twophases. The first phase, the preliminary toxicity assay or dose range-finding assay, is used to establish the dose range for the definitive assay. The second phase, the mutagenicity assay, which may include both an initial assay and a confirmatory assay, is used to evaluate the mutagenic potential of the test article.If a large numberof similar chemicals are being screened, it is often possible to eliminate the preliminary toxicity once an appropriate dose range is established. In Japan, it is customary to conduct the preliminary experiment with all strains using duplicate plates per dose rather than just one or two representative strains and single plates per dose.
1. Dose Selection In the Preliminary toxicity assay, the maximum dose level tested should be 5 mg per plate, solubility or homogeneity permitting. If the test article cannot be dis-
et
136
Putman
al.
solved at a sufficient concentration in a solvent compatible with the test system or if the test article cannot be suspended adequately in a vehicle compatible with the test system, the maximum dose tested should be the highest achievable dose up to 5 mg per plate. A suspension isnot ideal because of the difficulty in preparing accurate dosing stocks and in delivering them to the test system. However, unlike mammalian systems, where the test article must be removed following a finite exposure period, suspensions willnot otherwise interfere with the conduct of the assay. A sufficient number of lower dose levels should be tested in the preliminary toxicity assay to determine the appropriate dose range for the definitive assay. In the mutagenicity assay, the maximum dose level for each strain and metabolic activation combination should be either 5 mg per plate or should be selected to demonstrate toxicity or mutagenicity. The criteria for assessing toxicity, precipitate, and mutagenicity are discussed in following sections. The selection criteria for the maximum dose level in the mutagenicity assay approved by the OECD and ICH are as follows: (1) for a nontoxic, a nonmutagenic, and a nonprecipitating test article, the maximum dose level should be 5 mg per plate: (2)for a toxic or a mutagenic test article, the maximum dose level should demonstrate toxicity or mutagenicity irrespective of the precipitation profile; (3) for a precipitating, a nontoxic, and a nonmutagenic test article, the maximum dose level should be the lowest precipitating dose level [2,20]. While these criteria will meet regulatory requirements, other criteria may be applied based on the objective of the assay. It has been demonstrated that low-level mutagenic componentsmay not be detected until the test article is tested at a sufficiently high concentration for the mutagenic components to be detected. This may necessitate testing at precipitating dose levels. If the testing is being conducted early in the development process. this may help identify potential lowlevel mutagenic contaminants in different preparations or analogues of the test material.
2.
Number of Cultures
Since the preliminary toxicity assayused is primarily to determine the appropriate dose range for the definitive assay, it is usually conducted with only a single plate per dose level. The mutagenicity assay is usually conducted with three plates per dose level. This provides a reliable estimate of the response at each dose level without significantly influencing the mean by divergent plate counts.
3.
MetabolicActivationSystem
Many mutagens and carcinogens are actually promutagens and procarcinogens that require activation or detoxification to their active forms, e.g., polycyclic aromatichydrocarbons,aromaticamines,nitrosaminesandazodyes. The major
Genetic
137
mammalian activation systems are multicomponent, membrane-bound reduced nicotinamideadeninedinucleotidephosphate(NADPH)-requiring,molecular oxygen-requiring, cytochrome P450-dependent complexes of mixed-function oxygenases. Since S. typhin~uriunzand E. coli, as well as many other in vitrotest systems, lack these metabolic capabilities, an exogenous activation system must be provided. Ames and others developed an exogenous activation systemby centrifuging homogenized mammalian liver at9000 X g, to recover the microsomal fraction commonly referredto as S9. To effectively detect a wide varietyof promutagens and procarcinogens, it is necessary to stimulate production of these enzymes with an inducing agent, most commonly a polychlorinated biphenyl (PCB) [7]. Although other tissues, species, and inducing agents may be used, liver microsomes from rats induced with Aroclor 1254 have been found to be a convenient and efficient activating system [23]. Because of the carcinogenic properties of Aroclor 1254, the disposal problems associated with PCBs and the difficulty in working with PCBsin some partsof the world, other inducing agents have been developed. Matsushima et al. [24] and Ong et al. [25] have reported that a combinationof phenobarbital and P-naphthoflavone providean alternative to PCB induction that is similar to Aroclor 1254. Typically, S9 homogenate is prepared from male Sprague-Dawley rats induced with a single, intraperitoneal injection of Aroclor 1254, at 500 mg/kg, 5 days prior to sacrifice. The S9 is prepared and stored frozenat approximately -70°C for up to 2 yr. To verify the activity, each batch of S9 homogenate must be assayed for its ability to metabolize 2-aminoanthracene and 7,12-dimethylbenzanthraceneto forms mutagenic to S. typhiwzuriunl TAlOO [26]. For the S9 homogenate to provide a NADPH-generating system, it must be combined with an appropriate mix of cofactors as follows. Immediately prior 10% S9 to use, the S9 is thawed and mixed with a cofactor pool to contain homogenate, 5 mM glucose-6-phosphate, 4mM P-nicotinamide adenine dinucleotide phosphate (NADP), 8mM MgCl? and 33 mM KC1 in a 100 mM phosphate S9 mix. Sham mix, containing buffer at pH 7.4. This mixture is referred to as 100 mM phosphate buffer at pH 7.4, is used in place of S9 mix for the nonactivated portion of the assay. For peak activity, some mutagens may require S9 mixes with between 4 and 30% S9 homogenate [27.28]. If specific metabolic requirements (e.g., azo compounds) are known abouttest thearticle, this information should be utilized in designing the assay. Furthermore, equivocal results should be clarified using an appropriate modification of the experimental design (e.g., dose levels, activation system, or treatment method) [2,20].
4.
Controls
To ascertain thatthe test system is functioning properly, appropriate controls must be included with each mutagenicity assay.To determine that the tester strains are
et
138
Putman
al.
functioning properly, a set of negative controls must be included for each tester strain-activation combination. If a routine vehicle (e.g., water, dimethyl sulfoxide, ethanol or acetone) is being used, the negative control only needs to be the vehicle. If an unusual vehicle, with no historical database, is being used, the negative control should include both untreated and vehicle controls. This will allow one to determine any vehicle-induced effects on the test system. To determine that the tester strains are responding to known mutagens, positive controls that are direct-acting and those that require S9 activation must be included for each tester strain-activation combination. It is important to compile the control values and to monitor their performance. These historical values can be used to track performanceof the test system over time and to provide a basis for determining outlier values for each experiment. Positive controls that may be used with each tester strain-activation combination are shown in Table 2. Sterility controls should also be included for the sham and S9 mixes and for the vehicle and test article dosing solutionsby plating on selective agar with the same aliquot used in the assay.
5.
Dose Administration
On the day of its use, minimal top agar, containing 0.8% agar and 0.5% NaCl, is melted and supplemented with L-histidine,D-biotin and L-tryptophan. Supplemented top agar is commonly referred to as selective top agar. Bottotn agar is Vogel-Bonner minimal medium E [29] containing 1.5% agar and 0.2% glucose. The low glucose content enhances the revertant colony formation of tester strain TA97 without affecting the other strains [30].
Table 2 PositiveControls
Strain
s9 control Positive activation
TA98, TAl00, TA1535, TA1537. TA1538 TA97 WP2 mv-A, WP2 uvsA (pKM101) TA102 WP’, (pKM101) TA98, TA1538 TA 100, TA 1535 TA1537. TA97 TA102 E. coli
+ + + + + -
Concentration (pg/plate) ZAminoanthracene 2-Aminoanthracene 2-Aminoanthracene Sterigmatocystin Sterigmatocystin 2-Nitrofluorene Sodium azide 9-Aminoacridine Mitomycin C methanesulfonate Methyl
1 .o 2.0
10 10 100 1.o
1.o 75 1.o 1000
Genetic
139
The test articlecan be administered to thetest system using the plate incorporation method or the preincubation method. The plate incorporation method is the original method developed by Ames [7]. In this method, S9 mix or sham mix, tester strain, and test article dosing solutions are added to molten selective top agar. The components are mixed and overlaid onto the surface of minimal bottom agar plates. After the overlay has solidified, the plates are inverted and incubated for approximately 48 to 72 hr at 37°C. Although the plate incorporation method is sensitive tomany mutagens, it is not sensitive to all mutagens (e.g., nitrosamines, divalent metals, aldehydes, azo dyes, pyrrolizidine alkaloids, allyl compounds, and nitro compounds) [31]. Yahagi et al. [32] developed the preincubation method to overcome this limitation While the preincubation method is generally more sensitivethan the plate incorporation method because the test article is preincubated with the test system, it is also generally more subject to toxic effects. In the preincubation method, 0.5 ml of S9 mix or sham mix is added to preheated (37°C) glass culture tubes. To these tubes are added tester strain and dosing solution. After mixing, is allowed it to incubate for 20 to 60 min at 37°C. Selective top agar is added to each tube and the mixture is overlaid onto the surface of minimal bottom agar plates and incubated as described above. Plates that arenot counted immediately following the incubation period may be stored at 4°C.
6.
End Point Evaluation and Time of Evaluation
Prior to scoring the revertant colonies, the conditionof the bacterial background lawn is evaluated for evidence of test article toxicityby using a dissecting microscope. Toxicity and degree of precipitation are scored relative to the concurrent vehicle control plate using the codes shown in Table 3. It is essential that the so that grownup, background lawn be evaluated before scoring the revertants nonrevertant background lawn colonies are not included in the revertant count. This occurs as the level of toxicity increases because histidine or tryptophan available for the surviving population permits the auxotrophic bacteria to form isolated microcolonies that can be misinterpreted as mutant colonies (Figure 1). True revertant colonies generally appear as isolated colonies about1 to 2 mm in diameter. Immediately surrounding the revertant colony is a normal to moderately To confirm the genotypeof any questionreduced microscopic background lawn. able colonies, eachmay be replicate plated onto histidine-free or tryptophan-free medium. Colonies that grow on the amino acid-free medium are true revertant colonies that should be included in the revertant count. All nonreplicating colonies are excluded from the count. For each plate, all true revertants are tallied and reported as revertants per plate. A dose level is considered toxicif it causes a >50% reduction in the mean (this number of revertants per plate relative to the mean vehicle control value
Putman et al.
140
Table 3 BackgroundLawnEvaluationCriteria
Description
Code 1 2
Normal Slightly Reduced
3
Moderately reduced
4
Severely reduced
5
Absent
6
Obscured by particulate
NP
Noninterfering precipitate
IP
Interfering precipitate
Distinguished by a healthy microcolony lawn Distinguished by a noticeable thinning of the microcolony lawn and possibly a slight increase in the size of the microcolonies compared with the vehicle control plate Distinguished by a marked thinning of the microcolony lawn resulting in a pronounced increase in the size of the microcolonies compared with the vehicle control plate Distinguished by an extreme thinning of the microcolony lawn resulting in an increase in the size of the microcolonies compared with the vehicle control plate such that the microcolony lawn is visible to the unaided eye as isolated colonies Distinguished by a complete lack of any microcolony lawn over 290% of the plate The background bacterial lawn cannot be accurately evaluated due to microscopic test article particulate Distinguished by precipitate on the plate that is visible to the naked eye but any precipitate particles detected by the automated colony counter total less than 10% of the revertant colony count (e.g., <3 particles on a plate with 30 revertants) Distinguished by precipitate on the plate that is visible to the naked eye and any precipitate particles detected by the automated colony counter exceed 10% of the revertant colony count (e.g., >3 particles on a plate with 30 revertants)
reduction must be accompanied by an abrupt dose-dependent dropin the revertant count) or a moderate reduction in the background lawn. In the event that fewer than three nontoxic dose levels are achieved, the affected portion of the assay should be repeated with an appropriate change in dose levels. Since mutagenic activity is inherently toxic, may it be necessary to adjust dose levels to investigate suspect responses.
Genetic Toxicology
141
C.EvaluationCriteria 1. Data Presentation For each plate, the background lawn evaluation codeand revertant count should be presented along with the mean and standard deviationof the replicate revertant counts for each dose.
2. Statistical Analysis Althoughseveralstatisticalmodelshavebeendeveloped, no single,recommended choice has been agreed upon for these assays. Weinstein and Lewinson [33], Bernstein etal. [34], and Stead et al. [35] assume the experimental variation follows a Poisson distribution. Margolin et al. [36] assume that the random variation follows a negative-binomial distribution. Meyerset al. [37] employ an iterative weighted least squares method. Unfortunately, dose-response relationships can vary widely in practice and all of these methods employ specific models and assumptions that arenot always met. Snee and Irr [38] have shown thatby using a power transformation, the statistical assumptions of normal distributions and homogeneous variance are more closely met. Therefore, analysis of variance, regression analysis, and Student's t-test can be properly used in evaluation of mutagenicity data. This method allows statistical evaluation of the data with regard to reproducibilityand linear, quadratic, and higher-order responses. Regardless of the statisticalmodel used, if any, sound scientific judgment and biological relevance should be used as the primary criteria, with statistical methods serving as an aid in drawing any conclusions.
3. Criteria for Valid Test To ensure the validity of the test results, certain test system criteria must be met before the data can be properly evaluated. Each tester strain culture must demonstrate the appropriate genotype as listed previously in Table 1. Ideally, these strain markers should be determined daily for each culture; however, at a minimum, they shouldbedemonstratedfollowingpreparation of each new master plate or frozen stock culture. Each culture must demonstrate the characteristic number of spontaneous revertants in the negative control.Each laboratory should establish historical ranges for the performanceof the strains in their environment. To ensure that appropriate numbers of bacteria are plated, tester strain culture titers must be greater than or equal to 0.3 X lo9 cells/ml. Positive controls must demonstrate the sensitivityof each strain to a known direct-acting mutagen and the capability of the activation system to convert a promutagen to a mutagen.
Figure 1 Bacterial background lawn evaluations. (A) A normal, healthy background lawn is shown surrounding a revertant colony to the left. The healthy lawn results from microcolonies that form as the bacteria divide. When the available histidine or tryptophan is exhausted these nonmutated bacteria stop dividing. (B) A slight reduction in the background lawn results when a small percentage of the population of bacteria is killed. The resulting survivors become slightly larger and less densely packed. As the toxicity continues to increase, a larger percentage of the bacterial population is killed, resulting in more available histidine or tryptophan for the surviving population. At this stage, the background lawn becomes moderately reduced (C), with a revertant colony visible to the left and the background lawn appearing as isolated. enlarged microcolonies. As the toxicity approaches near total lethality, a severely reduced background lawn (D) results when the microcolony lawn becomes visible to the unaided eye as isolated colonies.
Genetic Toxicology
143
4. Positive and Negative Test
The most appropriate criteria for assessment of a positive response is a biologically relevant, reproducible, statistically significant, dose-related increase in revertant colonies per plate. However, in the absence of a clear statistical model and despite the lack of supporting evidence, the following arbitrary guidelines have been commonly used.
al.
144
Putman et
For a test article to be evaluated positive, it must cause a dose-responsive increase in the mean revertants per plate of at least one tester strain either with or without S9 activation. Data sets are judged positive if the increase in mean revertants at the peakof the dose response is equal to or greater than two or three times the mean vehicle control value. Strains TA1535, TA1537, and TA1538 are judged positive if the peak response is equal to or greater than three times the vehicle control value. Strains TA98, TA100, TA97, TA102, WP2 ZWA,WP2 zwA (pKMlOl), and WP2 (pKM101) are judged positive if the peak response is equal to or greater than two times the vehicle control value. As per the OECD and ICH guidelines, verification of a clear positive response is not required and negative results do not need to be retested when justification can be provided. However, equivocal results should be retested using an appropriate modification of the experimental design (e.g., dose levels, activation system, or treatment method) [2,20].
D. Special Considerations Bacterial mutagenicity assays may be readily modified to accommodate the testing of unusual test articles. These test articles are typically those that are poorly detected by the standard assay or those that cannot be directly delivered to the test system. Several of the more common modifications are briefly discussed below.
1. Medical Devices These materials present a unique challenge because of the difficultyin delivering the test article to the test system. This problem is overcome by extracting the test article in an aqueous medium (e.g., saline) and in an organic medium (e.g., dimethyl sulfoxide or ethanol) [39]. The materials are typically extracted at 37 or 50°C for 3 to 5 days. The extracts are generally prepared at a ratio of 120 cm’ test article surface area in 20 ml of extraction medium for materials <0.5 mm thick. For test articles 20.5 mm thick, the extraction ratio is usually60 cm2 of test article surface area in 20 ml of extraction medium. For test articles in which the surface area cannot be readily determined, the extraction ratio is usually 4 g of test article per 20 ml of extraction medium. The extracts are subjected to the normal testing scheme.
2.
Petroleum Extracts
Many complex mixturesof petroleum hydrocarbons that demonstrate dermal carcinogenicity in rodents are undetected or only induce marginal responses in the standard bacterial mutagenicity assay. In 1984, Blackburn et al. [40] described a modified S. hylzimurimz assay with improved sensitivity to complex mixtures
Genetic Toxicology
145
of petroleum hydrocarbons with boiling points 2500°F. The Blackburn method overcame two difficulties in testing petroleum hydrocarbons. First,by extracting the oil sample with dimethyl sulfoxide, aqueous compatible solutions are obtained. Second, the metabolic activation system is enhanced by using an 80% hamster liver S9 mix with a 2X NADP concentration rather than the standard 10% rat liver S9. Blackburn et al. [41] further reported that when using tester strain TA98 in the modified assay, there was excellent correlation between the slope of the initial portion of the mutagenicity curve (termed the mutagenicity index) and the carcinogenic potency of the oil.
3.
Reductive Metabolic Activation
Many dyes used in foods, drugs, cosmetics, and industrial products are azo compounds. These compounds canbe reduced, by anaerobic intestinalmicroflora and mammalian azoreductases in theliver,tofreearomaticamines.Since many aromatic amines are carcinogenic and mutagenic, a safety assessment of these compounds is necessary. Privaland Mitchell [42] have developed a modification to the standard assay for optional sensitivity to a wide range of azo dyes. These modifications were adapted from those developed in Sugimura's laboratory and employ use of 30% uninduced hamster S9 under reductive conditions and a 30min, 30°C preincubation. The use of reductive metabolic activation should be used in collaboration with the normal activation system.
4. Vapor- and Gas-Phase Test Articles Vapor or gaseous test articles requirean effective containment system to ensure adequate testing. Several exposure systems have been developed. Distlerath et al. [43] developedthe taped plate assay to test volatile materials on a small scale. Hughes et al. [44] haveused Tedlar"' bags for in situ testing of volatiles in smallvolume environmental air samples. Wagneret al. [45] reported on the use of the desiccator methodology forthe routine testing of vapor-phase and gas-phase test articles. By using 9-liter desiccators to contain the test articles, dose responsive positive increases were observed with methylene chloride as a vapor and with vinyl bromide as a gas.
5. Biornonitoring The genotoxicity testing of urine has been used to monitor humans exposed to cigarette smoke, drugs, and to chemicals in the workplace. Chemicals administered to humans can appear in the urine as the free, parent chemical, as one or more metabolites, or as conjugates with glucuronic acid, cysteine, or other in the urine, assayson urine substances. Althoughnot all metabolites are excreted have been used successfully to test the in vivo metabolism and mutagenicity of
146
al.
Putman et
cigarette smoke, antineoplastic drugs, food mutagens, and other chemicals. To adjust for varying volumesof urine and differences in body weight between individuals, the creatinine content of each urine sample can be used to standardize the quantity and concentration of urine tested in the mutagenicity assay. Due to the presence of histidine in the urine, which can interfere with the assay, and also to concentrate the excreted mutagens, the samples can be extracted with a SEP-PAK"' C18column (Millipore Corp.). The CIScolumn will extract nonpolar components and has been found to be superior to XAD-2 resins in terms of the performance and recovery of nonpolar genotoxic constituents in urine [46]. The loss of polar components in both the XAD-2 and CI8extraction procedures is anticipated. The extracts are generally tested using an enhanced preincubation procedure developed by Kado et al. [47]. In this method, heightened sensitivity (13- to 20-fold) is achieved by using a concentrated preincubation mixture with a IO-fold increase in the number of bacteria per plate and a 90-min preincubation period.
6. Screening Assays While these assays will not meet regulatory requirements, they can provide a of screeningalargenumber of materials.Each quick,cost-effectivemeans method has advantages and disadvantages. ( a ) Gradient Plctte Asscry. In this assay, developed by Cline and McMahon, each tester strain is tested over a 4-log (0.I to 1000 pg/ml) concentration range in the presence and absence of S9 activation [48]. The frequency of revertant colonies is qualitatively evaluated along the bacterial streak. If no mutagenic activity occurs, a pale band of bacterial growthis observed along the inoculation streak. If nonlethal, mutagenic activity occurs, revertant colonies appear along the pale bacterial streak. The colony frequency generally increases with the increasing test article gradient until the concentration of maximum mutation is reached. If toxicity occurs. there willbe no bacterial growth. While the results provide only a qualitative assessmentof the test article's mutagenic activity, this assay allows testing of all strains in the standard battery with about 150 mg of test article.
( b ) Spot Test Assay. Each tester strain is tested in duplicate in the presence and absence of S9 activation. Since the test article is applied in the center of a petri dish and a concentration gradient forms asthe test article diffuses into the agar, each plate permits evaluation over a limited concentration range [49]. Revertant colonies induced by test articles that diffuse poorly tend to be tightly clustered immediately outside the zone of inhibition of growth causedby cytotoxicity of the test article. In these cases it is difficult to accurately count the revertants. Onthe other hand, sometest articles diffuseso readily that the revertants
Genetic Toxicology
147
are spread fairly evenly throughout the top agar and can be counted with reasonable accuracy. If the distribution of revertants is skewed toward the area around the test article, the phenomenon is referred to as clustering. For these reasons, the results provide a semiquantitative assessment of the test article's mutagenic activity with all strains in the standard battery and about 200 mg of test article.
(c) Abbreviated Standard Assay. Thisassaygenerallyemploystester strains TA98 and TAlOO, two strains that are responsive to a diverse group of mutagens. Each strain is tested at multiple dose levels each in the presence and absence of S9 activation. The results provide a quantitative, but limited assessment of the test article's mutagenic activity with about 150 mg of test article.
111.
MAMMALIANCELLMUTATIONASSAYS
A.
Introduction
The induction of mutations in mammalian cells in vitro was first demonstrated [50,511. Since that time, investigators have searched in Chinese hamster V79 cells for in vitro systems that are sensitive to chemically induced mutations, e.g., in diploid fibroblasts [52], L5178Y mouse lymphoma cells [53,54], Chinese hamster ovary cells [55,56], and human lymphoblasts [57,58]. Depending on the specific regulatory requirements, any one of the following three assay systems may be used: the thymidine kinase locus of L5178Y mouse lymphoma cells (TK), the hypoxanthine-guanine phosphoribosyl transferase locus of Chinese hamster ovary cells (CHO/HGPRT), and the bacterial xanthine-guanine phosphoribosyl transferase gene which has been inserted into Chinese hamster ovary cells (AS52/XPRT). This discussion is focused on the TK and CHO/HGPRT assay systems, which are the more commonlyused mammalian cell gene mutation systems.
1. Purpose of Study The purpose of the mammalian mutation assay is to evaluate the mutagenic potential of the test article based on quantitation of forward mutations at a selected locus in mammalian cell cultures (specifically, the TK locus of L5178Y mouse lymphoma cells, or the HGPRT locus of CHO cells).
2.
Cell Selection and Justification
The L5178Y mouse lymphoma thymidine kinase assay, first described by Clive et al. [59]. has successfully identified a number of mutagenic agents. The assay utilizes L5178Y cells, which are heterozygous at the TK locus. Potential mutagenic agents are tested for the ability to cause the TK+" "+ TK-" mutation.
al. 148
et
Putman
TK-/- mutants lack the salvage enzyme TK and can easily be detected by their resistance to lethal thymidine analogues. The selective agentof choice is trifluorothymidine (TFT) [60]. L5 178Y mouse lymphoma cells possess several desirable characteristics that facilitate mutagenicity testing. The cells grow in suspension culture, which allows (1) theculturesto be easilysampledtodetermine the cellpopulation densities, and (2) the enumeration of mutant colonies in selective soft agar medium. The cell line also exhibits a very high cloning efficiency (>70% in our hands) and has a generation time of 10 to 12 hr. Within 10 to 14 days after the initial seeding, colonies are large enough to count and size with an automatic colony counter. This cell line is easily cryopreserved, recovers quite readily when reestablished in culture, and is routinely grown in horse serum, which is very cost effective. The maximum mutagenic response atthe TK locus is achieved with a relatively short expression time.A 2- to 3-day expression period is necessary for the L5178Y TK+/- assay [61,62], whereas a range of 6 to 16 days is required for selectivesystemsmeasuringmutationsat the HGPRTlocus[54.58,63,64]. A shortexpressionperiodalsoallowstherecovery of slower-growingmutants, which would be overgrown during a longer selection period. In addition to the detection of point mutation, the use of L5178Y cells, which are heterozygous at the TK locus, also permits the detection of mutants resulting from chromosomal rearrangements. L5 178Y TK-/- mutants resulting from point mutations are recoverable because no essential functions are deleted. Mutants resulting from chromosomal deletion or rearrangement may be recovered if the deleted essential function is supplied by the homologous region on the homologous chromosome [65]. After treatment with many mutagenic substances, mutant TK-/- colonies exhibit a characteristic frequency distribution of colony sizes. The precise distribution of large and small TFT-resistant mutant colonies appears to be the characteristic mutagenic “finger-print” of carcinogens in the L5178Y TK’” system [61,65]. Clive et al. [61] and Hozier et al. [66] have presented evidence to substantiate the hypothesis that the small-colony variants carry chromosome aberrations associated with chromosome 11, the chromosome on which the TK locus is located in the mouse [67]. Large-colony TK-/- mutants with normal growth kinetics appeared karyotypically similar within and among clones and with the TK+/- parental cell line. In contrast, most slow-growing, small-colony TK-/- mutants had readily recognizable chromosome rearrangements involving chromosome1 1. They suggested that the heritable differences in growth kinetics and resultant colony morphology in large and small mutants were related to the type of chronlosome damage sustained. Large-colony mutants received very localized damage, possibly in the form of point mutation or small deletion within the TK locus, while small-colony mutants received damage to collateral loci concordant with the loss of TK activity. This view was substantiated by the work of Glover et al. [68]. They reported the loss of a 6.2
Genetic Toxicology
149
kb fragment that spans the end of the TK locus and a portion of the collateral loci in small-colony mutants. Therefore, it is possible that a range of generic lesions (point mutations and chromosomal damage) is detected using the TK locus in mouse lymphoma cells. The CHO/HGPRT assay was designed to select for mutant cellsthat have become resistant to such purine analogues as 6-thioguanine (TG) and 8-azaguanine as a resultof mutation at the X chromosome-linked HGPRT locus [55,69711. This system has been demonstrated to be sensitive to the mutagenic action of a variety of chemicals [55]. Unlike the L5 178Y cells, which are heterozygous at the TK locus, the CHO cells are functionally hemizygous the at HGPRT locus on only one of the (the single functional copyof the HGPRT gene being present two X chromosomes). Therefore, while mutants resulting from point mutations are detectable in this assay system, mutants resulting from chromosomal mutations are not recoverable because no homologous region is available to supply any deleted essential function and lethality ensues [65].
3.
Maintenance of Cells
L5 178Y/TK+/-mouse lymphoma cells, from clone 3.7.2C, should be obtained from Patricia Poorman-Allen, GlaxoWellcome Inc., Research Triangle Park, NC. Each freeze lot of cells must be tested and found tobe free of mycoplasma contamination. Prior to usein the assay, L5178Y/TK+" cells are cleansed to reduce the frequency of spontaneously occurring TK-/- cells. Using the procedure described by clive and Spector [73], L5178Y cells are cultured for 24 hr in the presence of thymidine, hypoxanthine, methotrexate, and glycine to poison the TK-/- cells. L5178Y cells are cultured in Fischer's Media for Leukemic Cells of Mice with 0.1 % Pluronics, supplemented with 10% horse serum and 2 mM glutamine (Flop). The CHO-K1-BH4 cell line is a proline auxotroph with a modal chromosome number of 20, a population doubling time of 12 to 14 hr, and a cloning efficiency of usually greater than 80% [7 11. This subclone (Dl) was derived by Dr. Abraham Hsie, Oak Ridge National Laboratories, Oak Ridge, TN. CHO cells should be cleansed in medium supplemented with hypoxanthine, aminopterin. and thymidine (HAT), then frozen. Cells used in the mutation assay should not exceed four subpassages from frozen stock. Each freeze lotof cells mustbe tested and found tobe free of mycoplasma contamination. Exponentially growing CHOK1-BH4 cells are cultured in F12 medium, with or without hypoxanthine, supplemented with 5% dialyzed serum (F12FBS5 or F12FBSS-Hx).
B. ExperimentalDesign With L5178YITK'" cells, the mammalianmutationassayisperformed by exposing single or duplicate cultures (depending on protocol) to concentrations
150
Putman et al.
of test article as well as positive and negative (solvent) controls. Exposures are for 4 hr in the presence and absence of an S9 activation system. Following a 2day expression period, with daily cell population adjustments, cultures demonin restrictive medium strating 0 to 90% growth inhibition are cloned, in triplicate, containing soft agar to select for the mutant phenotype. After a 10- to 14-day selection period, mutant colonies are enumerated. The mutagenic potential of the test article is measured by its ability to induce TK+/- -+ TK-/- mutations. For those test articles demonstrating a positive response, mutant colonies are sized as an indication of mechanism of action. In compliance with regulatory guidelines, verification of a clear positive response will not be required [ 1,2]. For equivocal and negative results without activation, an independent repeat assay will be performed in which cultures are continuously exposed to the test article for 24 hr. A preliminary toxicity assay without S9 activation and using 24-hr continuous treatment may be performed (where appropriate) to select doses for the independent repeat assay. For equivocal results with S9 activation, an independent repeat assay will be performed using modified dose levels or study design. For negative results with S9 activation, an independent repeat assay will not be required unless the test article is known to have specific requirements of metabolism. The CHO/HGPRT mutation assay is performed by exposing CHO cells for 5 hr to concentrationsof test article as well as positive and the solvent controls in the presenceand absence of an exogenous sourceof metabolic activation. After a 7- to 9-day expression period, the treated cells are cultured in the presence of 10 pM TG for selection of mutant colonies. The mutagenic potential of a test article is determinedby its ability to induce a dose-related increase in the number of TG-resistant mutant colonies when compared with the solvent control.
1. Dose Selection The toxicity profile of a test article is determined in a preliminary toxicity test. For the mouse lymphoma assay system, the preliminary toxicity test is conducted by exposing L5178Y/TK+/-cells to solvent alone and to multiple concentrations of test article, the highest concentration being the lowest insoluble dosein treatment medium but not to exceed 5 m g h l or 10 mM (whichever is the lower) [2,72]. The pH of the treatment medium is adjusted, if necessary, to maintain a neutral pH in the treatment medium. The osmolalityof the highest soluble treatment condition is also measured. After a 4-hr treatment in the presence and absence of S9 activation, cells are washed twice with Flopand cultured,in suspenfirst sion for 2 days posttreatment, with cell concentration adjustment on the day. Selection of dose levels for the mouse lymphoma mutation assay is based on reduction of suspension growth after treatment in the preliminary toxicity test.
Genetic Toxicology
151
Typically, the high dose for the mutation assay is that concentration exhibiting approximately 100% growth inhibition. The low dose is selected to exhibit 0% growth inhibition. For freely soluble, nontoxic test articles, the highest concentration is 5 mg/ml or 10 mM (whichever is the lower). For relatively insoluble, nontoxic test articles, the highest concentration is the lowest insoluble dose in M is the lower). treatment mediumbut not to exceed 5mg/ml or 10 I ~ (whichever In all cases, precipitation is evaluated at the beginning and at the end of the treatment period using the naked eye 12,721. For the CHO/HGPRT assay system, the preliminary toxicity test is based upon colony-forming efficiency. Approximately 5 X lo5 CHO cells are cultured overnight and then exposed to solvent alone and to multiple concentrations of test article, the highest concentration being the lowest insoluble dosein treatment medium not to exceed 5 mg/ml or 10 mM (whichever is the lower) [72]. The pH of the treatmentmedium should be adjusted, if necessary, to maintain a neutral pH in the treatment medium. The osmolality of the highest soluble treatment condition also should be measured. Exposure is for 5 hr at 37°C in a humidified atmosphere of 5% carbon dioxide (CO:) in air in the presence and absence of S9 activation. Eighteen to 24 hr after removal of treatment medium, the treated cells are trypsinized and reseeded at a density of 100 cells/60 mm dish. After 7 to 10 days of incubation at 37°C in a humidified atmosphere of 5% CO, in air, colonies are fixed with 95% methanol, stained with 10% aqueous Giemsa, and counted. The cell survival of the test article-treated groups is expressed relative to the solvent control (relative cloning efficiency). For the mutation assay, whenever possible, the high dose is selected to give a cell survivalof 10 to 30%. Four lower doses are selected, at least one of which will be nontoxic. For nontoxic test articles, the selection of the highest concentration should be based on the same criteria as those described for the mouse lymphoma assay system.
2. Number of Cultures For the preliminary toxicity test i n the mouse lymphon~aassay system, a single culture is exposed to the solvent and each test article dose level. For the mutation assay, typically, duplicate cultures are exposed to the solvent control, positive control, and each of eight test article dose levels; cloning for viability assessment and selection of the mutant phenotype are performed at a minimum of five test article dose levels. Theuse of single culturesin the mutation assay is also acceptable provided that more test article doses are used ( e g . 10 dose levels) and cloned for viability assessment and mutant selection (e.g.. eight dose levels). For the preliminary toxicitytest in the CHO/HGPRT assay system, a single culture is exposed to the solvent and each test article dose level. Forthe mutation assay, duplicate cultures are exposed tothe solvent control, positive control, and and selection each of five test article dose levels; cloning for viability assessment
et
152
Putman
al.
of the mutant phenotype are performed at a minimum of four test article dose levels.
3. Metabolic Activation System For the mouse lymphoma assay system, Aroclor 1254-induced rat liver S9 is 1 1.25 thawed immediately prior to use and mixed with a cofactor pool to contain mg DL-isocitric acid, 6 mg NADP, and 0.25 ml S9 homogenate per ml in FOP. The S9 mix is adjusted to pH 7. For the CHO/HGPRT assay system. Aroclor 1254-induced rat liver S9 is thawed immediately prior to use and mixed with a cofactor pool to contain 100 pl S9/ml reaction mixture of approximately 4 mM NADP, 5 mM glucose-6phosphate, 10 mM MgC12. 30 mM KCl, 10 mM CaC12,and 50 mM sodium phosphate buffer, pH 8.0 [70]. The S9 reaction mixture must be stored on ice until used.
4.
Controls
The solvent (or vehicle) for the test article is used as the negative control. For the mouse lymphoma assay system, methyl methanesulfonate (MMS) at two concentrations of 10 and 20 pg/ml for the 4-hr exposure (or 2.5 and 5 pg/ml forthe 24-hr exposure) is used as the positive control forthe nonactivated 7,12-dirnethylbenz[a]anthracene test system. For the S9-activated system, (DMBA) at two concentrationsof 2.5 and 4.0 pg/ml is used as the positive control. For the CHO/HGPRT assay system, ethyl methanesulfonate (EMS) is used at a concentration of 0.2 p h 1 as the positive control forthe nonactivated study, and benzo[a]pyrene (BaP) will be used at a concentration of 4 pg/ml as the positive control for the S9-activated study.
5. Dose Administration For the mouse lymphoma assay system, treatment is carried outin conical tubes by combining test or control article in solvent or solvent alone with medium or S9 activation mixture containing 6 X lo6L5178Y/TK+” cells in a total volume of 10 ml. All pH adjustments should be performed prior to adding S9 or target cells to the treatment medium. Treatment tubes are gassed with 5% C 0 2 in air. capped tightly, and incubated with mechanical mixing for 4 hr at 37°C. At the end of the exposure period, the cells are washed twice with culture 20 ml Flop, medium and collectedby centrifugation. The cells are resuspended in gassed with 5% C 0 2in air and culturedin suspension at 37°C for 2 days following treatment. Cell population adjustments to 0.3 X lo6cells/ml are made at 24 and 48 hr.
Genetic Toxicology
153
For selection of the TFT-resistant phenotype, cells from the appropriate number of cultures demonstrating from 0 to 90% suspension growth inhibition are plated into three replicate dishes at a density of 1 X lo6 cells/100 mm plate in cloning medium containing 0.23% agar and2 to 4 pg TFT/ml. For estimation of cloning efficiency at the time of selection, 200 cells/l00 mm plate are plated in triplicate in cloning medium free of TFT (viable cell [VC] plate). Plates are incubated at 37°C in a humidified atmosphere of 5% C 0 2 for 10 to 14 days. For theCHO/HGPRT assay system, the timeof initiation of chemical treatment is designatedasday 0. Cellsareexposed, in duplicatecultures,to five concentrations of test article for 5 hr at 37°C in a humidified atmosphere of 5% C 0 2 in air. After the treatment period, all media are aspirated and the cells are washed to remove treatment mediumand are cultured in F12FBS5 or F12FBS5Hx at 37°C in a humidified atmosphere of 5% COz in air. After 18 to 24 hr of incubation, the cells are subcultured to assess cytotoxicity and to continue the phenotypic expression period. For evaluation of cytotoxicity, the replicate cultures from each treatment condition are subcultured independently in F12FBS5 or F ~ ~ F B S S - Hin X ,triplicate, at a density of 100 cells/60 mm dish. After 7 to 10 days of incubation at 37°C in 5% COz in air, colonies are fixed with 95% methanol, stained with 10% aqueous Giemsa, and counted. Cytotoxicity is expressed relative to the solventtreated control cultures. For expression of the mutant phenotype, the replicate cultures from each treatment condition are subcultured independentlyat a density of no greater than lo6ceIls/100 mm dish. Subculture, as aboveat 2- to 3-day intervals, is performed for the 7- to 9-day expression period. At this time, selection for the mutant phenotype is performed. For selectionof the TG-resistant phenotype, cells from each treatment condition are plated into a maximum of five dishes at a density of 2 X lo5 cells/ 100 mm dish in F12FBSS-Hx containing 10 pM TG. For cloning efficiency at the time of selection, 100 cells/60 mm dish are plated in triplicate in medium free of TG. After 7 to 10 days of incubation, the colonies are fixed, stained, and counted for both cloning efficiency at selection and mutant selection.
6.
End Point Evaluation and Time of Evaluation
For the mouse lymphoma assay system, the total number of colonies per plate is determined after 10 to 14 days of incubation for the VC plates and the total relative growth calculated. The total number of colonies per TFT plate is then determined for those cultureswith 110% total growth. Colonies are enumerated using an automatic counter: if the automatic counter cannot be used, the colonies are counted manually.The diameters of the TFT colonies from the positive control and solvent control cultures should be determined over a range of approxi-
154
Putman et al.
mately 0.2 to 1.1 mm. In the event the test article demonstrates a positive response, the diameters of the TFT colonies for at least one dose level of the test of approxiarticle (the highest positive concentration) is determined over a range mately 0.2 to 1.1 mm. For theCHO/HGPRT assay system, the colonies are fixed with 95% methanol after 7 to 10 days of incubation. This includes the plates for cytotoxicity assessment, mutant selection, and cloning efficiency at time of selection. Once stained, the colonies are counted.
C.EvaluationCriteria 1. Data Presentation For the mouse lymphoma assay system, the cytotoxic effect of each treatment conditionisexpressedrelativeto the solvent-treatedcontrolforsuspension growth over 2 days posttreatment and for total growth (suspension growth corrected for plating efficiency at the time of selection). The mutant frequency for each treatment condition is calculated by dividing the mean number of colonies on the TFT plates by the mean number of colonies on the VC plates and multiplying by the dilution factor(2 X lop4),and is expressed as TFT-resistant mutants per IO6 surviving cells. For the CHO/HGPRT assay system, the cytotoxic effectof each treatment conditionshould be expressedrelativetothesolvent-treatedcontrol(relative cloning efficiency). The tnutant frequency for each treatment condition is calculated by dividing the total number of mutant colonies by the number of cells selected. corrected for the cloning efficiency of cells prior to mutant selection, and is expressedas TG-resistant mutants perlo6clonable cells. For experimental conditions in which no mutant colonies are observed, mutant frequencies should be expressed as less than the frequency obtained with one mutant colony. Mutant frequencies generated from doses giving 5 1 0% relative survival are not considered as valid data points and are not included in the data analysis.
2. Criteria for Valid Test For the mouse lymphoma assay system. the following criteria must be met for a test to be considered valid. The spontaneous mutant frequency of the solvent (or vehicle) control cultures mustbe within 20 to 100 TFT-resistant mutants per lo6 surviving cells. The cloning efficiency of the solvent (or vehicle) control group must be greater than 50%. At least one concentration of each positive control must exhibit mutant frequencies of 2100 mutants per lo6 clonable cells MMS positive over the background level. The colony size distribution for the control must show an increase in both small and large colonies [74,75]. A minimum of four analyzable concentrations with mutant frequency data is required
Genetic Toxicology
155
for assays with duplicate cultures (or a minimumof eight analyzable concentrations with mutant frequency data for assays with single cultures) [72]. For the CHO/HGPRT assay system, the following criteriamust be met for a test to be considered valid. The cloning efficiency of the solvent (or vehicle) control must be greater than 50%. The spontaneous mutant frequencyin the solvent (or vehicle) control must fall within the range of 0 to 25 mutants per lo6 clonable cells.The positive control must induce a mutant frequency at least three times that of the solvent control and must exceed 40 mutants per lo6 clonable cells. A minimum of four analyzable concentrations with mutant frequency data is required [ 7 2 ] .
3.
Positive and Negative Test
In evaluation of the data for the mouse lymphoma assay system, increases in mutant frequencies that occur only at highly toxic concentrations (i.e., less than 10% total growth) are not considered biologically relevant. All conclusions are based on sound scientific judgment; however, the following criteria are presented as a guide to interpretation of the data [76]. The result is considered to induce a positive response if a concentration-related increase in mutant frequency is observed and one or more dose levels with 10% or greater total growth exhibit mutant frequencies of 2100 mutants per lo6 clonable cells over the background level. A result is considered equivocalif the mutant frequency in treated cultures is between 55 and 99 mutants per lo6 clonable cells over the background level. Test articles producing fewer than 55 mutants per IO6 clonable cells over the background level are concluded to be negative. For the CHO/HGPRTassay system, spontaneous mutant frequencies in this assay range from 0 to 25 mutants per lo6 clonable cells. As a result, calculation of mutagenic response in terms of fold increase in mutant frequency above the background rate does not provide a reliable indication of the significance of the observed response. The wide acceptable range in spontaneous mutant frequency also suggests the need to set a minimum mutant frequency for a response to be considered positive. Hsie etal. [56]refer to a levelof 50 mutants per lo6clonable cells. However, a minimum significant level at >40 mutants per lo6 clonable cells is commonly used.
D. SupplementalInformation The full powerof the mouse lymphoma assay is not realized withoutvery careful attention to dose selection, culture and treatment conditions, cloning conditions, colony counting techniques, andmutant colony sizing requirements. To produce a valid assay, it is essential to assure that a sufficient range in toxicity has been induced by the test chemical so that weak positive responses are not missed.
Putman et al.
156
Culture condition should provide for rapid,uniform cell proliferation and should be freeof exposure to white light or other nonspecific mutagens. Treatment conditions can markedly affect the outcomeof genotoxicity tests using cultured mammalian cells. Low pH and high osmolality have been demonstrated to produce false positive responses for mutation, cell transformation, and clastogenicity [77]. Perhaps the most sensitive portionof the assay is cloning. Some media and sera that will support suspension growth fail to sustain clonal growth. This is also the part of the assay most susceptible to contamination. Cloning conditions are best judged by the recovery of small-colony mutants [76]. If the small colonies do not grow to a detectable size,many mutagens will go undetected. Use of a known small-colony inducer such as methyl methanesulfonate provides a control for cloning conditions (small-colony generation), colony counting, and sizing. Apart from the conventional mouse lymphoma assay that entails the plating of cells in an agar matrix for viability assessment and mutant selection, a microtiter (or microwell) method has been developed [78,79].In the microtiter method, of cells for assay and the treatment of cells with the procedures for the preparation test article are identical to those used in the conventional assay. For assessment of viability following treatment and mutant selection, cells are transferred into 96well microtiter plates. Sizingof mutant colonies in the microwells is feasible and the data are essentially comparable to those generated from the conventional agar method [80-821. CHO In the AS52/HPRT mammalian cell gene mutation assay system, the cells carry a single copy of the E. coli gpt gene stably integrated and express phosphoribosyltransferase the bacterial gene for the enzyme xanthine-guanine [56,83]. Mutants deficient in this enzyme can be induced in the AS52 cell line and both spontaneous and induced mutants can be selected for resistance to 6thioguanine.
IV. IN VITRO CHROMOSOME ABERRATIONASSAYS A.
Introduction
For several decadesthe in vitro cytogenetics assay has proven worth its in assessing chromosomal derangement due to chemicals and radiation and has become an integral element of genetic toxicology testing. The chromosome aberration assay offers the advantage of direct visualization of the damage caused by the test article under investigation and can also be used to screen populations for chromosome anomalies arising as a result of environmental agents. The abnormalities that are detected using this method are structural chromosome aberrations, which involve breaks and rearrangements, and numerical chromosome aberrations, involving variations in the number of chromosomes in the nucleus. Structural chromosome aberrations, involving one or both chromatids, result in a discontinuity in the chromosomal DNA thatmay be (1) repaired, restor-
Genetic Toxicology
157
ing the original structure;(2) rejoined inappropriately, forming a rearrangement; or (3) left unrejoined resulting in a break or deletion. The majority of structural aberrations are typically lethal tothe cell or to the daughter cells duringthe first few cell cycles following their appearance. However, these structural anomalies serve as an indicator of the occurrence of transmissible aberrations, such as balanced translocations, duplications, inversions, and small deletions. These transmissible aberrations can result in fetal and perinatal mortality or defects at birth if they arise in germ cells, and may play a rolein tumor initiationand progression in somatic cells [84]. Cells with numerical chromosome aberrations feature a chromosome complement different from the number of chromosomes characteristic for the species. A deviation of the chromosome number involving one or a few chromosomes is termed aneuploidy. This condition can be measured in karyotypically stable primary cell cultures (Le.. peripheral lymphocytes). A variation in the complement of chromosomes involving a whole set of chromosomes is called polyploidy or endoreduplication, depending on the underlying mechanism. This anomaly can be measured in primary cell cultures and in established cell lines. Aneuploidy, polyploidy, and endoreduplication arenot due to the direct interaction of an environmental agent with the chromosomal DNA. Aneuploidy and polyploidy are usually the result of disruption of some aspect of the spindle apparatus (microtubules, centrioles, kinetochores, and the associated proteins) or due to cell fusion, failure of cytokinesis, or nuclear fusionin binucleate cells [85,86]. Endoreduplication appears to bean abnormal variation of cell replication, probably resulting from the action of DNA polymerase p rather than DNA polymerase a.that involves two cycles of chromosome replication in the absence of an intervening nuclear division [87-891. When aneuploidy appears in germ cells, the result be can spontaneous abortions or defects at birth [90]. These numerical aberrations do not seem to play a key role in the initiation of tumors, but are indicativeof the evolution of karyotypic instability within a population of tumor cells [85]. The physiological outcome, and hence the genotoxic impact, of polyploidy and endoreduplication is less than clear. Both phenomena are found in vivo in normal cell populations, and both are caused by a variety of agents, some of which do not induce other types of chromosome damage [91-941. The induction of numerical aberrations may be interpreted as cytotoxicitywith a concomitant perturbationof DNA replication, and unless it is observed at concentrations that are within a therapeutic (or expected exposure) range it may not reflect a major genotoxic lesion [85].
1. Purpose of Study The purpose of an in vitro cytogenetic study is to evaluate the clastogenic or chromosome breakage potential of a test article and its metabolites based upon its ability to induce chromosome aberrations in cultureusing an established cell
158
Putman et al.
line or a primary cell source. The use of cell cultures as a test system has been demonstrated to be an effective method of detection of chemical clastogens [95]. Induction of chromosome breakage in vitro is an indicator that the test article is potentially genotoxic.
2. Cell Selection and Justification Established cell lines or primary cell cultures can be used as the in vitro test system. The cell types routinely used in these assays are Chinese hamster ovary (CHO) cells, Chinese hamster lung (CHL) cells, and human peripheral blood lymphocytes (HPBL). The Chinese hamster cell lines, CHO and CHL, are establishedcelllinesthataregrownasmonolayercultures.Humanlymphocytes, HPBL, are a primary cell source and are grown as suspension cultures that must be stimulated with the mitogen phytohemagglutinin (PHA) in order to divide. Primary cell cultures, such as HPBL, are believed to show some variability among donors in their response to test articles and their sensitivityto test article treatment compared with established cell lines [96,97]. However, HPBLs have demonstrated reliability over yearsof in vitro cytogenetic testing, are sufficiently validated, and their relevancy to human exposure to test articles cannot be overlooked. When obtaining blood cells, routine clinical blood handling precautions shouldbeobservedanddonorselectionshouldbecloselymonitored.Blood should only be obtained from healthy volunteers without a recent history of receiving medication, viral infection, or x-ray exposure [98]. Established cell lines from frozen stocks, such as CHO or CHL cells, are genetically more homogeneous than primary cell cultures and tend to show less interexperiment variability within a cell type. Established Chinese hamster cell lines. such as CHO and CHL, are useful in in vitro cytogenetics testing because they are easily cultured in standard medium, have a small number of large chromosomes each with a more or less distinctive morphology, and have a relatively short cell cycle time [99]. Several clones of CHO cells are currently available for cytogenetic studies. In comparative studies, quantitative differences in responses to test articles among these clones have been demonstrated and this characteristic makes it imperative that the cell source and type of clone used in any study are well described [97]. An inherent property of most established cell lines is that due to extensive chromosome rearrangements the chromosome number varies around a modal value [97,98]. This property necessitates establishing criteria for defining analyzable cells. The current criteria are that cells with the modal chromosome number plus or minus two chromosomes are acceptable for microscopic analysis.
3. Maintenance of Cells Established cell linescan be obtained from laboratories conducting in vitro cytogenetics studies or a supplier such as the American Type Culture Collection
Genetic Toxicology
159
(Manassas, VA). Upon receipt of the cells, frozen stocks should be established and the cells need to be checked for mycoplasma contamination. Several environmental factors must be monitored closely. Extremes of pH and osmolality under test conditions must be avoided. Low pH and excessively high osmolalitycan induce chromosome aberrations[ 100,1011. These parameters must be adjusted to physiological levels when necessary. When the test system consists of established cell lines, the monolayers must not reach confluency at any time (Le., during incubation of seeded cultures prior to test article treatment, or during the assay). The target cells in the assay are mitotically active. As the cells approach confluency, the growth rate slows down, thereby diminishing the number of target cells. In addition, with established cell lines, as the monolayers become confluent and the growth rate slows, the cell cultures have a tendency to become karyotypically unstable and the background levels of chromosome rearrangements (dicentrics, rings, etc.) may increase.
B.
ExperimentalDesign
In vitro chromosome aberration studies are generally conducted in two phases. The first phase, the preliminary toxicity assay, serves as a dose range-finding assay for the definitive portion of the study.In the second phase, the chromosome aberration assay, the clastogenic potential of the test article is evaluated. The second phase may include an initial and an independent repeat or confirmatory assay. When conducting an in vitro chromosome aberration assay it is critical to make structural damage assessments in chromosomes that are in the first posttreatment metaphase of the cell cycle (Figure 2). If damaged cells are capable of cycling and allowed to progress through more than one cell cycle, damaged chromosomes (or fragments) may be lost or converted to relatively intricate derivatives [ 1021. Structural damage that appears as chromatid-type in the first posttreatment metaphase can potentially emerge as chromosome-type damage in the second posttreatment metaphase. The loss of damaged chromosomes (or fragments) or the conversion of one type of damage into another during cell cycle progression can be misleading. Ensuring that structural damage is assessed in the first posttreatment metaphasemay be accomplished by assessing the cell cycle delay, if any, that results from treatment with the test article in question, or by using multiple harvest times. The harvest times are designed to enrich the proportion of first posttreatment metaphase cells and are described below.
1. Dose Selection In the preliminary toxicity assay, the maximum concentration tested is either 5 mg/ml or 10 mM (whichever is lower) for freely soluble test articles, or, for poorly soluble test articles,the maximum concentration resulting in a suspension
160
Putman et ai.
A
Figure 2 Examples of structural and numerical aberrations in Chinese hamster ovary (CHO) chromosomes. (A) Cell in metaphase of mitosis demonstrating the standard modal number (20) of structurally normal chromosomes. (B) Gap, g. (C) Chromatid break, ctb; chromatid rearrangement (triradial), tri. (D) Iso-chromatid break, demonstrating chromatid union in the distal fragment, iso-b. (E) Chromatid rearrangement (quadriradial), q. (F) Chromatid rearrangement (complex rearrangement), cr, and quadriradial, q. (G) Numerical aberration, endoreduplication. Examples of in vivo rat (H) and mouse (I) bone marrow metaphases showing chromatid type aberrations.
161
Genetic Toxicology
r
Putman et ai.
162
I
-
H;'
-.-v
''2
II d
t
Figure 2 Continued
."
Genetic Toxicology
163
that can deliver a reliable amount of test article to the test system (workable suspension). The preliminary toxicity assay should be conducted in the presence and absence of metabolic activation (S9). A sufficient number of lower concentrations must also be included to determine the appropriate dose range for testing in the definitive assay. A solvent control needs to be included in the preliminary toxicity assay. Toxicity end points are assessed relative to the solvent control and should be such that they give an appropriate indication of the test article impact on cell growth [103]. For established cell lines, the colony-forming efficiency (cloning/plating efficiency), cell monolayer confluency, or a measure of the number of surviving cells, assessed with an automatic cell counter, are reliable indicators of the test article toxicity. Determination of the average generation time using BrdUrd-labeled cells stained for sister chromatid differentiation [ 1041 or mitotic index determination may be used as supplementary information, but neither is a sufficient indicator of cytotoxicity when used alone. For cells grown in suspension, such as HPBLs, the reduction in mitotic index relative to the solvent control is the most practical method of assessing toxicity. Under test conditions, the pH should be adjusted to physiological values and the osmolality should be monitored and compared with the solvent control. In the definitive assay, the maximum concentration to be tested must be selected to demonstrate toxicity (assuming there is measurable toxicity) in the test system. For established cell lines, the highest concentration selected must exhibit a level of toxicity of at least a 50% reduction in cloning efficiency or cell monolayer confluency, or a 50% inhibition in cell growth relative to the solvent control. For HPBLs, the highest concentration selected for testing must exhibit a reduction in the mitotic index of at least 50% relative to the solvent. The selection of the maximum dose level for the chromosome aberration assay is as follows: for a nontoxic, nonprecipitating test article, the maximum dose
Putman et al.
164
level should be 5 mg/ml or 10 mM (whichever is lower); for a toxictest article, the maximum dose level should demonstrate toxicity (at least 50%) regardless of the precipitation profile; and for a precipitating, nontoxic test article, the maximum dose level should be the lowest precipitating dose level [2,105].
2.
Number of Cultures
For the preliminary toxicity assay, one culture for each dose level and the solvent control is sufficient although more than one culture can be used. Duplicate cultures for each dose level and the solvent control are recommended for the chromosome aberration assay.
3. Metabolic Activation System Extended treatment of cells in S9 should be avoided because of the cytotoxic nature of the mixture and the loss of enzymatic activity of the S9 mix over time. Exposures of 3 to 6 hr are sufficient when the concentration of S9 is 10% or less [97]. Aroclor 1254-induced S9 may, under some circumstances, induce chromosome breakage, possibly due to the generation of active oxygen radicals [106]. Other metabolic activation systems using different rodent species and inducing agents offer an alternative to the useof PCBs [ 1071. The effects of oxygen radicals may therefore be minimized by using S9 induced by agents other than Aroclor 1254, such as P-naphthoflavone plus phenobarbital.
4.
Controls
To determine that the test system is functioning properly and that the test is valid, appropriate positive and negative controls must be included in chromosome aberration assays. A solvent control, in which the cell cultures are treated with the solvent only, is required to assess relative increases in the measured end point (structural or numerical chromosome aberrations).The amount of solvent added to the test system should be the same as in the test article-treated cultures. A solvent control needs to be included in both the S9-activated test system and in the nonactivated test system.If an unusual solvent is beingused in the assay and there is little or no historical data to determine its effecton the test system, then an untreated control should alsobe included as oneof the negative controls.The untreated control receives no additional components other than the culture medium forthe nonactivated test system or the S9 reaction mixturethe formetabolically activated test system. ExposureS9toin the untreated control cultures should be for the same lengthof time as in the test article-treated cultures. Comparison of the solvent control values with the untreated control values allows assessment of any solvent effects. A positive control is required to determine that the test system is capable of detecting clastogenic activity. The positive control articles
Genetic Toxicology
165
should be appropriate for the system in which the test is conducted. In the nonactivated test system, the positive control article should be a direct-acting clastogenic compound such as mitomycin-C (MMC) orN-methyl-N’-nitro-N-nitrosoguanidine (MNNG); in the S9-activated test system, the positive control should be an agent requiring metabolic activation such as benzo[a]pyrene (BaP) or cyclophosphamide (CP). The positive controls should be used at concentrations that adequately demonstrate that the test system is functioning. However, high concentrations and the resulting extreme responses from the positive control articles are to be avoided whenever possible.
5. Dose Administration, Exposure Time, and Cell Collection Time Target cells are treated for a defined exposure time that may be continuous up to cell harvest in the absence of S9 but limited (3 to 6 hr) in the presenceof S9. Typically, the test article is dissolved in solvent at a concentration that will be diluted to the final target concentration when the dosing aliquot is added to the test system. Incorporation of the test article-solvent mixture into the treatment medium has been demonstrated to be an effective method of dosing the target cells [95].In the S9-activated test system, the test article-solvent mixture is added to the S9 mixture in the culture flask for the prescribed exposure time (3 to 6 hours), the test article-solvent/S9 mixture is then aspirated, the cells are rinsed with a physiological buffer, and refed with complete medium. The cultures are then incubated until the cell harvest. It is important to include a concurrent toxicity test (colony-forming efficiency, cell monolayer confluency, or cell growth inhibition for CHO cells and mitotic inhibition for HPBLs) alongwith the chromosome aberration assay. An attempt to harmonize global regulatory requirements by the ICH and [ 1,1051. OECD has led to the issuance of guidelines that satisfy global submission In this set of guidelines, the conduct of an in vitro cytogenetics or chromosomal aberration assay will be carried out in the absence and presenceof S9 activation. The cells are exposed to at least three concentrations of the test article for 3 to 6 hr as well as positive and solvent controls in duplicate cultures. The cells are 1.5 times thenormal then harvested for microscopic analysis at a single time point cell cycle from the initiationof treatment. In the event of a negative response in the nonactivated portionof the assay, an independent repeat or confirmatory assay is required. The repeatassay is carried out with a continuous exposureup to cell harvest, which is 1.5 times the normal cell cycle.A negative response in the S9activated portion of the assay may require a confirmatory assay (to be required on a case by case basis). The end points measured include structural and numerical aberrations in metaphase chromosomes that are representative of the first cell division after
Putman et al.
166
treatment. The harvest times are selected in an attempt to balance the requirement for evaluating first-division metaphase chromosomes with the possibility of test article-induced cell cycle delay.To obtain the metaphase chromosomes, the cells are treated with a spindle apparatus disrupting agent, Colcemid'@, 2 hr prior to cell harvest. The concentration of Colcemid is typically0.1 pg/ml treatment medium. This treatment traps the cells in metaphase at the time of cell harvest. The cells are harvested by treatment with trypsin for monolayer cultures or by centrifugation for suspension cultures. The harvested cells are then treated in a hypotonic buffer (0.075 M KCl) for an appropriate time that allows the cells to The swelling of the cells is necessary swell (3 to 20 min, depending on cell type). to ensure well-separated chromosomes when the cells are dropped onto microscope slides. After hypotonic treatment, the cells are fixed (3 : 1 methanol and glacial acetic acid), dropped onto slides and stained with Giemsa.
6.
Metaphase Analysis
To ensure that a sufficient number of metaphase cells are available for analysis on the slides, the percentage of cells in mitosis per 500 cells scored (mitotic index) is determined for each treatment group. In some cases, when test article precipitation has been carried over by centrifugation of suspension cultures, or cannot be adequately rinsed from monolayers, it must be determined that the precipitate will not obscure analysis of the metaphase chromosomes. Slides from the highest scorable dose level and the next two or three dose levels are selected for analysis. Slides selected for analysis are blind coded. Metaphase with cellsthe modal posnumber (2n) ? 2 centromeres are examined under oil immersion. Whenever sible, a minimum of 200 metaphase spreads (100 per duplicate flask) are examined and scored for chromatid-type and chromosome-type aberrations[98] (Figure 2). Chromatid-type aberrations include chromatid and isochromatid breaks and exchange figures such as quadriradials (symmetrical and asymmetrical interchanges), triradials, and complex rearrangements. Chromosome-type aberrations include chromosome breaks and exchange figures such as dicentrics and rings. Fragments (chromatid or acentric) observed in the absence of any exchange figure are typically scored as a break (chromatid or chromosome). Fragments observed with an exchange figure are not scored as an aberration but are considered part of the incomplete exchange. Pulverized chromosome(s), pulverized cells, and severelydamagedcells (210 aberrations)arealsorecorded.Chromatidand isochromatid gaps are recordedbut not included in the analysis.The XY coordinates for each cellwith chromosomal aberrations are recorded using a calibrated microscope stage.The percent polyploidand endoreduplicated cells are also evaluated per 100 cells. The mitotic index is recorded as the percentage of cells in mitosis per 500 cells counted.
Genetic Toxicology
C.
167
EvaluationCriteria
All conclusions must be based on sound scientific judgment.As a guide to interto induce a positive pretation of the data, the test article is usually considered response if the percent aberrant cells is increased in a dose-responsive manner with one or more concentrations being statistically elevated relative to the solvent control group (P 5 0.05). A reproducible and statistically significant increase at a single doselevel may also be considered positive. Test articlesnot demonstrating a statistically significant increase in aberrations are concluded tobe negative. Regardless of results of the statistical analysis, however, the biological relevance of the response must be considered (for example, comparing test results with historical control data), especially when evaluating a borderline response [103].
1. Data Presentation The toxic effects of treatment are based upon colony-forming efficiency, cell monolayer confluency, or cell growth inhibition or mitotic inhibition relative to the solvent control group. These data are presented for both the preliminary toxicity assay and the definitive assay for all dose levels tested (which may include an initial and an independent repeat or confirmatory assay). The number and types of aberrations, the percentage of structurally aberrant cells (percent aberrant cells), numerically aberrant cells in the total population of cells examined, and the frequency of structural aberrations per cell (mean aberrations per cell) are also reported for each treatment group. Chromatid and isochromatid gaps are presented in the data but are not included in the total percentage of cells with one or more aberrations or in the frequency of structural aberrations per cell.
2. Statistical Analysis Numerous statistical methods are available for use in analyzing chromosome aberration data. At the present time, a reliable and commonly used method is the Fisher’s exact test. The Fisher’s test is used to make pairwise comparisons between the percentage of aberrant cells in each test concentration group and the solvent control value. An adjustment of the significance level should be done to ‘ take into account that multiple comparisons are made against a single concurrent solvent control. The Cochran-Armitage test is recommended as a trend test for dose responsiveness [ 1081.
3.
Criteria for Valid Test
The frequency of cells with structural chromosome aberrations in the untreated and solvent control groups should remain within the rangeof the historical controls. For the positive controls, the percentage of cells with aberrations must be
al.
168
Putman et
statistically increased( P I0.05, Fisher’s exact test) relative to the solvent control or to the untreated control.
V.
IN VIVO CYTOGENETICASSAYSYSTEMS
In vivo cytogenetic assays are used to evaluate the potential of a product to induce structural and numerical chromosome aberrations in somatic and germ cells of mammals. There are several advantages to in vivo testing, the most significant being the consideration of metabolic activation and detoxification processes. In vitro assays usingS9 or primary hepatocytesto simulate in vivo metabolism may lack critical detoxification enzymes. Additionally, some chemicals areknown to be modified by intestinal bacteria[ 1091. In vivo testing also allows for the assessment of chromosome alterations in both somatic and germ cells. The standard chromosome damage assays include the micronucleus assay in bone marrow or peripheral blood cells and the chromosome aberration assay in bone marrow or spermatogonial cells. Both the micronucleus test and the metaphase analysis assays may be used interchangeably for the demonstration of in vivo clastogenic activity.
A.
ErythrocyteMicronucleusTest
1.
Introduction
The micronucleus test is used for the detection of damage to chromosomes as well as the mitotic apparatus in bone marrow or peripheral blood of cells rodents. The assay system has been well standardized [ 110-1 131. The basic features of the test system are (1) the effectof the test chemical is observed in anucleated polychromatic erythrocytes (PCEs); (2) polychromatic so that any micronucleithey contain erythrocytes have a relatively short life span, must have been generated as a result of recently induced chromosome damage; (3) micronuclei are readily identifiable and their distribution is well defined; and (4) the frequencyof induced micronuclei in polychromatic erythrocytes is dependent upon sampling times. Erythroblasts in bone mamow undergo afinal chromosome replication after which they divide and differentiate into polychromatic erythrocytes. Chromoin the lagging chrosomal breaks or interferencein the mitotic process that result mosomes during this division lead the to formation of micronuclei that are similar in appearance butmuch smaller than the nucleus in immature, nucleated erythrocytes. During differentiation, only the nucleus is expelled from the nucleated erythrocyte, leaving behind any micronuclei formed.
Genetic Toxicology
169
The micronucleus assay may be used not only for the detection of acute but also chronic genetic damage. In mice, chromosomal breakage in bone marrow erythroblasts producesan accumulation of micronuclei in normochromatic erythrocytes in peripheral blood and there is little, if any, selective removalof micronucleated cells from circulation. This is not the case with rats, which limits their usefulness in long-term studies using peripheral blood.
2.
Experimental Design
(u) Species Mice or rats are the most frequently used mammals in micronucleus studies using bone marrow. When peripheralblood is used, mice are recommended. However, any appropriate mammalian species may be used provided sufficient historical control data are available. At initiation of the study, the weight variation between animals should not exceed t 2 0 % of the sample mean for each sex.
(b) Hozrsiq Conditiom. Animalsshouldbeobtainedfromsourcesthat are freeof adventitious agents. Animals should be housed in an American Association for the Accreditation of Laboratory Animal Care (AAALAC)-accredited facility with a controlled environment of 23 ? 3"C, 50 2 20% relative humidity and a 12-h light/dark cycle. Animals may be individually or group housed by sex. Animals should have free access to a certified chow that hasbeen analyzed for environmental contaminantsand to drinking water. Animals should be identified uniquely and acclimated for no fewer than 5 days prior to study initiation. ( c ) Treatment and Sanzpling Time. The maximum volume of liquid administered by gavage or injection should not exceed 2 m1/100 g body weight. Except for irritating or corrosive substances that normally reveal aggravated effects with higher concentrations, variabilityin test volume should be minimized by adjusting the concentration to ensure a constant volume at all dose levels. Dosing may be performed using a single administration followed by multiple sampling times orby dose administration on 2 consecutive days separatedby 24 hr followed by a single sample time. The choiceof treatment protocol is usually made on the basis of any pharmacokinetic data on the test substance. The use of a high dose increases the likelihood that a week clastogen will be detected. In many cases. a higher total dose can be given in two or more treatments than in suchcasesmultipleinjections may increasethe inasingletreatment,and success rate of the assay. No single sampling time is optimal [ 1 141. However, the most frequently used design involves the administration of three concentrations of test article as well as positive and negative (vehicle) controls to male mice, after which bone marrow cells are collected at 24 and 48 hr and examined for the presence of
170
al.
Putman et
micronucleated polychromatic erythrocytes. In the event that peripheral blood is the target, samples are taken at least twice, starting no earlier than 36 hr after treatment, with appropriate intervals following treatment. The clastogenic potential of the test article is measured by its ability to increase micronucleated polychromatic erythrocytes in treated animals as compared with vehicle control animals. (d) Dose Selection. Doselevelsto be employedshouldbeselected on the basis of toxicity data but should not exceed 2 g/kg body weight. During the performance of the toxicity study, in addition to the observation of clinical signs and mortality, bone marrow smears may be prepared from all the surviving animals and scored for PCEs/total erythrocyte ratio. If both sexes are being tested, the toxicity profile may be different between males and females and will justify dose selection based on sex. Normally, the high dose should be the maximum tolerated dose as determined according to mortality. bone marrow cell toxicity, treatment-related clinical signs, or 80% of the lethal dose for 50% of the test animals (LD,,) up to a limit of 2 g/kg body weight. no substantial differences If data are available to demonstrate that there are in toxicity, pharmacokinetics, or metabolism between sexes, testing of a single sex is sufficient[ 1 la]. Where human exposure to chemicalsmay be sex-specific, as for example with some pharmaceutical agents, the test should be performed with animals of the appropriate sex. Each treatment group should include a minimum of 5 analyzable animals. ( e ) Controols. Concurrent positive and negative (vehicle) controls should be included for each sex tested. Negative controls, consisting of vehicle alone, and otherwise treated in the same way as the treatment groups, should be included for every sampling time. The volume of vehicle administered should be equivalent to that given to test animals. In addition, an untreated control group should be included in the absence of historical control data demonstrating no adverse effect of the vehicle alone.If peripheral blood is analyzed, a pretreatment sample should be acceptable as a concurrent negative control but only for short-term studies. It is advisable to include a concurrent positive control in every experiment to ensure that the assayis performed according to prescribed standards. Convincing positive data may be accepted in the absence of positive control data, but reports of negative assay data unaccompanied by concurrent positive control data are not acceptable. Positive control doses should be chosen so that the effects are clear but do not immediately reveal the identity of the coded slides to the can be evaluator. The purpose of a positive control is to show that a response detected under the conditions of dosing and sample preparation. It is acceptable that the positive control be administered by a route different from the test substance and sampled at only a single time. In addition, chemical class-related
Genetic
171
positive control chemicals should be considered, when available. Examples of positive controls and concentrations that are used in mice are ethyl methanesulfo1 mg/kg; and cyclonate, 200 mg/kg; ethyl nitrosourea, 25 mg/kg; mitomycin C, the proporphosphamide, 40 mg/kg. Positive control doses that significantly alter tion of PCEs in the bone marrow should be avoided. Accumulation of historical data for the negative and positive controls is recommended. Comparisonof the concurrent negativeand positive controls with the historical values provides evidence that the assay is within expected limits.
(f) Route of Adnir?istrntior?. The preferred route of administration is the route of human exposure. However. to maximize delivery of the test substance to the target tissue, intraperitoneal injection is the most commonly used method of administration. ( g ) Bone Marrow Collection. Bone marrow cells are collected from femurs or tibias into a small amount of fetal calf serum (1 to 2 ml) and smears are prepared and stained using standard methods. The primary consideration in preparation of bone marrow smears is to obtain a single layer of cells sufficiently spread to preserve morphology and to facilitate scoring. A random distribution of the cells is achieved by suspending and mixing the marrow in fetal bovine serum. The cells are stainedto differentiate PCEs from normochromatic erythrocytes (NCEs). Methods include the use of conventional stains, e.g., Giemsa or [ 1 131 May-Gruenwald-Giemsa, or DNA-specific stains, such as acridine orange or Hoechst 33258 plus pyronin Y [ 1151, which may eliminate someof the artifacts associated using non-DNA-specific stains.
(11) End Point Evduation. The minimumnumber of cellsscoredper sample should be chosen to minimize the proportion of zero class samples. At a spontaneous frequency of 0 to 2 micronucleated cells per thousand, at least 2000 cells should be scored per sample. Since the frequency of micronucleated cells among NCEs does not increase as markedly as that among PCEs, it is not necessary to score micronucleated NCEs. However,it may be useful to measure this parameter for purposes of quality control, since artifacts in any given slide will produce apparent increases in the frequencies of micronuclei in both NCEs and PCEs and the incidence of artifacts will generally fail to follow the time coursethroughtheerythrocytesubpopulationsasexpectedforthetruemicronuclei. In addition to the frequency of micronucleated PCEs (mPCEs), the ratio of PCEs to total erythrocytes should be determined. This ratio may be obtained by counting the number of PCEs out of 1000 erythrocytes. A reduction in PCE/ total erythrocyte ratio is used to indicate bone marrow toxicity andmay be used to document bioavailabilityof the test chemical to the target tissue. In the absence
et
172
Putman
al.
of bone marrow toxicity, plasma levelsmay be required to demonstrate bioavailability of the test substance to the target organ [ 1161.
3.
Evaluation Criteria
The criteria for distinguishing positive and negative results should be established in advance. Statistical methodsmay be used asan aid in evaluating the test results; however, statistics should not be the only criteria for determinationof a positive response. Biological relevance of the results should also be considered. Criteria for determining a positive result should include a dose-related increase in mPCEs A negative or a clear increase in mPCEs at the high dose at a single sampling time. result indicates that, under the test conditions, the test substance not does produce mPCEs in the test species. (a) Data Presentcrtiorz. Individualanimaldataarepresented in tabular form. The experimental unit should be the animal. The incidence of mPCEs is determined for each animal and treatment group. The number of PCEs scored, the number of mPCEs, and the ratio of PCEs per 1000 erythrocytes should be reported for each animal and summarized for each treatment group by sex and time point.
(b) Statistics. Thereislittleagreement on thedistribution of mPCEs and a varietyof tests havebeen used for statistical analysis, including those based on the binomial, Poisson, and negative binomial distributions[ 1 171. Our laboratory uses the statistical tables developed by Kastenbaum and Bowman [ 1181 which are based on the Poisson distribution and whichmay be used to test pairwisethestatisticaldifference in themPCEfrequenciesbetweentestarticletreated and vehicle control groups.
(c) Criteria for Valid Test. The mean incidence of mPCEsmust not exceed 0.5% (5 mPCEs/l000 PCEs) in the negative (vehicle) control. The incidence of mPCEs in the positive control group must be significantly increased relative to the negative control. B.
BoneMarrowChromosomeAberrationAssay
1. Introduction The in vivo chromosome aberration assay is used for the detection of certain structural chromosome changes inducedby test compounds in mammals, usually rodents. The gross chromosome damage detected in these assays is frequently lethal to the cell during the few firstcell cycles after their induction. The induction of these aberrations indicates a potential to induce more subtle chromosome dam-
Genetic Toxicology
173
age (nonlethal) than may be compatible with cell division and which may lead to heritable cytogenetic abnormalities. Chromosome aberrations are evaluated in mitotically arrested metaphase cells, following treatmentwith the test compound. In principle, metaphase analysis can be performed in any tissue containing dividing cells. Whereas the bone marrow is themost appropriate tissue containing rapidly dividing cells for screening purposes, other cells may be examined when tissue-specific effects are of interest. Some aspects that were mentioned in micronucleus study design are of common to all in vivo tests such as the species selection, solubility, route administration, dose levels, and controls.
2.
Experimental Design
Following the administration of three concentrations of test article as well as positive and negative (vehicle) controls to animals, bone marrow cells are arrested in metaphase using colchicine treatment and are collected for microscopic evaluation. Normally at least two bone marrow collection times are used. The first sampling time is usually 18 hr, which is approximately 1.5 normal cell cycle lengths following treatment. Since cell cycle kinetics can be influenced by the test article, a later sample collectionat 42 hr (24 hr after the initial sample time) is also used. ( a ) Treatment and Sampling Time. Treatment by a single administration is normally used unless there is a specific reason for doing otherwise. A single injection will, in the majority of cases, provide for maximum sensitivity of the assay. As in the micronucleus test, the volume administered should not exceed 2, ml/ 100 g body weight. The efficiency of detection of induced aberrations will depend upon the selected cell satnpling time for cytological processing. The aberrations are best observed in contracted metaphase chromosomes at the first mitosis after their induction. It is necessary to sample cells at their first mitosis (Ml) after treatment to allow for the most accurate measure of the induced aberration frequency. If the cells are scoredat M2 or M3, the types of aberrations are mixed and lose identity with the cell stage. Because of failure to divide, acentric fragments can be lost from daughter cells,and chromatid-type of aberrations can segregate to give normal and aberrant cells. Allof these factors lead to a reduced aberration frequency if first-division metaphase cells are not scored. The selected sample times should be such that cells in different stages of the cell cycleat the time of treatment will be analyzed. Also, induced chromatidtype aberrations are converted into derived chromosome-type as a consequence of cell division and subsequent DNA replication. The majority of chemical agents induce chromatid-type aberrations during S phase, irrespective of the cell cycle
Putman et al.
174
stage treated. Thus, at least one of the populations analyzed should constitute cells that were in the S phase at the time of treatment [ 1 191. This is of particular importance for agents that have a short period of activity in vivo. For some chemicals, such as benzo[a]-pyrene and 2-acetylaminofluorene, which can cause considerable cell cycle delay, the maximum induction of chromosomal damage occurred between36 to 44 hr after a single dose administration [120]. A cell kinetics study to estimate the cell cycle delay caused by the test article treatment canbe performed by implanting BrdUrd tablets subcutaneously at 1 to 2 hr prior to dose administration. Bone manow is collected at different time points, such as 12 and 24 hr following dosing. Metaphase cells are differentially stained [121] and the proportions of cells in first-division (M,), seconddivision (M?), and third-or-greater-division (M3) metaphase are determined. The average generation time is estimated based on the proportionof MI, M2, and M3 cells. As long as cells are scored at the first-division metaphase following treatment, the types of aberrations visualized are reflective of the stage of the cell cycle in which they were induced. The types of chromosome aberrations have been extensively described and standardized [ 1221.
(b) Dose Selection. In general, selection of an upper dose level can be based on end points mentioned in the micronucleus assay. In addition to the observation of clinical signs and mortality, bone marrow mitotic index may be considered in dose selection. The high dose should be set at the maximum tolerated dose, based on mortality, bone marrow toxicity, or treatment-related clinical signs. The maximum dose delivered should not exceed 2 g/kg. (c)Preparation of MetaphaseCells. Atan appropriatetimepriorto sampling, generally 1 to 3 hr, animals are injected with 2 mg/kg colchicine to arrest cells in metaphase. The bone marrow samples are collected, processed by exposing cells to a hypotonic solution, and then fixed.The fixed cells are spread on slides and stained with Giemsa.
3.
Evaluation Criteria
The criteria for distinguishing positive and negative results should be established in advance. Statistical methodsmay be used asan aid in evaluating the test results; however, statistics should not be the only criteria for determinationof a positive response. Biological relevance of the results should also be considered. Criteria for determining a positive result should include a dose-related increase in percent cells with aberration or a clear increase in percent aberrant cells at the high dose at a single sampling time. A negative result indicates that, under the test condition, the test substance does not produce chromosomal aberration in the test species.
Genetic
175
( a ) Datu Presentcrtion. The mitoticindexshouldbedeterminedasa measure of cytotoxicity in at least 1000 cells per animal. Metaphase cells containing 2n ? 2 centromeres should be scored from each animal for chromatidtype and chromosome-type aberrations. The mitotic index and the total number and types of aberrations found in each animal should be presented. Gaps are presented in the data but are not included in the total percentage of cells with one or more aberrations or in the average number of aberrations per cell. The percentage of damaged cells in the total populationof cells scored is calculated for each treatment group. The severity of damage within the cell is reported as the average number of aberrations per cell for each treatment group. Male and female animals are analyzed separately. The Fisher's exact test may be used for pairwise comparisons of the percentage of aberrantcells between eachtreatmentgroup and negativecontrol group. The Cochran-Armitage trend test for the percentage of aberrant cells is performed between test article-treated groups and the negative control to test for evidence of dose response.
C.SpermatogonialChromosomeAberrationAssay 1. Introduction The in vivo mammalian spermatogonial chromosome aberration test is conducted to identify those substances that cause structural chromosome aberrations in mitotically dividing mammalian spermatogonial cells[ 123- 1251. Rodent germ cell assays would not normally be conducted for routine screening purposesbut may be part of a package of data used in quantitative hazard assessment. Inall probability, the clastogenicity of the agents will have been established either in vitro or in vivo using somatic cells prior to germ cell testing. Requirements for the bone marrow cytogenetic assays (see above) apply equally to germ cell studies, but additional technical considerations should be considered.
2.
Experimental Design
( a ) Species. Routinely,miceandratsareusedfor metaphase analysis.
the spermatogonial
(b) Trecrtment m d Sampling Time. It is advisable to use more than one sampling interval because treatmentwith the test agent may delay the cell cycle. Differentiating spermatogonia divideat 26 to 38 hr. The majority of mitotic cells in testicular preparations are stage B spermatogonia with an average cell cycle
Putman et al.
176
time of 26 hr. Since most clastogens are S-phase dependent, sampling shouldbe performed at 24 and 48 hr after dose administration.
(c) DoseSelection. The maximumtolerateddose(MTD)is usually selected as the high dose for hazard identification. The MTD isdefined as the dose that shows signs of toxicity to the animals or gives an indication for cytotoxicity todifferentiatingspermatogonia.However, in the Spermatogonial metaphase analysis, the number of Spermatogonial metaphases from treated animals should not be reduced by more than 50% as compared with the vehicle control. The limit dose for nontoxic chemicals should be2 g/kg body weight. Two additional doses adequately spaced should be employed to establish a dose response. ( d ) Preparation of Sperrnatogonicrl Metuplmses. Animalsaretreated with 3 to 4 mg colchicine/kg body weight at 2 to 4 hr prior to sacrifice. The number of Spermatogonial cells can be enhanced if the testicular cells are dispersed in 0.1% trypsin prior to hypotonic treatment [125]. ( e ) Cells to Be Scored. For each animal, at least 100 well-spread mitotic metaphases, with complete number of centromeres, are analyzed for structural aberrations with a minimum of 500 per treatment group. Additionally, the ratio of Spermatogonial mitotic cells to I and I1 meiotic cells may be determined in a total sample of 100 dividing cells per animal to establish possible cytotoxic effects. (f) ScoringCriteria. Cells with 2n 5 2centromeresareacceptablefor scoring. Aberrations shouldbe recorded as describedfor the bone Inanow assay. is deNumerical abnormalities or polyploids are not scored in this test, which signed for the assessment of structural chromosome aberrations.
3.
Evaluation Criteria
The statistics, criteria for determinationof a valid test, and data presentation for the Spermatogonial chromosome aberrations are same as for the bone marrow assay (see above).
VI.
PRIMARY DNA DAMAGE ASSAY SYSTEMS
A.
Introduction
The induction of DNA damage by physical or chemical agents in mammalian cells represents the early event(s) that may lead to mutation and/or neoplastic transformation. Therefore, an assessment of the DNA-damaging capacity of a
Genetic Toxicology
177
substance may provide some information on its potential mutagenic and/or carcinogenic activity. Damage to DNA may be detected directly by the measurement of DNA fragments (e.g., using alkaline elution techniques) or indirectly by the measurement of DNA synthesis that occurs in the processof DNA repair. Several methods are available for the estimation of DNA repair, e.g., (1) by chromatographic detection of the rateof disappearance of altered nucleotides[ 1261, (2) by sedimentaof DNA fragmentation of DNA through alkaline sucrose gradients for detection tion and rejoining[127],and(3) by monitoringthe“resynthesis” of short sections of the DNA molecule that are eliminatedby endo- and exonuclease enzymes following exposure to exogenous DNA-damaging agents [128]. Monitoring DNA repair synthesis is the most widely used method for assessing DNA-damaging activity. As opposed to the scheduled DNA synthesis that occurs during the normal phase of semiconservative duplication of DNA in the cell cycle. DNA synthetic activity triggered by DNA damage can occur at any phase of the cell cycle and is commonly referred to as ‘‘unscheduled DNA synthesis” or UDS. Measurementof UDS can be achieved by tracking the incorporation of BrdUrd or tritiated thymidine into nuclear DNA of repairing cells, although other purine or pyrimidine precursors can also be used [ 1291. The incorporation of radioactively labeled purine or pyrimidine can be measured by autoradiographic or scintillation counting methods. Repair of DNA damage in mammalian cells, as evidenced by UDS, was first demonstrated by the autoradiographic detectionof the uptake of labeled thymidine into the DNA following UV irradiation [128]. A variety of cell types, including fibroblasts and epithelial cells of rodent as well as human origin, have been used as indicator cells for UDS. The use of fibroblasts, which typically have limited metabolic capability for the biotransformation of chemical compounds, usually entails the inclusionof an exogenous enzyme system to provide for metabolic activation.The use of freshly isolated hepatocytes, which are enzymatically proficient, does not require the inclusion of an exogenous metabolic activation system. Examination of over100chemicalcompoundsrepresentingthemajor groups of carcinogenic substances revealed a good correlation between the carcinogenic activity and the capacity to elicit DNA repair synthesis in cultured mammalian cells [ 130-1351. A review of the published literature and suggested protocols and evaluation criteria for evaluating UDS were presentedin a Working Group Report prepared for the EPA’s Gene-Tox Program [ 1361. Recommendations for the performanceof UDS assays in vitro and in vivo have been presented in subsequent papers [ 137- 1391. This section focuseson the monitoringof UDS by autoradiography because in the screeningof substances for DNA-damaging this method is commonly used
Putman et al.
I 78
activity, although some discussion onthe use of alkaline elution and scintillation counting techniques will be included.
1.
Purpose of Study
The purpose of the UDS assay is to evaluate the potential of the test article to induce unscheduled DNA synthesisin primary rat hepatocyte cultures following in vitro or in vivo administration of the test article.
2.
Species/Cell Selection and Justification
Primaryhepatocytesshouldbeobtainedfromyoungadult (6- to12-wk-old) Sprague-Dawley or Fischer rats. This test system has been demonstrated to be sensitive to the DNA-damaging activity of a variety of chemicals. The response of hepatocytes from either strainof rat to DNA-damaging agents is comparable. The use of male animals only is based on the fact that the in vivo UDS assay was validated in male rodents and the preponderance of data in the literature is from male animals. The use of hepatocytes from male rats is sufficient unless there is any evidence of significant male/female differences in toxicokinetics [ 1391. Monitoring unscheduled DNA synthesis in primary cultures of rat hepatocytes presents several advantages over other cell types used to monitor possible interactions between the test article and DNA. First, the target cells possess the ability to metabolize many promutagens/procarcinogens to their active form. Second, rat hepatocytes in culture are nearly 100% nondividing, so no metabolic blocks are needed to inhibit replicative DNA synthesis. Third, the target cells are epithelialin origin. Since most human cancers are carcinomas,an assay using epithelial cells to monitor genetic damage may be more relevant to the in vivo situation than a similar assay using fibroblasts.
3.
Maintenance of Cells or Animals
Animals should be obtained from a source monitored for evidence of adventitious agents and are quarantined forno fewer than 5 days prior to dose administration. The animals are observed each working day for signs of illness, unusual food and water consumption, and other general conditions of poor health. All animals must be judged to be healthy prior to utilization i n the study. Animals are housed in an AAALAC-accredited facility with a controlled environment of 50 ? 30% relative humidity and 23 ? 3°C with a 12-h light/ dark cycle. Rats may be individually or group housed by sex in plastic autoclavable cages. Heat-treated hardwood chips are used for bedding. Animals are pro-
Genetic
vided free access to a certified laboratory rodent chow that for environmental contaminants, and to tap water.
179
has been analyzed
B. ExperimentalDesign For the in vitro UDS assay, the assay is performed using modifications of the procedures describedby Williams [ 131,1401. Primary rat hepatocytes are exposed to concentrations of the test article as well as positive and negative controls in triplicate cultures. Forin vivo UDS studies, the experimental design follows that described by Butterworth et al. [138] and OECD Guideline 486 [141]. Hepato(2 to 4 and 12 to 16 h) cytes are isolated from male rats at two time points following the administrationof three concentrationsof test articleas well as positive and vehicle controls. The harvests at two time points in the in vivo assay are designed to target the peak UDS response elicited by different test articles [ 1421. Both the in vitro and in vivo UDS assays are evaluated on the basis of incorporation of tritiated thymidine (3H-TdR) into the hepatocyte DNA, presumably as a consequence of DNA repair.
1. Dose Selection For in vitro UDS studies, selection of dose levels for the UDS assay is based upon toxicity of the test article. Primary rat hepatocytes, plated 90 to 180 min earlier, are exposed to concentrations of test article, the highest concentration not to exceed5 mg/ml. Approximately 18 to 20 hr after treatment, toxicity is assessed by measuring the amount of lactate dehydrogenase (LDH) that has leaked from the cells into the culture medium relative to the solvent control. Leakageof this enzyme increases with the loss of cell membrane integrity. The treated cultures also are observed microscopically for toxic effects. Whenever possible,the high dose is selected to yield at least 50% toxicity, to a maximum of 5 mg/ml. For freely soluble. nontoxic test articles, the highest concentration is 5 mg/ml. For relatively insoluble, nontoxic test articles, the highest concentration is the lowest insoluble dose in treatment medium but not to exceed 5 mg/ml. If dose-related cytotoxicity is noted, irrespectiveof solubility, then the top concentration is based In all cases, precipitation is evaluated at the beginon toxicity as described above. ning and at the end of the treatment period using the naked eye. For in vivo UDS studies, selection of dose levels is based on toxicity of the test article but will not exceed 2 g/kg body weight [141]. The high dose for the UDS assay should be the maximum tolerated dose, or that which produces some indication of toxicity, such as reductionin body weight gain, clinical signs of pharmacotoxic effect, or mortality. The LDSo may be selected for the high dose, provided that a sufficient number of animals are likely to survive to the
al. 180
et
Putman
16-hr postexposure harvest. Two additional dose levels are tested, approximately one-half and one-fourth of the high dose.
2.
Number of Animals or Cultures
For in vitro UDS studies, hepatocytes from a single rat are sufficient for testing a number of test articles. For in vivo UDS studies, the animals will be assigned to 10 treatment groups of five males each based on equalization of group mean body weights. Only three surviving animals per group are evaluated microscopically for UDS. For preparation of primary hepatocyte cultures. the rats are anesthetized with metofane and a midventral incision is made to expose the liver. The liver 0.5 mM ethyleneglycol-bis(P-aminoethyl ether)N,N,N',N'isperfusedwith tetraacetic acid (EGTA) solution followed by collagenase solution (80 to 100 units Type I collagenase/ml culture medium). The liver is removed, transected, and shaken in a dilute collagenase solution to release the hepatocytes. The cells are pelleted by centrifugation, resuspended in complete Williams' Minimum Essential Medium (WME buffered with 0.01 M HEPES, supplemented with 2 mM L-glutamine, 50 yg/ml gentamicin, and 10% fetal bovine serum) and approximately5 X lo5cells are seeded into 35-mm tissue culture dishes containing complete WME. For the preliminary cytotoxicity assay, cells are seeded into two replicate dishes per dose level, without coverslips. For the UDS assay, cells are seeded into three replicate dishes per dose level, containing 25-mm coverslips. In addition, four cultures are seeded without coverslips for determinationof total LDH release: two for treatment with the highest dose of test article and two for treatment with the solvent control.The hepatocyte cultures are maintained in a humidified atmosphere of 5% COz and 37°C.
3. Controls For in vitro UDS studies, untreated cells are used as the untreated control. The test article solvent (or vehicle) isused as the negative control. For solvents other than water or culture medium, the final concentration in treatment medium should not exceed 1 %. For positive control, 7,1Z"imethylbenz[a]anthracene (DMBA) at concentrations of 3 and 10 pg/ml is used. For in vivo UDS studies, the test article vehicle is used as the negative control.Commonly used positivecontrols,administered via gavage,include methyl methanesulfonate (MMS) and dimethylnitrosamine (DMN) for the 2- to 4-h time point and 2-acetylaminofluorene (2AAF) for the 12- to 16-h time point [138]. However, the response is greatly influencedby the route of administration and the solubility of the positive control in a vehicle compatible with the route of administration. For example, both MMS and DMN are readily miscible with water and they can be administered by intraperitoneal or intravenous injection
Genetic
181
or by gavage. On the other hand, 2AAF, being insoluble in water and with limited solubility in carboxymethylcellulose or corn oil, can be administered only via gavage to elicit a detectable response. When 2AAF isadministered via intraperitoneal injection,the response isvery marginal in spite of excessive toxicity [143]. Dimethylnitrosamine, administered via intravenous injection or gavage, can be used as the positive control for the 2- to 4-hr and 12- to 16-hr sacrifices [143].
4.
Dose Administration
For in vitro UDS studies, the cells are washed with plating medium at 90 to 180 min after plating, refed with serum-free WME (for the UDS assay, the medium will contain 10 pCi/ml 'H-thymidine), and exposed to chemicals for approximately 18 to 20 h at 37°C. The test article is dissolved or suspended either directly in serum-free WME concentraat the appropriate concentration, orin an appropriate solvent at lOOX a tion. If WME is the solvent, thetest article dilutions are prepared directly in the serum-free WME medium. The plating medium is removed fromthe hepatocyte cultures and replaced with the treatment medium at a rate of 2 ml per dish. If WME is not used as the solvent, the test article dilutions are prepared in the appropriate solvent, and 20 p1 of the 1OOX dosing solutions are added to the treatment medium. At approximately 18 to 20 hr after treatment, at least one culture from each treatment group is observed microscopically, and a toxicity evaluation of the cultures is made relative to the solvent controls.An aliquot of the medium from two replicate culture dishes per treatment group is removed for measurementof LDH release. In addition, the extra cultures treated with the solvent control and the highest test article dose are lysed with 1% Triton@, and subsequently sampled for LDH release. Eighteen to 20hr after exposure,the coverslips containing cells are washed three times in serum-free WME. The cells are swelledin 1% sodium citrate sohtion and fixed in three changes of ethanol-glacial acetic acid fixative (3: 1, v/v). The coverslips are allowed to dry for at least 1 hr before mounting cell side up on glass slides.The slides are labeled with the study number and a code to identify the dose level. The slides are dipped in NTB-2 emulsion (diluted 1 : 1 in deionized water [H,O]) at 43 to 45OC, allowed to drain and dry for at least 1.5 hr at room tempera5 to 12 days at 2 to 8°C in lighttight boxes with a desiccant. ture and are stored for Slides are developed in Kodak" D- 19 developer (diluted 1: 1 in deionized H20), fixed in Kodak fixer. and stained with hematoxylin-eosin stain. For in vivo UDS studies, the oral route is recommended because the data reported inthe published literature are predominantly based on chemicals (including chemicals commonly used as positive controls) administered by gavage [e.g.,
182
al.
Putman et
144,1451. Other routes of administration (e.g., via intravenous injection)may be used if justified. The intraperitoneal route is not recommended in light of reservations expressed that this route could expose the liver directly to the test article rather than exposure via the circulatory system [ 1391. The test article-vehicle mixture, the vehicle alone, and the positive control are given as single administrations. The rate of administration for the test articlevehicle mixture and vehicle alone is typically 10 ml/kg unless larger volumes, up to 20 ml/kg body weight, are required to deliver the targeted dose. The isolation and culturing of hepatocytes are performed as described earlier for in vitro UDS studies. Ninety to 180 min after plating, the cells are washed 10 pCi once with complete WME and refed with serum-free WME containing 3H-TdR/ml. Four hours later, the radioactive medium is removed, the cultures 0.25 mM thymidine, and then washed three times in serum-free WME containing refed with serum-free WME containing 0.25 mM thymidine and incubated for 17 to 20 h. The cultures are then processed for autoradiography as described earlier for in vitro UDS studies.
5. End Point Evaluation and Time of Evaluation All coded slides are read without knowledge of treatment group. The slides are viewed microscopically under a 1OOX oil immersion lens. An automated colony counter is interfaced with the microscope so that silver grains within each nuclei and the surrounding cytoplasm can be counted. First the number of grains in a nucleus are counted. Then the number of grains in three nuclear-sized adjacent cytoplasmic areas are counted. Replicative DNA synthesis is evidenced by nuclei completely blackened with grains, and should not be counted. Cells exhibiting toxic effects of treatments, such as irregularly shaped or very darkly stained nuclei, also should not be counted. For in vitro UDS studies, a total of 150 nuclei should be scored per dose level. If possible, 50 nuclei are scored from each of three replicate cultures. For in vivo UDS studies, 50 nuclei should be scored from each of three replicate cultures for a total of 150 nuclei from each rat.
C. EvaluationCriteria 1. Data Presentation A net nuclear grain count is calculated for each nucleus scored by subtracting the mean of the cytoplasmic area counts from the nuclear area count. For each treatment group, a mean net nuclear grain count and standard deviation (SD), as well as the proportion of cells in repair (percentage of nuclei showing 2 5 net nuclear grain counts) are determined and reported.
Genetic Toxicology
2.
183
Criteria for Valid Test
The following criteria must be met for a test tobe considered valid. The proportion of cells in repair in the negative controls must be less than 15% and the net nuclear grain count must be less than 1. The mean net nuclear grain count of the positive control must be at least five counts over that of the solvent control.
3.
Positive and Negative Test
All conclusions should be based on sound scientific judgment; however, the following is offered as a guide to interpretation of the data. Any mean net nuclear count that is increased by at leastfive counts over the solvent control is considered significant [ 131,1401. A test article is judged positiveif it induces a dose-related increase with no less than one dose significantly elevated above the solvent control. A significant increase in the mean net nuclear grain count in at least two successive doses in the absence of a dose response is also considered positive. A significant increase in the net nuclear grain count at one dose level without a dose response is judged equivocal. The test article is considered negative if no significant increase in thenet nuclear grain count is observed.The percentage of cells in repair (cells with 2 5 net nuclear grains) is also reported; this information may also be used in making a final evaluation of the activity of the test article.
D. SupplementalInformation Apart from the use of autoradiography, UDS can also be monitored by the liquid scintillation counting (LSC) technique for the detectionof incorporated radioactest article and labelingwith tivity [ 134,136,1461.The procedures for exposure to jHTdR are identical to those for the autoradiographic technique. Following the labeling period, the DNA is extracted. Aliquots of the extracted DNA are used for determinationof the DNA content using standard spectrophotometric methods and forthe detection of incorporated radioactivity using liquid scintillation counting. The LSC method provides the advantage that the time between exposure and obtaining the results is less thanthat for the autoradiography approach. However, the LSC method does not provide direct visualizationof the cells undergoing UDS. In addition, the LSC method requires more cells. more test article. and morereplicatesamplesthandoestheautoradiographyprocedure. TheLSC method is potentially prone to interference by the presence of cells undergoing DNA replicative synthesis because of the substantial uptake of 'HTdR by such cells. Therefore, the LSC methodology is less commonlyused than the autoradiography approach. Although hepatocytes are most commonly used in UDS studies, the DNAdamaging effects of chemicals on germinal tissue can also be studied using the
184
Putman et al.
in vivo UDS procedure [ 1471. This approach has been used to identify the DNAdamaging effect of chemical mutagens in germ cells [ 148-1521. Another method of studying the DNA-damaging effect of chemical mutagens is the alkaline elution technique [153-1551. This procedure detects DNA damage prior to the onset of UDS. It can be used on both somatic and germinal tissue. Depending on the tissue, the procedure can be used under in vitro and in vivo conditions. Following exposure to the test article, cells are transferred to a filter and lysed under alkaline conditions. Upon passage of an eluting fluid, small DNA fragments will pass through the filter. Larger DNA fragments, depending on their size, may be eluted, while intact undamaged DNA will be retained by the filter. The DNA-damaging activity of a test article is assessedby the quantity of DNA eluted and the speed at which DNA from exposed cells elutes from the filter. In a survey of selected chemicals that are difficult to detectin conventional in vitro genetic toxicology assays (e.g., bacterial mutagenesis, mammalian cell mutagenesis, UDS), the results of in vitro alkaline elution studies on rat hepato[ 1561. Using a slight cytes correlated wellwith in vivo carcinogenicity assay data modification of the analytical procedure, the alkaline elution assay can be adapted for the detection of DNA-DNA and DNA-protein cross-links [ 153,155,1571.
REFERENCES 1. International Conference on Harmonisation (ICH) of Technical Requirements for Registration of Pharmaceuticals for Human Use. Genotoxicity: A Standard Battery forGenotoxicityTestingofPharmaceuticals. S2B dpcumentrecommendedfor adoption at step4 of the ICH process on July 16, 1997. Federal Register 62: 1602616030, November 21. 1997. 2. International Conference on Harmonisation (ICH) of Technical Requirements for Registration of Pharmaceuticals for Human Use. ICH Guidance on Specific Aspects ofRegulatoryGenotoxicityTestsforPharmaceuticals.S2Adocumentrecommended for adoptionat step 4 of the ICH process on July 19, 1995. Federal Register 61:18198-18202. April 24, 1996. 3. IS0 10993, Part 3: Tests for genotoxicity. carcinogenicity and reproductive toxicity. AAMI Stadards and Reconznzenclecl Practices, Volume 4:Biological Evaluation of Medical Devices, Association for the Advancement of Medical Instrumentation, Washington. DC, 1994. p. 31. 4. A. Auletta,K.L. Dearfield and M.C. Cimino, Mutagenicity test schemes and guidelines: U.S. EPA Office of Pollution Prevention and Toxics and Office of Pesticide Programs, Envirofz. Mutugen. 21:38 (1 993). 5. K.L. Dearfield, A.E. Auletta. M.C. Cimino and M.M. Moore, Considerations in the U.S. Environmental Protection Agency’s testing approach for mutagenicity. Mutat. Res. 258259 (1991). 6. Y. Shirasu, The Japanese mutagenicity studies guidelines for pesticide registration, MNtat. Res. 205:393 (1988).
Genetic
185
7. B.N. Ames, J. McCann and E. Yamasaki. Methods for detecting carcinogens and mutagens with the Salrztonellalmammalian-microsome mutagenicity test, Mutnt. Res. 31:347 (1975). 8. E.E. Slater. M.D. Anderson and H.S. Rosenkranz, Rapid detection of mutagens and carcinogens, Cancer Res. 31:970 (1 971 ). 9. B.A. Bridges, Simple bacterial systems for detecting mutagenic agents, Lab. Pmct. 21:413 (1972). 10. J. McCannandB.N.Ames,Detectionofcarcinogensasmutagensinthe Salmonellalmicrosome test: assay of 300 chemicals: discussion, Proc. Natl. Acnd. Sci. USA 73:950 (1976). 11. J. McCann, E. Choi, E. Yamasaki and B.N. Ames. Detection of carcinogens as mutagens in the Snb~zo~zellulmicrosorne test: assay of 300 chemicals, Proc. Natl. Acacl. Sci. USA 72:5135 (1975). 12. J. Ashby, R.W. Tennant, E. Zeiger and S. Stasiewicz. Classification according to chemical structure, mutagenicity to Snlrnonelln and level of carcinogenicity of a further 42 chemicals tested for carcinogenicity by the U.S. National Toxicology Program, Mutation Res. 223:73 (1989). 13. M.H.L. Green. Mechanismsof bacterial mutagenesis and properties of mutagenesis tester strains, Arclz. To-xicol.39:241 (1978). 14. B.N. Ames. F.D. Lee and W.E. Durston. An improved bacterial test system for the detection and classification of mutagens and carcinogens. Pt-oc. Nutl. Acnd. Sci. USA 70:782 (1973). 15. J. McCann, N.E. Springarn, J. Kobori and B.N. Ames, Detection of carcinogens as mutagens: bacterial tester strains with R factor plasmids. Pt-oc. Nutl. Acnd. Sci. USA 72:979 (1975). 16. D.M. Maron and B.N. Ames, Revised methods for the Saltnonelln mutagenicity test, Mutation Res. 213:173 (1983). 17. D.E. Levin. M. Hollstein, M.F. Christman, E.A. Schwiers and B.N. Ames, A new Sulmonellcl tester strain (TA 102) with A-T base pairs at the site of mutation detects oxidative mutagens, PI-oc.Natl. Acnd. Sei. USA 79:7445 (1982). 18. P. Wilcox. A. Naidoo, D.J. Wedd and D.G. Gatehouse. Comparison of Salnzorzella tyyhimuriunl TA 102 with Escherichia coli WP2 tester strains, Mutagenesis 5:285 (1 990). 19. M.H.L. Green and W.J. Muriel, Mutagen testing using trp' reversion in Esclzericlziu coli, Mutatiorz Res. 383 (1976). 20. OECD Guideline 47 1 (Genetic Toxicology: Bacterial Reverse Mutation Assay), July (1997). 21. V.O. Wagner, 111, J.E.Sly,M.L.Klug.T.L.Staton.M.K.Wyman, S. Xiaoand R.H.C. San, Practical tips for conducting the Salmonella and E.coli mutagenicity assaysunderproposedinternationalguidelines, EnviI-on. Mol. Mutagen.23:70 ( 1994). two 22. V.O. Wagner. I11 and R.H.C. San, The effect of titer on the response of Saltnonella strains to positive control mutagens. Emiron. Mol. Mutagen. 21:75 (1993). 23. B.N. Ames, W.E. Durston, E. Yamasaki and F.D. Lee, Carcinogens are mutagens: a simple test system combining liver homogenates for activation and bacteria for detection, Proc. Nutl. Acnd. Sci. USA 70:2281 (1973).
186
Putman et al.
24. T. Matsushima,M.Sawamura,K.HaraandT.Sugimura,Asafesubstitutefor polychlorinated biphenyls as an inducer of metabolic activation systems. In vitro nletrrbolic clctilwtion in mutagerlesis testing (F. J. De Serres, J. R. Fouts, J. R. Bend. and R. M. Philpot, Eds.). Elsevier/North-Holland, Amsterdam, 1976, p. 85. 25. T. Ong, M. Mukhtar, C.R. Wolf and E. Zeiger, Differential effects of cytochrome P450-inducers on promutagen activation capabilities and enzymatic activities of S9 from rat liver. J. Eizliron. Pathol. Toxicol. 4:55 (1980). 26. F.J. De Serres and M.D. Shelby, Recommendations on data production and analysis using Snlrzzonell~~lmicroson~e mutagenicity assay, Mutut. Res. 64:159 (1979). 27. J. Ashby. The prospects for a simplified and internationally harmonised approached to the detection of possible human carcinogens and mutagens, Mutagenesis 1:3 (1986). 28. S. Venitt. C. Crofton-Sleigh and R. Forster, Bacterial mutation assays using reverse mutation, MLrtngenicit_v testing: u practical approach (S. Venitt and J.M. Parry, eds.), IRL Press, Oxford (1984), p. 45. 29. H.J. Vogel and D.M. Bonner, Acetylornithinase of E.coli: partial purification and some properties, J. Biol. Clzem. 218:97 (1956). of TA97a into a standard Ames test 30. C.E. Piper and C.D. Kuzdas, Incorporation protocol, E~~vir-on. Mutugerz. 9:85 (1987). 31. D. Gatehouse. S. Haworth, T. Cebula, E. Gocke, L. Kier, T. Matsushima, C. Melcion, T. Nohmi.T. Ohta. S. Benitt and E. Zeiger. Recommendations for the performance of bacterial mutation assays, Mlrtot. Res. 312217 (1995). 32. T. Yahagi, M. Nagao,Y. Seino. T. Matsushima, T. Sugimura and M. Okada, Mutagenicities of N-nitrosamines on Sul~~~onellr, Mutut. Res. 38:121 (1977). 33. D. Weinstein and T.M. Lewinson. A statistical treatment of the Ames mutagenicity assay, Mutntiorz Res. 51:433 (1978). 34. L. Bernstein, J. Kaldor, J. McCann and M.C. Pike, An empirical approach to the statistical analysis of mutagenesis data from the Srrlr~zonellatest, Mutot. Res. 9 7 267 ( 1 982). 35. A.G. Stead, V. Hasselblad, J.P. Creason and L. Claxton. Modeling the Ames test. Mutat. Res. 85:13 ( 1981). 36. B.H. Margolin, N. KaplanandE.Zeiger,Statisticalanalysis of theAmes Salnlor~ellalmicrosollletest, PI-oc. Nntl. Acncl. Sci. USA 78:3779 ( I 98 1). 37. L.E. Meyers. N.H. Sexton, L.I. Southerland and T.J. Wolff, Regression analysis of Arnes test data. Erzvirou. Mutugen. 3575 (1 981). 38. R.D. Snee and J.D. Irr, A procedure for the statistical evaluation of Ames Srrlnzonella assay results: comparison of results among 4 laboratories, Mutut. Res. 128: 115 (1984). 39. US Pharmacopeia, XXIII. 88, Biological Reactivity Test, In Vivo (1995). 40. G.R. Blackburn, R.A. Deitch. C.A. Schreiner. M.A. Mehlman and C.R. Mackerer. Estimation of the dermal carcinogenic activity of petroleum fractions using a modified Ames assay. Cell Biol. Toxicol. 1:67 (1984). 41. G.R. Blackbum. R.A. Deitch. C.A. Schreiner and C.R. Mackerer. Predicting carcinogenicity of petroleum distillation fractions using a modifiedScholzella mutagenicity assay, Cell Biol. Toxicol. 2:63 (1986). 42. M.J. Prival and V.D. Mitchell, Analysis of a method for testing azo dyes for muta-
Genetic Toxicology
43. 44. 45. 46. 47. 48. 49. 50. 51. 52. 53. 54.
55.
56.
57.
187
genic activity inSabnonella typhinlurirrrtt in the presence of flavin mononucleotide and hamster liver S9, Mutat. Res. 97:103 (1982). L.M. Distlerath, J.C. Loper and C.R. Dey, Aliphatic halogenated hydrocarbons produce volatile Scdrnonelln mutagens, Mutation Res. 13655 (1 984). T.J. Hughes, D.M. Simmons, L.G. Monteith and L.D. Claxton, Vaporization technique to measure activity of volatile organic chemicals in the AmeslSnlnzonelln assay, Emiron. Mutagen. 9:421 (1987). V.O. Wagner, 111, R.H.C. San and E. Zeiger, Desiccator methodology for Snlmonella mutagenicity assay of vapor-phase and gas-phase test materials., Environ. Mol. Mutagen. I9:68 ( 1992). B.P. Dunn and J.R. Curtis, Clastogenic agents in the urine of coffee drinkers and cigarette smokers, Mutat. Res. 147: 179 (1985). N.Y. Kado, D. Langley and E. Eisenstadt, A single modification of the Salmonella liquid-incubation assay increased sensitivity for detecting mutagens in human urine, Mutat. Res. 21:25 (1983). J.C. Cline and R.E. McMahon, Detectionof chemical mutagens: use of concentration gradient plates in a high capacity screen, Res. Conzm. Chern. Pnthol. Pharmacol. 16523 (1977). B.N. Ames, The detection of chemical mutagens with enteric bacteria. Chemical mutagens: principles and methods for their detection (A. Hollaender, Ed.), Plenum Press. New York, Vol. 1 (1971), p. 267. E.H.Y. Chu and H.V. Malling. Mammalian cell genetics11. Chemical induction of specific locus mutations in Chinese hamster cells in vitro, Proc. Natl. Acad. Sci. USA 61:1306 (1968). F.T. Kao and T.T. Puck, Genetics of somatic mammalian cells, VII. Induction and isolation of nutritional mutants in Chinese hamster cells. Proc. Natl. Acad. Sci. USA, 60: 1275 (1968). R.J. Albertini and R. DeMars, Somatic cell mutation detection and quantification of X-ray-induced mutation in cultured, diploid fibroblasts, Mutation Res. 18:199 (1973). D. Clive, W.G. Flamm, M.R. Machesko and N.J. Bernheim, A mutational assay system using the thymidine kinase locus in mouse lymphoma cells, Mutation Res. 16:77 (1972). A.G.A.C. KnaapandJ.W.T.M.Simons,Amutationalassay system forL5 178Y mouselymphomacells,usinghypoxanthine-guaninephosphoribosyltransferase (HGPRT) deficiency as marker. The occurrence of a long expression time for mutations induced by X-rays and EMS, Mutat. Res. 30:97 (1975). A.W. Hsie. D.A. Casciano, D.B. Couch, D.F. Krahn, J.P. O’Neill and B.L. Whitfield, The use of Chinese hamster ovary cells to quantify specific locus mutation and to determine mutagenicity of chemicals. A report of the Gen-Tox Program, Mutat. Res. 86:193 (1981). K.R. Tindall. L.F. Stankowski, Jr., R. Machanoff and A.W. Hsie. Detection of deletion mutations in pSV2gpt-transformed cells, Mol. Cell. Biol. 4: 1411 ( 1984). K. Sato, R.S. Slesinski and J.W. Littlefield, Chemical mutagenesis at the phosphoribosyltransferase locus in cultured human lymphoblasts,Proc. Natl. Acnd. Sci. USA 69: 1244 (1972).
188
Putman et al.
58. W.G. Thilly. Chemical mutation in human lymphoblasts, J. Tos. Ertviron. Health 2:1343 (1977). 59. D. Clive, W.G. Flamm and M.R. Machesko. Mutagenicity of hycanthone in mammalian cells, Mzrtcct. Res. 14262 (1 972). 60. M.M.M. Brown and D. Clive, The utilization of trifluorothymidine as a selective agent for TK- / - mutants in L5 178Y mouse lymphoma cells, Mzctut. Res. 53: 116 (1978). 61. D. Clive, K.O. Johnson. J.F.S. Spector. A.G. Batson, and M.M.M. Brown, Validation and characterization of the L5178Y TK+/-mouse lymphoma mutagen assay system, Mzctcrt. Res. 59:61 (1979). 62. M.M.M. Brown and D. Clive, The effect of expression time on the frequency of carcinogen-inducedmutantsattheTKandHGPRTlociinL5178Ymouse lymphoma cells, Mutation Res. 3:159 (1978). 63. W.G. Thilly. J.G. DeLuca, H. IV Hoppe and B.W. Penman, Phenotypic lag and mutation to 6-thioguanine resistance in diploid human lymphoblasts, Mutat. Res. 50:137 (1978). 64. A.M. Rogers and K.C. Back, Comparative mutagenicity of hydrazine and three methylatedderivatives in L5178Ymouselymphomacells, Mutat. Res. 89:321 (1981). 65. D.M. DeMarini. H.E. Brockman, F.J. de Serres, H.E. Evans, J.F. Stankowski Jr. and A.W. Hsie, Specific-locus mutations induced in eukaryotes (especially mammalian cells) by radiation and Chemicals: a prospective, Mzrtat. Res. 220: I 1 (1 989). of 66. J. Hozier, J. Sawyer, M. Moore, B. Howard and D. Clive. Cytogenetic analysis the L5 178Ytk+/-, tk- / - mouse lymphoma mutagenesis assay system, Mutat. Res. 84:169 (1981). 67. C.A. Kozak and F.H. Ruddle, Assignment of the genes for thymidine kinase and galactokinase toMus n1usclrlzrs chromosome 11 and the preferential segregation of this chromosome with Chinese hamster/mouse somatic cell hybrids, Somat. Cell Genet. 3: 121 ( 1977). 68. R. Glover and D. Clive, Molecular spectra of L5178Y/TK-/- mutants induced by diverse mutagens, Emiror?. Mzctager~.14:71 (1989). 69. J.P. O‘Neill. P.A. Brimer, R. Machanoff. J.P. Hirsch and A.W. Hsie, A quantitative assayofmutationinductionatthehypoxanthine-guaninephosphoribosyltransferase locus in Chinese hamster ovary cells (CHO/HGPRT system): development and definition of the system, Mutat. Res. 45:91 (1977). 70. R. Machanoff, J.P. O’Neill and A.W. Hsie. Quantitative analysis of cytotoxicity and mutagenicity of benzo(a)pyrene in mammalian cells (CHO/HGPRT), Chern. Biol. Interact. 34:l (1981). 71. A.P.Li.J.H.Carver.W.N.Choy,A.W.Hsie,R.S.Gupta, K.S. Loveday,J.P. O’Neill. J.C. Riddle, L.F. Stankowski and L.L. Yang, A guide for the performance of Chinese hamster ovary cell/hypoxanthine-guanine phosphoribosyl transferase gene mutation assay. Mzrtnt. Res. 189: 135(1 987). 72. OECD Guideline for the Testing of Chemicals, Guideline(In476 Vitro Mammalian Cell Gene Mutation Test), July (1997). 73. D. Clive and J.F.S. Spector, Laboratory procedure for assessing specific locus muta-
Genetic Toxicology
74. 75. 76.
77. 78. 79. 80.
81.
82.
83. 84. 85. 86. 87.
189
tions at the TK locus in cultured L5178Y mouse lymphoma cells, Mutat. Res. 31: 17 (1975). M.M. Moore, D. Clive, B.E. Howard, A.G. Batson and N.T. Turner, In situ analysis of trifluorothymidine-resistant (TFT') mutants of L5178YITK'" mouse lymphoma cells, Mutat. Res. 151:147 (1985). C.S.Aaron.G.Bolcsfoldi,H.-R.Glatt, M. Moore, Y. Nishi,L.Stankowski,J. Theiss and E. Thompson, Mammalian cell gene mutation assays working group report, Mutat. Res. 312:235 (1994). D. Clive, G. Bolcsfoldi, J. Clements, J. Cole, M. Homna, J. Majeska, M. Moore, L. Muller, B. Myhr, T. Oberly, M. Oudelhkim, C. Rudd, H. Shimada, T. Sofuni, V. Thybaud and P. Wilcox, Consensus agreement regarding protocol issues discussed during the mouse lymphoma workshop: Portland, Oregon, May 7. 1994. Etzviron. Molec. Mu tagen. 25: 165 ( 1995). D. Scott, S.M. Galloway, R.R. Marshall, M. Ishidate, Jr., D. Brusick, J. Ashby and B.C. Myhr. Genotoxicity under extreme culture conditions. A report from ICPEMC Task Group 9, Mutut. Res. 257: 147 (1991). J. Cole, C.F. Arlett. M.H.L. Green, J. Lowe and W. Muriel, A comparison of the agar cloning and microtitration techniques for assaying cell survival and mutation frequency in L5178Y mouse lymphoma cells, Mutat. Res. 111:371 (1983). J. Cole, M.J. Muriel and B.A. Bridges, The mutagenicity of sodium fluoride to L5178Y [wild-type and TK+ - (3.7.2c)l mouse lymphoma cells, Mutagenesis I: 157 (1986). J. Clements. M.D. Fellows, P.A. Oxley, J. Greensitt and D.J. Kirkland, Automation of colony sizing in the microwell TK assay and a comparison of data generated usingmicrowellandagarcloningmethods,Erzviron.Molec.Mutagen.23:lO (1994). J. Clements. M. Fellows, P. Oxley, J. Wilkinson and H. Armstrong, The mouse lymphoma assay (MLA) in microtitre plates: mutant colony size distribution for a selection of mutagens and a comparison of methods for determining cytotoxicity (RS vs RTG), Emiron. Molec. Mutagen. 25:9 (1995). T., Sofuni, P. Wilcox, H. Shimada. J. Clements, M. Honma, D. Clive, M. Green, V. Thybaud, R.H.C. San, B.M. Elliott and L. Muller, Mouse Lymphoma Workshop: Victoria, British Columbia, Canada, March 27, 1996. Protocol issues regarding the use of the microwell method of the mouse lymphoma assay. Environ. Molec. Mutasen. 29:434-438 (1997). K.R. Tindall, L.F. Stankowski, Jr., R. Machanoff and A.W. Hsie, Analyses of mutation in pSV2gpt-transformed CHO cells, Mutat. Res. 160:121 (1986). T.D. Tlsty, A. Briot, A. Gualberto, I. Hall, S. Hess, M. Hixon, D. Kuppuswamy, S. Romanov, M. Sage and A. White, Genomic instability and cancer, Mutat. Res. 337 1 (1995). I. de Mitchell, T.R. Lambert, M. Burden, J. Sunderland, R.L. Porter and J.B. Carlton, Is polyploidy an important genotoxic lesion? Mutagenesis 10:79 (1995). N. Schultz and A. Onfelt, Video time-lapse study of mitosis in binucleate V79 cells: chromosome segregation and cleavage, Mutagenesis 9:1 17 (1994). R.T. Schimke, S.W. Sherwood, A.B. Hill and R.N. Johnston, Over-replication and
190
88. 89. 90.
91. 92. 93. 94. 95. 96. 97. 98.
99. 100. 101.
102.
103.
Putman et al.
recombination of DNA in higher eukaryotes: potential consequences and biological implications, Proc. Nat. Acncl. Sci. USA 83:2157 (1986). Y. Huang. C. Chang and J.E. Trosko, Aphidicolin-induced endoreduplication in Chinese hamster cells, Cmcer Res. 43: 1361 (1983). M.J. Aardema. D.P. Gibson and R.A. LeBoeuf, Sodium fluoride-induced chromosome aberrations in different stages of the cell cycle: a proposed mechanism, Mutat. Res. 223: 191 (1989). E.B. Hook. Contributions of chromosome abnormalities to human morbidity and mortality and some comments upon surveillance of chromosome mutation rates. Progress in Mutation Research (K.C.Bora,G.R.Douglasand E.R. Nestmann. Eds.), Elsevier, Amsterdam (1982). p. 9. G. Feldman, Liver ploidy, J. Hepatol. 16:7 (1992). R.L. Siegal and G.F. Kalf, DNA polymerase p involvement in DNA endoreduplication in rat giant trophoblast cells, J. Biol. Chem. 257: 1785( 1 982). S. Sutou and Y. Arai, Possible mechanisms of endoreduplication induction. Membrane fixation and/or disruption of the cytoskeleton, Exp. Cell. Res. 92:15 (1975). R. Thust and B. Bach, Exogenous glutathione induces sister chromatid exchanges. clastogenicity and endoreduplication in V79 Chinese hamster cells, Cell Biol. TOXicol. 1:123 (1985). R.J. Preston. W. Au, M.A. Bender, J.G. Brewen, A.V. Carrano, J.A. Heddle, A.F. McFee. S. Wolff and J.S. Wassom, Mammalian in vivo and in vitro cytogenetic assays: a report of the U.S. EPA's Gene-Tox program,Mutat. Res. 87:143 (1981). D. Kirkland and R.C. Garner, Testing forgenotoxicity-chromosomal aberrations in vitro CHO cells or human lymphocytes, Mutat. Res. 289:186 (1987). D. Kirkland, Chromosomal aberration tests in vitro: problems with protocol design and interpretation of results, Mutclgenesis 7:95 (1992). D. Scott, N.D. Danford, B.J. Dean and D.J. Kirkland, Metaphase chromosome aberrationassaysinvitro, BasicMutagenici9Tests: UKEMS recortwzended proceclw-es (D.J.Kirkland.Ed.),CambridgeUniversityPress,Cambridge ( I 990), p. 62. M. Ishidate. M.C. Harnois and T. Sofuni, A comparative analysis of data on the clastogenicity of 951 chemical substances tested in mammalian cultures, Mutat. Res. 195:151(1988). T. Morita, T. Nagaki, I. FukudaandK.Okumura,Clastogenicity of lowpHto various cultured mammalian cells. Mutat. Res. 268:297 (1992). S.M. Galloway, D.A. Deasy, C . Bean, A.R. Kraynak, M.J. Armstrong and M.O. Bradley, Effects of high osmotic strength on chromosome aberrations, sister-chromatid exchanges and DNA strand breaks, and the relation to toxicity. Mutat. Res. 289: 15 (1987). S.M. Galloway, A.D. Bloom, M. Resnick, B.H. Margolin, F. Nakamura, P. Archer and E. Zeiger, Development of a standard protocol for in vitro cytogenetic testing with Chinese hamster ovary cells: comparison of results for 22 compounds in two laboratories. Environ. Mutagen. 7 1 (1985). S.M. Galloway. M.J. Aardema, M. Ishidate, J.L. Ivett. D.J. Kirkland. T. Morita, P. Mosesso and T. Sofuni, Report from working group in onvitro tests for chromosomal aberrations, Mutation Res. 312:241 (1994).
Genetic Toxicology
191
104. J.L. Ivett and R.R. Tice, Average generation time: a new method of analysis and quantitation of cellular repairative kinetics, Environ. Mutagen. 4:358 (1982). 105. OECD Guideline for the Testing of Chemicals, 473 (In Vitro Mammalian Chromosome Aberration Test), July (1997). 106. D.J. Kirkland, R.R. Marshall, S. McEnaney, J. Bidgood, A. Rutter and S. Mullineux, Arochlor- 1254-induced rat-liver S9 causes chromosomal aberrations in CHO cells but not human lymphocytes. A role for active oxygen? Mutat. Res. 214:115 ( 1 989). 107. J.A. Hubbard, T.M. Brooks, L.P. Gonzalez and J.W. Bridges, Preparation and characterization of S9 fractions,Comyarutive Genetic Toxiciology (J.M. Parry and C.F. Arlett, Eds.), The Macnlillan Press Ltd., Houndmills, Hants (1985), p. 413. 108. B.H. Margolin, M.A. Resnick, J.Y. Rimpo, P. Archer, S.M. Galloway, A.D. Bloom and E. Zieger, Statistical analyses for in vitro cytogenetics assays using Chinese hamster ovary cells. Emirorz. Mutagen. 8:183 (1 986). 109. R.P. Batzinger, E. Bueding, B.S. Reddy and H.J. Weisburger. Formation of a mutagenic drug metabolite by intestinal microorganisms, Cancer Res. 383608 (1978). Mutat. Res. 18:187 110. J.A. Heddle, A rapid in vivo test for chromosomal damage, (1 973). A simple in vivo 111. B.E. Matter and J. Grauwiler, Micronuclei in bone marrow cells. model for the evaluation of drug-induced chromosomal aberrations, Mutut. Res. 23:239 (1974). 112. J.A. Heddle, M. Hite, B. Kirkhart, K. Mavournin, J.T. Macgregor, G.W. Newel1 and M. Salamone, The induction of micronuclei as a measure of genotoxicity. A report of the U.S. Environmental Protection Agency Gene-Tox Program, Mutut. Res. 123:61(1983). 113. M. Hayashi, T. Morita, Y. Kodama, T. Sofuni and M. Ishidate, Jr, The micronucleus assay with peripheral blood reticulocytes using acridine-coated slides,Mutat. Res. 245245 (1990). 114. M. Salamone, J. Heddle, E. Stuart and M. Katz, Toward a an improved micronucleus test: studies on 3 model agents, mitomycin C, cyclophosphamide and dimethylbenzanthracene, Mutat. Res. 74:347 (1980). 115. J.T. MacGregor, C.M. Wehr and R.G. Langlois, A simple fluorescence procedure for micronuclei and RNA in eyrthrocytes using Hoechst 33258 and pyronin Y, Mutat. Res. 120269 (1983). 116. G.S. Probst, Validation of target tissue exposure for in vivo tests, Proceedings of the Second Internatiorml Conferewe on Hurtnonisution, Orlando. I993 (P.F. D'Arcy and D.W.G. Harron, Eds.), The Queen's University, Belfast. 1994, p. 252. 117. L. Elder, Statistical methods for short-term tests in genetic toxicology: the first fifteen years, M~ctut.Res. 277:1 1 (1992). 118. M.A. Kastenbaum and K.O. Bowman, Tables for determining the statistical significance of mutation frequencies, Mutat. Res. 9527 (1970). S. Galloway, H. Holden, A.F. McFee and M. Shelby, Mam119. R.J. Preston, B.J. Dean, malian in vivo cytogenetic assays. Analysis of chromosomes aberrations in bone marrow cells, Mutat. Res. 189: 157(1 987). 120. R.R. Tice. M. Hayashi, J.T. MacGregor,D. Anderson, D.H. Blakey, H.E. Holden, M. Kirsch-Volders, F.B. Olson, Jr., F. Pacchierotti, R.J. Preston, F. Romagna, H.
al. 192
121. 122. 123.
124. 125. 126. 127. 128. 129. 130. 131. 132.
133. 134. 135. 136. 137.
et
Putman
Shimada. S. Sutou and B. Vannier. Report from the working group on the in vivo mammalianbonemarrowchromosomalaberrationtest, Mutat.Res. 312:305 ( 1 994). P. Perry and S. Wolff. New Giemsa method for differential staining of sister chromatids, Nature 251: 156 (1974). H.J. Evans, Cytological Methods for Detecting Chemical Mutagens, Clremzical Mutagens, Principles arld Methods for Their Detection (A. Hollaender, Ed.), Plenum Press, New York (1976), Vol. 4, p. 1. I-D. Adler, Clastogenic potential in spermatogonia of chemical mutagens related to their cell-cycle specificities, Gerzetic Toxicologv of Em+orzmental Chemicals: Genetic Efsects and applied Mutagenesis (C. Ramel, J. Lambert, J. Magnusson, Eds.) Alan R. Liss, New York, 1986, p. 474. M. Richold, A. Chandly. J. Ashby. D.G. Gatehouse, J. Bootman and L. Henderson. In vivo cytogenetic assay: analysis of chromosome aberrations in bone marrow cells, Mutat. Res. 189:157 (1990). K. Yamamoto and Y. Kikuchi, A new method for preparation of mammalian spermatogonial chromosomes, Mutat. Res. 52207 (1978). R.B. Setlow and W.L. Carrier, The disappearance of thymine dimers from DNA: an error-correcting mechanism, Proc. Natl. Acnd. Sci.. USA 51:226 (1964). R. McGrath and R.W. Williams, Reconstruction in vivo of irradiated Escherichia coli deoxyribonucleicacid:therejoiningofbrokenpieces, Nature 212:534 ( 1966). R.E. Rasmussen and R.B. Painter, Radiation-stimulated DNA synthesis in cultured mammalian cells, J. Cell Biol. 29: 11 (1966). J.E. Cleaver, DNA repair with purines and pyrimidines in radiation- and carcinogen-damaged normal and Xeroderina pigmentosum human cells, Cancer Res. 33: 362 (1 972). R.H.C. San and H.F. Stich, DNA repair synthesis of cultured human cells as a rapid bioassay for chemical carcinogens, hzt. J. Cancer 16:284 (1975). G.M. Williams, Carcinogen-induced DNA repair in primary rat liver cell cultures, a possible screen for chemical carcinogens, Carzcer Lett. 1:231 (1976). G.M. Williams, Detection of chemical carcinogens by unscheduled DNA synthesis in rat liver primary cell cultures, Cancer Res. 37 1845 (1977). G.M. Williams, Further improvementsin the hepatocyte primary culture DNA repair test for carcinogens: detection of carcinogenic biphenyl derivatives, Cancer Lett. 4:69 (1978). C.N. Martin, A.C. McDermid and R.C. Garner, Testing of known carcinogens and noncarcinogens for their ability to induce unscheduled DNA synthesis in HeLa cells, Cancer Res. 38:2621 (1978). H.F. Stich and R.H.C. San, Reduced DNA repair synthesis in Xeroderma pigmentoszm cells exposed to the oncogenic 4-nitroquinoline1-oxide and 4-hydroxayminoquinoline 1-oxide, Mutat. Res. 13:279 (1971). A.D. Mitchell. D.A. Casciano, M.L. Meltz, D.E. Robinson, R.H.C. San, G.M. Williams and E.S. Von Halle, Unscheduled DNA synthesis tests: a report of the “GeneTOX” program,Mutat. Res. 123:363 (1983). B.E. Butterworth, J. Ashby, E. Bermudez, D. Casciano, J. Mirsalis. G. Probst and
Genetic Toxicology
193
G. Williams, A protocol and guide for the in vitro rat hepatocyte DNA-repair assay, Mlltat.Res. 189:113 (1987).
138. B.E. Butterworth, J. Ashby, E. Bermudez, D. Casciano,J. Mirsalis. G. Probst and G. Williams. A protocol and guide for the in vivo rat hepatocyte DNA-repair assay, Mutat.Res. 189:123 (1987). 139. S. Madle. S.W. Dean, U. Andrae, G . Brambilla, B. Burlinson, D.J. Doolittle, C. Furihata. T. Hertner, C.A. McQueen and H. Mori, Recommendations for the performance of UDS tests in vitro and in vivo, Mutat. Res. 312:263 (1994). 140. G.M. Williams, The detection of chemical mutagens/carcinogens by DNA repair and mutagenesis in liver cultures, Chemical Mutagens (F.J. de Serres and A. E. Hollaender, Eds.), Plenum Press, New York, Vol. VI (1979), p. 71. 141. OECD Guideline 486, Unscheduled DNA Synthesis (UDS) Test with Mammalian Cells in Vivo, July (1997). 142. J.C. Mirsalis, K.C. Tyson and B.E. Butterworth, The detection of genotoxic carcinogens in the in vivo-in vitro hepatocyte DNA repair assay, Envirorz. Mutagen. 4: 553 (1982). 143. R.H.C. San, J.E. Sly and H.A. Raabe, Unscheduled DNA synthesis in rat hepatocytes following in vivo administration of dimethylnitrosamine via different routes. Ensirox Molec. Mutngelz. 2758 (1996). 144. J. Mirsalis and B. Butterworth, Detection of unscheduled DNA synthesis in hepatocytes isolated from rats treated with genotoxic agents: an in vivo-in vitro assay for potential mutagens and carcinogens, Carcinogenesis 1:621 (1980). 145. J. Ashby. P.A. Lefevre, B. Burlinson and M.G. Penman, An assessment of the in vivo rat hepatocyte DNA repair assay, Mutut. Res. 156:1 (1 985). 146. M.W. Lieberman, R.N. Baney, R.E. Lee, S. Sell and E. Farber, Studies on DNA repair in human lymphocytes treated with proximate carcinogens and alkylating agents, Cmcer Res. 31:1297 ( 1971). 147. G.A. Sega, Unscheduled DNA synthesis in the germ cells of male mice exposed in vivo to the chemical mutagen ethyl methanesulfonate, Proc. Natl. Acad. Sci., USA 71:4955 (1974). 148. G.A. Sega, J.G. Ownes and R.B. Cumming, Studies on DNA repair in early spermatid stages of male mice after in vivo treatment with methyl-, ethyl-, propyl-. and isopropyl methanesulfonate,Mutat. Res. 36:193 (1 976). 149. G.A. Sega, K.W. Wolfe and J.G. Owens, A comparison of the molecular action of an S,1-type methylating agent, methyl nitrosourea, and an SN2-type methylating agent, methyl methanesulfonate, in the germ cells of male mice, Chem.-Biol. Znteract. 33:253 (1981). 150. R.E. Sotomayor, G.A. Sega and R.B. Cumming, Unscheduled DNA synthesis in spermatogenic cellsof mice treated in vivo with the indirect alkylating agents cyclophosphamide and mitomen, Mutat. Res. 50:229 (1978). 151. R.E. Sotomayor. G.A. Sega and R.B. Cununing, An autoradiographic study of unscheduled DNA synthesis in the germ cells of male mice treated with X-rays and methyl methanesulfonate, Mutat. Res. 62:293 (1 979). 152. R.E. Sotomayor, P.S. Chauhan and U.H. Ehling, Induction of unscheduled DNA synthesis in the germ cells of male mice after treatment with hydrazine or procarbazine, Toxicology 25:201 (1982).
194
et
Putman
al.
153.K.W.Kohn.R.A.G.Ewig, L.C. EricksonandL.A.Zwelling,Measurement of strand breaks and cross-links by alkaline elution,Handbook ofDNA Repair Teclzrliques (E. Friedberg andP.C. Hanawalt, Eds.), Marcel Dekker. New York (1980). p. 379. 154. J.A. Skare and K.R. Schrotel, Alkaline elution of rat testicular DNA: detection of DNA strand breaks after in vivo treatment with chemical mutagens, Mutat. Res. 130:283 (1984). 155. J.A.SkareandK.R.Schrotel,Validation of aninvivoalkalineelutionassayto detect DNA damage in rat testicular cells. Emiron. Mutagen. 7563 (1985). 156.J.F.Sina, C.L. Bean,G.R.Dysart,V.I.Taylorand M.O. Bradley,Evaluationof the alkaline elutioldrat hepatocyte assay as a predictorof carcinogenic/mutagenic potential, Mutat. Res. 113:357 (1 983). 157. J.A. Skare and K.R. Schrotel, Alkaline elution of rat testicular DNA: detection of DNA cross-links after in vivo treatment with chemical mutagens,Mutat. Res. 130: 295 (1984).
Developmental and Reproductive Toxicology Kit A. Keller Toxicology Consultant, Washington, D.C.
1.
INTRODUCTION
This chapter provides a basic guide for nontoxicologists or nonexperts in developmental and reproductive toxicology (DART) who are responsible for contracting chemical safety studies to outside laboratories and need help in understanding the subject enough to be able to use the appropriate terminology, review protocols, interpret study results, and understand the implicationsof specific findings to risk assessment. It is not the purpose of this review to discuss specific methodology or specific agents and their mechanisms of toxicity. If the reader is interested, the following books are excellent sources for further inquiries into those areas of toxicology [ 1-51. One of the most common problems in this field of study is the multiple use and/or misuse of terminology. A glossary of common terms used in DART is presentedin the Appendixat the end of the chapter. However, it is not possible to include a listing of names for even the most common malformations due to space limitations. Recently, an internationally developed glossary on abnormalities has been published [6]. Although the majority of government agencies and guidelines requiring DART testing are identified in this chapter (Tables 1 to 3), it is still recommend that the contract laboratory help in study design. Any good contract laboratory will have carefully reviewed each guideline and will have prepared protocols to meet the minimum specific requirements. Note that some of the guidelines are currently under revision. Testing guidelines for drugs intended for animal use are available but have not been included in this review [14-161. 195
196
3
A
m 1
L
Y
2
C
Y
2
Keller
00
I
3
\o
n
c3
Y
2
bn
CA
Developmental and Reproductive Toxicology
I
n
\o
0
0
c.l
Y
2
U
U U
197
198
-3
U H
Developmental and Reproductive Toxicology
00
c h
199
200
00
Keller
Developmental and Reproductive Toxicology 201
202
Keller
Developmental and Reproductive Toxicology
d-
d)
Y
H U
tA
o;
203
204
Keller
It isvery important that all studies intended for submission to a government ’ (GLP) regulations. Conducting agency be run under ‘‘Good Laboratory Practice’ DART studies requires experience and expertise. They should be placed only at contract laboratories with good records for conducting these types of studies. If of the laboratory’s historical one is not sure about a particular laboratory, a copy control data for fetal, neonatal and reproductive indices may be requested. These data should provide a good indication of how many of these studies are performed each year as well as some insight into the quality and consistency of their data.
II. STUDY DESIGN ANDSTUDYPARAMETERS Historically, the entire reproductive cycle has been split into three segments for testing purposes. This aids in data interpretation and helps in identification of specific stages that are targetsof a particular toxin.The three segments generally cover the period of premating and mating through implantation (segment I), the period from implantation through major organogenesis (segment 11), and, finally, late pregnancy and postnatal development (segment 111). These three segments are: the reproduction and fertility study, the teratology study (or developmental toxicity study), and the perinatal/postnatal study, respectively. For agents anticipated to provide low-level chronic exposure of populations (e.g., environmental pollutants, food additives), complete multigeneration studies over two to three generations are required. Various combinations of these studies may be acceptable as long as all required parameters are included and minimum study requirements met. Examples include studies combining segments I and I1 or combining segments I1 and 111. In some cases ( e g , when test material is in very short supply), it may also be acceptable to conduct a segment I protocol on the end of a 1- or 3-mo toxicity study by adding a mating period onto the endof the study. Sample DART study designs are presented in Figure 1 . Other test systems (both in vivo and in vitro) have been developed and used in preliminary, prescreening or priority selection [ 17-23]. However, these screens arenot considered substitutes for definitive DART testing by any government agency. In addition, these standard studies not are designed to study specific when one comtargets or possible mechanism(s)of toxicity. This is quite evident pares some of the possible targets in the male or female reproductive system, which are much more numerous than the standard parameters actually measured in these studies (Table 4). For this reason, a few possible add-on and/or followup methods or studies are included in Section I11 of this chapter that may be useful if further investigation or characterization of a test material are needed. These suggestions are by no means all-inclusive. Two methodology books on the male and female reproductive systems edited by Chapin and Heindel [2,3],
U Y F
hprulon
Figure 1 Sample DART study designs.
h-
206
Keller
x
M
3
Y
U
i
c
3
Developmental and Reproductive Toxicology
do
207
Keller
208
as well as a book on approaches to mechanistic studies in developmental toxicology [24] are also excellent sources.
A.
Study Basics
1. Test Material and Vehicle The material tobe tested shouldbe the “technical grade” material or bulk chemical of the active ingredient. The characteristics of the test material (i.e., purity, composition, etc.) must be documented. If a vehicle is necessary for administration of the test material, then the choice of vehicle should be appropriate for the delivery of the test compound, should not interfere with absorption of the test material, and should not induce maternal or developmental toxicity. A vehicle control group should be part of the study design. In addition, a nontreated or sham control may be necessary when using a vehicleof unknown toxicity. Some of the most commonly used vehicles include water, powdered diet, hydroxymethylcellulose, corn oil, and saline. Test material-vehicle mixtures should be analyzed periodically during a study to verify concentration. In the relatively short treatment periods of segment I. 11, and IT1 studies, analysis at the beginning and the end of treatment is sufficient. For longer nlultigeneration studies, additional analyses should be added. If the dosing preparation is in diet or is a suspension, homogeneity of the test preparation must also be verified at least once.
2. Test System ( a ) Species Selectio~. There is no “perfect”animalmodelinDART testing. Indeed, as one can see in Table 5 , commonly used laboratory species can vary widely in male and female reproductive function as well as in timing and development of offspring. There are many good reviews in the published literature discussing species selection for both developmental and reproductive toxicity studies [7,25-27]. Each species has its benefits and its problems, depending on which particular end points are of importance or what type of agent is being tested. It is very important that an investigator understand the animal model and use this knowledge in interpretation and extrapolation to human risk assessment. Many guidelines suggest that one select a species whose pharmacokinetics and metabolism most closely resemble that seen in humans. However, in the real world, dataof this sort are rarely available at the time that DART studies are initiated. A common assumption that nonhuman primates shouldbe the closest model to humans with regard to pharmacokinetics is simply not true. More often than not, pharmacokinetics and toxicokinetics in primates can differ from humans as much as any other species. For allof these reasons, it is highly recommended to avoid any studies with other than the most commonly used species,
”
1
Developmental and Reproductive Toxicology
m
209
Keller
210
the rat and rabbit, for studies that are to be submitted to a government agency unless conferring with an expert. Thus, one can expect to use the rat in segment I, 11, and I11 protocols, as well as in multigeneration studies. Rabbits are used only in segment I1 studies, as the required second, nonrodent species. ( b ) Animal Source and Age. Animalsshouldcomefromsuppliers that can guarantee the health and lineage of the animals. This means animals are often “cesarean-derived,” usually from outbred colonies such as Sprague-Dawley or Wistar rats orNew Zealand White or Dutch Belted rabbits. It is beneficial to use the same strain of rat as used in other toxicological studies, if possible, since previous data will help in dosage selection. Animals should be young, but mature adults at the time of mating and the females should be virgin. Age of sexual maturity will be dependent on the strain of the animal selected. Generally, rats should be at least 12 wk old and rabbits at least 5 mo old at the time of mating. Using animals not fully mature will increase the variability of reproductive parameters, reduce the sensitivity of the study to identify adverse events, and, in some cases, completely compromise an agency’s acceptability of the study. A good study shows a greater than 85 to 90% pregnancy rate in the controls. Animals to be used in segment I studies should be acclimated to the study room for at least 2 wk prior to initiation of the study. This will allow time for the estrous cycle, which is often irregular following the stress of shipping, to normalize. Many suppliers will mate or artificially inseminate the animals (“timed pregnant animals”) for use in segmentI1 or I11 studies. This is especially usefulin laboratories that do not want to spend the time or expense of maintaining a colony of breeder males. In these cases, animals will be placed on study as soon as they arrive.
(c) Nunzber of Animals. The numbers of rats and/or rabbits used on a 20 to 30 study varies according to study type and guideline, generally around per group for rat and rabbits (three treated groups and one vehicle control group). Many guidelines give only the required “minimum” number of pregnancies for an acceptable study. It is up to the investigator(s) to decide how many animals they will need to use per group to reach this number at study termination. It is also becoming more common to add additional animals (e.g., 4 or S/sex/group) in a study to serve as satellite animals for evaluation of drug blood levels (see section on toxicokinetics below). The supplier’s and/or laboratory’s historical control pregnancy rate for this species shouldbe taken into account when deciding group numbers. The goal is to have sufficient numbers of pregnant animals to make accurate interpretations and maintain consistency among studies. The International Conferenceon Harmonization (ICH) guidelines for testing of human drugs [7] states that. “Below 16 litters per group, between study results become inconsistent, above 20 to 24 litters per group, consistency and precision are not
“
” _
Developmental and Reproductive Toxicology
211
greatly enhanced." Other, older guidelines list 12 as the minimum number of litters needed in a segment I1 study. In multigeneration studies, the starting group size for the first generation (Fo)may need to be larger than in other studies to allow fornatural losses and still attain a sufficient number of litters for evaluation of an F3 or even F3 generation. In studies with larger animals, such as primates or dogs, smaller group sizes are more common and accepted. Animals used in the study should be individually identified (e.g., tattoo. ear punch, or biochip implant) and should be assigned into dosage groups by some appropriate randomization method based on body weight. The treated and control groups are always run concurrently. However, it is not unusual in segment I1 studies to stagger the day of pregnancy over a few days so that one has, for example, one-fourth of each group's animals starting on one day, one-fourth more starting on next the day, andso on. This is acceptable to the agencies as long as the stagger evenly distributes animals among each group and each day. It is important not to stagger the animals over too long a period of time. Staggering allows more time on the day of cesarean section for completion of all of the uterine and fetal examinations.
3. Dosage Selection Selection of dosages is one of the most critical issues in the design of DART studies and one of the most difficult tasks. In addition, agency expectations with regard to dosage selectionand desired outcome can differ among agencies due to their differing approaches to risk assessment [28-321. Dosage selection generally requires quite a bit of experience and is very much test material-specific. However, there are a few basic rules that should be kept in mind. One generally wants of some to see some overt toxicity at the highest dosage level, or in the case drugs, a maximally tolerated pharmacological effect. This can be in the form of toxicologically significant (i.e., >lo%) reductions in body weight and/or food consumption. Lethality should be kept low (usually stated as below 10%) to ensure adequate numbers for evaluation. If necessary, clinical chemistry parameters, organ weights, and/or histopathology can also be incorporated into the study to document sufficient toxicity. This high dose should also not result in excessive abortion or death of the embryo or fetus, again to ensure adequate numbers for evaluation. However, while excessive toxicity at the high dose can create difficulties in data interpretation, in reality, there is more danger of study rejection by the agency if the highest dosage level is too low (i.e., without significant toxicity). If an agent is simply not very toxic, the maximum dosage level stipulated 1 g/kg, 1% or 5% of diet, 5 in the guideline,termed the "limitdose"(i.e., mg/l) can be used. In some instances, pharmacokinetics can be used for dosage
212
Keller
justification when increasing dosages administered do not produce increasing blood or tissue levels of a test substance. The lowest dosage level should not induce toxicity. For drugs, the lowest dosage level is usually selected based on either the clinically intended pharmacological effect or as a multipleof the anticipated human exposure.The intermediate dose is usually logarithmically placed between the high and low dosage and ideally should induce some minimal observable toxic effect. Dosage selection for segment I or multigeneration studiescan be based on data collected from previous subchronic toxicity studiesin the rat. Using similar dosage levels also has the added benefit of allowing direct comparisonof dosage levels producing reproductive toxicitywith those producing systemic toxicity.If the guideline requires that maternal treatment be continued through gestation and that offspring be evaluated, then consideration of offspring survival must also be taken into account (see below). In general, dosages used in longer-term DART studies are usually lower than those selected for segment I1 and segment I11 studies. Dosage selection for the segment I1 studies in rat and rabbit are usually based on small non-GLP dosage range-finding (RF) studiesin pregnant animals. These RF studies are necessary for identifying dosage levelswith sufficient maternal toxicity, but minimalembryonidfetal loss. They should alsoflag for possible differences in pharmacokinetics between pregnant and nonpregnant animals. There are no set rules to follow in the designof these studies, but most agencies are now requiring laboratories to report all findings from these studies. Interpretation of RF study results outside of its use as a guide in dosage selection is problematic and full of pitfalls. This is due tothe small number of animals per group and the often excessive maternal toxicity. Thus, one needs tobe careful that the study is conducted with some forethought. Most often these studies are conducted with three tofive animals per group (plus satellite animals for blood level determination) in four or five treated groups and one vehicle control group.The highest dosage can be set as high as the LDlo (dose lethal to 10% of the animals), as determined in an acute study. The treatment period should be the same as that to be used in the definitive segment I1 study. Evaluation of uterine contents can be made at gestation day 13 (determining viable and nonviable implants) or near at gestation term (gestation day20). Some investigators prefer to sacrifice animals day 20, weigh the gravid uterus, and determine viabilityof the implants (by fetal movement) without opening the uterus. Estimates of fetal weights are made by dividing the gravid uterine weight by the numberof full-term fetuses. This allows one to get a general idea about dosages inducing maternal toxicity, embryo or fetal mortality, as well as possible effects on fetal growth. Alternatively, fetuses can be removed, individually weighed, and evaluated for gross external, visceral, and/or skeletal morphology. However, many investigators do not feel that it is worth the time and expense of doing this in a RF study.
Developmental and Reproductive Toxicology
213
Dosages selected forthe segment 111study aremost often the same as those used in the segment I1 rat study.
4.
RouteandFrequency
of Administration
The route of administration shouldbe similar to expected human exposure. Alternative routes may be acceptable if it canbe shown that similar systemic exposures are achieved. Oral gavage is the most common methodof administration in segment I, 11, and I11 studies. Addition of test material to the diet or drinking water is the most commonly recommended routein multigeneration tests. Methods have been developed for most routes, from inhalation to sublingual to intravenous infusion. These can vary widely in cost, labor, and accuracy. and not all contract laboratories will have the expertise needed, especially on DART studies. Most DART studies are conducted using single daily dosing. Dosing should be performed at approximately the same time each day. If the test material has a very short half-life in the animal, administration may have to be increased to twice (b.i.d.) or even three times a day (t.i.d.) to ensure reasonable exposures. In some circumstances where atest material is known to have cumulative effects or is associated with rapid development of tolerance, one may have to dose less frequently or for a shorter treatment period. It is also possible to conduct the DART review using more than three segments (i.e., six to eight segments), if necessary. It isnot unusual in drug development to have an agent being developed for single use (e.g., diagnostics or surgical medicines such anesthetics) where it may not be possible to administer repeated dosages for the required length of time in the guidelines. In such cases. reducingthe length of the treatment period but using a higher dosage may be more appropriate than administration over the required treatment period at a much reduced dosage. The required periodof treatment is dependenton the specific guideline (see Tables 1 to 3). In segment I studies, the treatment period begins prior to mating. Historically, a longer period of prenlating treatment (i.e., 56 to 80 days) was thought to be necessary in males to extend treatment over at least one full cycle of spermatogenesis. However, recent prospective and retrospective analyses of reproductive data [33,34] reveal that the longer exposure period isnot necessary in most standard studiesin order to correctlyflag male reproductive toxins. Thus, for males, premating exposurecan be anywhere from 80 days in older guidelines to 30 days in the newer ICH guidelines. Administration to the males is generally continued until they are sacrificed. The females in a segment I study are dosed for at least 14 days prior to mating.The end of the treatment periodin the female is dependent upon the study design and can range from gestation day 6 (when implantation occurs) to the end of the lactation period (postnatal day 21). In segment IT studies, pregnant females are treated during what is commonly referred to as the major period of organogenesis. This includes the period
214
Keller
from implantation to closure of the hard palate (roof of the mouth). This translates into gestation days 6 to 15 in the rat and gestation days 6 to 18 in the rabbit. Although most of the major organ systems are formed during this period, organogenesis actually extends well into the postnatal period, particularily development of the central nervous system. The treatment period in segment I11 studies usually extends at least from closure of the hard palate (gestation day 15) until the endof the lactation period (postnatal day 21). Some investigators extend this to start treatment as early as gestation day 6.
5. Toxicokinetics Toxicokinetics simply refers to pharmacokinetic data collected from or to support a toxicology study. It is used primarily to document systemic exposure levels in a study and aids in extrapolation of the study results to other animal studies and to risk assessment in humans. One may also come across the term “pharmacodynamics,” which generally refers to the relationship between drug concentration and effect. Such data are usually expressed in terms of the AUC, C,,, and tIl2. The AUC or area under the curve is an expression of total exposure (Le., the ng hr/ml). The blood concentration of an agent over a specified time period; C,,,, refers to the peak concentration measured in the blood. The tIl2 refers to the time it takes for half of the test material to clear from the system and is therefore a measure of duration of exposure. Most agencies do not require collection of toxicokinetic data at this time, although there is a growing trend to do so. There are no fast rules regarding on what study day samples should be taken.It is suggested that satellite animals be used for this purpose rather than the main study animals because multiple sampling times result in large amounts of blood withdrawal, which can confound study results and interpretation. Timing of the sampleson the selected day isvery much dependent on compound and route of administration but usually entails at least four samples anywhere from 5 min out to 24 hr postadministation. In the segment I study, the preferred sampling time for blood collection is just prior to mating from both the Fo males and females. Selection of the best sampling day for segment I1 studies is less clear due to the constantly changing system. Sample collection near the endof the treatment period appears tobe the most appropriate time, in most cases. Mostoften,toxicokineticdataare not collectedinsegment 111 studies. Rather, data are collectedin a separatelyrun pharmacokinetic studyto investigate transfer of test material intomilk. often using radiolabeled test material. The data on test material transfer and concentration in milk are very important in the risk assessment of exposure to nursing women and their infants.
Toxicology
Reproductive Developmental and
215
Test material concentration may also be measured in fetal blood (or amniotic fluid) to assess levels of placental transfer and possible accumulation. This technique requires quite a bit of expertise and is most easily conducted on term fetuses due to the small blood volumes. If necessary, blood from an entire litter may be pooled.
B. Parametersto Be Measured 1. GeneralObservations There are two basic types of observations. The first is a daily (sometimes twice daily) check for signs of morbidity, mortality, abortion, or early delivery. The second type of observation is a much more detailed assessment in which the animals are removed from their cages and observed anyfor physical or behavioral abnormalities (clinical signsof toxicity). The detailed observation is usually conducted at a selected time after dosage administration. All findings are recorded, including those that may not be considered treatment-related. Possible observations are numerous and can range from localized hair loss, excessive salivation, alterations in respiration, and abnormal gaits, to tremors or convulsion.
2.
BodyWeightsand
FoodConsumption
Body weightsarerecordeddaily,twiceweekly,orweeklydepending on the guideline and on the reproductive phase (i.e., premating. gestation, lactation). Body weights during the premating and mating period are usually taken weekly or twice weekly. During the gestation period, body weights are recorded either daily or at intervals (i.e.. gestation days 0, 6, 9, 12, 15, 18. and 20). Maternal and pup body weights in the postnatal period are usually taken weekly (i.e., postnatal days 0, 7, 14, and 21). Food and, in some cases, water consumption are generally measured in the same intervals as body weights.
3. VaginalSmears Vaginal smears can be used for two purposes: to monitor the estrous cycle and to verify mating. Daily vaginal smears allow determination of the cycle stage (proestrus, estrus, metestrus, and diestrus) based on vaginal cytology [35].The estrous cycle in rats normally lasts 4 to 5 days. Chemicals capable of disrupting the pituitary-hypothalamus-gonadal axis often affect this cycle. For example. chemicals with estrogenic properties will induce the system into prolonged estrus. The method and experience of the technician taking the vaginal smear is important because it is easy to put the female into pseudopregnancywith overt vaginal stimulation in rodents (i.e., prolonged diestrus). It also takes an experienced tech-
216
Keller
nician to interpret the stages properly. Slides can be fixed and stained if a permanent record is needed. Althoughnot required, it is recommended that the estrous I study to cycle be monitored for at least 10 days pretreatment in a segment ensure that the females placed on study are cycling normally. Females continue to be monitored during the 3-wk treatmentperiod until sperm-positive during the cohabitation phase.It may be necessary to allow a nontreated recovery period for females who arenot cycling before initiating mating. Animal mating is verified by the presence of a vaginal plug or sperm in the vaginal smear (see below).
4. Cohabitation The female should alwaysbe placed into the male’s cage and not vice versa. The rats are usually placed together in the late afternoon and left together overnight. in good The preferred mating ratio is 1 : 1 because it is more likely to result pregnancy rates and for the ease it allows in compiling and interpreting data. Some guidelines do allow mating two females with one male. Each morning the technician looks for signs of a vaginal plug (coagulated mass of semen), either within the vagina or discharged and found on the bottom of the cage. If a plug is not found, a vaginal smear shouldbe taken to examine for vaginal sperm. The mating period allowed varies from2 to 3 wk. Most rats will mate withinthe first 5 days of cohabitation (Le., at the first available estrus) if cycling normally. In somecases,females may becomepseudopregnant. The longermating period allows time for these females to restart estrous cycling and hopefully become pregnant. In instances where it appears that fertility problems may be due to a an dysfunction of the male system, the mating period usually allows time for unmated female tobe placed with a secondproven male from the same test group. The day that mating is verified is usually considered day 0 of gestation. Records should be made of all cohabitation and matings. The time between initial cohabitation and mating is termed the copulatory interval.
5. FemaleSacrifice ( a ) Gestntim Dcry 13 UterineE.wnlinntion. Many investigatorsprefer to evaluate pregnancy status in an ICH Segment I study at gestation day 13. However, other guidelines may require evaluation on gestation day20 or 21 (see below). At gestation day 13, the implants shouldbe identified by position within as simply viable or nonviable. This may be accomthe uterine horns and recorded plished without opening the uterus. The uterus may be opened and the implants examined for abnormalities, but this requires an expertise that many laboratories do not have. The number of corpora lutea on each ovary are also recorded. The female should be subjected to a gross necropsy and the reproductive organs, and othertissuesasrequired.shouldbeweighedandsavedforpossiblehistopathology.
Developmental and Reproductive Toxicology
217
(b) Term Cesarean Secfion. In termsacrifices(gestation day 20 or 21 in the rat, or gestation day 29 in the rabbit), the uterus is removed, weighed (termed “gravid uterine weight”) andthen opened. The number and uterine positions of any early and late resorptions and viable and dead fetuses are recorded. Early and late resorptions are differentiated by the extent of autolysis (i.e.,to what degree fetal features are recognizable).The individual term fetuses are carefully removed from their membranes, the umbilical cord cut, and further examined as described below. The placentas areweighed and examined forany abnormalities and saved if required. The number of corpora lutea on each ovary is recorded. Again, the female should be subjected to a gross necropsy and the reproductive organs. and other tissues as required, should be weighed and saved for possible histopathology. For apparently “nonpregnant” rats (but not rabbits), ammonium sulfide stainingof the uterus can be used to identify peri-implantation deaths that are otherwise not visible upon gross examination.
( c ) Sacr-@ce atE d sf Location Period. In Segment I11 and multigeneration studies, dams areusually sacrificed after weaningof the offspring on postnatal day 2 1. The female should be subjected to a gross necropsy and the reproductiveorgans.andothertissues as required,should be savedforpossible histopathology. 6.
Fetal Evaluations
( a ) At Necropsy. Fetuses should be individually identified so that there is a recordof both what litter they are from andin what position they were found in the uterus. Most laboratories do this by means of a stringedtag around the neck. Each fetus is sexed (verified internally at visceral examination) and weighed. The crown-rump length can also be recorded; however, this can be time-consuming and is not required. Each fetus is then externally examined for abnormalities. Only dead fetuses that do not show any significant degree of maceration should be included in the external and subsequent examinations becauseof the frequent distortions and artifacts encounted in such specimens. Observed abnormalities are commonly classified into malformations and developmental variations. There is no such thing as a “clean” study: all studies will include at least some incidence of variations. There are no standards for these classifications and minor differences among laboratories can be expected. Malformations can be defined as “those structural anomalies that alter general body conformity, disrupt or interfere with body function, or are generally thought to be incompatible with life” [31]. On the other hand, developmental variations are defined as “those alterations in anatomic structure that are considered to have no significant biological effect on animal health orbody conformity, representing slight deviations from normal.” Some laboratories include variations in degree of ossification intothis category, however, thereis a trend to classify variations in
Keller
218
ossifications separately, as they are generally related to small differences between litters in time of conception and implantation or, in many cases, related to intrauterine growth. Some guidelines require photographic records of all malformations and representative variations on a study. It is recommended that this be done for all studies. Once the external examination is complete, the fetuses are euthanized. The method of fetal sacrifice can affect the quality of the fetal tissue in subsequent evaluation. For example, intraperitoneal injection of drugs may cause a breakdown in abdominal tissues. The most commonly used method of sacrifice is carbon dioxide (C02) asphyxiation or immersion. In rat studies, either one-half or one-third of the fetuses are subjected to visceral examination for anomalies. The remaining fetuses are fixed and processed for subsequent skeletal examination. In rabbit studies, all the fetuses undergoboth visceral and skeletal examinations. (b) Visceral Exurnimtion. Examination of the viscera can be conducted using a variety of methods. Rabbit fetuses are most often evaluated on the day of necropsy as fresh specimens using a necropsy method known as the Staples’ method [36-381. This examination includes not only evaluation of abdominal and thoracic organsbut also specific cuts into the brain and heart. Becausemany laboratories do not have the manpower to conduct all of these evaluations in a single day, the Staples’ methodhas also been used on ethanol and Bouin’sfluidfixed specimens. However, fixation can leave the tissues less manageable tomanipulate for a proper examination. Historically, Bouin’s fluid-fixed fetuses are evaluated using a serial slicing method known as the Wilson technique [39] or a combination of both methods. Various modifications and additions to these methods have been published [40-451. (c) SkeZetcd Examination. Fetusesselectedforskeletalexaminationare fixed in ethanol, then “cleared” and stained by a KOH-alizarin red S technique [46]. This method leaves a stained and intact skeleton sheathed within a transparent “body.” Examination includes enumeration of the vertebra, ribs, and other bone structures, degree of ossification, and any fusions or abnormalities in bone shape or position. Again, various modifications and additions to method this have been published [47-5 11.
7. Parturition The pregnant females are placed in litter cageswith solid floorsand nesting mateThe start of labor and its progress rial at least 2. days prior to expected parturition. is noted when possible. In many cases, labor will start or finish during offhours, making observations on the entire process difficult.In so far as possible, parturition should be observed for any signs of difficulty or unusual duration. The date
-””
-
” “ _
-
” “ _
-
Reproductive Developmental Toxicology and
219
when all pups are considered delivered is designated as postpartum, postnatal, or lactation day 0. Usually litters born overnight will be considered to havebeen delivered on the morning they are found.
8.
Postnatal Observations
As soonaspossibleafterdeliveryiscomplete, the pupsarecounted,sexed, weighed, and externally examined for grossly visible abnornlalities. Pups and dams are observed and weighed weekly. Survival checks should be conducted daily. Many laboratories consider any pup found dead on postnatal day 0 as stillborn by default. A lung flotation testcan be used as a more definitive method for differentiating stillborn pups from those dying after birth. In all cases, dead pups should be necropsied as soon as possible to evaluate for abnormalities. It is also not uncommon for rodents to cannibalize dead or injured pups. Missing pups are classified with the dead pups when evaluating data. Reduction of litter size by culling (usually to four males and four females per litter) hasbeen conducted historically, basedon the premisethat this equalizes or reduces differences in maternal care and nutrition due to litter size. Culling is most commonly conductedon postnatal day 4 with pup body weights recorded before and after. All culled pups should be randomly selected. Pups are weaned from the dams on postnatal day 21. A selected number of this FI generation can be continued on study for additional behavioral testing and/or to evaluate FI reproductive capacity as required. F1pups not selected are usually necropsied and tissues saved as outlined for the Fo generation.
9. Offspring Developmental, Reflex, and Behavioral Indices Segment I11 studies include evaluationof not only offspring growthand survival but also their behavioral development. At various times throughout the lactation period and early postweaning period, the pups are evaluated using a variety of developmental and behavioral end points. Guidelines leave some leeway in the choice of methods, butthe testing should include at least some measure of sensory development, activity,and learning/memory function. Developmental indices include end points such as time of pinna detachment, eye opening, tooth eruption, and hair growth, and are generally coincident with pup growth. Reflex testing should also be part of any standard pup evaluation and can include such reflexes as surface righting as earlyas postnatal day 1 to airdrop rightingreflex and startle reflex testing in the late lactation period. It isvery important that each parameter be measured at the appropriate time in development. After weaning on postnatal day 21, 1 or 2 pups/sex/litter are selected to continue behavioral testing and to produce the F2 litter. Activity level, exploratory and basal, can be quantified in an ‘‘activity chamber’’ using photobeam technology. The most complex testing
220
Keller
involves the learning and memory function, which can be measured by various methods, including active or passive avoidance test, maze test. or swimming test.
10. Offspring Reproductive Capacity The first end points in the evaluation of offspring reproductive capacity in a segment 111study are two additional developmental indices for sexual maturation: vaginal opening in females and testes descent or preputial separation in males. Both of these indices are very sensitive to small changes in hormonal status during gestation and/or lactation. In both segment I11 studies and multigeneration as described for theFogenerstudies, offspring reproduction is usually conducted ation. Care must be taken so as not to mate siblings. In ICH segmentI11 studies, it is preferable to euthanize and evaluate F, pregnancy status using a gestation day 13 uterine examination. Guidelines for multigeneration studies will require the F2 litter to be evaluated in a gestation day 20 cesarean section with fetal F2or even F3generation through weaning.In some evaluations or to carry out the instances, animals will be required to produce more than one litter. In this case, generations will be refen-ed to as the FI, or Fibgeneration.
11. Male Sacrifice It is always advisable to delay the sacrifice of the males until after the outcome of mating is known (via cesarean section or natural parturition). In the event of observed effects on pregnancy, these treated males can be mated again to untreated females to ascertain their rolein the adverse effect on fertility. When the males are sacrificed, they, too, should be subject to a gross necropsy. Evaluations may include organ weights and macroscopic examination of the testes, epididymides, seminal vesicle, and prostate. Tissues are generally saved for possible histopathology. At necropsy, one of the testes and/or epididymides may be used to quantify sperm concentration and/or evaluate sperm morphology and motility.
12. Histopathology of Reproductive Tissues Some guidelines specifically require that microscopic examinationbe conducted on male and female reproductive organs. It very is important to have histopathology data, especiallyif sufficient histopathologyof the reproductive organs isnot available or the quality of the data is dubious from the 1- and/or 3-nlo general toxicity study. Histopathological changes in the reproductive organs can be one of the most sensitive end points in DART studies. There has been much talk in the past few years of adding requirements for labor-intensive serial section and cell staging of the testes and ovary. At this time, it is not necessary in DART screening studies to evaluate numerous serial sections of the ovary and testis or to conduct a full staging evaluation. However, the pathologist should have
Reproductive Toxicology Developmental and
221
specialized trainingso as to identify situations in which more complex qualitative and quantitative analyses should be done [52,53]. To correctly evaluate testicular morphology, the tissue must be preserved properly [53,54]. The most common tissue preservation method, fixation in formalin and paraffin embedding, is not acceptable for testicular tissue and does not permitadequateassessment of testicularcytoarchitectureorspermatogenic stages. The technique resultingin the highest quality sections for testicular examination uses perfusion fixation and plastic (i.e., glycol methacrylate) embedding. However, on a routine basis, fixation in Bouin’s fluid, paraffin embedding, and staining with periodic acid-Schiff (PAS) also produces acceptable tissues for proper examination.
111.
DATA COMPILATION,INTERPRETATION, ANDRISK ASSESSMENT
The purpose of DART testing is to establish whether a test material has the potential to induce adverse effects on the male and female reproductive systems or development of the conceptus, and to establish the potency or dosage levels that cause the specific adverse effects. Once this information is established, an assessment can be made of the agent’s hazard potential and actual risk it poses to humans by bringing in exposure information aswell as data from other toxicology [29,31,3235,561. From a regstudies, pharmacokinetics, and mechanistic studies ulatory standpoint,acompound’sreproductive and developmentaltoxicityis most important in risk assessment when the dosage level inducing DART effects is below that associatedwith systemic toxicityin adults. If the dosagelevel inducing adverse DART effects is above those causing systemic toxicity, then safe exposure levels set to protect against systemic toxicity will usually also protect against adverse reproductive effects. Agents producing DART effects at dosages below those inducing systemic toxicity are generally referred to as “selective” reproductive or developmental toxins. The minimal acceptability criterion for a DART study is demonstrable toxicity to the Fo parents or dosing up to the “limit” dose. Although it is advantageous to demonstrate a no observable effect level (NOEL) as well as a lowest Fo generation, as long as a NOEL for observable effect level (LOEL) for the developmental toxicity is observedon the study, this isnot absolutely necessary. The absence of a parental LOEL and/or a developmental NOEL may require repeating the DART study at either higher or lower dosage levels.
A. Tools Needed in Interpretation of DART Data All data need to be included in the final report. This includes not only summary tables, incorporating group means, but also tables presenting all individual animal
222
Keller
data. It is very important in DART studies to include all individual data in such a manner that the reader can associate all maternal and fetal findings with specific animalnumbersandtimes. Not only groupmeansbutalsoindividualvalues should be considered carefully when interpreting results. Statistical analysis should be used to determine whether results differ significantly from those of the controls. There are no specific types of statistical tests required. The statistical methods applied should be appropriate for the type of end point being evaluated [56-591. There has been a long debate on whether the entire litter or the individual fetus shouldbe treated as the experimental unit in statistical evaluation; most investigators include statisticsusing both. Some of the tests that havebeen employed in the statistical analysisof DART data include analysis of variance (ANOVA), analysis of covariance (ANCOVA). Bonferroni inequalitytest,Bartlett’stestforhomogeneity,Studentt-test,chi-square test, Fisher’s exact test, jackknife procedures, Mann-Whitney U-test, StudentNewman-Keuls multiple range test, and rank transformation. For many of the parameters measured in DART studies, the power to discriminate between random variation and true treatment effect (i.e., the statistical sensitivity) is poor due to the small nunlber of animals, the wide variability of some end points and the normally low incidence rates of others. For example, with 20 animals per group, it would require an incidence rate 5- to 12-fold above control levels to detect a statistical increase in malformations compared with 3to 6-fold increase for postimplantation deaths and a 0.15- to 0.25-fold change in fetal body weight [31]. Moreover, the way in which statistical results are used in data interpretation has been under debate [60]. The absence of a statistically significant difference from the control value does not necessarily negate the “biologic” significance of an observed change. In this case, the investigator must rely on historical control values, possible dosage-related trend, and experience when deciding whether the “effect” is real or not.On the other hand,the presence of statistical significance also doesnot make an “effect” biologically or toxicologically significant, especiallyif the value of the end point in question is within the historical control range for that parameter. The probability of detecting at least one statistical false-positivein a study with numerous biological parameters is very high [55]. Historical control values for DART parameters very are important for comparative purposes [61]. These are basically data from the control groups of past studies compiled by study date. A laboratory experienced in conducting DART studies should have historical control records from numerous studies. Because of genetic drift over time in the outbred animal strains used in DART studies, an investigator will usually rely on historical control data that is within 2 2 yr of the initiation of the study being considered. It is best to use historical data from the same laboratoryin which the study was conducted. However, published historical control data [62,63] may also be used, when necessary, as long as it is a compilationof data on the same strain and,if possible, the same source (i.e.,
Reproductive Developmental and
Toxicology
223
colony). The vehicle and route of administration used in the control data studies may also be a possible variable and should be considered when deciding what historical control data are appropriate to use for comparison. The dose response can be fundamental in determining if there is, in fact, a true reproductive or developmental effect.The most commonly observed dose response is characterized by an effect occurring with greater frequency and/or severity as the dosage level is increased. Thus, one would question a possible treatment-related effect if one were to see a higher incidence of an effect in the low- and high-dosage groups compared with controls but not in the mid-dosage group. One must also keep in mind that developmental and reproductive toxicity are assumed to be threshold phenomenon (i.e., they show a threshold dosage level below which adverse effects are not seen). Because of the small number of treated groups and the often steep dose-response curve, it is not uncommon the high-dosage level. in DART studies to have some apparent effects only at This is particularly true under circumstances where developmental toxicity is observed only at levels associated with severe maternal toxicity. There is also the possibility of observing a plateau effect, where no increase in incidence or severity of a particular effect canbe induced by increasing the dosage level. This is often due to dose-dependent toxicokinetic limitations. Finally, one needs to consider the impact of “competing” end points on the dose-response patterns (e.g., increase in postimplantation loss covering up any chance of observing a nonlethal effect at high-dosage levels). Effects should be biologically plausible. For example, one may expect to see a test material effect on corpora lutea count in a segment I study where test material exposure occurred during the time of ovulation and early pregnancy. On the other hand, the number of corpora lutea among test groups in a segment I1 or I11 study should be within the same range as control since compound administration was initiated after implantation. Biological plausibility would also suggest it highly unlikely to have an effect, for example, on fetal bone ossification in a segment I study where exposure to the dam did not occur at the time of fetal bonedevelopment. Also inherent in plausibilityisthestipulationthateffects should concur across studies. For example, the validity of a treatment-related reduction in offspring weight would be questioned if fetal weights were reduced in the segment I1 study, but day 0 pup weights showedno reduction in the segment I11 study at the same dosage level.
B. Interpretation of SpecificEndPoints 1. Clinical ObservationData Clinical observations are not always related to toxic effects of the test material. Prior identification of specific findings in previous studies as well as a doseresponse incidence arean important sign of compound related toxicity. The tim-
Keller
224
ing and duration of the observed signs can also help in interpretation. However, confounding factors, including concomitant disease, environmental factors, or technical errors, must also be considered. Hopefully, theuse of quality SPF (specific-pathogen free) animals and examination of animals by a veterinarian prior to placement on study should reduce the incidenceof spontaneous disease in the test animals. However, there is no procedure that can completely eliminate the possibility. A certain combination of signs and histopathology as well as a lack of dose-response can suggest such a confounding factor to the investigator. For example, in rabbits, clinical signs such as nasal discharge and/or dialrhea along with histopathological evidence of lung congestion and/or pitted kidneys has been associated with pasteurellosis and coccidiosis infections. A high enough incidence and severityof such observations, indicating questionable health status of the test animals, may make a study unacceptable for submission. Stressful environmental factors, including low humidity, extreme temperatures, or inadequate food and water availability, can also to lead morbidity, which can confound study interpretation. Even such factors as suboptimal caging or mishandling of the animals can lead to clinical signs such as hair loss, abrasions, scabbing, broken teeth, or injured or even broken limbs. When an animal dies on study it is very important to try to determine a possible cause of death at necropsy. Findings in oral gavage studies such as congested or fluid-filled lungs and reddening or puncturing of the trachea are signs of agavageerror and the death of theanimal would not be considered test material-related.
2.
Parental Body Weight and Food Consumption Data
Reduction in body weight gain or actual body weight loss in conjunction with food consumption measurements can be sensitive indicators of systemic toxicity in all types of toxicology studies. In DART studies, body weight and food consumption data are presented and interpreted separately for each phaseof a study (i.e., Fo premating, F,) gestation, Fo lactation, F, premating, etc.). Interpretation of premating body weight and food consumption is straightforward. Group mean body weights and/or percentage change in body weights are calculated. Reductions in body weights may indicate a direct effectof the test material and/or may reflect reductions in food consumption (anorexia). The role that anorexia may have played in reducing body weights may be assessed by calculating a ‘‘food efficiency index.” This index is a measure of the efficacy of food utilization (i.e., food consumed/body weight gained). Reduced food intake is not considered a main factor in body weight reductions if the food efficiency index is similar between the treated and control groups. In general.mild to moderate depressions in body weight resulting from reduced food consumption have little orno effect on adult reproductive capacity, and it is not considered appropriate to dismiss reductions in fertility as being secondary to such weight reduction [64].
Developmental and Reproductive Toxicology
225
For the gestation and lactation periods, mean body weight and food consumption data shouldbe generated for eachday they were measured. Most prefer in to calculate body weight and food consumption changes as intervals to aid interpretation. For example, in a rat segment I1 study, useful intervals to look at would be gestation days 0 to 6, 6 to 9, 9 to 12, 12 to 15, 15 to 20, 6 to 15, and 0 to 20. In this way one can assess the impact of the test material during the early part of the treatment period (days 6 to 9), middle of the treatment period (days 9 to 12) the last part of the treatment period (days 12 to 15), and during the entire treatment period (days 6 to 15), as well as effects posttreatment (days 15 to 20), and over the entire gestation period (days 0 to 20). Often included is a “corrected” mean maternal body weight gain that is calculated as the terminal body weight minus the initial body weight minus the gravid uterus weight (or minus the litter weight). It can be difficult to differentiate small reductions in weight gain or weight loss in pregnant and lactating animals due to normally erratic weight gain during pregnancy as well as confounding factors, such as differing litter size and implantation loss. Nonpregnant animals as well as does (maternal rabbits) that abort should be excluded from calculation of the means and any statistical evaluations. Also, food consumption data from the lactation period is of questionable use due to the fact that the pups as well as the dams start to consume the diet during the second week. Food andbody weight effects will most often demonstrate a dose-response pattern. Also it is not uncommon to observe an actual compensatory increase in food consumption compared with controls after treatment is stopped. However, interpretation of these data is not always simple. For example, when excessive mortality occurs at the high dose there is a possibility of weight reductions appearing larger in the mid-dose group due to the eliminationof the most sensitive animals in the higher-dosage group. In these circumstances, the lack of a dose response would not imply that there were no test material-related effects.
3.
Male and Female Reproductive Systems
( a ) Estrous Cycling. Individual daily staging (proestrus, estrus, metestrus, or diestrus) for each female should be presented for the pretreatment period, the treatment period, and the posttreatment period, if available. Rat estrous cyclicity can be expressed in a variety of ways. Some laboratories summarize these data as the average duration of cycle in days or number of estruses over a given time period. The normal cycle time in laboratory rats, depending on the strain, is usually 4 to 5 days. However, this type of compilation will not give any informationon where the changesin cyclicity are occurring or the possible mechanisms involved. In toxicology studies, rodents will usually demonstrate some periodic alterationsin cycling (e.g., prolonged estrus or prolonged diestrus) prior to becoming completely acyclic that may aid in interpretation of possible causes [65]. Thus, many investigators prefer to summarize estrous cycle data as
226
Keller
percentage of animalsdisplayingabnormalstages(i.e.,prolongedestrus [>2 days], abnormal diestrus [<2 or >3 days], prolonged metestrus [>1 day], or prolonged proestrus [>1 day]. A summary of the percentage of time spent in each stage would also give similar information about specific stage abnormalities. Alterations in cyclicity provide some indication that the major endocrine mechanisms controlling normal female reproduction have been affected [66]. Prior to ovulation,in the beginningof estrus, a marked increase in serum estradiol occurs and the vaginal cytology, consisting of primarily cornified epithelial cells, reflects this influence. Following ovulation, progesterone predominates and becomes the primary controlling factor for vaginal cytology. At diestus, both estradiol and progesterone are equally prominent, resulting in vaginal cytology consisting of leukocytes and epithelial cells entangled in mucus.
(b) Mating and Pregnancy. The evidence of mating is expressed as the copulatory index and/or mating index (Table 6) and as the copulatory interval. A decrease in the copulatory or mating index may be due to physical impairment (e.g., inhibitionof penile erection) or to alterations in sexual behavior (e.g., libido, receptivity). Rodents actually mount and mate a number of times during a short period of time, however, standard segment I and multigeneration studies do not directly incorporateend points for assessmentof mating behavior, but rather only indirect measurement of the end result (i.e., sperm-positive smear or vaginal plug). A sperm-positive mating does not necessarily indicate that pregnancy will ensue. In rodents, the male must provide a sufficient number of intromissions and ejaculations for the female to respond with sufficient progesterone for the initiation of pregnancy [67,68]. For the female, the degree of sexual preparedness stronglyinfluencesthesite of semendepositionandsubsequenttransport of sperm in the female genital tract [69]. Failure to achieve these conditions will adversely influence these processes and reduce the probability of successful conception. The actual number of pregnancies resulting from sperm-positive matings are calculated as the fecundity index (see Table 6). The accurate determination of the fecundity index requires careful evaluation of the uterus at necropsy for the presence of implantation sites.All pregnancies should be included here, even those not resulting in viable offspring. A reduction in the fecundity index could be the resultof adverse effects onmany targets, including effectson sperm motility/ viability. follicular rupture and oocyte release, fertilization, oviduct transport, endocrine status, uterine receptivity, or implantation. 6) is amuch broader reflection The fertility (or conception) index (see Table of the overall reproductive capacity of the treatment group and takes into account all animals-those that did not mate, those that mated but didnot conceive, and those that achieved pregnancy-and thus encompasses both the copulatory index i
.
Developmental and Reproductive Toxicology
227
Table 6 Calculated Indices Used in DART Studiesa
Copulatory index:
# males mated
#males paired # females mated Mating index: # estrus cycles Fertility index (conception index):
or
index:
or
#females mated # females paired
# males siringlitter a # males paired
or
# females pregnant # females paired
Fecundity index (in females also called pregnancy index)-index of conception rate: # males siring litter a # females pregnant # of pregnancies or or # males mated # females mated #copulations of # implantation sites Implantation index: # corpora lutea # corpora lutea - # implantations Preimplantation loss: # corpora lutea # implantations - # viable fetuses Postimplantation loss: # implantations fetal body weight Ponderal index: fetal crown-rump length # litters delivered # litters delivered Parturition # females mated # females pregnant # live pupsborn # live littersborn Live birth index (also called gestation index): or # pups delivered # pregnant # pups surviving 4 days Viability index: # live pups at birth # pups surviving 4 days Reproduction index: # females pregnant # pups weaned (day 21) Weaning index (also called lactation index): # pups after culling (day 4) # pups born per litter - # pups weaned per litter Preweaning loss: # pups born per litter Each index is usually calculated and multiplied by 100 and represented as a percentage for each test group.
228
Keller
and fecundity index. Unless the study includes mating the treated males and or treated females to untreated rats it is usually not possible to differentiate whether or not the reduction in fertility is due to dysfunction of the male, the female, or both. Fertility isnot considered avery sensitive indicatorof reproductive toxicity. A normal male rat produces sperm counts well above that required for normal fertility (see Table 5). In fact, it has been demonstrated that chemical-induced reductions of up to 90% of sperm production can still result in normal fertility rates [70]. In contrast, minor reductionsin sperm production in human males can have marked effects on reproductive capacity since human males function nearer to the threshold for the number of normal sperm needed to ensure reproductive competence. Female rats also have a greater reproductive capacity than needed and may require a considerable reduction in oocyte/follicle development before affecting fertility rates. Thus,when “no effects on fertility” is the sole end point available for risk assessmentof reproductive competence, not much reliance can be put on this for human risk. When infertility is observed on a study, especially in association with evidence of histopathological changes in the testes and/or ovaries, it is very important to assess whether the effects are reversible. Chemically induced destruction of stem cells (spermatogonia or primordial oocytes) is of great significance because there is no mechanism for repopulationof these germ cells, thus the effect is irreversible. Possible follow-up studies on sexual behavior could include female receptivity and lordosis response and copulatory measures, such as latencyof male to mount, number of intromissions prior to ejaculation, number of mounts without intromission, and lordosis/mount ratios[7 1,721. However, a more detailed evaluation of sexual behavior is not warranted for all toxicants associated with reductions in fertility. The most likely candidates include agents associated with neurotoxic effect or those with possible androgenic or estrogenic properties. Measurement of hormone levels can also be added to ongoing studies or further evaluated in follow-up studies [73,74]. The primary hormones that are measuredincludeluteinizinghormone(LH),gonadotropin-releasinghormone (GnRH), follicle-stimulating hormone (FSH). estradiol, testosterone, progesterone, and prolactin. However, to obtain adequate serum hormone data requires multiple sampling times, due to their pulsatile nature, and must take into consideration the age, reproductive state, and cycle day of the test animal. Also it is not always possible to differentiate whether endocrine changes are the cause or are secondary to reproductive effects. Endocrine changes that indicate toxicity include multiple values outside the normal physiological ranges, physiologically plausible changes in direction in hormone levels, or failure of key hormonal of luteal phase events (such asLH surge, preovulatory estradiol rise, maintenance progesterone production. etc.) [75,76]. Other strategies for evaluating endocrine
Developmental and Reproductive Toxicology
229
status include such hormone challenge assays as measurement of LH following injection of GnRH or measurementof progesterone following injectionof chorionic gonadotropin [77]. Function of the male gamete can be assessed by various end points and methods, including morphometric analysis of spermatogenic stages, epididymal sperm count, sperm viability and motility, sperm morphology, testicular spermatid head counts, and in vitro Sertoli cell and Leydig cell function [2,33,70]. Follow-up studies on possible roles of the ovary in fertility reduction include investigations on in vitro ovarian steroidogenesis function [78], in vivo time-response study to assess oocyte maturation and fertilization [79], in vivo oocyte toxicity assayin juvenile mice [80], and in vitro culturesof growing follicles [81]. Possible causes for preimplantation loss can be investigated by evaluation of uterine decidualization [82], in vitro fertilization assay [83], and evaluation of egg/embryo transport [82].
(e) Gestation Lerlgth and Parturition. In rats, the length of the gestation period ranges from 21 to 23 days and averages approximately 21.5 days. Gestation length is most often simply compared using groups mean values. However, a graph depicting the percentage of the dams undergoing parturition on each possible day of parturition for each dosage groupwill often give the investigator a better picture of possible dose-related trends. Gestation length is determined by the triggering of parturition, a process that involves numerous endocrine factors and is not completely understood [84-861. Prolonged gestation as well as early delivery of pups are considered dysfunctions of the triggering process and threaten the survival of offspring. The parturition process itself involves major changesin the contractile potential of the smooth muscle of the uterus to allow for synchronous contractions during labor. Chemically induced dysfunction of parturition (dystocia) often in> 12 hr) volves interference with uterine contractility. Prolonged parturition (i.e.. is often associated with increased stillbirth. Calculation of the parturition index (Table 6) gives a measure of adult fertility/fecundity through birth. The index itself provides limited information on actual offspring viability, since litters with only one pup are counted the same as those with more than one pup. Follow-up studies on parturition effects have utilizedin vitro uterine strips to study contractility [87], uterine smooth muscle cell cultures [88], endocrine assays [89]. and prostaglandin challenge [90]. (d) Organ Weights. Organweightdataareusuallypresented as both individual and group mean absolute weights and as relative weights (i.e., organ weight/body weight ratios). In DART studies, absolute organ weights are more
230
Keller
important for interpretive purposes since many of the reproductive organs are unaffected by changes in body weight. In male rats, the reproductive organs tobe weighed are the testes, epididymides, prostate, seminal vesicles, and pituitary. Testes weight varies little within a given test species[91-931. This relativelylow intraspecies variabilitycan make testicular weight a sensitive indicator of gonadal injury. Most testicular toxins will reduce testicular weight in rodent studies. However, increases in testicular weight can also be observed with the induction of edema and inflammation, cellular infiltration, or Leydig cell hyperplasia. There is also a chance of observing no change in testicular weight even with the presenceof severe injury upon histopathological evaluation. Epididymal weight is affected by the number of sperm present in the lumen. Both seminal vesicles and the prostate contain large proportions of luminal fluid and these fluid levels are easily influenced by changes in androgenic hormone levels. In female rats. the reproductive organsbeto weighed are the ovaries, uterus, and pituitary. The weights of the ovaries and uterus do vary during the course of the estrous cycle and obviously during pregnancy. Ovarian weight can be be closely related to the number of corpora lutea present. Uterine weight can greatly affected by the presence of estrogen. Changes in pituitary weight are usually consideredanasadverse toxicological effect, but are not necessarily reflective of reproductive impairment since gonadotropin-producing cells in the pituitary represent only a small portion of the many hormonal cell types. Unlike changes in the weights of some other organs, testicular, epididymal. and ovarian organ weights are not influenced by small or moderate changes in body weight [92,94-961. Prostate and seminal vesicleweight can vary with body weight changes. Thus, when body weight reductions occur in a study due to decreased food palatability or consumption one cannot just assume that changes in the weight of reproductive organs are simply secondary to the presence of generalsystemictoxicity.Inthesamevein,reproductiveorganweightdata should not be used as a lone end point; accompanying histopathology evaluation can be crucial.One cannot assume that a particular agent does not induce adverse effects in the testes or ovaries based on the absence of organ weight changes alone. Finally, difficulties in dissection and removal of extraneous tissues from smaller organs, such as the epididymides, seminal vesicles, prostate, and pituitary, can create unwanted variability. Organ weight data should also document whether the specified weight was taken with or without excess fluids. Weights without fluids generally will show less variability. (e)Histopathology of Reproductive Organs. Properlyconductedhistopathology is one of the most sensitive indicators of injury to the reproductive system. Familiarity with the cytoarchitectureof the testes and ovaries as well as
Reproductive Toxicology Developmental and
231
with the kinetics of spermatogenesis and follicle development is crucial to identifying injury, especially in the identification of less prominent lesions in these organs [97-1001.
1. Male Reproductive System A thorough histological evaluationof the testes includes examinationof the various cell stages (i.e., spermatogonia, spermatocytes, spermatids), sperm release into the lumen (spermiation), support cells (i.e., Sertoli and Leydig cells), aswell as examination of the interstitial area [97,98,100].The most commonly observed finding in the testis is germ cell degenerationand necrosis. Cytotoxic agents will generally target all dividing cell types in the testis. Other male reproductive toxins target very specific germ cell stages or directly target the Sertoli or Leydig cells. However, the majority of testicular toxins, regardless of their site of action, will produce a similar nonspecific degenerationif administered at a high dosage over a prolonged period. Germ cell depletion is often accompanied by Leydig cell hyperplasia or hypertrophy, depending on the hormonal status. It should be remembered that spermatogenesis is a temporally synchronized process and it is possible tomiss an effect by examination of tissues too early or too late posttreatment. Chemical injury to the epididymis can include necrosis of the epithelium and formation of sperm granuloma and/or spermatocele that can resultin obstrucbeen associated tion of the lumen. Vacuolizationof the epididymal epithelium has with estrogenic stimulation.The pathologist should also take note of any unusual findings in the epididymal spermatozoa. Theaccessorysexglands(prostate,seminalvesicles,ampullaryglands, bulbourethral glands, and preputial glands) most frequently are targets of chemiof reproductive hormones. Histocals that directly or indirectly alter the balance logical changes indicativeof toxicity may be degenerative, atrophic, proliferative, or inflammatory. The presence of accessory sex organs varies with species. In many routine toxicology studies, only representative accessory sex organs, such as the prostate, are examined histologically.
2.
Female Reproductive System
A thorough histological evaluation of the ovaries includes examination of the epithelial capsule, stroma, and follicular cells. Significant findings in the ovary include a reduction or absenceof follicle-enclosed oocytes, unruptured follicles, reductions or absenceof corpora lutea, or the presenceof ovarian cysts [99,101]. Although not required, once an ovarian lesion has been identified, morphometric analysis of follicle subtypes, including primordial oocyte counts, may provide additionalinformation on theextent and possiblereversibility of theinjury
Keller
232
[52,102,103].However,thisprocedureislaborintensiveandrequires an advanced level of expertise. The corpus luteum is a transient ovarian endocrine gland formed by a rapid growth and differentiationof the theca and granulosa cellsof the follicle following ovulation [ 1041. The primary function of the corpus luteum in all species is the synthesis of progesterone, which is essential for implantationand subsequent development of the fetoplacental unit. Activation and maintenance of luteal function are species-specific andcan involve pituitary, placental, and/or ovarian hormones (see Table 5). Corpora lutea are found on the surface of the ovary and are counted by macroscopic examination at necropsy. A corpus luteum is formed for every oocyte released. Preimplantation loss (see Table 6) can be calculated by the difference between the number of eggs releasedand the numberof implantation sites that develop. The number of implantation sites must be equal to or less than the number of corpora lutea. In some instances, not all corpora lutea will be visible on the surface of the ovary and for statistical purposes the data need to be adjusted in those cases where the number of implants is greater than the number of corpora lutea. The appearances of the uterus and vaginavary with the stage of the reproductive cycle and with pregnancy. Agents that impair ovarian cycling, alter normal hormonal balance, mimic reproductive hormones, or interfere with autocrine/ paracrine regulation may cause significant changes in uterine and vaginal morphology and the ability of the uterus to maintain a fetus to term. Accessory female sex organs generally are rudimentary and often are not examined unless grossly visible abnormalities lead to histological observations. However, accessory sex glands such as the clitoral glands of rodents may be examined routinely in some studies.
4.
Developmental Data
Adverse developmental effects may be detected at any point in the offspring’s life span.In DART studies, these effects are manifested as death of the offspring, induction of structural abnormalities (teratogenicity), reduction in growth, and/ or functional alterations. All four end points are of toxicological concern if produced at a particular exposurelevel and in a dose-related manner [29,31,56,105]. A toxicant usually induces more than one type of effect in a dose-dependent manner. In segment I1 studies, three general dose-response patterns exist [6,106]. In the first dose-response pattern, one observes a combination of resorbed, malformed, growth-retarded, and “normal” fetuses within the litter. Depending upon the potency of the test material, lower dosagesmay cause primarily growth retardation or malformation. and higher dosages may induce primarily death of the conceptus. This does not necessarily indicate that one type of effect leads to is capable of a full another type of effect, but rather indicates that the agent
Reproductive Toxicology Developmental and
233
spectrum of effects. This is the type of dose-response pattern that is commonly seen for cytotoxic agents. In the second pattern, the agent is not teratogenic, but will induce growth retardation and lethalityat sufficiently high dosages. Growth retardation usually precedes significant lethality.In the last pattern, highlypotent and target-oriented teratogens will generally cause malformations of the entire litter at exposure levels that do not cause embryolethality. Increasing the dosage level will usually result in death of the conceptus, but often in conjunction with severe maternal toxicity. In this pattern, growth retardation is often associated with the malformed offspring. Effects on the fetus can be induced by direct effects of the test material on the conceptus or indirectly through toxic effectson the maternal system, placenta, or other tissues. Effects noted after parturition can be the result of previous in utero exposure and exposure to the test material via maternal milk or indirectly through poor maternal care and postnatal nutrition. ( a ) Death. Loss of the conceptus anywhere from fertilization to implantation is termed preimplantation loss. This is calculated from the total number of implants and the total number of corpora lutea (see Table 6). Preimplantation loss may be due to a direct effect on the early embryo or indirect interference with such events as movement of the egg/embryo through the fallopian tube, interference with uterine decidualization in preparation for implantation, or adverse effects on corpora lutea function and progesterone synthesis. Death of the conceptus in the uterus is referred to as postimplantationloss (sometimes referred to as fetal wastage),which can include early or late resorption of the implanted embryo/fetus and late fetal death. Postimplantation loss is calculated using the number of implants and the number of viable fetuses in a litter (see Table 6). Postimplantation loss may be due to a direct lethal effect on the developingconceptus,induction of lethalmalformation(s),orindirectly through interference with maternal systems supporting the pregnancy. Stillborn and neonatal deaths are referred to collectively as postnatal loss. Postnatal survival is usually expressed in time intervals. Stillborn incidence rate can be expressed as the live birth index based on the total number of pups delivered by a dam and the number of live pups delivered (see Table6). An increase in the stillborn ratecan be associated with increased late fetal death in the segment I1 study. Prolonged gestation and/or problems in parturition (dystocia) are also often associated with increases in stillbirths. Pup adaptation to postnatal life may be affected by treatment-inducedstructuralabnormalitiesorfunctionalalterations. Rodents will often cannibalize stillborn and/or abnormal pups early after delivery, making accurate evaluations difficult unless closely observed. Survival from birth to culling on postnatalday 4, from culling to postnatal week 1, etc. are calculated as viability indices (see Table 6). Total survival over the entire lactation period canbe calculated as the weaning index or preweaning
234
Keller
loss. Again, reduced postnatal survival may be due to a direct effect of the test material (via milk and/or diet/water administration), induced functional/structural changes affecting survival, or treatment-related effectson maternal nursing behavior, milk production, or milk ejection reflex. The presence or absence of milk visible in the stomachof the pups can help distinguish whether or not lactation played a role in their morbidity/mortality. In a multigeneration study,an observed reduction in litter size at birthmay be associated with a reduced ovulation rate (corpora lutea count), higher rate of preimplantation deaths, higher rate of postimplantation deaths, stillbirths, and/ or immediate postnatal deaths. In all types of DART studies, a high incidence of offspring mortality at any point may mask the occurrence of other adverse effects, such as growth retardation and/or teratogenicity[ 1071. If a study results in no effects on the conceptus at the mid-dose and very high offspring mortality at the high dose, an agency may request an additional study to be conducted using dosage groups between the mid- and high-dose levels of the first study. (b) Gro~vtl?Retnrdution. Reductions in offspring body weightsarea very sensitive indicator of developmental toxicity due to their relatively small intraspeciesvariability.Littersizedoesinfluenceoffspring body weightand is inversely should be factored into any statistical analyses (i.e., body weight proportional to litter size: higher weight in small litters and smaller weights in larger litters). There are circumstances where only one or a few of the fetuses/ pups in a litter are affected. Therefore, as with other developmental end points, growth should be assessed on both an individual and a litter basis rather than relying solely on group mean values. Pups that are substantially smaller than their littermates are referred to as runts. In studies where gestation length has been affected, it is recommended that pup body weights be assessed based on day of birth (postnatal the dayof conception (gestation day0) rather than from the day 0). By doing this, differences between litters in gestation length and age at birth are eliminated. Male and female body weights should be summarized separately. These, along with data on thesexratio(males/females) of thelitters,canindicate whether or not one gender is preferentially affectedby a particular test material. Although there are reports of such gender sensitivity, such occurrences are very rare. Altered growth may be induced at any stage of development and may be permanent or reversible. For example, the mechanism responsible for fetal body weight reductions in a segment I1 study may not be carried into the postnatal period where pup weight may increase parallel to control weights and, in some cases, catch up. Therefore, it is important to examine the data from each dayon which body weights were taken to differentiatewhen actual reductions or losses in body weight actually occurred. Failure to recover from growth retardation is
.
"_
Developmental and Reproductive Toxicology
235
considered permanent stunting.A permanent weight change isusually considered more worrisome than a transitory change, although transitory changes should not be completely dismissed since little is known about the long-term consequences of short-term fetal or neonatal growth inhibition [31]. In utero growth retardation is often accompanied by delays or reductions in skeletal ossification and,in some cases, increases in minor skeletal variations. This is often observed with test agents that exert their effects through maternal toxicity. Agents known to interfere directly with fetal nutrition, growth factors, and other developmental processes may also induce growth retardation by a direct effect on the fetus [lOS]. Postnatally, growth can be affected by alterations in maternal care, milk production, or milk ejection reflexes, orby direct effects on pup suckling behavior or continued test material exposure through milk.
(c) StructuralAbnormalities. Highlypotentteratogens that consistently induce a large incidenceof malformations in offspring are rare[4,6]. Most agents that are tested have low a potential for causing malformations and/or very specific time and dosage requirements that arenot reproduced in a standard study design. Therefore, the chances of observing malformations at a rate high enough to be distinguished against the normal background (i.e., historical control) are small. More often than not it is adverse effects on growth and viability of the conceptus that most often flag a compound as a developmental toxin [31,56,109]. When trying to assess whether the presence of malformations in a study represents treatment-induced abnormalities, there are some basic concepts that may help [4]. First, almostall types of malformations havebeen reported to occur spontaneously in rodents and/or rabbits. Secondly, known chemical teratogens produce more or lessspecific patterns of multiple defects; isolated abnormalities are not the rule. Thus, it is very important to review not only the summary incidence of anomalies but also individual fetal data for the occurrence of patterns of multiple malformations. A general teratogen, such as a cytotoxic chemotherapeutic agent, usually will produce a whole spectrum of defects, dependent upon the time of administration, since all organs are equally susceptible. With all teratogens, some degree of uniformity in induction of malformations is seen. Finally, chemically induced malformations are usually bilateral in the case of paired organs. A significant change in the incidence may be manifested as an increase in malformed offspring per litter, the number of litters with malformed offspring, or the number of offspring or litters with a specific type or patternof malformation(s) that appears to increase in a dose-response pattern. The incidence rates should always be compared with the background rate in the historical control data,andstatisticalsignificanceshould not be alimitingfactor in deciding whetheror not the presence of aparticularmalformationistestmaterialrelated.
236
Keller
Developmental variations, becausethey are not life-threatening, occur normally at a much higher rate than major malformations. Developmental variations are not considered true terata. Changes in the incidence of variations are commonly induced at dosagesbelow those causing malformations and can be precursors to teratogenic dosages. It has been suggested that this is because some variations are the result of compensatory mechanisms invoked to halt the progress to malformation and death [ 1 101. Variations can also be induced by agents that are not considered teratogenic. However, their presence, with or without increases in malformations, is usually considered indicative of a developmentally toxic effect. Under some circumstances, the induction of skeletal variations, such as supernumerary ribs and wavy ribs, is considered the result of maternal toxicity or stress rather than a direct toxic effect on the conceptus [ 1 1 1-1 141. Findings related to skeletal ossification can be summarized separately from other developmental variations. Reductions in skeletal ossification rates are usually merely delays in the ossification process and are usually interpreted as being related to growth retardation. (d) Ofspring Fzmctiorz. Investigationintopossiblefunctionalchanges resulting from test material exposure during gestation and/or lactation is limited in standard DART testing. Currently, the guidelines requiring such end points include measurements only for changes in maturational and behavioral indices and reproductive competence. The function of other systems (e.g., cardiopulmonary, immune, urinary, etc.) are not required to be tested for functional competence postnatally. Maturational indices (e.g., eye opening, hair growth, pinna detachment) and reflex indices (e.g., righting reflex, startle reflex) are measured as time-dependent/ growth-dependent phenomenon. It is recommended that the development of these parameters be presented as postcoital (from gestationday 0) rather than postpartum (from postnatal day 0). By doing this, differences among litters in gestation length and pup age are eliminated. Group means, litter means, and individual data should be used when evaluating possible effects. These end points generally display little intraspecies and intralitter variability and are relatively sensitive end points. Delays in the attainment of maturational andreflex parameters areusually associated with reductions in growth of the offspring. The significanceof delays in the attainment of these end points is dependent upon how many of the end points are affected, whether both sexes are affected. and how long of a delay was observed. Behavioral testing, including basal and exploratory activity monitoring and learning/memory tests, is conducted over a single day or a few days and is generally conductedwhen the offspring are older.The interpretation of behavioral data is often difficult and limited due to the lack of knowledge about the underlying toxicological mechanisms and their significance [31,1151. These types of tests
Developmental and Reproductive Toxicology
237
also display wide intraspecies variability. However, alterationsin behavioral parameters at dosage levels below those causing other developmental injurymaor ternal toxicity would be of considerable concern, particularlyif they were permanent. Significant changes in behavioral testing will often walrant follow-up in more sophisticated studies, including brain chemistry and histopathology, to further characterize the test material. Evaluation of possible chemically induced effects on offspring reproductive competence can include maturational indices such as vaginal opening and testes descent or preputial separation as a measure of puberty, initiation and cyclicity of female estrous, and mating procedures such as those outlined in the parental adults at maturity [116-1181. All of these data should be evaluated as described above for similar parameters, including both individual and group mean data. Significant alterations in these parameters can be indicative of in utero and/or postnatal test material effects on sexual differentiationof the brain, altered endocrine status, and possible injury to offspring reproductive tissues and germ cells.
(e) Possible Follow-Up Studies. The approachestoevaluatingpossible mechanisms responsible for developmental toxicity, including teratogenicity, are as numerous as the differing types of effects that are possible, ranging from investigations on possible effects on uterine blood flow to receptor-specific targeting in fetalorgans[24,119,120]. One of the most useful tools indistinguishing whether a test material is acting in utero via toxicity to the mother or directly acting on the fetus is in vitro embryo/fetal culture [ 121- 1231. In the postnatal period this can be differentiated using cross-fostering with untreated animals [ 1241. Maternal behavior can also be scored for measuresof anxiety and aggression, nest building, pup retrieval, and other markers [ 125,1361.
C.
Risk Assessment
Risk assessment and extrapolationof animal data to human risk is a complicated process and can be very different depending on the use, expected human exposure levels, and expected exposed population for the agent in question (i.e., drug, occupational chemical, environmental pollutant, etc.) [28,29,3 1,32,55,56]. Thus, is not possible to examine this in any great detail in this review. An accurate evaluation must include consideration of all relevant data for the test material, such as acute and chronic toxicity and target organ data, all possible mechanism(s) of action, metabolism and pharmacokinetic data, as well as all available DART data. The availability of data in humans can greatly enhance the reliability and accuracy of the risk assessment, but most often risk assessment must rely solely on animal data. The first step in any risk assessment is a review of the data, including judgment on the qualityof the studies availableand pertinence to the risk assess-
238
Keller
ment. This judgment is based on a number of factors including, but not limited to, consistency of results, reproducibility of results in the same species, number of species studied, concordance of effects among species (including humans), demonstrated dose-response relationships, applicability of the various routes of end points exposure, powerof the study to detect a positive result, and the oftype measured. Each adverse effect, whether reduced male fertility, embryolethality, or pup behavioral changes, may show a slightly differently shaped response curve. The lowest observed effect level (LOEL) or lowest observed adverse effect level (LOAEL) for the study is the lowest dosage inducing at least one of type adverse effect. The no observed effect level (NOEL) or no observed adverse effect level (NOAEL) is the highest dosage showing no adverse effect. As appropriate, an adult systemic, adult reproductive, and/or a developmental LOEL/NOEL should be identified. In the absence of specific data to the contrary, any adverse DART effects seen in the animal studies are presumed to indicate a potential risk to human reproduction. Some end points may be given greater weight in the assessment process than others.An effect perceived to be reversible and non-life-threatening would be of less concern than clearly severe, life-threatening effects.The sensitivity of a particular end point is also considered.The presumption is maintained in the human, since concordance of effects even if an end point has no counterpart is usually an exception rather than the general rule. The lack of uniformity of effects between species not is unexpected if one considers themany critical differof human exposure and those used in ences that exist between the conditions animal studies (e.g., the exaggerated dosage levels used in experimental animals; controlled environment and exposure to a single agent; species differences in pharmacokinetics, placentation. and pregnancy maintenance). Most toxicity studies in laboratory animals typically relate the response as a function of the dosage administered without data on actual systemic exposure (i.e., levels of test material in the blood). Yet, it is not uncommon to observe marked species differences in absorption, distribution, and elimination of a chemical. When extrapolating from animals to humans, the absenceof such comparative pharmacokinetic data increases the uncertainty of the extrapolation. This is especially problematic when extrapolating not just across species but also across differing routes of administration. Another default assumption,in the absence of Pharmacokinetic data, is that treatment of the animal is assumed relevant to human exposure by any route. Also in the absence of adequate pharmacokinetic data, themost sensitive species (i.e.. the species with the lowest NOEL) is used in the risk assessment, since humans are generally considered as sensitive as the most sensitive species tested, particularly in developmental toxicity [109]. It can be very beneficial to obtain pharmacokinetic data on a test compound when comparative human data are
Developmental and Reproductive Toxicology
239
available. There are many examples of toxicants that require biotransformation before demonstrating any toxic effects. If this biotransformation pathway were shown to be absent in humans, then the toxicity data from the animal studies in that species would be considered irrelevant to humanrisk assessment. The same holds true for animal data on mechanism(s)of action, which may or may not be relevant to humans. Actual acceptable levels of exposure for chemicals, other than drugs, vary with the nature of the chemical, the purpose for which itwill be used (e.g.. food additives, pesticides, and pollutants) and typeof benefits expected. There areno mathematical models that are generally accepted for estimating possible DART response below experimental dosage levels. Defining a NOELin a DART study does not prove or disprove the existence a threshold; it only defines the highest level of exposure under the conditionsof the test that arenot associated with an adverse effect [31]. Calculating acceptable exposure limitsin DART risk assessment is not a precise science but rather an approximation, with allowable exposure concentrations setat a specified fractionof the NOEL based on the available data. The fraction is determinedby the use of various “safety factors’’ as applied using the “uncertainty factor” approach or the “margin of safety” approach. There has been some debate in recent years on the use of the NOEL in setting safe exposure levels, since reliance on the NOEL places substantial importance on the power of a study to detect low-dose effects. Methods for replacing the NOEL with a calculated “benchmark dose” are under consideration [1271301. The margin of safety approach derives a ratioof the NOEL from the most sensitive speciesto the estimated human exposure level from all potential sources. The adequacy of the margin of safety is then considered based on the weight of evidence, including the nature and quality of the hazard (toxicology data) and exposure data, the number of species tested, dose-response relationships, and other applicable factors. of the applied safety factor In the uncertainty factor approach, the size varies based on interspecies differences, the nature and of extent human exposure, the slope of the dose-response curve, the types of end points affected and the relative dose levels for systemic toxicity vs. those producing developmental or reproductive toxicity in the most sensitive test species. The safety factor applied can range from 10 to 10,000 depending on the number of adjustments needed. Possibilities include 10 for situations in which the study LOEL must be used because a NOEL was not established, 10 for interspecies extrapolation, and 10 for intraspecies adjustment for variable sensitivity among individuals. Additional adjustments may be applied for length of exposure (acute to subchronic) or to correct for inadequacy of the NOEL or LOEL or insensitivity of the end point it is based on. Once the uncertainty factor is selected, this is divided into the NOEL to obtain the acceptable exposure level. Food additives and pesticides
Keller
240
generally are given at least 100-fold safety margins. Safety margins for environmental pollutants, on the other hand, are generally a 1000-fold. Acceptable levelsof exposure for human drugs are not generally calculated and used in the same way they are used for chemicals. The margin of safety between toxic exposure and therapeutic dosage levels can be as low as 10-fold or even 1-fold for highly beneficial drugs.The U.S. Food and Drug Administration uses pregnancy categories and labeling requirements basedon available animal and human DART data[ 131.1321, leaving the risk/benefit decisionup to the physicians and their patients. The pregnancy categories are:
A. Controlledhumanstudiesshownorisk.Adequate,well-controlled studies in pregnant women have failed to demonstrate risk to the fetus. B. No evidence of risk in humans. Either animal findings show risk, but human findings do not, or,if no adequatehuman studies have been done, animal findings are negative. C. Risk cannot be ruled out. Human studies are lacking, and animal studies are either positive for fetal risk or lacking as well. However, potential benefits may justify the potential risk. D.Positiveevidence of risk.Investigationorpostmarketingdatashow risk to the fetus. Nevertheless, potential benefits may outweigh the potential risk. X. Contraindicated in pregnancy. Studies in animals or humans, or investigational or postmarketing reports have shown fetal risk that clearly outweighs any possible benefit to the patient. Recent suggestions on how to improve the information passed on in drug labels and the risk/benefit process for doctors and patients have been published [ 1331.
IV. CONCLUSION Reproduction and development of offspring are the primary goals of genetics and involve a complicated andnot fully known interplayof physiology, endocrinology, and biology. Much of these reproductive and developmental processes are measured indirectly in standard DART studies by the ultimate production of viable and reproductively competent offspring. The large number of chemicals needing to be screened each year makes more detailed investigations into each impossible. Government requirements for testingof possible chemically induced effects on reproduction are dependent on the end use, exposure potential, target population. and perceived benefit of the chemical. The laboratory animals used in these studies arenot perfect models. However, daily research is increasing our understanding of these animal models, aiding in the extrapolation of potential risk tohuman populations. Ultjmately, a clear understanding of the mechanism(s)
Reproductive Toxicology Developmental and
241
inducing chemically related reproductive and developmental toxicity will provide the most accurate extrapolation of potential risk to humans.
APPENDIX: GLOSSARY Abortifacient An agent that causesabortion. Abortion Premature expulsion from the uterus of products of conception (embryo or a nonviable fetus). Rodents cannot abort. Rabbits do, beginning at approximately 20 days; by day 28 are consideredan early delivery (see Premature birth) due to the survivability of the fetus. [embryotocia] Agalactia Absence or failure of the secretion of milk. Amenorrhea Absence or abnormal cessation of the menses or estrous cycles. Amnion The inner of the fetal membranes: a thin, transparent sac that holds the fetus suspended in the amniotic fluid. Androgen Class of steroid hormones produced inthe gonads and adrenal cortex that regulate masculine sexual characteristics; a generic term for agents that encourage the development of or prevent changes in male sex characteristics; a precursor of estrogens. Anovulation Suspension or cessation of the release of ova from the follicles. Blastocyst Mammalian conceptus in the postmorula stage; spherical structure produced by cleavage of the fertilized ovum, consisting of a single layer of cells (blastoderm) surrounding afluid filled cavity (blastocele). [blastula; blastosphere] Cannibalism Consumption of one by theother: in reproductioncontext, the eating of offspring by the mother. Cesarean section Laparotomy to deliver fetus(es); usually performed on day 18 (mouse), day 20 (rat), day 29 (rabbit). Chorion Outermost of the fetal membranes, consisting of on outer trophoblastic epithelium lined internally by extraembryonic mesoderm; its villous portion, vascularized by allantoic blood vessels, forms the placenta. Conceptus Sum of derivatives of the fertilized ovum at any stage of development from fertilizationuntil birth including the embryo/fetus and the extraembryonic membranes. Copulation Mating. [coitus] Copulatoryinterval Time (mean days)animalscohabit until matingoccurs. [precoital time] Corpora lutea (corpus luteum, SI) Endocrine body formed in the ovary at the site of the ruptured Graafian follicle; appear as small raised translucent, creamy or yellowish colored bumps which appear like small grapes on the surface of the ovary (see also Luteinization); secrete an estrogenic and progestagenic hormone.
Keller
242
Culling In reproductive context, the selective removalof young in a litter, usually to 8 (in rat), in order to reduce intralitter competition in nursing. Decidualization Changes in uterine tissues in preparation for embryo implantation; characterizedby extensive proliferation and differentiation of endometrial stromal cells. Developmental variations Forpracticalpurposes,variationsare defined as those alterations in anatomical structure that are considered to have no significant biological effecton animal health or body conformity, representing slight deviations from normal. Most examples placed in this category are minor variations in size and form of normally present ossification centers. While these are evaluated on a precise day of development, some variation is expected due to when conception and implantation actually occurred. Thus differences in the pattern of ossification, manifested either as retardation or as acceleration of apparent osteogenesis, are common findings. Also included in this category areslightmisshapeningormisalignment of structures,processesinvolving continued development (bilateral skeletal centers not yet fused, incomplete maturation of renal papillae, presenceof vestigial structures, etc.), and development of extra ossification sites. Slight malpositioning and hypoplasia are also considered variations in development. The presence of multiple variations in many instances are observed concomitantwith maternal and/or developmental toxicity. Dystocia Prolonged,abnormalordifficultdelivery.[parodynia] Early delivery Expulsion of a survivable conceptus prior to normaltern1 parturition. In rodents and rabbits, expulsion 2 days prior to tern1 is considered an early delivery. Embryo Early or developing stage of an organism, especially the developing product of fertilization of an egg. Its development is termed embryogenesis. The embryonic period is as follows: Rat Mouse Rabbit Human
9- 14 days postconception 6-13 10- 15 2 1-55
Epididymis Secondarysexorganthroughwhichspermatozoapassand in which spermatozoa acquire ability to become motile andto fertilize; the distal segment (cauda) is also a site for storage of spermatozoa. Estradiol Estrogenichormoneproduced by folliclecells of thevertebrate ovary: provokes estrus and proliferation of the human endometrium. Estrogen Estrogenichormone;generictermforvariousnaturalorsynthetic substances that produce estrus. Estrous cycle Recurring periods of estrous in the adult female of most mam-
Developmental and Reproductive Toxicology
243
mals and the correlated changes in the reproductive tract from one period to the next. Stages are proestrus, estrus, diestrus, and metestrus. Fecundity Physiological ability to reproduce (as opposed to fertility); the biological competence of the reproductive system. Fertility Capacity to conceive or induce conception following mating. Fertilization Act of rendering gametes fertile or capable of further development. Begins with contact between spermatozoa and ovum, leading to their fusion, which stimulates the completion of ovum maturation with release of a second polar body. Male and female pronuclei then form and perhaps merge; synapsis follows, which restores the diploid number of chromosomes and results in biparental inheritance and the determinationof sex. The process leads to the formation of a zygote and ends with the initiation of its cleavage. Fetus Unborn offspring; the developingconceptusfollowingembryogenesis. The fetal period is as follows: 15-32 postconception days Rat Mouse 14-20 Rabbit 16-32 56-238 Human
Follicle Small excretory or secretory sac or gland; one of the vascular bodies in the ovary, containing the oocytes. Atretic follicle A Graafian follicle which has involuted. Graafian follicle Mature mammalian ovum with its surrounding epithelial cells. Ovarian follicle The egg and its encasing cells at any stage of development. Primordial follicle Ovarian follicle consisting of an undeveloped egg enclosed by a single layer of cells. Follicle-stimulating hormone (FSH) Glycoproteinhormonesecreted by the anterior pituitary that promotes spermatogenesis and stimulates growth and secretion of the Graafian follicle. Gamete One of two cells produced by a gametocyte, male (spermatozoon) and female (ovum), whose union is necessary in sexual reproduction; a 1N (haploid) cell in sexual fusion. Gastrula Early embryonic stage which follows the blastula. The simplest type consists of two layers, the ectoderm and mesectoderm, and two cavities, one lyingbetween the ectodermandtheentoderm,theother(thearchenteron) formed by invagination so as to lie within the entoderm and havingan opening (the blastopore). Gastrulation Process by which a blastula becomes a gastrula or,in forms without a true blastula, the process by which three germ cell layers are acquired.
244
Keller
Gestation Period of intrauterine development from conception to birth. Gonadotropin Substance that acts to stimulate the gonads. Gravid Pregnant. Hydramnios Alteredvolume of amniotic fluid. Oligohydramnios Reducedquantity at term. Polyhydramnios Increasedquantityatterm. Hydrometra Excess fluid (clear, colorless) in the uterus. [uterine dropsy] Implantation Attachment of the blastocyst to the epithelial liningof the uterus. its penetration throughthe epithelium, and its embeddingin the compact layer of the endometrium. [nidation] Infertility Absence of the ability to conceive, or to induce conception. Lactation Secretion of milk; the period following birth during which milk is formed. Luteinization Process taking place in the ovarian follicle cells which havematured and discharged their ova: the cells become hypertrophied and assume a yellow or cream color, the follicles becoming corpora lutea. Luteinizing hormone (LH) Glycoproteinhormonesecreted by theadenohypophysis that stimulates hormone production by interstitial cells of gonads. Malformation Defectiveorabnormalfunction;anatomicalormorphological abnormality; deformity [birth defect; congenital anomaly; dysmorphosis; cacomorphosis]; defined by the EPA as ‘‘a permanent deviation which generally is incompatible with or severely detrimental to normal postnatal survival or development” (Federal Register 5 1( 185); 34028-34040, 1986); For practical purposes:“Malformationsarethosestructuralanomalies that altergeneral body conformity,disruptorinterfere with body function,oraregenerally thought to be incompatiblewith life. Specific examplesof processes that result in maldevelopment include marked or severe mishapening, asymmetry. or irregularity of structure brought aboutby fusion, splitting. disarticulation, malalignment, hiatus, enlargement, lengthening, thickening, thinning, or branching. a Absence (agenesis) of parts or whole structures is also considered malforrnative process.” Multigravida A female pregnant for the second (or more) time. Multipara A female that hashad two or more pregnancies that resultedin birth of viable offspring. Multiparous Giving birth to several offspring at one time. [polytocous] Morula Solid mass of blastomeres (embryonic cells) formed by cleavage of a fertilized ovum. Neonate Newly born: the neonatal period in humans pertains to the first 4 wk after delivery. Nilliparous A female that never had borne viable offspring. [nulliparous] Oligomenorrhea Prolongation of menstrual cycle beyond average limits. Oocyte Femaleovariangermcell. Organogenesis Period of major organ development in the embryo/fetus; entire
Developmental and Reproductive Toxicology
245
period of organogenesis actually continues intothe postnatal period, including development of the central nervous system. Osteogenesis Formation or development of the bone (skeleton). Ovulation Discharge of an ovum or ovule from a Graafian folliclein the ovary. Parity Condition of a female with respect to the number of pregnancy(ies) that resulted in viable born offspring. Parturition Act or process of giving birth (delivery; labor). Perinatal Occurring shortly before, during or shortly after birth. Postpartum After birth or delivery;postnatally(postparturition). Premature birth Birth prior to expectedtime but capable of surviving ex utero. Rabbit-after gestation day 28 up to expected parturition (day 30). Mouse-prior to expected parturition (day 19). Rat-prior to expected parturition (day 21).
Primigravida A female pregnant for the first time. [unigravida] Primipara A female who has born but one offspring. [unipara] Primiparous Producing only one ovum or offspring at one time. [uniparous] Progesterone Steroid hormone producedin corpus luteum, placenta, testes,and adrenals that plays a physiological rolein the luteal phase of menstrual cycles and maintenance of pregnancy; also an intermediate in biosynthesis of androgens and estrogens. Prolactin Proteinhormoneproduced by theadenohypophysisthatstimulates secretion of milk and promotes functional activity of the corpus luteum. Pseudopregnancy False pregnancy; the condition mimics that of pregnancy; in rats, accompanied by prolonged diestrus. Puberty Period at which generative organs become capable of reproduction. Resorption Conceptus that, having implanted in the uterus, subsequently died and is being or has been resorbed. Early resorption Evidence of implantation without recognizable embryonic characteristics evident. Late resorption Recognizable fetal featuresbut undergoing evident autolysis. Runt Normally developed fetus or newborn significantly smaller than the rest of the litter. Semen Mixture of sperm and fluids from the excurrent ducts and accessory sex glands. Seminiferous epithelium Normal cellular components within the seminiferous tubule consisting of Sertoli cells, spermatogonia, primary spermatocytes, secondary spermatocytes, and spermatids. Seminiferoustubules Structureswithinthetestes in whichspermatozoaare produced and begin transport toward the excurrent ducts. Sertoli cells Cells in the testicular tubules providing support, protection, and nutrition for the spermatids.
246
Keller
Spermatogenesis Process of formation of spermatozoa,includingspermatogenesis and spermiogenesis; more specifically, the process by which type A spermatogonia germ cells periodically differentiate at a given point within a seminiferous tubule of the testis and divide to give rise to more differentiated spermatogonia and ultimately, primary spermatocytes. The duration of spermatogenesis is the interval from thispoint until release of the resulting spermatozoa at spermiation. Spermiation Complex series of events during which sperm are separated from the seminiferous epithelium and released into the seminiferous tubule lumen. Spermatogenic cycle [cycle of seminiferous epithelium] Interval required for one complete series of cellular associations (spermatogenic stage) to appear at a fixed point within a tubule. Spontaneous malformations Normalbackgroundincidence of maldevelopment unrelated to known causes. Steroidogenesis Enzymatic steps converting acetate and cholesterol to sex steroids, glucosteroids, or mineralocorticoids. Stillbirth Birth of adeadfetus. Teratogen Agent or factor that causes the productionof physical defects in the developing embryo; the production of defects is termed teratogenesis. Term Earliest time at which a fetus can survive outside the mother: fetus at the end of the gestation period (i.e., term fetus). Testis Male gonad that contains the seminiferous tubules wherein spermatozoa are produced and the interstitial cells, including Leydig cells, produce androgenic hormones. Testosterone Biologically potent androgenic steroid released from the gonads and adrenal glands. Vaginal plug A mass of coagulated semen that forms in the vagina of some mammals after coitus. [copulation plug] Weaning Day onwhich an animalisseparated from itsmother;usually on postnatal day 21 in the rat. Yolk sac Extraembryonicmembranecomposed of endodermandsplanchnic mesoderm; it is the organin which the first red blood cells are formed (“blood islands”). In rodents, it is the primary absorptive surface prior to formation of the placenta.
REFERENCES 1. P.Taylor, Practical Teratology, AcademicPress.NewYork,1986. 2. R. E. Chapin andJ. J. Heindel, Methods in Toxicology. MaleReproductive Tosicology. Vol. 3A. Academic Press, New York, 1993. 3. R. E. Chapin and J. J. Heindel. Methods i n Toxicology. Female Repr-oducti\le Tosicology, Vol. 3B. Academic Press, New York, 1993.
Reproductive Toxicology Developmental and
247
4. C. A. Kimmel and J. Buelke-Sam, Developrnentcd Toxicologv, Raven Press, New York. 198 1. Toxicolog-y, 2d Ed., Raven Press, New York, 1995. 5. R. J. Witorsch. Reprod~tcti~le 6. L. D. Wise, S. L. Beck, D. Beltrame, B.K. Beyer, I. Chahoud. et al., Terminology of developmental abnormalities in common laboratory mammals (version 1). Temtology 55: 249, 1997. 7. International Conferenceon Harmonization: Guideline on Detection of Toxicity to Reproduction. Federal Register 59: 48746, issued September 22, 1994; and Draft Guideline on Detection of Toxicity to Reproduction: Addendum on Toxicity to Male Fertility, issued April, 1996. 8. U.S. FDA, Toxicological Principlesfor the Sufety Assessrnelltof Direct Food Additives and Color Additives Used in Food, Food and Drug Administration, Bureau of Foods No. PB83-170696, 1982, pp. 26, 27, 80-122 (“Red Book”). Fedeml Reg9. U.S. EPA, Toxic Substance Control Act Test Guidelines: Final Rules. Federal Register 50: 39252, 1985; Reproductive and Fertility Effects: Final Rules. ister 50: 39432, 1987; and Toxic Substances Control Act Test Guidelines: Final Rules. Federal Register 62: 43820, 1997. Guidelirzes for Testing 10. Organization for Economic Cooperation and Development, of Chernicals, 1981.414:TeratogenicityTesting.One-GenerationReproduction Study1982,415:l-8withaddendum1983;andTwo-GenerationReproduction Study 1982, 416; 1-8 with addendum 1983. 11. U.S. EPA, Heulth Efsects Test Guidelines, EPA Report No. 560/6-82-001. Office of Pesticides and Toxic Substances. issued August, 1982: and Revised Pesticide Assessment Guidelines. Series 83-3, Subdirisiorz F: Hazards Eva1~rafio~~”Hrr~nrrn and Domestic Arzirnals, EPA Report 540/9-82-025, Office of Pesticide Program.
issued November, 1984. Guidelirles 011 Tosicolog?? 12. Japanese Ministry of Agriculture. Forestry and Fisheries, Stur1.y Data .for Application of Agriculturul Chemical Registration, 59 Nohsan No. 42000, Tokyo, Japan, 1985. Guidelines for Pesticide To,ricolog?l 13. CanadianMinistryofHealthandWelfare, Datcr Requirenleilts. Reprocl~rction Studies. Health Protection Branch. Health and Welfare, Toronto. Canada, 198 1. 14. U.S. FDA. Target AIlintal Scrfery Guidelinesfor New Atzirnnl Drugs. Office of Scientific Evaluation, Bureau of Veterinary Medicine, Food and Drug Administration, November,1983. 15. U.S. FDA, Gerteral Priuciples for Evalnatiq the Sc$ety of Conlpo~rrlds used in Food-Prod~tcirzg Aninzals. Office of Scientific Evaluation. Bureau of Veterinary Medicine. Food and Drug Administration, November, 1986. 16. Japanese Ministry of Agriculture. Forestry and Fisheries, Gr4idelirle.sfor Tosicifi Studies of New Allinla1Drugs. Notification No. 63-44 of the Pharmacuetical Affairs Office. Tokyo, Japan, 1988. 17. M. W. Harris. R. E. Chapin, A. C. Lockhart et al., Assessment of a short-term reproductiveanddevelopmentaltoxicityscreen, Fmd. Appl. Toxicol. 19: 186, 1992. K. Beyer,Anabbreviatedrepeatdoseand 18. R. A.Scala, C. BevanandB. reproductive/developmental toxicity test for high production volume chemicals. RegLrl. Tosicol. Phcrrmncol. 16: 73,1992.
248
Keller
19. D. Neubert. Benefits and limits of model systems in developmental biology and toxicology (in vitro techniques). In:Prevention of Physical cmd Mental Corzgeizitcll Defects, (M. Marois et al., Eds.), Alan R. Liss, New York, 1985, p. 91. 20. D. Neubert, G. Blankenburg. C. Lewandowski and S. Klug, Misinterpretations of results and creation of "artifacts" in studies on developmental toxicity using systems simpler than in vivo systems. In: Dellelopnzerztal Mechrrnisms: Noma1 and Abtzormd. Alan R. Liss. NY, 1985. p. 241. 21. A. H. Piersma, A. Verhoef and P. M. Dortant. Evaluation of the OECD 431 reproductive toxicity screening test protocol using butyl benzyl phthalate, To.ricoZog.v 99: 191,1995. 22. L. E. Gray Jr. and R. J. Kavlock. An extended evaluation of an in vivo teratology screen utilizing postnatal growth and viability in the mouse, Teratogenesis Carcirzog. Mutagen. 4:403, 1984. 23. F.HomburgerandA. N. Goldberg. In Vitro Ernbvotoxicih and Teratogenicity Tests. S. Karger, Basal, 1985. 24. F. Welsch. Approaches to ElucidateMechnrzisrttsin Terntogenesis, Hemisphere Publ. Co.. Washington, D.C.. 1987. 25. H. Tuchmann-Duplessis, Selection of animal species for teratogenic drug testing. In: Methods in Prenatcd Toxicit\r',(D. Neubert,H.-J.Merkerand T. E. Kwashigrouch, Eds.). Georg Thieme Publ., Stuttgart. 1977, p. 25. 26. H. C. Stanton. Factors to consider when selecting animal models for postnatal teratology studies, J. Emiror1.Pathol.Toxicol. 2: 201. 1978. 27. R. P. Amann, Use of animal models for detecting specific alterations in reproduction. Flrmkw. Appl. Toxicol. 2: 13,1982. on the useof teratology data for human risk assess28. V. H. Frankos, FDA perspective ment. Frtndunl. Appl. Taricol. 5: 615, 1985. 29. T. Tanimura. Japanese perspectives on the reproductive and developmental toxicity evaluation of pharmaceuticals. J. Am. Coll. Toxicol. 9: 37, 1990. Teclznicrrl Report 30. EuropeanChemicalIndustryEcologyandToxicologyCentre, No. 21. A Guide to the Classification of Carcirzogem, Mutagens a d Teratogerls Under the Sisth Amendment. Brussels, Belgium, 1986. of Suspected Developmental Tox31. U.S. EPA, Guidelines for the Health Assessment icants, Federal Register- 51: 34028. 1986: proposed amendments Federal Register 54: 9386,1989. M. S. Golub. W. L. Hart, et al., An evalua32. J. A. Moore, G. P. Daston, E. Faustman,
tionprocessforassessinghumanreproductiveanddevelopmentaltoxicityof agents, Reprod. Toxicol. 9: 61. 1995. 33. B. Ulbrich and A. K. Palmer, Detection of effects on male reproduction-a literature survey, J. Am. Coll. Toxicol. 14: 293 (1995). 34. S. Takayama, M. Akaike, K. Kawashima, M. Takahashi and Y. Kurokawa, A collaborative study in Japan on optimal treatment period and parameters for detection of male fertility disorders induced by drugs in rats, J. Am. ColZ. Toxicol. 14: 266 (1995). 35. R. R. Fox and C. W. Laird. Sexual cycles. In: Reproduction arzd Breeding Tecltniqrles for Laboratory Arlinzals, (E. S. E. Hafez, Ed.). Lea& Febiger, Philadelphia. 1970,p. 107.
Developmental and Reproductive Toxicology
249
36. R. E. Staples, Detection of visceral alterations in mammalian fetuses, Teratology 9: A37.1974. 37. M. V. Barrow and W. J. Taylor, A rapid method for detecting malformations in rat fetuses, J. Moryhol. 1 2 7 291, 1969. 38. J. L. Stuckhardtand S. M. Poppe, Fresh visceral exanlination of rat and rabbit fetuses usedinteratogenicitytesting, Teratogenesis Carcitlog. Mutagen. 4: 181. 1984. 39. J. G. Wilson, Methods for administering agents and detecting malformations in experimentalanimals.In: Teratology:Principles and Techniques, (J. G. Wilson and J. Warkany, Eds.), University of Chicago Press, Chicago. 1965, p. 263. of rat and rabbit fetuses for malformation of internal 40. H. Sterz. Routine examination organ combination of Barrow’s and Wilson’s methods. In: Methods i n Prerzatal To-vicology, (D.Neubert,H.-J.Merkerand T.E.Kwashigrouch,Eds.),Georg Thieme Publ., Stuttgart, 1977, p. 113. 41. H. Sterz andH. Lehmann, A critical comparison of the freehand razor-blade dissection method according to Wilson with an in situ sectioning method for rat fetuses, Tesatogenesis Cnrcitlog.Mzrtagen. 5: 347. 1985. 42. L. Machemer and E. G. Stenger, [Examination of fetuses in teratological experiments. Modification of “Wilson Technique”]. Arz-izeim. Forsch. 21: 144, 1971. 43. J. F. Faherty. B. A. Jackson and M. F. Greene. Surface staining of 1 mm (Wilson) slices of fetuses for internal visceral examination, Stuirz Teclznol. 47: 53, 1972. 44. E. B. van Julsingha and C. G. Bennett, A dissecting procedure for detection of anomalies in the rabbit fetal head. In: Methods in Prerzcrtcd Toxicology, (D. Neubert. H.-J. Merker and T. E. Kwashigrouch. Eds.), Georg Thieme Publ., Stuttgart, 1977, p.126. 45. E. Igarashi, New method for the detection of cardiovascular malformations in rat fetuses: Gelatin-embedding-slice method, Teratology 48: 329, 1983. 46. A. B. Dawson, A note on the staining of the skeleton of cleared specimens with AlizarinRed S, Stain Tedznol. I : 123. 1926. 47. R. P. Jensh and R. L. Brent RL, Rapid schedules for KOH-clearing and Alizarin Red S staining of fetal rat bone. Stcdirz Technol. 4Z: 179. 1966. 48. R. E. Staples and V. L. Schnell. Refinements in rapid clearing technique in the KOH-Alizarin Red S method for fetal bone, Stain Technol. 39: 61, 1964. 49. H. Fritz and R. Hess, Ossification of the rat and mouse skeleton in the perinatal period. Teratology 3: 33 1, 1970. 50. M. C. Marr, C. B. Meyers, J. D. George and C. P. Price, Comparison of single and double staining for evaluation of skeletal development: The effects of ethylene glycol (EG) in CD rats, Terntology 33: 476, 1988. 51. S. Kawamura et al, Bone-staining technique for fetal rat specimens without skinning and removing adipose tissue, Cong. Arzonz. 30: 93, 1990. 52. A. N. Hirschfield, Histological assessment of follicular development and its applicability to risk assessment, Reprod. Toxicol. I : 71, 1987. 53. L. D. Russell, R. A.Ettlin,A.P. S. HikimandE. D. Clegg, Histological a i d HistopcrthologicalEvalrtation of the Testis, CacheRiverPress,Clearwater.FL, 1990. TV and R. E. Chapin, Experimental modelsof male reproductive toxicol54. J. C. Lamb
250
55. 56. 57. .
58. 59. 60. 61.
62. 63.
64. 65. 66.
67. 68. 69. 70. 71. 72.
Keller
ogy. In: Erzdocrine Toxicology, (J. A. Thomas,K. S. Korach and J. A. McLachlan, Eds.), Raven Press, New York, 1985, p. 85. K. S. Rao, B. A. Schwetz and C. N. Park. Reproductive toxicity risk assessment of chemicals, Vet. H L ~Toxicol. . 23: 167, 1981. Interagency Regulatory Liaison Group Workshop on Reproductive Toxicity Risk Assessment, Emtroll. Health Pet-spect. 66: 193, 1986. S. C. Gad and C. S. Weil. Statistics for toxicologists. In: Principles a ~ Methods d of Toxicology (A. W. Hayes, Ed.), Raven Press, New York, 1982. p. 273. E. A. C. Shirley and R. Hickling, An evaluation of some statistical methods for analysing numbers of abnormalities found amongst litters in teratology studies. Biometrics 37: 819, 1981. T. R. Ten Have andT. Hartzel. Comparison of two approaches to analyzing correlated binary data in developmental toxicity studies, Teratology 52: 267. 1995. D. A. Savitz. Is statistical significance testing useful in interpreting data? Repl-od. Toxicol. 7: 95,1993. H. Fritz and K. Giese, Evaluation of the teratogenic potential of chemicals in the rat. Pharmacology 4O(Slrppl. 1): 1 . 1990. E. L. Feussner, G. E. Lightkep. R. A. Hennesy, A. M. Hoberman. M. S. Christian. A decade of rabbit fertility data: Study of historical control animals. Terntology 46: 349.1992. H. Morita et. al., Spontaneous malformations in laboratory animals: Frequency of external, internal and skeletal malformations in rats. rabbits and mice. Cong. Anonz. 27: 147,1987. R. Heywood and R. W. James. Current laboratory approaches for assessing male reproductive toxicity: Testicular toxicity in laboratory animals. In: Reproductille Toxicology. (R. L. Dixon. Ed.), Raven Press. New York, 1985, p. 147. P. C. May and C. E. Finch, Aging and responses to toxins in female reproductive functions, Reprod.Toxicol. I : 223.1988. W. Hansel and E. M. Convey, Physiology of the estrous cycle, J. Anim. Sci. 57: 404.1983. N. T. Adler. J. A. Resko and R. W. Goy. The effect of copulatory behavior on hormonal change in the female rat prior to implantation, Physiol. Behav. 5: 1003, 1970. R. V. Chester andL. Zucker. Influence of male copulatory behavior on sperm transport,pregnancyandpseudopregnancy in femalerats, Plzysiol.Behml. 5: 35, 1970. N. T. Adler and J. P. Toner, The effect of copulatory behavior on sperm transport and fertility in rats. A m N. Y. Acad. Sci. 474: 21, 1986. M. L.Meistrich,Quantitativecorrelationbetweentesticularsten1cellsurvival. sperm production and fertilityin the mouse after treatment with different cytotoxic agents. J. Andt-01. 3: 58, 1982. A. A. Gerall and R.E. McCrady, Receptivity scores of female rats stinulated either rnanually or by males, J. Etzdocl-inol. 46: 55, 1970. A. M.Etgen,1-(0-Chlorophyeny1)-1 -(p-chlorophenyl)2,2,3,-trichloroethane: A probe for studying estrogen and progestin receptor mediation of female sexual behavior and neuroendocrine responses. Endocrinology l l l : 1498, 1982.
Developmental and Reproductive Toxicology
251
73. D. L. Hess, Neuroendocrinology of female reproduction: Review, models, and potential approaches for risk assessment. Reprod. Toxicol. I : 139, 1988. 74. C. Desjardins, Endocrine regulation of reproductive development and function in the male. J. Anint. Sci. 37(S~rppl.2): 56, 1978. Federal 75. U.S. EPA. Proposed Guidelines for Assessing Male Reproductive Risk, Register 53: 24850, 1988. risk. Reg76. U.S. EPA, Proposed guidelines for assessing female reproductive Federlrl ister 53: 24834, 1988. 77. C. L. Hughes, Effects of phytoestrogens on GnRH-induced luteinizing hormone secretion in ovariectomized rats, Reprod. Toxicol. I : 179, 1988. 78. J. Laskey. E. Berman, H. Carter, J. Ferrell, Identification of toxicant induced alterations in steroid profiles using whole ovary culture, Toxicologist 2 2 : 111 , 1991. 79. S. D. Perreault, S. Jeffay, P. Poss and J. W. Laskey, Use of the fungicide carbendazim as a model compound to determine the impact of acute chemical exposure during oocyte maturation and fertilization on pregnancy outcome in the hamster, Toxicol. Appl. Pharmncol. I 14: 225, 1992. 80. J. S. Felton and R. L. Dobson, The mouse oocyte toxicity assay. In: Application of Short-Term Bioassays in the Analysis of Complex En\irorzmental Mixtures, (M. D. Waters, et. al., Eds.), Plenum Press, New York, 1982. 81. P. L. Nayudu, P. S. Kiesel, M. A. Nowshari and J. K. Hodges, Abnormal in vitro developnlent of ovarianfolliclesexplantedfrommiceexposedtotetrachlorovinphos, Reprod. Toxicol. 8: 261, 1994. 82. A.M. Cunmings,Toxicologicalmechanisms of implantationfailure, Fw~dunt. Appl. Toxicol. 15: 571,1990. 83. Y. Toyoda and M.C . Chang, Fertilization of rat eggs in vitro by epidiymal spermatozoaandthedevelopmentofeggsfollowingtransfer, 1. Reprod. Fertil. 36: 9,
1974. 84. A. R. Fuchs and R. Fuchs. Endocrinology of human parturition: A review. Brit. J. Obstet. Gynecol. 91: 948, 1984.
85. J. Neulen and M. Breckwoldt, Placental progesterone, prostaglandins and mechanisms leading to initiation of parturition in the human, Exp. Clin. Erzdocrinol. 102: 195,1994. 86. W. Y. Chan and D. L. Chen. Myometrial oxytocin receptors and prostaglandin in the parturition process in the rat, Biol. Reprod 46: 58, 1992. 87. D. R. Juberg,R. C. Webb and R. Loch-Caruso. Characterization of o,p’-DDT-stimulated contraction frequency in rat uterus in vitro, Fztnckm. Appl. Toxicol. 17: 543, 1991. 88. K. Criswell. R. Loch-Caruso and E. Stuenkel, Lindane inhibits gap junctional communication (GJC) in rat myometrial cells via a calcium-independent process, Toxicologist 13: 354, 1993. and C. Sutter. Effects of tiocona89. F. Latrille. J. Perrand. J. Stadler, A. M. Monroe B. zole on parturition and serum levels of 17b-oestradiol, progesterone, LH and PRL in the rat, Bioclzem. Phcrrrnncol. 36: 11 19, 1987. 90. J. G. Powell Jr and R. L. Cochrane, The effects of a number of non-steroidal antiinflammatorycompoundsonparturitionintherat, Prostnglandirzs 23: 469. 1983.
252
Keller
91. B. A. Schwetz, K. S. Rao and C. N. Park, Insensitivity of tests for reproductive problems. J. Emiron. Pathol. Toxicol. 3: 8 1, 1980. 92. R. Heywood and R. W. James, Assessment of testicular toxicity in laboratory animals, Etzviron.HealthPerspect. 23: 73, 1978. 93. W. F. Blazak. T. L. Ernst and B. E. Stewart, Potential indicators of reproductive toxicity: Testicular sperm production and epididymal sperm number, transit time. and motility in Fischer 344 rats. Fzcnd. Appl. Tosicol. 5: 1097. 1985. Arch. Patlzol. 94. W. E. Ribelin, Atrophy of rat testis as index of chemical toxicity. 75: 229,1963. 95. R. E. Chapin, D. K. Gulati, L. H. Barnes and J. L. Teague. The effects of feed restriction on reproductive function in Sprague-Dawley rats, Fundam. Appl. TOXicol. 20: 23, 1993. 96. R. K. Parshad, Effect of restricted feeding of prepubertal and adult male rats on fertility and sex ratio, 1rzciiar.1J. Exp. Biol. 31: 991. 1993. 97. T. C. Jones. U. Mohr and R. D. Hunt, Genital System. Springer Verlag, Berlin, 1987. 98. Creasy and P. M. D. Foster, Male Reproductive System. In:Handbook of Toxicologic Pathology, (W. M. Haschek and C. G. Rousseaux, Eds.), Academic Press, hc., New York, 1991, p. 829. Hmzdbook of ToxicologicPathology, 99. Yuan.FemaleReproductiveSystem.In: (W. M. Haschek and C. G. Rousseaux. Eds.). Academic Press, Inc., New York, 1991, p. 891. 100. Russell, R. A. Ettlin, A. P. Sinha Hikim and E.D. Clegg, Histological and HistoCacheRiverPress,Clearwater,FL, pathological E\ducrtion of theTestis. 1990. 101. D. R. Mattison, How xenobiotic compounds can destroy oocytes,Contemp. Obstet. Gynecol.15: 157,1980. 103. T. Pederson and H. Peters, Proposal for a classification of oocytes and follicles in the mouse ovary, J. Reprod. Fertil. 17: 555. 1968. D. R. Mattison, Comparison of 103. B . J. Smith, D. R. Plowchalk. I. G. Snipes and random and serial sections in assessment of ovarian toxicity, Reprod. Toxicol. 5: 379,1991. 104. P. L. Keyes, The corpus luteum, Znt. Rev. Physiol. 27: 57, 1983. 105. E. M. Johnson and M. S. Christian, When is a teratology study not an evaluation of teratogenicity? J. Am. Coll. To,uicol. 3: 431. 1984. 106. J. M. Manson, Teratogens. In:Cclsarett arzd Doidl’s Tosicology: TheBasic Science of Poisom, 3rd Ed., (C.D. Klaassen, M.0. Amdur andJ. Doull, Eds.), Macmillian Publ.Co..1986.p.195. 107. F. Beck and J. B. Lloyd, An investigation of the relationship between foetal death and foetal malformation, J. Anat. 97: 555. 1963. 108. R. Holson, P. J. Webb, T. F. Grafton and D. K. Hansen. Prenatal neuroleptic exposure and growth stunting in the rat: An in vivo and in vitro examination of sensitive periods and possible mechanisms. Teratology 50: 125, 1994. 109. J. L. Schardein andK. A. Keller, Potential human developmental toxicants and the role of animal testing in their identification and charaterization, CRC Crit. Rev. Tosicol. IO: 251. 1989.
Reproductive Toxicology Developmental and
253
110. A. K. Palmer, Assessment of current test procedures, Enriron. Health Perspect. 18: 97, 1976. 111. M. Yasuda and H. Maeda. Significance of the lumbar rib as an indicator in teratogenicity tests. Teratology 6: 124, 1972. 112. P. E. Beyer and N. Chernoff, The induction of supernumerary ribs in rodents: Role of maternal stress, Teratogenesis Carcinog. Mutagen. 6: 419, 1986. 113. H. F. P. Joosten, T. D. Yih, D. H. Waarlkens and A. Hoekstra, The relevance of wavy rib in teratological evaluation. Toxicol. Lett. Special Issue (Abstr.) 188: 109. 1980. 114. K. S. Khera, Pathogenesis of undulated ribs: a congenital malformation. Fmclnnz. Appl. Toxicol. 1: 13,1981. 115. C. A. Kimmel, The evaluation of behavioral teratogenic effects, Cong. Anom. 2 7 139,1987. 116. N. J. MacLusky and F. Naftolin, Sexual differentiation of the central nervous system. Scielzce 211: 1294. 1981. 117. H. F. Urbanski andS. R. Ojeda, Neuroendocrine mechanisms controlling the onset of female puberty, Reprod. Toxicol. 1: 129, 1987. 118. J. Rajfer and P. C. Walsh, Hormonal regulation of testicular descent: Experimental and clinical observations. J. Urol. 118: 985, 1977. Annu. Rev. Pharttza119. D. A. Beckman and R. L. Brent, Mechanisms of teratogenesis. col. To.xico1. 24: 485, 1984. 120. I. Wilmut, D. I. Sales and C. J. Ashworth, Maternal and embryonic factors associated with prenatal loss in mammals, J. Reprod. Fertil. 76: 851, 1986. 121. F. HomburgerandA.N.Goldberg. Irt Vitro Ernbryotoxicit?,andTeratogenicity Tests. S. Karger, Basal, 1985. 122. R. L. Brinster, Teratogen testing using preirnplanation mammalian embryos. In: Metlzods
New York, 1975, p. 113. 123. M. Dushnik-Levinson andN. Benvenisty, Embryogenesis in vitro: Study of differentiation of embryonic stem cells. Biol. Neonate 6 7 77. 1995. 124. R. M. McClain and R. M. Hoar, The effect of flunitrazepam on reproduction in the rat. The use of cross-fostering in the evaluation of postnatal parameters in rat reproduction studies, Toxicol. Appl. Phnrmncol. 53: 92, 1980. 125. D. Maestripieri, A. Badiani. S. Puglisi-Allegra, Prepartal chronic stress increases anxiety and decreases aggression in lactating female mice. Belzav. Nenrosci. 105: 663,1991. 126. A. L. Giordano, A. E. Johnson andJ. S. Rosenblatt, Haloperidol-induced disruption of retrieval behavior and reversal with apomorphine in lactating rats, Physiol. Behnv. 38: 21 l, 1990. E. M. Faustman, Dose-response 127. B.C.Allen, R. J.Kavlock.C.A.Kimmeland assessment for developmental toxicity.IT. Comparison of generic benchmark dose estimates with no observed adverse effect levels, Fundarn. Appl. Toxicol. 2-3: 487, 1994. 128. C. A. Kimnlel et. al., The application of benchmark dose methodology to data from prenatal development toxicity studies, Toxicol. Lett. 82/83: 549, 1995.
Keller
129. P. M. D. Foster and T. R. Auton, Application of benchmark dose risk assessment methodology to developmental toxicity: An industrial view, Toxicol. Lett. 82/83: 555, 1995. 130. K. S. Krump, Calculation of benchmark doses from continuous data, Risk. Anal. 15: 79, 1995. 131. U.S. FDA.Pregnancylabelling, FDA Drug Bull. 9: 23.1979. 132.U.S.FDA,PregnancyCategories, Federal Register 44: 37464.1979. 133. Teratology Society Public Affairs Committee, FDA classification of drugs for teratogenic risk. Teratology 49: 446, 1994. .134. U.S. FDA, Toxicologicnl Principles for the Safet?,Assessment of Direct Food Additives m d Color Additives Used in Food ("Redbook 11" Draft), Food and Drug Administration, Center for Food Safety and Applied Nutrition, 1993. 135. U.S. EPA. Revision of Prenatal Developmental Toxicity Study and Reproduction and Fertility Effects Testing Guidelines Under FIFRA and TSCA: Notice of Availability and Request for Comments, Federal Register 61: 8282. 1996.
Neurotoxicology Walter P. Weisenburger Pfizer, Inc., Groton, Connecticut
1.
INTRODUCTION:WHAT
IS NEUROTOXICITY?
Awareness and concern regarding the effectsof chemicals and drugs on the nervous system and behaviorhas been heightened during the last three decades. This increased concern has occurred at the same time that the neurosciences have made great strides in understanding the way our nervous systems operate and how normal function can be altered by chemicals and drugs. Even the Congress of the United States had taken note of the issues. The 101st U.S. Congress designated the 1990s the “Decade of the Brain” after subcommittee hearings were held in 1985 on “Neurotoxins in the Home andin the Workplace” [l]. Scientific of neurotoxicity, and the present situaadvances have shaped policy on the issue tion is that policy, as manifested by environmental law, regulations, and mandated testing for potential effects using published guidelines, is now driving much of the research in these areas. Althoughothercountrieshaverecognizedthatnervoussystemdisease might sometimes be the consequence of exposure to man-made chemicals and drugs, the U.S. Environmental ProtectionAgency (EPA) was the first regulatory agency to systematically propose, develop,and promulgate detailed and specific guidelines for the testingof compounds for neurotoxic potential.The 1982 Pesticide Assessment Guidelines[2] were revised, expandedin scope, andfinally published in the current form in March, 1991 [3]. Today those guidelines still stand as the most specific and comprehensive instructions issued to direct the assessment of the neurotoxic potential of chemicals. This approach has been extended Act to the testing of chemicals regulated under the Toxic Substances Control (TSCA) [4]. The EPA neurotoxicity guidelines were developed to test pesticides 255
256
Weisenburger
and chemicals used in industry when available evidence suggeststhat there could be a potential for neurotoxicity. Neurotoxicity testing is mandated for the registration of new pesticides, the reregistrationof old pesticides, and on a case-by-case basis for chemicalsused in industry for which a concern has been raised or when other members of the chemical class have been shown to be neurotoxic. Neurotoxicity isan obvious concernin the developmentof new pharmaceutical entities. The risk-benefit analysis is quite different in that case as pharmaceuticals are developed specifically to assistin the amelioration or preventionof disease. No pharmaceutical regulatory agency in any country to date hasspecific requirements for the routine testingof new drugs for neurotoxicity, althoughnervous system signs are common side effects andmany instances of drug-induced neurotoxicity are known. Nevertheless, the U.S. Food and Drug Administration (FDA) and comparable agencies around the world occasionally do ask for additional studies of candidate drugs to assess neurotoxic potential. Many of these requested studies, at least in the United States. are performed according to the tests specified in the EPA neurotoxicity guidelines and using criteria published in the guidelines to aid in test selection. For these reasons, the EPA guidelines will be used to develop the protocols to be presented in this chapter. The definition of neurotoxicity accepted here is as it appears in the EPA neurotoxicity guidelines: "Neurotoxicity is any adverse effect on the structure or function of the nervous system related to exposure to a chemical substance" [3]. This broad definition includes effects on function as observed and/or inferred from the behavior of the animal or person, as well as lesions and other effects as on the underlying structures of the central and peripheral nervous systems observed nlacroscopically and microscopically with the various techniques of histopathology and neuropathology. One cannot fully substitute one for the other, as behavior can be altered withoutany observable or detectable structural pathology and somenervoussystemlesions,particularlyduringtheearlystages of intoxication, do not have obvious functional correlates, There has been a growing awareness sincethe early 1960s that developing animals and people can be considerably more affected than adults by exposure to harmful chemicals. Disruptionor delay of development either during gestation or during infancy or childhood, can cause ineversible changes in the nervous system. Early damage that is not rescued is often permanent. Probably the best known example of these effects in people is the fetal alcohol syndrome [5]. On the one hand, it is true that developing nervous systems possess a tremendous amount of "plasticity" and the function of a damaged or malformed neural structure can often be assumed by another structure. On the other hand, developing structures are more likely to suffer damage than structures that are fully developed at the time of insult. For precisely the latter reason, in the mid1970s, regulatory agencies in Japan and the United Kingdom began requiring studies for new pharmaceutical agents that examine effects on the offspring of
Neurotoxicology
257
treated pregnant rats. Those regulatory agencies identified the types of functions that were of interest in terms of “auditory, visual, and behavioral function” for the United Kingdom [6] and “motor and sensory activity, emotion, learning,...” etc. for Japan [7]. Those guidelines did not dictate specific tests to be performed, allowing for and encouraging greater use of judgment on the part of each investigator and more flexibility in designing studies appropriate for the specific drug under investigation. At least that was the intent. The reality was that a battery was rarely modified for individual compounds once that battery was adopted by a company or contract testing facility. A major problem for the pharmaceutical companies was that this kind of testing was expensive and differences in the two guidelines often resulted in many behavioral and functional measures being incorporated into all three of the reproductive segment studies (see Chapter7 for descriptions of the segment studies)if the drugwas intended for global marketing. After many years of efforts at “harmonization,” the International Conference on Harmonization (ICH) Guidelines on Detection of Toxicity to Reproduction for Medicinal Products were adopted by the European Union and Japan prior to being published in the United States in 1994 [SI. The FDA never required such functional testingand currently “accepts” such testing without actually requiring it. This kind of testing is often referred to as developmental neurotoxicity testing, although pharmaceutical companies rarelyuse that term: usually referring to the study as the perinatal study or the segment I11 reproductive study. Studies conducted by the pharmaceutical industry to satisfy the various regulatory mandates from the mid-1970s until the present provided a very good foundation forthe testing of chemicals currently requiredon a case-by-case basis by the EPA for pesticides and chemicals acknowledged to be toxic to adults. Protocols for studiesto be done according tothe EPA developmental neurotoxicity guidelines are covered in detail later in this chapter.
II.
ADULTNEUROTOXICITYTESTING:STUDY
DESIGN
The protocols presented here were summarized from the EPA Pesticide Assessment Guidelines [3] published in 1991and recently extended to compounds covered by the Toxic Substances Control Act [4]. The purpose of these studies is to evaluate the potential effects on human health of the compound in question, with emphasis on potential toxicity to the nervous system. The neurotoxicity (FOB), assessment screening battery consists of a functional observational battery of spontaneous motor activity, and neuropathology examinations. Table 1 summarizes the essential featuresof the screening battery for adult rodents.The battery is intended to be used in conjunction with general toxicity study data and the resulting data are to be interpreted in light of other toxicologically relevant information. The battery is a first-tier screen and will not provide a complete
Weisenburger
258
Table 1 Neurotoxicity Screening Battery for Adult Rats for Compounds Regulated by the U.S. Environmental Protecfion Agency
Functionalobservationalbattery:Autononlicsigns(lacrimation,salivation, piloerection, exophthalmus, urination, defecation, pupillary function, palpebral closure); convulsions, tremors, abnormal movements; reactivity to handling, arousal; grip strength, landing foot splay; pain perception; posture and gait, unusual/abnormal behaviors, stereotypies, altered appearance; body temperature Motor activity: Automated apparatus, animals individually, tested session long enough to approach asymptotic levels for last 20% of session Timing of testing for FOB and motor activity: Before dosing, estimated time of peak effect within Acute studies: 8 hr of dosing. 7 and 14 days after dosing Before first dose; 4th, 8th, and 13th wk of exposure Subchronic studies: Before first dose and every 3 mon thereafter Chronic studies: Neuropathology: End of study, at least S/sex/group, in situ perfusion with aldehyde fixative, paraffin and/or plastic embedding for central nervous system tissues, plastic embedding for peripheral nervous system tissues, special stains as necessary (GFAP encouraged) 1. Qualitative analysis: Nervous system regions affected, types of neuropathology due to test substance, range of severity of alterations 2. Subjective analysis: Done alterations if found, determine dose response, evaluations without knowledge of treatment (blind)
evaluation of neurotoxic potential nor will it provide sufficient information to determine mechanisms.
A.
Study Basics
The test material should be well characterized and in a stable form. The vehicle should provide for homogeneous dispersionof the test material as either a solution or a suspension and should not in itself be toxic at the levels administered. In general, a standard strain of laboratory rat should be used. It is usually better to use the same strain and supplier of rats as was, or is intended to be,
Neurotoxicology
259
used for other types of toxicity studies. This will facilitate the integration of results of the various studies. Occasionally another species may be more appropriate. If, for example, the dog or mouse is known to metabolize the compound of interest in a manner more similar to humans than the rat, that species would be more appropriate for testing. Certain partsof the battery must be modified to accommodate a change in species and this can be difficult. For purposes of the discussion here. use of the rat will be assumed. A minimum of 10 males and 10 females is required for each dose and control group for behavioral testing. At least five of each sex for each dose and controlgrouparerequiredforterminalneuropathologyassessment,although additional animals will be needed if interim evaluations are planned. Animals mustberandomlyassignedtotreatmentandcontrolgroups. The guidelines specify that young adults at least 42 days of age be used. A concurrent vehicle controlgroup is required and all aspects of housingandhandlingshould be If the vehicle is known to be toxic at the the same as for the treated groups. levels administered, an untreated or saline control group should be added to the design. The guidelines emphasize theneed for positive control data to be generated by the testing laboratory. Positive control data can either be collected concurrently with the test study or in separate studies. There are several reasons to conduct positive control studies. The most important reason is for the testing laboratory to demonstrate that the functional observational battery methods and of detecting differencesin the relemotor activity measuring devices are capable vant end points that are associated with neurotoxicity. For the FOB, major neurotoxic end points include limb weakness or paralysis, tremor, and autonomic signs. For motor activity assessments, the equipment and procedure must be capable of detecting both increases and decreases in activity. Pharmacologically induced changes, as opposed to frank neurotoxicity, are usually acceptable to demonstrate testing competence. Positive control data for groups exhibiting central and peripheral nervous system neuropathology are also required. Another important reason for conducting positive control studies is for the training of technical staff so that they can recognize and competently describe abnormal behaviorwhen it occurs. Observing normal rat behavior is useful, butonly observation of abnormal behavior will prepare technicians for the assessment of behavioral toxicity. The specialized techniques usedin the histopathology examinations for the neurotoxicity studies also require considerable practice and skill beyond typical histopathology assessments. Untreated control group data from training studies may then be submitted to the agency as partof the laboratory’s historical control of effects data. Historical control data are invaluable in evaluating the significance observed in compound test studies. The guidelines suggest that positive control data be generated approximately once a year, assuming that laboratory conditions and personnel are constant, and more often if they are not.
Weisenburger
260
Inadditiontothevehiclecontrolgroupand any othercontrolgroups deemed necessary, at least three dose levels of the test compound shouldbe used. Equally spaced dose levels are recommendedand a rationale for selectionof the dose levels is required. A primary goal in dose level selection is the ability of the study to determine dose-effect relationships. The establishment of a benchmark dose will aid in the selection of dose levels for acute, single-dose studies. The benchmark dose is empirically estimated as the highest nonlethal dose as determined in a preliminary lethality study. The goal of the preliminary study is to determine a dose level that is clearly toxic. When such a dose level is determined, the other dose levels for the acute study canbe successive fractions, e.g.. 1/2 and 1/4 of the benchmark dose.If no toxicity is found during this procedure. a limit dose may be used for the high-dose level. Limit doses have been set as 2 g/kg body weight for acute studies and 1 g/kg for subchronic and chronic studies. It is highly unlikelythat people would be exposed to, or would accidentally ingest, even a fraction of such high levels. High-dose levels for all of the study lengths should notbe so high thatthe incidence of fatalities would interfere with evaluation of the data fromthat group. Otherwise, the high-dose level should or other toxic effects. The low-dose level ideally produce significant neurotoxicity should ideally produce minimal or no toxic effects. Criteria for selecting the route of administration for neurotoxicity studies can include the most likely route of human exposure, bioavailability, practical considerations. the likelihoodof observing effects, andthe likelihood of producing nonspecific effectssuch as systemic toxicity. More than one route of exposure may be important. When this is the case, the route that best satisfies the criteria should be selected and a clear rationale should be included with the report. Exposures are usually daily for repeated-dose studies and administration in the diet is generally acceptable. Other regimens should be discussed with the relevant agency prior to the study start. Administrationby inhalation is sometimes appropriate and weekday exposures (5 days/wk) are reasonable for practical logistic reasons. The neurotoxicity screening batterymay be combined withany other toxicity study, provided that neither of the goals of the combined studies is compromised. Combining studies can lead to significant savings in terms of time and resources but the resulting combined study can be rather unwieldy to perform. The revised European Organization for Economic Cooperation and Development (OECD) Guidelines forthe Testing of Chemicals advocates this approach for the 28-day repeated-dose toxicity study (guideline #407) and the 90-day repeateddose toxicity study (guideline #408), both in rodents.
B.
Parametersto be Measured
Standard measures used in general toxicity studies, such as body weight and food consumption, should be collected in neurotoxicity studies. Although the
Neurotoxicology
261
guidelines do not specify if and when food consumption should be measured, periodic assessmentof food consumption is often useful in interpreting the results of the compound in studies where of the study and are necessary to verify dosing administration is in the diet. The guidelines do specify that the animals should be weighed on each day of testing and at least once per week during the exposure period. In acute studies,FOB and motor activity testing should be conducted before dosing (not necessarilyon the same day of dosing), at the estimated timeof peak effect within 8 hr after dosing, and at 7 and 14 days after dosing. An estimation of the time of peak effect can be made by dosing small numbers of rats with a range of doses and making regular observations of gait and arousal. In subchronic studies,FOB and motor activity testing should be conducted before the first dose is administered and during weeks 4, 8, and 13 of exposure. In chronic studies, FOB and motor activity testing should be conducted before the first dose is administered and every 3 months. When behavioral testing is scheduled on a day when there is dosing, the behavioral testing should be conducted prior to dosing to minimize short-term pharmacological effects from the dose that might be interpreted as neurotoxicity. It is important to control for time of day when conducting FOB and motor activity testing. This can be accomplished by restricting testing tospecific hours during the day, or better, by balancday of ing testing of groups so that all groups are represented throughout the testing. The best procedure isto implement both strategies, although the numbers of animals to be tested can place practical constraints on that approach. The guidelines do not state that all animals in a study need to be tested concurrently. A balanced replicate design, where two or more cohorts (replicates) that have each treatment and control group equally represented begin the studyon different days (or weeks),can reduce otherwise long test days and will distribute theworkload into a more sustainable schedule.
1. FunctionalObservational Battery The functional observational battery is a seriesof noninvasive observational and interactive measures that assess the neurobehavioral and functional integrity of rodents or other species [9, lo]. Testing in the FOB generally proceeds from the least interactive to more interactive measures. The animal is first observed in its home cagefor posture, involuntary tnotor movements, vocalization, and palpebral closure. The animal is then removed from its cageand rated for ease of removal and reactivity to being handled. While holding the animal, the observer carefully examines it for palpebral closure, lacrimation, eye abnormalities, salivation, and piloerection. The animal is then placed intoan open field (the top of a cart or a standard arena are both often used) for a defined period of one to several minutes during which timethe animal is allowed to move about freely. During this period, obser-
262
Weisenburger
vations are made of involuntary movements, such as tremors and convulsions, gait, mobility, arousal, respiration, and stereotypical and bizarre behaviors. The number of times the animal rears, defined as any time both front paws leave contact with the floor, is often counted for the defined period selected. The number of fecal boluses and urine pools are also counted and the presence of diarrhea and polyuria are notedif present. Following the defined period during which the animal is allowed to move about freely, several standard stimuli are presented to assess reactivity. The animal is approached from the frontwith a blunt object such as a pencil, touched lightly on the rump, presented with a click sound of moderate intensity, and finally has its tail pinched with forceps. The animal’s responses to these stimuli are rated and recorded. A variety of reflexes arethen elicited and evaluated. Pupil response to light is assessed with a penlight, and the corneal reflex may be elicited by gentle stroking of the eye with astiff hair while holdingthe animal. Extenser thrust, areflex elicited by applying pressure to the hindfeet, may also be assessed. Air righting is scored after the animal is dropped from a height of approximately 30 cm from a supine position to evaluate the functioning of the vestibular system and the motor components of the righting reflex as the animal turns over in mid-air to land on all four feet.The animals may be evaluated for their responses to a “hot plate” (typically about 52°C) or in a tailflick apparatus. Both use heat stimuli to assess the degree of analgesia induced by the test compound. Hindfoot splay, a measure of coordination and muscular strength that is sensitive to peripheral nerve damage,may be assessedby dropping the animal one or more times from a height of approximately 30 cnl from the prone position and recording the distance between spots made when the animal lands after dabbing the outside di,’ Ditson the rear limbs with nontoxic ink or paint. Quantitative grip strength is measured two or three times for both fore- and hindlimbs using wire mesh screens or bars that the animal reflexively grabs and holds while being lifted across the apparatus. The maximum force that the animal exerts before letting go is recorded by strain gauges and averages for the trials are calculated later. Body temperature recorded by rectal probe is often recorded. If body temperature is measured, it should be done near the end of the test session, as the animals often struggle during the15 to 30 sec of restraint required for the procedure. The entire FOB battery usually requires approximately8 to 10 min per animal, unless there are unusual behaviors to assess and record. Observers should be carefully trained and should be unaware (i.e., blind) of the animals’ treatment to avoid potential experimenter bias. There are many variables that can affect the behavior of an animal during FOB testing. Every effort should be made to minimize the effects of such extraneous variables and to maintain a consistent testing environment, including the behavior of the observer(s). Scoring criteria, or explicitly defined scales, should be developed for those measures that involve subjective ranking.
Neurotoxicology
263
2. Spontaneous Motor Activity Motor activity evaluations in animals and people have been found to be useful indicators of nervous system function that are obvious and directly quantifiable by many methods. An additional advantage of these types of procedures for animal testing is that they can be easily automated and, therefore, divorced from potential experimenter bias. Motor activity is considered to be “apical” in that it represents the integration of sensory, motor, and higher-level processesof the central nervous system [l 11. The EPA neurotoxicity testing guidelines specify that motor activity be monitored by an automated apparatus that is capable of detecting both increases and decreases in activity. If more than one device isused for testing, steps must be taken to ensure reliability across devices, and treatment groups must be balanced across devices to minimize the possibility of impacting of the data by nontreatmentsources.Animalsshouldbetestedindividuallyandallsessions should be the same length. The literature contains numerous examples of session lengths from 1 min to continuous recording for 24-hr periods and longer. The guidelines do not specify or suggest any particular session length. They do provide the interesting criterion for session length of being long enough for motor activity to approach asymptotic levels by the last 20% of the session for untreated control animals. This criterion allows each laboratory to determine empirically an adequate session length and allows for assessment of habituation by the animal to the test environment. Most laboratories have found the range of times that satisfies this criterion to be 20 to 60 min, depending on the apparatus and test environment. As is the case for the FOB, a number of variables are known to of such affect motor activityand care must be exercised to control for the effects variables.
3.
Neuropathology
Neuropathological analysis of tissues from the central and peripheral nervous systems is thethird pillar of the neurotoxicity screen.The first step in the preparation of nervous system tissues for histological analysis is in situ perfusion with an aldehyde fixative. This procedure pumps the fixative through the blood vessels and allows diffusion of the fixative into the tissues of the brain from the inside; providing faster and more complete preservation of that organ than is possible with standard immersion fixation. The peripheral nerves are more delicate they as project more distally, and perfusion makes them much easier to dissectand more resistant to damage that can result from handling and slicing. Paraffin embedding is acceptable for central nervous system tissues, although plastic embedding is encouraged. Plastic embedding is required for peripheral nervous system tissues. Histological sections should be stained using hematoxylin and eosin (H&E) or a comparable stain and additional special stains, such as a silver-based method,
264
Weisenburger
are recommended.The application of glial fibrillary acidic protein (GFAP) inmunohistochemistry and radioimmunoassay is recommended tobe used in conjunction with standard stains to determine the lowest dose atwhich cellular alterations are detected [12,13]. Increases in GFAP are sensitive to cytopathology at lower dosages of many neurotoxins than are the stains used in routine histopathology. Detailed descriptions of the vascular perfusionand dissection techniques. as well as other detailsof appropriate fixation, staining, and processing of nervous system tissues for histological examination may be found in standard histology texts including Spencer and Schaumburg [ 141 and standardized histological protocols such as those published by the World Health Organization in 1986 [15]. Representative tissue samples should be obtained from all of the major regions of the nervous system. During the qualitative examination phase of the neuropathology examination, the regions known to be sensitive to neurotoxic insult and those regions suspected tobe affected based on the resultsof the behavioral tests should receive of tissuesamples is recomparticularattention. A "stepwise"examination mended. With this approach,the sections from the high-dose group are examined first and compared with those from the control group.If no alterations are found in the samples from the high-dose group, additional examinations are not required. If alterations indicative of neuropathology are found in the high-dose samples, samples of the same tissue(s) from the intermediate-dose group must be examined. If alterations are found in the intermediate-dose group samples, samples from the low-dose group must be examined. When neuropathological alterations are found in the qualitative examination, a subjective diagnosis (i.e.. semiquantitative analysis) will be performed to further characterize dose-response relationships. All regionsof the nervous system with any evidence of pathology must be included in this analysis. Sections so that the patholofrom all of the dose groups from each region should be coded gist examining the slides will notknow to which dose or control group a sample belongs. The sections are randomized and in the course of the examination the frequency and severity of each lesion are rated and recorded. Photomicrographs of treatment-related lesions are recommended for inclusion in the report to accompany the textual descriptions and to illustrate the rating scale used to quantify the degree of severity of the lesions from very slight to very extensive.
111.
DEVELOPMENTALNEUROTOXICITYTESTING: STUDY DESIGN
The protocols presented here were summarized from the EPA pesticide guidelines [3] published in 1991. The purpose of these studies is to examine the potential functional and morphological hazards to the nervous system that may arise in
Neurotoxicology
265
the offspring of mothers exposed to the test compound during pregnancy and lactation. The developmental neurotoxicity screening battery consists of observations to detect gross neurological and behavioral abnormalities, developmental landmarks (preputial separation formales and vaginal opening for females), four assessments of spontaneous motor activity, auditory startle response evaluations in weanling and adult offspring, assessmentof learning and memory in weanling and adult offspring, brain weights in preweaning and adult offspring, and neuropathology examinations in preweaning and adult offspring. Table 2 summarizes the essential features of the developmental neurotoxicity screening battery for rats. Table 3 presents the developmental neurobehavioral test batteryI use in my laboratory for evaluating the safety of new pharmaceuticals. That battery is not atypical of what is done in the pharmaceutical industry for submission to the FDA and other regulatory agencies and is presented here for comparison with the EPA developmental neurotoxicity battery. This type of testing is encouraged, and sometimes required, by the EPA on a case-by-case basiswhen the substance under consideration produces malformations in the central nervous system,is structurally similar toknown behavioral teratogens, is known to be neurotoxic or neuropathologic in adult animals. or is hormonally active. Other terms used to describe functional effects on offspring resulting from exposuresto a parent besidesthe term developmental neurotoxicity are behavioral teratology, neurobehavioral teratology, and, occasionally, secondgeneration functional effects. As is the case with the adult neurotoxicity screening battery, the developmental neurotoxicity battery is a first-tier screen and will not provide a complete evaluation of the potential of a compound to cause developmental neurotoxicity. Most of this type of research and regulatory concern focuses on exposure to the mother during gestation and lactation, although an increasing body of findings implicate exposures to the father affecting the functional development of his offspring. Study designs to detect paternally mediated effectswill not be discussed here. For a recent review of paternally mediated effects see Nelson et al. [16].
A.
Study Basics
The test material should be well characterized and in a stable form. One or more concurrent control groups are required. If a vehicle is used for delivery of the test compound, then a vehicle control group is necessary.The vehicle should not be developmentally toxic (e.g., teratogenic) nor have effects on reproduction. If a vehicle is not used for delivery, a sham-treated group is required. All details of handling and maintaining mothers (maternal rats are referred to as dams) and offspring should be the same for the control group(s) as for the treated animals. Testing shouldbe performed in rats. Use of the same strain asused in other toxicity studies, and especially reproductive and developmental toxicity studies
266
Weisenburger
Table 2 Developtnental Neurotoxicity Testing Protocol for Chemicals Regulated by the U.S. Environmental Protection Agency
Balano-preputial separation and vaginal opening Autonomic signs (lacrimation, salivation, piloerection, exophthalmus, urination, defecation, pupillary function, palpebral closure); convulsions, tremors. abnormal movements; posture and gait, unusual/abnormal behaviors, stereotypies, altered appearance; any other signs of toxicity The same end points as for dams, as appropriate for Observations of offspring: the age at observation Motor activity: PND 13, 17. 21, 60 (+2); automated apparatus, individual test, session long enough to approach asymptotic levels for last 20% of session Auditorystartleandhabituation:PND22,60 (+2); reactivityhabituation:silnplest form of learning, auditory function. Learning and memory: PND 21-24, 60 (+2): different anitnals at each age, same or different tests at each age, flexibility in choice of tests: Criteria 2 for tests 1. Learning assessed either change as a over trials, or in single trial tests, controls for nonassociative factors 2 . Some measure of memory (short or long term) in addition to acquisition Recommended Demonstrated sensitivity to class of compound being tested Neuropathology and brain PND 1 1; 1 male or 1 female from each litter. brain weights: weights for all, from these animals 6/sex/dose group for neuropathology, euthanasia by C 0 2 End of study; 1 male or 1 female from each litter, brain weights for all, additional 6/sex/dose group for neuropathology euthanasia by perfusion I . Qualitative analysis Nervous system regions affected, types of neuropathology due to test substance, range of severity of alterations 2. Subjective diagnosis Done if alterations found, determine dose response. evaluations without knowledge of treatment (blind) 3. MorphometricanalysisAssessstructuraldevelopmentofbrain,measure thickness of layers of neocortex, hippocampus, and cerebellum Developmental landmarks: Observations of dams:
Neurotoxicology
267
Table 3 A Developmental Neurobehavioral Test Battery for a Perinatal Reproduction
Study for Pharmaceutical Safety Evaluation Measure All in litter Surface righting 4/sex/litter Incisor eruption Eye opening Air righting 2/sex/litter FOB for weanlings Ophthalmoscope exam Motor activity (20 min.) Auditory function Vaginal opening Preputial separation l/sex/litter Auditory startle habituation Cincinnati water maze (learning) Passive avoidance (memory)
Postnatal day(s) of Mean day of testing appearance pass or 1 until pass
3.0
7 until pass 10 until pass 14 until pass
10.7 14.9 18.3
21 between 21-28 23 30 28 until pass 35 until pass
32.5 44.7
-
once as adults between 55-75 between 55-75
if they have been done, is preferable butnot required. The only limitation placed on strain selection in the guidelines is the admonition to not use the Fischer 344 strain because. of differences in the timing of developmental events compared with other strains. A detailed justification is necessary if the Fischer 344 rat or another mammalian species is used. Communication with the relevant regulatory agency prior to the start of the study in such a case would be advised. not been pregnant previAt least 20 young adult pregnant females that have ously (i.e., nulliparous females) should be used for each treated and control group. It is important for these studies to keep in mind that it is the dam that is randomized and administered the test compound. Therefore, any measurements made on the offspring should be analyzed with the litter as the experimental unit and not individual offspring. After the litters are born, on postnatal day 4, each litter should be culled by random selection so that each litter has, as nearly as possible, four males and four females remaining. The issue of culling is the topic of an ongoing debate by researchers in the United States and Europe. Some opponents consider the practice of culling to have the potential to mask developmental toxicity, while in litter size. proponents consider the practice necessary to control for differences The guideline has not changed. Eliminationof runts (i.e., very small pups) prior
268
Weisenburger
to random culling is not appropriate. If a litter does not have enough pups of either sex to obtain four of each sex, partial adjustment (e.g., five males and three females) is acceptable. Litters having fewer than seven pups are not acceptable and those litters mustbe removed from the study. After standardization of litters, each pup should be uniquely identified and one male and one female from each litter (20 males and 20 females per litter) should be randomly assigned to one of the following tests: motor activity, auditory startle, or learning and memory (learning and memory are testedin both juvenile and adult animals).In 1998 the EPA Office of Prevention, Pesticides and Toxic Substances reduced the minimum number of offspring required for behavioral evaluation to either one male or one female per litter. for a total of 10 males and 10 females per group for compounds regulated by that office [ 171. On postnatal day 11, either one male or one female pup from each litter (total of 10 males and 10 females i n each dose group) must be sacrificed and brain weights obtained. Of those pups, six males and six females in each dose group should be designated for neuropathological evaluation. At the end of the study, either one male or one female from each litter (total of 10 males and 10 females per dose group) must be sacrificed and brain weights obtained. An additional group of six males andsix females in each dose group (one male or one female in each litter) shouldbe sacrificed at the end of the study for neuropathological evaluation. In addition to the vehicle and/or any other control groups deemed necessary, at least three equally spaced dose levels of the test compound should be used. If the test compound has been previously shown to be developmentally toxic, the high-dose level shouldbe the highest dose that did not cause malformations or death to the fetuses or neonates in developmental toxicity studies. The high-dose level should induce some overt maternal toxicity, but should not result 20% during gestation and lactation. in a reduction in body weight gain exceeding The low dose should, ideally,not induce maternal toxicityor developmental neurotoxicity. Administration of the test compound should be daily by the oral route from gestation day 6 through postpartum day 10 with day 0 of gestation as the day of presumed mating. Other routes of administration mustbe justified and the reasons for the route selected explained clearly. Test compounds and vehicle should be given atthe same time each day. Dosing should not occur onthe day of parturition for animals that are in the process of delivering pups. The day of delivery of a litter is considered postnatal day 0 for that litter.
B. ParameterstobeMeasured Dams should be observed at least once eachday before administration of the test compound or vehicle. The observer(s) should be unaware of the animals’ treat-
Neurotoxicology
269
ment group and standardized procedures, such as those used in the functional observational battery, should be used. Standard, routine assessment of clinical signs is not sufficient for these observations.If the same observer cannotbe used to evaluate all of the animals in a study, some demonstration of interobserver reliability is required. The guidelines are rather specific about the types of observations that should be conducted. Observationsof the dams should include assessment and, for some measures, ranking of signs of autonomic function such as lacrimation, salivation, piloerection, exophthalmus, urination, defecation, pupillary function, and palpebral closure. Convulsions, tremors, abnormal movements and behaviors, posture, gait abnormalities, stereotypical behaviors, emaciation, dehydration, hypo- or hypertonia, altered fur appearance, the appearance of the eyes, nose, and mouth,and any other signs of toxicity that mightaid in the interpretation of the data should be recorded when they are observed. The time of onset, degree, and duration of all observations should be included. Body weight measurement is specified in the guidelines to be recorded for the dams at least weekly, on the day of delivery, and on postpartum days 11 and 21 (weaning). More frequent measurement of dam body weight is often usefulin overall interpretation of the study results. Although food consumption isnot required for the dams, periodic assessment is also useful in understanding maternalbody weight changes and fetal weight differences in some cases. Offspring should be examined in their cages at least once each day for signs of morbidity and mortality. All offspring should be observed outside of their cages for gross signs of toxicity wheneverthey are removed from their cages for weighing or for behavioral testing.The technician(s) trained to conduct these observations should be unaware of the animals’ treatment group, and standardized procedures should be used to maximize interobserver reliabilityif the same observer can not evaluate all of the animals in a given study. The observations outlined above for the dams represent the minimum for the offspring and the monitored end points should be appropriate for the developmental stage of the offspring. Any signs of toxicity in the offspring should be recorded when they are observed and should include the time of onset, degree, and duration, as for the dams. Live pups should be weighed individually as soon as practically possible after birth,on postnatal days4, 1 1, 17, and 2 1, and at least every 2 weeks thereafter. Food consumption is not required for offspring and is not generally recognized as being useful in these studies. The only developmental indices that are required are the ageof vaginal opening in females (usually occurs between postnatal days 32 and 38) and the age of preputial separation of the penis in males (usually occurs between postnatal days 36 and 45). See Adams et al. [18] and Korenbrot et al. [19] for details of these observations in females and males, redo not mention it, measuring thebody weight spectively. Although the guidelines
Weisenburger
270
of each offspringon the day of attainment of vaginal opening or preputial separation can be quite usefulin interpreting positive findings for delay, or acceleration, of these developmental landmarks. Motor activity should be monitored by an automated device on postnatal 60 (2 2days). The samecriteriaandrecommendations days13,17,21,and listed in the section on adult neurotoxicity assessment pertain to evaluating motor activity in offspring in the developmental neurotoxicity study. Unlike the adult guidelines,thedevelopmentalguidelinesspecifythattherecordingintervals within a monitoring session should not be more than 10 min in duration. The same criterion for empirical determination of the lengthof the monitoring session that appears in the adult guidelines applies to the developmental study. An auditory startle habituation test should be performed on the offspring on postnatal days 22 and 60 2 2 days. The mean response amplitude must be determined for each block of 10 trials with a daily session consisting of five blocks of 10 trials, for a totalof 50 trials on each test day. Habituation is assessed by comparison of the degree that the startle amplitude decreaseswith successive presentations of the startle stimulus. The slope of the recorded data for each treatment group can be compared with the control group to determine whether the slopes differ from one another. Details of the procedure for this test can be found in Adams et al. [ 181. The auditory startle test can be made more powerful by the additionof pr-epulseinhibition. Prepulse inhibition contributes information regarding sensory processing to the auditory startle test and can also be used to detect changes in auditory thresholds after exposure to toxicants. Although the guidelines do not require prepulse inhibition, its addition is highly recommended. Details of the conduct of this test may be found in an article by Ison [20]. One or more testsof learning and memory are required tobe administered around the time of weaning (postnatal days 21 to 24) and again in adulthood (postnatal day 60 2). The same or different procedures may be used at these two stages of development. Considerable flexibility in test selection is allowed. However, two criteria for test selectionmust be fulfilled. First, learning must be assessed either as a change in behavior across several learning trials or sessions, or, in tests involving a single trial (e.g., one-trial learning passive avoidance), in the training experience with a condition that controls for nonassociative factors that can provide assurance that extraneous factors are not influencing the measure of learning. Second, some measure of memory, either short-term or long-term, must be included in the test in addition to the original measure of learning (i.e., acquisition). Whichever measure of memory is selected, the measure of acquisition should also be obtained from the same test. Testing for learning and memory and interpreting the resulting data can be quite complex because it is impossible to demonstrate learning without invoking memory and it is impossible to demonstrate memory without invoking learning to establish that which is to be remembered. When treatment-related effects are _+
Neurotoxicology
271
found in the test or testsof learning and memory, additional testsmay be needed to discover if other interpretations for the data are correct besidesthe interpretation that cognition has been affected. Depending on the test and the pattern of results obtained, alterationsin sensory functioning and/or processing, motivation to complete the test, motor capabilities, and general activity levels could contribute to the observed effects and should be considered before acceptingthe concluA recommendation sion that learning and/or memory capacity has been affected. in the guideline that is often overlooked is to select a test or tests for learning and memory that has been shown tobe sensitive to compoundsin the same stl-uctural or functional class as the compound under investigation. There are a great many tests of learning and memory in the literature. but it is beyond the scope of this chapter to review them. A few tests that fulfill the criteria in the guidelines delayed-matching-to-position [21], for these types of studies in adult rats are olfactory conditioning [22], and acquisition and retentionof schedule-controlled behavior 123,241. Additional tests for weanling rats are described by Spear and Campbell [25] and Krasnegor et al. [26], for adult rats by Miller and Eckerman [27], and forboth young and adult rats in Rileyand Vorhees 1281. Water mazes, especially the Morris water spatial task, often called the Morris maze 129,301, have become popular for assessment of rodent learning and memory in many contexts. A test that is currently gaining acceptance and has a relatively large compoundeffectbaseafterprenatalexposuresis the Cincinnatiwater maze [3 1,321, which I use in my own laboratory. Only a few formal comparisonsof tests of learning and memory havebeen published.Examples of suchcomparisonsafterprenatalexposuresinclude Akaike et al. [33], Tsutsumi et al. [34], and Weisenburger et al. [35]. in the developmental neuroThe behavioral tests required or recommended toxicity test guideline (motor activity, auditory startle, and learning and memory tests) are apical in nature, i.e., they do not test discrete central nervous system (CNS) functions but require integration of various processes. These processes may include sensation, motivation, neuromuscular function,and other aspects of nervous system functioning. The actual behavior measured is the culmination and integration of the function of several underlying processes. According to one pioneering researcher in this area [36], the point of apical testing is to grossly analyze the integrated responseof the organism. The value of testing apical performance liesin its potential sensitivity to detectdeficit a in any of several subsystems. This sensitivity, due to the involvement of many CNS subsystems, allows many factors to influence behavioral output. An argument against the useof apical tests is that it is often difficult to identify specifically which subsystems are affected.This problem can oftenbeovercome by carefulexamination of the results and correlation with results of several tests that tap similar subsystems. In a screening approach. such as the one under discussion. it is less important to define a mechanism of action than it is to determine impairment of function
272
Weisenburger
resulting in potential increased risk. Apical tests compromise specificity in favor of sensitivity and are, therefore, useful in detecting impairment. Specificity can be recouped in additional follow-up tests that can be applied on a case-by-case basis to further characterize the nature of the impairment andthe site of the lesion, if one can be found. A very through neuropathological analysis of the offspring must be conducted to satisfy this guideline. On postnatal day 11, one male and one female pup should be removed from each litter such that equal numbers of male and female offspring are removed from all litters combined. Of these, six male and six female pups should be sacrificed for neuropathological examination. After euthanasia by carbon dioxide inhalation, the brains should be removed, weighed. and fixed by immersion in an aldehyde fixative. The remaining offspring in the subsetshould be sacrificed in thesamemanner,theirbrainsremovedand weighed. At the termination of the study, usually after the last learning and memory assessment, one male or one female from each litterwill be euthanized with carbon dioxide, and the brain removed and weighed. In additionto the animals above, six animals of each sex in each dose group (one male or female per litter) shall be sacrificed at the end of the study for neuropathological evaluation with the same procedures used for the adult neurotoxicity study. Neuropathological evaluationof the animals sacrificedon postnatal day 1 1 and at the termination of the study must include a qualitative analysis and, if warranted by the findings during the qualitative analysis, a semiquantitative analysis, as well as simple morphometrics. Samples from the postnatal day 11 pups should be immersion-fixed in an aldehyde fixative and then postfixed and processed according to standardized histological protocols such as the Armed ForcesInstitute of Pathology(AFIP)[37],SpencerandSchaumburg[14]or Pender [38]. Paraffin embedding is acceptable but plastic embedding is recommended. Histological sections should be stained using hemtoxylin and eosin or a similar stain according to standard protocols such as AFIP [37] or Bennet et al. [39]. The brains of the pups sacrificed on postnatal day 1 1 should be examined for any evidence of neuropathologicalalterations.Towardthatend,samples should be collected from all major brain regions to include the olfactory bulbs. cerebral cortex, hippocampus, basal ganglia, thalamus, hypothalamus, midbrain (i.e., tectum, tegmentum, and cerebral peduncles), brain stem, and cerebellum. Further guidance for examinationof the nervous system for indicationsof developmental insult can be found in Friede [40] and Suzuki [41]. In addition to the typical kinds of cellular alterations that can be assessed in neuropathological studies(e.g.,astrocyticproliferation,leukocyticinfiltration,andcysticformation), there shouldbe special emphasisplaced on structural changes that are indicative of developmental insult. Examples of such changes include gross changes pattern of foliation of the cerebelin the size or shapeof brain regions such as the
Neurotoxicology
273
lum, death of neuronal precursors, abnormal proliferation or migration, alterations in transient developmental structures such as the external germinal zone of the cerebellum, evidence of hydrocephalus, particularly enlargement of the ventricles, stenosis of the cerebral aqueduct, and thinning of the cerebral hemispheres. There are three purposes for the qualitative histological examination in the developmental neurotoxicity screen. The first is to identify regions within the nervous system with evidenceof neuropathological alterations. The second is to identify the types of alterations resulting from exposure to the test substance. The third purpose is to determine the severity of the lesions.As in the neuropathological examinationin the screen for adult animals, the developmental neuropathological examination is conducted using a stepwise approach wherein the highdose-group tissue sections are comparedfirst to the control group samples. If no alterations are found in the high-dose-group animals, no further analysis is required. If alterations indicative of neuropathology are found in the high-dose sections, samples of the same tissue(s) from the intermediate-dose group are exnot exhibit the amined, and so on, until a dose level is encountered that does alterations or there are no more dose groups to examine. The recommendations for the use of additional stains and methods to determine the lowest dose level at which neuropathology is detected in the adult neurotoxicity screen also apply to the developmental neurotoxicity screen. If any evidence of neuropathology is found in the qualitative examination, then a subjective (i.e., semiquantitative analysis) must be performed to further characterizedose-responserelationships. All regions of the brain exhibiting any evidence of neuropathology should be included in this analysis. Sections of each region from all dose groups will be coded as to treatment and randomizedpriortoexamination.After all sections from alldosegroupshave been rated for severity using a scale such as 1+, 2+, and 3+ to indicate the degree of severity ranging from very slight to very extensive, the code will be broken and statistical analyses performedto evaluate dose-response relationships.Many to thisprocedure when theguideline was openfor pathologistsobjected comments from the public. But this approach minimizes the potential effects of observer bias and has proven tobe a very useful featureof the adultand developmental neurotoxicity screens for objectifying neuropathological lesions and alterations. Simple morphotnetric analysis is useful in evaluating disruptionof developmental processesthat are oftenreflected in changes inthe rate or extentof growth of particular brain regions. Suchan analysis is required for someof the pups that are sacrificed on postnatal day 11 and at the end of the study. At a minimum, this analysis consists of an estimate of the thickness of the major layers within the neocortex, hippocampus, and cerebellum. Details of the procedures for conducting these measurements can be found in Rodier and Gramann [42].
Weisenburger
274
It should be clear after reading the summary of procedures for the assessment of neuropathology in the developmental neurotoxicity screen that the intent of the guideline is a very comprehensive look at the development of the central nervous system from the early days of life through adulthood. Although only a few studies of this type have been published to date, it is clear that these procedures represent the state of the art in neuropathology screening and a sensible approach to protection of the general public: the current generation and thoseto come.
IV. DATACOLLECTION,REPORTING,EVALUATION, AND INTERPRETATION Once the data are collected, the last animal has been sacrificed, and all measures added to the various data sets, the second phase of the study commences. This phase isno less important then the in-life and other data generation stages. Indeed, the primary purpose of this phase is to make sense of the data and to place that can be used it into a format that renders a factual story with conclusions to arrive at decisions of import to the lives of people like you and I. The process that leads to the final product, the report, complete with the appropriate perspective and interpretation, is neither simple nor straightforward. The discussion that follows pertains equally to the adult and developmental neurotoxicity studies.
A.
DataOrganizationandAnalysis
The final test report must include all of the information and data necessary to of the experiproperly interpret the results. More specifically, the general design ment, the equipment, and test methods used should be described in detail. Any deviationsfromtheguidelinesordecisionsinvolvingprofessionaljudgment should be explained and justified. Examples of what is expected in the report are, for the functional observational battery, the dimensions of the arena. the of the proceduresused to standardize the observascoring criteria, and description tions, as well as operational definitions for scoring observations. For motor activity, the procedures for calibrating the devices and the balancing of treatment groups across the relevant parameters, such as time of day, should be included. A good “rule of thumb” is that if one is not sure whether or not to include specific information on the conduct of the experiment, include it! Obviously, the test system (i.e.. the animals used) should be well described in terms of species, strain, age, sex, supplier. and any other available information. Many of the end points in neurotoxicity studies are known to vary with strain of rat tested.
Neurotoxicology
275
Positive control data generated by the laboratory performing the test should be included in the report to demonstrate the sensitivity of the procedures being used and the competence of the personnel performing the tests. Historical, nonconcurrent, positive control data may be used if the essential aspects of the protocol are the same. In developmental neurotoxicity studies, positive control data do not need to be from prenatal exposures. Historical control data are often useful, and sometimes critical, in the interpretation of study findings, as these kinds of data expand the basis of comparison for possible treatment effects beyond the single study. The submission of positive and historical control data along with the test report is encouraged to facilitate and expedite the review of the study results and interpretation. Presentation of results of a study should be arranged by test group and dose level using tabular formats. Data for each individual animal should include its unique identification number, body weights, scores on each sign at each observation time, total session and intrasession (i.e., interval) subtotals for each day of motor activity measurement, and the time and cause of death if the animal died on study or was sacrificed as moribund. For developmental studies, the following measures also need to be presented: the litter from which each offspring came, body weight and score on each developmental landmark (i.e., preputial separation or vaginal opening) at each observation time, auditory startle response amplitude per session and intrasession amplitudes on each day measured, and appropriate data for each repeated trial (or session) showing acquisition and retention scores on the test(s) of learning and memory on each day measured. Summary data for each dose and control group must include the number of animals at the start of the test, the number of animals with each observation score at each observation time. the mean and standard deviation for each continuous end point at each observation time, and the results of statistical analyses for each measure, where appropriate. For developmental studies, the following should be added: body weight of the dams during gestation and lactation, litter size, and mean weight of the offspring at birth. This is a good place to remind the reader that the unit of analysis for developmental neurotoxicity studies with compound administration to the dam is the litter, as it was the dam that was randomized to treatment group. All neuropathological observations should be arranged by test group in the report. The recommended format for presentation includes description of the lesions for each animal showing the unique identification number, sex, treatment, dose, duration of dosing, a list of structures examined, as well as the location, nature. frequency, and severity of lesioncs). The EPA strongly recommends the inclusion of photomicrographs that illustrate examples of the type and severity of the neuropathological alterations observed. Any diagnoses derived from neurological signs and lesions, including naturally occurring diseases or conditions, should be included. The neuropathology data should be tabulated to show the
276
Weisenburger
number of animals examined in each group. the number of animals in which any lesion was found, the number of animals affected by each different type of lesion, and the locations, frequency, and severity of each type of lesion. Additional data to be reported for developmental studies include whole brain weights (both absolute and relative), regional brain weights, and the values for the morphometric measurements made for each animal listed by treatment group. The findings from the screening battery should be evaluated in the context of other toxicity studies and any other pertinent information that exists for the compound of interest. The evaluation should include the relationship between the doses of the test substance and the presence or absence, incidence. and severity of any neurotoxic effects. Appropriate statistical analyses are crucial to evaluation of the data. Parametric statistical tests are usually appropriate for continuous data like body weights, motor activity counts, auditory startle data, and body temperature. Nonparametric tests are usually appropriate for the remainder of the measures. It is beyond the scope of this chapter to attempt a discussion of the application of various statistical techniques to each of the measures in a neurotoxicity study. Choice of analyses should consider tests appropriate to the experimental design, including repeated measures when appropriate. Data sets for which parametric statistics are appropriate are analyzed routinely by most laboratories with either analysis of variance (ANOVA) in its several forms or one of several trend tests that assess the relationship of the measure to dose [43]. Nonparametric tests (e.g., chi-square, Maim-Whitney U) are often used for data that are not normally distributed. Adjustments for multiple comparisons should be made when appropriate. The guidelines state that the report must include dose-effect curves for observations, motor activity expressed as counts, and any gross necropsy findings and lesions observed. The evaluation should attempt to elucidate any relationship between the observed neuropathology and behavioral changes.
B. Interpretation of Specific Measures The tests that comprise the adult and developmental neurotoxicity screening batteries cover a wide range of functional and pathology measures that increase the likelihood of detecting neurotoxic effects. The behavioral tests generally lack specificity for distinct neural systems but were selected to survey the integrated functional output. Sometimes a single measure can be affected without supporting differences in other measures. Examples of such single-measure effects might be altered auditory startle response without supporting histopathological findings or vacuolization of a particular area in the brain with no detected functional correlate. Such effects are the exception rather than the rule. A more typical outcome is a pattern of effects on the behavioral tests that is supported by the histopathological findings in associated regions of the nervous system. Although it is impossible to describe all of the possible permutations of findings, or even to survey
.
‘Neurotoxicology
277
all of the patterns described in the literature thus far, general comments on interpretation of findings for specific tests follow. The largest issue in interpretation of the data from a neurotoxicity study is differentiating effects on the central and/or peripheral nervous system from “systemic” or general toxicity. When frank lesions are found in any part of the nervous system, a diagnosis of neurotoxicity is easy to assign. When no such lesions are found, but one or more behavioral differences are observed, the task is more difficult. Systemic or general toxicity may lead to the animal being “sick” and manifestations of illness such as malaise might have the appearance of being due to neurotoxicity. An animal that is “sick” may behave abnormally although the nervous system is not a specific target of the compound. The time course of such effects is usually helpful in differentiating systemic toxicity from neurotoxicity. Effects that are due to neurotoxicity usually persist beyond the time that systemic effects are observed. Unfortunately, it is often the case that systemic toxicity accompanies neurotoxicity and a careful analysis of the pattern of findings must be conducted. All of the physiological and organ systems have the potential to affect behavior at least indirectly through the nervous system and sometinies directly. 1. Clinical Observation Data Unusual clinical signs are frequently the first clue that a compound is neurotoxic. More often than not, standard toxicity studies will have been conducted with the compound of interest before a formal evaluation of neurotoxicity is performed. Clinical signs, as well as other findings from such studies, should be reviewed and the likelihood of nervous system damage considered. The most important point to be made from the assessment of clinical signs and the more formal functional observational battery is to “look at your animals!” There is no substitute nor more important source of information than this simple directive. Direct observation and careful recording of clinical signs is critical to well-conducted neurotoxicity assessments. Observation of adult animals with direct exposure to the test compound can be complicated by the pharmacological effects of the test compound. In these situations, a knowledge of the pharmacokinetics of the compound and observation beyond the time when the compound is likely to exert a pharmacological effect will usually clarify whether the observed effects are pharmacologically or toxicologically mediated. The situation is different for the clinical signs observed in offspring in developmental studies. There is usually little or no compound in the bodies of the offspring when the observations are made. Therefore, any clinical signs observed are likely due to interference with developmental processes and are likely to be related to neurotoxicity.
278
Weisenburger
Obviously, the nature of the clinical signs observed is crucial as they can range from general findings such as “unkempt appearance” to truly bizane behaviors such as repetitive circling, straub tail, and rage-like behaviors. Many of these specific behaviors and syndromes are well documented and a literature search with appropriate key words will often lead to diagnosis, clarification of mechanism, or at least some ideas on how to proceed. 2.
Body Weights and Food Consumption
These two measures are truly the mainstays of toxicological assessment. Body weight effects are often the first indication of toxicity. It is obvious that reduced food consumption will lead to decreased body weight and/or decreased body weight gain. Unfortunately, the relationship is not always so neat. When the degree of reduced food consumption does not adequately explain the body weight data, a metabolic disturbance might be responsible and might lead to unusual behaviors as well, such as increased water intake or altered eliminative processes. The main point to keep in mind for these measures is that they both are sensitive to many different kinds of physiological perturbations and can be sensitive to psychological factors as well. A hypothetical example for the latter point is that of an animal that is hallucinating or is experiencing negative emotions like fear or disorientation. Under such conditions, the animal may not eat normal amounts, which would lead to decreased body weight if that mental state persisted for more than a day.
3. Functional Observational Battery A functional observational battery typically consists of 25 to 35 end points that are measured at several time points before, during, and/or after compound administration. Fortunately, these measures can be grouped into functional domains to assist in interpretation [9]. Six functional domains were recently evaluated statistically to investigate the degree to which experimental data correspond to the domain groupings [44] using the following domains refined and proposed by Tilson and Moser [ 101: autonomic, neuromuscular, convulsions, activity. excitability, and sensorimotor. The measures within a functional domain were found to be related to each other, as would be expected. In addition, some of the measures in the domains correlated with some of the measures in other domains. This latter feature actually increases the usefulness of the FOB by allowing patterns of results to be identified not only within domains but also among domains. Some toxicologists have expressed concern that with so many measures in the FOB to statistically analyze, one or more statistically significant differences will be found by chance but might not be related to neurotoxicity. The same scientists are quite coinfortable with interpreting clinical chemistry and hematol-
Neurotoxicology
279
ogy data in animals when it is often the case that the same kind of spurious statistically significant differences are found. They are able to interpret the data adequately and the reviewing regulatory agencies usually do not take issue with the interpretations because of isolated and spurious instances of statistical significance. The situation is directly analogous to what is often found with FOB data. As with clinical chemistry, hematology, and any other sets of data that require many statistical comparisons, statistical significance will be found for approximately 1 in 20 comparisons, independent of effect of the treatment. The magnitude, direction, additional findings that do or do not form a logical pattern, and, of course, professional judgment should be used to determine whether a particular finding that is statistically significant is biologically significant or not. In most cases, a single statistically significant difference on any one measure that is not supported by positive findings on related measures, or at leastin the other sex, is probably not important. Some measures in theFOB can be viewed as complementary althoughthey do not measure the same function. An example of this complementarity is the relationship of landing foot splay to grip strength. Landing foot splay involves a reflex to the stimulus of falling that coordinates inputfrom the vestibular, visual, and motor response systems. Grip strength is mainly a measure of the strength of the limbs that utilizes a reflex of the toes and claws to allow measurement. Peripheral nerve damage, such as that induced by acrylamide, is clearly manifested by decreased grip strengthwith accompanying increased landing foot splay distance. Administration of central nervous system depressants, such as codeine and pentobarbital, will also result in decreased grip strength when the drugs are present at pharmacologically active levels,but landing foot splay isnot affected. The effectsof acrylamide become more severewith time after dosing has ceased. The effects of codeine and pentobarbital disappear after the drugs have cleared from the animal. The recommended practice of conducting FOB studies with positive effect compounds allows the researcher to explore the relationships of the various measures comprisingthe FOB and adds experience to the “toolbox” to be used later for interpreting test data.
4.
Motor Activity
Assessment of locomotor activity has been a mainstay of behavioral investigations almost as longas animal studies havebeen considered as models for human response. In our everyday lives we all realize that activity levels differ among people and increasedand decreased levels are often associated with psychological states such as anxiety and depression, respectively,in the sameperson at different times. A simple mental review of one’s own activity levels that occurred concurrently with changes in psychological state over the past day or week will clarify this point.
280
Weisenburger
The standard example I have used many times to illustrate for students a common pharmacological effect is one that many college-age and older people can relate to, from personal experience and/or direct observation of the behavior A person goes toan establishof other people.The didactic scenario is as follows: ment where alcoholic beverages are served, music is played and dancing is encouraged with the goal of increasing pleasant stimulation and social behavior. Initially the activity level is low to moderate. After one or two measured doses of alcoholic beverage, the activity level increases as dancing commences and social behavior increases. Oneor two additional doses are self-administered over a period of time (minutes to hours) and the frequency and intensity increases for dancing and social behavior, there is more frequent and louder verbal behavior and body movements become more exaggerated and less coordinated. If selfadministration continues, a point will likely be reached where activities decrease and may eventually be restricted to simple maintenance of necessary physiological functions with little or no apparent activity. This example from everyday life not only illustrateshow we all use observable behaviors, including motor activity, to draw conclusions about the psychological and pharmacological states of ourselves and others, but is also a good example of an “inverted U” dose-response function for a commonlyused drug, alcohol. The “inverted U’ ’ function refers to the observation in our example that increasing dose administration results in increased response (i.e., more dancing, talking, and exaggerated body language), but that at some point increasing the dose results in decreased response (i.e., slower and then no dancing, less and no active body language).Suchnonlinear then no talking,andlittleor dose-response relationships occur often in behavioral pharmacology and occaof sionally in behavioral toxicology and neurotoxicology. Since a basic tenet toxicology is that there should be a dose-response relationship for a toxicant, the assumption is often made that the dose-response curve should reflect either more or less responsewith increasing dose. A dose-response curve that does not conform to this textbook ideal may be overlooked and considered a spurious finding. In animal studies, motor activity is assessed overspecified a period of time, usually for periodsof 30 to 90 min up to the assessmentof circadian cycles over 24 hr. In versionsof this type of assessment where the animal is allowed to move about a standardized test environment that is not the animal’s home cage, such as an open field or a figure 8 apparatus, the rat will initially exhibit relatively high levels of activity as it explores the novel or different environment and will move about less over time as it “gets used to” or habituates to the new setting. The rate of habituation is a sensitive index of the animal’s ability to adapt to its environment and is altered by many pharmacological (e.g., amphetamine) and neurotoxic (e.g., trimethyltin) agents.
Neurotoxicology
281
As is true for all of the measures in the neurotoxicity battery, observed differences in motor activity levels must be interpreted in the context provided well as by other measures in the neurotoxicity and general toxicity studies as what is known about the pharmacology of the compound. An animal that has been administered a peripheral neurotoxicant, such as acrylamide that induces a “dying-back” axonopathy resulting in loss of neural control of the limbs, will likely exhibit decreased motor activity, regardless of how one measures it. Habituation will be difficult to demonstrate, as the initial level of activity within a session will probably be very low to begin with. Obviously, this example would not be interpreted as the compound being active only in the central nervous system because we know that the peripheral effects of acrylamide confound the motor activity assessment. But what if we were dealing with a compound with unknown toxic potential? The point here is that all of the data generated on the compound’s effects in our target species must be reviewed before a proper interpretation can be rendered. In the example above, the FOB data and the neuropathology data collected would lead us to a more accurate interpretation then an examination of motor activity alone could.
5. AuditoryStartle The startle response consists of a characteristic sequence of muscular responses elicited by sudden. intense stimuli. Loud sounds and air puffs elicit startle responses in all mammals studied, including humans, whereas visual stimuli are generallyineffective.Onlystimuli with very rapidrisetimeselicitstartle. A sound that slowly increases in intensity (i.e., loudness) will not elicit the startle response. Startle represents a short-latency reflexthat is mediated by a simple neural circuit. The auditory startle response (also referred to as the acoustic startle response or reflex) can be differentiated from motor behaviors and other movements due to its very short latency and dependence on the onset of the tone. Auditory startle can be measured electromyographically in rats with latencies of approximately 5 msec when measured in the neck muscles and approximately 8 to 10 msec when measured in the hindlimbs. Measurementand testing of auditory startle fallsmidway between most observational behavioral tests and electrophysiological measurements. The data are quantified asunits of force ( e g . newtons) and are quite reproducible for the same animal over different test sessions on different days. Much of the neural circuitry for the auditory startle response residesin the brain stem and includes theventral cochlear nucleus, ventral nucleus of the lateral lemniscus, nucleus reticularis points caudalis, and motor neurons in the spinal cord [45]. Obviously, the cochlea and other structures involved in hearing must
282
Weisenburger
be intact and functioning correctly for the auditory stimulus to be effective in eliciting startle. In toxicity experiments, when the amplitude of the auditory startle response is reduced, the possibility thatthe treatment has affected the organs of hearing shouldbe investigated. One simple approach is to the testsame animals with air puff startle (tactile modality) to assess whether or not the reduced response amplitude is specific to one of the sensory modalities or represents an effect on other components of the neural circuit. Some testing laboratories routinely include air puff startle for this and other reasons. There are three main parameters that are measured when auditory startle is used in neurotoxicity assessments: latency, amplitude, and habituation. Latency refers to the time that elapses measured from the onset of the tone to either the beginning of the whole-body startle or that point in time when the maximum force of the startle response is recorded. This measure is related to the speed of the nerve impulse as it travels through the neural circuit. Changesin latency are often indicativeof compound-related changes in nerve conduction speed. Amplitude refers to the force with which the rat reflexively responds to the sound stimulus. Changes in amplitude can be due to alterationsin the central nervous system, in the peripheral nervous system,at the neuromuscular junction,or in the n~uscles involved in the response. Habituation refers to the decrease in the amplitude of response over repeated stimulus presentations and is considered to be a simple form of nonassociative learning. Changesin habituation that are related to exposure to a compound usually indicate a difference in the way the animal adapts to aspects of its environment. Rats, like people, are programmed to dampen responses to repetitive stimulithat do not have consequences for them.In assessing habituation, one should keep in mind that if the mean amplitude of the auditory startle response is dramatically decreased compared to controls, adequate evaluation of habituation may not be possible. Additional useful information may be obtained through the application of prepulse techniques. The introduction of a low-intensity stimulus shortly before the intense, startle-eliciting stimulus will result in modification of startle amplitude and latency. These modifications of amplitude and latency are robust and predictable and can lead to assessmentof the processing capabilityof the subject [20]. Since prepulse inhibition occurs with tones that are near the threshold of audibility, this technique is used by a few laboratories to assess auditory thresholds in animals [46] and represents an efficient method for the rapid determination of compound effects upon hearing. This isa promising technology for incorporation into toxicity studies i n the future to determine the functional status of the auditory system in animals noninvasively. Unfortunately, there are still a few technical problems to be resolved before this technique can be widely adopted. An efficient test of animal hearing would be very helpful as there is currently no efficient and sensitive procedure in general use for the evaluation of hearing
Neurotoxicology
283
in animals, although many drugs and neurotoxicants have been found to affect hearing in animals and people.
6.
Learning and Memory
Tests of learning and memory are often referred to as “cognitive tests” because they are assumed to require considerable processing and integration by the brain; processes that we often refer to in people as “thinking.” It is these types of cognitive processes that make us, you and I, what and who we are. I think we could all agree that protectionof these processes from harm done by exogenous compounds is highly desirable for ourselves and our children. There are many tests of learning and memory in use in animal studies. In general, the greater the complexity of the test, the greater the power of that test to detect compound effectson learning and tnemory. Bewareof tests that are too simple! One would not attempt to assess the intelligence of children by asking one or two simple questions (e.g., What color is the sky? How many days are there in a week?).If the questions were tooeasy and too few, almost every child would passand the conclusionwould be that all, or almost all, children are equally intelligent. Thanks to 100 yr of intelligence testing of children and adults, it is well documented that there is a range of intelligence among people, and many studies suggest that the same is true for animals. We also know that mental retardation in children is a real consequence of prenatal exposures to some compounds. A good example is the mental retardation often seenin the fetal alcohol syndrome [47]. Let us assume that the test selected for an animal study has been shown to be sensitive to the types of effects we are studying. That is, effectson learning and memory. What weight should be given to positive findings in relation to other findings in a neurotoxicity experiment? This is not a simple question. One on the learning strategy that isnot defensible is to explain decreased performance and memory task or tasks by invoking body weight differences in treated offspring that are relatedto compound exposure unless it can beshown that size or weight actually affected performance in the test. I have seen researchers write off, or attempt to disregard, effects on testsof learning and memory in developmental studies because the effects weremainly observed in high-dose-group offspring where the body weights were lower than controls during the early periods of their lives but caught upto controls later in life prior to testing. One can only speculate about a mechanism that would explain the test results in such a situation. The indefensible position stated above indicates confusion regarding the issues of growth, development, and adult cognitive ability. Growth and development are related, but different, aspects of what occurs during the period prior to adulthood.
284
Weisenburger
Results of tests of learning and memory must be considered in the context of the other pharmacological and toxicological information available for the compound of interest. If the animals were severely hypoactive and lethargic at the time of testing, a valid assessment of learning and memory was probably not possible. In animal experiments, as in human learning and memory, the factor of motivation should be considered. If the animal is not motivated sufficiently by the reward or the adverse consequence, normal performance is not likely. If performance is maintainedby electric shock, asin most passive and active avoidance procedures, the equivalence of the effects of the shock should be assessed for the different exposure groups. Sometimes a clear-cut or borderline decrement cannot be explained by other factors. Perhaps results from a passive avoidance test or simple water maze lead to the conclusion that the compound of interest is a “bad actor.” If one has selected and conducted the test using appropriate criteria and procedures. confidence in the findings should be high. That compound under specific exposure conditions probably deserves the negative label it will receive. Public health is the underlying issue. More sophisticated and sensitive tests, such as schedulecontrolled operant behavior(SCOB) tests, will likely provide additional information and characterization of the effects on motivation and cognition. However, it is unlikely that such tests will reverse the conclusion of the screening test and additional results, even if they are negative, will probably not make the initial finding “go away.” The regulatory agency will rightfully consider a positive research finding as a real effect unless a confounding factor is identified that invalidates the test. In such a case, the test will probably need to be repeated. Tests of learning and memory require proper functioning of integrated neural systems that are highly apicalin nature. Performance on these tests represents the highest levelof functioning of the central nervous system that current methodology can assess. These brain functions in animals are as close as we can get to testing the effects of the compound of interest on human thinking, emotion, and motivation without actually testing in people. The way we think, feel, and are motivated is what makes us human. Animal studies can help protect us from unwanted influences on these critical aspects of our existence.
7 . Neuropathology Integration of neuropathological findings with the other findings in a neurotoxicity study is crucial. Neuropathology can identify specific lesions that will elucidate the other findings or can stand alone to indict a compound as a neurotoxicant. Serious lesions are sometimes foundin the central or peripheral nervous systems without any behavioral correlates. Theremay be several reasons why this can be so. Possibly the behaviors or functions that could be affected were not tested. Perhaps the time course of the study was too short and behavioral effects would
Neurotoxicology
28s
become manifest if the study continued. It might be that the dose response on other measures interfered with detecting functional effects. In the last case, the high-dose group might havehad considerable mortality in the same animals that would have, or might already have, developed the lesions. The intermediate-dose group might then have the same lesions because the dose level was below that which caused mortality but high enough to cause the lesions in some animals. More often than not, a behavioral or functional change is observedand no lesion is found. As has been emphasized throughout this chapter, behavioris the result of the combined functioning and integrationof many different regions and systems in the brain and spinal cord. Lesions may occur at levels that standard neuropathological techniques cannot detect. Alterations in regional levelsof neurotransmitters can affect behavior dramatically. but cannot be detected with histopathology. The same can be said for changes in the manner that synapses function. Naturally, one can currently focus on the molecular level and find effects that ultimately may or may not be detectable with the techniques of neuropathology. The peripheral nervous system appears to be more amenable to microscopic examination. Several compounds have been found thattarget peripheral neurons and induce severe neuropathies. A good example that has been cited already is acrylamide which causes a ‘‘dying back’ ’ axonopathy. Abnormalities detected in the behavioral screening battery on such measures as gait in the functional observational battery, hypoactivityin the motor activity assessment, or poor per(most formance in a testof learning and memory requiring coordinated movement of those tests do) mightall be due to peripheral neuropathy forwhich the lesions can be identified. Neuropathology in the offspringof exposed animals is usually less dramatic and more elusive than that observed in adults exposed directly to the compound. Crude measures like thebrain weight assessments and morphometric analysisof the layersof the neocortex, hippocampus, and cerebellum can, theoretically, show the consequencesof interference with developmental processesby the compound of interest. The comprehensive examinations of the nervous system detailed in the EPA neurotoxicity guidelines make the detectionof more subtle lesions and errors in development more likely than behavioral tests alone.
V.
CONCLUSION
A general approach for the examination and detection of neurotoxicity in adult and developing animals, usually rats, has been reviewed. This area is constantly evolving and is expected to become more complex and powerful to detect neurotoxicants in the future.The field of neuropathology will provide greater sensitivity with the application of refined techniques such as the quantitation of glial fibrillary acidic protein, increased useof peripheral nerve conduction velocity assess-
286
Weisenburger
ments, and more sophisticated staining and tissue preparation including the increased use of electron microscopy. Behavioral techniques are also being improved. Tests of learning and memory, in particular, will provide greater sensitivity to potential effects on cognitive functioning. Some researchersin this area are currently talkingand writing about the potential for animal cognitive tests tobe among the most sensitive screening tests for neurotoxicity. Some investigators suggest that testing of learning and memory in adult animals should be added to the first-tier neurotoxicity screen forEPA-regulatedcompounds. The next severalyearswillseeasignificant amount of research attention in this area. A final word about the use of the term neurotoxicity. There are numerous other approaches and measures that are bona fide measures of neurotoxicity. These avenues of research are ongoing in many scientific disciplines, especially the neurosciences, broadly defined. The contribution to the understandingof how the nervous system functions in its healthy, normal state and in various diseased states has been enormous. Most of these basic research methods, as powerful as they are, are not currently amenable to incorporation into screening approaches. Through all of the complexityof the methods and the sometimes confusing results of neurotoxicity experiments in animals, we need to keep in mind that the ultimate goal is the protection of the health of people and the other organisms with which we share this planet.
REFERENCES 1. U.S. Congress, Office of Technology Assessment, NeurotoxiciQ: Ident$ying a d Controlling Poisons of the Nervous System, OTA-BA-436, U.S. Government Printing Office, Washington, DC (1990). Pesticide As2. U.S. Environmental Protection Agency, Office of Pesticide Programs, sessrnent Guideli~es,Subdilision F-Hazard Ewlltntiorz: H L O ? attd I ~ ZDomestic Animals, EPA Pub. No. 540/9-82-025, Washington, DC (1982). 3. U.S. Environmental Protection Agency, Pesticide Assessment Guidelines, Subdivision F-Hazard E\taltration: H w m n and Domestic Aninlals, AddendumIO-Neurotoxicity Series 81, 82, and 83; PB 91- 154617, National Technical Information Ser-
vice, Springfield, VA (1991). 4. U.S. Environmental Protection Agency. 40 CFR Part 799, Toxic Substances Control Act Test Guidelines; Final Rule, 799.9620 TSCA Neurotoxicity Screening Battery, Federal Register, Vol. 62, No. 158, 43857-43860. August 15, 1997. 5. S.K. Clarren and D.W. Smith, The fetal alcohol syndrome, N. E@. J. Med. 298: 1063-1067 (1978). 6. Committee on the Safetyof Medicines. Notes for guidance on reproduction studies. Committee on the Safety of Medicines, Department of Health and Social Security, London, Great Britain (1974).
Neurotoxicology
287
7. Ministry of Health and Welfare, On studies of the effects of drugs on reproduction, Notification No. 529 of the Pharmaceutical Affairs Bureau, Ministry of Health and Welfare, Japan (1 975). 8. U.S. Food and Drug Administration, International Conference on Harmonization: Guideline on Detectionof Toxicity to Reproduction for Medicinal Products: Availability Notice, Federal Register; Part IX, 48746-48752, September 22, 1994. 9. V.C.Moser,J.P.McCormick,J.P.Creason,andR.C.MacPhail,Comparisonof chlordimeform and carbaryl using a functional observational battery, Fund. Appl. Toxicol. 11:139-206 (1988). 10. H.A. Tilson and V.C. Moser, Comparison of screening approaches, Nezwotoxicology 13:1-14 (1992). 11. K.M. Crofton, J.L. Howard, V.C. Moser, M.W. Gill, L.W. Reiter, H.A. Tilson, and R.C. MacPhail, Interlaboratory comparison of motor activity experiments: Implications for neurotoxicological assessments, Neurotoxicol. Teratol. 13599-609 (1991). 1 2 T.O. Brock and J.P. O'Callaghan, Quantitative changes in the synaptic vesicle proteins, synapsin I and p38 and the astrocyte specific protein, glial fibrillary acidic protein, are associated with chemical-induced injury to the rat central nervous system, J. Neurosci. 7:93 1-942 (1987). 13. J.P. O'Callaghan, Neurotypic and gliotypic protein as biochemical markers of neurotoxicity, New-otoxicol. Teratol. 10:445-452 (1988). Experimental artd Clinical Nezrrotoxicol14. P.S. Spencer and H.H. Schaumburg (eds.). ogv, Williams and Wilkins. Baltimore (1980). 15. World Health Organization (WHO),Principles nlld Methods for the Assessment of Nez4rotoxicity Associated with Exposztre to Chemiccrls (Environmental Health Criteria 60) World Health Organization Publications Center USA, Albany, New York (1986). 16. B.K.Nelson,W.J.Moorman,andS.M.Schrader.Reviewofexperimentalmalemediated behavioral and neurochemical disorders, Neurotoxicol. Terntol. 18:611616 (1996). 17. U.S. Environmental Protection Agency; Office of Prevention, Pesticides and Toxic Substances, DevelopmerttrdNeurotoxicityGuidelirle OPPTS 870.6300, available throughtheEnvironmentalProtectionAgencywebsiteathttp://www.epa.gov/ epahome/research.htm ( 1998). 18. J. Adams, J. Buelke-Sam,C.A.Kimmel,C.J.Nelson,L.W.Reiter,T.J.Sobotka, H.A. Tilson, and B.K. Nelson, Collaborative behavioral teratology study: Protocol design and testing procedure, Neurobehav. To.xico1. Terntol. 7579-586 (I 985). 19. C.C. Korenbrot,I.T. Huhtaniemi, and R.W. Weiner, Preputial separation as an external sign of pubertal development in the male rat, Biol. Reprod. 17298-303 (1977). 20. J.R. Ison, Reflex modification as an objective test for sensory processing following toxicant exposure, Neurobehnv. Toxicol. Terntol. 6:437-445 (1 984). 21. R.J. Green and M.E. Stanton, Differential ontogeny of working memory and reference memory in the rat. Behnv. Neurosci. 103:98-105 (1989). 22. D. Kucharski and N.E. Spear, Conditioning of aversion to an odor paired with peripheral shock in the developing rat, Dev. Psyclzobiol. 17:465-479 (1984). 23. D.A. Cory-Slechta, B. Weiss, and C. Cox, Delayed behavioral toxicity of lead with increasing exposure concentration, Toxocol. Appl. Phnrmcol. 71:342-352 (1983).
2aa
Weisenburger
24. B.A. Campbell and V. Haroutunian. Effects of age on long-term memory: Retention of fixed interval responding, J. Gerontol. 36:338-341 (1981). Ontogeny of Learning and Memo?, Erlbaum. 25. N.E. Spear and B.A. Campbell (eds.). New Jersey (1979). 26. N.A. Krasnegor. E.M. Blass, M.A. Hofer, and W.P. Smotherman,Perinatal DevelPerspective. Academic Press, Orlando, FL ( 1 986). opment: A Ps~~chobiologicul Neurobehav. Tox27. D.B. Miller and D.A. Eckerman, Learning and memory measures, icol. (Z. Annau. ed.),JohnsHopkinsUniversityPress,Baltimore.MD1986.pp. 94- 149. 28. E.P. Riley and C.V. Vorhees (eds.), Hnndbook of Behrrvioml Teratology, Plenum Press, New York (1986). 29. R.G.M.Morris,Spatiallocalizationdoesnotrequirethepresenceoflocalcues, Learrz. Motiv. 12:239-260 (1981 ). 30. D.J.-F. de Quervain, B. Roozendaal, and J.L. McGaugh. Stress and glucocorticoids impair retrieval of long-term spatial memory, Nature 394:787-790 (1998). An analysis 31. C.V. Vorhees. W.P. Weisenburger, K.D. Acuff-Smith. and D.R. Minck, of factors influencing complex water maze learning in rats: Effects of task complexity, path order and escape assistance on performance following prenatal exposure to phenytoin. Neurotoxicol. Terutol. 13:213-222 (1991). 32. W.P.Weisenburger,D.R.Minck,K.D.Acuff.andC.V.Vorhees,Dose-response effects of prenatal phenytoin exposure in rats: Effects on early locomotion, maze learning. and memory as a function of phenytoin-induced circling behavior, Neurotoxicol. Teratol. 12:145- 152 (1 990). 33. M. Akaike. H. Ohno,S. Tsutsumi, and M. Omosu, Comparison of four spatial maze learning tests withmethylnitrosourea-induced microcephaly rats.Terutology 49:8389 (1994). 34. S. Tsutsumi. M. Akaike, H. Ohno, and N. Kato, Learning/memory impairments in rat offspring prenatally exposed to phenytoin. Neurotoxicol. Terutol. 20: 123- 132 (1 998). 35. W.P. Weisenburger. C.L. Kozak, A.R. Hagler. and D.S. Chapin. Perinatal phenytoin and methimazole in rats to compare five tests of learning and memory: Factors relevant to the selection of tests for pharmaceutical safety evaluation, Neurotoxicol. TerCrtol. 19257-8 (1997). 36. R.E. Butcher, Behavioral testing as a method for assessing risk. Emiron. Health Perspect. 18:75-78 (1 976), 37. Armed Forces Institute of Pathology (AFIP), Mnrzrcal of Histologic Stnitzing Methods, McGraw-Hill, New York (1968). 38. M.P. Pender, A simple method for high resolution light microscopy of nervous tissue, J. Neurosci. Methods 15:213-218 (1985). 39. H.S. Bennet, A.D. Wyrick, S.W. Lee. and J.H. McNeil, Science and art in the preparing of tissues embedded in plastic for light microscopy, with special reference to glycol methacrylate. glass knives and simple stains, Stuirz Teclznol. 51:71-97 ( 1 976). 40. R.L. Friede, Developmental Neuropathology, Springer-Verlag. New York ( 1 975). 41. K. Suzuki, Special vulnerabilities of the developing nervous system, Experimerztnl and Clinical Neurotoxicology (P.S. Spencer and H.H. Schaumburg. eds.), Williams and Wilkins, Baltimore, 1980, pp. 48-61.
Neurotoxicology
289
42. P.M. Rodier, and W.J. Gramann, Morphologic effects of interference with cell proliferation in the early fetal period, Neurobehnv. Toxicol. 1:128-135 (1971). 43. J.W. Tukey, J.L. Ciminera. and J.F. Heyse, Testing the statistical certainty of a response to increasing doses of a drug, Biorr~etrics41295-301 (1985). 44. J.S. Baird, P.J. Catalano, L.M. Ryan, and J.S. Evans, Evaluation of effect profiles: Functionalobservationalbatteryoutcomes, F L ~Appl. . Pharnzacol. 40:37-5 1 (1 997). 45. M. Davis, The mammalian startle response,Neural Mechnnisrtzs of Startle Behavior (R. Eaton, ed.), Plenum, New York (1984). 46. K.M. Crofton. R. Janssen. J. Prazma. S. Pulver, and S. Barone. The ototoxicity of 3,3’-iminodipropionitrile: Functional and nlorphological evidence of cochlear damage. Hear. Res. 80: 129-140 (1994). 47. A.P. Streissguth, H.M. Barr, and D.C. Martin. Alcohol exposure in utero and functional deficits in children during the first four years of life, Meclmnisms of Alcohol Dnnzage ir? Utero (R. Porter. M. O’Connor, and J. Whelan, eds.), Pitman (Ciba Foundation Symposium 105). London (1984).
This Page Intentionally Left Blank
Toxicological Assessment of the Immune System Gary J. Rosenthal RxKinetix, Inc., Louisville, Colorado
Dori R. Germolec National Institutes of Environmental Health Sciences, Research Triangle Park, North Carolina
1.
INTRODUCTION:IMMUNOSUPPRESSION, HYPERSENSITIVITY, AND AUTOIMMUNITY
Toxicological investigation of the immune system is concernedwith adverse effects of physical or chemical agents on the integrated organ system referred to as immunity. At the cellular and molecular level the components of this diffuse system are generally considered be to the provinceof lymphocytes, macrophages, polymorphonuclear leukocytes, complement components, and a wide array of soluble mediators such as cytokines and antibodies. The nlethodologies employed in immunotoxicology use the central principles of toxicology in combinationwith advances in basic and applied immunology to better understand the actions of xenobiotics on the itnmune system. Immunotoxic agents identified to date include all classes of foreign insult and includes environmental and industrial chemicals, pharmaceuticals, consumer products, food, and food additives, as well as natural entities, such as mycotoxins, and physical agents, such as radiation. From the perspective of pathophysiology, the biological manifestationsof altered immune homeostasis can generally be divided into three diverse formsof disease: immunosuppression, hypersensitivity, and autoimmunity. The consequences of xenobiotic-induced immunosuppression can be devastating, as evidenced by the high incidence of secondary cancer in transplant patients following therapeutic immunosuppression or the increased susceptibility 291
Rosenthal and Germolec
292
to pulmonary infections characteristicof Yusho disease seen in China and Japan following consumptionof immunosuppressive polychlorinated biphenyls and furans from contaminated rice oil [I]. In addition to these and many other clinical cases, a largebody of experimental data also exists showing that xenobiotic exposure can produce marked changes in immune competence and significantly decrease resistance to infectious or neoplastic challenge (for review, see [2]). Fromanindustrialtoxicologyperspective,hypersensitivitydiseasesare probably the most common manifestation of immunotoxicity. Although hypersensitivity diseasesafflict tens of millions of Americans, the incidence associated with environmental or occupational exposure is unclear, although it is likely to be significantin light of the numerous chemicals shown have to produced a hypersensitivity response after occupational exposure, including the diisocyanates, trimellitic anhydrides, and platinum dusts, to name a few. The characteristic that distinguishes the allergic responses from immune mechanisms involved in host of the reaction, which often leads to tissue damage. defense is the excessive nature Almost any organ can be targeted by hypersensitivity reactions, including the gastrointestinal tract, blood elements and vessels, joints, kidneys, central nervous system, and thyroid, althoughthe skin and lung, which demonstrate urticariaand asthma respectively, are the most common targets. While the immune system can adequately defend against infectious agents and prevent adverse reactions self to because of the exquisitely organized network of interacting, discriminatory components, self-tolerance isnot always preserved and disregulated recognition of autoantigens may lead to autoimmune disease.
Table 1 XenobioticswiththePotentialtoInduce Autoimmune Disease
Compound disorder Autoimmune Systemic erythematosus lupus Procainamide Isoniazid Penicillamine Hydralazine Methyldopa anemia
Hemolytic Asbestos Penicillin Sulfa drugs
salts
Gold Thrombocytopenia Chlorothiazide Salicylic acid Rifampin Scleroderma-like chloride Vinyl disease
Toxicological AssessmentImmune ofSystem the
293
The well-documented examples of drug-induced autoimmune syndromes (i.e., [3] suggest that exposure procainamide,isoniazid,sulfadrugs,penicillamine) to other xenobiotics could contribute to the incidence of these diseases through disruption of normal immune processes. Autoimmune disorders are manifestations of immunological disregulation in which many predisposing factors (e.g., viral or genetic) can play an etiological role. The pathogenesis of autoimmune disease is diverse and includes the production of autoantibodies, damaging inflarnmatory cell infiltrates in target tissues, and immune complex formation and deposition in numerous tissues. Examples of xenobiotics capable of inducing an autoimmune response are shown in Table 1.
II. CONSIDERATIONS ANDAPPROACHESFOR DETECTING IMMUNOMODULATION Chemical modulationof the immune systemby exogenous agents is due as much of immunity. to the general properties of the agent as to the complex nature Because of this inherent complexity, the initial strategies devisedby immunologists working in toxicology and safety assessmenthave been to select and apply a broad and often tiered panel of assays to identify immunomodulatory agents in laboratory animals or via focused epidemiological studies. Although the configurations of these tiered testing panelsvary depending on the needs of the laboratory conducting thetest and the animal species employed,they usually include assessment of one or more of the following: (1) lymphoid organ weights and histopathology, (2) quantitative assessment of lymphoid tissue cellularity, hematology, and bonemarrow differential, (3) immune cell function at the effector or regulatory level, and/or (4) host resistance studies involving infectious or neoplastic challenge. Table 2 lists some of the more common functional methods used for experimental testing of immune status. A number of test panels have been proposed for evaluating the immune system in experimental animals by various government agencies [4-81. The tier testing approaches employed by these agencies are similar in design in that the first tier is a screen for immunotoxicity with the second tier consisting of more specific or confirmatory studies, host resistance studies, or in-depth mechanistic studies. At present, most information regarding these models comes from the U.S. National Institute of Environmental Health Sciences, National Toxicology Program (NIEHWNTP) [4] followed by the model developed at the National Institute of Public Health and Environmental Protection (RIVM) in Bilthoven, The Netherlands [8]. While the first-tier screeningat RIVM consists of tests for general parameI of the NIEHS-NTP panel includes ters of specific and nonspecific immunity, tier functional tests in which an immune response is measured following in vivo
Rosenthal and Germolec
294
Table 2 MethodsUsedtoAssessImmunotoxicity
Immune parameter evaluated ~
~~
NonspecificmarkersimmunotoxicityCompletebloodcountandimmunopathology differential Acute phase proteins Complement (CH50) Surface marker phenotypic analysis Cell-mediated immunity Mixed lymphocyte response Mitogen-induced proliferation Delayed-type hypersensitivity/skin testing Cytotoxic T-lymphocyte mediated cytolysis Antibody plaque forming cell assay Humoral immunity titers Basel or antigen-specific serum antibody Mitogen-induced proliferation Natural killer cell cytotoxicity Nonspecific immunity Macrophage phagocytosis Macrophage bactericidal/tumorcidal activity Host resistance assessment models Listeria monocytogenes Plasmodium yoelli Trichinella spirulis Streptococcus prleumoniae
Influenza virus PYB6 or B16F10 tumor cell challenge
antigenic challenge. These are generally considered the most sensitive indicators of immune integritybut are not routinely conducted as part of subchronic toxicology studies in light of the potential for immunization and potential immunogenicity to compromise interpretationof toxicity test results.If sufficient reason exists to pursue any tiered panel, the use of satellite groups of animals may avoid this confounding variable. Histopathology of lymphoid organs is a pivotal component in the RIVM screening battery[8]. Routine histopathologyof lymphoid organs hasbeen shown to be usefulin assessing the potential immunotoxicityof a chemical, particularly when these results are combined with the effects observed on the weight of the lymphoid organs and sufficiently high doses of the chemical are tested. In the RIVM panel,if the resultsin tier I suggest immunotoxicity,then tier I1 functional studies can be performed to confirm and further investigate the nature of the immunotoxic effect. Information on structure-activity relationships of immunotoxic chemicals can also lead to the decision to initiate functional evaluation. The choice for further studies depends on the type of immune abnormality observed. The second tier consistsof a panel of in vivo and ex vivo/in vitro assays
Toxicological Assessment of the Immune System
295
including cell-mediated immunity, humoral immunity, macrophage and natural killer (NK) cell function, as well as host resistance assays (see Table 2). Recently, a database consistingof over 50 compounds, which were evaluated in one or more tiers of the NIEHWNTP panel, has been analyzed in an attempt to improve the accuracy and efficiency of screening chemicals for immunosuppression and to better identify those tests that predict immune-mediated diseases [9,10]. The types of compounds in this database included a broad spectrum of agents including environmental, industrial, and pharmacological agents. While these reports describe limitations existing in the data setsused in the analyses, a number of important conclusions were drawn from these data [9,10]: Assessment of as few as two or three immune parameters may be used to successfully predict immunotoxicantsin mice. In particular, lymphocyte enumeration and quantitation of the T-dependent antibody response appearparticularlyfavorable.Furthermore,commonlyemployedgross measures such as lymphoid organ weights appear somewhat insensitive. A good correlation existed between changes the in immune testsand altered host resistance. However, in many instances, immune changes were observed in the absence of detectable changesin host resistance, suggesting that immune tests are,in general, more sensitivethan the host resistance assays. No single immune test was identified that couldbe considered highly predictive for altered host resistance in mice. However, combining several immune tests increased the ability to predict host resistance deficits, in some cases to about 80%. Most immune function-host resistance relationships follow a linear rather than thresholdmodelsuggestingthatevensmallchanges in immune function may theoretically manifest into some deficit inhost resistance. However, because of the variability in the responses, it was not possible to establish linear or threshold models for most of the chemicals studied when the data sets were combined.
A.
DosingandSpeciesConsiderations
A variety of factors need to be considered when evaluating the potential of an environmental agent or pharmaceutical to adversely influence the immune system of experimental animals. As in most toxicological investigations, the selection of the exposure route should parallel the most probable route of human exposure. Treatment conditions should take into account the biophysical properties of the of action, if agent,includingpharmacokinetics,metabolism,andmechanism available. Dose levels should be chosen that will likely establish distinct doseresponse curves as well as a no observable effect level (NOEL). Although in
296 Germolec
and
Rosenthal
some instances it may be necessary to include a dose level that induces some other manifestation of overt toxicity, immune changes observed at such a dose level should be interpreted cautiously since severe stress and malnutrition and resultant cachexia are known to impair immune responsiveness. Lastly, inclusion of a positive control group with an agent that shares characteristics of the test compound may be also be useful to validate the robustness of the assay. It is not surprising that the selection of the most appropriate animal model for inmunotoxicology studies hasbeen the subjectof much deliberation. Assuming the species of primary interest is the human, then toxicity testing should be performed with a species thatwill respond to the test chemical in a pharmacological and toxicological manner similar tothat anticipated in humans. For example, the test animals and humans should metabolize the chemical similarly and should have identical target organ responses and toxicity. Toxicological studies are often conducted in several animal species, most often employing a rodent and nonrodent species. Although some exceptions exist, many for immunosuppressive therapeutics, rodent dataon target organ toxicities and the comparability of immunosuppressive doses have generally been good predictors of subsequent clinical observations. Taking into account the toxicokinetic and pharmacokinetic differ[ 1 I], rodents continue ences that exist between experimental animals and humans of non-speciesto be very useful models for examining the immunotoxicity specific compounds based on established similarities of toxicological profiles as well as the relative easeof generating host resistance and immune function data. As novel xenobiotics continue to be produced, particularly recombinant biologics and gene therapy agents, comparative toxicological assessment should be seriously considered, since their safety assessment will likely present species-specific host interactions and toxicological profiles.
B. ApproachestoEvaluatingHumoralImmunity The humoral component of the immune response is primarily involved in the defense against soluble and extracellular pathogens.The soluble mediator of humoral immunity is the antibody molecule, which consistsof two identical heavy chains and two identical light chains. A variable region at the N-terminals of both chains constitutes the antigen binding site, and shows considerable heterogeneity in the composition and arrangement of the amino acids that make up the region, allowing forthe recognition of a large numberof widely diverse antigens. Immunoglobulins on the surface of B lymphocytes serve as signal transducers, and antigen binding to surface immunoglobulins leads to cross-linking of the antibody molecules, initiating intracellular signaling that may ultimately lead to cell activation, proliferation, and maturation into an antibody-secreting plasma cell. The antibody molecule also serves as a “bridge” between innate and adaptive immune functions, via its ability to coat bacterial pathogens thus enhancing
Toxicological Assessment of the Immune System
297
phagocytosis by macrophages and granulocytes, and through complement fixation by antigen-antibody complexes. In risk assessment studies designed to evaluate the most sensitive and predictive tests for immunotoxicology, assessment of humoral immunity,via examination of antigen-specific antibody responses, has been shown to be the best single indicator to determine the potential for a compound to induce alterations in immune function [9]. This is likely a reflection of the fact that measurement of the antibody response assesses more than B cell function, as both T cell and macrophage activity are generally required in mounting an effective antibody response. Macrophages have a significant role in antigen processingand presentation and may also modulate the antibody responsevia cytokine release. Antigenspecific T cells may have either a regulatory(T suppressor cells) oran accessory role (T helper cells).In T cell-dependent antibody responses, T helper cells produce a variety of cytokines, in particular interleukin-4, -5, and -6 (IL-4, IL-5, and IL-6), which regulatethe proliferation, differentiation,and isotype production in B cells. In laboratory rodents, primary antibody responses are commonly evaluated after in vivo immunization with a T-dependent antigen, such as sheep red blood cells (SRBC), by quantitating the number of antibody-producing cells that produce plaques in a modificationof the method first described by Jerne and Nordin [ 121. Splenic lymphocytes are the most common source of immunocompetent cells for rodent studies. Data for spleen cellularity and weight are also included in the evaluation, to correct for chemical-induced alterations in spleen cellnumber or as an indicator of overt toxicity to the whole animal. While chemicalof defects induced alterationsin numbers of antibody-forming cells are suggestive in one or more of the cellular pathways contributing to the antibody response, it does not identify the specific cellular the in vivo plaque assay is limited in that target. Thus. in vitro methods to assess antibody production, which can be useful in determining the specific mechanismof immunosuppression, have been developed. In general, these studies use purified macrophages, B and T lymphocytes, in vivo with the chemical of isolated from control animals or animals treated interest in elegant separation-reconstitution experiments to identify the target cell type. This typeof methodology has been used successfully to elucidate cell[ 131, specific toxicity for immunomodulatory compounds such as dideoxyinosine carbon tetrachloride[ 141 and 2,3,7,8-tetrachlorodibenzodioxin(TCDD) [ 151. Additional mechanistic information can also be obtained using T-dependent antigens, such as lipopolysaccharide and dinitrophenol-ficoll. Serum levelsof antigen-specific immunoglobulinM (IgM) or immunoglobulin G (IgG) from treated laboratory animals can be quantitated using enzymelinked immunosorbant assay (ELISA). Primary and anamnestic humoral immune responses in nonhuman primates are frequently evaluated in this manner. In rodents, there is a high degreeof correlation between the in vivo plaque assay and
298 Germolec
and
Rosenthal
the SRBC ELISA [16], however, the kinetics may be slightly different across rodent species, and it is critical to ensure that quantitation occurs at the peak of the response. Use of ELISA techniques to evaluate antigen-specific antibody responses has become increasingly popular, as the potential for robotic automation and the availabilityof inexpensive microplate dilutors, pipetors. and ELISA readers has allowed for the generation of highly reproducible results for large numbers of samples in an ever-decreasing time period. There is a growing consensus that measurement of total IgM and IgG are of limited predictive value, particularly in the absence of functional tests. However, these end pointscan be measured relatively inexpensively, and do not require additional groups of animals in standard toxicology studies, and as such can provided some information on immune status at relatively little cost. Ig Studies in humans have generally been limited to quantitating serum levels in vitro responses to recall antigens (e.g., tetanus), or in vitro secretion to nonspecific stimuli (e.g., pokeweed mitogen). In vitro assays to assess primary antibody responses are difficult to perform with human cells, due to the difficulty in obtaining enough responsive lymphocytes[ 171. Recent studies have madeuse of immunodeficient mouse strains, such as the CB-17 scid/scid (SCID) mouse engrafted with human immune cells, but the utilityof these methods to examine chemical-induced alterations in humoral immunity may be limited by the high degree of variability in the reconstituted responses [ 181. Vaccination represents the best opportunity for monitoring alterations in humoral immunity in humans. Secondary (recall and booster) responses appear to be less sensitive to chemical-induced perturbations than doprimary responses [ 191, however, in casesof severe immunosuppression, response to recall antigens canbeinformative[20]. In adults,postvaccinationantibodyresponsesto Epstein-Barr virus and influenza neoantigens may be useful to assess primary reactions [21,22]. There is a growing realization that, as the developing immune system may be particularly vulnerable to immunotoxic agents, studies of primary immune responses in newborns and young children in conjunction with established vaccination programs (such as against measles, diphtheria, tetanus, and poliomyelitis) may offer a significant opportunity to assess chemical-induced alterations in immune status and improve public health.
C.ApproachestoEvaluatingCellularImmunity A number of factors determine whether a specific antigen will induce a cellmediated immune (CMI) response, a humoral immune response, or both. These factors include the route of exposure, the physicochemical attributesof the antigen, the nature of antigen processing and presentation, as well as the initial and ultimate distributionof the antigen within the host tissue. In lightof this inherent complexity, some patterns do exist that may assist in identifying those antigens
Toxicological AssessmentImmune ofSystem the
299
that generally elicit CMI. These include chemical agents and drugs that covalently bind to self-proteins, tissue-associated antigens, and certain antigenic determinants on persistent microorganisms, such as Mycobacteriunz t~lberculosis. Generally, the induction of CMI proceeds when small lymphocytes differentiate into large "blast-transformed" cells and ultimately divide or clonally expand, giving rise to cells responsible for immunological memory and effector cell function. Depending on the stimulus, T cells can further differentiate into effector cells possessing cytotoxic potential (i.e., cytotoxic T cell; CTL; CDS). helper potential (Th; CD4) which facilitates antibody production by B cells and assist in certain T-cell functions, or suppresser potential (Ts; CDS) capable of inhibiting certain T and/or B cell responses. At the heart of the initial T cell interaction with antigen is its ability to recognize the foreign agent and become activated along a specific pathway that results in functions dependent on cell contact (i.e., cytotoxicity) or functions resulting in amplification or suppression of the capacity of other cells through the release of soluble mediators, such as interleukin-1 through -6 and interferon-y (INF-y). Workby Bottomly [23] demonstratedthat CD4-t Th cells canbe further subdivided into two distinct populations, referred to as Thl and Th2 cells. Thl cells produce IL-2 and IFN-y, while Th2 cells produce IL-4, IL-5, and IL-6. Changes in the normal homeostatic control of these cells, and their respective soluble mediators, may be the basis for a number of hypersensitivity diseases including allergy [24,25]. Assessment of T cell integrity is performed using a number of test methods such as quantitiation of lymphocyte subsets or lymphoproliferation assays [26] that measure the blastogenic and proliferative capacity of splenic or circulating lymphocytes to selected plant lectins or mitogens (i.e., phytohemagglutinin or concanavalin A). In light of the immune system's dependence on clonal expansion following antigen exposure, a decrease in lymphoproliferation is clearly an immunotoxic event. Other frequently employed methods used to detect T cell dysfunction are the mixed lymphocyte response (MLR) [36] and the CTL assay [27]. The MLR teststhe proliferative responseof T lymphocytes to surface major histocompatibility complex (MHC) antigens on allogeneic cells and provides a sensitive indicator for cell-mediated immunity. Clinically, the MLR measures those cellular events involved in graft rejection and graft vs. host reactions and has been shown tobe predictive of host response to organ transplants. T lymphocytes with cytotoxic effector function are generated in response to a variety of stimuli, including allogeneic cell surface MHC determinants, certain mitogenic lectins, chemically or virally modified autologous or syngeneic cells, as well as unique tumor-associated antigens.The differentiation of CTLs from their precursors involves a highly complex series of cellular interactions and production of soluble mediators ultimately resultingin the production of effector cells capable of recognizing and lysing the target. Assessment of CTL function is measured
300
Rosenthal and Germolec
by the MHC-restricted lysis of sensitive target cells and has been recently viewed by House and Thomas [37].
re-
D. Approaches to EvaluatingNonspecificImmunity: Macrophages and Natural Killer Cells Macrophages originatein the bone marrow as promonocytes and are then released and carried into the circulation as monocytes in a relatively immature state, prior tofurtherdifferentiation at variousorgansites.Inadditiontolymphoidorgans, macrophages are found in most every other organhissue including the liver (Kupffer'scells),lung(alveolarmacrophages),andskin(Langerhans'cells). These cells participate in immune responses at a variety of levels [28], including (1) bidirectional interactions with lymphocytes (i.e., antigen processing and presentation), (2) production of soluble mediators that control other cellular as well as acute and inflammatory responses, (3) scavenger function for the removal of insoluble materials or damaged cells, and (4) host defense against intracellular or extracellular microorganisms or neoplastic cells. A comprehensiveassessment of macrophagefunctionrequiresmultiple tests that take into consideration their heterogeneous function, origin, and activation state. The concept of macrophage activation, developed in the 1960s on the basis of work of Mackaness [29], is central to any analysis of isolated macrophages.While many mechanisticstudieshavebeenperformed with elicited, peritoneally derived macrophages, an applied approach warrants evaluation of macrophages derived from the organ(s) most closely associated with chemical exposure, since macrophages derived from different sites of the body may behave functionally and phenotypically distinct. Such heterogeneity is probably teleologically drivenby the unique environment each tissue macrophage operates within. For instance, whilemany macrophages operate in a relatively anaerobic state, the alveolar macrophage resides inan environment with comparativelyhigh oxygen levels, which may be related to its relatively robust reactive oxygen production compared with the peritoneal macrophage. Techniques are available to obtain highly enriched macrophage populations from many organs [30]. Furthermore, the cells are well suited to short-term culture and activation by mediators such as interferon-y and lipopolysaccharide [311. The methods employed to evaluate the functional status of macrophages following suspect immunotoxicants vary considerably, ranging from phagocytic indices [32], release of a growing list of soluble mediators, or complex bactericidal or tumoricidal activities, including the release of reactive oxygen or nitrogen [33,34]. Table 3 lists a number of the biological capacities and functions commonly evaluated in immunotoxicolgical investigation. Among the various cells that have been shown to present antigen in an MHC-restricted manner most are of the macrophage cell lineage. These cells
.
" "
L
Toxicological Assessment of the Immune System
301
Table 3 ImmunobiologicalFunctionsandCapacitiesAssociatedwithMacrophages
Function/capacity Interaction with lymphocytes complex Production of soluble mediators
Scavenger Host defense
Examples Antigen processing in conjunction with MHC Interleukins Tumor microsis factor Fibronectin Arachondonic acid metabolites Platelet activating factor Antigen uptake incorporation and catabolism within phagolysosomes Phagocytosis Tumor cytostasis or cytocidal activity Bactericidal activity Reactive oxygen and reactive nitrogen production
share three important functions for all antigen-presenting cells: (1) expression of class I1 glycoproteins on their surfaces, (2) ability to process antigen, and (3) ability to synthesize and release immunomodulatory soluble mediators. Immunomodulatory chemicalsmay alter antigen presentationby affecting one or a combination of these processes. The natural killer (NK) cell is a unique lymphocyte population, which, unlike cytotoxic T lymphocytes, can target and lyse virally transformed or neoplastic cells independent of the MHC antigens on the target cell surface. Natural killer cells are normally present in measurable levels in healthy individuals and are frequently considered to be the first line of protection against primary tumors. Recent work has shown that NK cells may also be important in the early defense against certain infectious agents as well, as demonstrated by the ability of NK cells to lyse virus-infected fibroblasts or epithelial cells and target intracellular bacteria residing in monocytes. Mechanistically, the tumorcidal and anti-infective activity of NK cells can be augmented by certain soluble mediators, including interferon-a, -p, and -y and interleukin-2 [35,36]. In addition to being activated by cytokines, the NK cell itself is a potent source of certain soluble mediators, (GMincluding IFN-)I and granulocyte-macrophage colony-stimulating factor CSF). Considering this established roleof NK cells in neoplastic immunosurveillance, an adequate understanding of chemical-induced suppression of NK cell activity may provide insight into the mechanism(s) by which certain chemicals exert their carcinogenic effects. Immunotoxicologic investigations frequently include the evaluationof NK cell integrity in a functional manner, using cytotoxic-
302
Rosenthal and Germolec
ity assays, as well as quantitatively through lymphocyte subset analysisof circulating blood. The NK cell cytotoxicity assay is oneof the more easily conducted functional tests performed in an immunotoxicological assessment, and can be performed using human peripheral cells and readily available target cells such with the YAC-1 tumor cell line as the K562 cell line or using rodent spleen cells
WI. E. ApproachestoEvaluatingHypersensitivityResponses Xenobioticsthatinducehypersensitivityresponsesareoftenlow-molecularweight, highly reactive molecules (haptens) or proteins that produce a unique and antigen-specific immune response.The clinical/diagnostic characteristic that sets these responses apart from immune mechanisms involvedin host defense is that the reaction is characteristically excessive and often leads to tissue damage. Clinical differentiation of allergic responses from nonimmune irritant responses is based on their persistence and severity. Chemical-induced hypersensitivities can be considered to fall into two general categories distinguished both mechanistically and temporally: (1) delayed-type hypersensitivity, which is a CMI-based 48 hr afterchallengeand (2) immediate response that occurswithin24to hypersensitivity, which is mediated by immunoglobulin (most commonly IgE) and manifests within minutes following exposure toan allergen. The type of immediate hypersensitivity response elicited (Le., anaphylactic, cytotoxic, Arthus, orimmunecomplex)dependsupontheinteraction of thesensitizingantigen or structurally related compound with antibody. In contrast, delayed-type hypersensitivity responses are characterized by T lymphocytes bearing antigen-specific receptors which, on contact with cell-associated antigen, respond by secreting cytokines. Hypersensitivity responses usually occur at potential xenobiotic portals of entry. which explains why the skin and respiratorytract are very common disease targets. Mononuclear phagocytes have a tnajor role in mediating local responses, initially via antigen processing and later via the release of reactive oxygen species and cytokines that modulate the recruitment and activation of additional cell types including polymorphonuclear leukocytes (PMNs) and lymphocytes. In addition to leukocytes, local cell types are often involved in the response including keratinocytes, epithelial cells, and fibroblasts. Historically, the guinea pig has been used to test for potential sensitizers. In the induction phase (primary exposure period), the guinea pigs are treatedwith the test agent intradermally and/or topically, followed by reexposure(s) (challenge phase) to the same test compound, normally after a periodof 10 to 14 days. Erythema and edema are measured at the site of the challenge exposure with a nonirritant concentrationof test compound. Because guinea pigs are large, several graded doses of antigen may be tested and an entire dose-response curve can be generated by comparing skin reactions in individual animals. However, it is
Toxicological AssessmentImmune ofSystem the
303
relatively expensive to purchase as well as maintain guinea pigs, there are few inbred strains, and immunological reagents are not widely available. Many variations in procedures for guinea pig hypersensitivity assays have been studied (e.g., Buehler occluded, guinea pig maximization. split adjuvant); details of which can be found elsewhere [38]. These guinea pig models are very sensitive and it has been suggested that in light of this, they may produce more false positivesthan is desirable. This argument may not be completely convincing when one considers the notable heterogeneityof immune responsesin the human population. Many effortshave been madetosubstitutetheguineapigassays with mouse models. Gad et a]. [39] proposed a mouse ear swelling test (MEST). This procedure is similar technically to the guinea pig assay in that both induction and challenge phases are required, but the response is quantitated by measuring an increase in earthickness when the material forchallenge is applied.One of the more intensively studied methods is the local lymph node assay (LLNA) [40]. In thisprocedure,the test materialisappliedtopically in threesuccessive daily applications toboth ears of the test species, usuallythe mouse. Control 5 days of exposure.miceare micearetreated with thevehiclealone.After injected with radioisotopicallylabeledDNAprecursors (e.g.. 'H-thymidine), and single-cellsuspensionsareprepared from the lymphnodesdraining the ears. At least one concentration of the test chemical must produce a threefold increase or greater in lymphocyte proliferation in the draining lymph nodes of test animals comparedwith vehicle-treated control mice be to considered positive. The primary advantage of this assay is that it minimizes the manipulation of animals.
111.
THE ROLE OF ENZYMATIC TRANSFORMATION IN IMMUNOTOXICITY
In addition to those chemicals that cause direct damage to immune cells and tissues, there are many compounds, including organic solvents such as benzene, cytoreductive drugs, pesticides, mycotoxins, and polycyclic aromatic hydrocarbons (PAHs), that induce immune alterations only after undergoing enzyme-mediated reactions within various tissues (Table 4). The biochemical pathways that haveevolvedtometabolizeexogenouschemicalsgenerallymakethesecompounds less toxic (deactivation reactions) and more water-soluble, thus facilitating their elimination from the body. However, in a number of cases, exogenous compounds may be transformed to active or more toxic metabolites resulting in activation or bioactivation reactions. Bioactivation reactions that result in toxicity are frequently due to the formation of reactive intermediates, including epoxides, free radicals. or N-hydroxyl derivative,s (reviewed in [41.42]).
Rosenthal and Germolec
304
Table 4 Examples of Immunotoxic Compounds Requiring Metabolic Activation
mpounds
Class Aflatoxin Mycotoxins Ochratoxin A Wortmannin Organic solvents Benzene
PAHs
Chlordane
Carbon tetrachloride Ethanol tz-Hexane Benzo[a]pyrene Dimethylbenzanthracene 3-Methylcholanthrene
Pesticides Malathion Parathion Cyclophosphamide Miscellaneous Dimethylnitrosamine Source: Adapted from Smialowicz and Aolsapple. 1996.
Xenobiotic metabolism occurs via enzymatic reactionsthat can be broadly classified into two categories, phase I and phase I1 reactions. These reactions generally work in sequence to detoxifyand facilitate removalof xenobiotics from the body. Phase I reactions are frequently oxidation reactions; however, reductions, hydrations, ester hydrolysis, alcohol and aldehyde dehydrogenation, and dismutation reactions also occur.The class of reactions termed phaseI1 generally act through conjugation of the xenobiotic molecule with a polar compound, increasing water-solubility and accelerating excretion. Phase I1 reactions include methylations. acetylations, and conjugation reactions with glutathione, glucuronides, glucose, sulfates, thiols, and thiosulfates. The substrates for phase I1 reactions can be the unchanged xenobiotic or the metabolic product of a phase I reaction [42]. Table 5 liststhe principal phaseI and phase IT enzymes responsible for xenobiotic metabolism. Although metabolism by the cytochrome P-450s are predominantly deactivation reactions, the P450s are the principal phase I enzymes involved in metabolic activation of foreign compounds. These enzymes are widespreadin nature, tend to be concentrated in the portals of entry to the body (skin, respiratory system, digestive system), and are thought to function as a first line of defense by detoxifying exogenous compounds before toxic insult can occur. The general reaction catalyzed by the P450s is characterized by the addition of a hydroxyl group to a substrate molecule:
ion
Toxicological Assessment of the Immune System
305
Table 5 Phase I andPhase I1 Metabolic Enzymes
Class ~
~_______
Phase I enzymes dehydrogenase Alcohol Oxidation
Hydrolysis Reduction
Aldehyde dehydrogenase Xanthine oxidase Monoamine oxidase Diamine oxidase Prostaglandin H synthase Flavin monooxygenase Cytochrome P450 Carboxylesterase Peptidase Epoxide hydrolase Azo-, nitro-reductase Carbonyl reductase Disulfide reductase Sulfoxide reductase Quninone reductase Cytochrome P450 (reductive dehalogenation)
Phase I1 enzymes Addition funcof N-acetyltransferase tional groups Methyltransferase Conjugation UDP-glucoronosyltransferase Sulfotransferase Glutathione S-transferase Acyl CoA synthetaselN-acyltransferase Source: Adapted from Parkinson, 1996.
RH
+ O2 + NADPH + H+ + ROH + H 2 0 + NADPS
To date, over 480 P450 genes have been characterized in a varietyof species, and in more than 20 distinct human isozymes are known [43]. Genetic polymorphisms P450 isozymes have been associated with drug- and chemical-induced toxicity [44,45]. Three gene families (CYP1, CYP2, and CYP3) appear to be primarily responsible for the oxidationof foreign compounds, including drugs, pesticides, and environmental contaminants [46]. These families may have evolved from steroidogenic P450s, responsible for the oxidation of endogenous compounds, to detoxify xenobiotics such as plant toxinstaken in through the diet. Considerable similarities or differences may exist across species in isozyme expression and substrate specificity within a given subfamily [47]. For example, an ortholog of
306
Rosenthal and Germolec
IA with similar catalytic propertiesand consistent substrate preferences is found in diverse species such as rodents. chickens, and fish [48]. In contrast, orthologsintheCYP2Csubfamily,whileexpressed in avariety of species,have widely varied substrate specificities and can be expressed in sex-specific or sexindependent fashion [48]. The regulation of P450s has been studied extensively, and a number of factors may affect P450 expression, including gender, age, nutritional status, disease, genetic predeterminants, environmental pollutants, and stress [49-521. Of particular importance with respect to immunotoxicity is the fact that both very young and very old organisnls are deficient in many of the constitutive P450 enzymes, although certain P450 isozymes can be induced in these groups by exposure to xenobiotics. Thus, compounds can be more or less toxic as a result of age or nutritional deficiencies. Tissue concentrations of various P450s can also be influenced by dietary components as well as by any of a large number of lipophilic xenobiotics. Phenobarbital (PB) and TCDD have been widely studied as inducers of the CYPlA and the CYP2B families, respectively. Many PAHs have been found to induce the 1A and 1B families, with the strongest inducers found among the most planar PAHs. As the 1A and 1B families are involved in bioactivation reactions, inducers of these P450 families are of special concern with regard to immunotoxicity [53.54]. Polycyclic aromatic hydrocarbons such as benzo[n]pyrene are metabolized primarily in the liver, and reactive metabolites are transported by serum proteins to other tissues [55]. While splenic and alveolar macrophages have demonstrated CYPlAl activity and the ability to produce reactive metabolites after exposure to PAHs. isolated lymphocytes and thymocytes not do demonstrate significant metabolic activity [56-591. Another widely investigated inducer is ethanol, which induces CYP2El , an enzyme involved inthe metabolism of low molecular weight organic solvents such as benzene. Benzene and its metabolites have long been associated with hematological and immunological disorders including leukemia in humans. A number of experimental studies suggest that benzene metabolites, including hydroquinone. catechol, and phenol, are responsible for its hematotoxicity [60]. Induction of CYP2E1 in rats by treatment with ethanol enhances the myelotoxicity of benzene [61], and inhibilion of CYP2E1 activity by administration of propylene glycol partially prevents benzene-induced immuno- and nlyelotoxicity w21. The elucidationof the metabolic pathways for enzymatic biotransformation of xenobiotics and immunotoxicity studies are areas that donot frequently overlap. However, as can be seen from the examples described above, knowledge of metabolism and identification of the ultimate toxic species are critical in understanding the mechanisms of immunotoxicity and the target cell populations for a wide variety of xenobiotics. Furthermore, an awareness of how these two pro-
"
.
-
-
" "
Toxicological Assessment of the Immune System
307
cesses interrelate is essential to conduct risk assessment in immunotoxicology, taking into account likely metabolites and polymorphisms in humans of drugmetabolizing enzyme systems. Such concerns also exist in the designof therapeutics used to treat HIV infections, transplant rejection, and other diseases that affect the immune system.
IV. IMMUNOMODULATION IN HUMANPOPULATIONS Evaluation of immunotoxicity in humans is notably more complex than in animals,consideringthelimitednumber of noninvasivetestsandbiological responses in the general population that are relatively heterogeneous. In addition, with the exceptionof controlled clinical studies, exposure levels of the agent (i.e., dose) are oftendifficult to verify.When immune function studies are performed in humans, it is essential that recently exposed populations be studied and sensitive tests for assessing the immune system be performed since many immune changes in humans following chemical exposure may be occasional and subtle. Because of overlap (redunmany immune tests performedin humans have a certain degree dancy), it is also important that a positive diagnosis of immune dysfunction be based not on a single changebut on a profileof changes, similar to that observed in primary or secondary immunodeficiency diseases (e.g., low CD4:CD8 ratios accompanied by changes in skin tests to recall antigens). The World Health Organization (WHO) has prepared a monograph [63] providing testing schemes and their respective pitfalls for examining immune system changes in humans. It should be noted that the selection of many of these assessmentswas derived from observations in patients with primary immunodeficiency disease, individuals who suffer from a degree of immunosuppression considerably more severe than that induced by chemicals (immunomodulatory drugs excluded). In lightof the difficulties that existin identifying chemical-induced immunosuppression in humans, establishment of exposure levels (e.g., blood or tissue levels) of the suspected chemical(s) is essential in determining a cause-effect relationship. In human immunological studies, it should not be necessary to observe clinical diseases in order for immunosuppression to be meaningful for several reasons. First, uncertainties exist regarding whether the relationship between immune function and clinical disease follows linear or threshold responses. For instance, in a linear relationship, even minor changes in immune function would relate to increased disease, given that the population examined is large enough. While the relationship at the low end of the dose-response curve is unclear, obviously, at the high end of the curve (i.e., severe immunosuppression), clinical disease is readily apparent. best exemplified by increases in opportunistic infections that occur in AIDS patients and the increased incidence of neoplasia in therapeutically immunosuppressed transplant patients. Secondly, clinical disease
308
Germolec
Rosenthal and
may be difficult to establish considering neoplastic diseases may involve a 10to 20-yr latency before tumor detection and increases in infections are difficult to ascertain in epidemiological surveys. TheAgencyforToxicSubstancesandDiseaseRegistry with the CDC (ATSDRKDC) andNationalResearchCouncilSubcommittee on Biologic Markers in Immunotoxicology have proposed testing batteries that attempt to address many of the above-described problems and pitfalls by implementing a comprehensivestate-of-the-artimmunologicalevaluation in conjunction with more traditional tests [64,65]. Many of these tests are similar to those used to identify chemical-induced immunosuppression in laboratory animalsand should help to predict the probability of developing suppressed host resistance or clinical disease in humans. Similar. to animal investigation, these tests are also recommended in a tiered approach.
V.
REGULATORYISSUES IN IMMUNOTOXICOLOGY
Regulatory requirements for examinationof the inmunomodulatory potential of chemicals and drugs has primarily focused on the ability of these agents to induce hypersensitivity. Recent efforts in this area have centered on harmonization of requirements in the European Community and various U.S. regulatory agencies. In addition, a number of agencies, including the FDA and OECD are considering incorporation of the murine local lymph node assay (LLNA) as an alternative testing methodology for the assessment of the skin sensitization potential of chemicals and drugs. Interlaboratory validation studies suggestthat the LLNA is a sensitive, predictive, and reliable testing method for the detection of sensitizing agents [66-681. The assay was the first to be evaluated as an alternativemethod by theInteragencyCoordinatingCommittee on Alternative Methods. There remains considerable debate as to whether. histopathology. as currently performed in routine toxicity studies, such as those requiredin the OECD 407 guideline for repeat-dose toxicity testing in rodents, is a sufficient indicator for potential immunotoxicity.As mentioned earlier, alterationsin lymphoid organ weights are relatively poor indicators of immune-targeted toxicity[9]. and limited studies suggest that the use of “extended histopathology” (spleen and thymus weights and histopathology, plus histopathology of the lymph nodes, Peyer’s patches, and bone marrow; OECD 407 revisions adopted 7/27/95) while an improvement over the original 407 testing scheme, are not as efficient as a tiered approach at detecting immunotoxicants [69]. A proposal to include functional testing as part of the OECD 407 guideline is now under consideration. Currently the U.S. Environmental Protection Agency is the only agency with specific regulatory requirements for assessing potential immunosuppressive
Toxicological Assessment of the Immune System
309
effects of chemicals, however, there are a number of current and proposed testing guidelines within the FDA that identify approaches to such testing and to the interpretation of test results. The EPA’s guidelines for immunotoxicology testing fall under both its Toxic Substances Control Act (TSCA) and its Federal Insecticide, Fungicide and Rodenticide Act (FIFRA) Series 870 Health Effects Test Guidelines, which are used by the Office of Pesticide Programs and the Office of Pollution Prevention and Toxics. The two guidelines require assessment of immune function in groups of eight animals at three dose levels plus a negative control, after a %day exposure in tier I testing, via quantification of antibody production in rodents after immunization with SRBCs (see Section 799.9870 of the TSCA Testing Guidelines for additional details). Both the plaque-forming cell assay and ELISA quantificationof antigen-specific antibodiesin serum would be acceptable methodologies. The guidelines allow forthe use of a single rodent species if ADME data are similar between species, however, if ADME data are lacking, then testing in both rats and mice will be required. The necessity to perform additional tests, such as quantification of lymphocyte subsets or natural killer cell activity shall be determinedon a case-by-case basis, dependenton the outcome of the assessment of the antibody response. As both the EPA and FDA guidelines are subject to update and modification, the reader is referred to the Federal Register for the most current versions specific to the drug, chemical, or device as appropriate.
VI.
CONCLUSIONS ANDFURTHERDIRECTIONS
Adverse effects on the immune system occurring from exposure xenobiotics are manifest in a wide rangeof biological responses that share one common element, an alteration the normal homeostatic balanceof immune system components.The diverse pathogenesisof these diseases necessitates that different testing strategies be employed for their assessment. While sensitive and predictive diagnostic tools for the assessment of some alterations in immune function are well developed (e.g.. hypersensitivity), measures to quantitate immunosuppression in humans are still relatively insensitive. The diversity of the human population in terms of environmental exposures, genetics, age, etc. requires special considerationwhen assessing the risk of adverse immune effects. Health concerns will vary from population to population, asrisk of infectious diseases would be more significant in developing countries, while increased prevalence of allergic and neoplastic diseases would be a more likely outcome in developed countries. With increasing public awarenessof the potential for increased disease due to alterations in immune function, regulatory agencies have begun to require immune assessment as part of the registration process for chemicals, drugs, and devices. This must be balanced withthe increasing pressure to reduce the useof
Germolec
310
and
Rosenthal
animals, costs of testing, etc. New methodologies that will provide useful and predictive information out of current tests (e.g., activation markers in phenotyping studies), in vitro models (e.g., use of primary cell cultures for the skin and lung), transgenic and knock-out mice. and the use of molecular techniques (e.g., the useof cytokine profiles to assess potential hypersensitivity) need to berefined and subsequently validated to better the assess the effects of low-level exposures to environmental agents.
REFERENCES 1. Y.Seki, S. Kawanishi,and S. Sano,MechanismofPCB-inducedporphyriaand Yusho disease. A m . NY Acud. Sci. 514:222 (1987). 3. L.A. Burns. B.J. Meade, and A.E. Munson, Toxic responses of the immune system, in Casarett nrld Doull’s Toxicology, The Basic Scierrce of Poisoil. 5th ed. (C. Klassen, ed.), McGraw-Hill, New York, 1996, pp, 355-402. 3. J. Descotes, Drug IrzdrrcedImumuze Disecrses, Elsevier,Amsterdam,1990pp.1233.
4. M.I. Luster, A.E. Munson, P.T. Thomas. J.P. Holsapple, J.D. Fenters, K.L. White, L.D. Lauer, D.R. Germolec, G.J. Rosenthal, and J.H. Dean, Development of a testing battery to assess chemical-induced immunotoxicity: National Toxicology Program guidelines for immunotoxicity evaluation in mice, Fmd. Appl. Toxicol. 102- 19 (1988). 5. D.M. Hinton, Testing guidelines for evaluation of the immunotoxic potential of direct food additives, Crit. Rev. Food sci. Nutr. 32:173-190 (1992). 6. R.D. Sjoblad. Potential future requirements for immunotoxicology testing of pesticides, Toxicol. Appl. Phurnzncol. 4:391-395 (1989). ImmuneFunction Test Butteries for Use in Environmental 7. J.M.Straight,etal., Health Field Studies. U.S. Department of Health and Human Services, Public Health Service, Washington, DC (1994). Evctluatiorr of OECD Guideline #407 for Assessmellt 8. H. Van Loveren and J.G. Vos, of Tosiciv of Cherllicals wit11 Respect to Potential Adverse Effects to the Irnmme System, National Instituteof Public Health and Environmental Protection Bilthoven,
The Netherlands ( 1992). 9. M.I. Luster, C. Portier, and D.G. Pait, Risk assessment in immunotoxicology: Sensitivityandpredictabilityofimmunetests, F~rrzd.Appl. Toxicol. 18:200-210 (1992). 10. M.I. Luster et al., Risk assessment in immunotoxicology: Relationships between immune and host resistance tests, Fund. Appl. Toxicol. 21:71-82 (1993). 11. J. Mordenti and W. Chappell, The use of interspecies scaling in toxicokinetics, in Toxicokinetics arld New Drug Developinellt, (A. Yacobi, J. Skelly, and V.K. Batra eds.) Pergamon, New York, 1989, pp. 42-96. 13. N.K. Jerne and A.A. Nordin, Plaque formationin agar by single antibody producing cells, Scierzce 140:405 (1 963). 13. K.E. Phillips and A.E. Munson, 2’. 3’-Dideoxyinsine inhibits the humoral immune
Toxicological Assessment of the Immune System
14.
15. 16.
17. 1s. 19. 20.
21. 22. 23. 24. 25. 26. 27. 28.
311
responseinfemaleB6C3F1micebytargetingtheBlymphocyte, Toxicol. Appl. Pizannncol. 145:260 (1997). B. Delaney and N.E. Kaminski, Induction of serum-borne immunomodulatory factors in B6C3Fl mice by carbon tetrachloride. I. Carbon-tetrachloride-induced suppression of helper T-lymphocyte function is mediated abyserum-borne factor,Toxicology 85:67 (1993). R.K. Dooley, L.M. Dale. and M.P. Holsapple. Elucidation of cellular targets responsible for tetrachlorodibenzo-p-dioxin(TCDD)-induced suppression of the antibody response. 2. Role of the T-lymphocyte, Zrnrnrrr?ophar~racolog~ 19:47 ( 1 990). L. Temple, T.T. Kawabata, A.E. Munson, and K.L. White, Comparison of ELISA andplaque-formingcellassaysformeasuringthehumoralimmuneresponseto SRBC in rats and mice treated with benzo[a]pyrene of cyclophosphamide, Fwd. Appl. Toxicol. 21:412 (1993). S.C. Wood, J.G. Karras, and M.P. Holsapple. Integration of the human lymphocyte into immunotoxicological investigations. Fmd. Appl. To.kol. 18:450 (1992). P.L. Pollock, D.R. Germolec, C.E. Comment, G.J. Rosenthal, andM.T. Luster, Development of human lyphocyte-engrafted SCID miceaas model for immunotoxicity assessment, F w d . Appl. To-sicol. 22: 130 (1 994). A.A. van der Heyden, E. Bloemena, T.A. Out, J.M. Wilmink, P.T. Schellekens. and M.H. van Oers, The influence of immunosuppressive treatment on immune responsiveness in vivo in kidney transplant recipients, Trmsplnntatiorl 48:44 (1989). W.V. Raszka, R.A. Moriarty, M.G. Ottolini, N.J. Waecker, D.P. Ascher. T.J. Cieslak,G.W.Fischer,andM.L.Robb.Delayed-typehypersensitivityskintestingin humanimmunodeficiencyvirus-infectedpediatricpatients, J. Pediatr. 129:245 ( 1996). R.Glaser.G.R.Pearson,R.H.Bonneau,B.A.Easterling,C.Atkinson,andJ.K. Kiecolt-Glaser.StressandthememoryT-cellresponsetotheEpstein-Barrvirus in healthy medical students, Health Psychol. 12:435 (1993). J.K.Kiecolt-Glaser,R.Glaser, S. Gravenstein.W.B.MalarkeyandJ.Sheridan, Chronic stress alters the immune response to influenza virus vaccine in older adults, Proc.Natl. Acnd. Sci. USA 93:3043 (1996). K. Bottomly, A functional dichotomy of CD4+ T lymphocytes. Itnrnmol. Today 9: 268-274. (1 988). F.D. Finkelman, I.M. Katona, T.R. Mosstnan and R.L. Coffman. IFN-)I regulates the isotopes of Ig secreted during in vivo humoral immune responses. J. Zrnmrrnol. 130:1022- 1027 (1988). R.J. Dearman, J.M. Hegarty. and I Kimber, Inhalation exposure of mice to trimellitic anhydride induces both IgG and IgE antihapten antibody. 6 1 f . Arch. Allergy Appl. ZI~?I~1Z~;?Ol. 95170-76 (1991 ). J.H. Dean, J.B. Cornacoff, G.J. Rosenthal, and M.I. Luster, Immune system, evaluation of Injury, in Prirzciples nrzd Methods i r l Toxicology, 3rd ed. (A.W. Hayes, ed.) Raven, New York, 1994, pp. 1074- 1075. R. House and P. Thomas, In vitro induction of cytotoxic T-lymphocytes.in Methods in Z~nrnr~noto.sicolog~, Vol. 1 (Burleson. Dean, and Munson, eds). Wiley-Liss. New York,1995.pp.159-172. E.R. Unanue, Macrophages, antigen presenting cells. and the phenomenaof antigen
312
29. 30. 31. 32.
33. 34. 35. 36. 37. 38.
39. 40.
41. 42. 43. 44.
Rosenthal and Germolec
handling and presentation, in Fundemental Irnrmnology, 2d ed. (W.E. Paul. ed.). Raven, New York, 1989, pp. 95. G.B. Mackaness. The monocyte in cellular immunity. Sernin. Hemntol. 7(2): 172184 (1 979). J.G. Lewis. Isolationof alveolar macrophages, peritoneal macrophages. and Kupffer Vol. 2. (Burleson, Dean, and Munson. eds), cells, in Methods irz Zr~zrizcrrzotosicolog~~. Wiley-Liss, New York. 1995, pp. 15-26. G.J.Rosenthal.B.B.Blaylock.andM.I.Luster.Isolationand in vitroculture of mononuclear phagocytes, in Methods in Toxicology. Vol. 1A (Burleson, Dean. and Munson, eds). Wiley-Liss, New York, 1993, pp. 455-469. D.L.Nelson,R.W.Lange,G.J.Rosenthal,C.E.Comment,andG.R.Burleson, Macrophagenonspecificphagocytosisassays,in Methods in I~ilrrzui?oto.~icology, Vol. 2 (Burleson, Dean, and Munson. eds.). Wiley-Liss, New York, 1995, pp. 39-57. K.E. Rodgers, Measurement of the respiratory burst of leukocytes for immunotoxicologicanalysis,in Methods it? Iinrnzcrrotoxicolog~,Vol.2(Burleson.Dean.and Munson, eds.). Wiley-Liss, New York, 1995, pp. 67-78. R.R.Dietert,J.H.Hotchkiss,R.E.Austic.andY.J.Sung.Production of reactive nitrogen intermediates by macrophages, in Methods in Irlznzlrizoto,~icology,Vol. 2 (Burleson, Dean, and Munson, eds.). Wiley-Liss. New York, 1995. pp. 99-1 18. J.Y. Djeu, H.A. Heinbaugh, H.T. Holden, and R.B. Herberman. Augmentation of mouse natural killer cell activity by interferon and interferon inducers. J. Irnmurrol. 122:175 (1979). E.A. Grimm, A. Mazumder. H. Zhang, and S.A. Rosenberg. The lymphokine activated killer cell phenomenon: Lysis of fresh solid tumor cells by IL2 activated autologous human peripheral blood lymphocytes. J. E.xp. Med. 15:I823 (1982). J.Y. Djeu. Natural killer activity, inMethods inInzmlrrzoto.~icoIog?~. Vol. 1 (Burleson. Dean, and Munson, eds.). Wiley-Liss, New York. 1995, pp. 437-450. K.E. Anderson and H.I. Maibach, Guinea pig sensitization assays, Cz4rr. Prob. Der/i?ntol.14:263-290 (1985). S.C. Gad et al., Development and validation of an alternative dermal sensitization test: Mouse ear swelling test (MEST), Toxicol. Appl. Plmmacol. 83:93-114 (1986). D.A. Basketter et al., Interlaboratory evaluation of the local lymph node assay with 25 chemicals and comparison with guinea pig test data, Toxicol. Methods 1:30-43 (1991). F.P. Guengerich, Enzymatic oxidation of xenobiotic chemicals, Crit. Rely. Biochem. Mol. Biol. 25:97 (1990). A. Parkinson. Biotransformation of xenobiotics,Cnsarett & DOlt//'s Toxicology: The Basic Scierlce of Poisom, 5th ed. (C.D. Klaassen, ed.). McGraw-Hill, New York, 1996, p. 113. J.A. Goldstein and M.B. Faletto. Advances in mechanism of activation and deactivation of environmental chemicals, Erwiror?. Healtlz Perspect. 100: 169 (1993). C.J. Chen, M.W. Yu, Y.F. Liaw. L.W. Wang. S. Chiamprasert. F. Matin. A. Hirvonen, D.A. Bell, and R.M. Santella, Chronic hepatitis B carriers with null genotypes of glutathione S-transferase MI andTI polymorphisms who are exposed to aflatoxin are at increasedrisk of hepatocellular carcinoma, Am. J. Hula Genet. 59:I28 (1 996).
Toxicological Assessment of the Immune System
313
45. K. Brosen and L.F. Gram, Clinical significance of the sparteine/debrisoquine oxidation polymorphism, Eur. J. Clin. Pharmacol. 36:537 (1989). 46. J.R. Halpert, F.P. Guengerich. J.R. Bend, and M.A. Correia. Selective inhibitors of cytochromes P450, Toxicol. Appl. Phannacol. 125:163 (1994). 47. F.J. Gonzalez and D.W. Nebert, Evolution of the P450 gene superfamily: Animalplant warfare, molecular drive and human genetic differences in drug oxidation, T r e d s Genet. 6: 182 (1990). 48. D.W. Nebert. Multiple forms of inducible drug-metabolizing enzymes: A reasonable mechanism by which any organism can cope with adversity, Mol. Cell. Biochem. 27:27 (1 979). 49. F.J. Gonzalez and Y.-H. Lee, Constitutive expression of hepatic cytochrome P450 genes, FASEB J. 10:1112 (1 996). 50. G. Jacob, K. Byth, and G.C. Farrell, Age but not gender selectively affects expression of individual cytochrome P450 proteins in human liver, Biochem. Plznrmrcol. 50:727 (1995). 51. A. Lampen,U. Christians, A. Bader,I. Hackbarth, and K.-F. Sewing, Interindividual variability of cyclosporine metabolism in the small intestine,Phcrrrrtcrcology 52: 159 (1996). 52. B.K. Park, P. Munir, and N.P. Kitteringham, The role of cytochrome P450 enzymes in hepatic and extrahepatic human drug toxicity, Pharmacol. Ther. 683385 (1995). as an indicator of 53. C. Toannides and D.V. Parke, Induction of cytochrome P4501 potential chemical carcinogenesis, Drug. Metch. Rev. 25:485 (1993). 54. K.K. Bhattacharyya, P.B. Brake, S.E. Eltom, S.A. Otto, and C.R. Jefcoate. Tdentificationof aratadrenalcytochromeP450activeinpolycyclichydrocarbon metabolism as a rat CYPlB I . Demonstration of a unique tissue-specific pattern of hormonal and aryl hydrocarbon receptor-linked regulation, J. Biol. Clzent. 270: 1 1595 (1 995). 55. G.L.GinsburgandT.B.Atherholt,TransportofDNA-adductingmetabolitesin mouse serum following benzo[a]pyreneadministration, Carcinogenesis 10:673( 1989). 56. D.R. Germolec. N.H. Adams. and M.I. Luster, A comparative assessment of metabolic enzyme levels in macrophage subpopulations in the F344 rat, Bioche~n.Phcrrnzacol. 50:1495 (1995). 57. D.R. Germolec. E.C. Henry,R. Maronpot, J.F. Foley, N.H. Adams. T.A. Gasiewicz, and M.I. Luster. Induction of CYPlAl and ALDH-3 in lymphoid tissues from Fischer 344 rats exposed to2.3,7.8-tetrachlorodibenzodioxin (TCDD), Toxicol. Appl. Pknrmacol. 13757 (1996). 58. G.S. Ladics. T.T. Kawabata. A.E. Munson, and K.L. White, Jr., Generation of 7,8dihydroxy-9,10-epoxy-7,8,9,10-tetrahydro-benzo[cr]pyrene by murine splenic macrophages. Toxicol. Appl. Pharrnacol. 115:72 ( 1992a). 59. G.S.Ladics,T.T.Kawabata.A.E.Munson.andK.L.White.Jr.,Metabolismof benzo[a]pyrenebymurinespleniccelltypes, Toxicol. Appl. Pharnmcol. 116:248 (1992). 60. R.D. Irons, W.F. Greenlee. D. Wierda, andJ.S. Bus, Relationship between benzene metabolism and toxicity: A proposed mechanism for the formation of reactive intermediates. Biologiccrl Reactive Intermediates-II (R. Snyder, D.V. Parke, J.J. Kocsis. D.J. Jollow,C.G. Gibson, and C.M. Witmer.eds.), Plenum, New York. 1982. p. 229.
314
Rosenthal and Germolec
61. T. Nakajima, S. Okuyama, I. Yonekura. and A. Sato, Effects of ethanol and phenobarbital administration on the metabolism and toxicity of benzene, Clzenz. Biol. Interact. 55:23 (1985). 62. J. Tuo, S. Loft, M.S. Thomsen, and H.E. Poulsen. Benzene-induced genotoxicity in mice in vivo detected by the alkaline comet assay: Reduction by CYP2E1 inhibition. Mutat. Res. 368213 ( 1996). Clinical Imnzunology.Meth63. IUIS/WHO Working Group, Laboratory Investigations, ods, Pi?falls, mzcl Clinical Imiications, 49, Geneva, 1988. pp. 478-497. National Acad64. National Research Council,Biologic Markers irl Z~?zn~~trzotoxicology, emy Press, Washington. DC. 1992. 65. U.S. Congress, Office of Technology Assessment, Idelztifvirzg m d Controlling Imnzru~oto.xicSubstance, OTA-BP-BA-75, Government Printing Office, Washington, DC,1991. 66. D.A. Basketter, G.F. Gerberick, I. Kimber, and S.E. Loveless, The local lymph node assay: A viable alternative to currently accepted skin sensitization tests, Food Clzetn. Toxicol. 34:985 (1996). 67. S.E. Loveless,G.S.Ladics.G.F.Gerberick,C.A.Ryan,D.A.Basketter.E.W. Scholes, R.V. House. J. Hilton, R.J. Dearman, and I. Kimber, Further evaluation of the local lymph node assay in the final phase of an international collaborative trial. Toxicology 208:141 ( 1996). 68. I. Kimber, J.Hilton,R.J.Dearman,G.F.Gerberick,C.A.Ryan,D.A.Basketter, E.W. Scholes.G.S. Ladics, S.E. Loveless, R.V. House, and A. Guy, An international evaluation of the murine local lymph node assay and comparison of modified procedures, To.xicology 10_?:63 (1995). 69. H. Van Loveren, J.G. Vos. and E.J. De Waal, Testing immunotoxicity of chemicals as a guide for testing approaches for pharmaceuticals,Drug Info. J. 30:275 ( 1996).
10 Toxicological Pathology Assessment Lynda L. Lanning BioReliance Corporation, Rockville, Maryland
1.
INTRODUCTION
Histopathology and clinical pathology evaluationsin toxicology studies generate an enormous amount of very important data. Pertinent factors in hematological, hemostasis, and clinical chemistry test selection, methodology, and interpretation include, but are not limited to, animal data (including species, strain, and age) and test material data (known toxic effects, chemical interactions). In addition, appropriate specimen collection and handling are critical to the validity, reliability, and consistency of the test results. Clinical pathology dataare generally subjected to statistical analysis; however, care should be taken to establish the difference between statistical and biological significance.The pathologist and clinical pathologist form the link between in-life data and postlife data. Data interpretation must be done as apart of the whole study picture, including clinical observations, various in-life measurementssuch as food or water consumption andbody weights, necropsy and histopathology findings. clinical pathology findings, and organ weight data.The clinical pathologist utilizesthe review of concurrent control data. historical control data, and individual animal vs. datagroup meanswhen evaluating the relative biological significance of changes. The pathologist takes into accountboth the recognitionof injury to tissue and its biological significance based on the natureof the injury in conjunctionwith other available data. Assessment of altered gross and cellular morphology depends on the ability of the pathologisttodiscriminatebetweentestmaterial-inducedchanges,secondary changes, spontaneous disease. postmortem changes, iatrogenic changes andnormal biological variations. Accuracy and clarity of the pathology data and report 315
Lanning
316
are key to the usefulness and validity of the study. The clinical pathologist and the pathologist can and shouldbe an integral part of the toxicology team providing input from study design through study report.
II. CLINICALPATHOLOGY
A. Specimen Collection andQuality Proper collection and specimen handling are critical for accurate hematological evaluation. Collection sites vary with the species, volume of specimen needed, necessity for anesthetics, and number of collection time points required by the of rats protocol. Table 1 describes suggested sample volumes from different ages and mice. The minimum amounts required for routine hematology and clinical chemistry are dependent on the type of equipment used in the clinical laboratory; however, in general, these amounts are0.5 ml of whole blood (ethylenediaminetetraacetic acid [EDTA] anticoagulant) for hematology and 0.5 ml of serum for a routine clinical chemistry panel. Standard operating procedures should be in place for all blood sampling techniques. Only those staff members trained and proficient in the blood collection technique requiredby the protocol should perform the collection.
Table 1 SuggestedSampleVolumesfromMiceandRats
Volume (ml) Species Age Mouse (nonsacrificial) Whole Blood Plasma Serum Mouse (terminal) Whole Blood Plasma Serum Rat (nonsacrificial) Whole Blood Plasma Serum Rat (terminal) Whole Blood Plasma Serum
6 wk
17 Age
wk
Age 6 mo
0.25
0.75 0.37 0.37
1.o 0.5 0.5
1.o 0.5 0.5
1.5 0.75 0.75
1.75 0.87 0.87
1.o 0.5 0.5
1.5 0.75 0.75
3.0 1.5 1.5
3.0 1.5 1.5
4.0 2.0 2.0
2.5 2.5
0.5 0.25
5.0
Toxicological Pathology Assessment
317
In rodents, anesthetics,when required, should be chosen carefully. Clinical pathology parametersmay be affected by these agents,many of which areknown microsomalproteininducersand/orcausesplenicsequestration of peripheral bloodcells.Alternately,stressduringbloodcollectionwithoutanestheticsis known to result in marked alteration in cellular values [ 1,2]. The anesthetic of choice for laboratory rodents is 70% a carbon dioxide/30% oxygen mixture. This agent is relatively safe, nontoxic, readily available, does not induce microsomal proteins, does not alter cardiac function/output, and is acceptedby the American Veterinary Medical Association [3]. Anesthetics are rarely used in rabbits, dogs, and nonhuman primates because appropriate restraint is achievable without their use.
B. Hematology The importance of appropriate hematological assessment in toxicology studies has been recognized formany years. In elucidating toxic effects through hematology, the investigator must be awareof the appropriate methods of blood collection and handling, sampling times, testing selection, quality control evaluation, and test result interpretation. One should remain aware that the test substance may affect stem cell health, maturation, and/or release, or peripheral blood cell distribution, function, and/or use. In addition, one mustbe aware of normal animal species physiology and its impact on hematological parameters when interpreting test results.
1. Selection of Parameters The investigator should consider several points when selecting hematological parameters for evaluation. These items include sample volume requirements, information desired, time points for evaluation, and route of sample collection. "Routine" hematological evaluationin toxicity testing is generally usedin those studies where hematological effects are not expected based on the known structure and/or functionof the test material. The parameters measured include white blood cell (WBC) count, red blood cell (RBC) count, hematocrit (HCT), hemoglobin (HGB), mean corpuscular volume (MCV),mean corpuscular hemoglobin (MCH), mean corpuscular hemoglobin concentration (MCHC), and platelet count (PLT). Smearsshould be preparedatsamplecollectionfor both the WBC differential/cellular morphology evaluation and the reticulocyte count. If changes are noted in the WBC, the WBC differential shouldbe performed. The reticulocyte count may provide useful information in determining the extent and/or the If the investigator suspectstest material-related hemaresponse to RBC changes. tological changes prior to study start, the use of more specialized tests or techniques may be warranted. For example, a bone marrow smear may be prepared
Lanning
31 8
at necropsy and evaluated for potential stem cell damage or maturationhelease effects. Determination of methemoglobin concentrations are helpful if the test material is a known oxidant substance.
2.
Specimen Collection and Quality
Forroutinehematologicalmeasurements, which requirewhole,uncoagulated blood, the anticoagulant of choice is the potassium salt of ethylenediaminetetraacetic acid (EDTA). This product is available as either a powder or liquid in a variety of commercial blood collection tubesof various collection volumes, however the liquid EDTA is preferable for adequate mixing. It is important to include the appropriate amount of blood for the volume of anticoagulant present in the tube to minimize volume effects such as erythrocyte shrinkagewhich will result in artifactual changes in the HCT, MCV, and MCHC. Other anticoagulants include heparin, sodium fluoride, sodium citrate (used for blood collection for coagulation measurements) and potassium oxalate. After the specimen is collected from the animal and placed into the collection tube, the blood and anticoagulant should be gently mixed for 30 to 60 sec. The specimen should then be placed on a mechanical rocker prior to analysis. The WBC differential smears should be prepared from EDTA-anticoagulated whole blood within 2 hr of sample collection. The smears should be allowed to air dry and then fixed with absolute methanol. These smears are stained with Wright-Giemsa stain for microscopic evaluation. The WBC differential and platelet and erythrocyte morphological assessment are conducted concurrently from thissmear.Reticulocytesshould be vitallystainedfor10 min within 2 hr of sample collection and a thin-film smear prepared.
3. Common Parameters-Methodology and Interpretation ( a ) White Blood Cell Comt. Thetermwhitebloodcellpertainstoall types of leukocytes including granulocytes (neutrophils, eosinophils. basophils), lymphocytes, and monocytes. Each typeof white blood cellhas distinct morphological and functional features.The white blood cell count is usually reportedin units of thousands per cubic millimeter. Automatedinstrumentshavevirtuallyreplacedperformance of WBC counts by manual methods, however the WBC count may be obtained by use of a specialized typeof microscope slide called a hemocytometer. The manual method requires mixing bloodin a special white cell dilution pipette. After the red blood cells are lysed, the specimen is placed on a hemocytometer and the cells are counted under lOOX magnification. Automated white blood cell counts may be measured by either laser light scatter or, more commonly,by impedance particle counting. The laser light scat-
Assessment Pathology Toxicological
319
ter method uses a focused light beam. As the cells travel through the light beam, the amount of light scattered at different angles is measured. Impedance particle counting involves suspensionof an aliquot of the specimen in isotonic salinethat flows through a narrow aperture across which a DC current is maintained [4,5]. The white blood cells are differentiated from red blood cells by diameter (in rats, RBC diameter is 5.9 pm compared with 10 to 12 pm for neutrophils; in mice, the RBC diameter is 5.5 pm compared with 10 to 12 pm for neutrophils) [SI and lysing of the erythrocytes. A significant advantageof impedance cell counters in laboratory animal testing is the ability to adjust the discriminators to the specific cell size(s) of various species. Caution should be used when evaluating the WBC data since several common mistakes in blood collection and evaluationmay result in erroneous results. Nucleated RBCs cannotbe distinguished from WBCs and will increase the WBC count. After the WBC differential count is completed, theWBC count should be corrected for the number of nucleated RBCs (see discussion below). Platelet clumps may also produce high WBC counts. This is a problem with inadequate mixing of the specimen with the anticoagulant. If manual counts are performed, care must be taken when choosing the lysing solution.It should be strong enough to lyse the red cells preferentially to the white cells. However, WBCs from some species, such as mice, are particularly susceptible to lysis. Leukocytosis is indicated by a WBC count that is higher than the normal value for that species and age animal. In general, increases are due to increases in the neutrophil componentand are most likely a response to infectious processes in tissues of the body. (b) Hematocrit. The hematocrit (packed cell volume) is the quantitative measurement of erythrocyte concentration after optimal packingof erythrocytes in a commercially available microhematocrit capillary tube. A manual measurement of the hematocrit is performed by centrifugation of anticoagulated whole blood in a microhematocrit centrifuge. The packedred cell column in the microhematocrit capillary tube then is measured using a microhematocrit reader. Visual inspection of the centrifuged specimenmay provide additional information, such as evidence of hemolysis, icterus, lipemia, and leukocytosis. The hematocrit can also be calculated on automated hematology instruments by multiplying the red blood cell count and themean corpuscular volume and is expressed as a percentage. This automated method avoids technical errors such as the presence of trapped plasma and reading errors. An increased hematocrit is indicated by a hematocrit value that is higher than the normal value for that species and age animal. Increased hematocrit can be due to either an increase in the circulating RBC mass or a decreasein plasma volume (dehydration). Anemia is generally defined as a hematocrit <37% [7].
Lanning
320
(c) Red Blood Cell Count. The red blood cell, or erythrocyte, is a nonnucleated biconcave disk derived from marrow stem cells under the influence of erythropoietin. Erythropoietin is produced by the kidneys and its production is stimulated by hypoxia and/or stem cell turnover.The maturation time from stern cell to red blood cell in peripheral circulation is approximately five days. The red blood cell transports oxygen, carbon dioxide, and nutrients.The average life span of the red blood cell varies by species (mouse 20 to 45 days, rat 50 to 65 days, rabbit 45 to 70 days, dog 100 to 120 days). Although automated instruments have virtually replaced performance of RBC counts by manual methods, the RBC count may be obtained by use of a hemocytometer. Automatedred blood cell countmay be measuredby either laser light scatter or, more commonly,by impedance particle counting as discussedin the previous section. The RBC is usually reported in unitsof millions per cubic millimeter. Anemia, decreased RBC count, is indicated by a RBC count that is lower than the normal value for that species and age animal. Anemias are the resultof a decrease in RBC mass resulting from various mechanisms. (d) Hemoglobin. Hemoglobin (HGB)is theoxygen-carryingpigment of the erythrocyte. It is an oligomeric protein containing four separate globin peptidechainseach of whichisnoncovalently bound toaporphyrinicheme group. Each heme group has a central iron atom that is reversibly bound with molecular oxygen. The International Committee for Standardizationin Hematologyrecommendsthecyanomethemoglobin method forhemoglobinmeasurement. This reaction converts hemoglobin to cyanomethemoglobin following the addition of potassium ferricyanide to the specimen. All hemoglobin derivatives except sulfhemoglobin are converted to cyanomethemoglobin, which is a stable pigment with an absorption at 540 nm in a spectrophotometer. The hemoglobin concentration is calculated by comparison of the unknown solution to a standard solution of hemoglobin and is usually expressed in units of grams per deciliter. The HGB should be approximately one-third of the hematocrit if the red blood cells are of normal size. The HGB value is used to determine several of the red cell indices which are used in characterizing anemias.
(e) Mean Corpuscular Volume. The mean corpuscularvolume(MCV) measurement is the volume of the average red cell calculated from the number of red blood cells and hematocrit as described below: MCV
=
Hematocrit (%) X 10 RBC count (106/pl)
The MCV is generally higher in young animals due to the presence of more immature red blood cells, which are larger and havenot yet taken the biconcave
AssessmentPathology Toxicological
321
shape. Red blood cells with increased MCV values are referred to as macrocytic. Those with decreased MCV values are microcytic. The MCV is usually expressed as femtoliters or cubic micrometers. (f) MeanCorpuscularHemoglobin. The mean corpuscularhemoglobin (MCH) measurement is the concentration of hemoglobin by weight in the average red blood cell (expressed in units of picograms or microtnicrograms) and is calculated as described below:
MCH =
Hemoglobin concentration (g/dl) X 10 RBC count ( 106/pl)
(2)
(g) Mean Corpuscular Hemoglobin Concentration. The mean corpuscular hemoglobin concentration (MCHC) is the ratioof the hemoglobin concentration to hematocrit (expressed as a percentage) and is calculated as described below:
MCHC =
Hemoglobin concentration (g/dl) X 100 Hematocrit (%)
This measurement is considered to be the most constant erythrocyte index, providing that the HGB and HCT measurements are accurate [8]. Red blood cells with increased MCHC values are referred to as hyperchromatic. Those with decreased MCHC values are hypochromatic. ( h ) Platelet Count. The plateletcount(PLT) is an automateddirect count determined by impedance, as described in the section on white blood cell count and is usually expressed in units of thousands per cubic millimeter. In addition, pulse editing isused to discriminate between red blood cells and platelets. Because of their small size, wide size range, aggregation, and difficulty in distinguishing platelets from debris or microcytic red cells, platelet counts are measured less preciselythan other componentsof the blood count (expected variability of approximately 22%) [9]. Automated measurements deal with these issues by mathematical analysisof the platelet volume distribution to ensure that it represents the log-normal distribution expected.An estimate of platelet numbers should be done on the WBC differential slide, particularly if the platelet count and/or histogram are outside the expected range. This visual inspection will allow forexclusion of spuriousplateletcountscaused by thepresence of platelet clumps, debris, microcytic red blood cells, or cellular fragments [ 101. The sigin thelatersectionconcerning nificance of theplateletcountisdiscussed coagulation/hemostasis.
322
Lanning
(i) M e m Platelet Volume. The mean platelet volume (MPV) is an automated direct measurement of platelet size and is usually expressed in units of femtoliters. It has been documented that the MPV is inversely correlated with the platelet count [ 1 1,121. Significant changes in size will alter the platelet biomass, which, in turn, determines hemostatic ability as long as platelet function is not impaired. Therefore,in evaluating platelet hemostatic function, the circulating platelet biomass is a more meaningful indication than mean platelet volume or platelet count alone. The following platelet volumes have been reported: dog, nonhuman primate, pig, and human, 7.6 to 8.3 fl; and rat, guinea pig, and mouse, 3.2 to 5.4 fl [ 131. In general. it is consideredthat larger platelets are metabolically and functionally more active than smaller platelets [ 141. (j) Reticulocyte Count. The reticulocyte is an immature erythrocyte that is not nucleated but contains some ribosomal or mitochondrial material. When whole, anticoagulated blood is incubated in a solution of new methylene blue, the ribonucleic acid is precipitated as a dye-ribonucleoprotein complex. This complex appears as a dark blue network or dark blue granules, which allow the reticulocytes in a smear to be readily identified and counted. The reticulocyte count is usually expressed as a percentage. The reticulocyte smear must be prepared within 2 hr of specimen collection. Care must be taken to store the slides in the dark, away from fluorescent light.
( k ) MethemoglobinConcentration. Methemoglobin(metHb)isformed when- the heme irons of hemoglobin are oxidized and is usually expressed as a percentage of total hemoglobin concentration. It cannot combine reversibly withoxygenorcarbonmonoxideandisadark,greenish brown color that does not revert to red on exposure to oxygen. Since methemoglobin is readily reduced by normal intraerythrocytic mechanisms (MetHb reductase enzyme system), it should be measured within 30 min of sample collection. The four-wavelength spectrophotometric methodbased on the work of Evelyn and Malloy [lS] is commonly used for methemoglobin measurements. There are species differwhen evaluating MetHb ences in MetHb formation[ 161 that should be considered data.
( I ) Diffet-entia1(CellularEler72ents)-Periphe~-alBlood. Preparation of a well-made thin-film peripheral blood smear is critical to hematologic assessment. The slides must be made in such a manner as to create the appropriate A monolayer of cells at the feathered areas on the slide necessary for evaluation. A slightly edge of theslideisrequiredforcellularmorphologyassessment. thicker area is desirable for WBC differential counts and platelet estimates. The slide should look similar to a thumbprint with the edges of the smear approaching but not touching the edges of the slide. The end of the smear should feather out to a rounded edge.
Toxicological Pathology Assessment
323
White blood cell differential smears should be prepared from room temperature, EDTA-anticoagulated whole blood within 2 hr of sample collection. The specimen should be mixed on a rocker for 5 min prior to smear preparation in order to insure homogeneity. Once prepared, the smear should be stained with Wright-Giemsa stain for microscopic evaluation.In addition to the WBC differential, WBC, platelet, and erythrocyte morphologic assessment is conducted concurrently from this smear. The systematic microscopic evaluationof the smear should begin with the high dry objective and then under oil immersion at 50X to evaluate the smear for general impressions of numbers of cells and morphology and to locate the monolayer of the smear for the differential count. The 1OOX objective is then used to perform the differential and morphology assessments.Upon microscopic examination, typically, a differential contains five types of mature WBCs: neutrophils,lymphocytes,monocytes,eosinophils,andbasophils.Smallnumbers (<3.0%) of immature white blood cells may also be present in the smear. Complete descriptionsof various cell type characteristicsin various species are found in numerous texts [ 171. The differential is a count of 100 WBCs. The constituents are reported as a percentageof the 100 cells counted. For interpretation,it is also useful to report results in absolute numbers of individual cell types. Absolute counts are calculated using the following formula: Absolutecount = % cellcount(expressedasadecimal)
X cWBC
(4)
where cWBC = WBC count corrected for presence of NRBCs (see below). Occasionally, nucleated red blood cells (NRBC) will appear in the smear. When present, these cells must be counted and reported as: number of NRBCs/ 100 WBCs. If three or more NRBCs are counted per 100 WBCs, the automated of the automated WBC mustbe corrected and reported to account for the inability impedance cell counter to distinguish between a NRBC and a WBC. This correction is calculated using the following formula: Corrected WBC =
100 +
Total WBC X 100 Number of NRBC per 100 WBC
( m ) DifSererltial (Cellular Elerner1ts)"Bone Marrow. The quality of the smear is absolutely critical to evaluation. For rodents and rabbits, bone marrow smearsshouldbepreparedfromthefemoralbonemarrow.Typicallythe smears are prepared using size "000' ' paintbrushes moistened with physiologic phosphate-buffered saline. For larger species such as dogs and nonhuman primates, imprintsof sternal bonemarrow are preferable.To ensure that an adequate smear is prepared, at least two slides should be prepared from each animal. Care must be taken to avoid exposure of the bone marrow to formalin before, during,
324
Lanning
and after slide preparation. After the slides are allowed to air dry, they should be fixed in absolute methanol for 5 min. The slides should be stained with a modified Wright-Giemsa stain. The systematic evaluation of the smear should 50X to evaluate begin with thehigh dry objective andthen under oil immersion at the smear for general impressions of total cellularity and morphology and to locate the monolayer of the smear for the differential count.The 1OOX objective is thenused to perform the differential assessment. Typically, bone marrow evalu(500 cells) based on cell lineage (i.e.. ations include a differential cell count myeloid, erythroid, megakaryocytic series cells, miscellaneous cells) and stage of maturation. Myeloid series cells are classified as myeloblast, promyelocyte, myelocyte,metamyelocyte,band,neutrophil,eosinophil,orbasophil.Further characterization can be made within the myelocyte and metamyelocyte cell types based on the appearanceof specific granules. Erythroid series cells are classified as rubriblast, prorubricyte, rubricyte, or metarubricyte. Lymphocytes, monocytes, plasma cells, mast cells, and reticulum cells are recorded as miscellaneous cells. A myeloid/erythroid (M/E) ratio is also determined for each animalvia an additional independent 500-cell count of myeloid/erythroid cells. The M/E ratio is calculated by dividing the number of myeloid cells by the number of erythroid cells. It is important that interpretation of the bone marrow evaluation be performed in conjunction with the peripheral blood picture and histopathological evaluation of the hematopoietic organs. ( n ) CoaguZation/Hemost~~sis.Evaluation of hemostasis in thetoxicology study includes both plasma coagulation factorsand platelets. The prothrombin time (PT) and activated partial thromboplastin time (PTT) are utilized to assess the functionof the plasma coagulation factors, with the exception of factor XIII. Thrombin time (TT) is utilized to assess the presence of functional fibrinogen. Plasma coagulation reactions are divided into the extrinsic, intrinsic, and common pathways. The extrinsic system involves the reactions of tissue factor and factor VI1 that result in the conversion of factor X to factor Xa. The intrinsic system is composedof factors VIII, IX, XI, and XII, prekallikrein, and kininogen. The common pathway includes factor V. X, and XIII. prothrombin. and fibrinogen. The coagulation factor assays require that whole blood be collected using trisodium citrate as the anticoagulant. The ratioof anticoagulant to whole blood is critical. The prothrombin time (PT) is a nonspecific that testmeasures the functional ability of the extrinsic coagulation system. The assay measures plasma clotting time in seconds after the addition of tissue thromboplastin (factor111) and calcium in duplicate. chloride to the specimen.To ensure accuracy, the assay is performed The difference between the duplicate determinations shouldnot exceed 5%. The
Toxicological Pathology Assessment
325
PT testmay be prolonged due to abnormalities of factors V, VII, andX, prothrombin, or fibrinogen, or due to the presence of an inhibitor. The activated partial thromboplastin time (PTT or APTT) is also a nonspecific test that measures the functional ability of the intrinsic and common coagulation systems. The assay measures plasma clotting timein seconds after the specimen is incubated with a surface activating agent (factor VI1 activator), partial thromboplastin, and calcium chloride.To insure accuracy, theassay is performed in duplicate. The difference between the duplicate determinations should not exof factors V. VIII, ceed 5%. The PTT test may be prolonged due to abnormalities IX, X, XI, and XII, or due to the presence of an inhibitor. The thrombin time (TT) is a specific test that estimates the quantity of functionally active fibrinogen. The assay measures plasma clotting time in seconds after the specimen is incubated with thrombin and calcium chloride. The assay is performed in duplicate. The difference between the duplicate determinations should not exceed 5%. The TT may be prolonged by reduced functional fibrinogen (dysfibrinogenemia),reducedfibrinogen(hypofibrinogenemia),the presence of fibrinogen degradation products, heparin, or antibody to thrombin or amyloidosis. Platelets are measured as described previously. Megakaryopoiesis is regulated by the number of circulating platelets under the influence of a hormonal factor, thrombopoietin [ 181. Thrombopoietin influences platelet production by stimulating committed stem cells, inducing additional endomitosis in immature [ 18,191. megakaryocytes.andshorteningthemegakaryocytematurationtime Thrombocytopenia is defined as a platelet countbelow normal limits established for a specific species. It should be noted that there are significant differences in [20]. In animals, platelet number, function, and induced changes between species it is reported that petechial or ecchymotic hemorrhage does not occur until the platelet count is <100 X 1Og/liter [311. Thrombocytopenia should be confirmed by a review of the peripheral blood smear. Thrombocytopenia may occur due to increased platelet removal/destnlction (e.g., autoimmune thrombocytopenia, disseminated intravascular coagulation) or impaired platelet production due to megakaryocyteinjuryfromdrugs,radiation,viruses,orneoplasia.Thrombocytosis isdefined as a platelet count above normal limits established for a specific species. Thrombocytosismay occur due to secondary causes such as myeloproliferative disease, inflammation, or neoplasia. or primary bone marrow changes characterized by a marked increase in megakaryocytes.
C. ClinicalChemistry The importanceof appropriate clinical chemistry assessment in toxicology studies has been recognized formany years. Just aswith hematological parameter assessment, in elucidating toxic effects through clinical chemistry analysis,the investi-
326
Lanning
gator must be aware of the appropriate methodsof blood collectionand handling, sampling times, testing selection, quality control evaluation, and test result interpretation. In addition, one must be aware of normal animal species physiology and its impact on clinical chemistry parameters when interpreting test results.
1. Selection of Parameters The investigator should consider several points when selecting clinical chemistry parameters for evaluation. These items include sample volume requirements, information desired, time points for evaluation, and route of sample collection. A “routine” clinical chemistry evaluation in toxicity testing is generally used in thosestudieswhereclinicalchemistryeffectsare not expectedbasedon the known structure and/or function of the test material. The parameters measured would include those thatwould cover changes derived from all major organ systems. For example, a routine panelof assays would include alkaline phosphatase (ALP), alanine aminotransferase (ALT), aspartate aminotransferase (AST), and bilirubin (Tbili) for liver damage, creatine kinase (CK) for muscle damage, creati(BUN) to assess kidney damage, and sonine (CRT) and blood urea nitrogen dium. potassium. and chloride to assess the electrolyte status of the animal. If the investigator suspects test material-related damage to a particular organ prior to study start, the use of more specialized tests may be warranted. For example, 5’-nucleotidase and total bile acids would be important if the test material was thought to be a liver toxin.
2.
Specimen Collection and Quality
Specimen collection for clinical chemistry analysis requires the use of serum collection tubes (no anticoagulant)with or without gel for separation. Blood collected into serum separator tubes must be allowed to clot at room temperature for 30 min to minimize residual fibrin. Tubes are then centrifuged pursuant to conditions specifiedby the manufacturerof the serum separator tubes.The serum is removed with a polypropylene pipette and placed in a prelabeled screw-cap polypropylene test tube. The serum should be held or shipped in a chilled, insulated container. If serum is to be stored or shipped for subsequent analysis, the specimen should be frozen immediately < at- 15°C andheld for a limited period of time.
3. Common Parameters-Methodology and Interpretation
( a ) Enzymes Alkaline phosphatase. Alkaline phosphatase (ALP) is composed of several isoenzymes that are present in practically all tissues of the body, especially of monophosat orin the cell membranes. These enzymes catalyze the hydrolysis
Toxicological Pathology Assessment
327
phate esters and have a wide substrate specificity [22]. The actual natural substrates upon which they act in the body are not known. There are specific forms of ALP in liver, bone, intestine, placenta, and kidney: however, the predominant forms present in normal serum are the liver and bone forms. It appears that the enzyme is associated with lipid transport in the intestine and liver and calcification in thebone. The preferred method for analysis of serum ALP is the adenosine monophosphate (AMP)-utilizing 4-nitrophenyl phosphate method that measures the 4-nitrophenoxide ion produced by removal of the phosphate group from 4nitrophenyl phosphate by ALP. The four major causes of high serum alkaline phosphatase activity are induction of hepatic ALP (e.g.. cholestasis), induction of hepatic ALP release (e.g., iatrogenic corticosteroids, hyperadrenocorticism), increased osteoblastic activity in bone (e.g., hyperparathyroidism), and neoplasia (e.g., sarcoma, carcinoma). Hepatic ALP production is induced by increased intracanalicular (bile canaliculi) hydrostatic pressure. It is a microsomal membrane-bound enzyme that does not leak during altered hepatocellular permeability. It is themost sensitive indicator of cholestasis and willbe high prior to increases in total bilirubin. The magnitude of the ALP increase seen in hyperparathyroidism, bone neoplasia, rickets, orosteomalacia (from bone ALP) is not as large as that observed with cholestasis. Exposure to numerous chemicals/drugs (e.g., acetominophen, allopurinol, antifungal agents, halothane, clofibrate) are reported to cause increases in ALP [23]. Aminotransferases. The animotransferases, including alanine aminotransferase (ALT). formerly serum glutamate pyruvate transaminase (SGPT), and aspartate aminotransferase (AST), formerly serum glutamate oxaloacetate transaminase (SGOT), are indicators of hepatocyte damage [24]. These enzymes are present in hepatocyte cytosol and during episodes of altered plasma membrane permeability, they leak into the extracellular fluid. Alanine aminotransferase is an enzyme that catalyzesthe transfer of an amino group from alanine to oxoglutarate to produce glutamate. Aspartate aminotransferase catalyzesthe transfer of an amino groupfrom aspartate to oxoglutarate to form L-glutamate. The preferred methods for analysis of serum ALT and AST are the International Federationof Clinical Chemistry (IFCC) reference methods. In laboratory animal species, ALT is specific for liver damage or disease, whereas AST is found in liver and muscle. As described above, these enzymes leak out of the hepatocyte cytosol when the plasma membrane permeability is altered. Leakage occurs due to the high concentration gradient between the intraand extracellular compartments. In general, the magnitude of ALT increases in liver damage or disease is greater than that of AST. This is due, in part, to the in the cytopresence of some AST in hepatocyte mitochondria in addition to that sol. Mitochondrial contents are less likely to leak. Additionally, elevations of ALT activity persist longer than do those of AST (the plasma half-life of both
328
Lanning
is approximately 2 to 4 days). The cause of hepatic enzyme leakage is increased plasma membrane permeability,which may result from a reduced oxygen supply to the liver. direct effects of toxins, drugs, or chemicals, inflammation. and/or fatty change. Although the magnitude of the increase in ALT/AST is directly proportional to the number of hepatocytes affected [22]. it is not related to the reversibility/irreversibilityof the change. This is also true of increased AST with muscle damage or disease (e.g., myocardial infarction). Creatine kinase. Creatine kinase (CK) catalyzes the reversible phosphorylation of creatine by adenosine triphosphate (ATP).When muscle contracts, ATP is used (forming adenosine diphosphate [ADP]) and creatine kinase catalyzes the rephosphorylation of ADP (forming ATP) using creatine phosphate, the major phosphorylated compound in muscle. as the phosphorylation reservoir [25]. Three isoenzymes of CK have been described: MM, which is found in skeletal and cardiac muscle; MB, which is also found in skeletal and cardiac muscle; andBB. which is found primarilyin the brain but is also presentin the prostate, gut, lung, urinary bladder, uterus, placenta, and thyroid gland.All of these are found in the cytosol or associatedwith myofibrillar structures.The preferred method for analysis of serum CK is the enzymatic NAC method as recommended by the Scandinavian Society for Clinical Chemistry and Clinical Physiology and optimized by Szasz et al. [26]. Creatine kinaseis a leakage enzymeand increased serum values occurwith reversible and irreversible damage. Creatine kinase values increase within hours of muscle injury and reach maximum values by approximately 12 hr postinjury. Creatine kinase has a relatively short half-lifein serum and will return to normal levels at 24 to 48 hr after muscle damage from a single insult occurs. Therefore, high serum CK values are indicative of active or recent muscle damage. Lactatedehydrogenase.Lactatedehydrogenase(LDH)isahydrogen transfer enzyme that is found in the cytoplasm of most of the cells of the body. Tissue levels of LDH are approximately 500 times greater than normal serum levels so leakage from a small numberof cells can result in significant increases in the serum values of LDH. The peak of the increase in serum LDH values is observed within 48 to 72 hr after the insult. Causes of increased serum LDH include muscle damage or necrosis, hemolysis, liver disease, renal tubular necrosis, pyelonephritis, and malignant neoplasia.
(t)) Bilirubin. Total bilirubin (Tbili) is derived primarily from the heme moiety of the hemoglobin released from senescent erythrocytes destroyed in the reticuloendothelial cells of the liver, spleen, and bone marrow. It is produced in peripheral tissues from protoporphyrin IX by microsomal heme oxygenase, transported to the liver in association with albumin, and transported across the sinusoidal membraneby carrier-mediated active transport.In the hepatocyte cyto-
Toxicological Pathology Assessment
329
sol, bilirubin isbound primarily to ligandin andZ protein and rapidly conjugated with glucuronic acid to produce bilirubin mono- and diglucuronide, which are excreted into bile via an energy-dependent, active-transport process. The preferred method for analysis of serum Tbili is the Jendrassik-Grof method, which measures the azobilirubin solution formed by the reaction of total bilirubin with a caffeine reagent followed by the addition of diazotized sufanilic acid. Increases in serum total bilirubinmay be due to increased hemoglobin destruction (hemolytic hyperbilirubinemia) or liver damage (obstructive hyperbilirubinemia). Increased serum total bilirubin is an early indicator of cholestasis. Hyperbilirubinuria (only conjugated bilirubin is found in the urine) is often observed prior to increases in serum total bilirubin. In cases of increased Tbili due to cholestasis,ALP should alsobe increased. If not, hemolytic hyperbilirubinemia should be suspected.
(c) Crentirzilze. Creatinine(CRT)isderivedfromthenonenzymatic. spontaneous conversion of free creatine in the muscle. Approximately 1 to 2% of muscle creatine is converted to creatinine daily [37]and the amountof endogenous creatinine produced is proportional to muscle mass. The excretion rate is also constant and parallels production. The preferred method for analysis of serum CRT is the Jaffe reaction, which measures the red-orange adduct formed during the reactionbetween creatinine and the picrate ion in alkaline media. There are several noncreatinine Jaffe-reacting chromogens (e.g., protein, glucose, guanidine, acetone [38]), however, which may slightly increase the measurements. The definitivemethod for creatininein serum utilizes isotope-dilutionmass spectrometry [29]. Creatinine can provide similar information to blood urea nitrogen in renal disease or postrenal obstruction or leakage. It is filtered freely through the glomerulus;however,smallamountsarereabsorbed by therenaltubulesas well as secreted by the proximal tubules. Increased serum creatinine levels occur when glomerular filtration is decreased. Creatinine levels are also increased by reduced renal perfusion. Creatinine clearance may also be measured and is considered to be an accurate index of glomerular filtration rate (GFR). (d) Electrolytes. The majorelectrolytes(sodium,potassium,chloride) are primarily free charged ions that have diverse roles in the body, including but not limited to the following: maintenance of osmotic pressure and water distribution,maintenance of pH,regulation of musclecontraction,involvement in oxidation/reduction reactions, and serving as enzyme cofactors. Most are analyzed forby use of ion-selective electrodes that arehighly selective and accurate. Electrolyte gain or losscan occur in the gastrointestinal tract (dietary intake, loss of saliva, gut stasis, diarrhea, vomiting), kidney (lack of antidiuretic hormone. excess or lack of aldosterone, tubular disease), lung (hyperventilation. febrile episodes), and skin (sweat, febrile episodes).
330
Lanning
Sodium. Sodium is the major cation of extracellular fluid and is central in the maintenance of water distribution and osmotic pressure. Serum sodium is an indicator of total body sodium if the animal is appropriately hydrated. Common causes of hyponatremia include prolonged vomiting, persistent diarrhea, salt-losing enteropathies, glycosuria (solute diuresis), diminished tubular reabsorption, aldosterone deficiency, severe polyuria, metabolic acidosis, and severe edema and ascites. Hypernatremia occurs with excessive loss of sodium-poor body fluids, such aswith profuse sweating, prolonged hyperpnea, vomiting, diarrhea, polyuria, decreased antidiuretic hormone, hyperadrenocorticism, and brain injury. Hypernatremia in association with hypokalemia and hypercalcemia may be seen in hepatic disease, cardiac failure, burns, and osmotic diuresis [30]. Potassium. Potassium is the major intracellular cation and is the critical ion in maintaining ionic gradients for neural impulse transmission. Because approximately 90% of serum potassiumis intracellular, serum potassium concentration is not an indicator of total body potassium. Serum potassium abnormalities are commonly a result of acid-base imbalances and have serious consequences such as muscle weakness or paralysis and cardiac conduction abnormalities leading to cardiac arrest. Hypokalemiacan result from decreased intake, redistribution of extracellular potassium into intracellularfluid (as seen with alkalosis or acidosis), and increased lossof potassium-rich fluids (e.g., renal tubular acidosis, vomiting, diarrhea). Natural causes of hyperkalemia include redistribution of potassium into extracellular fluid (e.g.. massive tissue necrosis, dehydration, acidosis, hemolysis. leukocytosis. thrombocytosis) and decreased excretion (e.g., acute renal failure, renal tubular acidosis, adrenocortical insufficiency). Chloride. Chloride is the major extracellular anion and is regulated passively by gradients derived from active sodium transport across cell membranes. Like sodium, chloride is involved in water distribution, osmotic pressure, and anion-cation balance in the extracellular fluid compartment. The serum chloride concentration is directly proportional to the sodium concentration. Hypochloremia is primarily seen with chronic pyelonephritis, metabolic acidosis, and persistent gastric secretion and prolonged vomiting. Dehydration, renal tubular acidosis, and metabolic acidosis with prolonged diarrhea result in hyperchloremia. (e) Glucose. The serumconcentration of glucoseisregulated by complex interactionsof hormones such as glucagon, insulin, cortisol, and epinephrine. Glucose shouldbe measured in the fasting animal. The causes of increased serum glucose concentrations include postprandial blood collection, diabetes mellitus, hyperadrenocorticism, moribundity, exogenous glucocorticoids, and morphine. Decreased serum glucose concentrations can result from ethanol ingestion, liver failure, and deficiency of growth hormone, glucocorticoids, and glucagon. Severe and possibly ir-reversiblecentral nervous system dysfunction can result from hy-
Toxicological Pathology Assessment
331
poglycemia. Clinical signs of hypoglycemia include confusion, lethargy, ataxia, and seizure, which may progress to loss of consciousness and death. (f) Proteins. The body containsamultitude of differentproteins, of which approximately three hundred can be identified in the plasma alone [31]. With the exception of the protein hormones and immunoglobulins, the majority of the plasma proteins are synthesized in the liver. They are constantly undergoing catabolism, primarilyin the liver, and replacementwith each plasma protein having its own specific turnover rate. The different functions of proteins are as numerous as the proteins themselves. These functions include serving as complement factors, coagulation factors, anions in acid-base balance, and carriers for vitamins,hormones,fats,freehemoglobin, and unconjugatedbilirubin.Commonly, serum total protein and albumin are the two plasma proteins that are measured. Albumin is the most abundant protein in plasma and is the major determinant of plasma oncotic pressure. Normally, serum total protein concentration is in direct proportion to the serum albumin concentration. While hyperalbuminemia is only seen in dehydration, hypergammaglobulinemia, and hyperfibrinogenemia,hypoalbuminemia (and hypoproteinemia) is common in many disease states and may result from impaired synthesis (liver disease), increased catabolism (tissue damage), reduced absorption (malnutrition). protein loss (glomerulonephritis, protein-losing enteropathy, burned skin), or altered distribution (ascites). The preferred method for analysis of serum total protein is the biuret reaction method, which measures the amount of a colored product that is formed from the reaction of peptide bonds of proteins with copper (11) [Cu(II)] ions in of serum albumin, in all alkaline solution. The preferred method for analysis species except rabbits, is the bromocresol green dye-binding method, which measures the bromocresol green-albumin complex after allowing albumin and bromocresol green to bind at pH 4.2. The preferred method for analysis of serum albumin in rabbits is the bromocresol purple method, which measures the bromocresol purple-albumin complex after allowing albumin and bromocresol purple to bind at pH 5.2 with acetate. (g) Urea. Urea, the major nitrogen-containing metabolic product of protein catabolism, is synthesizedin the liver from aminonitrogen-derived ammonia. A small amount of urea is also absorbed from the large intestine. Urea is found throughout the total body water compartment due to passive diffusion, and renal excretion is themost important routeof excretion. It is removed in the glomerulus by simple filtration and is found in the same concentration in the glomerular filtrate as in the blood. Based on the rate of urine flow (directly proportional), urea will passively diffuse with water from the tubular lumen back into the blood. The preferred method for analysis of urea nitrogen in the serum (BUN) is the urease with glutamate dehydrogenase (coupled-enzyme system) method. which
332
Lanning
measures the decreasein absorbance resulting from the glutamate dehydrogenase reaction. Assessment of increases in BUN are categorized as prerenal azotemia, renal azotemia, and postrenal azotemia. Prerenal azotemia isan increase in BUN concentration due to increased protein catabolism (e.g.. tissue damage, febrile episodes)ordecreasedrenalperfusion(e.g.,dehydration,shock,cardiovascular compromise). A high BUN with a normal serum creatinine level is indicativeof prerenal azotemia. Renal azotemia resultswhen approximately 75% of the nephrons are nonfunctional. Both serum creatinine and BUN are increased similarly in renal azotemia. Obstruction to urinary outflow results in postrenal azotemia. Serum creatinine and BUN are increased with postrenal azotemia; however, there is a disproportionately greater increase in the BUN level as compared with the creatinine level.
D. Urinalysis Urinalysis shouldbe performed in any studies when the test material is suspected to be a renal toxin. Although urinalysis techniques utilized in toxicity testing are of the results corresponds with the methods emprimitive and the reliability ployed, the results can be effective in determining whether the kidney is functioning properly or if it is being overwhelmed beyond its capacity. In general, however, five animalsper sex pergroup may not providetheconsistencyto characterize changes successfully.
1. Selection of Parameters As a minimum, the specific gravity, acidity, protein, glucose, ketones, bilirubin, urobilinogen, and a microscopic evaluation (epithelial cells and casts) should be determined. In general, the protein, glucose, ketones, bilirubin, and urobilinogen aredeterminedqualitativelyusingreagentstrips.Formorespecificmeasurements, clinical chemistry analysis can quantitate the amounts of analytes such as total protein, glucose, urobilinogen, and bilirubin in the urine.
2. Specimen Collection and Quality The collection techniques employed in most toxicological studies are primitive. Urine is generally collected in metabolism cages that collect the urine in a container below the cage. For thebest results,the collection container is surrounded by a wet ice bath. Typically, for rodents, urine is collected over a 12-hr period. and specific gravity The urine is evaluated fortotal volume, color, cloudiness, pH, immediately after the collection period.
Toxicological Pathology Assessment
333
3. Common Parameters-Methodology and Interpretation ( a ) Ketorzes. The ketonesmeasured in the urinequalitativelyusually include acetone and acetoacetate. Increased ketones in the urine may be due to diabetes mellitus, prolonged fasting, persistent vomiting, or very low carbohydrate diets.
(b) Glucose. Increased glucose in the urine may be due to diabetes mellitus, major trauma, exogenous steroids, infection, or pheochromocytoma (adrenal gland tumor). (c) pH. Values of urinary pH will not reflect the normal physiological situation since the dissolved carbon dioxide (CO?) dissipates during the collection period, which results in an elevated pH. Unless the pH is outside the range of 6.0 to 7.0, changes are most likely incidental to the collection technique.
(d) Specific Grmi~y. Hair, dander, excrement, and dust, which are commonly found in the urine,will falsely increasethe specific gravity. This is a crude measurement of the concentrating ability of the kidney (osmolality) and may be altered with some types of nephrotoxicity. ( e ) MicroscopicEvaluation. Microscopicevaluation of theurinesediment allows for the detection of epithelial cells, bacteria, casts, red blood cells, white blood cells, and crystals. It shouldbe remembered that bacteria will grow in the urine during the collection period. Casts in the sediment are indicative of renal tubular damage. Crystals are often precursors to renal and/orurinary bladder stones. Red blood cells are typically seen in conjunction with urinary bladder or urethral mucosal damage. White blood cells are usually seen in cases of infectious changes in the kidney, urinary bladder, ureter, or urethra.
E. Quality Control For assuranceof the reliabilityof the data generated,it is important for the clinical laboratory to participate in an interlaboratory comparison program, such as the College of American Pathologists (CAP) Interlaboratory Comparison Program in the laboratory. In addition, the for Proficiency Testing, for all assays performed laboratory may subscribe to the CAP Quality Assurance Service (QAS). Quality control (QC) data generated in the clinical laboratory are transferred monthly, via modem,totheserviceforevaluationagainstthepreviousmonthlydata andforcomparison with otherlaboratories.Thisserviceallowstheclinical laboratory to ensure that the QC data generated by the laboratory are accurate and conform to the data generatedby other participants. The CAPQAS provides participants with a computer software package that allows them to maintain QC
334
Lanning
records on computerforquickandaccurateevaluationusingtheWestgard rules and for transmission of data directly to CAP. The clinical laboratory may participate in hematology, clinical microscopy (urinalysis), and clinical chemistry surveys with a minimum of twospecimensperintervalfor“unregulated” analytes and a minimum of 5 specimens per interval for “regulated” analytes. Successfulparticipation in thesesurveysisrequired by law undertheCLIA guidelines. To ensurethehighestqualityresults.aveterinaryclinicalpathologist should review data at the endof each day and prior to discarding or freezingany unused samples. Any questionable results should be dealt with at that time. In all cases, documentationof rejection and repeat sample runsmust be made along with the possible source of variation and any corrective action taken. The clinical laboratory must have written quality control procedures that are followed. A subset of the Westgard rules are typically used as criteria for acceptance or rejection of the test data. The initial rule selected for use is the 1: 2s- “warning” rule, a violation of which should trigger a close inspection of the control data.The following rules are commonly selected to determine rejection or acceptance of the run: 1:3S, 2:2S, and R:4S. When three levels of control are used (hematology) the following rules are used: 1:2S“One point falls outside 2 standard deviations (SD). (Wanzing Rule) 1:3S--One point falls outside 3SD. 2:2S--Two consecutive points for the same control value fall outside 2SD. R:4S--In one run, one control value exceeds themean +2SD and another control value exceeds the mean -2SD. many chemistry analyzers, all points When two levelsof coqtrol are used, as with must be within 2 SD for acceptance. These parameters have “target” values assigned for the specific analyzer using specific reagents. Specific to hematology analyses, assayed control material must be analyzed prior to use and the mean and standard deviation will be calculated by the instrument microprocessor. The QC material should be analyzed prior to, during, and after each automated hematology experimental run. Normal and abnormal controls should be used to determine the precision. accuracy and reproducibility of the eight parameters measured by the instrument. Hematology QC data may be Upon receipt of new lot numbers stored in most instruments and printed monthly. of control materials, the previous month’s data must be printed for archiving purposes and deleted from the instrument. For clinical chemistry analyses, assayed normal and abnormal QC materials of each clinical chemistry experimental must be included at the beginning and end test run. All control materials should be assayed prior to use and the mean and standard deviation calculated for each level.
and
glands esicle muscle
ymus
ladder
Toxicological Pathology Assessment
335
Equipment maintenance schedules (scheduled and unscheduled), calibration records, and cumulativeQC data must be maintained forall laboratory equipment.
111.
HISTOPATHOLOGY
A. Collection Methodology for Routine and Special Histology The selection of tissues for histopathological evaluation should be done in accordance with federal agency guidelines such as the FDA Redbook guidelines or the EPA guidelines. A general tissue list is found below: Tissue List for Microscopic Evaluation nerve Peripheral glands Adrenal Pituitary Aorta Bone Bone marrow Rectum Brain nodes lymph Representative Salivary Cecum Seminal Colon Skeletal Cervix muscle Smooth Duodenum 2 levels) least(at Esophagus cord Spinal Eyes Gallbladder (if applicable) Sternum Heart Ileum Jejunum glands/parathyroid glands Thyroid Kidneys Liver Urinary Lungs Mammary gland gross lesions Ovaries and fallopian tubes Pancreas
1. NecropsyTechniques Good necropsy techniques are critical to the histopathological evaluation process. At necropsy, specific attention is placed on the following items: (1 animal identification, (2) tissue accountability, (3) lesion recognition and accountability, (4) accurate recordingof gross findings/appropriate descriptions and required entries
Lanning
336
on the individual animal necropsy form, ( 5 ) proper gross trimming of the wet tissues and tissue fixation, and (6) weighing each protocol-specified tissue. The necropsy must be performed by necropsy prosectors trainedin the necropsy procedure. lesion recognition, and documentation of findings. The pathologist will systematically examine each tissue prior to placing it in the fixative container. This enables the pathologist to ensure a maximum accountability of tissues, to verify all gross findings noted by the prosector, and to identify any additional changes that the pathologist deems significant. Scheduled necropsies for routine histopathology shouldbe initiated within 5 min after the animal is euthanized. All tissues and/or organs are examined in situ, dissected from the carcass, reexamined, including cut surfaces,and fixed in 10% neutral buffered formalin. All organs specified in the protocol are saved be fixed and fixed in theirentirety.Tissuessavedforhistopathologyshould any other at a thicknessnot to exceed 0.5 cm. The tail tattoo, ears, and/or feet (or means of identification) used in any way for animal identification during the inlife phase of the studies must be saved in formalin with the animal tissues. The trachea and lungs shouldbe perfused by introducing 10% buffered formalin (approximately 4 to 8 ml for rats) into the trachea until the lungs are completely filled to normal inspiratory volume. The kidneys should be bisected, and the cut surfaces examined before fixation. To maintain tissue identity, the left kidney should be bisected longitudinallyand the right kidney transversely.In both cases, the bisection will be made off-center to insure preservation and orientation of cortex, medulla, and pelvis in at least the larger of the two sections for each kidney. The entire gastrointestinal tract, including the stomach, should be perfused with 10% neutral buffered formalin to ensure immediate fixation of the mucosal surface.
2.
Fixatives and Fixation
Proper fixation of tissues is a another critical aspect of necropsy that is often overlooked. Inadequate amountsof fixative and inappropriate wet tissue trimming techniques are the most common mistakes made during routine rodent necropsy. When using formalin fixation, it is recommended that at least a 10:l formalin to tissue ratio be used. Sections from solid tissues, such as liver and spleen, should not exceed 0.5 cm in order to assure complete fixation. Special fixatives are recommended for specific tissues,such as Davison's fixative for the eyes,4% paraformaldehyde for testes, and Bouin's fixative for ovaries. If fresh frozen sections are needed for immunostaining, they may be prepared at 3-mm thickness and placed face down in a cryomold with frozen tissue embedding medium completely encasing it. Oncethe mold is removed from the embedding ring, the rings should be carefully wrapped, preferablywith parafilm and aluminum foil. Careful
Toxicological Pathology Assessment
337
wrapping prevents desiccation of the specimens during storage at -80°C and of enzyme they can be maintained in this manner for months without degradation activity.
3. Histology Techniques Wet tissues are trimmed according to specific protocol guidelines and established histology laboratory standard operating procedures (SOPs). The tissue processing instructions should include a predetermined trimming/embedding scheme specific for the study being trimmed. The scheme is based on protocol information and established SOPs and is commonly affixed to the trimming hood for easy reference during trimming. Gross lesions or abnormal tissue changes recorded on the necropsy section of the individual animal necropsy form are verified during trimming. Any additional lesions or abnormalities found during trimming are recorded in the necropsy observation section noting that it was found at trimming, and processed accordingly. Tissues should be trimmed to a maximum thickness of 0.4 cm for processing. The trimmed specimens are placed in cassettes prelabeled with the study number, group and animal number, and block number. Any residual tissue should be wrapped in gauze and double bagged in labeled, sealed polyethylene bags containing a sufficient amount of fixative to keep the tissues moist. For paraffin-embedded tissues, fully automated tissue processing units are commonly used in histology laboratories.The processing schedule shouldbe specific for the species, and/or size and typeof tissue. Paraffin temperatures, reagent rotatiodchanges, study number, number of cassettes, technician processing, and date should be documented for the processor used. Before the processor is activated, the maintenance schedule program, paraffin temperature record, and reagent rotation/changes record should be checked for completion and accuracy. The tissue embedding scheme utilized is determined by the specific protocol requirements and histology laboratory SOPs. The embedding scheme should take into consideration tissue size and consistency and should be arranged to maximize tissue recovery for subsequent microscopic evaluation. After the tissues areparaffin embedded, tissue sections are routinely cut at 4 to 6 microns. The slides should be labeled with the study number, group and animal number, special stain (if applicable), and slide number using a pencil or an indelible marker. After the slides are dried, they are placed in staining racks and stained routinelywith hematoxylin and eosin (H&E) using an automatic slide stainer. Reagents and staining solutions used during staining should be rotated and changed on a regular schedule. Stained slides are coverslipped with an appropriate size coverslip and placed on slide trays for drying and labeling. Slides should then be labeled with computer-generated labels with the study number,
Lanning
338
group and animal number, special stain (if applicable), and the slide number on the label. After the slides are checkedby the quality control technician, they are ready for histopathological evaluation. In recent years, the popularity of plastic embedding of tissues has grown andexcellentresultshave been obtainedusing 3,-hydroxyethylmethacrylate, known as glycol methacrylate or GMA. Methacrylate offers many advantages over paraffin processing. There is less shrinkage of the tissues in methacrylate because no clearing agent or heat is required, faithfully preserving delicate biological structures. During polymerization, GMA does not react with any tissue paraffin group of importance in staining,so GMA sections bind stain similarly to sections. Cellular morphology and tissue relationships are better preserved in glycol methacrylate sections and are generally superior paraffin to sections, which is important in evaluation of the testes and neural tissues. Sections can also be cutthinner than paraffin blocks, 1 to 2 microns.One of themostsignificant is the water solubilityof its monoadvantages of GMA over other epoxy methods mer and the hydrophilic property of its polymer. The disadvantages of glycol methacrylate techniques are that reagents are more expensive than routine paraffin sections andnot all special stains canbe readily accomplishedon methacrylate sections.
4.
MicroscopicEvaluation by LightMicroscopy
Despite advances in many areas of molecular science, altered cellular morphology remains the hallmarkof toxicity. Microscopic evaluation of tissues for toxicological changes takes into account both the recognition of injury to tissue and its biological significance based onthe nature of the injuryin conjunction with other available data. Assessment of altered gross and cellular morphology depends on theability of thepathologisttodiscriminatebetweentestmaterial-induced changes,secondarychanges,spontaneousdisease,postmortemchanges,iatrogenic changes, and normal biological variations. In short-term toxicity studies, the microscopic evaluation is used to determine target organs and the general toxicity of the material to assistin elucidation of the mechanism of action of the test material. In many cases, this information isused in dose selection for longer studies. In longer studies, the pathologist must be consistent in evaluating large numbers of animals. The pathologist must focus on the tissue response of a treatment animal vs. an individual animal, the tissue response to an induced change vs. a spontaneous one, and quantification of the tissue response of the treated group(s) compared with untreated or vehicle-treated group(s). In addition, the information generated should be described in a standardized way and reported inacondensed,organizedreport.Ingeneral,thoselesions with an increased incidence in the treated group(s), particularlywhen the incidence increases in a
Toxicological Pathology Assessment
339
dose-related manner. may be considered to be test material-related changes. The distinction between direct effect and indirect effect must be made by integrating all the information available. For further reading on the interpretation of microscopic tissue changes, the reader is referred to Haschek and Rousseaux [32].
5. Special Techniques Although standard histologicaland pathology procedures are the backbone of the accurate assessment of tissue change, the development and application of a wide variety of special techniques have greatly advanced the pathologist's ability to ascertain mechanistic data as part of the overall tissue response evaluation. A specialtechniqueutilizes5-bromo-2'-deoxyuridine(BRDU)tolabel cells in S phase. Traditionally, chemically induced cell proliferation in a target tissue has been quantitated by histoautoradiographic visualization of ["]thymidine (3H-TdR) incorporation into the DNA of cells in S phase. However, the thymidine analogue, 5-brorno-2'-deoxyuridine (BRDU), is rapidly becoming the 'H-TdrR, BRDU is incorpolabel of choice in cell proliferation studies [33]. Like S phase). rated into the DNAof cells undergoing replicative DNA synthesis (i.e., Advantages of BRDU over 3H-TdR include(1) no radiation hazard, (2) no need for radioactive containment and protective measures, (3) fasterturn around time (histoautoradiographic exposures require weeks to months), (4) less expense, and (5) improved cytological detail due to the absence of the overlying emulsion required in histoautoradiography. In a comparison study with 'H-TdR, BRDU was reported as the label of choice in cell proliferation studies [34]. 5-Bromo2"deoxyuridine imnlunostaining has been successfully performed using combinations of different fixatives, embedding mixtures, and section thickness. Another technique developed for identification of proliferating cells utilizes antibodies to nuclear antigensin proliferating cells. Several monoclonal antibodies have been developed that preferentially label proliferating cells, such as Ki67. A major drawback to these antibodies is that they can be used only on fresh frozen sections. Recently, a monoclonal antibody to a 36-kD protein, proliferating cell nuclear antigen (PCNA, also called cyclin), has been shown to be capable of identifying proliferating cells in vitro as well as in alcohol-fixed or formalinfixed, paraffin-embedded tissue. The PCNA/cyclin is an auxiliary proteinof DNA polymerase-6 and plays a key role in the initiation of cell proliferation. It is a highly conserved protein that is expressed during the late G,/S phases of the cell cycle and is, therefore, correlated with the cell proliferative state [35]. The immunohistochemical technique for PCNA can be performed onany tissue, not just from animals previously administered a DNA precursor label (e.g.. 'H-TdR or BRDU). Because formalin-fixed tissues can be used, archival formalin-fixed. paraffin-embedded materialcan now be evaluated for cell proliferation. This permits retrospective examination of tissues from previously conducted studies.
Lanning
340
6.
Routine Immunohistochemistry
Immunohistochemical methods utilize the specificityof antibodies as diagnostic of a variety of cell- and tissue-bound antigens reagents for the direct visualization including enzymes, oncodevelopmental antigens, tissue-specific proteins, immunoglobulins, polypeptide and steroid hormones, bacterial and viral antigens, and the unique biochemistry of the 10-nm intermediate filaments specific for general cell categories and tumor types. These highly specific methods allow the pathologist to take into consideration the function, secretion, and physiology of a cell in determining the tumor cell type. Of the different immunohistochemical methods, immunoperoxidase has developed into the most commonly used technique. The most sensitive immunoperoxidase procedure for localizing a variety of histologically significant antigens and other markers employs primary antibody, biotinylated secondary antibody, and a preformed avidin-biotinylated horseradish peroxidase complex, and has been termed the ABC technique. The ABC technique has been found superior to other methods; some of its uses include Measurement of antigen concentration on a single cell Amplification of antibody titers Radio-, enzyme, and fluorescent immunoassays Coupling of antibodies and antigens to agarose Immunohistochemical staining Multiple labels in tissues Purification of cell surface antigens Localization of hormone binding sites Examination of membrane vesicle orientation Cytofluorometric separation of cells Nitrocellulose and nylon transfer blot detection In-site hybridization and blot techniques with biotinylated nucleotides Genetic mapping Hybridoma screening
7.
Electron Microscopy
The electron microscope (EM) is atool utilized to extend the information gained through light microscopy. It can provide unique and valuable information suggesting actual mechanisms of toxicity in local tissue and intracellular sites. It is generally accepted that EM should not be performed unless there are specific reasons to suspect the presence of significant ultrastructural abnormalities. The electron microscopist should have some knowledgeof the light microscopic pain thology in general, and specifically, should know the histopathological findings
b
Toxicological AssessmentPathology
341
each sample being evaluated. Light and ultrastructural abnormalities are usually present in a graded continuum. In a toxicology study setting, significant ultrain the absence of structural pathology is only rarely seen in a tissue or organ some type of corresponding light microscopic pathology. There are really only two basic reasons for performing electron microscopy in association with a toxicology study. First, to characterize or to confirm the nature of a given lesion as observed by light microscopy. Secondly, EM frequently provides important assistance in establishing dose levels at which there is no compound effect. Although in most cases light microscopic evaluation is still the more cost effective method for establishing no effect levels, in special cases, it may be important to document the absence of compound effects by using EM.
REFERENCES 1. B.J. Payne, H.B. Lewis, T.E. Murchison. and E.A. Hart, Hematology of laboratory animals, inHurzdbook of Laboratory Arzimal Science (E.C. Melby,Jr. and N.H. Altman, eds.), CRC Press, Boca Raton, FL, 1976, pp. 382-461. 2. K. Bickhardt, D. Buttner, U. Muschen and H. Plonait. Influence of bleeding procedures and some experimental conditions on stress-dependent blood constituents of laboratory rats, Lab. Anin?. 17:16 1 165 ( 1983). 3. Report of the AVMA panel on euthanasia. J. Am. Vet. Med. Assoc. 188:252-268 (1986). Proc. 4. W.H. Coulter, High speed automatic blood cell counter and cell size analyzer, Natl. Elect. Cor$ 12:1034 (1956). 5. T.S. Sellers and J.C. Bloom, Hematologic evaluation of laboratory animals. Lab. Allim. 11(6):43-51(1982). 6. J.H. Sanderson and C.E. Phillips. An Atlas of Laboratory Animul Hnemntology, Clarendon Press. Oxford, 198 1. 7. R.M. Hardy, Hypoadrenal gland disease, in Textbook of Veterinary h t e r r d Medicine (S.J. Ettinger and E.C. Feldman eds.) Saunders, Philadelphia, 1995, p. 1584. 8. Z.Z. Zawidzka. Hematologic evaluation, in Handbook of In Vivo Toxicity Testing (D.L. Arnold, et al. eds.), Academic, San Diego, CA, 1990, pp. 463-508. 9. G.G. Klee, Performance goals for internal quality controlof multichannel hematology analyzers, Clin. L r h Haemltol. 1:65 (1990). 10. M. Dumoulin-Lagrange and C. Capelle, Evaluation of automated platelet counters for the enumeration and sizingof platelets in the diagnosis and managementof hemostatic problems, Semin. Thromb. Hemost. 9:235-244 (1983). 11. J.D. Bessman. L.F. Williams and P.R. Gilmer, Mean platelet volume: The inverse relation of platelet size and count in normal subjects and an artifact of other particles, Am. J. Clirz. Pnthol. 76:289 (1981). 12. L. Corash, Platelet sizing: Techniques, biological significance, and clinical applications, Curr. Top. Hematol. 499-1 22 ( 1 983).
342
Lanning
13. R.J. Prost-Dvojakovic, Study of platelet volumes and diameter in 11 mammals, in Platelets: Recent Adlyances in Basic Research and Clinical Aspects (O.N. Ulutin, ed.) American Elsevier, New York. 1975, p. 30. 14. C.B. Thompson, Size dependent platelet subpopulations: Relationship of platelet volume to ultrastructure enzyme activity and function, Brit. J. Haernatol. 50509 (1982). of oxyhemoglobin, methemoglo15. K.A. Evelyn and H.T. Malloy, Micro-determination bin and sulfhemoglobin in a single sample of blood. J. Biol. Chem. 126:655 (1938). 16. D.A. Clark and M. de la Garza, Species differences in methemoglobin levels produced by administration of monomethylhydrazine, Proc. SOC.Exp. Biol. Med. 125: 912-916 (1967). 17. N. Jain, Schalm’s Veterinary Hernatology, 4th ed., Lea and Fibiger, Philadelphia ( 1 986). 18. T.P. McDonald. Array and site of production of thrombopoietin,Brit. J. Haemarol. 49:493 (1981). 19. T.T. Odell, Megakaryopoiesis and its response to stimulated suppression. inPlatelets: Production, Function, TransfLrsiorz andStorage (M.G. Baldini and S. Ebbe, eds.) Grune and Stratton, New York, 1974, pp. 11-20. 20. W.J. Dodds. Platelet function in animals: Species specificities, inPlatelets: A Multidisciplitzary Approach, (G. DeGaetano and S. Garattini, eds.), Raven, New York. 1978, pp. 45-59. 21 W.J. Dodds, Blood coagulation: Hemostasis and thrombosis, inHmdbook of Lnboratory Anirnal Science, (E.C. Melby, Jr., and N.H. Altman, eds.), CRC Press, Boca Raton, FL, 1974, Vol. 2, pp. 85- 116. 22. J.R. Duncan and K.W. Prasse. Veter-irznr?, Clitlical Medicine-Clinical Patholog??, 2d ed., Iowa State University Press, Ames, IA, 1986, pp. 1-285. 23. D.S. Young, L.C. Pestaner and V. Gibberman, Effects of drugs on clinical laboratory tests, Clin. Chem. 21(5):1-43 1 ( 1975). 24. G.S. Travlos, Frequency and relationships of clinical chemistry and liver and kidney histopathology findings in 13-week toxicity studies in rats, To,ricology 107:17-29 (1996). 25. D.W. Moss, A.R. Henderson, Enzymes, inTiet: Textbook of Clinical Chemistry, 2d ed. (C.A. Burtis and E.R. Ashwood, eds.), Saunders, Philadelphia. 1994. 26. Committee on Enzymes of The Scandinavian Society for Clinical Chemistry and Clinical Physiology, Recommended method for the determination of creatine kinase in blood modified by the inclusion of EDTA. Scand. J. Cli~t.L A . Imlest. 39:1-5 (1979). 27. A. Whelton. A.J. Watson and R.C. Rock. Nitrogen metabolites and renal function, in Tiet: Textbook of Clinical Clzemistty, 2d ed. (C.A. Burtis and E.R. Ashwood, eds.), Saunders, Philadelphia, 1994. 28. S.H. Soldin. L. HendersonandJ.G.Hill,Theeffectofbilirubinandketoneson reaction rate methods for the measurement of creatinine, Clin. Biochem. 11:82-86 (1978). 29. M.J. Welch, A. Cohen and H.S. Hertz. Determination of serum creatinine by isotope dilution mass spectrometryas a candidate definitive method.Anal. Chem 58:16811685 (1 986).
t
.”
AssessmentPathology Toxicological
343
30. L.I. Kleinman and J.M. Lorenz, Physiology and pathophysiology of body water and electrolytes, in Clinical Chemistry: Theory. Analysis arzd Correlation (L.A. Kaplan and A.J. Pesce. eds.). Mosby, St Louis, MO, 1989. 31. N.L. Anderson. R.P. Tracey, and N.G. Anderson, High resolution electrophoretic mapping of human plasma proteins, in The Plasma Proteins, 2d ed., Vol 4 (C. Putnam, ed.) Academic, New York, 1984. 32. W.M. Haschek and C.G. Rousseaux (eds.). Handbook of Toxicologic Pathology. Academic, San Diego, CA, 1991. 33. S.R. Eldridge L.F. Tillbury, T.L. Goldsworthy and B.E. Butterworth, Measurement of chemically-induced cell proliferation in rodent liver and kidney: A comparison of 5-bromo-2’-deoxyuridine and [jHIthymidine administration by injection or osmotic pump, Carcinogenesis 12:2245-2251 (1990). 34. T.L. Goldsworthy,Cell Proliferation and Chelnical Carcinogenesis, National Institute of Environmental Health Sciences (NIEHS) Workshop, Research Triangle Park, NC1992. 35. P. Galand and C. Degraef, Cyclin/PCNA immunostaining as an alternative to tritiated thymidine pulse labelling for marking S phase cells in paraffin sections from animal and human tissues, Cell Tissue Kin. 22(5):383-392 (1989).
This Page Intentionally Left Blank
Assessment of Laboratories for Good Laboratory Practice Compliance Linda J. Frederick Abbott Laboratories, Abbott Park, Illinois
I. INTRODUCTION Nonclinical laboratory studies intended for submission to a government agency must be conducted in compliance with the Good Laboratory Practice (GLP) Regulations and Standards [l-31. These regulations are a result of U.S. Food and Drug Administration (FDA) and U.S. Environmental Protection Agency (EPA) reviews of the pharmaceutical and chemical industry practices in the mid-1970s. Deficiencies were noted at some firms that seriously affected the integrity of reported data. There were cases of falsified laboratory work, undocumented replacement of animals that died on test, fabricated test results, and excluded results As a result of these findings and if they were unfavorable to the test article. subsequent congressional action, the FDA GLPs were implemented in June of 1979. A facility that is GLP compliant assures the quality and integrity of its of the regulating agency. Foreign countries data andis held to the strict standards have recognized the U.S. GLP regulations as a universal standard. While several countries and European organizations have established theirGLP ownregulations [4-71, they differ very minimally, if at all, from the intent of the U.S. GLPs. Preclinical safety studies are required for all regulatory submissions and are essential in laying the groundwork for clinical trials. Selecting a facility that can conduct such pivotal and crucial work requires careful deliberation. At a minimum,thefacilityshould be inspectedanddocumentationshouldbereviewed. It is best to plan this evaluation process well in advance of signing a contract and placing a study at the contract research laboratory. In this way, deficiencies noted during your inspection can be discussed with the facility man345
Frederick
346
agement and corrected prior to beginning a preclinical toxicology study. In the event that you decide that the laboratory does not adequately meet your needs, time still exists for you to continue your evaluations of other available laboratories.
II. BASICS OF GOODLABORATORYPRACTICES
A.
ProvisionsandCompliance
There are 144 provisions in the GLPs that need to be satisfied to assure compliance. Of course, not all of these are of equal importance in terms of accepting or rejecting a facility.An easy way to begin your evaluation of contract research laboratories is to develop a two-tiered approach. First, develop a checklist that covers the most basic requirements of the GLPs (Table 1). This can be used as a screening document to determine if the laboratory has the basic elements of If the laboratory answers these core questions satisfacthe GLP program in place.
Table 1 ContractResearchLaboratoryScreeningChecklist
1. What types of services do you provide? 2. Are you accredited, if so, by whom? 3. Which regulations do you follow?
4. Personnel records: A. Doyouhavetrainingrecordsforemployees? B. Doemployeeshave job descriptions? C. AreCVspresentforallpersonnel? D.Doyouhaveorganizationalcharts? 5. Do you have a quality assurance unit and what functions do they perform? 6. Facility design: A. Are there separate areas for animal care, housing, and supplies? B. Is there a secured area for test article receipt and storage? C.Doyouhavearchives? 7. Facility history: A.Howmanyemployeesatthisfacility? B. Howlonghasthecompanybeeninbusiness? C. How largeisthefacility? 8. Have you ever been inspected by the EPA/FDA? 483 issued/Warning letter? 9. Do you have calibration/maintenance/repair records for equipment? 10. Do you have standard operating procedures and a historical file? 11. Do you maintain a master schedule? 12. Are your data acquisition systems/instruments on:line or are data manually recorded?
Good Laboratory Practice Compliance
347
torily, thenthey would warrant a facility inspectionto determine their exact level of compliance. On the other hand, if the answers given demonstrate a lack of knowledge or understandingof the basic GLP requirements, you could eliminate that laboratory from further consideration.
B. OrganizationandPersonnel Your first concern would be the organization and personnel. Are there adequate staff to conduct the nonclinical laboratory study? What is the reporting structure? This question is especially critical when reviewing the quality assurance unit (QAU). The QAU must be separate and distinct from the personnel conducting the study to avoid any conflict of interest. It is not permissible for quality assurance personnel towork on the studies that they audit since their objectivity could be compromised. Curricula vitae (CVs) should be available for all staff as well as training records and job descriptions. The facility needs to be designed so that there is a degree of separation that will prevent any function or activity from having an adverse effect on the study. There shouldbe separate animal care, housing, and supply areas. Test and control article storage should be adequate to prevent contamination or mixup of compounds. Separate laboratory space should be provided for the performance of routine and specialized procedures.An archive should be established for storage of study specimens and data. The facility history is also important in this initial interview. Determine how long the laboratory has been in business and if they are accreditedby any organizations. Establish the size of the facility, number of employees, and what government regulations, if any, they operate under. This is also an appropriate time to ask if they have ever been inspected by a government agency and, if yes, what the outcome of that inspection was. If you determine that an FD483 was issued, which is a form the FDA uses to list objectionable practices, this information can always be obtained by a request through the Freedom of Information Act. Usually the contract laboratory will supply you with that documentation providing they have completed their responses to the 483.
C.
Documentation
Facility documentation is another critical area that should be examined. Every testing facility should have written standard operating procedures (SOPs), approved by management, that are adequate to ensure the quality and integrity of the data generated in the course of a study. The SOPs should discuss methods in sufficient detailto assure consistencyand reconstruction/reproducibilityof the nonclinical laboratory study. Furthermore, a historicalfile of SOPs should exist, and all revisions and their corresponding dates should also be maintained. The QAU should generate a “master schedule sheet” of all nonclinical laboratory
Frederick
348
studies conducted at the testing facility indexed by test article. This document, or a copy purged of proprietary information, should be made available to study sponsors. All equipment used for generation, measurement, or assessment of data should have calibration, maintenance, and repair records.
D. DataAcquisition One final item to discuss during this assessment is the method of data acquisition for instruments and studies. If these operations are computerized, validation packages must be present for each computer system. A validation package should contain documentation confirming that a computer system doeswhat it purports to do. Items that are typically presentin a validation package include but arenot limited to: 1. Approvals signature page of management and users 2. Systems requirements-describes what the completed systemwill do 3. System designspecification-modular layout, portionof database design, general screen formats 4. Module/functional design specifications-logic flow, interfaces, database elements, algorithms, input and output 5. Source code listings (or references to) 6. Module test results 7. System test plan and system test log-integration of modules into an operable and testable environment 8. Validation plan. test results, and test results verification 9. Retrospective evaluation documentation 10. System maintenance documentation-hardware and software maintenance manuals, system backup/recovery records 11. User documentation-manuals, SOPS,training records 12. Change control procedure-how changes to the system will be documented and approved 13. Security documentation-physical and logical security for software, data. and hardware 14. Record retention-documents to be retained duringthe operation and maintenance phaseof the system, and their storage location and retention schedule 15. Disaster recovery considerations Thispreliminaryfacilityscreeningisaccomplished by asking12core questionsusingthetelephone,fax,ormail. The purpose of thisabbreviated method is to allow you to quickly establish major deficiencies at a contract research laboratory and determine if a visit to the facility for an in-depth inspection is justified.
Good Laboratory Practice Compliance
111.
349
CONTRACTLABORATORIES:FACILITYINSPECTION
To perform a facility inspectionyou need a working knowledge or at least some familiarity with the GLP regulations.If your facility has a quality assuranceunit it would be advisable to involve them in the contract laboratory inspection. Most QA units are routinely responsible for evaluating contract research laboratories and vendors. Their familiarity with the GLPs makes them expertsin assessing other companies’ levelsof compliance. Your role, if you are the sponsor’s representative for a preclinical study, is to evaluate the scientific competency of the facility. Ideally, the sponsor’s “study monitor” and a representative from the QAU should travel to the contract laboratory together and conduct a joint inspection. Using this team approach, the scientific as well as the regulatory compliance aspectsof the study can be thoroughly reviewed in a minimum amount of time. Whether performing this GLP capability inspection as a team or alone, it is best to prepare an agenda that should include a facility tour, documentation review, and a close-out meeting with management to discussany findings. A preplacement checklist can be developed by reviewing the various subparts of the GLPs (Table2).
A.
OrganizationandPersonnel
Subpart B of the regulations list severalkey personnel alongwith a descriptionof their duties and necessary records. Request organizational charts for the contract
Table 2 ContractResearchLaboratoryPreplacementChecklist
Facility name, address, and phone number: I. Subpart B-21
CFRPart 58 A.Organizationandpersonnel: 1. Organizational chart(s) 2. Adequate staff 3. Personneldocumentationreview: Employee Name
4.
l l
Job Description CV Training Records
QualityAssuranceUnit: a. Master schedule b. Regularstudyinspections c. Finalreportaudit d. QAstatementforeachfinalreport e. Standardoperatingprocedures
Table 2 Continued 11.
111.
TV
Subpart C-21 CFR Part 58 A. Facilities: 1.Buildingdiagram(s),numberofbuildings,type 2. Sufficientspacetoconductactivities 3. Pest control program (request SOP to review) 4. Water analysis program (request SOP to review) 5. Backuppowersource if electricalfailure 6. Animalfacilities: a.Monitoredfortemperature,humidity.andlightcycles b. Air filtration and changes per room per hour c. Feedandbeddingstoragearea d.Veterinarycare 7. Cage,rack,andequipmentwashingarea: a.Calibrationandmonitoringofwatertemperaturegauges b.Maintenancerecords c.Cleaningmethodvalidation d.Qualitycontrolmonitoringprogram B. Testandcontrolarticlefacility: 1.Separationofreceipt,storage,andhandingactivities 2. Procedure for receipt, tracking, and use of test and control articles 3. Formulationmethodsavailable 4. Weighingequipmentin calibration/standardization programand maintenance logs 5. Adequatecleaningofformulatingequipmentanddocumentation 6. Areasecured 7. Areamonitoredfortemperature,humidity.etc. C.Laboratories(histology,clinicalchemistry,anddruganalysis): 1. Proper labeling of reagents, chemicals. and solutions, i.e., identity, titer or concentration, storage requirements, and expiration date. NO outdated bottles. 2. Equipment in calibration/standardization program, and maintenance logs 3. Sample log-in, identification, and storage adequate 4. SOPsavailableandadequate 5. Methods validated and quality control program Subpart D-21 CFR Part58 A. Equipment: 1. Adequate SOPS-discuss methods. materials, and schedules for inspections, cleaning, maintenance, testing, calibration, and/or standardization. and identify person responsible for each above operation. 2. Records of all inspections, maintenance, testing, calibration, and/or standardization Subpart E-21 CFR Part58 A.Facilityoperations: 1. TwelvemandatorySOPs a.Animalroompreparation b.Animalcare c. Receipt, identification, storage, handling, mixing, and method of sampling of the test and control articles
Table 2 Continued
d.Testsystemobservations e. Laboratory tests f.Handlingofanimalsfoundmoribundordeadduring a study g. Necropsy of animals or postmortem examination of animals h. Collectionandidentificationofspecimens i. Histopathology j. Datahandling,storage,andretrieval k. Maintenanceandcalibrationofequipment 1. Transfer,properplacement,andidentificationofanimals 2. SOPmanualsavailableinlabareas V.Subpart F-21 CFR Part 58 A.Testandcontrolarticles: 1. Reserve samples from each batch of test and control article must be retained if the study is longer than 4 wk in duration 2. Determinewhowillperformconcentrationand/orhomogeneityanalysis on the test or control article formulations. This will also include stability of formulations a. If it’s the sponsor, review the contract laboratory’s procedure for shipping samples b.If it’s the contract laboratory, determine that validated analytical methods are present VI.Subpart J-21 CFTPart 58 A.Recordsandreports: 1. Archives a. Orderly storage ofrawdata,specimens,documentation,reports. and protocols b.Promptretrievalofaboveitems c.Fireprotection/alarm d.Monitoredfortemperatureandhumidity e. Secured area f. Limitedaccesstoarchive g. Log-in and check-out procedure for personnel and raw data h.Dataretentiontime VII.Automateddatacollection A.Computersystems: 1 . Identifiesindividualenteringdata,time,anddate 2. Audittrailfordatachanges 3. Purchasedsystem(s)ordevelopedin-house 4. Validationperformed-hardwareandsoftware 5. Usertrainingrecords 6. Systemsecurity,e.g.,passwordprotected 7. SOPS a. Change control b. Disaster recovery c. Backupandstorage of electronicmedia d.Systemoperation-user‘smanual Validation e. ‘
352
Frederick
laboratory. This isan excellent way to assess the numberof employees, including study directors, and determinewhich CVs, training records, andjob descriptions you will review. Speak with the quality assurance unit at the facility. Determine that a master schedule is maintained and ask to see a copy. The QAU should perform regular study inspections of all critical study activities to assure study integrity, audit all final reports, and issue a quality assurance statement for those reports. In addition, the QAU should have a set of standard operating procedures detailing their functions.
B. Facilities Subpart C discusses facility requirements for GLP studies. Building diagrams are an efficient means of determining the number of structures and their sizes and designs.From the plans, selectwhich area(s) to tour. During the facility walkthrough, concentrate on the separation of activities and the overall cleanlinessof the facility. Inquire as to the types of pest control and water analysis programs in effect and request those SOPs and records for review. Examine the animal rooms and determine if they are monitored and alarmed for temperature, humidity, and light cycles.How many air changes occur per hour and is the air filtered and recirculated? If electrical power can be interrupted, determine what is the back-up power source. Inspect the cage, rack, and equipment washing area for calibration and monitoringof water temperature gauges and maintenance records. Are cleaning methods validated and is there a quality control monitoring program? Check on the availability of a veterinarian for animal care and routine husbandry issues. Test and control article receipt. storage, and handling areas should be next on the tour. Focuson separation of activities, with contamination being the utmost concern. Tnterview the dispensing room personnel and determine the procedure for receipt, tracking, and use of test articles. All weighing equipment should be in a calibration/standardization program and have documented maintenance and repairrecords.Examinemixingequipmentforcleanlinessandcorresponding documentation of such cleaning activities. Determine that the area is secured and monitored appropriately for temperature or any other parameters that could compromise the test or control articles. Be suretoincludeavisittothelaboratoryoperationareas, e.g, histology, clinical chemistry, hematology, drug analysis, etc. These areas should be checked for proper labeling of reagents, chemicals, and solutions, in addition to equipment concerns regarding calibration, standardization, maintenance, and repair. Sample log-in, identification, and storage should be reviewed with the various laboratory representatives. SOPs should exist for all of these activities and testing methods.
Good Laboratory Practice Compliance
C.
353
EquipmentDocumentation
Subpart D discusses equipment. If the equipment isused in the generation, measurement, or assessment of data or used for facility environmental control, then the requirements spelled outin this subpart apply. Standard operating procedures should discuss in detail the methods, materials, and schedules to be used in the routine inspections, cleaning, maintenance, testing, calibration, and/or standardization of equipment. Furthermore, theSOPs should designate theperson responsible for the performance of each operation and, when appropriate, remedial action to be taken in the event of failure or malfunction. Written records should be maintained of all inspections, maintenance, testing, calibration, and/or standardization operations. Spot check some equipment logs to determine the level of detail and GLP compliance.
D. StandardOperatingProcedures Szrbpnrt E is dedicated to facilities operation. By listing the 12 activities that minimally require operating procedures, you have a guideline to follow when requesting SOPs for the documentation review portionof your inspection. When you request an SOP index, check that the following procedures are present: 1. Animal room preparation 2. Animal care 3. Receipt, identification, storage, handling, mixing, and method of sampling of the test and control articles 4. Test system observations (e.g., physical examination, food consumption, morbidity) 5. Laboratory tests (e.g., clinical chemistry and hematology parameters) 6. Handling of animals found moribund or dead during a study 7. Necropsy of animals or postmortem examination of animals 8. Collection and identificationof specimens (e.g., blood, urine, tissues) 9. Histopathology 10. Data handling, storage, and retrieval 11. Maintenance and calibration of equipment 12. Transfer, proper placement, and identification of animals The laboratory areas should have laboratory manuals and SOPs relative to the procedures being performed immediately available. Another important topic discussed under this subpart is that of labeling for reagents and solutions. As you tour the laboratory areas, spot check the labels to assure they indicate identity, titer or concentration, storage requirements, and expiration date. Deteriorated or outdated reagents and solutions should not be present and should not be used.
Frederick
354
E. TestandControlArticles Subpart F discusses the test and control article characterization, which is typically a concern of the sponsor. Whether this activity is being done at the sponsor’s facility or performedby a contract laboratory. it is ultimately the study director’s responsibility to assure the requirements set forth in subpart F of the GLPs are followed. Identity, strength, purity, and composition that will appropriately identify the test or control article need to be determined and documented for each lot or batch of test material used on a study. Methods of synthesis, fabrication, or derivation of test and control articles must also be documented. The stability of the test and control articles under the conditions of the study must also be determined. For studies longer than4 wk in duration, reserve samples from each batch of test and control article must be retained. For test or control articles mixed with a carrier (vehicle), appropriate analytical methods must be conducted to determine the uniformityof the mixture and to determine periodically the concentration of the test or control article in the mixture. The stability of the test or control article in the mixture will also need to be determined. If the contract laboratory is to formulate the test or control article in a carrier, establish the sampling interval andwhen shipments of aliquots to the sponsor or their in-house analytical laboratory are to take place. Test article formulation and analysis is a criticalactivitytoanystudy. Any timeinitiallyspentdefiningwhoistaking responsibility for what activities will guarantee proper compliance and adherence to the regulations.
F. StudyProtocol Subpart G details the study protocol requirements andhow to conduct a nonclinical laboratory study. Each study must have an approved written protocol that contains the following 12 points, as applicable:
1. A descriptive title and statement of the purpose of the study. 2. Identification of the test and control articles by name, chemical abstract number, or code number. 3. The name of the sponsor and the name and address of the testing facility at which the study is being conducted. 4. The number, body weight range, sex, source of supply, species, strain, substrain, and age of the test system. 5. The procedure for identification of the test system (method for individual animal identification). 6. A description of the experimental design, including the methods for the control of bias (e.g., useof a control group, randomizationof test system). 7. A description and/or identification of the diet used in the study as
Compliance Good Practice Laboratory
8.
9. 10. 11. 12.
355
well as solvents, emulsifiers, and/or other materials used to solubilize or suspend the test or control articles before mixingwith the carrier. The description shall include specifications for acceptable levels of contaminants that are reasonably expected be to present in the dietary materials and areknown to be capable of interfering with the purpose or conduct of the study if present at levels greater than established by the specifications. (Example: If you are feeding a diet containing an antibiotic and your test article is an antimicrobial drug that would be adversely affected by certain levels of the antibiotic in the test system, you would need to identify these concerns in the protocol.) Eachdosagelevel,expressed in milligramsperkilogram of body weight or other appropriate units, of the test or control article to be administered and the method and frequency of administration. The type and frequency of tests, analyses, and measurements to be made. Therecordsto be maintained. The date of approval of the protocol by the sponsor and the dated signature of the study director. A statement of the proposed statistical methods to be used.
Keep in mind that any changes or revisions of an approved protocol and the reasons for those changes must be documented, signed by the study director, dated. and maintained with the protocol. During the document review portion of thecontractlaboratoryinspection,requestaprotocoltemplateoraprotocol purged of proprietary information for your review.The sponsor's study monitor will be involved in preparing and approving the study protocol along with the study director at the facility, but this cursory review will allowyou to determine the level of completeness of the facility's standard protocol.The overall conduct of a nonclinical laboratory study is the responsibility of the study director, but you, as a sponsor representative, should be well aware of the requirements and have the ultimate responsibility of assuring compliance. The study should be conducted in accordance with the protocol. Specimens, such as tissue, blood, and urine, need to be identified by test system, study, nature, and data of collection. Records of gross findings for a specimen from postmortem observations should be available to a pathologistwhen examining that specimen histopathologically. And finally, all data generated duringthe conduct of the study, except those that are generated by automated data collection systems, shall be recorded directly, promptly, and legibly in ink. These data entries need to be dated and signed or initialed by the person entering the data. Any changes to entries shall be made so as not to obscure the original entry,shall indicate the reason for such change, and shall be dated and signed or identified at the time of the change. The study director and the contract laboratory quality assurance unit should inspect and
Frederick
356
monitor each study to adequately assure compliance with this subpart GLPs.
of the
G. FinalReportandDataStorage Subpart J is the last subpart of the regulations that should be referenced during this facility inspection and describes records and reports. Of particular interest in this section are the 14 requirements for every final report, the storage and retrieval of records and data, and retention of records. This is the responsibility of the study director and the QAU. but as the sponsor representative, you will review and approve the draftfinal report and should inspect the storage facilities during this preplacement visit. An archive should be present to assure orderly storage and expedient retrieval of all raw data, documentation, protocols, specimens, and reports. Check that storage conditions minimize deterioration of the documents and specimens. Is the area monitored for temperature, humidity, and fire? Is it secured and alarmed? Only authorized personnel should be allowed access to the storage area(s). Determine who is in charge of the archives and review the log-in and check-out procedures for data and specimens. Inquire as to the retention time for data and specimens.The GLPs state a period of at least 2 yr following the date on which an application or research marketing permit is approved by the FDA. Studies supporting investigational new drug applications (INDs) should be retained for a period of at least 5 yr following the date on which the results of the nonclinical laboratory study are submitted to the FDA. The EPA regulations are quite similar under the GLPs with respect to retention times. Two important differences are the Federal Insecticide, Fungicide andRodenticide Act [2], which requires a retention time for the period the sponsor holds the research or marketing permit, and the Toxic Substances Control Act [3]. which requires a periodof 10 yr following the effective dateof the final test rule. Each study report must address the following requirements:
1. Name and address of the facility performing the study and the dates
2. 3. 4.
5.
on which the study was initiated (i.e., the date the protocol was signed by the study director) and completed (i.e., the date the final report was signed by the study director). Objectives and procedures stated in the approved protocol, including any changes in the original protocol. Statistical methods employed for analyzing the data. The test and control articles identified by name, chemical abstracts number or code number, strength, purity, and composition or other appropriate characteristics. Stability of the test and control articles under the conditions of administration.
Good Laboratory Practice Compliance
357
6. A description of the methods used. 7. A description of the test system used. Where applicable, the final report shall include the number of animals used, sex, body weight range, source of supply, species, strain and substrain, age, and procedure used for identification. 8. A description of the dosage, dosage regimen, route of administration, and duration. 9. A description of all circumstances thatmay have affected the quality or integrity of the data. of other scientists or profes10. The name of the study director, the names sionals. and the names of all supervisory personnel involved in the study. 11. A description of the transformations, calculations, or operations performed on the data, a summary and analysis of the data, and a statement of the conclusions drawn from the analysis. 12. The signed and dated reports of each of the individual scientists or other professionals involved in the study. 13. The locations where all specimens, raw data, and the final report are to be stored. 14. The statement prepared and signed by the quality assurance unit. The final report must be signed and dated by the study director, and any corrections or additions must be in the form of an amendment by the study director.
H. AutomatedDataCollectionSystems The last important area to be discussed during this inspection deals with automated data collection systems. There no is specific subpartin the GLPs dedicated to computer systems, but the regulations have always provided for on-line data acquisition systems. Establish what systems exist at the contract laboratory and whether they are purchased from vendors or developed in-house by company personnel. Purchased systems rely heavily upon the vendor to provide software [8]. All purchased computer validation and development life cycle documentation systems should have had a vendor audit performed by the contract laboratory QAU, or other appropriate personnel, to assure the vendor has adequately tested and documented the system's development. In-house developed systems are less of a risk inthat source code, which is the programming language that is executed by the computer, is available and therefore the system can always be supported by contract laboratory employees. Regardless, either type of system is required to identify the individual entering the data and to provide the time and date of the entries. An audit trail must exist for all changes made to electronically captured data that identifies the person making the change, old and new values,
358
Frederick
reason for the change, and the time and date of the change. Validation is also required for all computer systems collecting raw data. It is not the purpose of this visit to assess the completeness or adequacyof the validation packages, but only to determine that they are present. System security is a major concern. Is the system protected by password? How many levels of security are in place to access the system program? How often are passwords changed? User training records should be available as well as users manuals to demonstrate formal training for all personnel operatingthe system. Standard operating procedures should exist for the following activities:
1. Change control-ongoing evaluation of systems operations and changes during the use of a system to determine when/if revalidation is necessary 2. Disasterrecovery 3. Backupandstorage of electronicmedia 4. Systemoperation 5. Validationrequirements 1.
Close-OutMeeting
This concludes the preplacement inspection activities based upon the appropriate subparts of the GLP regulations and other relevant agency concerns. Begin compiling your notes and checklists as you prepare for the close-out meeting with laboratory management. Review the screening and preplacement checklists to assure you have covered all areas of interest. Ask any additional questions of the laboratory representatives and request any additional documents for evaluation. Take some time to organize your thoughts and impressions of the facility, personnel, and documentation before conducting your close-out meeting. Group your You should expect findings into majorand minor categories and discussion items. that all major concerns will be addressed and corrected by the facility immediately. Minor observations should be taken seriously, but not viewed as critical issues that require instantaneous resolution. Finally, discussion items are observations that you have made during the courseof your inspection thatyou feel obligated to mention, but thatmay or may not be acted upon at the discretion of the contract laboratory. The following example is an illustration of major, minor, and discussion items: Major: Training records do not exist for J. Doe. Minor: J. Doehasincompletetrainingrecords. No supervisorysignature
is present authorizing him to perform cage changing. Discussion: Trainingrecordsarepunchedandplacedina3-ringbinder. Some of the sheets need reinforcing as the holes are torn.
Compliance Good Practice Laboratory
359
Begin your close-out meetingby mentioning the positive observations, such as very clean animal quarters, very helpful personnel, well written SOPS, etc. Thank the staff for their time and hospitality since successful completionof the inspection was a direct result of their efforts. Then discuss your inspectionfindings and suggest solutions if you are consulted by the laboratory for opinions. Prior to concluding this meeting, review what was discussed. Be especially certain to restate what will be corrected and the time frame agreed to by contract laboratory management. A thorough inspection covering allof the critical issueshas just been completed. Action plans are in place to resolve any major observations, or major issues werenot adequately addressedand the laboratory doesnot meet GLP standards. It is now an easy task to make recommendations to your management as to which contract laboratories are GLP compliant and can conduct preclinical toxicology studies for regulatory submissions without incident.
REFERENCES Good Laboratory Practice .for Norzclinical 1. Food and Drug Administration (1987). Laboratov Studies, 21 CFR Part 58. FDA; Rockville. MD. 2. Environmental Protection Agency (1989).Federal Insecticide, Fungicide and Rodenticide Act, Good Laboratory PracticeStandards, 40 CFR Part 160. EPA, Washington, DC. 3. Environmental Protection Agency (1989).Toxic Substarzces Control Act; Good Laboratory Practice Standards, 40 CFR Part 792. EPA, Washington, DC. OECD 4. Organisation for Economic Cooperation and Development (OECD) (1981). Principles of Good Laboratoty Practice, C(81)30, Annex 2. 5. Instruction of 31 May 1983 relative to good laboratory practice (B.P.L.) in the domaine of experimental toxicology, Ministry of Social Affairs and National Solidarity, Secretariat of State for Health, France. 6. Pharmaceutical Affairs Bureau (1982).Notification No. 313, GLP Standard. Ministry of Health and Welfare, Japan. 7. United Kingdom Compliance Programme (1986).Good Laboratory Practice. Department of Health and Social Security. 8. Drug Information Association (1988). Cornputeri,-ed Data Systems for Norzclirlical Safety Assessment. DIA, Maple Glen, PA.
This Page Intentionally Left Blank
12 Use of Transgenic Animals for the Assessment of Mutation and Cancer Robert Young and David Jacobson-Kram BioReliance Corporation, Rockville, Maryland
1.
INTRODUCTION
The ability to create transgenic animals has given toxicologists powerful new tools with which to assess the safety of new products and materials and with which to better investigate mechanisms of toxicity. Initial transgenic constructs were created by microinjecting exogenous DNA into the female pronucleus of a zygote immediately following fertilization. This incredible feat provided the means to deliveran exogeneous gene (transgene) to each and every DNA-bearing cell of an adult animal. This,of course, includesthe germinal cells thereby ensuring transmission of the transgene to subsequent generations. One shortcoming of this technology isthat scientists cannot control the siteof genomic integration or the numbers of transgene copies that integrate. Because positional effects can dramatically affect gene expression, and because the site of integration can also affect expression of endogenous genes, every line that is created will be somewhat different, even when the integration vector used is the same. More recently, technology has been developed to target DNA sequences to specific sites in the genome. This revolutionary technique involves generating DNA vectors with sequences homologous to that of the target site. The vectors are transfected into embryonic stem cell lines and cells that have integrated the transgene are isolatedusing a selectable marker. Clones that developin the presence of the selection media are examined to identify those where the gene has truly integrated at the targeted site. Once a proper clone is identified, cells are 361
Young and Jacobson-Kram
362
mixed with those of a developing blastocyst. Selection of the appropriate chimeric offspring is usually performed using contrasting coat colors in the strains contributing the blastocyst and the embryonic stem cells. This technologyhas led to the development of murine strains where a specific gene has been knocked out; the primary endogenous DNA sequence having been interrupted by an exogenous DNA sequence. Clearly, this has provided a powerful toolwith which to ask the question: What are the phenotypic effects of the loss of a particular gene? Not surprisingly, many genesturnouttobeessentialfordevelopmentandsome knock-outs fail to produce viable offspring. However, many do survive and have providednovel ways tostudyembryonicdevelopment,toinvestigatedisease states, to develop new pharmaceuticals, to better understand cellular function, and of course to investigate toxicological responses. of the science for two types of transgenic This chapter summarizes the state mouse systems: those used to evaluate gene mutations and those used to assess potential carcinogens.
II. TRANSGENICMUTATIONMODELS Detection of mutations in laboratory animals has longbeen a goal in toxicology. There are a number of assays available for detecting chromosomal damage in vivo. There are also in vitro gene mutation assays that incorporate metabolic activation systems that mimic in vivo mammalian metabolism. Unfortunately, there has been no easy way to detect and measure gene mutations in vivo. Gene mutation assays using laboratory animals have traditionally been large, expensive, and cumbersome to perform or havebeen limited to a few genes detectable in a limited number of tissues. While assays have been available for detecting both somatic and germ cell mutations in vivo, these studies are seldom performed. Developments in molecular biology and the advent of transgenic animals have created a powerful new tool that permits the study of mutagenesis in vivo.
A.
In Vivo Mutagenesis Before Transgenic Animals
An integral partof regulatory-driven safety assessment of new chemicals is evaluation of the mutagenic activityof the product. Regulatory guidelines for the evaluation of new products, including food additives [l], drugs [2], pesticides [3], industrial chemicals [4], and medical devices [SI, all include requirements for the evaluationof genotoxicity. The various guidelines use several different assays in a battery designed to maximize the detection of genetic damage. The most common genetic toxicology battery used by regulatory agencies has included assays for gene mutation in bacteria and mammalian cells in culture and for chromosome damage in either mammalian cells in culture or in somatic tissue
Use of Transgenic Animals
363
of rodents. On occasion, some products may require other short-term assays, such as for deoxyribonucleicacid (DNA) repair, DNA damage, or malignant cell transformation. In general. regulatory guidelines have not included in vivo gene mutation assays because of the lack of suitable, cost-effective models. Whilechromosomaldamage has been measuredinrodentsforyears through the use of the micronucleus or in vivo chromosome aberration assay in laboratory mice and rats, there has been no easy method to measure gene mutations in vivo. Early pioneering work done by Russel and Major [6] and later by Fahrig [7] developed a method for detecting mutations in melanocytes of embryos of mice that are heterozygous for various coat-color genes. The resulting spots in the coats of offspring from mutant melanocytes gave rise to the name spot test. Drawbacks to this assay include the need for the drug to pass the placenta barrier and the large number of animals needed with timed pregnancies. There is a trade-off between age of the embryo and the sizeof the resulting spot. Using young embryos with few melanocytes gives large, easily detectable spots but the assay then requires many embryos due to the few number of target cells per embryo. Using older animals provides more target melanocytes but the resulting spots become difficult to detect. The spot test, while elegant, has never had widespread usage. T lymphocytesareauniquecellulartoolthat may be used to measure in vivo mutagenesis in the hypoxanthine phosphoribosyltransferase (HPRT) gene. These circulating blood cells can easily be recovered and stimulated to grow in vitro. In the presence of the purine analog 6-thioguanine (6-TG) wild-type cells with HPRT activity are killedby conversion of 6-TG to a toxic metabolite while HPRT mutants survive. This assay has been described in humans [8], mice [9] and rats [lo]. One difficulty is clonal expansion of mutated cells, thus skewing calculated mutant frequencies,and the limitation of the methodology tothe single tissue type. Limited work on HPRT-based assays using other tissues has been reported including lung [ 1 I]. Tlymphocyte assays have been used with notable success in human occupationaland environmental monitoring situations but have never entered into regulatory guideline-driven testing schemes. Another genetic marker used for in vivo mutagenesis is the Dlb-1 locus, a gene that controls expression of binding sites for the Doliclzos bijiorus agglutinin (DBA) lectin. Mutated intestinal stem cells and their progenybecan identified in mice heterozygous for this gene [ 121. Detecting heritable germ cell damage has been one of the long-standing goals of toxicology. Somatic mutations leading to cancer remain within the individual and are self-limiting from the perspective of the population. Conversely, germinal cell mutations may be passed onto offspring and increase the genetic disease burden of the larger population. Oneof the few systems to detectin vivo germinal gene mutations is the biochemical-specific locus test. This assay uses changes in the electrophoretic mobility of endogenous proteins in offspring of
’
364
Young and Jacobson-Kram
treated parents as the end pointof the assay [13,14]. Parental strains with isoenzymes of different electrophoretic mobility are needed so that loss or alteration in parental or hybrid protein is detectable on an electrophoresis gel. Parental animals are treated, mated, and theF1 offspring screened for isoenzyme electrophoretic pattern. Due to the expense and effort required to perform this type of assay, very few have ever been performed. Other assays are available that measure in vivo germ cell damage at the chromosome level and not gene level. The heritable translocation assay detects chromosome breakage in male germ cells [ 151. Male offspringof treated malesmay inherit that is transmissible to offspring a pair of nonhomologous chromosomes involved in a reciprocal translocation resulting in a semisterile mouse. While this assay detects heritable genetic effects, it does not detect heritable gene mutations. Another test available to measure genetic damage of germ cells is the dominant lethal assay[16]. This assay mates 10 weeks. Females are treated males with virgin females each week for up to sacrificed after mating and the number of living and dead implantations determined for each female. The dominant lethal assay measures damage to male gernl cells through the spermatogenesis cycle but does not identify heritable genetic mutations. Early in vitro genetic toxicology assays didnot have the ability to metabolize test articles due to the lack of high levels of endogenous enzymes in target cells. The search for mammalian metabolic activation led to bringing metabolic enzymes, test article, and target cells together through use the of liver microsome enzyme preparations (S9 mix) or feeder cells or, conversely, introducing target cells into metabolically competent mammals. One prescient group of researchers developed an assay called the host-mediated assay, which, inmany ways, anticipated the use of transgenic laboratory animals to investigate biotransformation, systemic distribution of test articles, and mutagenesis in vivo. In the host-mediated assay, Salmonella typhimurium strains of the Ames bacterial mutation assay are implanted or injected IV into host animals pretreated with test article [17]. Animals are sacrificed and tester organisms aseptically recovered from the liver and other organs, such as the lungs, kidneys, spleen, or testes, and screened for mutations using conventional Ames assay plating methods. This unique assay integrates mammalian metabolic activationwith a crude ability to evaluate tissue specificity for metabolic activation. Itwas not until 20 years later that molecular tools and techniques had developed sufficiently for the host-mediated assay to be taken to the next level of sophistication-mutation models using transgenic mice with intracellular target genes that are recoverable for screening in vitro.
B. The Road to Transgenic Animals for Detecting Mutations Transgenic animal models for detecting mutations are the fusion of years of work in disparate areas. including microbial genetics, molecular genetics, cell biology,
Animals Use of Transgenic
365
plasmid vector technology, developmental biology, and toxicology. Detection of mutations in endogenous genes of animals has been long fraught with difficulty, which led to the idea of introducing exogenous genes into animals. A number of laboratories successfully integrated shuttle vectors into mammalian cells in culture for studying mutagenesis. One of the first transgenic cell lines used for detecting i n vitro mutagenesis was the AS52 cell line [ 181. This cell line had the phosphoribosyltransferase (XPRT)geneintroducedinto bacterialxanthine ahypoxanthine phosphoribosyltransferase (HPRT)-deficientChinesehamster ovary cell (CHO). This transgenic construct, using the bacterial salvage pathway gene XPRT, was designed to give the AS52 cell improved survivability to deletion mutations over the CHOKl BH4 cell line used in the CHO/HPRT forward gene mutation assay. Another avenue leading to in vivo mutagenesis and the introduction of transgenes into animalswas work done in the 1980s with transient shuttle vectors. For example, work by Seidman [ 191 with SV4O-based shuttle vectors that carried marker genes permitted detection of loss of function through a colorimetric end point. The shuttle vector could be used to transfect mammalian cells in culture, the cells could be treated, the progeny plasmids harvested and introduced into indicator bacteria. Colonies carrying mutant plasmids could be enumerated, the mutant plasmids isolated, and the marker gene sequenced. Due to aggressive replication of the vector, host cells ultimately die, leading to the term transient vectors. There were a numberof questions associated with this approach, including how close the vector chromatinwas to host chromatin in terms of sensitivity to mutation and repair. This field of research was an important foundation for future transgenic systems with permanently integrated genes. In fact, Seidman in his article stated “Eventually, of course, the approaches will be supplanted by new technologies. . . . For example, it seems likely that transgenic mice will be constructed with recoverable marker genes.”
C. Evolution of TransgenicMutationModels The interest in transgenic animals for mutation research was given support by a cycle of reassessment that swept thefield of genetic toxicology in the late 1980s. An early and continuing goalof genetic toxicology is to use in vitro mutagenicity tests to predicthuman carcinogenicity of a chemical. Tennantet al. [20] analyzed data and found that the correlation between four short-term mutagenicity tests and traditional 2-yr mouse and rat cancer bioassays was too low for an accurate prediction of human carcinogenicity. The same data also pointed out the relative weakness of one rodent bioassay to predict the results of the other system. The debate, heated at times, initiated reassessment of the design of gene toxicology assays, the optimal selection of gene toxicology assays within regulatory guideline-required test batteries, and the use of gene toxicology data in risk assessment. Another outcome of the debate was recognition of the need for a new
366 Jacobson-Kram
and
Young
generation of mutagenicity assays that combined the power of short-term in vitro tests with the power of a whole-animal test system. so refractory to detecting muSince endogenous genes in mammals proved tations, interest turned to creating animals with exogenous genes added that were designed for easy detection of mutations. The approach was to create transgenic mice carrying a target gene in a shuttle vector integrated into the genomic DNA of every cell. The shuttle vector would carry a target gene that records genetic damage in vivo. Millions of copies of the target gene-carrying shuttle vector ' 'played back'' through an in vitro deteccould be recovered from genomic DNA, tion system, and mutant forms of the gene detected and enumerated. Selection of the vector and target gene was critical to developing a successful system. One initial consideration was whether expression and detection of the mutant phenotype should be in situ in mammalian cells or via a recoverable shuttle vector isolated from host DNA. Detection in situ would require expression of a foreign gene, the gene-product of which may have unknown effects on the host. In addition, detection of mutations of an integrated transgene may require either in vivo or in vitro proliferation of the host mammalian cell, thereby lessening the utility of the system. These considerations led researchers toward the use of recoverable shuttle vector systems. There were reports that nonintegrated, transfected DNA in mammalian cells had high mutant frequencies [21], while work by Glazer et al. [22] and others showed that integrated DNA has a spontaneous mutant frequency similar to genomic DNA. This directed work toward the use of shuttle vectors integrated into the genome. The size of the target gene and number of copies of the target gene affectthe ability to create, insert, and recover vectors and to have a sufficient mutant signal to detect. The target gene ideally should be large enough to provide a sufficient number of bases to record mutagenic attack yet not so large that the shuttle vector cannot be recovered and processed in vitro. There shouldbe a sufficient numberof vector copies available with a reasonable efficiency of recovery to permit detection of low spontaneous mutant frequencies. Work reported by Glazer et al. [22] showed that one way to increase efficiency of vector recovery was to insert multiple vector copies into each cell. They showed that more than 100 copies of a lambda phage vector couldbeconcatenatedtogether,transfectedinto,andrecovered from mouse L-cells. There have been various approaches for creating transgenic animals for [33] proposed detecting mutations in target transgenes. Malling and Burkhart @X174 analysis of DNA for mutationsin transgenic mice carrying a recoverable shuttle vector. They suggested several options, including using gain or loss of restriction sites, detectionof reverse mutations (such asin the m213 cs70 sequence in @X174), or the creation of recombinant vectors carrying a target gene such as l c d from the Lac operon. They also proposed a transgenic mouse, carrying both a shuttle vector for the detection of mutations and an activated oncogene,
"
Use of Transgenic Animals
367
that would be uniquely suitable for studying both tissue-specific mutation and carcinogenesis. Burkhart and Malling [24] demonstrated the stable incorporation of @X174 into mouse L cells in culture, recovery of the vector by restriction enzyme digestion, purificationby column chromatography, ligation, and transfection into competent spheroplasts. Enumeration of total phage and revertants at am3 and cs70 was done by adsorbing and plating onto appropriate Escherichia coli hoststrains.Theywereabletodemonstrateasignificantincrease in reverse mutant frequency in ethyl methanesulfonate-treated cells over control cells. Burkhart et al. [25] carried this concept from cells in culture to animals when they reported on creation of a @X174 am3 transgenic C57B1/6 mouse that is homozygous for approximately 100 copiesof the vector. They were able to demonstrate reversion to the wild-type phenotype following treatment with the alkylating agent N-ethyl-N-nitrosourea. Reverse mutation systems are limited to detecting only those mutagens capable of reverting the original forward mutation lesion. In this case, only chemicals capable of inducing base-pair substitutions at A:T base pairs would be detected. The @X systemhas been eclipsed by other more robust transgenic systemsthat used recombinant lambda phage shuttle vectors. Work was also underway in the late 1980s using the lambda phage genome as a recoverable shuttle vector to create transgenic animals suitable for the study of in vivo mutagenesis. The lambda phage andE. coli bacterial host system was a well-characterized systemthat had long been used as a researchtool by molecular biologists and geneticists. Transgenic technology built upon the molecular tools used to manipulate lambda shuttle vectors such as restriction enzymes to cut vectors out of genomic DNA and in vitro lambda phage packaging extracts that packaged lambda DNA into empty phage particles to create infective phage particles. By manipulating host E. coli strains, growth conditions, and useof chromogenic substrates, phage can be easily screened for mutant phenotypes. This work has led to two commercially available transgenic mouse models for the study TNO Institute for Experimental of mutagenesis in vivo. Gossen et al. [26] at Gerontology, Rijswijk, The Netherlands, and Kohler et al. [27] at Stratagene, La Jolla, CA, reported on the use of lambda shuttle vectors containing alac2 target gene to create transgenic mice for mutagenicity testing. The TNO work led to the development of the ZucZ system called MutaTMMouse.The Stratagene work with lac2 led to an improved system using lacI as the target gene in the lambda shuttle vector [28,29]. The Stratagene IncI system became known as Big Blue@.
D. Big BlueeLad System The Big Blue@ mouse system was developed using an inbred C67BL/6 female mouse to minimize genetic variability in the offspring due to chromosomal segregation. The C57BL/6 mouse was also chosen because femaleC57BL/6 females
368
Young and Jacobson-Kram
could be mated to male C3H/HeN nontransgenic mice to create B6C3F1 hybrid mice. The B6C3F1 hybrid is widely used for carcinogenicity testing so tissuespecific in vivo mutagenicity data can be compared with the extensive tumor database from B6C3Fl carcinogenicity studies. Both the C57BL/6 and B6C3F1 strains carry approximately 40 copies of the same vector concatenated in headto-tail fashion in the same chromosome position. This identical construction permits for thefirst time comparisons of mutant frequency of exactly the same gene between two different animal strains. The hLIZ vector used in creating Big Blue was constructed by a multistep process [39] thatresulted inan approximately45-kbfullyfunctionallambda phage vector flanked by cos sites. The cos sites permit concatenation of vectors of the vectors from into chains for insertion into the founder animal and excision genomic DNA. Carried within the vector is a phagmid that contains the entire l c d gene, the 1 m Z operator (the lac repressor protein binding site within lac2 the gene), the crlncZ gene (the amino terminal 675 nucleotides of the IncZ gene), a colEl origin of replication, and an ampicillin resistance gene. The phagmid is flanked by f l origins of replication to permit help with sequencing. Through the use of fluorescence in situ hybridization (FISH), the siteof integration was found to be on chromosome 4 [30]. The target and reporter genes from the system come from the lnc operon, the system of genes that regulates and codes for the synthesis of P-galactosidase. The 1080 base pair lncI target gene codes for a monomeric repressor protein. The repressor protein forms a tetramer that binds the lac operator that prevents transcription of the a2ac.Z reporter gene. Mutation in the 1ucI gene causes total or partial inactivation of the Lac repressor protein, allowing transcription of the alacZ gene. The a1ac.Z gene product complements the portion of the lac2 gene product codedby the specific hostE. coli cell (SCS-8). The two proteins combine, creating a fully functional P-galactosidase molecule. The P-galactosidase molecule cleaves galactoside linkages but can also cleave the chromogenic substrate 5-bromo-4-chloro-3-indol-~-galactopyranoside (X-gal) yielding an intense blue product. When lambda phage are adsorbed onto restriction minus SCS-8 E. coli cells and plated onto agar, plaques form in the bacterial lawn where bacterial cells are lysed by the phage. In the presence of X-gal, plaques from phage with A continuum mutant ZcrcI target gene are blue while wild-type phage are colorless. of color intensities is produced, with total inactivation of the lac1 protein giving the most intense blue color. Mutations can give rise to proteins with a wide range of abilities to bind to the operator region of IncZ, resulting in partial repression of IcrcZ, some production of P-galactosidase. and intermediate levels of blue color. Many of the mutants identified and sequenced contain mutations in the area that codes for the binding domain, which is the first 59 amino acids of the repressor molecule. The ratio of blue plaques to total plaques is the mutant frequency. The majority of mutants arise while integrated in the animal’s genomic
”
Animals Use of Transgenic
369
DNA. A small percentage of mutants may arise in vitro in the host bacteria as the result of spontaneous mutations in vitro or fixation of DNA damage that occurred in vivo. These ex vivo mutants are characterized by plaques that are clear with blue dots or sectors and are counted but not included in the total mutant count. The construction of the shuttle vector permits alternate scoring methods based on positive selection,of mutants. Positive selectionof mutants from larger wild-type populations generally makethe process of screening for relatively rare mutants more efficient. Some work has been done to make the lncl protein key to expression of a secondary reporter gene that is essential for cell growth. One approach provides growth advantage for lysogenic prophage colonies containing mutant lncl genes, making expressionof alacZ metabolically required [311. The lambda vector also contains plasmid maintenance sequences, thecolEl origin of replication, and the ampicillin resistance gene. This approach, permitting target gene recovery and mutant screening as a plasmid rather than as a lambda phage, has been described in othersystems[26].Somelaboratorieshaveemployed single-strand confirmation polymorphism (SCCP), in combination with restriction digestion of polymerase chain reaction (PCR) productsof the coding region and portions of the promoter region, to detect mutations in the Zacl gene from Big Blue mice [32]. Typically, transgenic animals are exposed to test or control articles for a period of time followed by a suitable expression period. Mice are sacrificed, organs and tissues of interest rapidly removed, flash frozen and stored at -70°C for later extraction of genomic DNA and analysis. Representative samples of tissue are homogenized, digested with proteinase K, precipitated with ethanol, and dissolved in buffer. The solubilized genomic DNA is then incubated with lambda in vitro packaging extracts, which cuts vectors free atcos sites and initiates packaging of the vector into phage particles creating infectious phage [33]. The phage are then adsorbed onto restriction-deficient host E. coli for mutant screening.Laboratoryprocedureshavebeenstandardizedandoptimizedfor lambda phage screening in the Big Blue system r34.351. Early datawith the Big Blue assay have provided data for statistical analysis of sources of variability, optimal study design, and optimal sample size [36-381. The early transgenic mutation data were thoroughly reviewed by Gorelick [39]. The greatest sourceof variability has been found to be animal-to-animal variation leading to the recommendation of screening fewer phage from more animals. Withoutprovidingexactrecommendations, many laboratoriescurrentlyuse group sizes of 5 to 10 animals and screen 100,000 to 200,000 phage per tissue. Other study variablesthat affect the induction and detectionof mutations include route of exposure. numberof treatments, dose, target tissue, and expression time. Choice of route of exposure would normally be governed by the route of human exposure orthe route that will provide maximum bioavailabilityof the test article
370
Young and Jacobson-Kram
to the chosen target tissue, thus requiring some knowledge of the toxicity, metabolism, or pharmacology of the test article. Gorelick [39] recommended the use of an exaggerated dosage to maximize the opportunity to detect effect. Heddle [40] reviewed how mutant frequency varied with treatment and expression time and how a simple study design could be designed to maximize sensitivity while minimizing the number of samples and number of animals required. He assumed that IncI genes in lambda shuttle vectors are genetically neutral to the host and that there is some “expression time” that is characteristic of each tissue. The expression time represents a combination of the rate of cell turnover in the tissue, the time required to convert DNA lesions into mutants, and the rate of DNA repair. He predicted that a single sampling time would be sufficient, provided that it was long enough to permit the slowest responding tissue to express induced mutations. He also predicted additivity of effect from multiple exposures, leading to the prediction that subacute and chronic treatments will make the transgenic mutation assays far more sensitive than single acute treatments. Various tissues from both C57BL/6 and B6C3F1 strains of the Big Blue mousehave been investigatedfor both spontaneousandinducedmutantfrequency. Tissues evaluated include liver, lung, spleen, bladder, kidney, heart, skin, muscle, brain, nasal epithelium, bone marrow, T lymphocytes from the spleen, and male germ cells.The minimum quantityof tissue needed for DNA extraction is 50 to 100 mg of tissue so a wide rangeof tissues may be studied. The majority of the work to date has been with somatic tissues looking at correlation with in vitro genetic toxicology tests and carcinogenicity data. Another areaof research has been to use transgenic animals for assessing germ cell mutations as a predictor of heritable genetic damage. Measurement of germ cell mutant frequency has great potential for complementing or replacing the expensive specific locus tests currently used for detecting heritable genetic effects.A multilaboratory comparison of Big Blue and MutaTMMouse assayshas been completed in an attempt to validate the useof transgenic assays for detectionof germ cell mutations [41,42]. The transgenic mutation assays lend themselves to the full range of treatment routes and regimens used in traditional animals studies. Studies have been performed using various routes including oral administration, intraperitoneal injection, intravenous injection, inhalation, irradiation, dermal application, and implantation of osmotic pumps for continuous drug delivery. Transgenic mutation models have alsobeen used to investigate other variables, including age [43] and diet [44]. The molecular basis of individual mutations may be determined by direct sequencing of mutant phage. Mutation types detectedin the l a m b d a / l d system include base substitutions, single base frameshift mutations, insertions, duplications, and deletions [30]. While intragenic deletions of up to about 7.8 kb are possible within the hLIZ vector, larger deletions would not be expected to be
Animals Use of Transgenic
371
detected unless the deletion spanned the correct regionsof concatenated vectors. The limitation of detecting deletions makesthe lambda vector systems somewhat refractory to clastogens. The lncZ transgenic animals havebeen used for a wide rangeof other toxicology studies. Transgenic animals havebeen used in multiple end point studies incorporatingotherbiologicalindicators of geneticdamageintoatransgenic study. End points used in transgenic studies have included nontransgenic gene mutation, DNA adducts, cell proliferation, micronuclei, and unscheduled DNA synthesis. Transgenic research has progressed to where one animal can carry multiple transgenes. For example, there are F1 mice available that are both p53 knock-out for tumorigenicity studies and carry the lambda/lncl vector for mutagenicity studies [30].Also available are transgenic cell lines and Fischer 344 rats that carry the identical lambda/lacZ vectors as found in the Big Blue mice [30]. The availability of the same target gene in both mice and ratsnow permits in vivo mutagenicity studies tobe performed using exactlythe same genetic sequencein two different species. Thelambda/lacl vector also has been used to analyze gene mutations in a selection-based assay using lambda vector genes instead of the original l a d target gene [45].
E. MutaTMMouseLac7 System The other widely used transgenic mouse mutation model is the lambda/lacZbased MutaTMMouse. This system, first described by Gossen et al. [26], carries about 150 copies of a lambda gtl0-lac2 shuttle vector stably integrated into the mouse genome. Gossen et al. reported that multiple copies of vectors usually integrate in a head-to-tail fashion at a single site in genomic DNA [46]. The founderanimal was a CD2.F1 mouse (BALB/c X DBAI2). By creating the transgenic line in an F1 hybrid animal, the animal cannot be outbred to create the B6C3F1 hybrid commonly used in cancer bioassays. The lambda vectorused in creating MutaMouse was a lambda gtlO vector. The vector carries cohesive ends permitting the vectorsto be joined together by cos sites. A fully functional bacteriallac2 gene was obtained as a 3.9-kb fragment from plasmid pMC151 that was cloned into the single EcoRl site of the lambda vector. In this system, the ZctcZ gene serves as both the target gene and reporter gene. The lac2 gene, from the Lac operon, is the structural gene that codes for the synthesis of P-galactosidase. The same chromogenic substrate as is used in 5-bro1no-4-chloro-3-indol-~-galactopyranoside (X-gal),is theBigBlueassay, used to detect the presence of P-galactosidase. The wild-type phenotype of phage from MutaMouse is maximal production of P-galactosidase. The lac2 mutants produce no, diminished quantities of, or less functional P-galactosidase as compared with the wild-type gene. The result is that wild-type plaques are intense blue and mutants with total loss of lac2 activity have no P-galactosidase formed
372
Young and Jacobson-Kram
and consequently are colorless. Mutations that result in diminished quantities or less functional P-galactosidase produce plaques with a continuum of blue color between the colorless mutant form and intense blue wild-type form. The end point of the assay is to determine the ratio between mutants (colorless and lessintense blue) plaques and nonmutant (intense blue) plaques. There were several technical improvements as the assay was developed. Because of the highly methylated nature of the vector, the use of restriction minus E. coli C derived in vitro lambda packaging extract and adsorbing onto appropriate E. coli C cells both greatly improved the rescue efficiency of the vectors. Another improvement was to pretreat the solubilized genomic DNA with the restriction enzyme Xbn I to increase the efficiencyof vector recovery. The genomic DNA is cut by the enzyme into small fragments while the vector remains intact because there are no recognition sites for the enzyme in the vector. Another requirement was to use a IacZ-deficient strain of E. coli C to prevent homologous recombination between the lac2 gene in the bacterial host and the vector [26]. The color screening assay is labor intensive toperform due to the exacting requirements of manually screening millions of plaques for white or light blue mutants against an intense blue plaque background. Even with the limitations of the color screening system, a wide range of compounds were evaluated. Reviews of transgenic systemsusually included discussionsof both systemswith comparisons of data [39]. During the first few years of use, more data were collected in the Big Blue system than in the MutaMouse system due to the ease and lower cost of mutant detection with Big Blue. For this reason, many of the review papers on study performance, study design, and statistical evaluation tended to focus on Big Blue data [36-381. A modification was developed to the MutaMouse system that permits posiof a tive selection of lac2 mutants during screening through the introduction proprietary E. coli lac- galE- host cell line [47]. The new bacterial strain does not replicate in the presence of galactose. Phenylgalactose (P-gal), an alternate substrate for the P-galactosidase, can be used as a selective agent for bacteria infected with phage containing nonfunctionalIucZ genes. A functional lac2 gene will produce P-galactosidase which converts P-gal to galactose. The galactose inhibits the bacterial host andviral replication and prevents plaque formation.In luc2 mutants, P-gal isnot metabolized so no galactose is cleaved allowing bacterial growth, viral replication, and plaque formation [48]. A number of laboratories have evaluated the MutaMouse positive selection lacI system system with the original nonselective method and with the Big Blue [49-511. Significant and reproducible increases in mutant frequency were observed following treatment with a variety of mutagens in several laboratories. Controllablesources of variabilitywereobserved in both assays,reenforcing recommendations for careful standardization of laboratory procedures [511. The MutaMouse system was identified as more cost effective for rapid screening of
Use of Transgenic Animals
373
plaques than Big Blue, but both systems very gave similar results.A major advantage of the Big Blue system that was noted is that it enables access to B6C3F1 mice and Fischer 344 rats for both mutagenicity and carcinogenicity studies [49].
F. OtherSystems There are a number of other transgenic mutation systems available foruse or in development. The focus of this discussion has been on the two commercially available systems that have been widely used and reported upon. An interesting model is the plasmid-based transgenic mouse model that uses anew lnc2 plasmid of the pUR288 plasmid that mouse [52]. This mouse carries about 200 copies contains the l a c 2 gene. The system has been shown to be able to detect largescale deletions andthe mutagenic effectsof clastogenic agents. One initial observation is that spontaneous mutant frequenciesof many somatic tissues from both l a d and lac2 lambda vector transgenic animals are about X 4 lop5while spontaneous mutant frequencies in the lac2 plasmid mouse are about 7 X lops, nearly double the other systems. This suggests that the vector system underestimates lac2 the full mutational loadin tissues by missing deletion mutations and that the plasmid mouse shows promise as a sensitive system for the detection of a full range of mutagens, including clastogens. Another transgenic mouse system for detection of mutations has been reported by Gondo et al. [53]. This mouse carries an E. coli pML4 plasmid containing an origin of replication,akanamycinresistancegene, and an E. coli rpsL+ gene. When the plasmid is recovered from the mouse and introduced into kanamycin-sensitive and streptomycin-resistant E. coli cells, selective pressure can be exerted by manipulating antibioticsin the growth medium. Only cells that contain a mutatedrpsL transgene will grow on medium containingboth kanamycin and streptomycin, while cells with both wild-type and mutant rpsL-carrying plasmids with grow on medium with only kanamycin. This selective growth permits easy enumerationof both mutant andtotal cell populations for the determination of mutant frequency. Induced mutant frequencies have been demonstrated in a variety of tissues including spleen, liver, and brain.
G. Summary The field of in vivo mutagenesis is advancing rapidly as new models are developed and validated. The availability of the first-generation models led to rapid increases in the ability to investigate tissue-specific mutagenesis and heritable gene mutations. The model systems havebeen evolving, even during the process of validating each generationof assay. There is a clear place for in vivo mutagenicity screening assays in both the regulatory toxicology environment and the basic investigational research laboratory.The debate on the role of these systems
Young and Jacobson-Kram
374
has been continuous since they became available. However, the nature of the debate has changed as the systems become more robustand efficient from “if” these systems should be used to “how” they should be used [54-561.
111.
CARCINOGENESIS
The definitive way to determine if a material is a human carcinogen is to perform prospective epidemiological studies. Such studies have two major limitations. First, the duration, such a study might have to be conducted for up to half a century, and second, if the outcome is positive, new cases of cancer have been induced that may have been avoidable. The experimental alternative of choice has been and continues tobe the rodent chronic carcinogenicity bioassay. In this type of study, rats and mice are exposed to a test materialby a relevant route of exposure to a “maximum tolerated dose”of the test material for2 yr, essentially the entire life span of these experimental animals. With 2 yr of dosing and an extensive postlife analysis, it can take almost3 yr to obtain the outcomeof such a study. Because of its duration and the large numbers of animals involved (approximately 400 of each species), these studies are quite expensive. Efforts to shorten the in-life portionof these studies have recently focused on genetically engineering animal strains that might show a shorter latency period. These efforts have been aided by our rapidly increasing knowledge on the relationship between oncogene activation, tumor suppressor gene inactivation, and multistage carcinogenesis. In order to create a test system with a shorter latency period, it seems logical to begin with an animal in which one step in the carcinogenesis pathway has already been taken. Toward this end, a series of transgenic mice were created by Leder et al. [57-601 containing exogenous oncogenes such as ras, mvc, and neu, and whose expression is controlledby the mouse mammary tumor virus (MMTV) regulatory elements. The transgenic strains constructedin this fashion provided interesting data relating to the function of these oncogenes but the animals turned out to be of little use as a system for carcinogen screening.Most likely because of the use of MMTV long terminal repeats (LTRs), all the transgenic animal strains appeared susceptible to spontaneous mammary tumors, with the females being more susceptible than the males. Depending on the gene inserted and the sex of the mice, tumors begin to appear during the first 30 to 60 days of life with the majority 8 mo. Surprisingly, when these strains of animals developing tumors in 6 to were exposed to carcinogens, including a carcinogen known to induce mammary tumors in wild-type animals, they did not develop significantly more tumors nor did they appear at a faster rate. These transgenic constructs were rapidly rejected as potential models for carcinogenicity testing. A secondgenerationtransgenicstrain to showpromiseis the PIM
Animals Use of Transgenic
37s
transgenic mouse. The yinz-1 gene locus is a frequent site of proviral insertion in T cell lymphomas and has thus been implicated as an oncogene. Early work showed that infectionof mice with murine leukemia virus induced T cell lymphomagenesis in which lymphomas were frequently foundwith proviral insertion in proximity to the yim-1 gene [61-631. Such lymphomas consistently overexpress yirn- 1. There are numerous examples of proviral activationof oncogenes in association with tumorigenesis including activations of c-myc, c-nzyb, erb-B, int-1, and int-2. Activation most frequently involves elevated or unregulated expression of the oncogene product as a result of transcription from the viral promoter, enhancement of transcription from the oncogene promoterby the proviral LTR, or by integration of the provirus within the oncogene transcription unit resulting in increased stabilityof the oncogene mRNA or even truncation of the translation product. Work with transgenic animals has shown that pim-1 playsan important role in the transformation process in T cell lymphomagenesis and thus the gene can be classified as an oncogene [64]. The effects of overexpression of pinz-1 are subtle with little detectable phenotype in the absence of external stimulation other than a slight elevation in the frequency of T cell lymphomas. Transgenic mice overexpressing the pim-1 oncogene are predisposed toward developingT-celllymphomas.Approximately 10% of animalsdevelop lymphoma in the first 34 wk of life. Animals treated with a single dose of Nethyl-N-nitrosourea (ENU) at 60 mg/kg begin to develop tumors after 3 mo; by 5 mo, essentially all treated animals demonstrate lymphomas [65]. Additional validation studies are underway in this model but some researchers are pessimistic as to whether this system will respond to nonlynlphomogenic agents. The newest and to date most promising transgenic models for carcinogenesis assessment are the p53 heterozygous knock-out, the Tg.AC line carrying an Hras2 strain activated m s genedriven by thezeta-globinpromotor,andthe carrying an intact rus protoconcogene. It is clear that the p53 gene plays a central and vital role in the regulation of the cell cycle. Over50% of human tumors havebeen found to carry mutations in the p53 gene and it remains the most frequently observed genetic alteration in human cancers. Ithas been linked to the regulation of a varietyof genes known to be involved in regulating cell cycle including BAX, BCL2, p2lWaf1, MDM2 cyclin G, and GADD45 expression.When exposed to DNA damage, these regulatory proteins serve to arrest cell cycle progression and allow DNA repair or to induce apoptosis to eliminatethe cell from the population. Both pathways help to reduce the riskof neoplasia associated with exposures to DNA-damaging agents. Indeed, mutations in the p53 gene are linked to Li-Fraumeni syndrome,an autosomal dominant disease characterized by increased susceptibility to a variety of cancers. The creation of a transgenic mouse line deficient in p53 was reported by Donehower et al. in 1992 [66] (Table 1). Homozygously deficient animals are
376
Young and Jacobson-Kram
Table 1 Characteristics of p53TransgenicMice 0
0
0 0
Created on a C57B16 background by Donehower et al. 1992. 50% of ~ 5 3 ” -develop tumors by 20 wk of age and 100% by 40 wk. p53+“ are disease-free for 24 wk; 50% of animals develop spontaneous tumors by 80 wk of age. Developprimarilylymphomas,osteosarcomas,andhemangiosarcomas. Amenabletoanyconventionalroute of exposure. Proposedutilityfordetectingmutageniccarcinogens. Tumors can be analyzed for condition of p53 allele.
highly susceptible to spontaneous tumors with approximately half the animals showing tumors in 20 wk. The majority of tumors forming in these animals are malignant lymphomas and hemangiosarcomas. Transgenic mice carrying a single functional p53 allele also show a susceptibility to spontaneous tumor formation between that seen for homozygous knock-outs and wild-type animals. Heterozygotes are essentially disease-free up to approximately 40 wk of age; by 80 wk, approximately half the animals have developed spontaneous tumors. Wild-type animals are essentially tumor-free in the same time frame [67]. Because of the importance of the p53 gene in human carcinogenesis and because this animal model mimics Li-Fraumeni syndrome, this model appears to be promising for carcinogenesis testing. A second transgenic mouse line that is currently receiving a good deal of attention was created by Leder et al. [68] and contains an activated H-ras gene (Table 2). The ras sequence contains activating mutations in codons 12 and 59 and expression of this gene is driven by the promotor of the mouse zeta-globin gene. This line has been reported to be “genetically initiated” as defined by the mouse two-stage skin carcinogenesis paradigm. Northern analysis of a wide range
Table 2 Characteristics of Tg.AC Transgenic Mice
Created on FVB/N backgroundbyLeder et al. 1980. Expression of the transgene, a v-Ha-ras oncogene is regulated by the fetal zetaglobin promotor and is normally not expressed in adult tissue; only exception in bone marrow (Hansen et al. 1996). Transgenecontainstwoactivatingmutationsincondons12and59. Fourtandemcopiesarelocatedonchromosome 11. Skin behaves as if “genetically initiated”: exposure to promoters results in development of papillomas.
Use of Transgenic Animals
377
Table 3 Characteristics of TgrctsH2TransgenicMice ~~
StraincreatedbySaitohet a]. 1990. Parental strain exists on a C57BL/6 background; animals are crossed with wildtype BALB/c to generate hemizygous CB6F1, which are used in testing. Mice carry the human c-Ha-ms proto-oncogene with its own promotor region; no mutations in transgene. Transgeneexpressedinnormalandtumortissues. Few tumors develop by 33 wk of age; 50% of mice develop tumors spontaneously by 18 mo of age. Spontaneous tumors include hemangiosarcomas, lung adenocarcinomas. skin papillomas, harderian gland adenocarcinomas, and lymphomas.
of tissues shows that under normal circumstances, the ras gene is not expressed. However. stimuli that induce cell replication, such as full-thickness skin wounds and 12-0-tetradeconoylphorbol- 13-acetate (TPA) applications induce gene expression. Amazingly, when these animals are subjected to dermal applications of TPA, papillomas begin to appear within 3 wk. After 10 wk of treatment, tumor burdensaverageapproximately 60 papillomasperanimal[69,70].Other promoting agents found to be positive in this strain include benzoyl peroxide and 2-butanol peroxide. The Hras2 transgenic line was created by Saito et al. in 1990 [71]; the 3). Theseanimalsare parentalstrainexists on aC57B16background(Table crossed with wild-type BALB/C mice to generate CB6F1, the strain commonly used in mice chronic studies. The animals carry a human C-ras proto-oncogene with its own promoting region. There are no mutations in that gene, i.e., it is a proto-oncogene. The gene is expressed in both normal and in tumor tissues. As with the other strains, few tumors develop within the normal course of a 6-mo study. Fifty percent of these mice develop tumors spontaneously by 18 mo of age, which include hemangiosarcomas, lung adenocarcinomas, skin papillomas, harderian gland adenocarcinomas, and lymphomas. The framework for the protocols currently in use with transgenic mice was first developed at the U.S. National Institute of Environmental Health Sciences (NIEHS) [72] (Table 4). Animals are generally 6 to 8 wk of age at the initiation of the study. The treatment schedule is 7 days a week for dosed feed or dosed water and 5 days a week for oral gavage, dermal application, and inhalation. This is different from theway most chronic studies are performedby the pharmaceutical industry where animals are dosed daily for the entire study. The duration of the study is generally 26 wk. The same criteria have been applied to dosage selection in transgenic studies as are typically used in traditional bioassays.
370 Jacobson-Kram
and
Young
Table 4 NIEHS Protocol Design with Transgenic Mice
0
0
Animals6-8wk of ageatinitiation. Treatment schedule: 7 days/week for dosed feed or water, 5 days/week for oral gavage, dermal application, and inhalation. Duration: 26 wk plus 2 wk holding for staggered termination. Dose selection: equivalent to those used in 2-yr bioassay. In the absence of bioassay data, range-finding studies (28-day) performed in wild-type animals. Highest dose is MTD.
The drive for validation and general scientific acceptance of transgenic animal models for carcinogenesis was given a giant boost by the International Conference on Harmonization (ICH) guideline on testing for carcinogenicityof pharmaceuticals [73]. In this publication, the U.S. Food and Drug Administration (FDA) makes clear its intention to adopt the ICH Steering Committee’s guideline. The guideline explores the question of whether “carcinogenicity studies in two speciescouldbereducedwithoutcompromisinghumansafety.”Further,the guideline recommends that in the absenceof other information, an oncogenicity study should be performed in the rat. In addition to a traditional chronic study in rat, complementary systems are proposed including “models of initiation-promotion in rodents, or transgenic rodents, or newborn rodents.’’ Several of the early studies carried out by the NIEHS examined transpecies carcinogens [74]. These are compounds that induce tumors both in rats and mice and often at multiple sites and are frequently mutagenic. These types of materials (benzene, p-cresidine, phenolphthalein, and 4-vinyl- 1-cyclohexene diepoxide) were positive in the p53 strain (Table 5). Three compounds that were negative
Table 5 Results of Carcinogenicity Studies in Wild-Type and p53-Deficient Mice
Chemical
Outcome in 2-yr bioassay in B6C3Fl mice ~~
Negative
Negative Positive multiple sites Positive at Positive bladder andPositive in liver 4-Vinyl- 1 -cyclohexane Positive ovary and skin Positive in diepoxide Positive in kidney liver and Negative N-methyl-o-acrylamide Positive mammary in tissue Negative Reserpine
p-Anisidine Benzene p-Cresidine
Source: Tennant et al. 1996.
Outcome in 6-mo bioassay in p53deft+”J mice
Animals Use of Transgenic
379
Table 6 Results of Carcinogenicity Studies in Wild-Type and TG.AC Mice
Chemical
Outcome 2-yr inbioassay Outcome 20-wk in B6C3F in mice bioassay TG.AC in mice
Benzene Positive, multiple sites Positive, forestomach Dimethylvinyl chloride p-Cresidine Positive, bladder and liver 7,12-Dimethylbenzanthra- Positive, skin cene o-Benzyl-p-chlorophenol Positive, kidney gavage o-Benzyl-p-chlorophenol Positive, skin skin painting Ethyl acrylate Positive, forestomach Positive, liver Mirex Triethanolamine Positive, liver Negative 2-Chloroethanol Negative Benzethonium chloride Negative Phenol Lauric acid diethanolamine Not available Diethanolamine Not available
Positive Positive Positive Positive
Positive Negative Positive Negative Negative Negative Negative Positive Negative
Source: Tennant et al. 1996.
in traditional National Toxicology Program (NTP) bioassays were also negative in p53(y-anisidine,1-chloro-2-propanol,oleicaciddiethanolamine).Three chemicals that were positive only in mice in an NTP bioassay were negative in the p53 strain (N-methyloacrylamide, methylphenidate, reserpine). Five NTP bioassay-positive compounds were negative in p53 (glycidol, lauric acid diethanolamine, pentachlorophenol, pyridine, coconut oil diethanolamine). Validation data from the Tg.AC strain suggest that this model detects mutagenic carcinogens as well as tumor promoters [74,75]. Seven NTP-positive car6). Four bioassay-negative cinogens were also positive in the Tg.AC strain (Table chemicals were negative in Tg.AC. Three bioassay-positive chemicals are negative in the Tg.AC. The NTP sponsored a large validation experiment in which 11 compounds were tested either in p53 or Tg.AC by various routes [75].Three known human carcinogens were tested. Diethylstilbestrol, a nongenotoxin, gives positive results in traditional chronic studies. It proved positive in Tg.AC and negative in p53. Melphalan, a well-characterized antineoplatic drug that is clearly genotoxic and positivein chronic studies, gave a positive response in both strains. Cyclosporin A,an immunosuppressive drug known be to carcinogenic to humans, has given negative resultsin 2-yr chronic studies. Itwas positive in Tg.AC mice
380
Young and Jacobson-Kram
Table 7 Validation: TLSI
InternationalLifeSciencesInstitute Collaborativevalidationprogram -Input from industry, CROs, government -54 laboratories -2 1 chemicals Evaluate TG mice,including p53 andTg.AC ILSIstandardprotocol
by oral gavage, but negative in p53. Tetrachlorodibenzodioxin (TCDD), a nongenotoxic receptor-mediated carcinogen, gives positive results in traditional 2yr chronic studies. It was positive in the Tg.AC strain but negative in p53. In comparing routes of exposure, N-methylolacrylamide was given by oral gavage and by dermal application to Tg.AC mice. This compound is positive in the Ames test, positive in a chronic study, and it was negative by both routes of exposure in Tg.AC’s. Inan isomercomparison,2,4-diaminotolueneand2,6-diaminotoluene were comparedin both strains. 2,4-Diaminotoluene, positivein the Ames test and positive in a 2-yr study, gave equivocal results in both Tg.AC and p53. Its isomer 2,6-diaminotoluene, which is also positive in the Ames test but is negative in a traditional chronic study, was negative in both transgenic mouse strains. The noncarcinogen. 8-hydroxyquinoline, which gives a “false positive” in the Ames test (negative in a traditional chronic study), was negative in both strains. Rotenone, which is negative in the Ames test and negative in a chronic study, was positive in Tg.AC but negative in p53. The International Life Sciences Institute (ILSI) has organized a validation effort among the pharmaceutical industry with 54 participating laboratories [76] (Table 7). These companies are studying 21 pharmaceuticals for which data are available on carcinogenicity. Validation studies are examining the p53 and the Tg.ACstrains.Beforeinitiatingthesestudies,ILSIparticipantsdesigned and agreed to follow a specific protocol for all the validation studies. The validation results with the transgenic Hras2 strain also appear promisA total of 29 chemiing [77,78] (Table 8). These studies were performed in Japan. cals were examined. Seventeen were Ames-positive carcinogens, six were Amesnegative carcinogens, three were Ames-positive noncarcinogens, and three were Ames-negative noncarcinogens. Of the 17 Ames-positive carcinogens, all produced tumors in the transgenic mice within 26wk. Fourteen of those compounds produced a higher level of tumor induction in the transgenic than in the wildtype control. These validation studies all included wild-type control groups. Four
Use of Transgenic Animals
381
Table 8 Ministry of Health and Welfare (Japan) Collaborative Study Results with TgHras2 mice
29 chemicalstested -17 Salmonella-positive carcinogens -6 Salmonella-negative carcinogens -3 Salmonella-positive noncarcinogens -3 Salmonella-negative noncarcinogens
Ames-negative carcinogens induced tumors in Hms2 mice within 26 wk. One of these, however,was benzene, which should be considered a mutagenic carcinogen.Onenonmutageniccarcinogen,thiourea,producedcomparableresultsin transgenic and wild-type animals. Two mutagenic noncarcinogens (p-anisidine and 8-hydroxyquinoline) did not increase tumor frequencies in the Hrns2 mice. Three nonmutagenic carcinogens (resorcinol, rotenone, and xylenes) did not increase tumor frequencies. However, there was not always good tumor site concordance between wild-type and transgenic animals. Forestomach squamous cell carcinomas, lung alveolar epithelial tumors, and hemangiosarcomas of the spleen were more common in Hms2 mice. In examining potential advantagesof transgenic vs. wild-type animals, certainly the biggest is the shortened in-life phase of the study (Table9). In a conventional assay, the in-life is 24 mo vs. 6 mo in transgenics. Cost saving is also an advantage; the priceof performing a transgenic study is considerably less. Animal numbers aremuch fewer because group sizestend to be much smaller. The spontaneous tumor frequency toward the endof a conventional chronic study is quite high, which sometimes makes interpretations difficult. With transgenic animals, the background frequencies of tumors are very low. While studies in transgenic mice still do not provide a great deal of mechanistic information, it is at least
Table 9 Advantages of TransgenicMice _______~
~
Conventional Transgenic In-life cost Animal number Spontaneous tumor frequency Mechanistic information Relevance to human risk
24 mo $750,000 400
High Little Uncertain
6 mo $150,000 120 Low Some Uncertain
382
Young and Jacobson-Kram
Table 10 Outstanding Issues Around the Use of Transgenic Mice 0 0
0
0
Studyduration. Is sixmonthslongenough? Groupsize. 10 animals/sex/dose? 15? 20? Dose selection. Is there justification for choosing anything other than the MTD? Route of administration. Can TG.AC animals be dosed by routes other than dernlal application? Positive controls. Should they be included with every assay? Wild-type controls. Should they be included with every assay? Validation. What is the gold standard? When is the test valid? When do we abandon traditional chronic studies? When do we replace currently available models with better ones?
somewhat more than is obtained from a conventional study. For example, one can assay for loss of heterozygosity in tumors that develop in p53 mice. The relevance to human risk remains uncertain for both wild-type as well as transgenic animals. There are still a numberof outstanding issues around the useof transgenic animals (Table 10). Is 6 mo the correct in-life duration? Animal group size is another important issue. Most studies currently use 15 to 20 animals per sex per dose. It isnot clear if this specificationwas based on achieving a particular statistical power or on the high cost of the animals. Appropriate dose selection is another outstanding issue. Is there justification for choosing anything other than the maximum tolerated dose (MTD) or one of the other criteria specified in the ICH guidelines [79]? Route of administration isan important question, especially in the Tg.AC strain. Since the dermal route is not relevant to many materials, especially to most pharmaceuticals, it will be important to determine whether this strain is amenable to other routes of exposure, in particular, oral administration. Currently, positive controls are included in every assay. At what point can we dispense with the useof positive controls? Some validation studies have included the use of wild-type controls. It is not clear how important it will be to include these routinely. With what database will validation results be compared? When will we feel that these new systems have been sufficiently validated? When can we abandon the use of traditional chronic studies and use only transgenic mice? And,perhapsevenmoreprovocative, when can we abandonthecurrent transgenic models for newer and better ones? This brings us to the final question: What is the future of carcinogenicity testing with transgenic animals? If rus gene expression is necessary although perhaps not sufficient for tumor formationin Tg.AC mice, is it necessary towait fortheanimaltodevelopapapilloma?Perhapsitissufficienttodetermine
Animals Use of Transgenic
383
whether exposing the animal to a test material causes expression of the oncogene., If this were the case, the test need only take 24 hr instead of 6 mo. Thompson et al. [80] have reported a high concordance between the in vitro expression of a reporter gene linked to the zeta-globin promoter and results that are seen in the Tg.AC assays. This may serve as a useful prescreen for a Tg.AC assay. Finally, it is clear that transgenic mice will have enormous utility in the field of toxicology. However, it is the authors'view that as we learn more about how oncogenes are activated and how tumor suppressor genes are inactivated, and as the tools of molecular biology become more sophisticated, particularly gene expression systems, theneed for transgenic mice in carcinogenicity testing will diminish. It is our opinion that in the future, carcinogenicity studieswill be performed using wild-type animals and will be approximately 1 mo in duration. Methods will be available to detect rare mutations that would ultimately produce a tumor.
REFERENCES 1. U.S. Food and Drug Administration, Center for Food Safety and Applied Nutrition (CFSAN). 1993. Draft, Toxicological Principles for the Safety Assessment of Direct Food Additives and Color Additives Used in Food Redbook 11. Federal Register 58:16536, March 29, 1993. 2. U.S. FoodandDrugAdministration,CenterforDrugEvaluationandResearch (CDER). International Conference on Harmonisation; Guidance on Specific Aspects of Regulatory Genotoxicity Tests for Pharmaceuticals. Federal Register 60: 18 19818202, April 24, 1996; and Federal Register 62: 16026- 16030, November 21, 1997. 3. U.S. Environmental Protection Agency, Office of Pesticides and Toxic Substances. Pesticide Registration Data Requirements. Federal Register 40:Part 158. OECD Guidelines 4. Organisation for Economic Cooperation and Development. 1996. for Testing of Chemicals. Section 4"Healtlz Efsects, Guidelines 471 through 485. Ninth addendum February 1998. Paris, France. 5 . International Organization for Standardization. 1995. ZnterrlntionalStandard ZSO 10993 Biological evaluation of medical devices, Part 1: Evaluntiort and Testing. Geneva, Switzerland. 6. Russel, L.B. and Major, M.H. 1957. Genetics 42:161-175. In Mutagens, Principlesand Methodsfor Their Detection, 7. Fahrig, R. 1977. Chemical vol. 5 (A. Hollander, ed.). New York: Plenum Press. 8. Albertini, R.J., Nicklas, J.A., and O'Neil, J.P. 1986. The HPRT mutant T-cell assays for human mutagenicity monitoring. Prog. Clin. Biol. Res. 209B: 185-194. 9. Jones, I.M., Burkhart-Schultz,K., Carrano, A.V. 1987.A method to quantify spontaneous and in vivo induced thioguanine-resistant mouse lymphocytes. Mzrtat. Res. 147197-105. 10. Aidoo, A., Lyn-Cook, L.E., Mittelstaedt, R.A., Heflich, R.H., and Casciano, D.A. 1991. Induction of 6-thioguanine-resistant lymphocytes in Fischer 344 rats following
384 Jacobson-Kram
and
Young
invivoexposuretoN-ethyl-N-nitrosoureaandcyclophosphamide.
EnviroIz. Mol.
Mutngert. 17141-151.
11. Driscoll, K.E., Deyo, L.C., Howard, B.W., Poynter, J., Carter, J.M. 1995. Characterizing mutagenesis in the hprt gene of rat alveolar epithelial cells. Exp. Lung Res. 211941 -956. 12. Winton, D.J., Blount, M.A.. and Ponder, B.A. 1988. A clonal marker induced by mutation in mouse intestinal epithelium. Nature 333:463-466. 13. Valcovic. L.R. and Malling, H.V. 1973.An approach to measuring germinal mutations in the mouse. Emiron. Health Pet-spect. 6:201-205. 14. Ehling, U. 1974. Dose-response relationship of specific locus mutations in mice. Arch. Toxicol. 32: 19-25. 15. Generoso, W.M., Russel, W.L., and Gosslee, D.G. 1973. A sequential procedure for the detection of translocation heterozygotes in male mice. Mutat. Res. 21:220-221. 16. Epstein, S.S., Arnold, E., Andrea, J., Bass, W., and Bishop, Y. 1972. Detection of chemical mutagens by the dominant lethal assayin the mouse. Toxicol. Appl. Plznrmncol. 23:288-325. 17. Gabridge. M.G. and Legator, M.S. 1969. A host-mediated microbial assay for the detection of mutagenic compounds. Proc. SOC.Exp. Biol. Med. 130:831-834. 18. Tindall, K.R., Stankowski Jr., L.F.. Machanoff, R., and Hsie, A.W. 1984. Detection of deletion mutations in pSV2gpt-transformed cells. Mol. Cell. Biol. 4:141 1-1415. 19. Seidman, M. 1989. The development of transient SV40 based shuttle vectors for mutagenesis studies. Mutat. Res. 220:55-60. 30. Tennant, R.W., Margolin,B.H., Shelby, M.D., Zeiger, E., Haseman, J.K., Spalding, J.. Caspary, W.. Resnick, M.A., Stasiewicz, S., Anderson, B., and Minor, R. 1987. Prediction of chemical carcinogenicity in rodents from in vitro genetic toxicology assays. Science 236:933-941. 21. Calos, M.P., Lebkowski, J.S., and Botchan, M.R. 1983. High mutation frequency in DNA transfected into mammalian cells. Proc. Not. Acad.Sci. USA 80:3015-3019. 22. Glazer, P.M., Sakar, S.N., and Summers, W.C. 1986. Detection and analysis of UVinduced mutations in mammalian cell DNA using a phage shuttle vector. Proc. Nat. A c u ~ Sci. . USA 83~1041-1044. 23. Malling, H.V. and Burkhart, J.G. 1989. Use of @X174 as a shuttle vector for the study of in vivo mammalian mutagenesis. Mutat. Res. 212:ll-21. 24. Burkhart. J.G. and Malling. H.V. 1989. Mutagenesis of0x174 and cs70 incorporated into the genome of mouse L-cells. Mutnt. Res. 213: 125-134. 25. J.G. Burkhart, Burkhart, B.A., Sampson, K.S., andMalling,H.V.1993.ENU-induced mutagenesis asa single A:T base pair in transgenic mice containing 0x174. Mutnt.Res. 292:69-81. 26. Gossen, J.A., De Leeuw, W.L.T., Zworthoff, E.C., Berends, F., Lohman, P.H.M., Knook, D.L., and Vijg, J. 1989. Efficient rescue of integrated shuttle vectors from transgenic mice: A model for studying mutationin vivo. Proc. Nnt. Acad. Sci. USA 861797 1-7975. 27. Kohler, S.W., Provost,G.S., Kretz, P.L., Dycaico, M.J., Sorge, J.A.. and Short, J.M. 1990. Development of a short-term, in vivo mutagenesis assay: The effects of methylation on the recovery of a lambda phage shuttle vector from transgenic mice. Nucleic Acids Res. 18:3007-3013.
Animals Use of Transgenic
385
28. Kohler. S.W., Provost,G.S., Fieck, A., Kretz, P.L., Bullock, W.O., Sorge, J.A., Putman, D.L., and Short, J.M. 1991. Spectra of spontaneous and mutagen-induced mutations in the lacI gene in transgenic mice. Pi-oc. Natl. Acad. Sci. USA 88:7958-7962. 29. Kohler, S.W., Provost, G.S., Fieck, A., Kretz. P.L., Bullock, W.O., Putman, D.L., Sorge. J.A., and Short. J.M. 1991. Analysis of spontaneous and induced mutations in transgenic mice using a lambda ZAP/lacI shuttle vector.Envirorz. Mol. Mutagen. 18:316-321. 30. Dycaico, M.J., Provost, G.S., Kretz, P.L., Ransom, S.L.. Moores, J.C., and Short, J.M. 1994. The use of shuttle vectors for mutation analysis in transgenic mice and rats. Mutat. Res. 307:461-478. G.S., and Short, J.M. 1993. The use of selection 31. Lundberg, K.S., Kretz, P.L., Provost, in recovery of transgenic targets for mutational analysis. Mutat. Res. 301:99-105. 32. Ushijima, T., Hosoya, Y., Suzuki, T., Sofuni, T., Sugimura, T., and Nagao, M. 1995. A rapid method for detection of mutations in the lacl gene using PCR-single strand conformation polymorphism analysis: Demonstration of its high sensitivity. Mz4tat. Res. 334:283-292. 33. Kretz, P.L., Reid, C.H., Greener, A., and Short, J.M. 1989. Effect of lambda packagingextract Mcr restrictionactivityonDNAcloning. NucleicAcidsRes. 17: 5409. 34. Rogers, B.J., Provost, G.S.. Young, R.R., Putman, D.L., and Short, J.M. 1995. Intralaboratory optimization and standardization of mutant screening conditions used for a lambda/lacI transgenic mouse mutagenesis assay (I). Mutat. Res. 327:57-66. 35. Young, R.R., Rogers,B.J., Provost, G.S., Short, J.M., Putman. D.L. 1995. Interlaboratorycomparison:Liverspontaneousmutantfrequencyfromlambda/lacItransgenic mice (Big Blue) (11). Mutat. Res. 327:67-73. 36. Piegorsch,W.W.,Lockhart,A.M.,Margolin,B.H.,Tindall.K.R.,Gorelick,N.J.. Short, J.M., Carr, G.J., Thompson, E.D., and Shelby, M.D. 1994. Sources of variability in data from a lacI transgenic mouse mutation assay. Erzviron. Mol. Mutagen. 23: 17-3 1. 37. Piegorsch. W.W.. Margolin,B.H.. Shelby, M.D., Johnson, A., French, J.E., Tennant, R.W., and Tindall, K.R. 1995. Study design and sample sizes for a lacI transgenic mouse mutation assay. Emiron. Mol. Mutagen. 25:23 1-245. of mutation studies 38. Carr, G.J. and Gorelick, N.J. 1995. Statistical design and analysis in transgenic mice. Erlvii-on. Mol. Mutagen. 25:246-255. 39. Gorelick, N.J. 1995. Overview of mutation assays in transgenic mice for routine testing. Environ. Mol. Mutagen. 25:218-230. 40. Heddle, J.A.. Shaver-Walker,P., Tao, K.S., and Zhang, X.B. 1995. Treatment protocols for transgenic mutation assays in vivo. Mutagenesis 10:467-470. 41. Ashby, J. 1995. Transgenic germ cell mutation assays:A small collaborative study. Ewiron. Mol. Mutagen. 25: 1-3. 42. Putman. D.L., Ritter, A.P., Carr, G.J., and Young, R.R. 1997. Evaluation of spontaneousandchemical-induced lacl mutationsingermcellsfromlambda/lucI transgenic mice. Mutat. Res. 338: 137- 143. 43. Tinwell, H., Lefevre, P.A., and Ashby. J. 1994. Mutation studies with dimethyl nitrosamine in young and old l a d transgenic mice. Mutation Res. 307501-508. 44. Zhang, X.B.. Tao, K., Urlando, C., Shavers-Walker, P., and Heddle, J.A. 1996. Mu-
386
45.
46. 47. 48. 49. 50. 51. 52. 53. 54. 55. 56. 57. 58. 59. 60.
Young and Jacobson-Kram
tagenicity of high fat dietsin the colon and small intestine of transgenic mice. Mutagenesis 11:43-48. Jakubczak, J.L., Garges, S., French, J.E., Tennant, R.W., Muller, W.. Adhya, S., and Merlino, G. 1996. Analysis of genomic stability during mammary tumor progression using a novel selection-based assay for in vivo mutations in a lambda mouse transgene target. Proc. Nutl. Acad. Sci. USA 93:9073-9078. Gossen. J.A.. de Leeuw, W.J.F., and Vijg, J. 1994. LacZ transgenic mouse models: Their application in genetic toxicology. Mutat. Res. 307:45 1-459. Gossen, J.A., de Leeuw, W.J.F., Molijn, A.C., and Vijg, J. 1993. A selective system for LacZ- phage using a galactose-sensitive E. coli host. BioTeclzrziques 14:326330. Dean,S.W.andMyhr,B.C.1994.Measurementofgenemutation in vivo using MutaTMMouse and positive selection for LucZ- phage. Mutagenesis 9: 183-185. Tinwell. H., Lefevre. P.A., and Ashby, J. 1994. Response of the MutaTMMouse lacZ/ galE- transgenic mutation assayto DMN: Comparisons with the corresponding Big Blue@(lad) responses. Mutat. Res. 307: 169-173. Morrison, V. and Ashby, J. 1994. A preliminary evaluation of the performance of the MutaTMMouse (IacZ) and Big Blue@ (lacI) transgenic mouse mutation assays. Mutagenesis 9:367-375. Tinwell, H., Liegibel, U.. Krebs, 0.. Schmezer, P.. Favor, J.. and Ashby, J. 1995. Comparison of lacI and lacZ transgenic mouse mutation assays: An EU-sponsored interlaboratory study. Mutat. Res. 335: 185-190. Gossen, J.A., Martus. H.-J., Wei, J.Y., and Vijg. J. 1995. Spontaneous and X-rayinduced deletion mutations in a LacZ plasmid-based transgenic mouse Mutat. model. Res. 331:89-97. Gondo, Y..Shioyama, Y., Nakao, K., and Katsuki, M. 1996. A novel positive detection system of in vivo mutations in rpsL (strA) transgenic mice. Mutat. Res. 360: 1-14. Ashby, J. and Tinwell, H. 1994. Use of transgenic mouse ZacZ/Z mutation assays in genetic toxicology.Mutagenesis 9: 179- 18 1. Susuki, T., Hayashi, M., and Sofuni. T. 1994. Initial experiences and future directions for transgenic mouse mutation assays. Mutat. Res. 307:489-494. Zeiger. E. and Tennant. R.W. 1996. New animals, new uses, and old issues. Envirorz. Mol. Mutagen. 28:3-4. Leder.A.,Pattengale.P.K., Kuo, A.Stewart,T.A..andLeder,P.1986.Consequences of widespread deregulation of the c-lnyc genein transgenic mice: Multiple neoplasms and normal development. Cell 45:485-495. Muller, W.J.. Sinn, E., Pattengale, P.K., Wallace, R., and Leder, P. 1988. Singlestep induction of mammary adenocarcinoma in transgenic mice bearing the activated c-rzeu oncogene. Cell 54: 105- 1 15. Stewart. T.A., Pattengale, P.K. and Leder, P. 1984. Spontaneousmammary adenocarcinomas in transgenic mice that carry and express MTVlmyc fusion genes. Cell 381627-637. Sinn, E., Muller, W., Pattengale, P., Templer,I., Wallace, R., and Leder, P. Coexpression of MMTVlv-Ha-ms and MMTVlc-nzyc genes in transgenic mice: Synergistic action of oncogenes in vivo. Cell 49:465-475.
Use of Transgenic Animals
387
61. Cuypers, H.T., Selten, G., Quint, W., Zilstra, M., Robanus-Maandag, E., Boelens, W., van Wezenbeek, P., Melief. C.. and Berns, A. 1984. Murine leukemia virusinduced T-cell lymphomagenesis integration of provirus in a distinct chromosomal region. Cell 37 141-150. 62. Stelten, G., Cuypers, H.T., and Berns, A. 1985. Proviral activation of the putative oncogene pim-1 in MuLV-induced T-cell lymphomas. EMBO J. 4: 1793- 1798. 63. Mucenski,M.L.,Gilbert,D.J.,Taylor,B.A.,Jenkins,N.A.,andCopeland,N.G. 1987. Common sites of viral integration in lymphocytes arising in AKXD recombinant inbred mouse strains. Orlcogeize Res. 2:33-48. 64. Van Lohuizen, M.. Verbeek, S., Krimpenport, P., Radaszkiewicz, T., and Berns. A. 1989. Predisposition to lymphomagenesis in pim-1 transgenic mice cooperation with c-myc and N-myc in MuLV-induced tumors. Cell 56:673-692. 65. Breuer, M., Wientjens, E.. Verbeek.S., Slebos, R., and Berns, A. 1991. Carcinogeninduced lymphomagenesis inpim- 1 transgenic mice: Dose dependence and involvement of rnyc and ras. Cancer Res. 51:958-963. 66. Donehower, L.A., Harvey, M., Slagle, B.L., McArthur, M.J., Montgomery, C.A., Butel, J.S., and Bradley. A. 1992. Mice deficient for p53 are developmentally normal but susceptible to spontaneous tumors. Nature 356:215-221. 67. Harvey, M.. McArthur,M.J..Montgomery,C.A.,Butel,J.S.,Bradley,A.,and Donehower, L.A., 1993. Spontaneous and carcinogen-induced tumorigenesis in p53deficient mice. Nat. Genet. 5:225-229. 68. Leder, A., Kuo, A., Cardiff, R.D.. Sinn, E., and Leder, P. 1990. v-Ha-ras transgene abrogates the initiation step in mouse skin tumorigenesis: Effects of phorbol esters and retinoic acid. Proc. Natl. Acad. Sci. USA 87:9178-9182. 69. Spalding,J.W.,Momma,J.,Elwell,M.R.,andTennant.R.W.1993.Chemically induced skin carcinogenesis in a transgenic mouse line (TG.AC) carrying a v-Haras gene. Carcilzogenesis 14:1335-1 341. 70. Hasen, L., and Tennant, R.W. 1994. Focal transgene expression associated with papilloma development in v-Ha-ras transgenic TG.AC mice.Mol. Carcinogen. 9:143156. 71. Saitoh. A.. Kimura, M., Takahashi. R., Yokoyama, M., Nomua. T. Izawa, M., Sekiya, T.. Nishimura,S., and Katsuki. M. 1990. Most tumors in transgenic mice with humanc-Ha-r-asgenecontainedsomaticallyactivatedtransgenes. Oncogene 5: 1195-1200. 72. Tennant, R.W.. French. J.E., and Spalding, J.W. 1995. Identifying chemical carcinogens and assessing potential risk in short term bioassays using transgenic mouse models. Erwiron. Health Perspect. 10:942-950. 73. ICH Steering Committee. 1997.ICH Harinonised Tripartite Guideline. Testing for Carcinogenicity of Pharmaceuticals. Recommended for adoption at Step 4 of the ICH process on 16 July 1997. 74. Tennant, R.W., Spalding. J.. and French, J.E. 1996. Evaluation of transgenic mouse bioassays for identifying carcinogens and noncarcinogens. Mutat. Res. 365: 1 19127. 75. Eastin,W.C.,Haseman,J.K.,Mahler,J.F..andBucher,J.R.1998.TheNational Toxicology Program evaluationof genetically altered mice as predictive models for identifying carcinogens. Toxicol. Patlzol. 26:46 1-473.
388
Young and Jacobson-Kram
76. Robinson, D. 1998, The International Life Sciences Institute’s role in the evaluation of alternative methodologies for the assessment of carcinogenic risk. To.rico1. Pathol. 26:474-475. 77. Yamamoto, S., Mitsumori, K., Kodama, Y. Matsunuma. N., Manabe, S., Okamiya, H., Suzuki, H., Fukuda,T., Sakamaki, Y., Sunaga,M., Nomura, G., Hioki. K., Wakana, s., Nomura, T.. and Hayashi, Y. 1996. Rapid induction of more malignant tumors by various genotoxic carcinogens in transgenic mice harboring a human prototype c-Ha-rus gene than in control non-transgenic mice. Carcinogenesis 172455246 1. 78. Yamamoto, S., Urano, K., Koizumi. H., Wakana, S., Hoiki, K., Mitsumori, K., Kurokawa, Y., and Hayashi, Y. 1998. Validation of transgenic mice carrying the human prototype c-Ha-ras gene as a bioassay model for rapid carcinogenicity testing. Environ. Health Perspect. 10657-69. 79. International Conference on Harmonization. 1995. Guideline on dose selection for carcinogenicity studies of pharmaceuticals. Federal Register60:11278- 1128 1. 80. Thompson, K.L.. Rosenzweig,B.A.,andSistare.F.D.1998.Useofaninvitro transgene-induction assay as a prescreen for testing the specificityof pharmaceuticals in the TG.AC mouse short-term carcinogenicity assay. Toxicol. Sci. 4272.
13 Health Risk Assessment of Environmental Agents: Incorporation of Emerging Scientific Information Vicki L. Dellarco, William H. Farland, and Jeanette A. Wiltse U.S. Environmental Protection Agency, Washington, D. C.
1.
INTRODUCTION
The objective of the U.S. Environmental Protection Agency's (EPA's) risk assessments are to support environmental decision making. Assessments of risks to environmental agents serve not only the regulatory programs of the EPA but also state and local agencies, as well as international communities that are addressing environmental issues.The ingredients of health risk assessment include information on whether a chemical produces adverse health effects,how the frequency of adverse effects changes with dose, and to what degree and underwhat conditions people may be exposed as pollutants travel in the environment. The primary sources of information for judging human risk are human epidemiological and animal toxicological studies, and other empirical information such as genotoxicity, structure-activity relationship, and exposure data. Risk assessments not usually available. The rely on studies in animals because human data are health-related information availableon agents is typically incomplete. Moreover, ~~~
Disclaimer: Although this chapter hasbeen subjected to reviewand approved for publication,it does not necessarily reflect the views and policies of the U.S. Environmental Protection Agency. The opinions expressed within this article reflect the viewsof the authors. The U.S. Government has the right to retain a nonexclusive royalty free license in and to any copyright covering this chapter.
389
Dellarco et al.
390
health risk assessments on environmental agents must usually address the potential for harm from exposure levels found in the environment that are usually lower than concentrations at which toxicity is found in laboratory animal or epidemiological studies. Thus, the extrapolations that are required to project human risk (i.e., from high to low doses, from nonhuman species to human beings, from one route to another route of exposure) inherently introduce uncertainty. Given that extrapolations must be performed, risk assessment is complex and often controversial.The EPA develops risk assessment guidelines to provide staff and decision markers with guidance and perspectives necessary to develop and use effective health risk assessments. Guidelines also encourage consistency in procedures to support decision making across the many EPA programs. The following lists the risk assessment guidelines that the EPA has published: Carcinogenicity [ 1,2]“ Mutagenicity [3] Developmental toxicity [4] Reproductive toxicity [5] Neurotoxicity [6] Exposure [7] Complex mixtures [SI The EPA recently proposed new cancer risk assessment guidelines to bring current and relevant science into future assessments and to promote research that applies new knowledge to specific pollutants. There have been significant gains in our understandingof the cellular and subcellular processes that result in cancer, and these advances have enabled research on the ways environmental contaminew guidelineswill be discussed nantsact on cellstocausecancer.These throughout this article as an illustration of how new science is impacting and improving the characterization of potential human risk. Health risk assessment practices are evolving on a number of fronts (Table 1). Risk analyses have historically relied to a large degree on observations of frank toxic effects (e.g., tumors, malformations). Risk assessments are moving fromthisphenomenologicalapproach by identifyingthewaysenvironmental agents are changed though metabolic processes, the dose at the affected organ system, and how an agent produces its adverse effects at high doses and at low ones. This understanding of how an agent produces its toxic effect is beginning to break down the dichotomythat has existed between assessmentsof cancer and noncancer risks. Of equal importance, the “one-size-fits-all” approach is being replaced by emphasizing the ascertainment of risk to susceptible subpopulations.
.v Since
the preparation of this manuscript another draft revision of the EPA’s cancer guidelines has been developed (see http:llwww.epa.govlncea/raflcar2sablpreamble).
Health Risk Assessment
391
Table 1 CurrentTrendsinHealthRiskAssessment
emphasis Emerging approach Historical Phenonlenological studies Separate assessments and approaches for cancer and noncancer risks Risk to general population Single chemical exposure and single pathway Risk characterization
,
Mechanism studies Integrative health assessments and harmonization of approaches for cancer and noncancer risks Risk to sensitive subpopulations Multiple chemical exposure via multiple pathways More expanded characterizations of human risk
The EPA recently put forth anew national agenda to protect children from toxic agents in the environment [9]. In addition, to make risk assessments more understandable and useful. there isan increased emphasis on risk characterization. Risk characterization is the final output of the risk assessment process from which all preceding analyses (i.e., from the hazard, dose-response, and exposure assessments) are tied together to convey in nontechnical terms the overall conclusions about potential human risk, as well as the rationale, strengths, and limitationsof the conclusions. This chapter discusses several trends occurring in risk assessment in the context of the risk paradigm-hazard, dose-response, and exposure assessments and subsequent risk characterization (Figure 1). Chemical examples are provided to illustrate these emerging directions in health risk assessment.
II. EVOLUTION OF HAZARDASSESSMENT In its 1994 report about the use of science and judgment in risk assessment. the National Research Council of the National Academy of Science recommended of risk that are both qualitathat the EPA incorporate technical characterizations tive and quantitative in its assessments [ 101. Thus, hazard identification as well as dose-response and exposure analyses are changingby the increased emphasis on providing characterization discussions. These technical characterizations essentially reveal the thought process that leads to the scientific judgmentsof potential human risk. The technical hazard characterization explains the extent and weight of evidence, major points of interpretation and rationale, strengths and weaknesses of the evidence, and discusses alternative conclusions and uncertainties that deserveseriousconsideration. The technicalhazardcharacterization along with those for the dose-response and exposure assessments arethe starting
Dellarco et al.
392
Hazard Assessment Risk Characterization
L
or are expect4 ro %cur fpr.human
populations?.Watppulatiom am &XD&Sed?
Figure 1 The elements of the risk paradigm: health risk assessment is organized by the paradigm put forward by the National Academy of Sciences [10,79],which defines four types of analysis: hazard assessment, dose-response assessment, exposure, and risk characterization.
materials for the risk characterizationprocess (see section V of this chapter) that completes the risk assessment. As shown in Figure 2, this concept of technical hazard characterizationshas been incorporated into the EPA’s revised Guidelines for Carcinogen Risk Assessment [2].
A.
Expanding Role of Mechanistic Data
Hazard assessment is moving beyond relying on traditional toxicology by using a weight-of-evidence (WOE) approach that considers all relevant data and the mode of action of the given agent. It is the sum of the biology of the organism and the chemical properties of an agent that leads to an adverse effect. Thus, it is an evaluation of the entire range of data (e.g., physical, chemical, biological, toxicological,clinical, and epidemiologicalinformation) that allows one to arrive at a reasoned judgment of an agent’s potential to cause human harm. For example, the EPA has proposed a major change in the way hazard evidence is weighed in reaching conclusions about the human carcinogenic potential of environmental agents [2,1 I]. Rather than relying heavily on tumor findings, the full use of all
Health Risk Assessment
393
ASSESSMENT
Dose-Response Characterization WOE Narrative and Descriptors
Integrative Analysis
I
Risk Characterization
Risk CharacterizationProcess
Figure 2 The risk characterization process: the framework of the EPA 1996 Proposed GuideZirlesfor- Cnrcinogerz Risk Assessment [2] is based on the paradigm put forth by the National Academyof Sciences [ 101. This framework puts an emphasis on characterizations of hazard, dose-response, and exposure assessments. These technical characterizations integrate the analysesof hazard, dose-response and exposure, explain the weightof evidence and strengths and weaknesses of the data, as well as discuss the issues and uncertainties surrounding the conclusions. The technical characterizations themselves are integrated into the overall conclusionsof risk that are presented in a risk characterization summary. (From EPA 1996 and Wiltse and Dellarco 1996.)
relevant information is promoted andan understanding of how the agent induces tumors is emphasized. Under the proposed revisions to EPA’s 1986 GuicZeZines for Carcinogen Risk Assessment [l], a short WOE narrative is derived from the longer technical hazard characterization. The WOE narrative is intended for risk managers and other users, and it replaces the current six alphanunleric classification categories; A, human carcinogen; Bl/B2, probable human carcinogen; C, possible human carcinogen;D, not classifiable, andE, evidence of noncarcinogenicity. This narrative explains in nontechnical language the key data and conclusions, aswell as the conditions for hazard expression. Conclusions about potential human carcinogenicity are presentedby route of exposure. Contained within this narrative are simple likelihood descriptors that essentially distinguish whether there is enough evidence to make a projection about human hazard (i.e., known human carcinogen, should be treated as if known, likely to be a human carcinogen, or not likely to be a human carcinogen) or whether there is insufficient
394
Dellarco et al.
evidence to make a projection (i.e., the cancer potential cannot be determined because evidence is lacking, conflicting, inadequate, or because there is some evidence but it is not sufficient to make a projection to humans). Because one encounters a variety of data sets on agents, these descriptors are not meant to stand alone; rather, the context of the WOE narrative is intended to provide a transparent explanation of the biological evidenceand how the conclusions were derived. Moreover, these descriptors shouldnot be viewed as classification categories (like the alphameric system), which often obscure key scientific differences among chemicals.The new WOE narrative also presents conclusions about how the agent induces tumors and the relevance of the modeof action to humans, and recommends a dose-response approach based on the mode-of-action under1II.B).Some examples of how mechanistic information standing (see later section on chemicals has informed risk assessments or provided a better basis for interpreting the meaning of effects from animal data and its relevance to humans are given in the following subsections.
1. a2,, Nephropathyand KidneyCancer The development of male rat kidney tumors mediated by a2,-globulin is one of the more thoroughly studied processesin cancer toxicology. Exposure to severaI agents, such as 2,2,4-trimethylpentane, unleaded gasoline, and d-limonene, have been reported to result in an accumulation of protein droplets containing a2,globulin in the epithelial cells of the proximal convoluted tubules of male rat in renal cell injury kidneys [ 12-15]. This protein accumulation is thought to result and proliferation,and eventually renal tubule tumors. Female rats and other laboratory animals do not accumulate this protein in the kidney and, when exposed to alpha?,,-globulin inducers, do not develop an increased incidenceof renal tubule tumors. The manner in which the human male responds to such agents is uncertain. This mechanism appears to be specific to the rat given the results from studies of other laboratory species, and given the high doses that are needed to produce an effect in the male rat. In 1991, the EPA concluded that the sequence of events proposed to link a2,,-globulin accumulation to nephropathy and renal tubule tumors in the male rat was plausible, although not totally proven;that the a2,-globulin response following chemical administration appears to be unique to the male rat; and that the male rat kidney response to chemicals that induce a3_,-globulin is probably not relevant to humans for purposes of risk assessment [ 151. However. when chemically induced a2;-globulin kidney tumors are present, other tumors in the male rat and any tumor in other exposed laboratory animals may be important in evaluating the carcinogenic potential of a given chemical. Some investigators think that the issue of a:, nephropathy and kidney cancer is not resolved and have proposed alternative hypothesis [ 161. Should significant new information
Health Risk Assessment
395
on a,,-globulin kidney tumors become available, the EPA will update its policy position accordingly.
2.
Perturbation of Pituitary-Thyroid Homeostasis and Thyroid Cancer
The ways in which antithyroid compounds induce thyroid tumors are also reasonably well understood, even though the precise molecular events leading to thyroid follicular cell tumors are not totally described. Experimental findings in rodents have shown that perturbation of hypothalamus-pituitary-thyroid homeostasis leads to elevated thyroid-stimulating hormone (TSH) levels, which in turn results in increased DNA synthesis and cell proliferation, and eventually to thyroid gland tumors [ 17-20]. Thus, thyroid tumors are secondary to a hormone imbalance. Agents with antithyroid activity include sulfamethazine and other thionamides. There is uncertainty whether prolonged stimulation of the human thyroidby TSH may lead to cancer. Because this possibility cannotbe dismissed, it is presumed that chemicals that produce thyroid tumors in rodents may pose a carcinogenic risk to humans. Humans (including other primates) are thought to be substantially less sensitive than rats to this mechanism. One factor that may account for the interspecific difference in sensitivity concerns the influence of protein carriers of thyroid hormones in the blood. Rodent thyroid hormones are more susceptible to removal from the body because of the lack of a high-affinity binding protein, which humans possess[21]. In the rat, there is chronic stimulation of the thyroid glandby TSH to compensate for the increased turnoverof thyroid hormones. Thismay render the rat more sensitive to disturbances in TSH levels. The EPA has recently proposed science policy guidanceontheconsideration of thyroidcarcinogenesis in risk assessment [20]. Briefly, it is proposed that chemicals that produce rodent thyroid tumors should be presumed to pose a hazard to humans; evaluations of human thyroid cancer risk from long-term perturbationsof pituitary-thyroid function in rodents should incorporate considerations about potential interspecific differencesin sensitivity and evaluate the applicability of potential human exposure patterns in relation to the findings in animal models. Dose-response approaches should be based on mode-of-action information; applicationof nonlinear approaches are appropriate for those nonmutagenic chemicals shown to cause a hormonal imbalance. However, those antithyroid compounds with mutagenic activity need to be carefully evaluated on a case-by-case basis.
3.
Bladder Calculi and Tumors
Another situation for which the rat appears to be quantitatively more sensitive than humans is the induction of bladder tumors secondary to bladder calculiinduced hyperplasia. Cohen and Ellwein [22] reported that if the administered
396
al.
Dellarco et
dose of a chemical (e.g., melamine, uracil, calcium oxalate, orotic acid, glycine) is below the level that causes calculus formation, there is no increase in cell proliferation; consequently, there no is increase in bladder tumors. Thus, calculusforming compounds would have a threshold of response. The EPAhas considered this in its assessment of melamine [23].
4.
Formaldehyde and Nasal Tumors
The understanding of formaldehyde carcinogenicityhas developed over a number of years since Kernset al. [24] demonstrated that inhalation exposure to formaldehyde caused nasal squamous cell carcinomas in mice and rats. In 1991, the carcinogenicity of formaldehyde was reassessed using data from rats and monkeys: levels of DNAproteincross-links(DPX)wereevaluated with alinearized of dose multistage (LMS) model [25]. Using DPX as a more precise measure resulted in risk estimates that were significantly lower than those derivedby using external exposure only. Although the mechanisms of formaldehyde carcinogenesis are not completely understood, data have continued to provide additional insight into the cancerrisk associated with low-dose exposure to inhaled formaldehyde by definingmorepreciselythelocation of thenasaltumors in therat, determining rates of cell proliferation in the nose, and establishing the delivered dose (i.e., levels of DPX) to the target tissue as well as rates of repair of DPXs after repeated exposures[26-291. Precursor response data also may have implications in the estimation of risk to humans. In the rat, the dose-response relationships of induction of nasal tumors and of cell proliferation correspond and are both highly nonlinear [28].The DPXs donot accumulate; and although the doseresponse relationship is linearin the range of tumor induction and increased cell replication, the slope is greater than at lower dose ranges due to saturation of detoxification [26]. Although formaldehyde is a mutagenic carcinogen, the data on tumors, cellular kinetics, and molecular dosimetry indicate that the doseresponse relationship is not linear throughout the entire range, but is subject to an upward curvature due to increased cell proliferation.
B. Conditions of Hazard Expression As mentioned earlier, hazard assessment has expanded from simply identifying adverse effects to fuller technical characterizations of a particular hazard. One dimension critical to characterizing hazard potential isthe concept of hazard expression (i.e., What are the circumstances under which a particular hazard is expressed?). For example, an agent may not carry the same hazard potential for different routes of exposure. Inhalation exposure to vinyl acetate (600 parts per in nasaltumors in rats, million)producesstatisticallysignificantincreases
"
Health Risk Assessment
397
whereas no statistically significant increases in tumors are observed when the compound is ingested orallyvia drinking water [30,31]. Likewise, a compound's carcinogenicity may be dose limited. Although methylmercury has been shown to produce tumors in mice at high doses [32], it is unlikely to pose a hazard to humans at low doses. Conditions of hazard expression may not only involve exposure conditions (e.g., route, magnitude, or duration) but also may depend on biological and physiological processes. Studies on metabolismmay provide pertinent data about the circumstances that affect hazard expression. The biotransformation of many chemicals to reactive compounds is dependent on the presenceof certain metabolic pathways (e.g., oxidative pathways involving Pjso cytochromes or conjugation pathways involving glutathione S-transferases). For example, 1,3-butadiene is carcinogenic in rats and mice, with mice being more sensitive to tumor induction than rats [33,34]. It is thought that the carcinogenic potential of 1,3-butadiene is dependent on metabolic activation to reactive metabolites, which interact with DNA. For example, metabolism of 1,3-butadiene to reactive epoxides is substantially greater in mice than in rats [35-371. Although it has been reported that humans exposed to 1,3-butadiene show a higher incidenceof chronic leukemia [38], the available metabolic studies suggest that humans may not be as highly susceptible as mice. Thus, metabolizing enzymes can account for different susceptibilities among species. Other biological factors that can result in differences in sensitivity include age, sex, or preexisting diseases. These factors that may contribute to special sensitivity to a given agent as discussed further in the following section.
C. Variation in HumanSusceptibility Certain individuals may be at an increased risk because their activity patterns increase their exposure or because their proximity to a source means higher exposures to environmental contaminants. Humans also may vary in their susceptibility to toxicity because of preexisting disease conditions or differences in age, gender, metabolism, or genetic makeup. For example, a number of studies have shown the roleof carcinogen-metabolizing enzyme polymorphisms in cancer susceptibility (reviewedin [39]), of which the most convincing is for the association of the GSTM1 homozygous genotype and the CYPlAl rare alleles with lung cancer in Japanese[40,41].Gene-environmentalinteractionshavealsobeen shown tobe important toan elevated risk for developmental defects. For example, genetic variationof transforming growth factor-alpha and maternal smoking have been associated with increased risk for delivering infantswith cleft lip or palate [42,43]. Human responses may vary due to environmental exposures during different periods of the life cycle. Exposures of the fetus or neonate may disrupt developing systems, thereby resultingin increased sensitivity. The EPA has con-
et
398
Dellarco
al.
sidered in its risk assessments subgroups with a high sensitivity to environmental pollutants, as evinced by the National Ambient Air Quality Standards for air pollutants and lead. Two examples are discussed in the following subsections.
1. Methylmercury and Neurobehavorial Effects in Children Mercury is ubiquitous and persistent in the environment. It occursin both natural (e.g., volcanoes, soils, wildfires) and industrial (e.g., coal combustion, mining, waste incineration) sources. A form of mercury that is particularly hazardous to humans is methylmercury.A primary pathway of human exposure isby consuming fish that have accumulated methylmercury. Microorganisms in the sediment of the earth’s waters can convert mercury into methylmercury. It is well established that methylmercury is a neurotoxin [44]. The developing nervous system of the fetus is especially sensitive to the effects of methylmercury. Animal and human studies indicate that in utero exposure to methylmercury can potentially result in adverse neurobehavioral effects on children. To protect sensitive subpopulations (e.g., infants exposed pre- and postnatally), in 1995 EPA established a reference dose (i.e., a quantitative estimate of mg/kg/day based on available levelsexpectedto be withouteffects) of 1 X human studies in Iraq [45]. This study was basedon 8 1 infant-mother pairs that had consumed seed grain that had been fumigated with methylmercury. The results of two recent epidemiological studies of fish-eating populations-one in the Seychelles Islands and the other in the Faeroe Islands-are anticipated to shed further light on the dose-response issues associated with the oral intake of methylmercury intake via contaminated food. It should be noted, like exposure to lead, the neurological effects associated with low exposures to methylmercury may be subtle and delayed, thus making it difficult to identify in young children. Lead is oneof the best studied examplesof prenatal exposure and its subsequent effects on cognitive and behavioral development of young children [46]. The EPA as well as other Federal agencies have published strategy documentsin an effort to reduce children’s exposure to lead [47-491.
2. Air Pollution and Respiratory Effects in the Elderly
and Children The elderly (65 years and older) make up another population susceptible to environmental pollution. For example, several morbidity (e.g., hospital admissions) and mortality studies provide evidence that the elderly (especially those with underlying respiratory or cardiac diseases) are more susceptible to the short- and long-term effects of particulate air pollution than are young healthy adults [50531. Particulate air pollution might aggravate the severityof preexisting chronic
Health Risk Assessment
399
respiratory or cardiac diseases. Approximately 40% of people over 75 years old have some form of heart disease, 35% have hypertension, and 10%have chronic obstructive pulmonary disease (e.g., asthma) [53]. Also, the elderly have had more cumulative exposure over their life span and hence more opportunity to accumulate particles or damage in their lungs. Although there is an association of short-term, low-level ambient exposure to particulate matter and excess mortality or morbidity among the elderly, the biological plausibility of these findings remains unclear.The few studies available also suggest that children, particularly those with preexisting respiratory diseases, may be potentially more susceptible than the general population to the pulmonary effects of air pollution [53,54].
D. Integrative Analysis of Cancer and Noncancer Health Effects In evaluating health risks posed by environmental agents, the EPA considers both cancer and noncancer effects. Some of the noncancer effects specifically considered are developmentaland reproductive toxicity, neurotoxicity, immunotoxicity, and respiratory toxicity, as well as systemic organ toxicities. Historically. assessments havebeen done separately andvery differently for cancer and noncancer health effects.An important direction in assessments of environmental agents is to provide more integrated characterizations of cancer and noncancer health effects. The dichotomy between cancer and noncancer is beginning to break down with a better understanding of the mechanisms of toxicity. Also, the quantitative approaches are merging as discussed in section I11 below. The underlying basis for certain noncancer toxicities and cancer may have several commonalties. For example, chemically induced toxicity can cause cell death. Surviving cells may then compensate for that injuryby increasing cell proliferation (hyperplasia), which may underlie many types of toxic responses. If this proliferative activity continues unchecked, itmay result in tumors. Chemicals may modulate or alter gene expression via receptor interactions. Thus, receptormediated pathwaysmay play a rolein both carcinogenesisand other organ system and dioxin-like comtoxicities. For example,2,3,7,8-tetrachlorodibenzo-p-dioxin pounds bind to the Ah receptor, which may represent the first step in a series of events leading to cellular and tissue changes in normal biological processes. Thus. dioxin (and dioxin-like compounds) may exert its carcinogenic, immunological, and reproductive effects via Ah receptor-dependent events [55-571. The EPA is attempting to integrate combined human health and ecological risk assessments to ensure that decision makers at all levels have an integrated view of risk, which is essential to making sound decisions. Human health and ecological assessments make useof similar data. For example, studiesof piscivorous birds that have consumed methylmercury-contaminated fish show neurobe-
Dellarco et al.
400
havioral effects similar to those of exposed human beings [58,59]. Concern has been raised in the news (e.g., Esquire and The New Yorker, January 1996) and among scientists about the accumulation in the environment of chemicals (e.g., pesticides like DDT/DDE and kepone, certain polychlorinated biphenyls) that may mimic natural sex hormones. There have been several reports suggesting a decline in sperm number in human males over the last 50 years [60], as well as effects on male reproduction in wildlife species (e.g., male alligators exposed to pesticides in Florida’s Lake Apolka with reduced genitalia). For example, DDE (1 , l ,1 -trichloro-2,2-bis(p-chlorophenyl)ethane)-which was shown to cause reproductive failure (due to eggshell thinning) in birds over two decades agohas been showntoinhibitandrogenbindingtotheandrogenreceptor,which may account for its ability to alter male reproductive development [61]. Because wildlife species and domestic animals share the same environment with humans and are in the human food web, these nonhuman species serve as sentinels for potential human health risks posed by environmental contaminants (for review see [63]).
111.
TRENDS IN DOSE-RESPONSEASSESSMENT
Historically, dose-response assessment has been done very differently for cancer and noncancer health effects. For nearly two decades, thehasEPA modeled tumor risk by a default approach based on the assumption of low-dose linearity. To estimate human cancer risk, the LMS model was applied, which extrapolatesrisk as the 95% upper-bound confidence interval [1,63,64]. The standard practice for noncancer health assessment has assumed the existence of a threshold for adverse effects. Acceptable exposures for chemicals causing noncancer effects have been no observed estimated by applying uncertainty factors (UFs) to a determined adverse eflect level (NOAEL), which is the highest dose at which no adverse effects have been detected. If a NOAEL cannot be established, then a lowest observed cldverse efSect level (LOAEL) is determined for the critical effect. The UFs may be as much as 10 each and are intended to account for limitations in theavailabledata, such as human variation,interspecificdifferences,lack of chronic data, or lack of certain other critical data.In the reference concentration (RfC) method, the composite UF for interspecific differences is 3 because of dosimetric adjustments [65,66]. The NOAEL (or LOAEL) is divided by UFs to establish a reference dose (RfD) for oral exposures or a RfC for inhalation exposures, which is an estimate (with uncertainty spanning perhaps an order of magnitude) of daily exposure (RfD) and continuous exposure (RfC) that is likely to be without an appreciable risk of deleterious effects during a lifetime[65-691. The RfDs and RfCs are not derived using composite UFs greater than 10,000 and
Health Risk Assessment
401
3000, respectively. The NOAEL canbe compared with the human exposure estimate to derive a margin of exposure.
A.
Modeling in the Range of Observation for Both Cancer and Noncancer Risks
With recent proposals to model response data in the observable range to derive [2,44]. EPA health points of departure* both for cancer and noncancer end points risk assessment practices are beginning to come together. The modeling of observed response data to identify points of departure in a standard way will help to harmonize cancer and noncancer dose-response approaches and permit comparisons of cancer and noncancer risk estimates.
1. Benchmark Dose Approach: Noncancer Assessment The traditional NOAEL approach for noncancer risk assessment has often been a source of controversy and has been criticized in several ways. For example. experiments involving fewer animals tend to produce larger NOAELs and, as a consequence, may produce larger RfDs or RfCs. The reverse would seem more appropriate in a regulatory context because larger experiments should provide greater evidence of safety. The focus of the NOAEL approach is only on the one the experimental doses. dose thatis the NOAEL, and the NOAEL must be of Moreover, it also ignores the shape of the dose-response curve. Thus, the slope of the dose-response curve plays little role in determining acceptable exposures for human beings. These and other limitations prompted development of the alternative approach of applying uncertainty factors to a benchmark dose (BMD) rather than to a NOAEL [70]. Essentially, the BMD approach fully uses all of the experimental data tofit one or more dose-response curves for critical effects that are, in turn, used to estimate a BMDthat is typically not far below the range of the observed data. The BMD approach allows for a more objective approach in developing allowablehuman exposures across different study designs encountered in noncancer risk assessment. The BMD is defined as a statistical lower confidence limit (CL) on the dose producing a predetermined level of change in adverse response (BMR) compared with the response in untreated animals [70]. The choice of the BMR is critical. For quantal end points, a particular level of response is chosen (l%,5% or 10%).For continuous end points, the BMR is the degree of change from controls and is based on what is considered a biologically significant change. The
* Point of departure is conceptually similar to benchmark dose, which has been used for noncancer assessment.
402
al.
Dellarco et
methods of CL calculation and choice of CL (go%, 95%) are also critical. The choice of extra risk vs. additional risk is based to some extent on assumptions about whether an agent is adding to the background risk. Extra risk isviewed as the default because it is more conservative. Several RfCs and an RfD based on the BMD approach are included in the EPA's Integrated Risk Information System (IRIS) Database." These include methylmercury based on delayed postnatal development in humans, carbon disulfide based on neurotoxicity, 1 ,1 ,1,2-tetrafluoroethane based on testicular effects in rats, and antimony trioxide based on chronic pulmonary interstitial inflammation in female rats. It should be noted that the BMD approach is still under discussion and development (see [70-731 for further discussion).
2. Two-step Process for Cancer Dose-Response Assessment The EPA recently proposed to replace its method for extrapolating low-dose cancer risk by applying the LMS procedure. Instead, it would apply a two-step process that distinguishes between what is known (i.e., the observed range of data) and what is not known (i.e., the rangeof extrapolation) [2,11]. Thus,the first step involves modeling response data in the empirical rangeof observation (Figure3). The proposed guidelines indicate a preference for modeling with a biologically of these models require based [74] or case-specific model. Because the parameters extensive data, it is anticipated that the necessary data to support these models will not be available formost chemicals and that modeling in the observed range will probably be done most often with an empirical curve-fitting approach. A point of departure is determined from this modeling.A standard point of departure was proposed (and which is subject to public comment) as the lower 95% CL on a dose associated with 10% extra risk (LEDlo). Other pointsof departure may be appropriate (e.g.,if a response is observed belowan increase in response at 10%). The objective is to determine the lowest reliable part of the dose-response curve for the beginningof the second stepof the process-the extrapolation range (discussedin the next section). For some data sets (e.g., certain continuousdata),estimatingaLOAELorNOAEL may bemoresuitable than determining a point of departure.
B. The Range of Extrapolation for Cancer Risk The second step involves extrapolation below the rangeof observation. As mentioned earlier, a biologically based or case-specific model preferTed is for extrapolating low-dose risk. If the available data do not permit such approaches, the IRIS can be accessed via the Internet at http://www.epa.gov/ngispgm3/IRTS/index.ht1n1. or call (513) 569-7254 for more information.
:i:
Health Risk Assessment
403
Observed Range
Range Extrapolation
Human Exposure of Interest I t
I I I I I I
I I I I
I
p+P
i ioepf . te$\ .
I
I I I
'
I I I I
I I
I1
I
0 extra risk
LED,,
le- MoE "4
//
I
ED,, Dose
Figure 3 Dose-responseassessment:thecurrenttrendfordose-responseassessment of cancer and noncancer end points is to begin with modeling response data in the observable range[2,70]. In the EPA 1996Proposed Guidelinesfor Carcinogen Risk Assessnzent
[2], dose-response assessment is proposed as a two-step process; in the first step, response data are modeled in the range of observation, and in the second step, the point of departure below the range of observation is determined. The LEDlo (effective dose corresponding to the lower 95% limit on a dose associated with 10% increase in response) is proposed as a point of departure for extrapolation to the origin as the linear default or for a margin of exposure analysis as the nonlinear default. (From EPA 1996 and Wiltse and Dellarco 1996.)
proposed guidelines provide for several default extrapolation approaches (linear, nonlinear, or both), which begin with the point of departure. The extrapolation default approach that is taken should be based on the mode-of-action understanding about the agent. As discussed earlier, the understanding of the underlying biological mechanisms as they vary from species to species, from high dose to low dose, and from one route of exposure to another drives the choice of the most appropriate extrapolation approach. Thus,in the new guidelines, the doseresponse extrapolation procedure follows conclusions about mode of action in the hazard assessment. The term mode of actiorz is deliberately chosen in these is sufficient new guidelines in lieu of mechanism to indicate using knowledge that
404
Dellarco et al.
to draw a reasonable working conclusion without having to know the processes meclzurzism might imply. Althoughan induced adverse effect in detail, as the term may result from a complex and diverse process, riska assessment must operationally dissect the presumed critical events, at least those that be measured can experimentally, to derive a reasonable approximation of human risk.
1. Default Extrapolation Approaches The LMS procedureof the 1986 guidelines[ 11 for extrapolating risk from upperbound confidence intervals isno longer recommended as the linear defaultin the 1996proposedguidelines[2]. The lineardefault in the new guidelinesisa straight-line extrapolation to the origin (i.e., zero dose, zero extra risk) from the point of departure (i.e., the LEDlo)identified in the range of observed data (see Figure 3). The new linear default approach does not imply unfounded sophistication as extrapolation with the LMS procedure does. The linear default approach would be considered for agents that directly affect growth control at the DNA level (e.g., carcinogens that directly interact with DNA and produce mutations). There might be modes of action other than DNA reactivity that are better supported by the assumption of linearity. When inadequate orno information exists to explain the carcinogenic mode of action of an agent, the linear default approach would beused as a science policy choice in the interest of public health. Likewise, a linear default would be used if evidence demonstrates the lack of support for linearity (e.g., lack of direct DNA reactivity and mutagenicity) and there is also an absence of sufficient information on another mode of action to explain the induced tumor response.The latter is also a public health protective policy choice. Although the understanding of the mechanisms of induced carcinogenesis likely will never be complete for most agents, there are situations for which evidence is sufficient to support an assumption of nonlinearity. Because it is experimentally difficult to distinguish modes of actions with true “thresholds” from others with a nonlinear dose-response relationship, the proposed nonlinear default procedure is considered a practical approach to use without the necessity of distinguishing sources of nonlinearity. In the 1996 proposed cancer guidelines [2], the nonlinear default approach begins at the identified point of departure and provides a margin-of-exposure (MoE) analysis rather than estimating the probability of effects at low doses (see Figure 3). The MoE analysis is used to compare the point of departure with the human exposure levels of interest. The MoE is the pointof departure divided by the environmental exposureof interest. The key objective of the MoE analysis is to describe for the risk manager how rapidly responsesmay decline with dose. A shallow slope suggests less reduction than a steep one. The steepness of the slope of the dose-response curve is also an important consideration in noncancer risk assessments applying the BMD approach. Information on factors such as the nature of response being used for point of departure (i.e., tumor data or a more sensitive precursor response) and
Health Risk Assessment
405
biopersistence of the agent are important to consider in the MoE analysis. As a default assumption for twoof these points, a numerical factorof no less than 10 each may be used to account for human variability and for interspecific differences in sensitivity when humans may be more sensitive than animals. When humans are found to be less sensitive than animals, a default factor of no smaller than 0.1 may be used to account for this. A nonlinear default position must be consistent with the understanding of the agent’s mode of action in causing tumors. For example, a nonlinear default approach would be taken for an agent’s causing tumors as a secondary consequence of organ toxicity or induced physiological disturbances(e.g., antithyroid agents that perturb pituitary-thyroid homeostasis, as discussed earlier). Because there must be a sufficient understanding of the agent’s mode of action to take the nonlinear default position, it is anticipated that the modeling of precursor responses to tumor developmentwill play an important role in providing support for nonlinearity, or modeling may actually be used instead of tumor data for determining the point of departure for theMoE analysis (see section1II.C below). There may be situations for which it is appropriate to consider both linear and nonlinear default procedures. For example,an agent may produce tumors at multiple sites by different mechanisms. In another case, for example, when it is apparent that an agent is both DNA reactive and highly active as a promoter at higher doses, both linear and nonlinear default procedures may be used to distinguish between the events operative at different portions of the dose-response curve and to consider the contribution of both phenomena. For example, formaldehyde, which was discussed earlier, is DNA reactive at low doses and active as a promoter at higher doses (i.e., concentrations of formaldehyde that cause cytotoxicity and increased cell proliferation are also carcinogenic in the nose). There may be situations for which there are insufficient data to provide high confidencein a conclusion aboutany single modeof action of a given agent and for which different mechanisms may be operating at the different sites of tumor induction. Although the available data generally support nonlinearity, a linear mechanism (e.g.. a mutagenic metabolite for of onethe tumor sites) cannot be entirely dismissed. Both defaults are conducted and a discussion of the degree may be of confidence in each is provided to the risk manager. The linear default viewed as conservative (i.e., likely to overestimate the risk at low exposures), and it might be more appropriate for screening analyses. The nonlinear default may be viewed as more representative of the risk given the growth-promoting potential and toxicity of the given agent.
C. Modeling of PrecursorResponse Data The proposed EPA cancer guidelines [2] call for modeling of not only tumor data in the observable rangebut other responses thought be to important precursor events in the carcinogenic process (e.g., DNA adducts, gene or chromosomal
406
Dellarco et al.
mutation, cellular proliferation, hyperplasia, hormonal or physiological disturbances, receptor binding). The modeling of important precursor response data makes extrapolation based on default procedures, discussed earlier, more meaningful by providing insights into the relationships of exposure and tumor response below the observable range.In addition, modelingof nontumor datamay provide support for selecting a certain extrapolation procedure (linear vs. nonlinear). If the nontumor end point is believed to be of part a continuum that leads to tumors, such data could then be used to extend the dose-response curve below the observed tumor response to provide insight into the low-dose response range. For example, studies using DNA adducts can be conducted with doses overlapping with the observed tumors down to environmental exposure levels. Several studies have demonstrated the meritof examining the relationship between DNA adduct concentration and tumor incidence for more accurate low-dose extrapolations (reviewed in [75]). However, when using DNA adducts (as a dosimeter) to extend the observable range. it is important to have a reasonable understanding of the targetcellandtheadductinvolvedinthecarcinogenicprocess. In addition, changes in cell proliferation rates can cause a steep upward curvature of the doseresponse curve, and thusneed to be factored into the evaluationof risk. The role of cell proliferation in changing the cancer dose-response curvehas been shown for 2-acetylaminofluorene for bladder tumors [76] and for formaldehyde for nasal tumors [28]. Precursor response datamay be modeled andused for extrapolation instead of the available tumor data. Currently, itnotisanticipated that precursor response data will be used in lieu of tumor data formany compounds because of the more stringent conditions that must be demonstrated. To be acceptable for extrapolation, the mode of action and the role the precursor event plays in the carcinogenic process must be understood. Furthermore, the precursor response should be considered to be more informative of the agent’s carcinogenic risk. Precursor data should be from in vivo experiments and from repeat dosing experiments over an extended period of time; precursor data are most valuable if they are built into the design of the cancer bioassay.It is anticipated that the modelingof precursor response data will come into play predominantly for the nonlinear default approach, which must be based on a reasonable understandingof the agent’s mode of action in causing tumors. The most likely situations for which precursor response data areused to estimate risk involve those mechanisms for which tumor development is secondary to toxicity or disruption of a physiological process. For example, hyperplasia might be used in lieu of tumor data to extrapolate risk for a bladder carcinogen that causes calculi to form in the urine, or TSH levels might be used for a thyroid carcinogen that perturbs hypothalamus-pituitarythyroid homeostasis. Alterations in TSH or thyroid hormone levels may result in other disease consequences. Early responses in the continuum of events that lead to organ pathology or resultant diseases, such as liver enzyme changes and
Health Risk Assessment
407
liver histopathology, respiratory irritation, and respiratory tract damage, have been a consideration in noncancer risk assessment [66].Thus, the consideration of precursor response data in health risk assessment is not a new concept.
IV. EMERGINGDIRECTIONS IN EXPOSUREASSESSMENT Exposure is defined as the contact of a chemical, physical, or biological agent with the outer boundary of an organism [7]. Application of exposure data to the field of risk assessment has grown in importance since the early 1970s because of greater public, academic, industrial, and government awareness of chemical pollution problems in the environment. In environmental health assessment one attempts to address the question of how many people are exposed to a pollutant and to how much. Information about the distribution of exposure to determine a element in the development the causes of exposures for high-risk groups is key of cost-effective mitigation strategies. In addition, information is neededbody on burden and related factors in the general population to provide a baseline for interpreting the public health significance of measured exposures from site- or source-specific investigations. For example, body burden levelsof environmental pollutants can put people near the linear part of the dose-response curve, even for a dose-response curve that is nonlinear. A current trendin health risk assessment is to assess cumulative total exposures and risks to multiple environmental agents, through multiple pathways and routes. People are exposed to many chemicals via different pathways during their lives. Multichemical exposures are ubiquitous (e.g., air and soil pollution from municipal incinerators, leakage from hazardous waste facilities and uncontrolled waste sites, drinking water containing chemical substances formed during disinfections). Becauseof the difficultiesin assessing multiple exposures, assessments have tended to focus on a single chemical and often on a single pathway of exposure. Little is known about whether exposure to one chemical or class of chemicals is correlated with exposure to other chemicals; and even less is known about the combined risks associated with multiple exposures. Thus, risk assessments of mixtures usually involve substantial uncertainties. A common risk assessment practice is to evaluate toxicological properties of the components of mixture and assume that similar effects are additive. However, some research indicates that toxicological interactions among chemicals can be antagonistic or synergistic. Pharmacokinetic studies or newer technologies using transgenic animals (fish or rodents) may make studies of mixtures (e.g., binary, tertiary, or quantinary combinationsof chemicals) more practicalthan traditional toxicology animal bioassays. Moreover, researchusing in vitro orin vivo eukaryotic models of the combined effectsof mixtures of environmental contaminants on elements of cell cyclecontrol-including growth, death, anddifferentiation-may provide
Dellarco et al.
408
insight into combined riskof chemicals representativeof mixtures that are found in environmental media.
V.
EMPHASIS ON RISK CHARACTERIZATION
Risk assessment is an integrative process that culminates ultimately in a risk characterization summary. Risk characterizationthe is final step of the risk assessment process in which all preceding analyses (from hazard assessments to doseresponse assessments to exposure assessments) are tied together to convey the overall conclusions about potential human risk. This component of the risk assessment process characterizes the data in nontechnical terms, explaining thekey issues and conclusionsof each componentof the risk assessment and the strengths and weaknesses of the data. Risk characterization is the product of risk assessment that is used in risk management decisions. The current emphasis on risk characterization is illustratedby recent publications by the EPA and the National Academy of Science/National Research Council [77,78].
VI. SUMMARY Compared with traditional approaches to healthrisk assessment, ongoing activities to assess the risk of environmental agents are including a more complete discussion of the issues and an evaluation of all relevant information, promoting the use of mode-of-action information to reduce the uncertainties associated with using experimental data to characterize and project how human beings will respond to certain exposure conditions. This emphasis on mechanisms is to promote research and testing to improve the scientific basisof health risk assessment and stimulate thinking onhow such information can be applied. As the science continues to evolve the practice and policies of risk assessment will reflect these advances.
REFERENCES 1. U.S. Environmental Protection Agency, Guidelines for carcinogen risk assessment, Fed. Reg. 51(185):33992-34003 (1986). 2. U.S. Environmental Protection Agency, Proposed guidelines for carcinogen risk assessment, Fed. Reg. 61(79):17960-18011 (1996). risk assess3. U.S. EnvironmentalProtectionAgency,Guidelinesformutagenicity ment, Fed. Reg. 51(185):34006-340012 (1986).
Health Risk Assessment
409
4. U.S. Environmental Protection Agency, Guidelines for developmental toxicity risk assessment (notice), Fed. Reg. 56(234):63798-63826 (1992). 5. U.S. EnvironmentalProtectionAgency,Reproductivetoxicityriskassessment guidelines, Fed. Reg. 61(212):56274-56322 ( I 996). 6. U.S. Environmental Protection Agency. Proposed guidelines for neurotoxicity risk assessment, Fed. Reg. 60(192):52032-52056 (1 995). 7. U.S. Environmental Protection Agency, Guidelines for exposure assessment (notice), Fed. Reg. 57(104):22888-22896 (1991). 8. U.S. Environmental Protection Agency, Guidelines for the health risk assessment of chemical mixtures, Fed. Reg. 51(185):34014-34025 (1986). 9. U.S. EnvironmentalProtectionAgency. Emirormental Health ThreatstoChildrelz, EPA175F-96-001, Office of the Administrator, U.S. Environmental Protection Agency, Washington, DC(1 996). 10. National Research Council, Science and Judgment in Risk Assessment, Comnittee on Risk Assessment of Hazardous Air Pollutants, Commission on Life Sciences, National Research Council, National Academy Press, Washington, DC (1994). 11. J. Wiltse and V.L. Dellarco, The U.S. Environmental Protection Agency guidelines for carcinogen risk assessment: Past and future, Mutat. Res. 365:3-16 (1996). 12. J.A. Swenberg.B. Short, S. Borghoff, J. Strasser, and M. Charbonneau, The comparative pathobiology of a?,-globulin nephropathy. Toxicol. Appl. Pharnzacol. 97:3546 (1989). 13. L.D. Lehman-McKeeman, M.I. Rivera-Torres, andD. Caudill. Lysosomal degradation of a?_,-globulin and a,,-globulin-xenobiotic conjugates,Toxicol. Appl. PharntaC O ~ .103~539-548 (1990). 14. D.R.DietrichandJ.A.Swenberg,Thepresenceofa,,-globulinisnecessaryfor d-limonenepromotion of maleratkidneytumors. Cancer Res. 51:3512-3521 (1991). Alpha 2pGlobulilt: Association with Chem15. U.S. Environmental Protection Agency, ically Indztced Renal Toxicity and Neoplasia in the Male Rat, Risk Assessment Forum, Washington DC, EPA/625/3-91/019f (1991). 16. R.L. Melnick, M.C. Kohn, and C.J. Portier, Implications for risk assessment of suggested nongenotoxic mechanismsof chemical carcinogenesis,Environ. Health Perspect. 104:123-134 (1996). 17. R.N. Hill,L.S. Erdreich, O.E. Paynter, et al., Review: Thyroid follicular cell carcinogenesis, Fnndanl. Appl. Toxicol. 12:629-697 (1989). 18. R.M. McClain, The significance of hepatic microsomal enzyme induction and altered thyroid function in rats: Implications for thyroid gland neoplasia. Toxicol. Prrthol. 17294-306 (1989). 19. R.M. McClain, Thyroid gland neoplasia: Non-genotoxic mechanisms,Toxicol. Lett. 64/65:397-408 (1992). Drap Risk Assessment ForumThyroid Can20. U.S. Environmental Protection Agency, cer Policy Report, U.S. Environmental Protection Agency, Washington. DC (1996). risk assessment: Case example21. R.M. McClain, The use of mechanistic data in cancer sulfonamides. inLow-Dose Extrapolation of Cancer Risks:Issues and Perspectives, S. Olin, W. Farland, C. Park, L. Rhomberg, R. Soheuplein, T. Stan: and J. Wilson (eds.), ILSI Press. Washington, DC, pp. 163-173 (1995).
410
Dellarco et al.
22. S.M. Cohen and L.B. Ellwein, Genetic errors, cell proliferation and carcinogenesis, Cancer Res. 51 :6493-6505 (1 991). 23. U.S. Environmental Protection Agency, Melatnine: toxic chemical release reporting, Fed. Reg. 53:23 128-23 132(1 988). 24. W.D. Kerns,K.L. Pavkov, D.J. Donofrio, E.J. Gralla, and J.A. Swenberg, Carcinogenicity of formaldehyde in rats and mice after long-term inhalation exposure, Cancer Res. 43:4382-4392 (1983). 25. 0. Hernandez, L. Rhomberg, K. Hogan, C. Siegel-Scott, D. Lai, G. Grindstaff, M. Henry, and J.A. Cotruvo, Risk assessment offormaldehyde, J. Hazard. Mat. 39: 161-172 (1994). 26. M. Casanova, K.T. Morgan, W.H. Steinhagen, J.I. Everitt, J.A. Popp, and H.d’A. Heck, Covalent binding of inhaled formaldehyde to DNA in the respiratory tract of rhesus monkeys: Pharmacokinetics, rat-to-monkey interspecies scaling, and extrapolation to man, Fundam. Appl. Toxicol. 17:409-428 (1991). 27. M.Casanova,K.T.Morgan.E.A.Gross,O.R.Moss,andH.d’A.Heck,DNAprotein cross-links and cell replication at specific sites in the nose ofF344rats exposedsubchronicallytoformaldehyde, Furzdanl. Appl. To.xicol. 23525-536 (1994). 28. H.d’A. Heck, M. Casanova, and T.B. Starr. Formaldehyde toxicity-new understanding. Crit. Rev. Toxicol. 20:397-426 (1990). 29. T.M. Monticello, J.A. Swenberg, E.A. Gross, J.R. Leininger, J.S. Kimbell, S. Seilkop, T.B. Starr, J.E. Gibson, and K.T. Morgan, Correlation of regional and nonlinear formaldehyde-induced nasal cancer with proliferating populations of cells, Cancer Res. 56:1012-1022 (1996). 30. M.S. Bogdanffy, H.C. Dreef-van der Meulen, R.B. Beems. V.J. Feron, R.W. Rickard, T.R. Tyler, and T.C. Cascieri, Chronic toxicity and oncogenicity inhalation study with vinyl acetate in the rat and mouse. Fundam. Appl. Toxicol. 23:215-229 (1994). 31. M.S.Bogdanffy,T.R.Tyler,M.B.Vinegar.R.W.Capanini.andT.C.Cascieri, Chronic toxicity and oncogenicity study with vinyl acetate in the rat: In utero exposure in drinking water, Fundam. Appl. Toxicol. 23:206-214 (1994). 32. B.R. Blakley, Enhancement of urethane-induced adenoma formation in Swiss mice exposed to methyl mercury, Can. J. Comp. Med. 4299-302 (1984). 33. International Agency for Research on Cancer,Occupational Exposure to Mists and Vaporsfion?Strong Inorganic Acids and Other Illdustrial Chemicals, Lyon, France, VO~ 54:237-287 . (1 992). 34. National Toxicology Program. Toxicology and carcifiogenesis studies of 1,3-butadiene in B6C3Fl mice (inhalation). U.S. Department of Health and Human Services, Public Health Service, National Institutes of Health, NTP TR434. NIH publication number 92-3165 (1991). 35. A.R. Dahl, W.E. Bechtold, and J.A. Bond, Species difference in the metabolism and disposition of inhaled 1,3-butadiene and isoprene, Environ. Health Perspect. 86:6569 (1990). 36. A.R. Dahl, D.J. Sun, L.S. Birnbaum, et al., Toxicokinetics of inhaled 1,3-butadiene in monkeys: Comparison to toxicokinetics in rats and mice,Toxicol. Appl. PharrnaC O ~ .110:9-19 (1991).
Health Risk Assessment
41 1
37. G.A. Csanady, F.P. Guerigerich, and J.A. Bond, Comparisonof the biotransformation of 1.3-butadiene and its metabolite, butadiene monoepoxide, by hepatic and pulmonarytissuefromhumans,rats,andmice, Carcinogenesis 13:1143-1153 (1 992). 38. E. Delzell,N. Sathiakumar, A. Macaluso, et Aal., follow up study of synthetic rubber workers. Submitted to the International Institute of Synthetic Rubber Producers, Department of Epidemiology,UniversityofAlabamaatBirmingham,Oct. 2, 1995. 39. F.J. Gonzalez, The role of carcinogen-metabolizing enzymes polymorphisms in cancer susceptibility, Reprod. Toxicol. 11:397-412 (1997). 40. M. Kihara and K. Noda, Risk of smoking for squamous and small cell carcinomas of the lung modulated by combinations of CYPl A1 and GSTMl gene polymorphisms in a Japanese populations, Carcinogenesis 16:233 1-2336 (1995). 41. K. Nakachi, K. Imai, S. Hayashi. and K. Kawajiri, Polymorphisms of the CYPlAl and glutathione S-transferase genes associated with susceptibility to lung cancer in relations to cigarette dose in a Japanese population, Cancer Res. 53:2994-2999 (1993). 42. M.J. Khoury, M. Gomez-Farias, and J. Mulinare, Does maternal cigarette smoking during pregnancy cause cleft lip and palate in offspring? Am. J. Dis. Child. 143: 333-337 (1989). 43. H.H. Ardinger, K.H. Buetow, and G.I. Bell, Associationof genetic variationof the transforming growth factor-alpha gene with cleft lip and palate, Ant. J. Hum. Genet. 451348-353 (1989). 44. World Health Organization, En\,ironmental Health Criteria 101: Methyln~ercury, World Health Organization, Geneva, Switzerland, 1990. 45. U S . Environmental Protection Agency,Integrated Risk Irtformation System (IRIS): Online,rfd/$c Document for Methylmercu?y, National Center for Environmental Assessment, Washington, DC (1995). 46. R.A. Goyer, Results oflead research: Prenatal exposure and neurological consequences, Environ. Health Perspect. 104:1050- 1054 (1996). 47. U.S.EnvironmentalProtectionAgency,Proposedregulation, Fed.Reg. 59(170): 4587 1 (1994). Guidelinesfor the Evaluation 48. U.S. Department of Housing and Urban Development, and Control of Lead-Bused Paint Huzards in Housing, HUD-1539-LBP, U.S. Department of Housing and Urban Development, Washington, DC (1995). 49. Center for Disease Control, Strategic Plan for the Elirnination of Childhood Lead Poisonirzg, U.S.DepartmentofHealthandHumanServices,Washington,DC (1991). 50. J. Schwartz, D. Slater, T.V. Larson, W.E. Pierson, and J.Q. Koenig, Particulate air pollution and hospital emergency room visits for asthma in Seattle, Am. Rev. Respir. D ~ s 147826-831 . (1993). 51. P.H.N. Saldiva, C.A. Pope 111, J. Schwartz et al., Air pollution and mortality in elderly people: A time-series study in Sao Paulo, Brazil,Arch. Envirou. Health 50: 159- 163 (1 995). 52. B. Ostro, J.M. Sanchez. C. Aranda, and G.S. Eskeland, Air pollution and mortality: Results from a study of Santiago. Chile (papers from ISEA-ISEE the annual meeting.
412
Dellarco et al.
September 1994. Research Triangle Park, NC, M. Lippman, ed.), J. Exposure Anal. Environ. Epidemiol. (in press). 53. U.S. Environmental Protection Agency,Air Quality Criteria for Particulate Matter, EPA/600/P-95/001bF. Vol. 2, Office of Research and Development, Washington. DC (1996). 54. U.S. Environmental Protection Agency,Air QzdityCriteria for Ozone and Related Photockevzical Oxidmts, EPA/600/P-93/004aF. Vol. 3, Office of Research and Development, Washington. DC (1996). Dioxins and 55. N.I. Kerkvliet,Immunotoxicologyofdioxinandrelatedchemicals, Health (A. Schecter, ed.), Plenum Press, New York. 1994, pp. 199-218. of dioxin 56. H.M. Theobald and R.E. Person, Developmental and reproductive toxicity and other Ah receptor agonists.Dioxin and Health (A. Schecter, ed.), Plenum Press, 309-335 (1994). pp. York, New I 57. U.S. Environmental Protection Agency. Health Assessment Document for 2,3,7,8Tetraclzlorodibenzo-p-dioxin(TCDD) und RelatedCompounds:External Review Draft, EPA/600/BP-92/001~,Vol. 3, Office of Research and Development, Wash-
ington, DC (1994). 58. A.M. Scheuhammer, Effects of acidification on the availability of toxic metals and calcium to wild birds and mammals, Emiron. PoZlut. 71:329-375 (1991). (Gravia 59. A.M.ScheuhammerandP.J.Blancher,Potentialrisktocommonloons iminer) from methylmercury exposure in acidified lakes, H$robiologin 279-289: 445-455 (1994). 60. E. Carlsen, A. Giwercman, N. Keiding, and N.E. Skakkebaek, Evidence for decreasing quality of semen during past 50 years, Br. Med. J. 305:609-613 (1992). 61. W.R. Kelce, C.R. Stone, S.C. Laws, L.E. Gray, et. al., Persistent DDT metabolite p,p-DDE is a potent androgen receptor antagonist, Nature 37.5581-585 (1995). Anirnnls as Sentinels of Environmental Health Hacards, 62. National Research Council, Committee on Animals as Monitors of Environmental Hazards, Board on Environmental Studies and Toxicology, Commission on Life Sciences, National Research Council, National Academy Press, Washington, DC (1991). 63. K.S.Crump,Animprovedprocedureforlow-dosecarcinogenicriskassessment from animal data. J. Environ. Puthol. Toxicol. 5:675 (1981). 64. D. Ki-ewski, D.W. Gaylor, and W.K. Lutz, Additivity to background and linear extrapolation, in:Low-Dose Extrupolution of Cancer Risks: Issues and Perspectives. S. Olin, W. Farland, C. Park, L. Rhomberg, R. Scheuplein. T. Starr and J. Wilson (Eds.), ILSI Press, Washington. DC. pp. 105-121 (1995). 65. A.M.Jarabek,Interspeciesextrapolationbasedonmechanisticdeterminantsof chemical disposition, J. Hwz. Ecol. Risk Assess. /(5):641-662 (1995). Methodsfor Derivation of Inhalation Refer66. U.S. Environmental Protection Agency, ence Concentrations and Application of Inhalation Dosimetly, EPA/600/8-90/066F. Office of Research and Development, Washington. DC (1994). 67. D.G. Barnes and M.L. Dourson, Reference dose (RfD): Description and use in health risk assessments, Reg. Toxicol. Pharnmcol. 8:471-488 (1988). 68. A.M. Jarabek, M.G. Menache, J.H. Overton, M.L. Dourson, and F.J. Miller. The U.S. Environmental Protection Agency's inhalationRfD methodology: Risk assessment for air toxics, Toxicol. Ind. Health 6279-301 (1 990).
Health Risk Assessment
413
69. A.M. Jarabek, The application of dosimetry models to identify key processes and parameters for default dose-response assessment approaches. To,xicol. Lett. 79:17 1 184 ( 1995). 70. U.S. Environmental Protection Agency,The Use of the Bencltmark Dose Approach i n Health Risk Assessment, EPA/630/R-94/007. Office of Research and Development, Washington, DC (1995). 71. B.C.Allen,P.L.Strong,C.J.Price,S.A.Hubbard,andG.P.Daston.Benchmark dose analysis of developmental toxicity in rats exposed to boric acid, Fundanz. Appl. Toxicol. 32:194-204 (1 996). 72. D.G.Barnes,G.P.Daston,J.S.Evans,A.M.Jarabek,R.J.Kavlock.C.A.Kimmel, C. Park, and H.L. Spitzer, Benchmark dose workshop: Criteria for use of a benchmark dose to estimate a reference dose, Reg. Toxicol. Plzarntacol. 21:296306 (1 995). 73. R.J. Kavlock, J.E. Schmid, R.W. Setzer Jr, A simulation study of the influences of study design on the estimation of benchmark doses for developmental toxicity, Risk Anal. 16:391-403(1996). 74. C. Chen and W. Farland. Incorporating cell proliferation in quantitative cancer risk assessment: approaches, issues, and uncertainties, in: Chemical Zrzduced Cell ProZgeration: Implications for Risk Assessment, B. Butterworth,T. Slaga, W. Farland and M. McClain (eds.), Wiley Liss, New York, pp. 481-499 (1991). 75. D.K. La andJ.A.Swenberg.DNAadducts:Biologicalmarkersofexposureand potential applications to risk assessment, Mutat. Res. 365: 129-146 (1996). 76. S.M. CohenandL.B.Ellwein,Proliferativeandgenotoxiccellulareffectsin2acetylaminofluorene bladder and liver carcinogenesis: Biological modeling of the ED01 study, To.xicol. Appl. Pharmacol. 104:79-93 (1990). Policy for Risk Clzaracterizatiorz,memoran77. U.S. Environmental Protection Agency, dum of Carol M. Browner, Administrator, March 21, 1995, Washington, DC, 1995. 78. National Research Council, Understallding Risk: Itforming Decisions in a Democratic Society. Committee of Risk Characterization, Commission on Behavioral and Social Sciences and Education, National Academy Press, Washington DC (1996). 79. National Research Council.Risk Assessment in the Federal Governnzent: Managing the Process,Committee on the Institutional Means for Assessment of Risks to Public Health, Commission on Life Sciences, NRC, National Academy Press, Washington, DC (1983).
This Page Intentionally Left Blank
Index
Abortion, 24 1 Acclimation period, 3, 10, 20, 110, 122, 210 Acetominophen, 327 2-Acetylaminofluorene, 180, 18 1 Acetylglucosamidase (see Enzymes) Acidosis, 330 Acrylamide, 281 Acute toxicity studies dermal irritation studies, 26-29 eye irritation studies, 22-25 ratings used in, 22 species used in, 2 timing of, 30 Adrenal gland adrenocortical insufficiency, 330 dysfunction, secondary effects of, 59 hyperadrenocorticism, 327, 330 as a target organ, 59 weight change in, 67 Aflatoxin, 304 Agrochemicals inhalation studies for, 107 genetic toxicology studies for, 129131, 362
[Agrochemicals] reproductive and developmental toxicity studies for, 200-203 subchronic/chronic studies for, 34 toxicokinetics for, 74 Alanine aminotransferase (see Enzymes) Alkaline elution studies, 184 Alkaline phosphatase (see Enzymes) Alkalosis, 330 Allopurinol, 327 Ames test(see Bacterial mutation assays) 9-Aminoacridine,138 2-Aminoanthracene,138 Ammonium sulfide, 2 17 Amnion, 241 Amphetamine, 280 Ampullary glands, 231 Animal numbers in acute studies, 21, 23, 27 in inhalation studies, 119 in neurotoxicity studies, 259,267-268 in reproductive and developmental toxicity studies, 2 10 in subchronic/chronic studies, 40-41 in toxicokinetics, 76, 78-79 in transgenic studies, 382 415
416
p-Anisidine, 378, 379. 381 Anorexia (see Food consumption) Aroclor1254,137,152,164 Asbestos, 292 Aspartate aminotransferase(see Enzymes) Bacterial mutation assays, 131- 147 cell type and cell strain section for, 132 controls used in, 137-138 for detection of base-pair substitution mutations, 13 1 for detection of DNA cross-linking, 131 for detection of forward mutations. 131 for detection of frame-shift mutations, 131 for detection of reverse mutations, 131 dosage selection for, 135- 136 use of E . coli in, 131, 134. 137 end points evaluated in, 139- 140 interpretation of data from, 141-144 metabolic activation systems for, 136-137,144.145,364 screening assays for, 146-147 study design of, 135-140, 146 use of S. ~phimuriunt,131, 134. 137, 144 Behavior (see Neurotoxicity studies) Benchmark dose, 40 1-402 Benzene, 303, 304, 378, 379 Benzethonium chloride, 379 Benzo[a]pyrene,152,165,304 o-Benzyl-p-chlorophenol, 379 Bile acids (see Clinical chemistry) Bilirubin (see Clinical chemistry) Bioavailability (see Toxicokinetics) Blood sampling, 14-15, 44, 47, 316318 anesthetics use during, 3 17 anticoagulant use in, 318 preparation of plasma samples from, 326
Index
[Blood sampling] recommended sample volumes, 3 16 sampling sites for cardiac puncture, 14 jugular,14 lateral tail vein, 14 marginal ear vein, 14 posterior vena cava. 14 retro-orbital plexus, 14 Blood urea nitrogen (see Clinical chemis try) Body weights in acute studies, 21 in carcinogenicity studies. 37 as used in dosage selection. 43 in inhalation studies, 119 interpretation of changes in, 61, 224225 in neurotoxicity studies, 260-26 1, 278 in reproductive and developmental toxicity studies, 215. 217, 219. 224-225. 269 in subchronic/chronic studies, 36, 45 Bone osteomalacia, 327 rickets, 327 Bulbourethral glands, 231 1,3-Butadiene, 397 Caging, 3, 8-12 gang housing, 8, 45 size of, 9-1 1 Calcium (see Electrolytes) Cannibalism, 8, 45, 219, 233, 241 Carbon tetrachloride, 297, 304 Carcinogenesis differences in species susceptibility, 397 examples in risk assessment bladder cancers associated with calculi, 395-396 nasal tumors and formaldehyde, 396 renal cancers associated with nephropathy, 394-395
Index [Carcinogenesis] role of route of exposure, 397 thyroid cancers associated with pituitary-thyroid homeostasis, 395 Carcinogenicity studies (see also Transgenic animal testing) definition of, 33 interpretation of data and risk assessment from, 68, 392-396 study design of, 37 Cardiovascular assessment electocardiography. 46-47, 55. 61 interpretation of findings, 61 -62 in subchronic/chronic studies. 46-47 Cardiovascular system atherosclerosis,127 clinical signs associated with dysfunction, 60, 64. 65 congestive heart failure, 57, 63 embolism, 59 heart weight, 67 hypertension, 59 hypotension, 59 myocardial necrosis, 57 serum chemistry changes associated with dysfunction, 63, 328, 330 as a target organ, 59 thrombosis, 59 toxin induced functional changes, 59 Cesarean section (see Reproductive and developmental toxicity studies) Chlordane, 304 2-Chloroethanol, 379 1 -Chloro-2-propanol, 379 Chlorothiazide, 292 Chromosome Aberration Assays (in who), 156-168,363 cell selection for, 158 for detection of aneuploidy, 157 for detection of chromosome breakage,157,164 for detection of chromosome rearrangements,157,160 for detection of polyploidy, 157 endpoints measured in, 166 experimental design of, 159- 166
417
[Chromosome Aberration Assays (ilz vitro)]
interpretation of data from, 167- 168 metabolic activation systems in, 163, 164 use of Chinese hamster cells in, 158 use of lymphocytes in, 157 Chromosome Aberration Assay ( i n V ~ V O ) ,172- 176 experimental design of, 173- 174, 175-176 interpretation of data from, 174-175, 176 use of bone marrow cells for, 173174 use of spermatogonia for, 175-176 Chronic studies, 33, 36, 43-44 Clinical (plasma) chemistry assessment (see also Electrolytes and Enzymes) as associated with target organ toxicity, 57 endpoints in albumin, 63, 66, 331 bile acids, 57, 58 bilirubin, 57, 5 8 , 327, 328-329 blood urea nitrogen, 57. 58, 62, 66, 33 1 creatinine, 58, 62, 66, 329 glucose, 5 8 , 62, 66, 330-331 lipids, 65 serum proteins, 331 in inhalation studies, 119 interpretation of findings from, 62-67 quality control in data collection and analyses for, 333-335 selection of parameters for, 326 in subchronic/chronic studies, 36, 46 Clinical signs/observation in acute studies, 21 in inhalation studies, 110 interpretation of findings, 60,223-224 in neurotoxicity studies, 277-278 in reproductive and developmental toxicity studies, 215, 223-224 in subchronic/chronic studies, 44-45
418
Index
Dosage selection Clitoral gland, 232 for acute studies, 21, 23, 27 Clofibrate, 327 for bacterial mutation assays, 135 Coagulation (see Hematology assessfor carcinogenicity/transgenic studies, ment) Coccidiosis, 6, 224 43, 378, 382 Conception index (see Fertility index) for inlmunotoxicity studies, 295-296 for i n vitro chromosome aberration Congenital birth defects dose-response patterns for, 232-233 assays, 159-164 fetal examination methods for, 217for in vivo chromosome aberration 218, 219 assays, 174 interpretation of data, 235-236 1, 260 limit dose, 20, 42, 21108, malformations. 209, 217. 234. 235. for mammalian cell mutation assays, 244 150-151 variations, 217, 235, 236. 242 ' for micronucleus assay, 170 Contract laboratories, 104- 107, 346for neurotoxicity studies, 360, 268 347. 349 for reproductive and developmental Copulation toxicity studies, 21 1-213 calculation of copulatory index, 226, for subchronic/chronic studies,41227 43 interval of, 216, 241 for toxicokinetics. 80,91 Corpora lutea, 209, 216, 217, 223, for 232, unscheduled DNA synthesis 234. 241 assay, 179-1 80 Creatinine (see Clinical chemistry) Dose response, 54,223, 232-233, 400p-Cresidine. 378, 379 407 Culling, litter, 219, 233, 242, 267268 Edema, 21, 60, 302 Cyclophosphamide, 164, 1,17304 EEC Directives Cyclosporin A, 379 genetic toxicology guidelines, 130 Cytochrome P450, 137 neurotoxicity study guidelines, 257 subchronic/chronic study guidelines, Data collection systems, 357-358 34 Data/specimen storage, 356 toxicokinetics guidelines, 74, 81 Dermal imitation studies Ejaculation, dysfunction of alternatives to, 28 urinary spermatozoa. 67 examination methods in, 27-28 Electrolytes scoring systems for, 28 bicarbonate, 63 Diaminotoluene, 380 calcium, 63 Dideoxyinosine. 297 chloride, 63, 329, 330 Diethanolamine, 379 etiology of changes in, 329-330 Diethylstilbesterol, 379 magnesium, 63 Differentials (see Hematology assessphosphate, 63 ment) potassium, 329, 330 7,12-Dimethylbenz[a]anthracene,152, secondary effect of changes in electro180, 304 lytes, 63 Dimethylnitrosamine, 1 80,181, 304 sodium, 63. 329 Dimethylvinyl chloride, 379 Electronmicroscopy. 52, 340-341
Index Embryo-fetal death dose-response patterns of, 232-233 postimplantation loss. 227, 233, 234 preimplantation loss, 227, 233, 234 resorption, 217. 233, 245 stillbirth, 219, 234 Environmental Protection Agency (TOSCA, FIFRA) good laboratory practice regulations, 345 guidelines for, acute studies, 29, 30 genetic toxicology study guidelines for, 129- 1 30 inhalationstudies,108,110,120, 123 neurotoxicity studies, 255-256, 257, 258, 263, 264, 265. 268, 275, 309 reproductive and developmental toxicity studies, 199-200, 203, 265 subchronic/chronic studies, 34, 35, 41 toxicokinetics, 74. 76, 81 risk assessment of environmental agents development of approaches to, 390-392 extrapolation of low dose, 402-405 guidelines for, 390 modeling of precursor response data, 405-407 role of mechanistic data, 392-396 variation in population susceptibility, 397-399 Enzymes alanine aminotransferase, 56, 57, 63, 64, 327-328 alkaline phosphatase, 57, 58, 63, 64, 120, 326-327 amylase, 65 aspartate aminotransferase, 56, 57, 63, 64, 327-328 creatinine kinase. 63, 328 gamma-glutamyltransferase, 57, 58, 65
419
[Enzymes] glutamate dehydrogenase, 63 lactate dehydrogenase, 63, 64, 120. 328 N-acetylglucosamidase, 58, 63, 120 sorbitol dehydrogenase, 56, 57. 63 Epididymes histopathology of, 23 1 weight of, 220, 230 Erythema, 28, 302 Estrogen (estradiol). 209, 226, 228, 242 Estrous cycle, 209, 210, 215-216, 242243 amenorrhea/acyclic. 225, 241 hormonal control of, 226 interpretation of data from. 225-226 oligomenorrhea, 244 prolonged diestrus, 215, 225-226 prolonged estrus, 215, 225-226 Ethanol, 304, 330 Ethyl acrylate, 379 Ethyl methanesulfonate, 152, 17 1 Ethyl nitrosourea, 171 Euthanasia, 15, 49, 218 Eye (see also Opthalmoscopic examinations) cataracts, 60 lacrymation, 60 as a target system, 60 Eye irritation studies alternative to, 24-25 examination methods for, 23 scoring systems for, 24 Fecundity calculation of fecundity index, 226, 227 definition of, 243 Federal Animal Welfare Act, 1 Fertility calculation of fertility index, 226, 227 definition of, 243 rodent vs human in risk assessment, 228 FIFRA (see Environmental Protection Agency)
Index
420
Follicle stimulating hormone (FSH), 209, 228, 243 Food additives mutagenicity study guidelines for, 362 reproductive and developmental toxicity study guidelines for, 198 safety factors used for, 98 subchronic/chronic study guidelines for, 34 toxicokinetics guidelines for, 74 Food and Drug Administration genetic toxicology study guidelines for, 128- 129 good laboratory practice regulations, 345 inhalation study guidelines for, 108 neurotoxicity study guidelines for. 256, 257, 265, 308 pregnancy categories, 240 reproductive and developmental toxicity study guidelines for, 196-203, 265 subchronic/chronic study guidelines for, 34. 41 toxicokinetics guidelines for, 74 Food consumption anorexia, 62, 63, 66, 67, 224 in carcinogenicity studies, 37, 40 fasting, 63, 66, 333 food efficiency index for, 224 in inhalation studies, 119 interpretation of changes in, 61, 224225 in neurotoxicity studies, 260-26 1, 278 in reproductive and developmental toxicity studies, 215, 224-225, 269 restricted diet, 8, 40 in subchronidchronic studies, 36, 45 Functional birth defects, 232, 236-237 Gamma-Glutamyltransferase (see En-
zymes)
Gastrointestinal system adverse vehicle effects on, 38 diarrhea, 6. 60, 66, 330 emesis (vomiting), 66. 330, 333 stool changes in, 60 as a target organ, 60 ulceration, 60 Genetic toxicology testing bacterial mutation assays in, 13 -1 147, 364 biochemical-specific locus test in, 363 chromosome aberration assays in, 156-168 cytogenetic assay systems in, 168-176 dominant lethal assay in, 364 guidelines for test batteries, 127- 131 heritable location assay in, 364 history of assay development, 362364 HPRT-based assays in. 363 mammalian cell mutation assays in, 147-156 spot assay in. 363 unscheduled DNA synthesis assay in, 177- 184 Gestation (see Pregnancy) Glucose (see Clinical chemistry m d Urinalysis) Glycidol, 379 Gonadotropin-releasing hormones (GnRH), 228, 229 Good Laboratory Practice Regulations (GLPs), 37, 106, 114, 204, 345359 Gradient Plate Assay, 146 Growth retardation dose-response patterns of, 232-233 influence of litter size on, 234 interpretation of data for, 234-235 runts, 234, 245 skeletal ossification, relationship to, 217, 235, 236 Halothane, 327 Hematology assessment in carcinogenicity studies, 37
Index [Hematology assessment] cellular parameters bone marrow differential, 323-324 hematocrit, 3 19 hemoglobin, 320-321 mean corpuscular hemoglobin, 32 1 mean corpuscular volume, 320-321 methemoglobin concentration, 322 peripheral blood differential, 322323 red blood cell count, 320 reticulocyte count. 322 white blood cell count, 3 18-3 1 9,323 coagulation parameters, 324-325 factors, 324-325 mean platelet volume. 322 partial thromboplastin time, 324 platelet count, 321, 325 prothrombin time, 324 thrombin time, 335 in inhalation studies, 119 interpretation of findings from, 62. 317-319 selection of parameters to be measured in, 317-318 in subchronic/chronic studies, 36, 46 Hematopoietic system anemia, 59, 62, 319, 320 erythroblastemia, 62 hemolytic hyperbilirubinemia, 329 leukocytosis, 330 polycythemia, 62 reticulocytosis, 62 as a target organ, 59-60 thrombocytopenia, 325 thrombocytosis, 325, 330 Hemolysis, 13. 60, 64, 330 Histopathology evaluation BRDU labelling in, 339 fixation of tissues for, 51-52, 336337 in inhalation studies, 119-120 interpretation of findings from, 6768, 338-339 methodology of slide preparation for, 337-338
421
[Histopathology evaluation] PCNA assay in, 339 selection of tissues for, 335 stains used in, 52 in subchroniclchronic studies, 5 1-52 Historical control, 6, 7, 8, 13, 55, 67, 68, 138, 204, 222. 235, 259 Hormones, as a measure in reproduction studies, 228-229 Husbandry of dogs, 10- 11 facility requirements for, 352 of guinea pigs, 9 of primates, 11 of rabbits, 10 of rodents, 8-9 Hydralazine. 292 Hydramnios. 244 8-Hydroxyquinoline, 380, 381 Identification, animal, 12, 21 1 Immune system antibodies in, 291, 296, 302 antigens in, 296, 297, 298, 302 autoimmune disease, 292-293 hemolytic anemia, 292 systemic lupus erythematosus, 292 thrombocytopenia, 292 complement components in, 29 1 cytokine action on, 291, 297, 299 hypersensitivity of, 292 lymphocyte function in immunotoxicity, 291, 296, 297, 298, 299, 302 macrophage function in, 291, 297, 300-301 natural killer cells, 300-301 polymorphonuclear leukocytes in, 291. 302 suppression of, 291-292, 307 Immunohistochemistry, 340 Immunotoxicity studies antigen-specific antibody response assays, 297-298 cell-mediated immunity, 294, 298300 humoral immunity tests, 294,296-298
422
[Immunotoxicity studies] hypersensitivity responses in, 302303 immunoglobulins, 297-298, 302 mixed lymphocyte response, 299 tier testing strategies in. 293-295 Implantation, 204, 209,21 3, 217 Industrial chemicals genetic toxicology study guidelines for, 129-131, 362 inhalation study guidelines for, 107108 neurotoxicity study guidelines for, 255 subchronic/chronic study guidelines for. 34 toxicokinetic guidelines for, 74 Infertility, 228 Inhalation toxicology studies confounding factors in, 105, 121, 122 exposure period for, 108- 109, 1 10111 interpretation of data from, 123- 124 monitoring and characterization of exposurein,105,109,114-117 particle deposition in, 103, 105 study designs for, 107-108, 119 test atmospheres and generation of, 103. 104-105, 111-1 14, 122-123 International Conferenceon Harmonization (ICH) genetic toxicology study guidelines for,128.129,134,136,144,165 neurotoxicity study guidelines for, 257 reproductive and developmental study guidelines for, 196- 197, 210, 213, 216, 220 subchronic/chronic study guidelines for. 33. 34. 35 toxicokinetics, guidelines for 74, 81 Isoniazid, 292 Japanese guidelines for the Ministry of Agriculture, Forestry and Fisheries (MAFF), 131, 201, 203
Index
[Japanese guidelines] for the Ministry of Health and Welfare (MHW), 34, 130, 256-257 for the Ministry of International Trade and Industry (MITI), 34, 130 for the Ministry of Labor (MOL), 34, 131 Kepone, 400 Kidney changes in clinical chemistry in renal injury, 57, 63, 66, 328, 329, 331. 332 changes in urine in renal injury, 66 glomerular filtration rate, 329 pitted, 6 pyelonephritis. 328, 330 secondary effects associated with dysfunction, 63, 330 as a target organ, 57, 58-59 tubular hyper- and hypoplasia, 58 tubular necrosis. 58, 328 tumors of, 59 weight changes of, 67 Lactation, 213, 214, 233, 234 Lauric acid diethanolamine, 379 LC50, 108,109-110 LD50, 2 1-22 Limit dose (see Dosage selection) Liver system cholestasis, 56, 65. 327, 329 cirrhosis, 56 clinical chemistry changes associated with injury, 57. 65, 66, 327, 330, 33 1 fibrosis, 56 hepatic porphyria, 67 hepatobiliary injury, 56, 57, 65, 329 hepatocellular injury, 56, 57 necrosis, 56 as a target organ, 56-58 tumors, 58 weight change, 67
Index
Local lymph node assay, 303, 308 Lowest effect level (LOEL), 42. 53, 80, 221, 238, 239,400 Lungs and respiratory tract bronchoalveolar lavage, 120 clinical chemistry changes associated with injury, 57. 64, 120 congestion of, 6 edema of, 60 embolism of, 57 histopathology of, 120 hyperpnea, 330 inflammation of, 120 irritation of, 121 sensitization of, 120- 121 Luteinizing hormone (LH), 209, 228, 229. 244 Luteotropic hormones. 209, 228 Lymph nodes proliferation as a measureof immunotoxicity of, 294 weight of, 294 Magnesium (see Clinical chemistry) Malathion, 304 Malformation (see Congenital birth defects) Mammalian Cell Mutation Assays cell selection for, 147-149 for detection of forward mutations, 147 endpoints evaluated in, 153- 154 experimental design of, 149- 154 interpretation of data from, 154- 156 metabolic activation system in, 152 use of Chinese hamster ovary cells (CHO/HGPRT).147-148,150, 151,153,154 use of L5178Y mouse lymphoma cells (TK), 147- 148, 150- 15 1, 152,153 Material Safety Data Sheet, 30, 38 Mating. 213. 216, 226-229 Mating index, 226. 227 Maximum tolerated dose (MTD), 42. 43, 53. 382
423
Medical devices, genetic toxicology testing of, 129. 144 Melphalan, 379 Metabolic activation systems use in bacterial mutation assays, 136-137,138,144,145,364 use in chromosome aberration assays, 163,164 use in mammalian cell mutation assays, 152, 364 3"ethylcholanthrene, 304 Methyldopa, 292 Methylmercury, 397, 398, 399 Methyl methanesulfonate, 138, 152. 180 Methylphenidate, 379 Micronucleus Assay ( i n vivo) for detection of chromosome damage. 168-169, 363 endpoints evaluated in, 17 -1172 experimental design of, 169- 172 interpretation of data from, 172 Milk agalactia, 24 1 ejection reflex, 234, 235 production of. 234, 235 Mirex, 379 MitomycinC,138,165.171 Mouse ear swelling test, 303 Muscle clinical chemistry changes associated with injury, 57. 64, 328 necrosis, 57 weakness, 330 Muscle irritation test, 13 Mutagenicity assays (see Genetic toxiCOb2Y) P-Naphthoflavone,137,164 Nasal discharge, 6, 60 National Toxicology Program (NTP) immunotoxicity testing guidelines for, 293 sensitivity of bacterial mutation assays, I3 1 toxicokinetics for, 74, 76, 81
Index
424
Necropsy autolysis, 67, 217 interpretation of data from, 67 methodology for, 335-336 in reproductive and developmental toxicity studies, 216-217, 220 in subchronic/chronic studies, 49-50 Neonatal developmental (maturational) indices eye opening, 219, 236, 267 hair growth, 219, 236 pinna detachment, 219, 236 preputial separation, 220, 237, 265, 266, 267, 269 testes descent, 220, 237 tooth eruption, 219, 267 vaginal opening, 220, 237, 265, 266, 267, 269 Neonatal reflex indices airdrop righting reflex, 219. 267 auditory startle, 265, 266, 267, 270 righting reflex, 236 startle reflex, 219. 236 surface righting reflex, 219, 267 visual cliff, 267 Neonates body weight of, 219. 234 death of, 219, 233, 234 suckling behavior in. 235 Nervous system brain weight changes, 67 clinical signs associated with dysfunction, 60, 331 histopathology evaluation of, 257, 258, 263-264, 266, 272-274, 284-285 immunohistochemisty for evaluation, 264 radioimmunoassay for evaluation, 264 serum chemistry changes associated with injury, 330 Neurotoxicology studies in adult animals. 257-264 endpoints in auditory startle habituation. 266, 28 1-283
[Neurotoxicology studies] functional observational battery. 257, 258, 261-262. 267. 269, 278-279 learning and memory testing, 219, 236, 265. 266, 267. 270-27 1, 283-284 motor activity test, 219. 236. 257, 258, 261, 263. 265-267, 270, 279-28 1 interpretation of data from, 236. 276285 in neonates, 264-274 positive controls used in, 259 n-Hexane, 304 2-Nitrofluorene,138 N-Methyl-o-acrylamide, 378, 379, 380 N-Methyl-N’-nitro-N-nitrosoquainidine (MNNG),165 No observed effect level (NOEL), 35, 53. 55, 80, 221, 238, 239. 400 Nursing behavior, 234 Ochratoxin A. 304 OECD
genetic toxicology study guidelines for,128,134,136.144,165 inhalation study guidelines for, 108, I 10 neurotoxicity study guidelines for, 260, 308 reproductive and developmental toxicity study guidelines for, 199 subchronic/chronic study guidelines for, 34, 35 toxizokinetics guidelines for, 74 Oleic acid diethanolamine, 379 Ophthalmoscopic examination in neurotoxicity studies, 267 in subchronic/chronic studies, 46, 55, 61 Organogenesis, 204, 209, 213. 244 Organ weights interpretation of changes. 67. 230 in reproductive and developmental toxicity studies, 229-230 in subchronic/chronic studies, 50-51.55
Index
OSHA, 30 Osmotic diuresis, 330 Ovaries anovulation, 241 histopathology of, 220-22 1, 228, 23 1-232 oocyte/follicles, 229, 231-232, 243 weight changes of, 67, 230 Pancreatitis, 57 Parathion, 304 Parturition, 21 8-219, 229, 233 Pasteurellosis, 6, 224 Paw lick test, 13 Penicillamine, 292 Penicillin, 292 Penile erection, 226 Pentachlorophenol, 379 Pesticides (see Agrochemicals) Pharmaceuticals, animal reproductive and developmental toxicity studies guidelines, 195 subchronic/chronic study guidelines, 34 Pharmaceuticals. human (see also International Conference on Harmonization ICH) genetic toxicology studies, 128, 129, 362 inhalation studies, 108, 1 10 neurotoxicity studies, 256 reproductive and developmental toxicity guidelines, 196-203 subchronic/chronic study guidelines. 34 Phenobarbital,137,164 Phenol, 379 Phenolphthalein, 378 Piloerection. 60 Pituitary role in pregnancy support. 209 weight changes. 67, 230 Placenta, 2 17 Polyarteritis, 7 Polychorinated biphenyls (PCBs), 137, 164, 400
425
Pregnancy gestation length, 209, 229, 234, 245 prolonged gestation. 229, 233 rodent behavior required for pregnancy, 226 Preputial glands. 231 Procainamide, 292 Progesterone, 226, 228, 245 Prolactin, 209, 228, 245 Prostate histopathology of, 220, 231 weight changes, 67. 230 Protocols, study amendments for, 53. 355 contents of, 354-356 deviations of, 53 Pseudopregnancy, 215, 216, 245 Pyridine, 379 Quality assurance units, 347 Randomization, 3, 20, 211 Range-finding studies for acute studies, 21 for reproductive and developmental toxicity studies, 212 for subacute, 40 for toxicokinetics studies, 78-79 Recovery period, 41 Reports, study content for inhalation studies. 123124
content for neurotoxicity studies, 274-276 content for subchronic chronic studies, 52-53 content for toxicokinetic studies, 9697 contents required by GLPs, 356-357 Reproductive and Developmental toxici ty studies compilation of data for, 221-223 endpoints measured in. 206-207, 215-221 interpretation of data from, 221-237 regulatory guidelines for, 196-203
426
[Reproductive and Developmental toxicity studies] risk assessment of, 237-240 study designs for, 204-205 toxicokinetics in, 214-2 15 treatment periods in, 196-203, 2 13 uterine examinations in, 216-217 Reserpine. 378, 379 Resorcinol, 38 1 Resorptions (see Embryo-fetal death) Respiratory tract (see Lungs) Rifampin, 292 Rotenone, 380, 381 S9 (see Metabolic activation systems) Safety margin/factors, 54, 98, 239-240 Salicylic acid, 292 Satellite animals, 41, 44 Sebaceous glands, 60 Segment I. IT and I11 studies (see Reproductive and developmental toxicity studies) Seminal vesicles, 220, 230, 231 Sensitization, pulmonary, 120-121 Sentinel population, 3 Sexual behavior, 226-229 intromissions, 228 libido receptivity, 226, 228 lordosis response, 228 Sexual maturity, 209, 210 Skin burns, 330, 331 inflammation, 60 ulceration, 60 Sodium (see Electrolytes) Sodium azide, 138 Sorbitol dehydrogenase (see Enzymes) Species. selection of, 2-8, 20, 39, 2082 10, 265-267, 296 dog, 7, 40. 118, 209, 259 guinea pig, 6, 209. 302 hamster, 209 primates, 7-8, 40, 118, 209 rabbit, 6-7, 20, 23. 27, 118, 209-210 rodents, 3-6, 20. 39-40, 117-1 18, 209-2 10. 258-259, 265-267, 296
Index
Sperm analyses, 220, 229 Spleen enlargement of, 62 lymphoproliferation as a measure of immunotoxicity, 297, 299 weight change of, 67 Spot Test Assay, 146-147 Standard operating procedures (SOPS), 37, 347, 353 Statistical analyses in carcinogenicity studies, 54-55 in genetic toxicology studies, 141, 167.172 in neurotoxicity studies, 276 in reproductive and developmental toxicity studies, 222 in subchronic/chronic studies, 54-55 Sterigmatocystin,138 Stillbirth (see Embryo-fetal death) Stress, confounding effects of, 56, 6263, 64. 66, 110 Structure-activity relationships, 29 Subacute studies. 33, 36, 43-44 Subchronic studies, 33, 36, 43-44 Target organs. 55-60 Teratogenicity (see Congenital birth defects) Teratology study (see Developmental toxicity study) Testes histopathology of, 220-221, 228, 229, 231 Leydig cells in, 229 seminiferous cycle of, 209, 246 Sertoli cells in, 229, 245 spermiation in, 209. 228. 246 weight changes of, 209, 230 Test materials analyses of, 38-39. 114-117. 208 calculation of quantity needed. 5, 113-1 14 formulation of, 38-39 maximum dose volumes of, 13 radiolabeling of, 75 required documentation for, 354
Index
427
[Test materials] [Toxicokinetics] routes of administration of, 12-1 3, enzymes induction effects on, 81, 304 19-20, 79-80, 103, 260, 268,first-pass effects in, 83, 95 295 interpretation of data from, 90-98 vehicles used in formulation of, 13, metabolites, 73. 74, 81, 82. 83, 8438,56, 138, 208, 265, 259, 268 85,86, 89, 94 Testosterone, 228, 246 models for, 97 2,3,7,8-Tetrachlorodibenzodioxin protein binding, 84, 89, 91, 97 (TCDD), 297, 380, 399 range-findinglprobe studies,78-79 Thiourea, 38 1 in reproductive and developmental Thymus, 308 toxicity studies, 214-215 Thyroid sample methods for, 79,81-83 dysfunction, secondary effects of, 59, saturation kinetics of, 74 statisticaI/data analysisof, 85-90 327 hyperparathyroidism, 327 study design of. 75-76 weight change in, 67 Transgenic animal testing Tier testing, 34, 74 development of animal models TOSCA (see Environmental Protection Big Blue@'lacl, 367-371 Hrm2 mouse, 375-376,381 Agency) Toxicokinetics lacZ plasmid mouse, 373 analytical methods, 83-85 MMTV transgenic mouse, 374 autoradiography, whole-body, 82, 83, Muta"Mouse lacZ, 371-373 84 P53 transgenic mouse. 375-376, definition of, 73 378 endpoints in, PIM transgenic mouse, 374-375 absorption. 73, 74,81, 77, 83, 86, PML4 plasmid mouse, 373 87,88, 90, 93, 95, 103, 104 Tg.AC mouse, 375-376, 379, accumulation, 74,81, 82 380 area under the curve (AUC). 44, outstanding issues in, 382-383 83, 87-88, 90, 92-93,98. 94, 95. use in carcinogenicity testing, 374214 383 bioavailability, 43,86, 104 protocols for, 377-378 validation studies, 378-381 clearance. 88, 91,92, 98 use in mutagenicity testing, 367concentration. maximum(C,,, j, 44, 374 83, 86, 90, 92, 95. 214 Triethonalarnine, 379 distribution, 73, 74,83, 77, 86, 89, Trimethyltin, 280 94, 95 elimination. 86. 87, 90, 91, 95 Uncertainty factor. 239, 400 excretion. 74, 77, 82 Unscheduled DNA Synthesis Assay half-life (t],.). 74,SI, 83,86, 91, 95, 214 (UDS) endpoints measured in, 182 metabolism/biotransformation, 73, experimental design of, 179- 82, 77, 83, 88, 89, 91, 94, 95, 97. 183-184 303-305 interpretation of data from, 182-1 83 time to maximum concentration species/cell selection for, 178 CT,,,). 86 a
Index
428
Urine analyses collection methods for, 15, 332 crystalluria, 67 endpoints in glucose, 333 ketones, 66, 333 osmolality, 60, 66 pH, 333 proteins, 66 sediments, 67, 333 specific gravity, 333 volume, 66 'hematuria. 67 hemoglobinuria, 67 interpretation of data from, 6667 polyuria, 330 selection of parameters for, 332 in subchronic/chronic studies, 36, 47 U.S. Department of transportation (DOT), 109,110
c
, .
Uterus decidualization, 229, 242 gravid uterine weight, 217 histopathological changes in, 232 hydrometra, 244 weight changes in, 67, 230 Vaginal plug, 216, 246 Vaginal smears/cytology, 215-216, 232 Variations (see Congenital birth defects) Vein irritation test, 13 Vinyl acetate, 396 Vinyl chloride, 292 4-Vinyl- 1 -cylohexane diepoxide, 378 Water consumption hydration state, as a confounding factor, 63, 66, 68, 330, 331 in subchronic/clronic studies, 45-46,61 Weaning index, 227, 233 Xylene, 381