When Handbook of Normative Data for Neuropsychological Assessment was originally published in 1999. it was the first book to provide neuropsychologists with summaries and critiques of normative data for neuropsychological tests. The second edition, which has been revised and updated throughout, 1.1resents data for ~6 commonly used neuropsychological tests, including: Trailmaking. Color Trails, Stroop Color Word Interference, Auditory Consonant Trigrams. Paced Auditory Serial Addition, Ruff ~ & 7. Digit Vigilance. Boston Naming, Verbal Fluency. Rey-Osterrieth Complex Figure, Hooper Visual Organization. Visual Form Discrimination, Judgment of Line Orientation. Ruff Figural Fluency. Design Fluency. Tactual Performance, Wechsler Memory Scale-Revised, Rey Auditory-Verbal Learning. Hopkins Verbal Learning. WHO/UClA Auditory Verbal Learning, Benton Visual Retention, Finger Tapping, Grip Strength (Dynamometer). Grooved Pegboard. Category. and Wisconsin Card Sorting tests. In addition. California Verbal Learning (CVLT and CVLT- II). CERAD ListLearning, and Selective Reminding Tests, as well as the newest versions of the Wechsler Memory Scale (WMS-Ill and WMS-IIIA). are reviewed. Locator tables guide the reader to the sets of normative data that are best suited to each individual case. depending on the demographic characteristics of the patient. and highlight the advantages associated with using data for comparative purposes. Those using the book have the option of reading the authors' critical review of the normative data for a particular test, or simply turning to the appropriate data locator table for a quick reference to the relevant data tables in the Appendices. The second edition includes reviews of 15 new tests. The way the data are presented has been changed to make the book easier to use. Meta-analysis tables of predicted values for different ages (and education. where relevant) are included for nine tests that have a sufficient number of homogeneous datasets. No other reference offers such an effective framework for the critical evaluation of normative data for neuropsychological tests. Like the first. the second edition will be welcomed by practitioners, researchers. teachers, and graduate students as a unique and valuable contribution to the practice of neuropsychology.
Maura Mitrushina, Ph.D., is Professor of Psychology at California State University. Northridge, and Associate Clinical Professor of Psychiatry at UClA School of Medicine. She is an ABPP/ABCN diplomate and maintains a clinical and forensic practice in Encino, California. Her research interests include cognitive correlates of normal aging and differential diagnosis of dementia, as well as factors influencing rates of recovery after traumatic brain injury. Kyle B. Boone, Ph.D., is Professor-inResidence of Psychiatry at UClA School of Medicine, and Director of Neuropsychological Services and Training at Harbor- UClA Medical Center. She is an ABPP/ABCN diplomate and maintains a clinical and forensic practice in Torrance, California. She has conducted research on the development and validation of techniques to identify noncredible cognitive performance, and on the effects of demographic factors and medical and psychological illnesses on neuropsychological test performance.
Jill Razani, Ph.D., is an Assistant Professor of Psychology at California State University, Northridge, and a licensed clinical psychologist in the state of California. In the past, she has conducted research on cognitive aspects of aging and neurodegenerative disorders. Presently, she has an active program of research examining issues related to multicultural and cross-cultural neuropsychology, as well as the relationship between cognitive functioning and activities of daily living in patients with dementia. Louis F. D'Elia, Ph.D., is Assistant Clinical Professor of Psychiatry, and former CoDirector of the Neuropsychology Assessment Laboratory at the University of California, Los Angeles, School of Medicine. He remains active in the training, supervision, and mentaring of UClA Postdoctoral Neuropsychology Fellows in his work with them in his private practice in Pasadena, California.
jACKET DESIGN: E\'E SIEGEL
OXFORD UNIVERSITY PRESS www.oup.com
PRAISE FOR THE FIR T EDITIO ''Should neuropsychologists purchase this volume? The answer is an unqualified yes. The book is a very valuable asset to any neurop~ ·chology collection. This reviewer wholeheartedly recommends it for purchase; the tables alone justify the pnce .... The authors are due a great deal of credit for gathering together material that most of us would understand as a multi-year project. In examining this book in even a cur orv way. the prospective buver will see that the effort needed to bring it to fruition is humbling .. -Kenneth M Adams. PhD. in]oumalofClinical and Experimental Neurops_rcholog.r
"Overall, Mitrushina et al. have made a substantial contribution with their text. and it nicely complements other thorough overviews of neuropsychology authored by Lezak or Spreen and Strauss. It is concise. timely, comprehensive, and cogent, and it holds great utility for the practice of clinical neuropsychology.... Let us hope they continue this good work as additional data emerge ... -Michael R. Basso, PhD, in Neuropsychiatry, Neuropsychology. and Behavioral Neurology
" ... a valuable and well-written addition to the literature that should find its way onto the reference shelves of practicing neuropsychologists. The book will be a useful educational tool. ... There IS a lot to be gained from consulting this book. In readability, utility, and practicality. it goes way beyond the norms." -Russell M. Bauer. PhD, infoumal of the International Neuropsychological Society
90000
9 780195 169300 ISBN 0-19-516930-1
Handbook of Normative Data for Neuropsychological Assessment
OXFORD UNIVERSITY PRESS
Oxford University Press, Inc., publishes works that further Oxford University's objective of excellence in research, scholarship, and education. Oxford New York Auckland Cape Town Dar es Salaam Hong Kong Karachi Kuala Lumpur Madrid Melbourne Mexico City Nairobi New Delhi Shanghai Taipei Toronto With offices in Argentina Austria Brazil Chile Czech Republic France Greece Guatemala Hungary Italy Japan Poland Portugal Singapore South Korea Switzerland Thailand Turkey Ukraine Vietnam
Copyright© 2005 by Maura Mitrushina, Kyle B. Boone,
Jill Razani, and Louis F. D'Elia
Published by Oxford University Press, Inc. 198 Madison Avenue, New York, New York 10016 www.oup.com Oxford is a registered trademark of Oxford University Press All rights reserved. No part of this publication may be reproduced, stored in a retrieval system, or transmitted, in any form or by any means, electronic, mechanical, photocopying. recording, or otherwise, without the prior permission of Oxford University Press. Library of Congress Cataloging-in-Publication Data Handbook of normative data for neuropsychological assessment I Maura Mitrushina ... [et al.].- 2nd ed. p. ; em. Includes bibliographical references and indexes. ISBN-13 978-0-19-516930-0 ISBN 0-19-516930-1 1. Neuropsychological tests-Handbooks, manuals, etc. 2. Reference values (Medicine)-Handbooks, manuals, etc. [DNLM: 1. Neuropsychological Tests. 2. Reference Values. WL 141 H23654 2005] RC386.6.N48M58 2005 616.8'0475-dc22 2004054724
9 8 7 6 5 4 3 2 1 Printed in the United States of America on acid-free paper
With admiration and gratitude, we dedicate this book to those professionals whose normative research efforts made this volume possible.
Preface
The Handbook of Nonnative Data for Neuropsychological Assessment is our attempt to provide ready access to neuropsychological normative data and to evaluate their strengths and weaknesses. Because the interpretation of test scores profoundly affects the quality and utility of neuropsychological reports and research, we felt that a critical compendium containing most of the available normative data for commonly used tests was essential. Before this book's publication, only those lucky individuals with the time or staff to conduct exhaustive library searches or with extensive professional subscription lists could hope to be aware of more than a few normative reports for any specific test. Although several books cover the intricacies of administration and scoring procedures for neuropsychological tests and a few contain some normative data, no previous volume has been exclusively devoted to the presentation and discussion of existing normative data for specific neuropsychological tests or provided a framework for judging studies that report normative data. This handbook was written to help guide the busy clinician, researcher, and graduate student to the utility of commonly used neuropsychological tests and to the normative data accompanied by critical reviews for comparison purposes for most of the tests described in this book. The following tests have been described: Trailmaking, Color Trails, Stroop Color Word Interference, Auditory Consonant Trigrams, Paced Auditory Serial Addition, Ruff 2&7, Digit Vigilance, Boston Naming, Verbal Fluency, Rey-
Osterrieth Complex Figure, Hooper Visual Organization, Visual Form Discrimination, Judgment of Line Orientation, Ruff Figural Fluency, Design Fluency, Tactual Performance, Wechsler Memory Scale (WMS-R, WMS-111, WMS-IIIA), Rey Auditory-Verbal Learning, California Verbal Learning, Hopkins Verbal Learning, WHO-UCLA Auditory Verbal Learning, CERAD List-Learning, Selective Reminding, Benton Visual Retention, Finger Tapping, Grip Strength (Dynamometer), Grooved Pegboard, Category, and Wisconsin Card Sorting tests.
ORGANIZATION OF THE BOOK
The book contains 25 chapters. The basic concepts of normative neuropsychology are addressed in the first three chapters. The first chapter provides an introduction to the practice and philosophy of neuropsychology as a clinical discipline. The second chapter explores the interface of neuropsychology with other professional/clinical disciplines and revisits critical issues in neuropsychology. The third chapter provides an overview of statistical methods and the use of statistical and methodological concepts in neuropsychology, history and applications of meta-analysis in clinical practice, and description of procedures for the use of meta-analysis in this book. The remaining 22 test chapters review and present the normative data for specific neuropsychological tests, which are derived from articles and other communications reporting results of normative and clinical comparison
viii
studies. These chapters begin with a brief ovetview of the history, utility, and psychometric properties of the test under discussion, which indicates whether there are different versions of the test and/or varying administration procedures. If more than one version of a test exists, the differences in content, administration, and scoring are described. We purposely avoided an exhaustive review of the history and psychometric properties of the tests because this information is readily available in other Oxford publications, specifically Lezak et al. (2004) and Spreen and Strauss (1998). The next part of the test chapters is a summary of the findings from research that has examined the influence of demographic variables (e.g., age, education, intellectual level, gender, ethnicity/culture, handedness) and administration procedures on test performance. The findings from this review highlight the critical variables needed to evaluate the normative reports for the test. These critical variables are broken down into two categories: (1) subject variables and (2) procedural variables.
Subject variables address such issues as: "How broad are the utilized age group ranges in data reporting?" Optimally, studies report data across rather discrete age groups (e.g., 20-24, 25-29, 30-34, 35--39, 4044, 45-49, 50-54 years) rather than across one allinclusive range (e.g., 20-54 years).
'What is the education and/or IQ of the study participants?" Because education and IQ may have a dramatic impact on test performance, it is important to include this information so that data that closely match the education and/or IQ of the patient under study can be used.
'What was the sample size in each of the reported age or age/education categories?" "Is the sample from which data were collected well described?" For instance, the age of the subjects and the country where the study was conducted always
PREFACE must be reported. Depending on the test administered, other important variables may include gender, ethnicity/culture, and hand preference.
Procedural variables address such issues as: 'What version of the test was administered?" "How was the test administered?" "How was the test scored?" "Did the data reported include mean and standard deviation scores?'' The next section of each of these chapters summarizes the status of the normative data for the test and answers the questions: "How many studies are out there?'' 'Which versions of the test have been the most frequently administered?'' 'What demographic characteristics have been the most frequently studied?'' The next section presents critiques of the studies, with the strengths and considerations regarding the use of each normative report discussed in some depth. Data tables are presented in the appendix corresponding to each chapter. Each appendix starts with the data locator table for that chapter, which summarizes the subject and procedural variables for each study reviewed in the text, organized in ascending chronological order. The table quickly highlights the most appropriate normative data, given the demographic characteristics of the patient under study, as well as the test administration and scoring criteria employed. The locator table also indicates the page number on which an extensive critical review of the study can be found in the text of the chapter and directs the reader to the corresponding data tables in the appendix. Therefore, readers have the option of reading the critical review of the normative data for a particular test or simply using the data locator table to rapidly identify the appropriate data set for quick test interpretation. Several test chapters also include summaries of results of the meta-analyses which were used to derive the predicted scores for different age groups. The tables of predicted
ix
PREFACE
scores with education or gender correction (where appropriate) are presented in the corresponding appendices, along with descriptive statistics for the aggregate sample, significance tests, and scatterplots depicting dispersion of the data points around the regression line. The test chapters conclude with a summary and suggestions for future research to improve the database for the test.
HOW TO BEST USE THE BOOK The process of selecting the inost appropriate normative report for interpretive purposes involves determining the "best fit" between a patient's demographic characteristics (e.g., age, years of education, IQ, handedness) and the demographic characteristics of the study sample. It is also critical to insure that the version of the test administered is the same as that used to collect the normative data. Likewise, it is critical that the scoring procedures are identical. As a general policy, before seeing a patient, we typically determine which normative data we are going to use to interpret his or her performance. This way we do not discover after a patient has gone home that the only reference data available utilized a different administration and/or scoring protocol from the version we used. Such "discoveries" undermine confidence in test score interpretation. Fortunately, however, the vast majority of normative reports use standard administration and scoring procedures. If the data have already been collected, an important variable to screen for initially is country of origin. If the patient was born and/ or educated in the United States, then the most appropriate comparison data should have been collected from individuals born and/or educated in the United States. Another critical variable is age. A patient's test scores must be compared to those of age peers because performance on most neuropsychological tests changes as a function of age. Educational level and/or IQ are also important variables. Because they can have a tremendous impact on performance on most neuropsychological
tests, a patient's IQ and/or educational characteristics should closely match the demographics of the normative comparison sample. Optimally, normative data are reported by age/ education or age!IQ categories (i.e., performance of those aged 20-25 years with 12 years of education, performance of those aged 2025 years with 13-15 years of education, performance of those aged 20-25 years with 16 years of education, etc.). Sample size is also critical because small sample size within any of the comparison categories (i.e., age, age/ education) can undermine the stability of the normative data and reduce confidence in score interpretation. For some tests, gender and handedness must be considered. Ideally, the administration and scoring procedures used to assess the patient should be identical to those used to collect the normative comparison data. If the data locator table suggests that more than one study could be appropriately used, then the reader is especially advised to read the critical reviews of the studies closely to help determine whether one data set is more appropriate than others. Close inspection of the details of the studies often leads to clear-cut conclusions. If the data from different studies yield contradictory values, the reader is advised to consult the table of meta-analytically predicted values (when available) to aid in theselection of the appropriate normative data set. If normative data for a certain demographic group cannot be found in the studies reviewed, with proper caution (see Chapter 3), the expected value for that group can be extrapolated based on the table of predicted values or can be computed based on the regression equation provided with the table. However, we strongly discourage the use of predicted values when the actual data sets are available.
HISTORY OF THIS PROJECT
The Beginnings The idea for this book originally grew out of the frustration that was experienced by Lou D'Elia in his attempts to locate appropriate normative data during the early years of his postdoctoral training. This frustration is
X
familiar to anyone who has used normative data and was practicing before 1990. Back in the "old days," it was fairly typical fQr practicing professionals to have access to, at most, one or two sets of normative data for any particular neuropsychological test. More often than not, graduate students and postdoctoral fellows and trainees were handed a m~ual of norms to be used in the clinic or laboratory. These "lab manuals" containing tables; of normative data were passed from mentorjto trainee {and vice versa) as if they were t&e Holy Grail. Early in his training, Lou beg~ to ask "Where did these data come from?"l Sometimes a graduate student, postdocto~ fellow, or faculty member would "discover" a pew set of norms for a particular test and a neW table would magically appear in the lab 1panual. Applying the new reference data to .patient scores often yielded wildly different percentile performance interpretations from those based on the "standard" norms. This sent Loo to the UCLA Biomedical Ubrary to search for the source of the data and to unearth the original research articles. Often, as he read the ·article, he discovered to his horror that the data had been collected from individuals not educated in the United States, that the sample size was extremely small (i.e., n < 10), or worse ~t. that the data were generated from a differ~t version of the test. If the same version of the test was being used, often the normative data had been collected by a nonstandard administration and/or scoring procedure. It was only after a thorough examination of how the J~tudies were carried out-in terms of test ~ tration, scoring, and demographic ch~acter istics of the study participants-that one could begin to unravel the reasons why the: use of one set of normative data yielded a ~erent interpretation than use of another. Those trips to the library resulted in the first article to summarize the availab.e normative data for any neuropsychological test: 'Wechsler Memory Scale: A Critical AP)>raisal of the Normative Studies" {D'Elia :et al., 1989). It was during the preparation ~f this article that our basic template for analyzing normative reports was developed. Lou's next question was 'Why has ~o one gathered all this information together into a
PREFACE
reference book?'' Fortunately, Lou found two student colleagues in the same training program who shared his concern: Kyle Boone and Allen D. Brandon. Lou, Kyle, and Allen eagerly returned to the library to collect the data necessary to produce a reference book. Soon, however, they discovered why no such volume existed. It is hard to imagine now, but as recently as the late 1980s and early 1990s, the majority of neuropsychology-related professional journals still had not been referenced in databases. No subject category for "Norms" or "Normative Data" was listed in the key reference indices such as Index Medicus or Psychological Abstracts. As a result, most of the research papers were located by going through the various journals article by article. Gathering the necessary information proved to be a very large task, not one that we would recommend to a postdoctoral fellow at the beginning of his or her career. Yet, that is exactly what they did. Hindsight is 20/20! Allen Brandon withdrew from the project upon completing his postdoctoral fellowship. Private practice called. Only Lou and Kyle remained. However, for Lou and Kyle, free time seemed to evaporate as they pursued developing professional careers and attended to their ever-increasing family activities and obligations. The project slowly moved forward. Finding and cataloguing the articles, then analyzing them using the templates required much more work than they had imagined. Then, about 1994, Maura Mitrushina joined the project, and thanks to her considerable enthusiasm and efforts the first edition of the book was 6nally completed.
The Second Edition-Changes and Updates Now, 6 years later, we are glad to have on board a new member of the team, the young and vibrant Jill Razani. We invited her to participate in the preparation of the second edition in order to share responsibilities for writing new chapters with reviews of additional commonly used tests in response to the wishes of our audience. This was the only way to keep our sanity, attend to our families and jobs, and have a semblance of "normal life" while working on the second edition.
xi
PREFACE
The new tests reviewed in the second edition include Paced Auditory Serial Addition, Ruff 2&7, Digit Vigilance, VISual Form Discrimination, Judgment of Line Orientation, Ruff Figural Fluency, Design Fluency, WMSIIIA, California Verbal Learning, Hopkins Verbal Learning, WHO-UCLA Auditory Verbal Learning, CERAD List-Learning, Selective Reminding, Benton VISual Retention, and Wisconsin Card Sorting tests. The chapters in the first edition have been updated and revised. Information on methodological issues, new versions and new approaches to the tests, and their clinical utility has been added. Studies published after 1998 that are based on well-defined, intact samples were reviewed. Outdated information, data on diagnosed clinical groups, and chapters describing tests that are not in wide use were removed. The format of data presentation has been changed. Learning from our mistakes with the first edition (data tables are not exactly placed in the text of their description, as we originally envisioned!), we removed all data tables from the text and placed them in the appendices. We hope that this change will make it easier to locate the needed tables. In response to the wishes of the readers of our first edition, we synthesized the data in meta-analytic tables of predicted values with supporting statistics for those chapters that have sufficient number and homogeneity of
studies for such analyses. The limitations of such predicted norms were highlighted.
FUTURE DIRECTIONS The handbook is as up-to-date as we could make it. We intend to update the handbook every few years; and with subsequent editions, it will be expanded to include additional tests frequently used by neuropsychologists. We have already made a step in this direction with the second edition. Almost all of the tests in this book continue to appear on lists of the most popular tests in neuropsychology. We also managed to sneak in some information regarding a couple of published tests that were developed in our laboratory that seem to be gaining popularity elsewhere (i.e., Color Trails Test, WHO-UClA Auditory Verbal Learning Test). We hope this book finds its place on the desks of professionals performing or reviewing neuropsychological assessments. We also hope it will be welcomed by teachers of assessment and psychological statistics and helpful to graduate students learning to interpret test scores. Our goal is to help bolster confidence in the basis for clinical judgments and to strengthen the credibility of research and clinical findings.
Los Angeles . California
M.M., KB.B., J.R., L.F.D.
Acknowledgments
We extend our deepest gratitude to all the authors whose normative and clinical comparison research is reviewed in this book. Without their work, this book would not have been possible. This volume is not intended to disparage the work of any author as we strongly believe that each author has made an important contribution to our overall knowledge through their research efforts. Over the years, several people have helped us with the preparation of the first and second editions of this book. Their help took many forms, including everything from typing tables and checking the accuracy of references to providing us with materials to be included in the book and simple moral support. We offer each one our heartfelt thanks for every kindness and courtesy extended to us along the way: Lidia Artiola i Fortuny, Jean Avezac, Eyzzz Baccarrdi, Julian Bach, Robert Bomstein, Virdette Brumm, Debora Burnison, Robert Butler, Flo Comes, Lou Costa, Michele Croisier, Jeffrey Cummings, Janine Czametzki, Doug Danaher, Dean Dellis, Jack Demick, Lois Desmond, Carl Dodrill, Linda Dukmajian, Katharine Earhart, Robert Elliot, Kadimah Elson, Gwenn Evans, Bee Fletcher, Travis Fogel, David Forney, Jennifer Forrest, Paula Fuld, Stephen Ganzell, Ismelda Gonzalez, Patricia Gross, Adrienne Gundry, Tiffany Harris, Lany Herrera, Charles Hinkin, Stacey Horowitz, Robert Ivnik, Lissy Jarvik, Irene Kassorla, Ellen Kester, Glen Larrabee, Asenath LaRue, Stanislav Levin, James Loong, Enrique Lopez, Christine LoPresti, Anahit Magzanyan, Mario Maj, Lawrence Majovski,
Alfred Marohl, Gayle Marsh, James Marsh, Joan McConnell, Susan McPherson, Fernando Melendez, John Meyers, Eric Miller, Robin Morris, Hector Myers, Narine Nazari, Linda Nelson, Tina Noriega, Lara Orchanian, Elizabeth Pacheo, Daniel Parks, Nikki Passanante, Helen Paull, Eileen Pearlman, Marcel Ponton, Stephen Rebello, Matt Reinhard, Mark Richardson, Linda Ringer, Marcela Rivera, Eddie Rozenblat, Michael Salmone, Manuela Saul, Robert Sbordone, Jeffrey Schaeffer, Karen Schiltz, David Schretlen, Amanda Schrey, Ola Seines, Glenn Smith, Fabrizio Starace, Norton Stein, Tony Strickland, Donald Stuss, Donald Trahan, Craig Uchiyama, Doug Umetsu, Harry Van der Vlugt, Wilfred Van Gorp, Valdis Volkovskis, Travis White, Jane Williams, Bennett Williamson, Lome Yeudall, Betty Young, and Miguel Zavala. We express endless gratitude to Courtney Sheen, who organized and coordinated the preparation of tables for the second edition. We thank Linda Fidell and Ingram Olkin for their advice on the design and statistical treatment of the meta-analyses. We are indebted to Xiao Chen and the UCLA ATS Statistical Consulting Group for their advice and support, ranging from providing ample literature resources on applications of Stata in meta-analyses to invaluable help with the set-up of command files and interpretation of results of the analyses. Special thanks go to Muriel Lezak and Edith Kaplan, who have been a constant source of encouragement and support from the very beginning of the project.
xiv
We extend our gratitude to Paul Satz, who fostered in three of the authors appreciation for the complexity and excitement of the field of neuropsychology. The contribution of Dale Sherman to the methodological accuracy of the first edition qualifies him for a spot in heaven. We also extend special thanks to Allen Brandon, who was an early collaborator on the first edition. Allen, your early efforts and great enthusiasm were deeply appreciated. Dr. D'Elia offers his admiration and appreciation to his three coauthors, whose efforts brought this project to completion.
ACKNOWLEDGMENTS
Sincere thanks to our editors Jeff House, Fiona Stevens, and ancy Wolitzer, who e support throughout has been continuous and enthusiastic. Finally, we thank our families: M.M. thanks Masha, Sasha, and Kaley for their endless patience and understanding; K.B.B. thanks Rodney, Galen, and Fletcher; J.R. thanks her parents and family, especially Bill, Rl10nda, and Mike; L.F.D. thanks his parent and family, especially Michael D. Salazar, for their constant encouragement and support. M.M., K.B.B., J.R. L.F.D.
Contents
I. BACKGROUND 1. Introduction, 3 Test-Taking Environment, 6 Test Norms, 7 Tests, 9 Standard and Experimental, 9 When Is a Test Considered Experimental?, 10 What Determines Whether a Test Is Considered "Standard?'', 11
2. Use of Methodological Concepts in Neuropsychology Practice, 12 Interface of Neuropsychology with Other Clinical Disciplines, 12 Applications of Neuropsychological Evaluation, 13 Different Levels of Data Integration in Neuropsychology Practice, 15 Judgment and Decision Making in Clinical Neuropsychology, 17 Strategies in Test Selection, 17 Normative References and Interpretation of Clinical Data, 18 Alternative Methods for Interpretation of Clinical Data, 22 Factors Influencing Performance on Neuropsychological Tests, 27 Effort and Motivation, 27 Issues in Cross-Cultural and Multicultural Neuropsychological Assessment, 28 Final Caveats, 30 Data Inclusion in Neuropsychological Reports, 31
3. Statistical and Psychometric Issues, 33 Measurement and Interpretation of Numerical Values, 33 Standardization of Raw Scores, 35 Standard Scores and Normal Distribution, 36 Interpretation of Infrequent (Outlying) Scores, 38 Interpretation of Scores That Are Not Normally Distributed, 38 Psychometric Properties of Tests, 39 Reliability, 39 Methods of Estimating Test Reliability, 39 Standard Error of Measurement, 40 Validity, 41 Decision Theory, 42 Base Rates, 42 XV
xvi
CONTENTS Selection Ratio, 43 Incremental Validity, 43 Cutoffs and Diagnostic Acctiracy of a Test or Interpretive Strategy, 44
Synthesis of Results of Differen~ Studies in a Meta-Analysis, 45 Historical Overview and the Raticinale for Using Meta-Analysis in This Book, 45 Application of Meta-Analysis in Quucal Practice, 46
Advantages, 46 Sources of Bias, 46 Selection of Studies and Procedures for Meta-Analyses Presented in 11lis Book, 47 Uterature Search and Selection ci Studies, 47 Procedures Used in the Analyses, 48 Data Editing, 48 Regression, 50 Prediction, 51 Standard Deviations, 51 Testing Model Fit and Parameter'Specilications, 52 Effect of Demographic Variables, ; 54 Comments on the Applicability oP;the Meta-Analyses Presented in This Book, 55
I
II. TESTS OF ATTENTION AND f::ONCENTRATION: VISUAL AND AUDITORY 4. Trailmaking Test, 59 I
Brief History of the Test, 59 Contributions of Cognitive Mechatlisms and Physical Layout Differences to Performance on Parts A and B, 60 Utility of the Derived Measures, Which Are Based on Differences in Performance Times for Parts A and B, 61 Utility of the Error Analysis, 62 Utility of the Cutoffs for lmpairm~nt, 63 Effect of the Order of Presentatioa and Practice Time, Practice Effect, and Alternate Versions oftheTMT, 64 Culture-Specific Sets of Normativ~ Data and Cultural Adaptations for the TMT, 65 Modified Versions of the TMT, fti
Relationship Between TMT PerfQnnance and Demographic Factors, 67 Method for Evaluating the No~tive Reports, 70 Summary of the Status of the Norms, 71 Summaries of the Studies, 72 Results of the Meta-Analyses of t\le Trailmaking Test Data, 96 Conclusions, 98
5. Color Trails Test,
99
Brief History of the Test, 99 Relationship Between CTT Performance and Demographic Factors, 101 Method for Evaluating the NonnJtive Reports, 102 Summary of the Status of the NofiJls, 103 Summaries of the Studies, 103 Conclusions, 106
6. Stroop Test,
1oa
Brief History of the Test, 108 Current Administration Procedures, 110
CONTENTS Relationship Between Stroop Test Perfonnance and Demographic Factors, 112 Method for Evaluating the Nonnative Reports, 114 Summary of the Status of the Nonns, 115 Summaries of the Studies, 116 Results of the Meta-Analyses of the Stroop Test Data, 132 Conclusions, 133
7. Auditory Consonant Trigrams, 134 Brief History of the Test, 134 Administration Procedures, 134 Psychometric Properties, 135 Relationship Between ACT Perfonnance, Demographic Factors, and Vascular Status, 135 Method for Evaluating the Nonnative Reports, 135 Summary of the Status of the Nonns, 136 Summaries of the Studies, 137 Conclusions, 140
8. Paced Auditory Serial Addition Test, 141 Brief History of the Test, 141 Modifications and Alternate Formats of the PASAT, 142 Psychometric Properties of the Test, 143 Relationship Between PASAT Perfonnance and Demographic Factors, 143 Method for Evaluating the Nonnative Reports, 145 Summary of the Status of the Nonns, 145 Summaries of the Studies, 146 Conclusions, 158
9. Cancellation Tests, 160 Brief History of the Tests, 160 Ruff 2&7 Selective Attention Test, 160 Brief Overview of the Ruff 2&7, 160 Psychometric Properties of the Ruff 2&7, 161 Relationship Between Ruff 2&7 Performance and Demographic Factors, 162 Digit Vigilance Test, 162 Brief Overview of the DVf, 162 Psychometric Properties of the DVf, 163 Relationship Between DVf Performance and Demographic Factors, 163 Method for Evaluating the Nonnative Reports, 163 Summary of the Status of the Nonns, 164 Summaries of the Studies, 164 Conclusions, 170
Ill. LANGUAGE 10. Boston Naming Test, 173 Brief History of the Test, 173 Studies Using BNT Error Quality Analyses, 174 Current Views on the Mechanisms Underlying Confrontation Naming Deficits, 176
xvii
xviii
CONTENTS
Modifications and Short Versions bf the BNT, 177 Cultural Adaptations and Culture~pecific Normative Data for the BNT, 178 Psychometric Properties of the Test, 179
Relationship Between BNT Perf()rmance and Demographic Factors, 180 Method Jor Evaluating the Nonqative Reports, 182 Summary of the Status of the Norms, 182 Summaries of the Studies, 183 Results of the Meta-Analyses of the Boston Naming Test Data, 197 Conclusions, 199
11. Verbal Fluency Test, 200 Brief History of the Test,
200
Psychometric Properties of the Ttft, 202 Cognitive Mechanisms Underlying Word Generation, 202 Biochemical and Anatomical Cort;lates and Effect of Brain Pathology · on Verbal Fluency, 203 Assessment of Verbal Fluency in JPifferent Languages, 205
Relationship Between VFT Perfopnance and Demographic Factors, 206 Method for Evaluating the No~tive Reports, 208 Summary of the Status of the Nc;ms. 209 Summaries of the Studies, 209 Results of the Meta-Analyses of ~e Verbal Fluency Data, 235 Conclusions, 237
IV. PERCEPTUAL ORGANIZATIQN: VISUOSPATIAL AND TACTILE 12. Rey-Osterrieth Complex Figure, 241 Brief History of the Test, 241 Administration Procedures, 241 Alternate Versions, 242 Scoring Systems, 243 Reliability, 248 Clinical Utility, 249 I Culture-Specific Studies and Nomfative Data for the ROCF, 251
Relationship Between ROCF Performance and Demographic Factors, 251 Method for Evaluating the Norm.tive Reports, 253 Summary of the Status of the Noims, 254 Summaries of the Studies, 255 Results of the Meta-Analyses of ~e ROCF Data, 269 Conclusions, 270 ·
13. Hooper Visual Organization ~-est, 272 Brief History of the Test, 272 Construct Validity, 273 Psychometric Properties of the Test, 274
Relationship Between HVOT Ped>rmance and Demographic Factors, 274 Method for Evaluating the Norm~tive Reports, 274 Summary of the Status of the No~s, 275 Summaries of the Studies, 275 Conclusions, 277
CONTENTS
14. Visual Form Discrimination Test, 278
Brief History of the Test, 278 Relationship Between VFDT Perfonnance and Demographic Factors, 280 Method for Evaluating the Nonnative Reports, 280 Summary of the Status of the Nonns, 281 Summaries of the Studies, 281 Conclusions, 282 15. Judgment of Line Orientation, 284
Brief History of the Test, 284 Psychometric Properties of the Test, 286 Alternate Brief Forms of the JLO, 286
Relationship Between JW Perfonnance and Demographic Factors, 286 Method for Evaluating the Nonnative Reports, 287 Summary of the Status of the Nonns, 288 Summaries of the Studies, 288 Conclusions, 296 16. Design Fluency Tests, 298
Brief History of the Tests, 298 Psychometric Properties of the Design Fluency Tests, 300 Ruff Figural Fluency Test, 300 Design Fluency Test Oones-Gotman!Milner Vemon), 300
Relationship Between Design Fluency Perfonnance and Demographic Factors, 301 Method for Evaluating the Nonnative Reports, 301 Summary of the Status of the Nonns, 302 Summaries of the Studies, 303 Conclusions, 310 17. Tactual Performance Test, 312
Brief History of the Test, 312 Psychometric Properties of the TPT, 314 Relationship Between TPT Perfonnance and Demographic Factors, 314 Method for Evaluating the Nonnative Reports, 315 Summary of the Status of the Nonns, 316 Summaries of the Studies, 318 Conclusions, 333
V. VERBAL AND VISUAL LEARNING AND MEMORY 18. Wechsler Memory Scale (WMS-R, WMS-111, and WMS-IIIA), 337
Brief History of the Test, 337 Relationship Between Test Perfonnance and Demographic Factors, 344 Method for Evaluating the Nonnative Reports, 345 Summary of the Status of the Nonns, 345 Summaries of the Studies, 346 Conclusions, 355
xix
CONTENTS
XX
19. List-Learning Tests, 357
Rey Auditory-Verbal Learning Test, 357 Variability in Administration of the Rey AVLT, 357 Functioning of Different Memory Mechanisms, as Assessed by the Rey AVLT, 359 Practice Effect and Alternate Fonns of the Rey AVLT, 361 Assessment of Auditory Verbal Learning with the Rey AVLT in Different Languages and Cultures, 362
California Verbal Learning Test-Second Edition, 362 Structure of the CVLT-11 and Description of the Nonnative Data Provided in the Test Manual, 362 Alternate and Short Fonns of the CVLT-11, 363 Review of the Recent Literature on the CVLT and CVLT-11, 363 Effect of Semantic Organization on Recoil, 363 Anatomical Correlates, 364
Assessment of Learning and Memory in Traumatic Brain Injury, 365 Assessment of Serial Position Effect in Dementias, 366 Repeated Administration and Practice Effects, 366 Assessment of Effort with the CVLT, 367 Use of the CVLT in Other Languages and Cultures, 367 Adaptations and Alternate Versions of the CVLT, 367 Hopkins Verbal Learning Test, 368 WHO-UCLA Auditory Verbal Learning Test, 369 CERAD List-Learning Test, 370 Selective Reminding Test, 370 Other Verbal and Nonverbal List-Learning Tests, 371 Relationship Between List-Learning Test Perfonnance and Demographic Factors, 372 Method for Evaluating the Nonnative Reports, 374 Summary of the Status of the Nonns, 375 Summaries of the Studies, 375 Results of the Meta-Analyses of the Rey AVLT Data, 391 Conclusions, 392
20. Benton Visual Retention Test, 394
Brief History of the Test, 394 Psychometric Properties of the Test, 397 Relationship Between BVRT Perfonnance and Demographic Factors, 398 Method for Evaluating the Nonnative Reports, 400 Summary of the Status of the Nonns, 400 Summaries of the Studies, 402 Conclusions, 416
VI. MOTOR FUNCTIONS 21. Finger Tapping Test, 419
Brief History of the Test, 419 Relationship Between FIT Perfonnance and Demographic Factors, 421 Method for Evaluating the Nonnative Reports, 422 Summary of the Status of the Nonns, 422 Summaries of the Studies, 423
xxi
CONTENTS
Results of the Meta-Analyses of the Finger Tapping Test Data, 441 Conclusions, 442
22. Grip Strength Test (Hand Dynamometer),
444
Brief History of the Test, 444 Relationship Between Hand Dynamometer Performance and Demographic Factors, 445 Method for Evaluating the Normative Reports, 445 Summary of the Status of the Norms, 446 Summaries of the Studies, 447 Results of the Meta-Analyses of the Hand Dynamometer Test Data, 457 Conclusions, 458
23. Grooved Pegboard Test,
459
Brief History of the Test, 459 Relationship Between GPT Performance and Demographic Factors, 460 Method for Evaluating the Normative Reports, 460 Summary of the Status of the Norms, 461 Summaries of the Studies, 462 Results of the Meta-Analyses of the GPT Data, 470 Conclusions, 471
VII. CONCEPT FORMATION AND REASONING 24. Category Test, 475 Brief History of the Test, 475 Alternate Formats, 477
Relationship Between Category Test Performance and Demographic Factors, 480 Method for Evaluating the Normative Reports, 481 Summary of the Status of the Norms, 482 Summaries of the Studies, 483 Results of the Meta-Analyses of the Category Test Data, 494 Conclusions, 495
25. Wisconsin Card Sorting Test,
496
Brief History of the Test, 496 Anatomical Correlates and Effect of Brain Pathology on the WCST, 498 Brief Overview of Clinical Findings Using the WCST, 499 Modifications and Alternate Formats of the WCST, 503 Psychometric Properties of the Test, 505
Relationship Between WCST Performance and Demographic Factors, 508 Method for Evaluating the Normative Reports, 511 Summary of the Status of the Norms, 512 Summaries of the Studies, 513 Conclusions, 531
References,
533
xxii
CONTENTS
Appendices 1. 2a. 2b. 2c. 2d. 3. 4. 4m. 5. 6. 6m. 7. 8. 9. 10. 10m. 11. 11m. 12. 12m. 13. 14. 15. 16. 17. 18. 19. 19m. 20. 21. 21m. 22. 22m. 23. 23m. 24. 24m. 25.
Where to Buy the Tests, 611 Subject Instructions for ACT According to Boone et al. (1990) and Boone (1999), 613 Auditory Consonant Trigrains (Boone et al., 1990; Boone, 1999), 614 Subject Instructions for ACT According to Stuss et al. (1987, 1988), 615 Auditory Consonant Trigrapts (Stuss et al., 1987, 1988), 616 WHO-UCLA Auditory Ve~al Learning Test: Instructions and Test Forms, 618 Locator and Data Tables fqr the Trailmaking Test (TMT), 623 Meta-Analysis Tables for Trailmaking Test (TMT), 648 Locator and Data Tables f~ the Color Trails Test, 657 Locator and Data Tables £ the Stroop Test, 661 Meta-Analysis Tables for Stroop Test (Golden Version, Interference Version), 680 Locator and Data Tables£ Auditory Consonant Trigrams, 684 Locator and Data Tables £ the Paced Auditory Serial Addition Test, 689 Locator and Data Tables £ the Cancellation Tests, 705 Locator and Data Tables£ the Boston Naming Test (BNT), 709 Meta-Analysis Tables for t}t Boston Naming Test (BNT), 724 Locator and Data Tables ~
Copyright Acknowledgments, Index,
1015
tJie
1013
BACKGROUND
1 Introduction
Clinical neuropsychology is an applied science concerned with the behavioral expression of brain dysfunction (Lezak et al., 2004). Neuropsychologists are neurobehavior specialists who administer tests and test batteries that are typically tailored to answer specific referral questions. Ideally, a neuropsychological test battery consists of well-validated, reliable, standardized, and normed measures that help to elucidate and quantify behavioral changes that may have resulted from brain injury or other central nervous system disturbances. A neuropsychological examination provides a comprehensive evaluation of cognitive domains putatively associated with various brain substrates. The cognitive domains typically assessed include language, attention/concentration, visuospatial perception and constructional abilities, frontal systems/executive functions, and verbal and nonverbal learning and memory. Sensory and motor functions as well as general intellectual functioning are also routinely assessed. Findings from a neuropsychological examination can help to highlight areas of functional strengths and weaknesses that may have focal or lateralizing significance. A neuropsychological evaluation is therefore considered an essential component in the diagnosis, treatment planning, and care of patients with suspected congenital or acquired brain dysfunction.
After administration of the te:.t battery, the neuropsychologist is faced with making sense of a plethora of numerical and qualitative data. To make optimal use of the test data, the neuropsychologist must have an understanding of what constitutes "normal" performance on the tests before an opinion regarding the strengths and/or weaknesses of various neurobehavioral capacities can be offered. To be meaningful, test scores must have an empirical frame of reference. Normative data provide this empirical context and represent the range of performance on a particular test of a group of medically/neurologically healthy individuals with relatively homogeneous demographic characteristics. These normative reference groups are considered the "gold standard" against which an individual's test performance is compared and contrasted. However, while normative data are a critical starting point for the interpretive process, they do not provide the sole basis for the interpretation of test scores. Test data alone do not provide an adequate basis for making sound clinical judgments regarding cognitive functioning. As Lezak et al. (2004) stress, interpretation of the data obtained from a neuropsychological examination must take into account qualitative observations and the patient's history, background, present circumstances, motivation, attitudes, and expectations regarding self and 3
4
BACKGROUND
the examination. A formal evaluation of the patient's emotional functioning and personality characteristics is also an intrinsi' part of neuropsychological evaluation. All tl}is information taken together provides a .ework for an accurate understanding of a patient's cognitive strengths and limitations. : To illustrate the relationship betwe$1 different sources of information, consider~.: 1.1. As can be seen, no single section of the yramid in Figure 1.1 should be used alone to form an opinion about neuropsychological 'oning. All interpretive elements (raw data/n , s, observations of test-taking behaviors, an medical history/presenting symptoms) must play in building evidence that forms for a professional opinion. Further s interpretive process, of course, is th neuropsychologist's clinical judgment, which. is influenced by his or her education, professional experience, and research knowledge b.e. The history (including medical, ~c. educational, vocational, and avocational)*presenting symptoms are important in und ding test data. A detailed medical!psychia ·c history is an especially important source of information given that neuropsychol~al test performance can be greatly influe~d by medical or psychiatric conditions. Doqumenting risk factors known to affect neuropsychological test performance is essential slnce an important task of neuropsychologists is to properly attribute the contribution of peiipheral
nervous system, central nervous system, and medical and/or emotional dysfunction to the clinical picture. It is also important to observe the qualitative process of performance leading to a specific score on a test. Reporting a score without revealing how it was obtained can sometimes be misleading. For illustrative purposes, assume that the criterion for passing a particular component of a driving test is that the car is parked in the garage. We look and, yes, the car is in the garage-criterion met. The driver passed the test. Or did he? We interview an observer of the event and sadly learn that the person drove through the garage door to get there! Obviously not a stellar performance. Similarly, with neuropsychological testing, noting how an individual obtained a score can be quite illuminating. Consider two 75-year-old architects (patient A and patient B) who each obtain a score of 36/36 on the Rey-Osterrieth test. On a nonnative basis alone, performance scores for both patients would be considered within normal limits, but how did these two architects obtain the score? Patient A quickly recognized the overall gestalt of the drawing, drew a box, and filled in the details. Patient B failed to appreciate the drawing's overall shape and built up the figure by accretion, taking 8 minutes to do so, with numerous erasures. Although patient B produced the same score as patient A, the dramatically different approach that patient B followed to complete the design
OBSERVATIONS
lntarpratation
--+
REPORT
RAW DATA, NORMS
Fisure 1.1. Graphic representation of the relatio~hip between different sources of information contributing to the decision-making process in neuropsycholoir.
INTRODUCTION
suggests significant loss in spatial/constructional ability. In observing a patient's test-taking style, it is important to assess attitude, effort, and motivation. In all evaluations, one must ascertain whether papents are offering their best performance. However, low test scores can be a product of lack of effort, incorrectly following test instructions, or deliberate failure. Fortunately, several free-standing effort tests as well as cut-offs for standard cognitive tests sensitive to the presence of suspect effort are now available to detect noncredible performance. When suspect scores are obtained on such measures, it may be important to confirm that the patient correctly followed the instructional demands of the test. In our clinical experience, we have had cases (although rare) of patients with well-documented significant brain injuries, their diagnosis supported with structural and functional neuroimaging data (CAT, MRI, fMRI, SPECT, PET), who nonetheless scored below chance level on some of the most popular forced-choice and probability theorybased tests specifically designed to assess for symptom validity and motivation. Therefore, when questionable scores are obtained on measures assessing for symptom validity, motivation, and effort, we recommend that the examiner, if possible, obtain from patients a description of what they thought they were asked to do and how they solved the test. The examiner may be very surprised by the answers. As an example, because of the forcedchoice nature of its design, we sometimes use the Warrington Facial Recognition Test as one of several measures to assess for symptom validity, motivation, and effort. However, it is possible to score below chance level if the patient decides that he or she is going to select all the "unpleasant" faces, determined by whether the face is looking away from the viewer, not smiling, or otherwise not staring the viewer in the eyes. Generally, the overall profile of scores and the consistency of successes and failures across different measures addressing similar cognitive domains greatly aid in determining the validity of test results. This issue is especially important in medical-legal evaluations due to the possibility of identifiable secondary gain.
5
In neuropsychological practice, one examines the pattern of test performance within a functional domain to assure greater probability of accurate interpretation. No single test score is sufficient to render a judgment regarding brain dysfunction. For instance, a single score on a verbal memory test has little meaning unless interpreted in the context of the individual's pattern of performance on other tests which also assess verbal memory functioning. The reason for this is quite simple. In most neuropsychological examinations, one can expect to 6nd an instance of unexplained performance deviation. This deviation may be due to an attentionallapse or to measurement error. Obviously, an occasional poor test finding, more or less in isolation, has less meaning than a pattern of poor test findings occurring within a specific cognitive domain. Also, it is important to keep in mind that no test is a perfect measure of what it purports to assess. Therefore, it is often important to administer more than one test to assess a particular functional domain, especially if a lower than expected score is obtained on an initial examination. For instance, consider a neuropsychologist who administers the Grooved Pegboard Test to a right hand-dominant patient as the only measure of motor functioning. The medical history establishes that the patient has never previously sustained any significant bone or soft tissue injuries to his fingers, hands, arms, shoulders, or neck. Yet, when administered the Grooved Pegboard Test, the patient obtains a lower score for the dominant hand than for the nondominant hand, and the neuropsychologist regards this as evidence of left hemisphere dysfunction. The problem with this interpretation, of course, is that poor performance on the Grooved Pegboard Test might be due to factors other than a motor deficit, such as a "practice effect," which benefits the second hand to be tested. Poor dominant hand performance might also be explained by a patient's idiosyncratic approach and slow start on the task. Therefore, in this situation, it would be important to administer other motor tests, such as the Finger Tapping Test or Grip Strength (aka, Hand Dynamometer) Test, to determine whether the
BACKGROUND
6
original findings can be corroborated. Thus, more than one test should be administered when assessing performance within a specific functional domain so that the internal consistency of performance findings can be judged before offering an opinion about function. In summary, failure to consider all aspects of the interpretive process will increase the probability of faulty inferences being drawn from the neuropsychological data obtained.
TEST-TAKING ENVIRONMENT Neuropsychologists evaluate behavior with neuropsychological test procedures. The task of the neuropsychologist is to obtain from the patient the best possible performance. To achieve this, the neuropsychologist must develop a rapport with the patient, gain cooperation, and conduct the evaluation in an environment that is as free as possible from distracting influences. This is referred to as the "ideal" test environment. As Lezak et al. (2004) commented, It is not difficult to get a brain damaged patient to do poorly on a psychological examination, for the quality of the performance can be exceedingly vulnerable to external inHuences or changes in internal states. All an examiner need do is make these patients tired or anxious, or subject them to any one of a number of distractions most people ordinarily do not even notice, and their test scores will plummet. In neuropsychological assessment, the difficult task is enabling [emphasis ours] the patient to perform as well as possible. (p. 130)
Indeed, discovering and documenting what a patient is capable of doing, or not doing, under the best of circumstances is of tremendous clinical value from diagnostic and prognostic viewpoints. The neuropsychologist is ethically bound to insure that the testing session is conducted in an atmosphere that promotes cooperation, accuracy and honesty and that minimizes any chance of collecting less than optimal data from the patient because of negative influences of the examiner and/or other influences, demands, or distractions. Neuropsychological test norms have been standardized under ideal test environment
conditions. Therefore, any deviation from a standardized test environment should be well documented in the report because such situations almost always adversely affect the reliability and validity of the test data and, therefore, the reliability and validity of the professional opinions derived from the data. Observers, court reporters, and video/DVD and/or audio recording equipment should never be allowed to document a neuropsychological testing session. Unfortunately, requests for observers and recording are not uncommon from attorneys seeking to insure and document that the neuropsychological tests administered to their client were fairly and correctly conducted. However, patients may behave differently when observed or recorded; Constantinou et al. (2002) observed that memory scores declined under audiotaping conditions, while Kehrer and colleagues (2000) found that the presence of an observer was associated with lower scores on measures of attention, information-processing speed, and verbal fluency. Further, the presence of observers and/ or recording equipment alters the standardized examination environment. Because a nonstandardized test environment is created under these circumstances, the data obtained from the evaluation could be considered invalid. Further, the normative data used to interpret test scores were collected under standardized testing conditions and not in the presence of third-party observers or electronic recording devices. An added problem is that the use of recording equipment in a testing session may place neuropsychologists in potential conflict with state laws regulating the practice of psychology, as well as federal copyright law. In addition to legal consequences, ethical concerns are raised since Ethical Standard 9.11 of the American Psychological Association Code of Conduct requires that "psychologists make reasonable efforts to maintain the integrity and security of test materials and other assessment techniques . . . ." The key problem is that neuropsychologists usually have no control of the recordings once they leave the office. The CD/DVD or tapes can be disseminated without regard to maintaining test security and assessment techniques. As was noted in the
INTRODUCTION
official statement regarding test security by the National Academy of Neuropsychology, The potential . . . likely and foreseeable consequence of uncontrolled test release is widespread circulation, leading to the opportunity to detennine answers in advance, and manipulation of test performance. This is analogous to the situation in which a student gains access to test items and the answer key for a final examination prior to taking the test. (Axelrod et al., 2000b) Should a test become invalidated due to exposure to the public domain, redevelopment and replacement would be very costly in terms of time and money. In the interim, the professional community would be deprived of an effective assessment instrument.
TEST NORMS Despite the critical importance of having access to normative data to facilitate clinical interpretation of tests findings, there still are relatively few large-scale normative reports in the literature. This is especially evident when one considers the large number of validation studies on the utility of neuropsychological tests in discriminating groups of patients with lateralized or focal lesions. To some extent, the relatively small body of literature regarding normative research is related to the formidable logistic problems and expense associated with the execution of such studies. However, the larger problem is that researchers generally have not been supported either financially or professionally in conducting normative research. Although test norms are essential to proper interpretation in clinical and research settings, major funding agencies have not favored such studies. The key problem is that these studies are inevitably descriptive, and descriptive studies are generally not considered "scientific" since no hypotheses are being tested. Until very recently, journal editors have generally been loath to publish normative reports, opting instead for more scientific works. As a result, a substantial number of normative data sets are imbedded in publications of clinical studies, making them difficult to locate. Since most researchers
7
are employed in "publish-or-perish" environments, such research takes a low priority. These obstacles have remained in place despite the awareness that normative research is sorely needed and, indeed, essential. So, although neuropsychological assessment procedures are widely available, there remains a relative scarcity of normative data for most tests. For a few of the most popular tests (e.g., Trailmaking Test A and B, Rey AuditoryVerbal Learning Test [AVLT]), however, numerous reports are available which provide normative comparison data for performance across the greater part of the life span. The problem for clinicians and researchers then is which set of normative data to use? Large differences in reported scores among studies examining performance on the same test in groups of individuals that have nearly identical demographic characteristics, as well as vague descriptions of how the data were collected, compound the problem of identifying appropriate norms for comparison purposes. A frequent difficulty is that use of one set of norms may suggest that the patient is performing in the impaired range while use of another set may suggest that performance is within normal limits. Unfortunately, use of the wrong set of comparison data may pave the way for faulty inferences being drawn, perhaps also resulting in either unnecessary treatment or therapeutic neglect. To demonstrate this point, consider the hypothetical medical-legal case of a physically healthy 80-year-old male, retired factory worker, born and educated in the United States (1 year of college), who presents with complaints of memory difficulty. The patient would like to continue to handle his financial affairs, but his family is concerned about his competence to do so. In the course of evaluating the patient, you administer several memory tests including the Rey AVLT. In the unlikely event that you were only interested in his performance score for trialS, there would be eight normative reports listed in this volume from which to choose, which are presented in Table 1.1. Let us assume that the patient's trialS score was 6/1S correct. Reliance on the Query and Megran (1983) normative data would suggest that test performance is in the average range
8
BACKGROUND
Table 1.1. Published Rey Auditory-Verbal Learning Test Nonnative Reports Education
Age Group Investigator
{years)
n
{years)
TrialS {#of words)
Delayed Recall {# of words)
Test lDcation
Rey, 1964
70-90
15
No info.
9.5 {2.2)
Query& Megran, 1983
70-81
23
11.4
5.86 {2.04)
3.45 {2.92)
N. Dakota {all Veterans Administration inpatients with physical complaints)
Cohen et al., pers. comm. Bleeker et al., 1988
75-89
4
13.8
9.25 {1.89)
8.25 {2.63)
Peoria, IL
80-89
11
13-18
Geffen et al., 1990
70-86
10
11.2
9.2 {2.1) 8.2 {2.5)
5.6 {2.6)
S. Australia (all male subjects)
Ivnik et al., 1990
80-84
49
2:12
9.0 {2.5)
5.5 {3.3)
Minnesota
Mitrushina et al., 1991
76-85
26
13.3 {3.6)
9.7 {2.8)
s. California
Mitrushina & Satz, 1991a
76-85
16
14.0 {3.6)
10.3 {2.4)
s. Californla
(53rd percentile). However, close reading of that normative report would reveal that the data were collected on a sample of Veterans Administration patients hospitalized for a variety of physical complaints. Thus, overall performance scores of the comparison sample were probably artiflcially lowered because of hospitalization effects, chronic pain effects, and dysphoria. Therefore, applying the Query and Megran data would lead the examiner to conclude that the patient's performance was better than it probably was. Depending on which other remaining normative reports were used, the patient's score would fall in the low average (Geffen et al., 1990, 19th percentile; Ivniketal., 1990, 12th percentile) or borderline (Rey, 1964, 6th percentile; Cohen et al., personal communication, 4th percentile; Bleecker et al., 1988, 6th percentile; Mitrushina et al., 1991, 9th percentile; Mitrushina & Satz, 1991a, 4th percentile) range. Unfortunately, all the studies reporting trial 5 data for this age group suffer from small sample size (n <50). In terms of selecting the "best" study for comparison purposes, those with the smallest sample size should probably be first rejected. This would eliminate the studies of Cohen et al., Bleecker et al., Geffen
Switzerland
Muyland
et al., Rey, and Mitrushina and Satz. As noted in Spreen and Strauss (1998), use of Rey's norms (reported in 1964 but collected in 1944) should also be avoided because of test content and administration differences. These data were collected over 50 years ago in Switzerland, raising serious concerns about cohort and cultural effects. Similarly, data from the Geffen et al. (1990) report should be avoided due to cultural differences in comparing North American vs. Australian samples and the fact that the educational level of the samples was low. We had previously eliminated the Query and Megran (1983) study for the reasons mentioned earlier. Of the two normative reports remaining (Ivnik et al., 1990; Mitrushina et al., 1991), that by Ivnik et al. would be selected because of the larger sample size. The subject and procedural characteristics of these two studies are otherwise nearly identical. Finally, and most importantly, the demographic characteristics of the patient being evaluated matched well with the demographic characteristics of the participants in this normative study. If one were interested in examining the delayed recall performance of this patient, four studies would be available for normative
9
INTRODUCTION
comparison purposes (Query & Megran, 1983; Cohen et al., personal communication; Geffen et al., 1990; Ivnik et al., 1990). Query and Megran's report would be avoided because of the sampling problems noted earlier. The Cohen et al. report would not be considered primarily due to the small sample size. In narrowing the choices down to the two remaining studies, the Geffen et al. report would be rejected in favor of the Ivnik et al. report for reasons of sample size and educational and cultural issues and the fact that the length of delay for the delayed recall condition was 30 minutes. However, if a better normative data match is not available, it should be considered that the greatest rate of forgetting occurs during the first 20 minutes. Consequently, the effect of variability in the length of delay between different norms from 20 to 30 minutes (even 45 minutes to 1 hour) on rates of forgetting is minimal. Because no normative study ordinarily allows a perfect fit to the demographic characteristics of the patient, the examiner should be aware of the specific limitations of any data set used for inferential purposes. Because patients are often evaluated more than once on the same test (often by different examiners), clinicians are urged to document the source of comparison data used to arrive at conclusions within the body of any report. This recommendation is especially relevant for tests with multiple and/or overlapping sets of normative data (D'Elia et al., 1989). In addition to normative data, there is another set of data that can be used for comparison purposes when it is available. Clinical comparison data (aka, abnorms) represent the range of test performances of distinct groups of medically, psychiatrically, and/or neurologically compromised individuals with relatively homogeneous demographic characteristics. Because of the general lack of clinical comparison data, however, neuropsychologists have largely relied on normative comparison data for interpretive purposes. As a result, interpretive comments have been limited to reporting how the patient under study differs from a healthy sample. Having the ability to discuss how a patient under study is
similar to other patient groups would be a welcome addition to the field. The availability of clinical comparison data sets would permit such analysis.
TESTS
Standard and Experimental Neuropsychologists use published, standardized measures that are well normed and generally accepted as standard tools of assessment in the field. There are several reasons why neuropsychologists must utilize standardized measures. Patients presenting for neuropsychological evaluation are frequently reevaluated over time, often by different examiners. As Figure 1.1 illustrates, the test data collected during the examination are an essential ingredient in forming a professional opinion regarding neuropsychological functioning. Since the written report of findings is a professional communication from one clinician to another, for the report to be meaningful and therefore useful to other professionals, the data must have been obtained from tests that are familiar to, or can be easily referenced by, any clinician. The use of standard tests and administration I scoring procedures is especially critical in initial examinations of patients, to establish a meaningful baseline. Baseline data are essential for subsequent comparison with retest data, to document whether there has been improvement or decline in functioning. If tests are employed in the initial exam which are unique to the neuropsychologist administering them, then subsequent examiners will be unable to make comparisons between baseline and subsequent performances. In a forensic context, the use of nonstandard tests and procedures impedes the fact-finding process. Unfortunately, it is not uncommon to review reports of medical-legal examinations that have relied heavily or exclusively upon findings from experimental measures as the basis for an opinion of neuropsychological impairment. Not surprisingly. the experimental measures employed almost always are purported to be tests of memory, attention/
10
concentration, and/or frontal system functioning, three cognitive domains that are especially sensitive to any brain insult, regardless of etiology. The use of experimental tests and the claims made to courts that the measures are at the "cutting edge" and provide objective evidence of impaired brain function often result in, at least for a while, increased business for the provider of this service. Rather than providing neuropsychological service, however, neuropsychologists conducting themselves in this way are practicing a form of modem-day alchemy: they have found a way to tum their time into gold. Such practice not only harms the patient through mislabeling but also undermines the legitimacy of neuropsychological practice in the forensic arena. Increasingly, neuropsychologists are being called upon to provide expert-witness courtroom testimony as to whether brain-behavior functioning is "intact," "impaired," or otherwise "compromised" after a brain insult (Hom, 2003; Leckliter & Matarazzo, 1989; Matarazzo, 1990; Satz, 1988). This development seems reasonable since neuropsychology is an applied science concerned with the behavioral expression of brain function and dysfunction. However, the days of being able to state in court "It is my professional experience that . . ."without also providing the objective basis to support an opinion (i.e., normative data, clinical comparison data, history, medical records, symptoms, observations) are over. Neuropsychologists are now regularly asked by knowledgeable attorneys to produce the objective basis-particularly the normative data-for their opinions. This development stems, in part, from the Faust et al. (1991) two-volume tome entitled Brain Damage Claims: Coping with Neuropsychological Evidence, which provided scathing reviews of the field of neuropsychology for attorneys. Neuropsychologists must be prepared to state why they used a particular set of normative data for comparison purposes. Although this line of inquiry by attorneys is a relatively new development in the courtroom, the practice is sound and should be welcomed. Why? Let us look at what our professional rules of practice tell us. Ethical Standard 9.06 of our code of ethics (American Psychological Association, 2002) states "When interpreting assessment
BACKGROUND
results . . . psychologists take into account the . . . characteristics of the person being assessed . . . that might affect . . . judgments or reduce accuracy of their interpretations." Ethical Standard 9.09(c) states "Psychologists retain responsibility for the appropriate application, interpretation, and use of assessment instruments, whether they score and interpret such tests themselves or use automated or other services." Of course, experimental measures are occasionally used during the course of an evaluation. This is tolerated by our field because experimental test development is a natural evolution in a discipline that advocates and uses research to advance knowledge about clinical assessment and diagnosis. However, findings from experimental tests should be used only to supplement and support findings from standardized procedures that were also administered. Findings from experimental tests should never be used as the primary basis for forming opinions about impaired neuropsychological functioning. Serious professional and ethical concerns are raised when neuropsychologists substantially deviate from standard test administration and scoring procedures and/ or rely heavily on experimental tests for clinical judgments.
When Is a Test Considered Experimentalt There are at least four levels of experimental tests (presented in order from most experimental to least): Levell tests have never been peer-reviewed or published and are typically uniquely utilized by the neuropsychologist who developed them. Often, normative data are sparse or nonexistent. Level2 tests although never published, may have been reviewed by peers and used by a specific group of neuropsychologists conducting a multisite research study. Again, normative data may be sparse or lacking. Level 3 tests often hpve been widely distributed to interested parties for experimental use; however, there have been no published studies
INTRODUCTION
using these tests. Preliminaty normative, reliability, and validity data are usually available. Level 4 tests have been carefully described in peer-reviewed journals as having been included in a study where several other standardized tests were administered. Preliminaty normative data are available, and there is some information about the test's reliability and validity. These tests generally have not yet been formally published by a recognized test publisher/distributor but are made available by the author(s) to interested parties for research/experimental use. Whereas findings based on level 1, 2, and 3 tests should be given little clinical weight and viewed with extreme caution, a level4 test has at least undergone formal peer review before publication of the findings in a recognized professional journal. Therefore, although caution is warranted when discussing findings from a level 4 experimental test, the results may be used if they are buttressed by similar findings from more formal, standardized tests. What Determines Whether a Test
Is Considered 11StandardJ" At a minimum, three of the following four criteria should be met before a test is regarded as being in standard use: 1. The test must be readily available to the
professional community and adequately normed. Using this definition, however, it is possible for a test considered "standard" to slide into the "experimental" range when not adequately normed for the age or sociocultural group under study. For instance, the majority of our "standard" tests must be considered "quasi-experimental" when assessing adults over age 80, members of minority groups, or any individual who does not speak English or uses it as a second language. In each such case,
11
adequate normative data are often sparse or lacking. Fortunately, several research groups are continuing to address the need for norms within the upper age ranges and for various cultural subsets. 2. The test stimulus and materials should be
standardized. A manual describing test administration and scoring procedures and providing information on reliability and validity should be available. These requirements usually imply that the test has been formally published by a respected test publisher/distribution company. However, several tests that are considered standard have never been formally published. For instance, the Rey AVLT has never been formally published nor have the administration procedures been standardized; but most of the necessruy information on norms, reliability, and validity for this test can be found in the neuropsychological literature. 3. Research using the test must have been
peer-reviewed and published in recognized professional journals. 4. The test has been reviewed in the Mental Measurements Yearbook arullor in more than one neuropsychological text by authors not connected with its development. In conclusion, neuropsychologists are responsible for choosing the particular tests to answer referral or research questions and the best possible clinical comparison data (both norms and "abnorms"). Neuropsychologists are also responsible for insuring that the test measures are administered properly and in a comfortable, nonthreatening, and distractionfree environment that will enable the patient to perform to the best of his or her ability. Being more accountable for what is done and being able to elucidate why it was done will not only raise the credibility of neuropsychology as a distinct specialty but also enhance the use of neuropsychological services within the medical and legal communities.
2 Use of Methodological Concepts in Neuropsychology Practice
INTERFACE OF NEUROPSYCHOLOGY WITH OTHER CLINICAL DISCIPLINES The notion of mental status has three components: 1. Mood and affect 2. Perception and thought 3. Cognitive status
content/process
of
Factors such as appearance, motor activity, insight, and motivation can be incorporated into these three components of mental status. Clinicians specializing in different disciplines approach the mental status evaluation from different perspectives. For psychiatrists, the presence of the following symptomatology is of primary concern: affective symptoms (e.g., depression, mania, rapid cycling), perceptual disturbances (e.g., hallucinations), disturbance in the content of thought (e.g., delusions), and disturbed process of thought (e.g., tangentiality, loosening of associations, flight of ideas). Assessment of cognitive status represents one of the aspects of psychiatric evaluation and includes brief appraisal of the level of consciousness, orientation, attention, memo.ry, language, ability to follow verbal
12
commands, calculations, abstract reasoning, fund of general information, and judgment Assessment of cognitive and affective components of mental status also constitutes pJUt of a neurological evaluation that addresses higher cortical and limbic system functions. In addition, the neurological evaluation focuses on the integrity of the lower levels of the nervous system through assessing functions of the cranial neiVes, motor systems, senso.ry systems, reflexes, coordination, station, and gait. Psychiatric and neurological approaches are redefined in the context of neuropsychiatry, a recently rediscovered medical discipline that has its roots in the notion of "psychosomatics". Neuropsychiatry is concerned with both neu- . rological and psychiatric symptoms of brainrelated disorders and is supported by advances in neuroimaging, psychopharmacology, genetics, and molecular biology. In the context of psychiatric, neurological, and neuropsychiatric evaluations, cognitive status is assessed by unstructured questioning or through administration of structured screening instruments, allowing quantification of a patient's cognitive status (e.g., Mini-Mental State Examination: Folstein et al., 1975). This assessment is brief (limited to 10-20 minutes) and, therefore, yields only a gross estimate of
METHODOLOGICAL CONCEPTS IN NEUROPSYCHOLOGY
cognitive abilities. If indications of cognitive decline are evident on the cognitive screening, the patient is referred for a neuropsychological evaluation. Assessment of the cognitive component of mental status is the primary focus of neuropsychological evaluation, though affective state and content/process of thought are also attended to. The standardized neuropsychological instruments that are used in a comprehensive assessment of different cognitive domains are more sensitive to subtle functional deficits than the gross screening tools used in psychiatric, neurological, and neuropsychiatric examinations.
APPLICATIONS OF NEUROPSYCHOLOGICAL EVALUATION The utility of psychological testing is in the midst of a heated controversy. Over the years, the psychometric approach has been criticized for being too mechanistic and biased against underrepresented groups. Neuropsychological evaluation is not limited to administration and interpretation of the tests; instead, it provides a comprehensive picture of the patient, placing test performance in the context of expectations specific for a particular individual. This approach overcomes the flaws inherent in testing. Clinical neuropsychology is a relatively young discipline that continues to shape itself in response to the pressing needs of clinical practice. With changing environmental demands, clinical neuropsychology redefines its objectives, its applications, and its relationship with other disciplines and revises its armamentarium. Whereas in the 1940s its main role was in the localization of lesioned brain areas, this need has faded with the rapid advancement of neuroimaging. The focus of clinical neuropsychology has shifted toward developing methods for diagnosing developmental and acquired disorders of brain function, which constitute a part of the multidisciplinary neurodiagnostic work-up, and toward understanding the cognitive status of an individual and his or her functional efficiency in dealing with daily tasks, as well as planning and im-
13
plementation of rehabilitative efforts to optimize functional level of an individual. Recent surveys and consensus statements document the utility of neuropsychological evaluation in different clinical settings: for example, in the differential diagnosis of dementia (McKhann et al., 1984; Roman et al., 1993), in screening for cognitive impairment in the psychiatric emergency service (Copersino et al., 2003), in assessment of neurobehavioral outcomes after cardiac surgery (Murkin et al., 1995), etc. The role of neuropsychological assessment in evaluating patients undergoing epilepsy surgery and in management planning for patients with suspected dementia, multiple sclerosis, Parkinson's disease, traumatic brain injury (TBI), stroke, or HIV encephalopathy is addressed in the report issued by the American Academy of Neurology (1996). The report describes the contribution of neuropsychological assessment to our understanding of neurological disorders in adults and documents its strengths and limitations. Although the report offers endorsement of neuropsychological assessment as "appropriate" and rates it as "established," it stirred up a controversy in the field of clinical neuropsychology (Bieliauskas et al., 1997a; Bigler & Dodrill, 1997; Hartlage, 2001; Reitan & Wolfson, 2001). Clinical neuropsychology, equipped with recent theoretical and clinical advancements, alliance with neuroimaging and interdisciplinary affiliations, has a wide range of applications: I. Neuropsychological evaluation is used in the differential diagnosis of conditions involving cognitive dysfunction (e.g., dementia of Alzheimer's type vs. vascular dementia vs. depression). It is especially useful in identifYing patterns of impairment in patients with subtle deficits (as confirmed in the report by the American Academy of Neurology, 1996) and sensitive to abnormalities in brain function that are not detectable by neuroimaging. One of the applications of neuropsychological evaluation is to determine the effect of medical disorders, specifically involving the central nervous system,
BACKGROUND
14
TBis, infections, metabolic abnormalities, oxygen deprivation, exposure to neurotoxins, or developmental and psychiatric disorders on cognition. In the past, differential diagnostic questions addressed to a neuropsychologist were framed in terms of organic vs. functional etiologies of disturbance. With the growing evidence of neuropathological and chemical correlates of functional disorders, the organic vs. functional dichotomy becomes obsolete. 2. Neuropsychological evaluation is useful in defining baseline levels of cognitive functioning for longitudinal comparisons with follow-up data. For example, even a subtle age-related decline in attention, decision-making, judgment, and visuospatial abilities in pilots might lead to failure to respond adequately in a critical situation. In view of the tremendous responsibility for human lives imposed on pilots, early signs of age-related decline need to be identified through longitudinal comparisons of performance. In another example, a baseline profile obtained on a child with a suspected learning disability before entering primary school can be compared with the results obtained 1 year later. This defines the rate of acquisition of a particular skill and facilitates decision making regarding the necessity of remedial intervention and special education/resource classroom placement. Longitudinal follow-up is also used to identify the rate of improvement or deterioration in cognitive status and response to treatment. For example, followup evaluations of a head injury patient help to identify the rate of improvement with or without treatment in comparison to an initial evaluation taken a few weeks after the accident. This allows prediction of the extent of future recovery, the nature and severity of residual deficits, and the highest functional level to be achieved upon recovery. Availability of this information affords realistic expectations as to the limits of recovery, facilitates adjustment of patients and their families to
possible changes in lifestyle, and prompts reassignment of responsibilities within the family. Similarly, the rate and pattern of cognitive deterioration associated with dementia, as identified by follow-up probes, provide useful diagnostic and prognostic information. Repeated assessment is also used to identify changes in cognitive status due to intervention such as cognitive remediation, to determine the efficacy/toxicity of a medication, and to identify the effect of radiation, chemotherapy, or other treatment modalities on cognitive status. Pre- and postsurgery assessments allow identification of cognitive deficits resulting from neurosurgery and follow-up on the progress in cognitive remediation. 3. Neuropsychological data are used in evaluating patients' employability and constitute a basis for determining whether they meet the criteria for disability. Furthermore, these data are relied upon in determining patients' competence in handling their legal and financial affairs, capacity to participate in medical and legal decision making, and ability to function independently in an everyday environment. For example, a severe deficit in executive functions related to amnesia associated with Korsakotrs syndrome renders a patient unable to maintain basic self-care in spite of retaining a high level ofpsychometric intelligence. In such cases, the results of an evaluation identify the need for close supervision of the patient's everyday activities to protect the patient from inadvertent self-neglect and to help him or her structure and organize activities and, thus, improve the quality of life. Information about the pattern of cognitive abilities is also used in decision making regarding enrollment/return to school to obtain advanced degree, need for special accommodations at work, educational and exam-taking settings, choice of psychosocial interventions, determination of factors primarily responsible for patient's complaints (e.g., postconcussion
METHODOLOGICAL CONCEPTS IN NEUROPSYCHOLOGY
syndrome vs. somatization tendency, factitious disorder, symptom magnification, or malingering), and prediction of efficiency in vocational and daily functioning. In the context of forensic criminal evaluations, this information is used to assess the integrity of cognitive processing in claims of insanity defense and in determining competence to stand trial. The neuropsychological performance pattern can also shed light on the differential contribution of more recent, transient factors vs. longstanding factors in a patient's cognitive profile. This distinction is especially relevant in the forensic context. 4. Neuropsychological evaluation identifies cognitive strengths and weaknesses in patient's functional status. This knowledge facilitates the selection of remedial techniques and rehabilitative strategies. It guides clinicians in their choice of rehabilitation strategy, focusing on remediation of weaknesses or focusing on compensation for cognitive losses using intact abilities.
2.
3.
4.
5.
DIFFERENT LEVELS OF DATA INTEGRATION IN NEUROPSYCHOLOGY PRACTICE 6.
As suggested in Chapter I, sole reliance on test scores and their normative references would frequently result in misinterpretation of a patient's cognitive profile. The high wlnerability of such interpretation to an erroneous outcome hinges on multiple sources of error in different components of the testing process. These sources can be identified as follows: I. Test Construction: Error is inherent in test construction due to the fact that the psychometric properties of the test are specific to the ability level of the subjects on which they were obtained; the values of item difficulty and item discrimination are sample-specific and sensitive to cultural factors, and the variance of errors of measurement is unequal across subjects,
15
with the consistency in performance of high-ability subjects being higher than corresponding values in average- and low-ability subjects. Test Administration: In spite of a meticulous description of procedures aimed at uniformity in test administration, an examiner's individual style and personal tempo as well as idiosyncratic features of a subject's test-taking style introduce variability in administration. Scoring: Test authors define scoring criteria for each test item. At times, however, patients' responses are ambiguous and present a scoring dilemma. As a result, interrater reliability for the majority of tests is less than perfect. Norms: In using norms, error can be introduced by comparing individual performance to an inappropriate target population, outdated norms, or norms based on a low sample size. Interpretation: Sources of error in test interpretation are outlined in the next section. In addition, caution should be taken when performance on an individual test is interpreted as being representative of a circumscribed cognitive ability. Cognitive abilities are highly interrelated, and performance on any test is dependent on the integrity of many different abilities and on the overall level of alertness. Test-Taker Characteristics: Test performance is influenced by motivational factors, emotional status, familiarity with test strategy, and the cultural and linguistic background of the test-taker.
Incorporating behavioral observations and qualitative performance indices in the interpretation and decision-making process considerably improves the accuracy of attributions relating low performance scores to faulty cognitive mechanisms. Luria's approach to test interpretation heavily emphasizes the qualitative aspect of performance. This direction in neuropsychological practice was further promoted by the efforts of Edith Kaplan. Her introduction of the Wechsler Adult Intelligence Scale-Revised (WAIS-R) as a Neuropsychological Instrument (WAIS-R-NI)
16 (Kaplan et al., 1991) attests to the importance of the qualitative performance indices even in the context of structured batte:ry assessment. This movement toward attending to performance quality without compromising standardization of test administration procedures (or with minimal modification of the procedures) is also reflected in Lezak et al.~s (2004) distinction between "optimal" and "sftzndard" testing conditions. Based on a comprehensive review qf recent developments in this area, Caplan and Shechter (1995) formulated the distinction between testing and evaluation as follows: We view the fanner as a largely mechahical enterprise that, because of its rigidity, lends tself well to group or computer-based applicationsi Evaluation is, by contrast, an art applied on an ijdividual basis that involves not only testing skills,' but also professional creativity, observational ~rtise, flexibility, and ingenuity in the service of ~lop ing a multidimensional understanding of Pttientstheir abilities and deficits, their emotiorfd state, self-regulatory functions, the impact of ~nviron mental variables on test perfonnance, and JIO forth. (pp. 359-360) The authors passionately advocate flexibility in testing procedures-specifically in the rehabilitation setting-to allow patients to maximally express their potential in t'st performance. A similar appeal to "see beyond the test data" in offering an opinion on psych.,logical functioning in the litigation setting wa9, voiced by Matarazzo (1990). In forensic evaloations, it is especially important to address numerous sources of bias in the test data (see vail Gorp & McMullen, 1997), which affect the accuracy of interpretations. Matarazzo propose4 a distinction between psychological testi~ and psychological assessment, where the lllJter incorporates historical information, rfledical histo:ry, and other relevant informa.on in clinical decision making. Meyer et al. ;(2001) refined the distinction with an emphisis on usage of multiple test methods in the; latter, incorporated in the context of historic:al information and behavioral observations, and addressed applications of the obtaine& information.
BACKGROUND
According to this model, the following two levels of data integration should be considered in neuropsychological practice: I. Testing refers to the psychometric aspects
and addresses the quantitative appraisal of a patient's performance on different measures. It yields a score or a set of scores that allow comparison with normative data or with a patient's own scores across different tests and over time. 2. Assessment incorporates qualitative aspects of test interpretation in addition to psychometric determination of a patient's relative standing in reference to the normative data. It is reliant on behavioral observations to allow better understanding of the nature of difficulties in test performance and of dysfunctional cognitive mechanisms contributing to low test scores. The clinician integrates various sources of information to place the interpretation of a patient's psychometric profile in the context of his or her history and current condition. Information is based on behavioral observations and an interview with the patient, in addition to the patient's test performance. Additional information can be obtained from medical and school records, interviews with significant others, schoolteachers, nursing staff, etc. The following issues constitute the essence of a neuropsychological assessment: the psychometric aspect of a patient's performance across cognitive domains; qualitative interpretation of dysfunctional mechanisms; the patient's behavior and interaction with the clinician; effort/ motivation to perform on the tasks; other aspects of mental status, including affective state; personality characteristics impacting information processing; demographic information, including educational and occupational histo:ry; medical and psychiatric history; family history; current symptomatology, progression of symptoms, and treatment; sources of sociaVfinancial support and living conditions; motivation to improve and future plans. Neuropsychological evaluations based on sound assessment techniques, with proper
METHODOLOGICAL CONCEPTS IN NEUROPSYCHOLOGY
consideration of a patient's background and sources of bias, increase the accuracy of the clinician's judgment and decision making.
JUDGMENT AND DECISION MAKING IN CLINICAL NEUROPSYCHOLOGY In various applications of neuropsychological evaluation, the clinician makes judgments as to differential diagnosis, effect of remedial intervention, employability, capacity to function independently in daily life, etc. How valid are these judgments? Different opinions have been expressed in the literature as to the validity and basis for judgments in neuropsychology. Limitations of judgments based solely on clinical data are addressed in several publications (Dawes et al., 1993, 2002; Faust, 1991; Wedding & Faust, 1989). The authors present a rationale for the superiority of actuarial methods over clinical methods (superiority of "formula over head;" Dawes, 1993) and argue for increased utilization of actuarial methods and decision aids. Among factors impeding the accuracy of judgments and decision making in neuropsychology, they discuss imperfect reliability and validity of tests, uncertainty inherent in normative information and in effects of modifying variables on test scores, failure to recognize statistical artifacts, reliance on small samples for normative information, overreliance on scatter analysis, and failure to account for probability estimates for a given diagnosis in making diagnostic decisions. However, there is an abundance of literature to support the validity of judgments based on neuropsychological evaluations in both clinical and forensic contexts. Based on data from more than 125 meta-analyses on test validity, Meyer et al. (2001) concluded that psychological assessment is as valid as medical tests in detecting neurological conditions such as dementia. Similarly, a review and meta-analysis of 11 studies addressing reliability and validity of judgments (Garb & Schramke, 1996) indicates that judgments made by neuropsychologists are reliable and moderately valid. The authors questioned the appropriateness of the external criterion measures in the studies reviewed,
17
which were neurological exams alone or with the addition of neuroimaging results. As neuropsychological evaluations are more sensitive to mild brain dysfunction than the criterion measures used in these studies (e.g., mild abnormalities identified by neuropsychological evaluation would not be detectable by cr or MRI scans), the moderate rating of the validity of neuropsychologists' judgments is likely to be an underestimation. To remediate this situation in future research, the authors propose construct validation strategies that would be more appropriate in the assessment of judgment accuracy, such as a multitrait-multimethod matrix approach. They also advocate stronger emphasis on behavioral prediction to improve clinical judgment. Overall, the utility of clinical neuropsychology in the clinical and forensic settings has been widely recognized. However, researchers and clinicians have been developing stronger methodological bases in both the behavioral and actuarial domains, to improve the accuracy of judgments and decision making. Among the methodological advances, we will focus on those addressing test selection, norming, and interpretation of test scores as they are most relevant to the topic of this book.
STRATEGIES IN TEST SELECTION A variety of instruments have been developed over the years to measure cognitive abilities. Tests vary in their reliance on theoretical constructs, complexity of cognitive functions assessed, length and ease of administration, psychometric properties, relevant populations, and availability of norms. Decisions regarding tests to use depend on the preferred approach. The fixed battery approach advocates administration of a comprehensive battery of tests to all patients in invariant order. Any additional information is gathered after the battery is administered and results are analyzed, which prevents any bias in interpretation of the results. Clinical interpretation of the obtained data is compared to information available from other sources for consistency. An example of such a data-driven approach is the Halstead-Reitan Battery. The
18
Luna-Nebraska and Benton batteries also exemplify the fixed battery approach. In contrast, the flexible approach is based on a patient-centered model. The choice of tests is guided by the hypotheses formulated by the clinician after reviewing all available information about the patient. The battery is individually tailored for each patient to include measures used to test a priori hypotheses regarding possible patterns of cognitive dysfunction. Luria's (1980) view of neuropsychological evaluation is most descriptive of the hypothesis-driven flexible approach. It has been further expanded in the processoriented assessment strategy exemplified by Christensen's (1974) standardization of Luria's techniques and by Kaplan's (1988) Boston Process Approach. The advantage of the fixed battery approach is in systematic acquisition of data on a wide range of measures, allowing comparisons across patients and across diagnostic groups and building extensive databases for research purposes. Use of this approach overcomes the effect of base rates on test selection, thus minimizing the probability of error in the early stages of clinical decision making. However, administration of an extensive battery to all patients, irrespective of their individual needs, leads to excessive testing and uneconomical expenditures of resources. In addition, the accuracy of the assessment is compromised if the fixed battery does not include tests sensitive to the deficits in specific functional domains suspected in a given patient. The flexible approach overcomes the shortcomings of the fixed approach. However, it is vulnerable to the effect of base rates and does not lend itself to across-patient or acrossgroup comparisons, given extensive missing data due to differences in tests administered. In a compromise dictated by the realities of clinical practice and the economic environment, many clinicians use a "flexible battery" approach, where a screening battery is specifically tailored for the respective diagnostic group or differential diagnosis in question (e.g., differential diagnosis of dementia in an elderly patient, assessment of cognitive deficit pattern and rate of recovery in head injury, or determination of learning disabilities in
BACKGROUND
a child). Based on the pattern of weaknesses identified by this battery and a priorigenerated hypotheses, additional tests might be administered to specifically address the extent and nature of the deficits. The flexible battery approach is enjoying growing popularity. According to Sweet et al.'s (2000a) survey of the practices and beliefs of clinical neuropsychologists, endorsement of the flexible battery approach increased from 54% of a surveyed sample in 1989 to 70% in 1999. In spite of the preferential use of flexible batteries in clinical practice (Groth-Marnat, 2000; Retzlaff et al., 1992; Sweet et al., 1996), the relative merits of flexible vs. fixed batteries remain controversial (Williams, 2001). Goldstein (1997) suggested that the choice of approach should depend on the setting, nature of the disorder, theoretical approach, and specific questions to be addressed by the evaluation. For further discussion of this issue, the reader is referred to Bornstein (1990), Goldstein (1997), and Tupper (1999).
NORMATIVE REFERENCES AND INTERPRETATION OF CLINICAL DATA
Selection of the most appropriate tests alone does not assure accuracy in understanding the patient's cognitive profile. After the tests are administered and scored, the test scores need to be interpreted in reference to an appropriate set of norms. When you happen to stumble across a reference to a normative report in the discussion section of someone else's paper, you often encounter words to the effect that the data are based on biased samples (because the average FSIQ of the group is 120), with the conclusion that the data are "therefore of limited use.'' There is no such thing as the best nom&ative data for any test since only the clinician can determine what report is best applied given a specific patient and situation. All normative data are of limited use. The data are limited to use with patients whose demographic characteristics are similar to those of the normative data sample (e.g., Heaton et al., 1986; Kalechstein et al., 1998; Ross & Lichtenberg, 1998; Van Gorp & McMullen, 1997) and match the
METHODOLOGICAL CONCEPTS IN NEUROPSYCHOLOGY
administration I scoring procedures of the test utilized. The inHuence of the demographic factors is even more apparent for neurologically normal individuals than for those who have cerebral dysfunction (Heaton et al., 1986). Another consideration in choosing an appropriate normative data set is its sample size. The sampling distribution of the variance in a small sample is positively skewed, which undermines the accuracy of estimation of standard scores (see the next section). Although there is a controversy in the field as to what constitutes a sufficient sample size to ensure precision of estimates of test psychometric properties (see Charter, 1999; Cicchetti, 1999), a sample size of 50 is typically viewed as adequate (Crawford & Howell, 1998a). Normative sample, by definition, implies that participants are not affected by illnesses that might lead to cognitive compromise. Assessment of health status in many normative studies is based on a self-report, whereas few studies use medical evaluation or neuroimaging results to rule out neuropathology or other conditions. Stanczak et al. (2000) showed that self-report of a negative history of neuropathology and psychopathology is sufficient for the pwposes of inclusion into a nonnative sample. In addition, more recently collected data should be preferred to older sets of data, given that the samples are comparable in other respects. A notable increase in normative expectations over time has been documented in the literature. Flynn (1984, 1998) showed an increase of about 0.3 IQ points per year, which necessitates periodic renorming of tests measuring intelligence. The reasons for this rise in IQ scores are unclear. One possibility is a greater exposure and availability of information over time, which results in an increase in the fund of knowledge possessed by an average individual. According to Kanaya et al. (2003), "the norms are most valid at the time the norms are released." This phenomenon, termed the Flynn effect, has not been systematically examined in application to neuropsychological tests. However, evidence confirming this phenomenon is provided by Iverson et al. (1999), who showed a substantial increase in the number of TBI patients scoring within
19
the impaired range on the Controlled Oral Word Association Test, when their performance was scored in reference to the updated normative data compared to the original normative sample. To choose the best set of norms for comparison purposes, it is therefore essential to know the subject characteristics and test administration procedures for the normative sample. Subject characteristics are specific identifiers regarding the subjects under study, such as their age, education, IQ, and gender. Procedural variables represent details of test administration such as whether a 30-minute vs. a 1-hour delay was followed. In general, it is advised that selection of the normative data be based on careful review of the subject and procedural variables employed by the normative study since the population to which the reported findings apply may be either restricted or ambiguous. Important subject and procedural variables to consider can be found in Table 2.1. In neuropsychological practice, once the appropriate normative data set has been identified, each raw test score is compared to the distribution of performance scores on the same test obtained by a normative sample with similar demographic characteristics. With this comparison, one can determine whether the test score is below average, above
Table 2.1. Subject and Procedural Variables to Consider When Trying to Locate an Appropriate Normative Comparison Data Set Subject Varitlblea: Sample Composition Description 1. Age 2. 3. 4. 5.
Sample size Education/IQ Gender Handedness (if appropriate)
Procedurdl Varitlblea: 1. Method of administration and scoring of the test (e.g., if a memory test, it should be reported whether delayed recall and recognition conditions were administered) 2. Mean, standard deviation, range (base rate information is preferred as well) 3. Testing history (including order of testing)
20
BACKGROUND
average, or just average relative to the normative group data (cf. Anastasi, 1988). For instance, knowing that a fledically healthy 76-year-old male obtained 18 out of 36 possible points on 3-minute delayed recall of the Rey-Osterrieth Complex Figure h~ little meaning by itself because the ~w s~re conveys no information regarding the performance score. We have no idea;wh ther this is a good, bad, or average scole. ven knowing that 50% of the figure was reailled has little meaning because there is no w4y to discern what percent recall would be $peeled. In this example, the subject (i.e., ~edically healthy 76-year-old male) and procedfral (i.e., direct copy of drawing followed by ~-minute delayed recall without warning, witll scoring following Taylor's method) variables~ known and used to locate an appropriate ntrmative sample. When the raw performance lscore is contrasted with the range of scores obtiuned by the normative sample, one can deternline that a recall score of 18/36 is in the high''average range (80th percentile; in referenc to the norms reported by Boone et al., 1992). n other words, we now know the subject's !relative standing compared to the normativ' group (namely, that performance is better ~ 80% of all normals who took the test). To more precisely judge the nature of performance on a test relative to the r~erence normative (e.g., standard) group, the raw score is converted to a standard score (typica1Jy a z or T score, which is expressed in terms ofstandard deviation units from the mean, see Ch~ter 3). Such conversion permits not only de~rmina tion of the subject's relative standing coptpared with the normative group but also dir~t comparison of scores across different tests.\ The development of "standard" measurement scales is especially important to~ neuropsychologists since test scores collecteJI while assessing the same functional domain !are often expressed in different units of ~asure ment. For instance, when assessing motor functioning, the Grooved Pegboard Te$t score is based on the number of seconds tt> complete placing metal pegs in all the grooved slots on the pegboard, the PIN Test 1core is based on the number of holes punc~d, the Dynamometer score is expressed in kildgrams,
exptcted
and the Finger Tapping Test score is expressed in terms of the number of taps made in 10 seconds. The ability to convert each of these various scores to a standard score equivalent, regardless of the previously expressed units of measurement (seconds, number of holes punched, kilograms, etc.) allows determination of a subject's relative standing in one distribution and permits its comparison with relative standing in another. The underlying assumption when using z or T scores is that the distribution of scores obtained by the normative sample follows what is known as the "standard normal distribution," which approximates the bell-shaped normal curve (see Chapter 3). Therefore, there is a fixed relationship between the standardized test scores, z scores, and percentile ranks. Table 2.2 illustrates the interrelationship between z scores, percentile ranks, and corresponding WAIS-III IQ equivalents. A positive z score will translate to a percentile rank of 50 or greater (refer to left side of the percentile rank column) and to a WAIS-III IQ of 100 or greater (left side of the WAIS-IJI IQ column). A negative z score will translate to a percentile rank below 50 (use right side of column) and a WAIS-III IQ below 100 (right side of column). Consider the following example. You have just assessed Mr. Smith's right (dominant) hand performance on the Grooved Pegboard Test, and you note that it took him 68 seconds to complete. Mr. Smith is 35 years old and has finished 11 years of formal schooling. He has lived almost his entire life in a large western Canadian city and only recently moved to the city where you evaluated his performance. After surveying the available normative data for possible comparison purposes (see Chapter 23), you decide that use of Bornstein's (1985) normative data for the Grooved Pegboard performance would be optimal. Examining the normative table, you note that males in his age and education group performed the test with their dominant hand in 65.3 (8.5) seconds. 68-65.3 (8.5) = 0.32
Considering that higher scores on this test reflect poorer performance (since it took
METHODOLOGICAL CONCEPTS IN NEUROPSYCHOLOGY
21
Table 2.2. Percentile Ranks and WAIS-III IQ Equivalent Scores for Corresponding z Scores Percentile
Percentile
WAIS-III IQ Equiv.
Rank
Rank
WAIS-III IQ Equiv.
SDor
SD or
z Score
+SD
-SD
2.17-3.00 1.96-2.16 1.82-1.95 1.70-1.81 1.60-1.69 1.52-1.59 1.44-1.51 1.38-1.43 1.32-1.37 1.26-1.31 1.21-1.25 1.16-1.20 1.11-1.15 1.06-1.10 1.02-1.05 0.98-1.01 0.94--0.97 0.90-0.93 0.86--0.89 0.83--0.85 0.79-0.82 0.76--0.78 0.73--0.75 0.70-0.72 0.66-0.69
99 98 97 96 95 94 93 92 91 90 89 88 87 86 85 84
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25
83 82 81 80 79 78
77 76 75
+SD ~133
130-132 127-129 126 124-125 123 122 121 120 119
-SD
z Score
+SD
-SD
'5,67 68-70 71-72 73-74 75-76
0.63-0.65 0.60-0.62 0.57--0.59 0.54--0.56 0.51--0.53 0.49-0.50 0.46-0.48 0.43--0.45 OA0-0.42 0.38-0.39 0.35-0.37 0.32-0.34 0.30-0.31 0.27--0.29 0.25-0.26 0.22-0.24 0.19-0.21 0.17--0.18 0.14--0.16 0.12-0.13 0.09-0.11 0.07--0.08 0.04--0.06 0.02-0.03 0.00-0.01
74 73 72 71 70 69
26 27 28 29 30 31 32 33 34
77 78 79 80 81
118 117 116
83
115
85
114 113
86 87
82
84
112
88
111
89
110
90
68
67 66 65 64
63 62 61 60 59 58 57 56 55 54 53 52 51 50
35
36 37 38 39 40 41 42 43
+SD
-SD
109
91
108
92
107
93
106
94
105
95
104
96
103
97
102
98
101
99
100
100
44
45 46
47 48 49
SD, standard deviation.
longer to complete placing all those pegs in the slots), you know that Mr. Smith has performed below the mean for the group (z = - 0.32). Obviously, this z score will be converted to something less than the 50th percentile. Indeed, when you locate a z score of - 0.32 in Table 2.2, the corresponding value (using the right side of the percentile rank column) is the 37th percentile. Percentile ranks permit one to indicate whether the performance is very superior, superior, high average, average, low average, borderline, or impaired. By convention, neuropsychologists use the percentile cutoffs presented in Table 2.3 in describing performance levels. However, it should be noted that these cutoffs may vary depending on psychometric properties of a given test. Therefore, whenever possible, the clinician is advised to use the cutoffs for performance levels that are provided by test authors.
Performance across different tests, expressed in standard scores, is frequently compared in neuropsychology practice, to identify strengths and weaknesses across cognitive domains. This approach, traditionally viewed as the "pattern analysis," should be used with caution as the probability of obtaining abnormal scores in an intact individual increases
Table 2.3. Converting Percentiles to Performance Levels Percentile ~98
91-97 75-90 25-74 9-24
2-8
<2
Level Very superior Superior
High average Average Low average Borderline Impaired
22
BACKGROUND
with the number of tests administered. Christensen et al. (1999) and Schretlen et al. (2003) reported large intraindividual variability in neurologically intact elderly samples, with 27% of the latter sample producing discrepancy values exceeding 3 standard deviations, which might explain the increase in probability of abnormally large deviations from the baseline performance level for an individual patient.
ALTERNATIVE METHODS FOR INTERPRETATION OF CLINICAL DATA 1. Sliwinski et al. (1997, 2003) refer to the conventional approach, described above, as "comparative" and point out that correction for age has an undesirable consequence: the same proportion of the population falls below any impairment cutoff score for all ages. Using probable Alzheimer's disease (pAD) as an example, they argue that this approach takes into account only test performance of intact persons and the relationship between test performance and age, disregarding the strong associations between age and the prevalence of pAD and between age and the probability of having pAD. In contrast, the authors advocate norms that rely on statistical methods that model the probability of having pAD as a function of test performance and age. They propose a "diagnostic norms" approach, which takes into consideration the performance of the clinical group and prevalence rates (base rates) of the disease for the relevant setting/demographics of the patient in addition to the information used by the comparative norms. Diagnostic norms are derived using values of sensitivity and specificity for all possible cutoff scores (which can be displayed on the receiver operating characteristic [ROC] plot). Using this information, the probability of impairment for a certain test score can be computed for different prevalence rates. Since an increase in the prevalence of the disease is taken into consideration
in the diagnostic norms approach, the relative probability of impairment for a given test score increases as age increases, whereas in the comparative norms approach it decreases with increasing age. An important contribution of base rate information to the accuracy of differential diagnosis and detection of malingering in test performance is further underscored by Elwood (1993), Gouvier (1999, 2001), Gouvier et al. (1998, 2002), King et al. (1998), Labarge et al. (2003), Rosenfeld et al. (2000), Slick et al. (2001), and Woods et al. (2003). 2. Another limitation of the commonly used age-corrected norms is that data are grouped into age intervals. No change in test performance is assumed within the interval, with an abrupt shift in scores between intervals. For example, in a normative data set partitioned by age decade, performance of 61- and 69-year-old individuals would be referenced to the same normative expectations, whereas 69- and 70-year-olds would be referenced to different normative expectations. As an individual passes from one age band to another, a notable shift in the interpretations of the same test scores occurs. To remediate this problem, several regression-based techniques have been developed. Gorsuch (1983) developed a continuous norming method for generating continuously adjusted age norms, using an analytic smoothing procedure. Calculation of such continuous norms is based on a few simple equations. Russell used this technique in renorming the Wechsler Memory Scale and in developing norms for the Halstead-Russell Neuropsychological Evaluation System (HRNES and HRNES-R) (Russell, 1987, 1988; Russell and Starkey, 1993, 2001). Zachary and Gorsuch (1985) and Taylor (1998b) applied the continuous norming method to calculate WAIS-R indices. A different regression-based norming system was used by Heaton et al. (1991, 2004) in their Comprehensive Norms for
METHODOLOGICAL CONCEPTS IN NEUROPSYCHOLOGY
an Expanded Holstead-Reitan Neuropsychological Battery and its revised edition. Crawford and Howell (1998b) advocate use of the regression approach over conventional normative data with a single predictor (e.g., age) as well as with multiple predictors and describe two methods for determining differences between the test score predicted by a regression equation and an individual's obtained score. A large discrepancy between predicted and obtained scores suggests a cognitive deficit. They also address application of the regression method to the assessment of change over time, where a regression equation can be generated to predict level of performance at retest based on the individual's score at initial testing. The authors point out that such an approach simultaneously factors in the effects of practice and regression to the mean. Conversely, Fastenau (1998) and Fastenau and Adams (1996) criticize regression-based norms, specifically in reference to Heaton et al.'s (1991) comprehensive norms. They question the validity of the regression-based norms when sample sizes in each demographic cell are low, especially when scores are not normally distributed, variances are not uniform across the range of each demographic variable, and demographic relationships are not linear across the sample used to derive the norms. 3. Cutoff scores have long been used in testing practice. In neuropsychology, this method has been used most widely in defining cutoffs for impairment in the context of Halstead-Reitan Battery. Cutoffs for classification into normal and clinical groups, as well as for differentiation between normal and impaired functional range, have been used in a number of tests and interpretive approaches. Some tests use multiple cutoffs to further refine the differentiation between performance levels (e.g., normal, mildly impaired, moderately impaired, severely impaired).
23
Ivnik et al. (2000) demonstrated the usefulness of cutoff scores in interpreting data from the Mayo Cognitive Factor Scales. The utility of cutoff scores was further emphasized by Soukup et al. (1998), who recommended reporting cutoff scores that represent borderline (15th percentile) and defective (<5th percentile) performances in addition to the means and SDs, to offset problems associated with the skew in test scores. Recent literature suggests growing interest among neuropsychologists in examining the outcomes of neuropsychological services using alternative methods to traditional hypothesis testing, such as odds ratios and relative risk analysis, as well as measures of diagnostic usefulness such as predictive values and likelihood ratios (see Chapter 3), which are based on a diagnostic dichotomy that places the performance of an individual patient above or below the cutoff scores (Bieliauskas et al., 1997; Cerhan et al., 2002; Cicerone &: Azulay, 2002; Haddock et al., 1998; Ivnik et al., 2000, 2001). Several studies address the utility of cutoffs in determining criteria for abnormality in test performance across multiple measures (Cicerone &: Azulay, 2002; Drebing et al., 1994; Ingraham &: Aiken, 1996). For example, Drebing et al. (1994) compared classification accuracy of the following criteria: one, two, or three scores in a battery falling below 1, 1.5, and 2 SDs. In the context of signal detection theory, selection of cutoff scores is based on the tradeoff between sensitivity and specificity (more precisely, between true-positive and false-positive rates of classification), as reflected in the ROC plot. This method has been used for determining the optimal cutoff scores for diagnostic classification (i.e., contrasting normal vs. clinical range) in neuropsychology. Typically, the score with the highest combination of sensitivity and specificity values is selected as a cutoff. A more elaborate method of determining
24
BACKGROUND
a cutoff involves computation of the area under the ROC curve, which ~presents the most useful index of diagQostic accuracy (Swets, 1996). The score associated with the largest area under the curve is the most sensitive cu~g score. In spite of their popularity, t}l.e use of cutoff scores has been criticized for imposing an artificial dichotomy cin a continuous distribution and for s~bjective judgment involved in the sel~on of cutoff points (Dwyer, 1996). It not allow consideration of a spectnfn of related disorders among diagnostc possibilities. According to a nu.ber of studies, use of a single cutoff fc;r a specific test, without conside~ other clinical information and dem.,graphic factors, results in a large nu~ber of false-positive misclassifications. Asimilar effect on classification accuracy is rendered by the failure to account for the base rates (see Chapter 3) of ~e criterion condition in the normative sample. In addition, an ROC curve ytelds reliable cutoff scores for diagnosl classifications only when an external riterion measure provides a reliable bas' for diagnosis in the clinical group. Conditions resulting in subtle cognitive dysfunction frequently do not have a reliable external diagnostic criterion, which undermines the accuracy of classification. Some investigators suggest that many of the current cutoffs are too conservative (Fromm-Auch &: Yeudall.~ 1983), thereby generating too many false negatives. However, their work has be,n done primarily with highly educated, '-igh-IQ samples; and of course, cutoffs based on average performers would generate a high false-negative rate. Unfortunately, a large number of studies docum.nt unacceptably high false-positive ~classi fication rates, placing normal Sllbjects into impaired ranges across dtfferent tests which are interpreted using i cutoff criterion (see chapters on Halst~d-Re itan Battery tests in this book). 4. The authors of the Mayo Older ~erican Normative Studies (MOANS) used the
poes
overlapping interval strategies described by Pauker (1988) to maximize the sample size of the normative distribution at each midpoint age interval (Ivnik et al., 1996; Smith&: Ivnik, 2003). Ten-year age bands were staggered at successive 3-year midpoint intervals. For example, all participants between 62 and 71 years of age contributed to the 67 year midpoint interval, whereas part of this sample also contributed to the 65-74 year age band with the midpoint of 70 years and to the 68-77 year age band with the midpoint of 73 years. The data were co-normed on the same normative cohort for a large battery of tests. The raw score distribution for each test at each midpoint age was normalized by assigning standard scores with a mean of 10 and SD of 3, based on actual percentile ranks. Formulas based on linear regressions were generated for each test in the battery to be applied to the normalized, age-corrected MOANS Sc~ed Scores to adjust for education. Duff et al. (2003) used the overlapping midpoint age interval technique with 5-year midpoint intervals to report normative data for the Repeatable Battery for the Assessment of Neuropsychological Status (RBANS; Randolph, 1998). Age-corrected scaled scores were further converted into education-corrected scaled scores using the same method across four education levels. In spite of the complexity of the procedures used in these studies for derivation of the normative data, these techniques hold great promise as they allow maximization of the sample size for each age interval, "smoothening" of the transition in normative expectations as the patient passes from one age group to another, and direct comparisons between various tests as they are co-normed on the same sample. 5. Crawford and colleagues proposed a single-case approach, where an individual is treated as a sample of n = 1. Crawford and Howell (1998a) described a modified t-test method for interpreting
METHODOLOGICAL CONCEPTS IN NEUROPSYCHOLOGY
an individual's test score when the normative sample is small (n <50). According to this method, the t score represents an estimate of the rarity or abnormality of the individual's test score. The authors argue that the t-test procedure is more appropriate than the standard z scores for any comparison of an individual against a normative sample because the norms are derived from samples rather than from populations. Therefore, normative data are treated by the authors as sample statistics, rather than population parameters. Whereas with large or modest sample sizes, this distinction has minimal effect on sampling distribution and the difference between the values of t and z is trivial, in small samples t represents a more accurate estimate of the rarity of an observation as the sampling distribution of the variance in a small sample is positively skewed (leading to underestimation of variance) and, therefore, z scores are likely to be overestimated. This approach was expanded by the authors to provide point estimates of the abnormality of differences between scores achieved by an individual on two tests (Crawford et al., 1998a) or between a patient's mean score on several tests and the score on a specific test contributing to the mean (Crawford & Garthwaite, 2002). Introduction of an intraindividual measure of association (liMA) as an estimate of abnormality (Crawford et al., 2003) represents a further advancement in the single-case approach. The liMA is expressed as a parametric or nonparametric correlation coefficient or the slope of a regression line. The estimate of abnormality of an individual score is based on comparison of the magnitude of the index of association for a specific patient with corresponding statistics for normative or control samples. The authors underscore that only summary statistics (SDs or measures of association between tests) are needed to derive estimates of abnormality and
25
respective confidence limits using the single-case approach. 6. As discussed earlier in this chapter, both fixed and flexible battery approaches have their merits and shortcomings. One of the weaknesses of the flexible battery approach is an increase in the number of scores falling below the expected range as the number of tests included in the battery increases. The fixed battery approach allows one to minimize the impact of this problem by introducing global indices, such as the HalsteadReitan Average Impairment Rating (Russell et al., 1970), Halstead Impairment Index (Reitan & Wolfson, 1985), Global Neuropsychological Deficit Scale (Reitan & Wolfson, 1988), as well as summary indices derived through factoranalytic or correlational studies. Rohling's Interpretive Method (RIM) (Miller & Rohling, 2001; Rohling et al., 2003a,b) is designed to generate global indices (summary statistics) in the context of the flexible battery approach. This method produces summary results analogous to those generated in fixed battery approaches, yet it offers the advantage of allowing the clinician to choose varied test batteries depending on the needs of a particular case. The authors describe 24 steps outlining the sequence of the procedures underlying RIM, including steps commonly used in neuropsychology practice (e.g., test administration and conversion of test scores to a common metric), as well as additional statistical procedures leading to generation and graphical representation of 14 summary statistics. Additional steps used to integrate and interpret the obtained statistics are based on psychometric procedures yet require the subjective judgment of the clinician. The results can be viewed at the global level, the domain level, or the test measure level. The authors point out that clinical judgment involved in decision making in the context of RIM is facilitated by statistical techniques (e.g., those assessing
26 impact of heterogeneity, sa~ple size, or unequal effect sizes). They view the RIM as a statistical interpretfle model and point to several weaknesses of this method (e.g., different tests art normed on different populations). Ne~rtheless, they assert that the method i$ "a solid step toward increasing the statif;tical and methodological strength of interpretive arguments" (Miller & Roj1g, 2001, p. 168). 7. Another approach which alloWs one to tailor the psychometric pro~rties of measures to specific measure~nt goals is represented by the item [response theory (IRT). It challenges me~~ement principles underlying classi~rst theory, which is most commonly, used in neuropsychology. In classical test theory, interpretation of test performance is based on) a normreferenced standard, where ~e raw score obtained on a test is : linearly transformed into a standard ~core to determine a relative position 1of that score in comparison to the n~rmative distribution. Embretson (1996); pointed out an objection to the norm-referenced approach in that scores have no meaning for what the person actually cari do. In IRT, the meaning of a scote is referenced to item difficulty. A rell4:ionship between ability level, which is ~Hected in a test score, and probability of a correct response to a given item is summarized in the item characteristk curve (ICC). The ICC is determined tby item difficulty and discrimination. It~ difficulty is the ability level at w~ch the probability of failing that item is ~qual to the probability of passing. (If the individual's ability level exceeds th. item's difficulty level, the individual wpuld be likely to pass that item.) Item ~crimi nation is manifested in the slope of the ICC at the level of item difficttlty and reHects the precision of measuref1ent. If the items are structured by co*ent in order of increasing complexity, ~f,sing a certain item by an individual c~ _he interpreted from the perspective . of the
BACKGROUND
content of knowledge available to that individual. This property of the IRT overcomes the above-mentioned weakness of the norm-referenced approach by relating a test score to what the person can do. As IRT represents model-based measurement, two or more scales measuring different abilities can be psychometrically matched to assure equivalent levels of reliability throughout the ability continuum. This is achieved by selecting items which fit a specified model for scale construction. Another important distinction between the two theories is related to the values of the standard error of measurement (SEM). According to classical test theory, the SEM is constant across scores in a particular population but differs between populations. Therefore, the SEM value reported in the manual for a particular test is applied across the entire range of test scores. However, in the context of a test battery, different tests are associated with different SEMs as estimates of reliability and variability (which underlie calculations of the SEM) differ across tests/populations. Since interpretations of test performance profiles are based on an assumption of equal SEMs across different tests, inequality of SEMs undermines the accuracy of interpretation. In the context of the IRT, the SEM increases with a departure from the mean of the score distribution but generalizes across populations. This increase in the SEM is a function of the distribution of item difficulty as the number of items that appropriately measure a given ability level decreases toward the extremes of the score distribution. Consequently, confidence intervals increase and precision of measurement decreases toward extremes of distributions, whereas the pattern of these changes is invariant across populations. In application to neuropsychological test performance, the magnitude of the SEM can be represented as a composite value, which is
METHODOLOGICAL CONCEPTS IN NEUROPSYCHOLOGY
the mean of individual SEM values across ability levels. The IRT methods are applicable at the scale, test, or item level. Some of the IRT models are applied by neuropsychology researchers and clinicians, such as the Rasch model, which is used in the analysis of dichotomous items, rating scales, partial credit scoring, and multidimensional models (see Embretson, 1996, for review). Mungas et al. (2003) used IRT methods to derive psychometrically matched measures of global cognition, memory, and executive function from commonly used neuropsychological tests based on a sample of 400 elderly participants, who ranged in cognitive functioning from normal to demented. The authors demonstrated utility of this method for "creating psychometrically matched measures of clinically relevant domains for use across a broad range of ability" (p. 386). The majority of studies reviewed in this book report common descriptive statistics, such as means, SDs, or percentiles. Cutoff scores or other criteria for abnormality were reported in a few studies. The regression approach was used by us to derive predicted scores for selected tests based on metaanalyses of available data sets (see Chapter 3). Although the emphasis in our literature reviews is placed on conventional descriptive statistics, we urge clinicians and researchers to consider the merits of more advanced techniques. In spite of the psychometric sophistication of many of the above measures, their use in clinical practice can be facilitated by computer software available from their authors.
FACTORS INFLUENCING PERFORMANCE ON NEUROPSYCHOLOGICAL TESTS When interpreting performance on neuropsychological tests, sole reliance on the actuarial data might lead to misinterpretation of the patient's level of functioning and to a
27
distorted profile of strengths and weaknesses. Several factors influencing test performance need to he considered beyond applying statistical methods to data interpretation. Some are discussed below.
Effort and Motivation A unique, rapidly evolving role for neuropsychological assessment is the identification and documentation of noncredible test performance in the context of malingering, somatoform disorder, or other conditions in which the patient is motivated, either consciously or nonconsciously, to present him- or herself as more cognitively impaired than is actually the case. If we cannot verify that patients are performing with adequate effort, our evaluations, no matter how lengthy or elegantly written, are worthless. Clinical neuropsychology has the objective methods to determine whether cognitive complaints secondary to claimed brain damage and/or psychiatric dysfunction are accurate or fabrications. Numerous wellvalidated cognitive effort tests are now available, as are indices derived from standard neuropsychological tests found to be highly sensitive to feigned performance, which possess sensitivity and specificity values in most cases superior to those of standard cognitive tests used to discriminate clinical disorders from normal function. It is beyond the scope and mission of this book to include these tests and indices, but we will stress that, as with measurement of traditional cognitive domains, assessment of effort requires administration of more than one test or index, preferably interspersed throughout the test battery, tapping the veracity of performance on differing types of cognitive skills, such as memory, motor function, processing speed, math, problemsolving/abstraction, etc. While most neuropsychologists realize that effort must be assessed in forensic settings, it has become apparent that effort should be regularly assessed in clinical evaluations; we have found that when effort measures are routinely administered, evidence of poor effort can emerge in unexpected situations (e.g., an "Alzheimer's patient" in a clinical drug trial who was later determined to have a factitious disorder rather than dementia).
28
Assessment of effort performs the important gatekeeper function of insuring that patients receive compensation only for actual disabilities and that neuropsychological assessments are not misused by individuals perpetrating fraud. In the case of somatofonnl conversion disorder patients, identification of noncredible cognitive complaints can steer treatment strategies away from reinforcing the medical patient role to addressing the underlying issues/concerns driving the symptom creation.
Issues in Cross-Cultural and Multicultural Neuropsychological Assessment Issues of ethnic diversity pose difficult and serious challenges for the field of neuropsychology, particularly as clinicians are more frequently being requested to assess functioning in patients from varied ethnic, cultural, socioeconomic, and linguistic backgrounds (see Ferraro et al., 2002; Fletcher-Janzen et al., 2000; Nell, 2000; for review). Practitioners are faced with a number of problems when conducting such assessments. One problem may be finding standardized neuropsychological instruments in languages other than English. Alternatively, if the patient speaks English, the problem may be finding ethnicity-specific normative data that take into account issues such as culture and bilingualism. Some of the critical issues in using these approaches in cross-cultural and multicultural neuropsychological assessment will briefly be discussed below. Due to the lack of availability of neuropsychological instruments in other languages, clinicians often have to translate tests or find translated versions of standardized tests in the literature. However, Puente and Ardila (2000) describe a number of methodological problems with such an approach. These authors point out that "translation and adaptation require much time and expertise." For example, translation of neuropsychological tests from English to Spanish is quite complex given that there are a number of Hispanic subgroups that use varied idioms and expressions of the language. The issue of test translation is just as
BACKGROUND
difficult in other ethnic groups, such as Asians (Wong, 2000) and Middle Easterners (Escandell, 2002), where there is large diversity in the languages or dialects and cultures. Another methodological problem in adapting English-standardized tests into other languages involves the validity of such measures when used with other cultural groups (Ponton & Ardila, 1999, Puente & Ardila, 2000; Nell, 2000). Simple language translations may fail to take into account the impact of familiarity and relevance of the test items in different ethnic groups (Escandell, 2002; Puente & Ardila, 2000; Wong, 2000), possibly compromising the validity of the results. Similarly, once a neuropsychological test has been translated into another language, it may no longer measure the same cognitive functions it was once thought to measure in its standard form. For example, Escandell (2002) points out that according to a study by Loewenstein et al. (1995), a pattern of correlations for a measure of daily functioning, the Direct Assessment of Functional Status (OAFS) test, was different when it was translated into Spanish relative to its original English version. Similarly, Puente and Ardila suggest that tests such as Digit Span in the WAIS or WISC may require other cognitive processes when administered in Spanish than in English since naming the digits requires a different number of syllables. While developing cross-cultural tests can be challenging, when specific procedures and guidelines are used, adequate outcomes can be achieved. Along with other approaches, Puente and Ardila (2000) recommend using Brislin's (1983) three-step procedure when translating tests, particularly for the Hispanic population. These steps include the initial translation, back translation, and resolving differences between the original version and the resulting translated version (for a more detailed discussion of this approach, see Puente & Ardila, 2000). Additionally, translation and test development for specific ethnic groups require a thorough understanding of the group's culture and a familiarity with the language. Another issue is that normative data need to take into account various cultural factors in addition to the usual demographic factors.
METHODOlOGICAl CONCEPTS IN NEUROPSYCHOlOGY
Even when standardized tests can be administered in English, we lack understanding of the effects of biculturalism and bilingualism on test performance in various ethnic groups (Ardila et al., 2002). Manly and colleagues found that in a sample of African Americans the level of acculturation and the use of "black English" accounted for significant proportions of variability in various neuropsychological test scores, particularly those that relied on verbal skills (Manly et al., 1998), and that reading level, not educational attainment, attenuated the difference in most neuropsychological test scores of older African Americans and Caucasians (Manly et al., 2002). Several large-scale normative projects for African-American individuals have been conducted in response to the pressing need to generate normative data specifically for this ethnic group: the Consortium to Establish a Registry for Alzheimer's Disease (Fillenbaum et al., 1997; Unverzagt et al., 1996; Welsh et al., 1995), the Washington-Heights-InwoodColumbia Aging Project (Manly et al., 1998), and the San Diego African American Norms Project (Diehr et al., 1998; Gladsjo et al., 1999). In their most recently updated normative manual, Heaton et al. (2004) present T-score conversions separately for groups of African Americans and Caucasians adjusted for age and education level. These are among the most comprehensive norms presented for African Americans to date. For Hispanics, Gasquoine (2001) suggests that bilingualism and acculturation factors should be used to determine whether Englishor Spanish-language tests are to be administered. Harris et al. (1995) examined the cognitive performance of three groups: those proficient in both English and Spanish, those more proficient in Spanish, and non-Hispanic Caucasian monolingual English-speaking adults. They found a small difference between Spanish-dominant and monolingual English speakers on a nonverbal reasoning task but noted that this difference was probably not clinically significant. However, they did find that degree of bilingualism represented a significant variable in the learning and retention of verbal information on a commonly
29
administered word-list learning test (i.e., California Verbal Learning Test). Use of verbal learning and memory strategies, such as semantic clustering, were found to be related to level of language proficiency. Culture-specific and cultural/language adaptations of specific neuropsychological tests have been developed and, when available, have been discussed in the individual chapters of this book. Additionally, Artiola i Fortuny et al. (1999) developed the Bateria Neuropsychologica en Espanol, a standardized and validated battery of neuropsychological tests culturally adapted for Spanish-speaking individuals. Normative data based on 390 participants, which were collected in Spain, Mexico, and the United States, are stratified by geographical area x age x education. The battery includes adaptations of commonly used neuropsychological tests, with a visual memory task, 16-word list-learning task, story learning task, letter Huency (P, M, R), attentional tasks (e.g., digit repetition), Stroop test, and Wisconsin Card Sorting Test. Ostrosky-Solis et al. (1997) developed the NEUROPSI, a short neuropsychological test battery for Spanish-speaking adults. The battery includes tests of orientation (time, place, person), attention (e.g., digits backwards), verbal and nonverbal recall and memory, language (e.g., naming, repetition, comprehension, verbal Huency), concept formation (e.g., similarities), and motor skills (e.g., alternating movements). Expanded normative data, stratified by four age groups and three education levels, for this battery are presented in Ostrosky-Solis et al. (1999). It is clear that culturally sensitive tests that are valid in use with individuals from various cultural and linguistic backgrounds need to continue to be developed. Additionally, normative data for existing standardized tests need to take factors of ethnicity, acculturation, and bilingualism into account if valid interpretations are to be made for different ethnic groups. It is also clear that adequate training in cross-cultural and multicultural neuropsychological issues is needed. Fastenau et al. (2002) propose one such model of multicultural training that may be used at the
30
BACKGROUND
graduate, internship, and postdoctoral training levels.
FINAL CAVEATS A major focus of this book is to assist the clinician to identify and select the most demographically appropriate normative data for specific patients, but the issue of demographic adjustment is not as simple as it might appear and some caveats need to be raised. While neuropsychologists have come to appreciate the impact of demographic variables on test performance, there has been little or no dialog as to how best to use and communicate this information for the best interest of specific patients. By way of example, let us examine the case of a 58-year-old male physician who obtains a score of 80" on Trails B. This score would be in the average range for his age (43rd percentile for men aged 40--59 with 12 or more years of education; Bomstein, 1985), but to report this performance as average would overlook the probable very real decline in function for this individual with 20 years of education. To illustrate the decline, the clinician would need to use norms which adjust for the high level of education, with the result that the Trails B score drops to the borderline impaired range (4th percentile for men aged 45-60 with more than college education; Seines et al., 1991). However, a treating professional who receives the neuropsychological report and sees a description of "borderline impaired" performance might be tempted to limit the patient's control over his finances, his access to driving, etc.-an unwarranted intervention and one that would do the patient a grave disservice, given that he is in fact average relative to the "typical" 58-year-old. Conversely, to omit the demographic correction and report the patient's performance as average would be of concern if a decision needed to be made regarding whether the patient was capable of returning to work as a surgeon; if the patient is borderline impaired relative to other 58-year-olds with 20 years of education, continued surgical practice might in fact place future patients in danger. Admittedly this is a
simplistic scenario because decisions of this magnitude should be based on multiple data points. However, the example does illustrate a clinical problem which, while raised by Axelrod and Goldman in 1996, has apparently received no further attention or discussion by the field of clinical neuropsychology, and as a result, no consensus has emerged as to how to address the dilemma. One partial remedy is that all neuropsychological reports should indicate for a given score which demographic corrections were applied, as in the following example: Trails B performance was borderline impaired (4th percentile for age and education; Seines et al., 1991).
It might also be preferable at times to com-
pare patient test scores against the general population and demographic peers: Dr. X scored within the average range on Trails B, when compared to the general population of 58-year-olds (43rd percentile for age; Bornstein, 1985), but within the borderline impaired range as compared to individuals of the same educational level (4th percentile for age and education; Seines et al., 1991).
Assuming that other executive skills showed a similar pattern of performance, the report summary might conclude: While executive/problem-solving skills were intact for the patient's age, they were significantly lowered (i.e., borderline impaired) as compared to individuals of the same educational level.
The consumer of the report would then be able to use the appropriate interpretation when making decisions regarding patient management. Specifically, if the issue is financial decision making, in which the general population of 58-year-olds can successfully engage, then the health-care professional would probably be correct in allowing the patient to make financial decisions. However, if the issue is return to work as a surgeon, which requires a finely developed skill executed only by a highly educated subset of 58-year-olds, then the treating professional would be wise to be concerned and should take steps to insure
METHODOLOGICAL CONCEPTS IN NEUROPSYCHOLOGY
that, at the least, the patient's professional activities are closely monitored. A second issue in the clinical application of demographic corrections has to do with whether the relationship between demographic variables and test scores is unidirectional or bidirectional. In the case of age and gender, both can impact test scores and, by extrapolation, cognitive function, but test scores I cognitive function do not affect age and gender; thus, these variables have a straightforward unidirectional relationship with cognitive scores, and it is appropriate to correct test scores for these factors when they account for test score variance. However, education leads to a more complicated situation. Educational level can impact test scores and, by inference, cognitive function, but cognitive function can also impact educational level; i.e., individuals of low cognitive ability do not typically complete much schooling. There are two groups of individuals with low education: those who did not have the opportunity to complete much schooling but who would have benefited from it and those who discontinued schooling because they could not comprehend and learn the material and would not have profited from additional education. It is appropriate to correct for educational level in the former group but not the latter group. This issue has very important real-world ramifications in that individuals of mentally retarded intelligence and low education might be "corrected" out of their lowered cognitive scores when lowered educational level is factored into test interpretation, thereby potentially losing benefits to which they are entitled (Regional Center resources, disability compensation) or, even more seriously, losing life if the restrictions against capital punishment for mentally retarded individuals no longer apply to them. On the other hand, to fail to correct for education in individuals with normal premorbid cognition who did not have the opportunity to complete at least high school would portray them as more cognitively impaired than is accurate, and as a result, they might lose privileges to which they are entitled and for which they are competent (e.g., managing financial affairs, driving, living in-
31
dependently). As a field, it will be critical for clinical neuropsychology to develop algorithms, perhaps based on quantification of school records and psychosocial data, which allow for determination as to whether a patient's low educational level is primarily due to low cognitive capability vs. environmental factors, with these data then used to determine whether education corrections to test data should be applied. Corrections for ethnicity may be similarly distorting. Ethnic differences in test performance appear to be primarily driven by cultural inftuences, such as level of acculturation to the predominant Anglo-American culture, English Huency, quality of education, and socioeconomic factors that affect cognitive development, and not race or ethnicity per se. Unfortunately, our lack of understanding of such factors combined with the focus on ethnic/racial differences in test performance has often led to faulty assertions regarding the cognitive capabilities of some ethnic minority groups. Ultimately (and hopefully in future editions of this book), adjustments to test scores will be made for factors such as acculturation, not ethnicity.
Data Inclusion in Neuropsychological Reports The writing of psychological reports, including neuropsychological reports, has gradually evolved to include more actual scores and data. Those of us who attended graduate school in the 1970s and 1980s were frequently instructed to summarize psychological results in a narrative format and specifically admonished not to include actual test scores because these would be misused/misinterpreted by the reader and patients would be harmed. However, one suspects that a secondary motive of at least some report writers for not including scores was that many interpretations were not in fact empirically based; if no raw scores were reported, the interpretations could not be questioned. More recent pressures for researchvalidated treatment approaches and accountability has led to a greater rigorousness within the subspecialty of psychological testing.
32
Practitioners who are confident in their neuropsychological skills and knowledge are not reticent to provide raw test data and to reveal how interpretations were derived. Matarazzo (1995) asserts that, in fact, the inclusion of test scores serves to minimize and clarify any interpretation biases or idiosyncrasies on the part of the writer. The inclusion of raw test scores in reports is also critical for comparing the results of initial and subsequent neuropsychological evaluations. Further, access to some services (e.g., Regional Center resources for individuals with extremely low intelligence) and some criminal sentencing decisions (e.g., ineligibility for the death penalty in mentally retarded individuals) require the reporting of actual scores. Pieniadz and Kelland (2001), summarizing the results of a survey completed in the 1990s by 81 directors of neuropsychology training programs, indicated that only 35% of respondents routinely appended test scores to reports. The reasons most commonly given for including numerical data were "thoroughness" (100%) and "facilitation of comparison" of test records (96%). Of those who did not append actual test results, 80% indicated that their decision was based on a desire to avoid misinterpretation by unqualified persons. In contrast, Donders (2001), summarizing the results of a survey completed by 414 U.S. members of American Psychological Association Division 40, revealed that 88% of respondents included numeric data in their reports, although raw scores were provided less frequently.
BACKGROUND
It is of interest that although the most commonly stated reason for omitting actual test scores from reports is protection of the patient from misinterpretation of results by nonpsychologists, no empirical data have emerged in the decades of psychological testing showing any harmful effects caused by inclusion of scores in test reports. In fact, Freides (1995) and Matarazzo (1995) assert that there is a greater potential for harm from interpretations of scores without the scores than from scores without interpretations. The interested reader is directed to more thorough discussions of this topic by Freides (1993, 1995), Pieniadz and Kelland (2001), Donders (2001), Matarazzo (1995), and Naugle and McSweeny (1995, 1996). We recommend that neuropsychological reports contain (1) all raw scores and percentiles, standard scores, and/or T scores; (2) the normative studies used to derive percentiles if other than the published test manual; and (3) which demographic factors were adjusted for in each test. Reporting of raw scores is important for various reasons. For example, a superior normative data set may emerge after an initial neuropsychological assessment; on retesting, the examiner might want to score both sets of test scores according to the more recent norms, which would not be possible if the initial raw scores were not available. Further, inclusion of the raw scores enables the reader to check that the scores were in fact converted and interpreted properly.
3 Statistical and Psychometric Issues
The administration, scoring, and interpretation of neuropsychological tests are major sources of information used in the clinical practice of a neuropsychologist to make decisions about patients' cognitive status, diagnosis, prognosis, and treatment. However, accurate decisions based on test results cannot be made without a clear understanding of the issues related to the measurement of psychological phenomena and the statistical properties of the tests. This chapter reviews basic statistical concepts of importance to neuropsychologists. No intent was made to provide a comprehensive review of statistics. The goal of this chapter is to help a novice understand and interpret psychometric data.
MEASUREMENT AND INTERPRETATION OF NUMERICAL VALUES Measuring abilities and traits is an inherent part of clinical work. It facilitates decision making by relating the performance of a given individual to an appropriate reference group or by uncovering a change in an individual's behavior over time. The nature of decision making is specific to a given situation and includes a wide range of decisions, such as identifying cognitive strengths and weaknesses, choosing an appropriate course of cognitive
rehabilitation, evaluating a patient's ability to function independently in an everyday environment, choosing an academic field and professional career, making diagnostic differentiation between disorders that affect cognitive functioning, assessing the rate of improvement or deterioration in functional capacities, and making prognostic predictions. The concept of measurement implies numerical representation of certain properties. Such physical properties as dimensions, intensity, speed, and gravity, which represent a core of scientific exploration, lend themselves to accurate and reliable measurement. In contrast to direct measurement of physical phenomena, psychological attributes such as cognitive abilities, personality traits, and emotional status cannot be measured directly. To assess these psychological constructs, we need to obtain a sample of behavior that can be quantified and represented in numerical scores. Well-validated psychological tests are designed to elicit behaviors that are representative of the underlying psychological constructs. Numerical values derived from an individual's performance on a test are identified as raw scores and may represent the number of correct responses, time required for completion of the test, number of errors, rating of the quality of a drawing, or different combinations of the above criteria. 33
34
In contrast to physical measuring scales that have an absolute 0 point, scaling ofpsychological measures does not start at the point of "no ability at all"-i.e., a raw score of 0 on an arithmetic test does not indicate that the patient has no ability to solve arithmetic problems. If the test included basic operations of addition or subtraction of single digits, the patient would be likely to succeed on these items. As a result, we cannot infer that a patient who received a raw score of 50 on a test is twice as good at arithmetic as someone with a score of 25. Due to the lack of the absolute 0 point in psychological measurements, ratios of scores are meaningless and most psychological tests are scored on an interval scale. Despite some disadvantages of the interval scale in comparison to the ratio scale, both of these scales provide measurements that lend themselves to advanced statistical analyses. The raw score obtained on a test says little about an individual's level of ability or maste:ry of the subject. To interpret the raw score, it should be related to the content of the test or compared to the performance of a group of individuals on the same test. Various interpretive strategies are outlined in Chapter 2; however, the statistical and psychometric issues related to the most frequently used interpretation strategies are discussed below. Test performance interpretation can be based on different reference criteria:
1. Domain- or content-referenced interpretation indicates how proficient an individual is in the domain tapped by the task presented on the test. Content maste:ry is usually reported as a percentage of correct responses on the test. 2. Criterion-referenced interpretation relates an individual's performance on a test to an external criterion measure, such as a practical situation which requires skills assessed by the test. For example, expectancy tables tie different levels of test performance to expected practical outcomes. 3. Norm-referenced interpretation compares scores achieved by one individual to the performance of a respective group of individuals who have similar characteristics. This
BACKGROUND
normative or standardization sample is assumed to be representative of the population from which it is drawn and is used as an external standard of perfonnance for interpretation of individual scores. There are several methods of relating individual performance to the norms: a. Raw scores obtained on a test can be converted to age or grade equivalents, which allow interpretation of a particular score in the context of expected performance for a specific age or grade level. This method is highly useful in assessing the developmental standing of children in comparison to their peers, provided that a considerable and well-identified increment in ability with age and grade advancement is expected. However, this method loses its effectiveness when the rate of development becomes uneven and the relationship between levels of ability and developmental markers weakens. b. Measures of relative standing of individual scores within a distribution provide an alternative method of evaluating individual performance. Percentile rank (PR) reftects the percentage of the standardization sample that scored lower than the individual score (plus one-half of that portion of the standardization sample who achieved the same score as the individual being assessed). The PR is useful for providing the relative standing of a score; however, it indicates only the ordinal position of the individual score within the distribution. It does not show dispersion of the remainder of the distribution below that score and does not indicate the absolute amount of difference between scores. For example, percentile transformations magnify differences between individuals close to the center of the distribution and compress the differences at the extremes. Let us consider a distribution of scores (A) representing the number of words recalled on the fifth trial of the
STATISTICAL AND PSYCHOMETRIC ISSUES Rey Auditory-Verbal Learning Test (RAVLT): A= (5, 7, 8, 8, 9, 9, 9, 9, 10, 10, 10, 10, 10, 11, 11, 11, 11, 12, 12, 13, 15)
There are 21 observations in this distribution. An individual who obtained a score of 9 performed better than four individuals who received scores lower than 9, plus two out of four individuals (0.5) who achieved a score of 9. Relating this proportion of six individuals to the total number of 21 observations yields a PR of 29, as follows from the formula: PR(9)
= 4 + (0.5 ) (4) 21
x 100 = 29
Similar calculations of PR for scores 8 and 7 in the above distribution yield 14 and 5, respectively: PR(s)
= 2 + (0.5 )(2) x 100 = 14 21
and 1 PRm = - x 100 = 5 21
As follows from the above calculations, the difference in PR between scores 9 and 8 (29 - 14 = 15) is greater than the difference between scores 8 and 7 (14- 5 = 9). This example illustrates the main disadvantage ofPRs: whereas they reflect the position of an individual score relative to the standardization sample, they do not indicate the absolute differences between scores. c. To accommodate the absolute differences between scores, interpretation of a raw score should be based on the relative standing of the score with respect to the mean for the distribution and the variability of the scores within the distribution. This can be accomplished through converting a raw score into a standard score. The most frequently used standard scores are z and T scores.
35
STANDARDIZATION OF RAW SCORES Compare the distribution of RAVLT scores (A) used in the above example with another distribution of scores on this test (B), both of which range from 5 to 15 with a mean of 10: A= (5, 7, 8, 8, 9, 9, 9, 9, 10, 10, 10, 10, 10, 11, 11, 11, 11, 12, 12, 13, 15) B = (5, 5, 5, 5, 6, 6, 6, 7, 7, 8, 10, 12, 13, 13, 14, 14, 14, 15, 15, 15, 15)
Visual examination of these distributions suggests that the variability around the mean is much greater for distribution B. Therefore, a score of 7 would indicate very poor performance relative to distribution A and a much better performance relative to distribution B. To account for the degree of variability in the normative distribution, individual measurements are converted into z scores. In another example, assessing recall of an individual on the fifth trial of the RAVLT, we use a reference sample with a mean of 10 (see graphs below). If the individual recalled seven words, comparison of the raw score with the mean for the reference sample (X- M) suggests that this individual's recall was three words below the expected score of 10 for his or her age. However, this does not tell us how low his or her performance was relative to the distribution of the normative sample. If a high degree of variability is expected and the standard deviation (SD) for the reference sample is 6 (graph A), then a score of 7 falls halfway between a mean of 10 for the reference sample and a score of 4 representing 1 SD below the mean, which results in a z score of - 0.5. -1SD
X
M
A) ~--+--+--+--+--+--+--+--+--+--+--+--+--+-1
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 Number of words
In reference to another sample with the same mean but a much lower degree of variability reflected in an SD of 1 (graph B), a score of 7 lies 3 SDs below the mean (z = -3). Therefore, the recall of seven words indicates
BACKGROUND
36
much poorer performance in relation to distribution B than in relation to distribution A. X-1SDM B) ~--+--+--+--+--+--+--+--+--+--+-+--+--+--~
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 Number of words
Thus, to account for the variability within the nonnative distribution, raw scores are standardized, i.e., converted into z scores that relate the difference between an individual score and the group mean (X- M) to the SD for the reference group:
X-M
z=-SD
A negative z score indicates that the raw score lies below the mean for the reference group, a positive z score represents higher performance than the mean for the group, and a z score of 0 indicates that the raw score is equal to the mean of the reference group. 1 The z score (SD units) shows not only how much an individual perfonnance deviates from the mean of the sample but also how likely it is that other individuals in the sample would achieve scores as high or as low as the person being tested. Standardization of raw scores, e.g., their conversion into z scores, allows comparison of the relative standing of individuals across different tests in spite of the differences in the measurement scales or the means and SDs for these tests. A standardized distribution of z scores has a mean of 0 and an SD of 1 because the mean is subtracted from each score and the result is divided by the SD. It preserves the same shape as the distribution of the raw scores from which it was derived. Therefore, differences in standard scores are proportional to the differences in the corresponding raw scores. 1For those tests that measure performance in terms of time or number of errors, where the higher scores reHect lower performance, z scores represent an inverse of the obtained score. Mathematically, in these cases the numerator should be multiplied by -1, i.e.-(X- M).
In spite of the obvious advantages of using z scores over raw scores, some of the properties of z scores are viewed as undesirable: (1) z scores have fractional values, which are carried to at least one decimal place; (2) half of the z scores in the standardized distribution are negative and half are positive, which leads to the zero-sum problem (i.e., corresponding values on both sides of the distribution cancel each other when totaled). Parameter values of the standard distribution are arbitrarily designated. Therefore, they can be easily changed through simple arithmetic transfonnations of z scores. T-score transfonnations overcome these disadvantages through multiplying z scores by 10 (thus eliminating fractional values) and adding a constant of 50 (which eliminates negative values and places all the scores on a scale of0--100 with a mean of 50 and SD of 10): T= 10z+50
For example, a z score of - 1.6 can be expressed in T scores as follows: T=(-16) +50=34
An example of a test which uses T-score conversion is the Minnesota Multiphasic Personality Inventory (MMPI) and its recent revision. Clinically significant elevations on the scales are judged relative to a mean of 50 and an SD of 10, which equates the scale of measurement across all validity and clinical scales on this test.
STANDARD SCORES AND NORMAL DISTRIBUTION Many biological measures and human characteristics are distributed so that the highest frequency of scores is observed around the distribution mean, with a gradual decrease in the frequency further away from the mean, which eventually tails off on both sides. Score distributions of many psychological tests approximate this model, which in its ideal hypothetical fonn represents a normal distribution. It
STATISTICAL AND PSYCHOMETRIC ISSUES is convenient to treat test score distributions as if
they were normally distributed because the properties of this model are known:
1. The distribution of hypothetical score frequencies arranged from the lowest to the highest values is bell-shaped and symmetrical; i.e., the left and right sides are mirror images of each other. 2. The frequency is highest in the middle of the distribution; therefore, the mean, median, and mode have the same values and divide the distribution into two equal parts. 3. The normal distribution stretches from minus to plus infinity; thus, the "tails" of the distribution get closer and closer to the x axis as they get farther away from the mean but never touch the x axis. 4. The normal distribution is described by a specific mathematical formula. Although the test score distributions do not perfectly match this model, if the number of cases were increased and smaller class intervals were used, the shape of the sample distribution would become relatively smooth and symmetrical, thus approximating the
Standard deviations
-3
-I
-2
s
Pen:cntile ranks
20
Mean
so
37
distribution of scores in the population from which the sample was drawn. This assumption of normality of the test score distribution allows it to be converted into a distribution of z scores with a mean of 0 and an SD of 1, which represents a standard normal distribution. Use of this conversion facilitates interpretation of the test scores because it allows comparison of a variety of otherwise not comparable distributions through equating their means and SDs. The proportion of cases comprising a certain area under the curve between two points along the z axis is known, which permits conversion of z-score units into percentiles. For example, it is known that 34.13% of all scores lie between z=O and z = + 1. Since the mean of the distribution (z = 0) divides the distribution in half, we know that 50% of all scores lie below the mean. Thus, adding 34.13% of scores above the mean to 50% of scores below the mean suggests that the 84.13th percentile corresponds to z = + 1. Figure 3.1 illustrates the corresponding conversion values for selected z scores. The proportion of scores (i.e., the area under the standard normal curve) for each value along the z axis can be easily determined using tables provided in any basic statistics textbook.
+I
80
+3
+2
9S
99
zscores
-3
-2
-I
0
+I
+2
+3
Tscores
20
30
40
so
60
70
80
figure 3.1. Illustration of the relationship between proportion of scores {represented by the area under the normal curve), percentile ranks, z scores, and T scores.
BACKGROUND
38
INTERPRETATION OF INFREQUENT (OUTLYING) SCORES As follows from Figure 3.1, 68.26% of all scores fall within 1 SD from the mean in both directions, 95.44% fall within 2 SDs, and almost all scores except for 0.0026% are included between - 3 and + 3 SDs from the mean. This correspondence between the proportion of cases and z-score values is important for interpretation of individual test performance since such interpretation is based on the relative frequency of the score obtained by the individual being assessed with respect to the distribution of scores. For example, a test score falling outside the range of 2 SDs above or below the distribution mean is highly infrequent; only 4.56% of the scores deviate that far from the mean in both directions. Therefore, individuals obtaining these infrequent scores may be viewed as outliers. The decision criteria for defining scores as outlying might vary from more conservative to more liberal in different clinical situations, depending on the cost-benefit ratio of making false-positive vs. false-negative errors. Outlying scores can have different origins:
g. Emotional factors and level of endurance (fatigue) h. Response style and response bias i. Motivational factors j. Previous exposure to similar tests (practice effect) 3. Outlying scores may result from an execution error. In this case, an individual score is foreign to the distribution used for comparison. For example, an elderly individual of low average ability might appear impaired when compared to a normative sample of highly functioning, independently living, relatively healthy elderly individuals. This bias in the normative sample, which is not representative of all functional and economic levels of the population for the respective age group, would result in an inflated mean and in upward "slippage" of the entire distribution. To avoid execution errors, a clinician should be highly sensitive to the appropriateness of the norms used for each individual being evaluated.
INTERPRETATION OF SCORES THAT ARE NOT
NORMALLY DISTRIBUTED 1. They might be due to the inherent variability in the population. Indeed, in any population, innate levels of a trait or ability range from very low to very high, which is modeled by a normal distribution. Therefore, a certain proportion of extreme scores is a natural feature of the population. 2. There might be purely deterministic reasons accounting for too low or too high scores, such as the following: a. Inadequate reliability of the measuring instrument b. Variations in test administration and scoring strategy c. Errors in data recording or in calculating appropriate statistics d. Demographic factors and physical handicaps affecting performance e. Situational factors (e.g., external noise} f. Sensitization and anxiety associated with the testing situation
Interpretation of individual test scores with respect to the normative distribution is based on an assumption of normality of this distribution. To avoid interpretive errors, the basis for test score interpretation should be different if distribution is asymmetrical. Standardized distribution has the same shape as the original distribution of test scores, which is highly dependent on the characteristics of the individual items comprising the test. If a test is designed in such a way that the majority of individuals can succeed on most of the items, test scores are compressed into a few discrete values at the upper extreme of the score range, with only a few observations at the lower part of the score range. In this case, the distribution is negatively skewed and the variability of scores falling within or above the normal range is highly limited. The test has its highest discriminative power at the lower ability levels; i.e., it is most useful in identifying impaired individuals. For example, the
39
STATISTICAL AND PSYCHOMETRIC ISSUES
distribution of scores on the Boston Naming Test and the Rey-Osterrieth Complex Figure (copy condition) in a sample of highly functioning individuals would acquire this shape. When test items present difficulty for the majority of subjects, the score distribution is positively skewed. Variability within the lower range of scores is highly limited, whereas the highest sensitivity is obtained in the upper part of the distribution. Such a test would be most appropriate for the selection of a few outstanding individuals from a large group. The distribution of scores for Raven's Advanced Progressive Matrices can be used as an example. In both of the above cases resulting in skewed score distributions, use of z-score conversions is inappropriate since such conversions are based on the assumption of normality (particularly symmeby) of the distribution.
PSYCHOMETRIC PROPERTIES OF TESTS In view of the proliferation of psychological tests with a high overlap in terms of abilities assessed, we are frequently faced with a dilemma: which test to select in a particular situation. All published tests have to meet the requirements outlined by the Standards for Educational and Psychological Testing (American Psychological Association, 1999). Yet, the choice of a test should be made in the context of a certain clinical situation. In choosing an appropriate test, one has to keep in mind several criteria for evaluating its psychometric properties. Reliability
When measuring a certain aspect of functioning, our main assumption is that the scores on a particular test would be consistent over repeated administrations. If an individual sometimes receives high scores and sometimes low scores on the same test, no inferences can be made regarding the level of ability being measured. In other words, we have to be assured that the test is a reliable measure of a stable construct such as a spe-
cific ability. However, a certain degree of variability is inherent in test performance. It is due to transient factors associated with the testing situation and the patient's state at the time of testing. Thus, the score on a test reflects the contribution of the following two factors: X=T+e
where X is the score on a test, T is a "true" score representing the actual level of ability measured by a test, and e is an error of measurement reflecting random variability. With an increase in test reliability, a considerable proportion of the variability in test scores is due to differences between subjects in the "true" scores. In other words, reliability can be expressed as the proportion of variance in test scores which is accounted for by the "true" differences between subjects on the ability being measured. Therefore, a reliability coefficient provides a measure of test reliability representing the ratio of "true" score variance to the total variance of the test scores:
Methods of Estimating Test Reliability Several procedures have been developed to determine the reliability of a test by measuring the proportion of "true" variance vs. the proportion of "error" variance. Different methods define measurement error with respect to different sources of error. The four most common methods are described below: 1. The test-retest method assesses the consistency of test scores from one test administration to the next. It is measured as the correlation between the scores on the first test and the retest and reflects the stability of scores over time. 2. The alternate forms method assesses the correlation between scores obtained by the same subject on alternate forms of a test. This method is the closest approximation to the parallel tests model.
40
BACKGROUND
3. The split-half method involves splitting the test into two equivalent halves after a single administration. There are different ways of splitting a test. The hi~est comparability of the two halves is aQhieved by an odd~en split in which one form contains all odd-numbered itef$s and the other form, all even-numbered_etems. 4. The internal consistency metJfWd estimates the reliability of a test bafd on the number of items and the aver~ed intercorrelations among them. This ptethod is mathematically related to the! split-half method. Coefficient oc is the moJt general form of this method and repr+ents the mean reliability coefficient obt$ed from all possible split-half comparjsons. In essence, internal consistency ~stimates compare each item on a test' to every · other item.
.
There is no universally agreed best method to evaluate test reliability. Each me~d has its advantages and disadvantages. The ;split-half reliability method overcomes theore!tical and practical problems associated with the testretest and alternate forms methodsl such as difficulty in developing two equivalent forms of a test, carry-over effects, reactivity effects, and the effect of random variability on two test probes. However, the reliability estimate obtained by the split-half method varies depending on the arbitrarily chosen method of splitting. In addition, the split-half reliability coefficient underestimates the reliability of the full test and requires the use of • correc· tion formula. The level of reliability varies for different tests. Ideally, a highly reliable test would be preferred to a test with low reliability. However, many practical considerations might influence a clinician's test selection. The cost of error in a decision-making situation iS, another factor which needs to be considered in selecting an appropriate test for the gi•en situation. Test reliability should be high when a patient's test performance is considered as one of the factors in making a final diagnostic determination. Tests with lower reliability might be acceptable in preliminary screening situations.
Typical levels of reliability attained by neuropsychological tests range from 0.95 to 0.80, which represents a high to moderate range of reliability. For a test with a reliability estimate of 0.80, 20% of the variability in scores is due to measurement error. Thus, tests with reliability below 0.80 introduce a considerable proportion of "noise" in scores, which compromises their interpretability. For screening tests, reliability between 0.80 and 0.60 would be acceptable, whereas reliability estimates below 0.60 are usually judged as unacceptably low. Standard Error of Measurement
The reliability estimate provides a relative measure of the accuracy of test scores. As any correlation, it is influenced by the variability of scores. In a sample with a heterogeneous score distribution, reliability will be higher than in a more homogeneous sample. The reliability estimate does not indicate how much variability should be expected due to measurement error and how accurate the individual test scores are. Therefore, in addition to reporting reliability coefficients, test developers report the size of the standard error of measurement (SEM), which is useful in interpreting the observed scores of each individual patient. The SEM is determined by the reliability of the test (rxx) and the variability of test scores (ax): SEM=a,Jl-rxx
Since no test provides a perfect measure of ability, a certain degree of variability in the scores obtained by the same subject is expected. The SEM indicates how much an individual's score might vary if he or she is retested repeatedly with the same test (assuming that there is no practice effect or fatigue effect). According to measurement theory, the scores obtained by one subject across an infinite number of retests with the same test would result in a normal distribution, with the mean equal to this subject's "true" score and the SD equal to the SEM. Since in most clinical situations we obtain only one score on a test, we may treat it as an estimate of the theoretical "true" score. Using
STATISTICAl AND PSYCHOMETRIC ISSUES the SEM, we can form a confidence interval (CI) around this score, which provides a range in which a subject's "true" score is likely to fall. For example, a 70-year-old patient obtained a WAIS-111 Full Scale IQ (FSIQ) of 110. According to the test manual, the size of the SEM for FSIQ in this age group is 2.19 (Wechsler, 1997). Since we know that 95% of all scores in a normal distribution fall within 1.96 SD of the mean, the 95% CI for this score will fall between 1.96 SEM below and 1.96 SEM above the obtained IQ score (110 ± 1.96 SEM). Calculation of a CI by multiplying the SEM by 1.96 (2.19 x 1.96=4.29) suggests that we would expect 95% of the IQ scores obtained by this patient to fall in the range of 110 ± 4.29, or between 105.71 and 114.29. If we want to increase the level of certainty in constructing a range in which a patient's true score is likely to fall, we can use the 99% Cl. In other cases, we might use lax Cis that provide accuracy below the 95% level. The drawback in using SEMs to determine the accuracy of test scores is the fact that they do not have the same size for all scores: they are smaller for extreme scores and larger for moderate scores. Another limitation in using SEMs is that test scores are generally further away from the mean than "true" scores because of the tendency for regression toward the mean. To overcome this distortion, the CI can be formed around the estimated "true" score, which is obtained from a regression equation.
Validity When a test is used to assess a certain aspect of functioning, it is assumed that the test measures what it is supposed to measure and that it is useful in making accurate decisions. Different validation strategies are used to understand the meaning and implications of the scores achieved on the test. Content validity and construct validity indicate whether a test is a valid measure of a specific ability. Criterionrelated validity refers to the accuracy of decisions that are based on the test scores. 1. Content validity reflects the extent to which the behaviors sampled by the test
41
are a representative sample of the ability being measured. It is not measured statistically but determined by agreement among expert judges with respect to a detailed description of the content domain that is measured by each test item. 2. Construct validity determines how well observable behaviors measured by the test represent underlying theoretical construct. This relationship can be established through high correlation of the test with other tests measuring the same construct (convergent validity) or through low correlation with tests measuring different constructs (discriminant validity). If more than one method is used to measure several constructs, the correlations among them can be represented in a multitrait-multimethod matrix, which establishes whether the results of a certain test are determined by the construct being measured or by the method of measurement. Construct validity can be further assessed by correlating one test with many other tests using factor analysis. In this case, construct validity is established through the high loading of a particular test on the factors that represent those constructs presumed to be measured by the test. Thus, content validity and construct validity represent two different strategies in determining that the test measures what it is supposed to measure: "Content validity is established if a test looks like a valid measure; construct validity is established if a test acts like a valid measure" (Murphy & Davidshofer, 1991). 3. The usefulness of a test in decisionmaking situations represents another aspect of test validity. Criterion-related validity reflects the relationship between test scores and measures of decision outcome, i.e., criteria. Any measurable behavior can be used as a criterion. For example, the choice of a rehabilitation strategy can be evaluated using a measure of symptom reduction as a criterion, or the accuracy of a screening test can be
42
BACKGROUND
assessed using a patient's psychiatric diagnosis as a criterion. The correlation between test scores and the criterion measure, which is derived without using the test, reflects the accuracy of predictions or decisions made on the basis of the test scores. Criterion measures can be obtained after decisions are made based on the test scores in a random sample of a population about which decisions are made (predictive validity) or at the same time that decisions are made in a preselected sample (concurrent validity). Whereas predictive validity is superior to concurrent validity in that it is a direct measure of the relationship between test scores and a criterion measure for the general population, it has a number of practical and ethical drawbacks. For this reason, the most practical and commonly used measure of criterion-related validity is concurrent validity, despite the fact that its coefficient underestimates the predictive validity. Theoretically, an estimate of the correlation between test scores and a criterion measure obtained in a criterion-related validity study can range between -1 and +I. Validity coefficients for most of the tests are relatively low, ranging between 0.2 and 0.5. This is due to the imperfect reliability of the test and the criterion measure: whereas a criterion is assumed to represent the "true state" of a patient, it is frequently based on subjective clinical judgment, which is inherently unreliable. If the correlation coefficient between a test and a criterion is 0.3, the proportion of the variance in the criterion that is accounted for by the test (?', or coefficient of determination) is 0.09. This means that only 9% of the variability in the criterion can be accounted for by the test scores. Although these numbers look discouraging, they should be interpreted in the context of other measures that contribute to the accuracy of decisions.
Decision Theory
In clinical practice, a clinician has to make decisions which range from assigning a certain diagnosis to applying a specific course of treatment. Since predictions based on the information available to the clinician are never perfect, each decision may have several possible outcomes. In the context of decision theory, the predictor and criterion values are reduced to only two categories, in spite of the continuous nature of these values. Comparison of predictions with criterion values suggests that there are four possible outcomes of decisions: correct decisions include truepositive (TP) and true-negative (TN) outcomes, whereas incorrect decisions include false-positive (FP) and false-negative (FN) outcomes. Tests are used to maximize the number of correct decisions and to minimize the number of errors. The contribution of the criterion-related validity of a test to improvement in the accuracy of decisions depends on the base rate and selection ratio.
Base rates The base rate reflects the proportion of an unselected population who meet the criterion standard. Clinically, this term is used interchangeably with incidence or prevalence of a disorder. In a hypothetical example, assume that among 500 normal elderly, 9% would have scores below the cutoff for dementia of 24 on the Mini-Mental State Exam (MMSE). Assume that 80% of patients with the diagnosis of dementia of Alzheimer's type (DAT) score below a cutoff of 24. If 100 DAT patients were added to 500 intact elderly (total number of subjects 100 + 500 = 600), the base rate would be 100/600, or 17%. In this situation, the outcomes would be as follows: 100 x 80%=80 TP 500
X
9%=45 FP
500-45=455 TN 100-80=20 FN
Thus, follow-up of subjects who score below the cutoff (80 TP + 45 FP = 125) will yield
STATISTICAL AND PSYCHOMETRIC ISSUES
a hit rate of 80/125 = 64%. In other words, the diagnosis of DAT will be confirmed in 64% of those subjects who scored below the cutoff of 24 on this test. In contrast to the above hypothetical example, in the general population, the base rate for DAT is considerably lower than 17%. For example, if the base rate for DAT is 5%, then out of 600 subjects, 30 would be suffering from DAT and 570 would be intact with respect to this diagnosis. Assuming that 80% of DAT patients and 9% of the intact elderly score below the cutoff on the MMSE, as was the case in the above example, the table of outcomes would look different because of the lower base rate for DAT: 30
X
80%=24 TP
570 x 9%=51 FP 570-51=519 TN 30-24=6 FN
The ratio of TP scores to the total number of subjects scoring below the cutoff (24 TP +51 FP = 75) will yield a hit rate of 241 75 = 32%, which is considerably lower than in the example with a higher base rate. With a decrease in base rates, most of the population are negatives; positives become more rare, and therefore, an attempt to identify this group will lead to an increase in the number of FP decisions. Low base rates also lead to a large number of TN decisions since a majority of the population do not suffer from DAT. Following the same logic, in the case of high base rates, as the number ofTP decisions increases, the frequency of FN errors also increases. An optimal base rate of about 50% minimizes decision errors and maximizes accurate decisions, providing that the test used to assist in decision making has sufficient validity. In the general population, the base rates for certain disorders are usually low and most of the "red Hags" represent false alarms. Base rates among individuals referred for evaluation due to progressive symptomatology are higher, and therefore, the expected number of false alarms would be lower.
43
Selection ratio Another factor affecting the accuracy of decisions is the selection ratio, which is defined as the ratio of TP + FP outcomes to the total number of subjects. Assume that a psychiatric ward has 30 beds for severely depressed patients. If only 32 patients are referred for hospitalization at any one time, the selection ratio would be high (30/32 = 0.94). The hospital cannot be very selective in this situation, and most of the referred patients could be hospitalized. In another scenario, 100 patients could be referred for hospitalization at any one time due to severe depression. Since the selection ratio is low (30/100 = 0.30), a certain strategy needs to be used to identify those who are acutely suicidal for immediate hospitalization. When a selection ratio is low, a test with even modest validity can make a considerable contribution to the accuracy of decisions. Incremental validity The utility of a test has to be assessed in terms of an increase in the accuracy of decisions obtained using the test, which extends beyond the base rate or beyond information obtained from other sources. In other words, incremental validity reHects the unique contribution of a test to understanding the patient. Incremental validity is affected by the base rate, selection ratio, and criterion-related validity of the test. When decisions are made at random, the frequency of different outcomes can be computed directly from the base rate and the selection ratio. The incremental validity of a test indicates the degree of improvement in the accuracy of decisions, i.e., frequency of TP and TN outcomes, beyond the random level, which are made using the test. The incremental validity is highest when the base rate is moderate, selection ratio is low, and criterion-related validity is high. Values of incremental validity for different combinations of base rates, selection ratios, and criterionrelated validity coefficients are provided in Taylor-Russell (1939) tables. Thus, the validity coefficient alone does not determine the usefulness of a test in each
44 clinical situation. Test usefulness depends largely on the context in which the test is used. Cutoffi and diagnostic accuracy of a test or interpretive strategy As pointed out above, in the framework of a decision theory approach, both the predictor (test) and criterion values are reduced to only two outcomes. Thus, the continuous nature of test scores is reduced to categories of pass I fail, impaired/unimpaired, etc. Selection of a cutoff point dividing a sequence of test scores into these two categories is another factor that affects the accuracy of decisions. Through manipulating the cutoff, the frequency of a certain type of correct decision can be maximized at the expense of increasing the frequency of another type of error. For example, test sensitivity, or the ability to correctly identify impaired individuals (expressed as the ratio of TP to all impaired individuals [TP + FN]), can be increased by fixing the cutoff at a small number of incorrect responses. This will reduce the frequency of FN errors but increase the proportion of FP errors. In other words, this will assure correct identification of the majority of individuals with even mild impairment and very few misidentifications of impaired individuals as being intact. At the same time, this will yield a large number of intact individuals who will be misidentified as impaired. The costs of such misidentification include inappropriate treatment, psychological distress, and adverse social! economic consequences. On the other hand, test specificity, or the ability to correctly identify the absence of impairment (expressed as the ratio of TN to all intact individuals [TN+ FP]), can be increased by setting the cutoff at a large number of incorrect responses. This will reduce the proportion of FP errors but result in a large number of FN errors. In other words, only those patients who have pronounced impairment will be identified as impaired, and very few intact individuals will be misidentified. However, many individuals with mild symptomatology will be missed. This will preclude timely therapeutic intervention which otherwise would allow stabilization or reversal of these patients' symptomatology.
BACKGROUND
Thus, manipulation of the cutoff affects the balance between sensitivity and specificity and results in different cost-benefit ratios. Based on the empirical evidence, the cutoff is usually set at a value that ensures a reasonable balance between sensitivity and specificity so that only "borderline" patients will likely be misidentified. Setting the optimal cutoff yields the highest Hit Rate, i.e., ability of the test to correctly identify the presence and absence of impairment (expressed as the ratio of [TP +TN] to all individuals in the sample [TP + FP + FN +TN]). In making a diagnostic decision, the clinician is concerned with the utility of a test in correctly identifying impairment in an individual patient, i.e., in the test's predictive value, rather than in its accuracy in discriminating between groups. Positive Predictive Value represents the probability that the patient is indeed impaired, given an impaired test score (expressed as the ratio of TP to all individuals identified by the test as impaired [TP + FP]). Negative Predictive Value represents the probability that the patient is intact given a nonimpaired test score (expressed as the ratio of TN to all individuals identified by the test as non-impaired [FN +TN]). The probability of the condition based on the test result (predictive value) is referred to as the posttest probability. However, the usefulness of a test in aiding diagnostic decisions is also determined by the base rates (prevalence) of the condition in a given setting (see above), which represents the pretest probability. These probabilities can be converted into odds of having the condition, which are expressed as the ratio of the probability of having the condition to (!-probability of having the condition). Posttest odds (which represent the likelihood that the individual who obtained a score X on the test has the condition) take into account the pretest odds and likelihood ratio: Pretest odds x Likelihood ratio = Posttest odds
where the likelihood ratio represents the odds of a specific test result occurring in an individual who has a condition over the odds of that test result occurring in an individual who does not have the condition. In other words, it
STATISTICAL AND PSYCHOMETRIC ISSUES
represents the likelihood that, given a score X on the test, an individual is impaired vs. unimpaired. Likelihood ratios relate the specificity and sensitivity of a test to a given setting. They can be defined over the range of possible values, identifying a degree of abnormality rather than representing the presence/ absence dichotomy, and therefore allow consideration of the greater predictive power of scores at the extremes of a distribution. Likelihood ratios are particularly useful in determining the probability of having a condition based on the results of a test battery, rather than an individual test score. An alternative to tests of statistical significance between groups is the Odds Ratio statistic, which reflects the association between the incidence of a condition given specific situational factors versus incidence of that condition in the absence of those factors and approximates relative risk estimates when the incidence of the condition is low. Conversely, it measures the strength of dissociation between individuals with different test results and reflects the probability that the condition is present in an individual with an abnormal test result. In other words, it shows that an individual who obtained an abnormal score (falling below the cutoff for impairment) on a specific test is X times more likely to have the condition than an individual who scores in the nonimpaired range (above the cutoff). For further discussion of the utility of diagnostic tests, see Fletcher et al. (1996) and Sackett et al. (2000).
SYNTHESIS OF RESULTS OF DIFFERENT STUDIES IN A META-ANALYSIS
Historical Overview and the Rationale for Using Meta-Analysis in This Book Historically, researchers and clinicians struggled with the amount and diversity of information available in the literature on any topic. Efforts to summarize results of studies in medicine can be traced back to the 18th century, marked with the conception of two journals published in Germany that provided critical appraisals of new publications. William
45 Withering's summary review on the use of digitalis for treatment of heart disease, published in 1785, is one of the first examples of a systematic review. Statistical methods to combine data from different studies in medicine were first introduced in 1904 by British mathematician Karl Pearson. These early efforts spearheaded the development of statistical methods to synthesize research findings in social sciences, particularly in psychology and educational research. In 1976, the American psychologist Gene Glass coined the term "meta-analysis" to describe research synthesis based on statistical techniques. In the 1980s, meta-analysis became popular in medicine, particularly for summarizing results of clinical trials addressing the effectiveness of treatment techniques and of observational (epidemiological) studies examining the accuracy of diagnostic methods. In response to a pressing concern regarding the lack of summary reviews for those who need to use evidence from unmanageable amounts of information to make informed decisions in medicine, British physician and epidemiologist Archie Cochrane founded the Cochrane Collaboration in the 1990s. This international network of health-care professionals promotes accessibility of systematic reviews through maintaining registers of controlled trials and preparing/updating systematic reviews, which are published in the Cochrane Library available on the internet (Antes & Oxman, 2001; Egger et al., 2002). To achieve a consensus across disciplines on how to report the results of systematic reviews, the conference on the Quality of Reporting of Meta-Analyses (QUOROM) was held in 1999, bringing together clinical epidemiologists, clinicians, statisticians, and researchers from the United Kingdom and North America. As a result of this conference, the QUOROM statement was published, which includes a checklist and a How diagram that identify the type and format of information to be included in systematic reviews (Moher et al., 1999). The goal of this "gold standard" is to help readers to evaluate the quality of reports and to appraise the likelihood of systematic error, i.e., bias in data reporting (Shea et al., 2001).
46
BACKGROUND
In the past decade, neuropfjychology researchers have turned to comparalji.ve (casecontrol) observational meta-analytic techniques in an effort to examine the cj.iagnostic accuracy of test batteries by comp~fing performance profiles of clinical and matched control groups. The outcome measures in such studies are sensitivity, specificey, likelihood ratio, effect size, and/or a sunimary receiver operating characteristic (ROC) curve. A few studies, primarily in medicGte, have used noncomparative (descriptive) observational meta-analyses to identify mean value$ and the expected variability of a certain pt:u-ameter (e.g., blood pressure) in non-disea.1ed individuals. The studies described in tftis book were subjected to such non-comparat:fve metaanalyses, the results of which are ~erted in the relevant test chapters. The out e measure in our analyses is an expected · 'bution of scores in a nonclinical population, ediated by demographic variables, when apptopriate. '
Application of Meta-Analysis in Clinical Practice The advantages and limitations of metaanalytic techniques are addressed i~ several publications (Cooper & Hedges, 1994; Egger et al., 2001, 2002; Green & Hall, 1984; Harris & Rosenthal, 1985; Hasselblad & Hedges, 1995; Hedges, 1982; Hedges & Olkin, 1985; Hunter et al., 1982; Kuli~, 1983; Light & Pillemer, 1984; Rosenthal 1983, 1984; Sterne et al., 2001; Sutton et aJ., 2000; Wolf, 1986) and are briefly sumhtarized below.
Advantages 1. Karl Pearson was the first mathe$latician to point out that individual stu~es are too small to allow definitive con~usions, in view of the size of the prob.ble error. To solve this problem, he proposed combining individual studies. ~uch a synthesis provides a solid basis for evidence-based clinical decisions in Ptodern clinical research and practice, thereas conclusions drawn from several tndividual studies might be contradictory or the sample size of an individual study might
be insufficient to detect or rule out a modest but important property of a specific parameter. 2. Meta-analysis identifies areas where consistent evidence is absent. It highlights the need for further research if conclusions drawn from individual studies are contradictory or high-quality studies in the area of interest are not available. 3. Analysis of heterogeneity in study results helps to identify subgroups that differ in an estimated parameter and draws attention to factors mediating the outcome.
Sources of Bias 1. Publication bias has been of considerable concern and was described by Rosenthal (1979) as a "file drawer problem." It points to the fact that studies yielding statistically significant findings are more likely to be published, published without delay, and published in English than studies with "negative" findings, which tend to be filed away. This causes a Type I publication bias error and results in a spurious effect of the parameter under investigation. To remediate this bias, the influence of unpublished studies and those published in languages other than English on the outcome should be taken into consideration. 2. Methodological and design quality differences between studies in terms of degree of experimenter blindness, randomization, sample size, controls for recording errors, and type of dependent variable (e.g., self-report vs. objective) represent another source of bias. Some researchers suggest that studies of higher quality (with larger samples sizes and well-controlled sampling) tend to have lower variance and effect sizes. 3. The issue of combinability of studies in a single meta-analysis is rooted in homogeneity of parameter estimates. Homogeneity across studies is assumed, based on the expectations that all studies are testing the same hypothesis and estimating the same population parameter
47
STATISTICAl AND PSYCHOMETRIC ISSUES
and that variations in study estimates are random. However, heterogeneity of parameter estimates is a common problem. There are several distinct points of view on how to deal with heterogeneity: a. Heterogeneity is analogous to individual differences among subjects within single studies and represents variations within the same parameter. b. The studies should be grouped into homogeneous subsets and combined in separate meta-analytic syntheses. c. Outliers contributing to heterogeneity should be subjected to close examination, to test for mediating effects that may contribute to the heterogeneity and to better understand the properties of the parameter of interest and suggest new hypotheses. In addition to statistical methods directed at reduction of heterogeneity, a practical approach to this problem rests on treating meta-analyses differently from systematic reviews. It is expected that all available data will be systematically reviewed. However, it might be inappropriate to pool data from all heterogeneous studies in a meta-analysis. This is especially true for case-control epidemiological studies, where combining a set of confounded studies may result in spuriously precise but biased estimates of association. Thus, careful examination of the data to determine sources of heterogeneity is advocated in the literature. 4. Another source of bias stems from including multiple tests of a hypothesis from a single study in a single metaanalysis, which inflates sample size beyond the number of independent studies. A practical solution to this "apples and oranges" criticism is to perform separate meta-analyses for each type of outcome variable. Careful attention to the sources of bias in meta-analytic synthesis enhances its reliability and validity. Reliability of the
meta-analysis refers to reproducibility of results (i.e., the likelihood that independent meta-analysts replicating the analysis will locate and include the same studies and measures) and agreement among raters in the coding of study characteristics (Rosenthal, 1984; Wolf, 1986; Zakzanis, 1998). External validity and internal validity are directly related to the choice of studies to be aggregated (coding strategies, examining mediating effects, testing homogeneity of results) and methodological quality of the studies, respectively. Guidelines for conducting metaanalyses to evaluate diagnostic tests and writing meta-analytic reviews have been published by Hasselblad and Hedges (1995), Irwig et al. (1994), and Rosenthal (1995).
SELECTION OF STUDIES AND PROCEDURES FOR META-ANALYSES PRESENTED IN THIS BOOK Literature Search and Selection of Studies
The initial pool of studies considered for inclusion in the meta-analysis was generated through a computer-based search of the PSYCHINFO and MEDLINE databases. Names of the tests of interest and the neuropsychological functions measured by these tests were entered as key words in separate runs, with the search limited to English-language publications dating from 1998 until the present. The intent of this search was to add the most recent articles to a presumably comprehensive set of articles containing normative data included in the first edition of this book. References in the articles generated as a result of this search were reviewed to identify earlier relevant publications that might have been missed in prior searches. In addition, unpublished sets of normative data that have been sent to the authors after publication of the first edition of the book were evaluated for inclusion. The meta-analytic tables are presented in this book only for those neuropsychological tests that have a sufficient number of
48
homogeneous studies that are based on the same version of the test or the same administration format. For those chapters .that contain the meta-analytic tables, not all studies available in the literature were necessarily included in the database for the analyses. Those data sets that are based on clinical groups not well identified in terms of methodol•gy or on administration of the tests by medical staff, rather than by trained examiners, were not reviewed. Among studies that were Jleviewed, those that do not contain test means ;and SDs (or data that can be converted into these statistics), do not have demographic de*riptions of the sample, or are based on idi~cratic samples (e.g., data collected in China) or nonstandard administration procedUfes were not included in the meta-analyses. ..4.n effort was made to identify multiple puijications based on the same study and to inclu4e a data set from each study only once, to avqid overlapping data sets. Similarly, when ~ta are presented in overlapping age groups, only nonoverlapping data points were us~. Data sets based on medical patients and on ~atients referred for neuropsychological e¥uation which yielded no neurological findirts were not included. The resulting data sets;include data collected primarily (but not exclusively) in the United States and Canada, and the vast majority of the participants across the studies are Caucasian.
BACKGROUND
sample size). Data points that have a larger sample size and a smaller variance contribute more to the analysis. This helps to control for study quality since higher-quality studies tend to have larger weights. Stata's analytic weights were used, which represent the number of elements that gave rise to the statistic representing the data point. Fixed effect with a cluster option was used for all regressions. A cluster option was used to identify data points that were derived from the same study, to account for a lack of independence of data points within each study. Ordinary least square regressions ("regress" command) were used, as opposed to the metaanalysis regression ("metareg" command), because "metareg" does not allow for the cluster option. We opted for the fixed effect based on an assumption that all data came from the same population. Preliminary tests with the "Meta" command for all data sets revealed that pooled estimates of the fixed and random effects were comparable (e.g., 42.42 and 42.22, respectively, for the FAS). Tables of predicted values are based on the parameters identified in the above regressions and include 95% Cis (expected to include 95 out of 100 estimated values if the trials were replicated 100 times), calculated according to the following formula: 95% CI =value ± 1.96 v'var(vaiue).
Data Editing Procedures Used in the Analyses Data were analyzed using Stata, which is a general-purpose, command line-driven statistical package for data management and analysis. It reads data into storage memory and is programmable, allowing the user to add new commands. This package was used for our purposes because it contains a comprehensive set of user-written commands for meta-analysis, in addition to commonly used ordinary regression analysis tools. Data in all analyses were inversely weighted on standard errors for the means since such weighting allows one to account for both sample size and the dispersion around the mean for each data point (calculated as a square root of the ratio of squared SD to the
After the relevant literature was selected, mean test scores with their respective SDs, demographic variables, and study characteristics were recorded in the Stata database partitioned by age and/or education group or for the entire sample as reported by study authors. When data allowed gender comparisons in addition to the overall scores, they were also recorded in a separate file to avoid double sampling. Every entry in the database is viewed as a data point. For example, a study that provides test performance data stratified into 4 age x 2 education groups would generate eight data points. Data were examined for consistency and for outlying scores. To aid us in this examination, we used the "meta" test, which tests data for
STATISTICAL AND PSYCHOMETRIC ISSUES
heterogeneity and assesses the influence of a single study in meta-analysis. It has an option of generating a graph depicting weighted means for all data points with their respective standard errors, centered around a vertical line demarcating the combined estimate of the mean (e.g., see, Fig. 3.2). Inspection of the resulting forest plot allows one to visualize the overall distribution of scores and to identify outlying data points. The degree of deviation from the estimated distribution parameters was further examined with a box plot and with an "iqr" test, which classifies outliers into mild vs. severe categories based on the analysis of an interquartile range. The presence of outliers typically resulted in a high Q value and a high estimate of between-study variance, representing heterogeneity among studies included in the data. Outlying data points identified by visual inspection and through the above analyses were
Yeucllll
49
reviewed. Although variability across studies and age/education groups is expected, we strived to identify data included in error. Outlying data points that can be explained by clerical errors in published sources, deviations from standard administration procedures, or idiosyncratic samples were excluded from further analyses. The "meta" test was rerun on the remaining data to assure a decrease in the Q value and in an estimate of between-study variance in comparison to the initial analysis. Information related to the analyses for outlying scores is not included in the tables in the appendices, to avoid "information overload." Data reported in the tables describe heterogeneity only in the final data set, after data editing. It should be noted that the final Q values for all data sets are significant at the 0.000 level, indicating heterogeneity for all data. This outcome likely stems from the fact that the data come from different studies and
·:1 - - + - :--.•
Noma
Yeucllll Gonion Gonion Yeucllll
---1 -
.....
IJIIal*l
I
Holf
a..o
Yeucllll Frleclmln
..... .....
Tam.ugh Tam.ugh
Slmldna Allllrod Troyer
a-.
Bola Allllrod F-
c:n..y
I
I
---at-
Tam.ugh Tontleugh
1
Monlc:h P.tcln Non1a Beelly
Allllrod
Cantined
15
20
25
30
35
40 45 fasmean
50
55
60
Figure 3.2. Example of a forest plot, which was used to assess the influence of a single study in the meta-analysis (data for the Verbal Fluency-FAS test were used). Vertical line demarcates the combined estimate of the mean. Data points are depicted as weighted means with their respective standard errors.
STaTa-
50
is dealt with using the cluster optioo (which identifies the study from which the data were derived) in the regression analyses. Furthermore, for those tests that are sensitive to an education effect, data were txamined for consistency of represented educl¢ion ranges. If a large gap was detected, data falling beyond the empirically supported distribution of education ranges were excluded from the analyses, to avoid extrapolating a prediction rule over ranges that are not supported by existing data. This process is describtd in the respective chapters.
Regression It has been widely documented in tle literature that expected test performance ~es as a function of demographic characteris~cs of an individual. Age has been shown to cqntribute most to this variability. To identify 1the rule describing a relationship between age and test performance, as reflected in the corresponding study means, data were subjected to regression analyses. Ordinary least square regressions with fixed effect and a clust~r option were used. The shape of the distribution of means was visually inspected to ass~t in the decision on whether linear or quadratic regression was appropriate for a specf£ic data set. In addition, both linear and quadratic solutions were subjected to the test fdr model fit. An increase in the R2 for the
etween age as a predictor variable and the te. means as an outcome variable. R2 indicates ~e proportion of variance in the test scores accounted for by the model. It should be noted that we used R2 rather than adjUfted R2 (which corrects for chance variation) $nee we had only one predictor for a relativqly large number of observations in each mocJel and, therefore, both values are very close; The F value and a corresponding probabilfy level indicate how reliably predictor varia~es pre-
BACKGROUND
diet test scores. The strength of the relationship between age and the test means is reflected in the dispersion of data points around the regression line. A scatterplot illustrating the dispersion is included in each relevant chapter, with the size of the bubbles reflecting the weight of the data point (larger bubbles indicate larger standard error [SE] and smaller weight). Information for each term of the regression model is provided in the tables. The coefficient for a predictor variable indicates the extent of gain or loss in the test performance given a one-unit change in the value of that predictor variable (given that all other variables in the model are held constant). For example, for the FAS version of the Verbal Fluency test, the coefficient for the Education term is 0.498 (see Table llm.l in Appendix Urn, under "Effect of demographic variables"). This means that for each 1-year increment in education we expect a 0.498-unit increment in word production. In other words, with every additional year of schooling, an individual is expected to increase verbal fluency by almost 0.5 of a word, irrespective of age. The 95% CI for the coefficient shows how high and how low the actual population value of the coefficient might be. Dividing the coefficient by the SE for that parameter yields the t value, which is used in testing the null hypothesis that the coefficient for a given term is 0. In our example, a t value of 2.47 with a two-tailed p = 0.025 indicates that the coefficient for Education of 0.498 in a model based on 29 observations is significantly different from 0; thus, we can infer that education significantly contributes to test performance. This information was used in the tables to provide a correction factor for predicted test scores for different levels of education. It should be noted that significance tests for the term age in the quadratic equations do not accurately reflect the linear effect of age on test performance due to collinearity with the quadratic effect, i.e., with the age2 term in the equation. To address the linear effect of age, avoiding the collinearity, we present significance tests for age centered (by subtracting the mean age for the aggregate sample from
51
STATISTICAL AND PSYCHOMETRIC ISSUES the mean age for individual samples) in the footnotes to the relevant summaries of the regression models.
Prediction The model that was estimated using regression command was used to make out-of-sample predictions on another data set, which included values for age distributed in 5-year increments (with smaller intervals at the extremes of the age distribution in some cases), representing mathematical centers of respective age categories. For example, for the F AS, the data set includes values 19.0, 22.5, 27.5, 32.5, 37.5, 42.5, 47.5, 52.5, 57.5, 62.5, 67.5, and 72.5. These numbers represent the age categories 18-19,20-24,25-29,30-34,35-39,40-44,4549, 50-54, 55-59, 60-64, 65-69, and 70-74. Care was taken to avoid out-of-range estimates. For example, when the available data extended only to the age of 82, the 80-84 category was not used in the prediction table because an assumption that the same rule applies to ages 83 and 84 would not have an empirical basis. In some cases, when a partial age group was well represented, a predicted value for this age group is listed. It should be noted that distributions of the data used for model estimation were examined for continuity, to avoid gaps within the distribution. When such gaps were detected, the extreme data points were excluded from analyses. Tables of predicted values with corresponding Cis for the relevant neuropsychological tests are presented in the meta-analytic tables in the appendices along with supporting statistics. Critical reviews of strength and limitations of predicted values are included in the text of the respective chapters. In clinical practice, situations might arise where an estimated score is needed for an age that falls beyond the range of age categories included in a prediction table. We strongly recommend that the clinician seek the needed data in individual studies included in this book (using locator tables to facilitate the search) or tum to the data accumulated in that specific clinic. However, if everything else fails, the needed score can be calculated using the regression equations included in the tables,
which underlie calculations of the predicted scores. The equations are based on the coefficients for all predictor variables used by the program. The equation for a linear model is as follows: Predicted test score =constant+ (Page) x age That for a quadratic model is as follows: Predicted test score= constant+ (Page) x age + (pag_.) x age2 where ~age is the coefficient for age and ~agei is the coefficient for age2 , respectively. For example, a quadratic equation derived from the model estimation for the F AS is 34.298 + 0.554 x age- 0.007 x age2 (see Table llm.1 in Appendix 11m). The equations are provided below the prediction tables; coefficients used in these equations are listed among the results of the analyses provided in the tables, specifically under the subtitle "Ordinary least square regression." As reflected in the shape of the regression line in the scatterplot in Appendix lim, FAS performance is expected to increase somewhat up to approximately age 40, with a subsequent decline. The value for age when performance reaches its maximum can be derived from the regression equation using the following formula:
Using coefficients from the regression equation for the F AS this value is - [.55412 x (- .007)] = 39.57. The obtained value represents the age at which the curve turns over to the declining direction.
Standard Deviations To test for a possible relationship between the variability in scores at different ages, regressions of SDs for test scores on age were run. When age accounts for a significant amount of variability in the SD, the predicted values for SDs and Cis (calculated using the same approach as above} are reported along with the predicted test scores. The results of
52
BACKGROUND
significance tests for regressions on~ SDs are reported. Tests for model fit for the lsolutions on SDs were performed using the same approach as for the performance scqres. The results of these tests were used for decisionmaking purposes, but they are not P,resented in the meta-analytic tables in the apjendices, to avoid information overload. When the results suggested tha~e does not account for any notable amoun of variability in SD, as reflected in a ve low R2, mean SDs derived from the original ~ta are listed in the tables as they are appliciahle for all age groups. :
Testing Model Fit and Parameter Specifications
I
Postestimation tests of parameter s 1 tions were performed to ensure cy of the prediction. Though violation of e normality of the residuals would not affi estimates of regression coefficients and p ·cted values, it would affect the validity of ypothesis testing; in other words, significan deviation from normality would affect the ·dity of p values for the t-test and F -te . The Shapiro-Wilk W test was used to as ss the normality of residuals for the variables bsed in the regressions. The p value for the W $tatistic
is based on the assumption that the distribution is normal. Thus, high values of p indicate that we cannot reject the hypothesis that the variable is normally distributed. The normality of residuals was also assessed using the "kdensity" plot (Kernel Density Estimate), which approximates the probability density of a variable, and through visual inspection of residuals regressed on age. Close approximation of the estimated curve to the normal density overlaid on the plot and no pattern in the dispersion of residuals support the results of the Shapiro-Wilk test. Kernel Density Estimate and plot of the residuals regressed on age for the F AS are reproduced in Figures 3.3 and 3.4 for illustration purposes (the size of the bubbles in Fig. 3.4 reflects the size of the SEs of the data points, reciprocal to their weights). However, they are not included in the meta-analytic tables in the appendices. Hmnoscedo.sticity, or homogeneity of variances of the residuals, is one of the main assumptions of the regression analysis. We used White's general test for heteroscedasticity, which regresses the squared residuals on all distinct regressors, cross-products, and squares of regressors. It tests the null hypothesis that the variance of the residuals is homogenous. Low values of the derived Lagrange multiplier statistic and high values of p indicate that we
.2
.15
I
.1
.05
figure 3.3. Kernel Density Estimate, which compares the estimated curve to the normal density (data for the Verbal Fluency-FAStest were used).
53
STATISTICAl AND PSYCHOMETRIC ISSUES 5
0 0
0 0
I
0
oo
0
0
0
0
0 0
0
0 0
of6
0
o o0
00
0
0 0
0
-5
0
20
50
40
30
70
60
mean age
60 ~
Figure 3.4. Plot of residuals regressed on age (data for the Verbal Fluency-FAStest were used). The size of the bubbles reflects the size of the standard errors of the data points, reciprocal to their weights.
cannot reject the hypothesis of homogeneity of variance in the residuals. A dispersion of residuals plotted vs. fitted values on the residualvs.-fitted plot (rvf plot) was visually inspected for each regression. In a model with a good fit and homogenous residual variances, this distribution should have no pattern. An rvf plot for the F AS is included (see Fig. 3.5) to illustrate this technique. However, these plots are
not reproduced in the meta-analytic tables in the appendices. Independence assumption refers to the expectation that errors associated with one observation are not correlated with errors associated with any other observation. Our data clearly do not meet this assumption because the data points derived from the same study (e.g., when the scores are stratified by
0
4.41668 0
0
0 0 0 0
.. 1
0 0 0
0
0
0
'0
1a:
0 0
0
0 0
0
0 0
0 0
0
0 0 0 0
-5.174
0
36.6217
Fitted values
45.1
7
STaTa-
Figure 3.5. Residual-vs.-fltted plot (data for the Verbal Fluency-FAStest were used).
54
BACKGROUND I
age group) are likely to be related and subjected to the same source of errqr. To account for the lack of independ$lce, we used the cluster option for model e~mation, which specifies that observations ;-e independent across studies (clusters) ;but not within studies.
Effect of Demographic Variables The effect of education was exploted with the "metareg" command, which yiel~ an estimated between-study variance taq2 , measuring residual heterogeneity adjtqted for covariates. The value of the tau2 estirpate was compared for regressions of test ~ans on additive components of variance ~th and without education. If the tau2 valu1·for the regression with education was mu lower, indicating that education explains a nsiderable amount of heterogeneity in tesq performance, education was entered as a ltedictor variable into the equation used for th~ model estimation. If R2 considerably improted as a result of addition of the education tdrm and the t value for education was high wi~ a low p value, the coefficient for education, !derived from the latter regression, was used ~ a correction factor in the tables for relevapt tests. Where education accounts for a ~e proportion of variance in test performal)ce, the predicted scores listed in the age-s~tified tables are accurate for individuals wijh education at the mean level for the origijal data set. With every year of education a~ove or below the mean, expected gains or lcfs~es in test performance are equal to the coctfticient for the education term. For exampleJ values listed in the prediction table for the PAS are accurate for individuals with appro'4mately 14 years of education since the meanfeducation across all samples for this test i$ 14.31 (see Table llm.1 in Appendix llm,i under "Description of the aggregate sample"). Thus, an expected score for a 37-year-old individual with 12 years of education (2 years bebw the mean of 14 cited above) is 45.17- 2(b.50) = 44.17. : Correction tables provided for the FAS and the Trailmaking Test (TMT) parts A and I
B allow such adjustments by adding or subtracting the appropriate correction factor to/ from the predicted scores provided in the prediction tables. The SD to be used with the education-corrected score is that for the person's actual age group (which is relevant to the TMT tables but not to the F AS as the same SD is used for all age ranges for the latter). It should be noted that the range of years of education for the correction tables is limited. This limitation is due to a lack of empirical data for individuals with lower levels of education in the studies reviewed. We do not know whether the pattern of education/test performance relationship linearly continues into the lower educational levels. Therefore, extrapolation of the suggested correction pattern onto educational levels falling below the empirically supported range might undermine the accuracy of the prediction. The effect of gender on test performance was assessed by adding a variable accounting for a percent of males in the sample as a predictor variable into a regression of test means on age. In addition, a t-test was run on the data that are reported for males and females separately. Male/female differences in mean test scores are reported in the tables, where appropriate. If a sufficient number of studies for a specific neuropsychological test report the data stratified by gender and a significant relationship between gender and test scores is highlighted in the literature, age-predicted scores are presented for males and females separately (e.g., GPT). For a number of neuropsychological tests, the differences between genders were not large enough to warrant separate predicted tables or addition of a correction factor for gender. Although it is widely known that intelligence level makes a considerable contribution to performance on certain tests, we could not provide corrections for IQ level because of the paucity of reported data on IQ in the samples aggregated for the analyses in this
book. Similarly, the volume of information on ethnicity or other demographic variables gathered from the studies reviewed was not sufficient to conduct statistical analyses.
STATISTICAL AND PSYCHOMETRIC ISSUES
Comments on the Applicability of the Meta-Analyses Presented in This Book As discussed earlier, an advantage of any metaanalysis is in increased power to direct in-
formed clinical decisions based on synthesis of empirical data. Data derived from individual studies underlying our meta-analyses might be biased by imperfect sampling procedures, random individual differences due to small sample sizes in each demographic cell, and deviations from standard administration procedures. In addition, they are setting-specific and contain data for limited age ranges or demographic categories. Thus, choosing a normative data set as a reference for a specific patient might become a time-consuming undertaking. Meta-analytically derived regression estimates are based on large aggregate samples and represent the mathematical center of all studies across demographic groups. As such, regression-based tables of normative data are relatively free of chance factors affecting individual studies. However, regressionbased norms should not be used as a substitute for empirically derived tables presented in the context of study reviews. Any averaging results in a loss of specific qualities. We intended to present corrections for variables that are in theory expected to affect test performance. However, we were limited by the data available in the literature, which in many cases seem to be at odds with the theory. Individual data sets based on a sample of participants who are similar in terms of setting, demographic characteristics, and/or functional level to the patient for whom normative comparisons are sought would provide more accurate estimates of expected performance than regression-based tables. It has been emphasized by a number of investigators (e.g., Heaton et al., 1986; Kalechstein et al., 1998; Ross & Lichtenberg, 1998; Van Gorp & McMullen, 1997) that a selection of the normative data set should be guided by the comparability of the patient's demographic characteristics to those of the normative data and, more specifically, by the moderating variable that is most likely to affect performance (e.g., age for tests tapping psychomotor
55
speed, education for tests emphasizing verbal achievement). Regression-based predictions are best considered an aid in selecting an appropriate table when results from different studies yield contradictory values. Regression-based norms have been criticized (Fastenau, 1998; Fastenau & Adams, 1996; Heaton et al., 1996; Morgan & Caccappolo-van Vliet, 2001; Moses et al., 1999). Major criticisms refer to the concerns of violation of assumptions undermining the accuracy of prediction and extrapolation of the rule summarizing the relationship between predictor and outcome variables to the ranges of the predictor variable that are not supported by available data. As it follows from the above description of the procedures for heterogeneity and parameter specification testing applied in our analyses, the issue of violation of assumptions was closely attended to. In addition, all predicted values fall strictly within the empirically supported ranges of the predictor variables. In spite of these efforts, the scope and quality of our analyses are limited by the scope of the data available in the literature. The accuracy of regression solutions presented in this book is undermined by several factors: 1. Age groupings provided in the literature
vary greatly between studies. Whereas for one study the mean age of 48 years might represent a range of 45--50, in another study the mean age of 48 represents a range of 20--86. The performance score reported in the latter study is much less meaningful in terms of agereferenced prediction than in the former. This situation was mitigated by weighting data points on SEs for the means as this weighting takes into account the dispersion around the mean. 2. Evaluation of the effects of demographic variables on estimates of test performance was limited by scarcity of demographic data provided in the literature. For example, an important variable such as IQ, which is expected to contribute significantly to variability in several neuropsychological tests, had very
56
limited variance across the data sets. Only few studies reported IQ. 3. Levels of education and IQ for the majority of data sets are high. Therefore, the predicted values overestimate expected performance for individuals with a high school education or below and with average or lower than average range of intelligence. 4. We cannot describe our aggregate sample in terms of ethnic distribution because of scarcity of information on participants' ethnicity in the individual articles. We believe that the underlying samples are not representative of the mixture of ethnic groups according to U.S. Census figures since many samples were dominated by Caucasian participants. Those data that were collected exclusively on representatives of specific ethnic groups (e.g., Chinese, African American, or Hispanic) were not included in the meta-analyses as they increase the heterogeneity of the data. Ideally, separate analyses on data for different ethnic groups should be conducted in the future, providing that a sufficient number of studies reporting normative data specifically for different ethnic groups will be generated. 5. Increments in the values of predictor or moderator variables extracted from the literature are uneven. As reflected in scatterplots depicting the distribution of data points around the regression line
BACKGROUND
for each relevant neuropsychological test, available data seem to cluster at the young and advanced ages, with more scarce data points in-between. Further investigations are needed to assure consistency in the relationship between predictor and outcome variables across all ages. However, large gaps in the ranges of predictor or moderator variables were avoided by eliminating extreme scores from the analyses. As a consequence of such adherence to empirically supported data, ranges of demographic categories covered in prediction tables are restricted; e.g., age groups are limited from both ends, and lower levels of education are not represented. 6. The suggested predictions for age (and education in a few cases) are based on the data for largely intact samples. It is unknown if the same relationship between demographic variables and test performance holds for individuals with brain pathology. Ultimately, normative databases should be expanded to include meta-analyses based on various clinical samples across test batteries, to acquire information on expected performance proffies for different diagnostic categories. In spite of the weaknesses addressed above, we hope that the predictions presented in this book will facilitate the process of clinical decision making, which encompasses historical, clinical, and psychometric information.
II TESTS OF ATTENTION AND CONCENTRATION: VISUAL AND AUDITORY
4 Trailmaking Test
BRIEF HISTORY OF THE TEST The Trailmaking Test (TMT) is included in the Halstead-Reitan Battery (HRB) and was originally part of the Anny Individual Test Battery (1944). Part A is an 8" x 11" page on which the numbers 1-25 are scattered within circles. The patient is instructed to draw lines connecting the numbers in order as quickly as possible. Part B is a page with the numbers 1-13 and letters A-L within circles. The patient is instructed to draw lines connecting the numbers and letters in order, alternating between numbers and letters (e.g., 1-A-2-B, etc.). Specific administration procedures are provided in Reitan's (1979) Manual for Administration of Neuropsychological Test Batteries for Adults and Children. Two scores are obtained, reflecting the total time in seconds to complete each task. In the Reitan (1979) administration format, errors are not scored, but when they occur, the patient is alerted to the mistake and instructed to correct it, thus slowing overall performance time. The patient is presented with short sample items prior to the administration of each task. Detailed administration instructions are provided in Lezak et al. (2004) and Spreen and Strauss (1998). Charter et al. (1987) reported reliability coefficients expressed as correlations with the
alternate forms of the test in which the order of the progression was reversed but the locations of the circles were not altered. The resulting coefficients were 0.89 and 0.92 in the normal sample with over 300 participants and 0.95 and 0.94 in the mixed sample for Trails A and B, respectively. The standard errors of measurement were 8.05 and 21.7 for Trails A and B. Dikmen et al. (1999) reported testretest reliabilities of 0. 79 for Trails A and 0.89 for Trails B over a 9-month interval in a mixed sample. Data on repeated administration are also presented by McCaffrey et al. (2000). Reliability and validity of the TMT are further addressed in Franzen (2000), Lezak et al. (2004), and Spreen & Strauss (1998). A children's version of the tasks (age 9-14) is available, which incorporates fewer items. The TMT enjoys considerable popularity due to its high sensitivity to the presence of cognitive imp~ent. In addition, a number of studies document the usefulness of the TMT as a predictor of instrumental activities of daily living in the elderly (Cahn-Weiner et al., 2002) and of functional outcome following acquired brain injury (Acker & Davis, 1989; Millis et al., 1994; Ross et al., 1997; Schmidt et al., 1996). According to surveys of test usage in neuropsychology practice (Butler et al., 1991; Camara et al., 2000; Lees-Haley et al., 1996; Sellers and Nadler, 1992; Sullivan & Bowden, 1997), the 59
60
TESTS OF ATTENTION AND CONCENTRATION
TMT is one of the most frequently used tests. The TMT is a standard component of screening batteries designed to detect cognitive impairment in different neuropsychological conditions. For example, in 1990, the TMT was adopted as a measure of cognitive impairment by the Drug Abuse Treatment Outcome Study (DATOS), sponsored by the National Institute on Drug Abuse of the National Institutes of Health. The DATOS was a naturalistic, prospective cohort study of adults enrolled in drug abuse treatment programs, which collected data on 10,010 adults in 96 programs across 11 cities in the United States between 1991 and 1993 (Horton & Roberts, 2003). The TMT data for a subsample of 8,521 adults were analyzed and presented by Horton and Roberts in a series of 19 articles published by the International Joumtd of Neuroscience between 2001 and 2003. The findings reflected in these publications point to significant effects of age, education, and ethnicity on many indices of TMT performance across various groups of drug users. However, the authors emphasized that these demographic effects are weak.
Contributions of Cognitive Mechanisms
and Physical Layout Differences to Performance on Parts A and B The TMT is described as a measure of visual conceptual and visuomotor tracking (Lezak et al., 2004); complex visual scanning with a motor component (Shum et al., 1990) with a contribution of motor speed and agility (Schear & Sato, 1989); simple motor-spatial skills and basic sequencing abilities (Lamberty et al., 1994); visual tracking, mental flexibility, and attention (Crowe, 1998b); visual perceptual abilities (Groff & Hubble, 1981); motor speed and visual attention (Gaudino et al., 1995); attention, simple motor and spatial skills, and sequencing abilities (Martin et al., 2003); and executive function (Burgess, 2003). Based on the results of a neuroimaging study exploring cognitive correlates of brain aging, Coffey et al. (2001) concluded that the neural substrates for the functions measured with the TMT part B involve multiple systems distributed throughout the brain. They attributed age-related slowing on part B to reduced
motor speed, impaired working memory, poor visual scanning, or a combination of several cognitive deficits. Factor analytic studies indicated that both parts A and B load on a visual perceptual factor (Groff & Hubble, 1981), a spatial factor (Moehle et al., 1990), a visuomotor scanning factor (Shum et al., 1990), a visuomotor speed and coordination factor (Swiercinsky, 1979), a motor problem-solving factor (Goldstein & Shelly, 1975), and a sustained attention and mental tracking factor (Lamar et al., 2002). Because of the complexity of mechanisms contributing to TMT performance, poor performance on this test is a nonspecific finding, which can be attributable to visual perceptual, motor, executive, motivational, or other factors (Anderson et al., 1995; Crowe, 1998b; Heilbronner et al., 1991; Iverson et al., 2002; Lezak et al., 2004; Lorig et al., 1986; Reitan & Wolfson, 1995b). To tease out a contribution of executive functioning to TMT performance, investigators turned to part B as a more complex measure requiring sequence alternation. According to the literature, several factors contribute to greater difficulty of part B in comparison to part A, which include cognitive demands and physical layout. Part B was found to place additional demands on the ability to alternate (Crowe, 1998b; Gaudino et al., 1995; Salthouse et al., 2000) and to flexibly modify a course of action (Arbuthnott & Frank, 2000; Kortte et al., 2002; Lamar et al., 2002; Lamberty et al., 1994; Pontius & Yudowitz, 1980) with a task-set inhibition component (Arbuthnott & Frank, 2000). Conversely, several investigators have identified additional demands on the ability to maintain two response sets simultaneously as the cognitive mechanism contributing to the greater difficulty of part B (Eson et al., 1978; Lezak et al., 2004; Reitan, 1971). Recent studies suggest that differences in physical layout further contribute to the greater difficulty of part B. Rossini and Karl (1994) reported that part B is 32% longer than part A. According to Gaudino et al. (1995), mean distances for parts A and B are 7.8 (3.2) and 10.2 (4.5) em, respectively, which increases trail length for part B by 56 em in
TRAILMAKING TEST
comparison to part A. In addition, analysis of visual interference indicated that part A has on average less than one visually interfering stimulus between each target, whereas part B averages more than one. Vickers et al. (1996) stated that the two parts of the TMT differ with respect to length and angular variability. However, there is no difference in structural complexity. Fossum et al. (1992) alternated the configura! arrangement of test stimuli by placing stimuli of part A in the spatial configuration of part B and vice versa. The authors concluded that differences in symbolic complexity and spatial arrangement, as well as interactions between these factors, contribute to the greater difficulty of part B. Arnett and Labowitz (1995) developed a modified version, which used a standard layout of part B with numbers substituted for letters, thus eliminating the alternation component. The authors found that it takes about 1.4 times as long to complete part B relative to part A because of the more complex layout of part B. Based on analysis of differences in time to completion between parts A, B, and the new version, the authors related longer completion time for part B to three factors: a cognitive processing factor that is unique to part B, the more complex layout of part B, and a psychomotor-attentional factor that is common to both parts A and B.
Utility of the Derived Measures, which Are Based on Differences in Performance Times for Parts A and B The above review suggests that TMT part B differs from part A in cognitive demands, length of trail, and perceptual complexity. However, studies that removed confounds of physical layout or visual complexity still documented a significant increase in time to completion with addition of an alternating component to the trailmaking condition (Crowe, 1998a; Gaudino et al., 1995). This suggests that sequence alternation places additional demands on executive function beyond the confounds of physical layout, which accounts for the increase in time to completion on part B. This assumption is further supported by the report that part B loads on an attention factor
61
(O'Donnell et al., 1994) and by findings of clinical studies indicating that samples of patients with frontal lobe damage or traumatic brain injury demonstrate lower performance on part B than normal samples or clinical samples with intact frontal lobes (Cicerone & Azulay, 2002; Corrigan & Hinkeldey, 1987; Pontius & Yudowitz, 1980; Reitan, 1971; Stuss et al., 2001). In contrast, Cicerone (1997) demonstrated low sensitivity of Part B to mild traumatic brain injury. Similarly, Anderson et al. (1995) and Reitan and Wolfson (1995a) did not find the TMT useful in detecting frontal lobe damage. These contradictory findings might be explained by differences in the severity of pathology in the study samples or by differences in the anatomy of the affected frontal regions (dorsolateral convexities vs. medial or basal-orbital regions). Several investigators have examined the relationship between performance on the two parts of the TMT, using the B-A difference and B:A ratio, in an attempt to identify an increment in time associated with the additional processing demands imposed by part B. Golden (1981) examined the properties of the B:A ratio and found that it has a curvilinear relationship with impairment. Both high and low ratios may indicate neuropsychological impairment, with a ratio score lower than 2 being indicative of deficient performance on part A and a ratio score greater than 3 reflecting deficient performance on part B. Heaton et al. (1985) recommended use of the B-A difference as a measure of cognitive efficiency. Corrigan and Hinkeldey (1987) found the difference and the ratio measures to be sensitive to the increased cognitive demands of part B. They recommended use of the ratio as an intrasubject comparative index, allowing one to control for individual variability. Lamberty et al. (1994) demonstrated the usefulness of the ratio measure as an index of cognitive flexibility controlling for intrasubject variability as it is relatively free of age and education confounds. The authors concluded that the ratio measure is most useful in a screening evaluation when strong diagnostic information is not available, in the context of age and education confounds. They also described its usefulness in the forensic context as
62
TESTS OF ATTENTION AND CONCENTRATION
"fakers" are expected to have a smaller ratio than brain-damaged individuals. The authors suggested that ratios of 2.0-2.5 represent normative performance, with 3.0 being identified as a cutoff for the presence of neuropsychological impairment. The usefulness of this cutoff is supported by Arbuthnott and Frank (2000), who found an especially large cost for alternating switches in participants with a B:A ratio greater than 3.0. However, Drane et al. (2002) concluded that rates of false-positive misclassifications for the 3.0 cutoff were unacceptably high in their sample of normal adults, especially for older age groups. In addition, Martin et al. (2003) did not find the B:A ratio to be sensitive to the severity of traumatic brain inju:ry. It also failed to identify examinees who were dissimilating, according to independent psychometric indicators. The authors concluded that the ratio measure does not enhance the clinical utility of the TMT in individuals with traumatic brain injury. Axelrod et al. (2000a) examined the sensitivity and specificity of different cutoffs for the ratio measure based on the Hebrew version of the TMT. They recommended use of a more conservative cutoff when performance is considered to be pathological.
Utility of the Error Analysis The diagnostic utility of the rate of performance errors on the TMT was examined in several studies. Rasmusson et al. (1998) reported an increase in error rate on part B, but not on part A, with each decade of life in their sample of nondemented participants over the age of 60. However, there was no longitudinal change in the error rate on a 2-year follow-up. In the demented sample, dementia status was significantly associated with the proportion of participants making errors on both parts A and B, independent of age. Stuss et al. (2001) found error analysis to be more useful than time to completion in distinguishing between patients with frontal lobe injuries and those with damage to nonfrontal areas or normal controls. All patients who made more than one error on part B had frontal lesions. Further division of the frontal
patients into subgroups, based on the number of errors, indicated that damage in dorsolateral frontal areas was associated with the greatest degree of impairment, whereas damage to the inferior medial aspects of the frontal lobes did not significantly affect performance. Steffens et al. (2001) reported greater frequency of subsequent errors after controlling for the overall initial error rate in a sample of geriatric depressed patients in comparison to a control elderly sample. The authors interpreted this finding as a performance feedback deficit in geriatric depression which is linked to a dysfunction of the orbital frontal cortex. Ruffolo et al. (2000) investigated the diagnostic utility of TMT errors by comparing error rates in two head-injured groups (varying in degree of injury severity), patients with suspect effort, controls, and experimental malingerers. No differences in error rates between both head-injured and control groups were found. However, error rates on part B were significantly higher in both malingering groups in comparison to the head-injured and control groups. The authors concluded that performance errors lack diagnostic utility for persons with head injuries but that they may be helpful in the assessment of malingering if used in conjunction with time to completion. To explore the specific cognitive mechanisms contributing to poor performance on the TMT in clinical samples, several investigators have examined the frequency of different categories of errors. Klusman et al. (1989) categorized errors on part B into shifting (from number to letter and from letter to number) and sequencing (number and letter) errors. The authors found that neither error type nor frequency nor percentage of individuals making errors differed significantly between head-injured and control groups. McCaffrey et al. (1989) identified two types of errors: sequential errors, which involve omission of the consecutive element in the series, and perseverative errors, indicating a failure to alternate between categories, with the latter type being applicable only to part B. Stability of both types of error between two administrations over a 7-10 day period was evaluated by the authors on a sample of
TRAILMAKING TEST
polysubstance users. The error analyses revealed a significant improvement in performance from the test to retest. The authors interpreted this improvement as a practice effect and pointed to the questionable utility of the error rates as indicators of stable central nervous system (CNS) dysfunction in polysubstance users. Amieva et al. (1998) used similar error categories in an investigation of cognitive failures contributing to low TMT performance in Dementia of Alzheimer's Type (DAT). Every transition between two items in their demented and elderly control samples was examined, and errors were further broken down into subcategories. The sequential errors (SE) category, identified by the authors as a failure to efficiently pursue the letter or the number series by omitting an item, was further subdivided into prox:imitySE (characterized by spatial proximity), SE to rectify (an attempt to move back and rectify an initial error), displacement SE (displacement of the subsequent sequence), and unexplained SE (not falling under any of the above descriptions). The perseverative errors (PE) categorywas conceptualized as a failure to alternate between the series of numbers and letters. The results yielded a significantly greater frequency of proximity SEs and PEs in the patient group, which was interpreted as an inhibitory deficit largely accounting for poor TMT performance in demented patients. Thus, review of the literature suggests that error analysis is not sensitive to cognitive deficits associated with head injury or polysubstance use. However, analysis of frequency and type of errors might be diagnostically useful in dementia and might contribute to the detection of localized frontal lobe dysfunction, suboptimal effort, and age-related decline in performance accuracy when used in conjunction with time to completion. Further research is needed to replicate these findings and to investigate the usefulness of these indices in other clinical settings.
Utility of the Cutoffs for Impairment The manual for the Army Individual Test (1944) provides a 10-point scale for converting raw scores, with 10 being the best score and 1
63
the worst. Reitan (1958) initially recommended use of this scaling method and suggested cutoffs for impaired performance based on the scaled scores. Given the significant association between TMT performance and age, IQ, education, and possibly gender, use of single cutoff scores does not appear to be appropriate, as has been confirmed by several studies. Bornstein et al. (1987b) found that, using a cutoff of ~40 seconds on part A and ~92 seconds on part B, 33% and 39% of a healthy elderly sample were misclassified as brain-damaged. Ernst (1987) found that use of cutoffs ~39 and ~92 seconds resulted in a misclassification rate of 48% for both parts A and B. Bak and Greene (1980) reported that at least 40% of their elderly normal participants were misclassified on part B. Dodrill (1987) documented misclassification rates of 11.7% and 13.3% when using cutoffs of ~39 seconds on part A and ~89 seconds on part B, respectively, in a young control sample. Bornstein and colleagues (1987b) noted that 96% and 98% correct classification rates for parts A and B, respectively, were obtained when cutoffs of ~55 seconds and 2:137 seconds were used (but 46% and 40% of braindamaged participants were then misclassified with these cutoffs). The authors emphasized that cutoff scores may be useful but only if considered in the context of other neuropsychological information obtained in a test battery and if age, education, and other appropriate adjustments are made. Cahn et al. (1995) used cutoffs of ~66 and ~172 seconds for parts A and B, respectively, in a study comparing DAT patients with a large sample of neurologically normal individuals. They report sensitivity and specificity indices of 69% and 90% for part A and 87% and 88% for part B. The authors underscored the diagnostic effectiveness of part B, which was one of the few measures contributing to optimal differentiation between DAT and control participants. The part B cutoff of 172 seconds was used by Rasmusson et al. (1998) in distinguishing between nondemented elderly and those participants who met criteria for DAT. Obtained sensitivity and specificity indices of 77% and 89.4% support the usefulness of this cutoff.
64
TESTS OF ATTENTION AND CONCENTRATION
The utility of the cutoff scores was further emphasized by Soukup et al. (1998), who recommended reporting cutoff scores that represent borderline (15th percentile) and defective (< 5th percentile) perforinance in addition to the means and standard deviations in future studies, to offset problems associated with the positive skew in the disbibution of TMT scores. i
Effect of the Order of Presentation and Practice Time, Practice Effect, an~ Alternate Versions of the TMT . The effect of the order of presentation on performance on parts A and B was examined by Taylor (1998a) in a sample of pa*nts with neurological disorders and by Miner and Ferraro (1998) in a sample of undergraduate students. Both studies revealed a ;gnificant time x order interaction, with time~ to completion being lower for part A and lii.gher for part B, for the reverse order of pretentation. Taylor (1998a) explained this trend i~ terms of a slight effect of practice in visual1scanning and noted that part B can be used if4 isolation as omission of part A will not lead ., serious distortion of part B performance. Thompson et al. (1999) examined the utility of practice times in predicting success or failure on the full version of the test. ~ authors presented tables of classification ac
a mean age of 59.1 years (standard deviation [SD] =9.3). At the fourth testing probe 6 months after the initial assessment, practice effect gains were partially lost. Practice effects, specifically on part B, were reported by DesRosiers and Kavanagh (1987). Frank et al. (1996) also reported a significant practice effect for part B on the 2-year retest for a sample of 380 elderly over the age of 65. To minimize the practice effect over repeated administrations, several alternate versions of the TMT were developed. Lewis and Rennick (1979) developed alternate forms for part B, which were included in the Repeatable Cognitive-Perceptual-Motor Battery. Further discussion of the comparability of these forms to part B can be found in Kelland and Lewis (1994) and Lezak et al. (2004). DesRosiers and Kavanagh (1987) developed Trail C (TMC) and Trail D (TMD) versions, which retained the same relative position of each circle but inverted the labeled sequences respective to their equivalents, TMA and TMB. Administration of both sets to 16 normal adults in the pilot study yielded high correlations between alternate forms (r=0.73 and 0.80 for TMA/TMC and TMB/ TMD comparisons, respectively). The equivalence of the alternate forms was further investigated in an orthopedic control group and in a sample of closed head injury patients. Alternate forms for both conditions were stable and consistent in both groups. The equivalence of these alternate versions was further tested by McCracken and Franzen (1992) and Franzen et al. (1996). Based on the data from clinical samples and patients referred for neuropsychological evaluation, the correlation analyses as well as a comparison of solutions for two runs of the principal component analysis provide support for the equivalence of the standard and alternate test versions. On a sample of healthy adults, LoSasso et al. (1998) found that the TMT-D version is somewhat more difficult than TMT-B. Therefore, the authors concluded that TMT-D can serve as an excellent alternate form to the TMT-B, if it is administered on the retest after the TMT-B. In the same study, the authors reported that there is no clinically meaningful difference in scores with respect to whether the test is
TRAILMAKING TEST
performed with the preferred or nonpreferred hand. Charter et al. (1987) developed alternate versions for parts A and B using similar methodology. The authors report high correlations (r > 0.9) between the new and standard test versions in a large sample of patients and normal controls. Kilander et al. (2000) used two new versions, letters A-Z and following arrows indicating the order, in addition to the standard A and B parts. A different approach to generating TMT alternate forms was proposed by Vickers and Lee (1998). The authors stated that there is no established procedure for generating equivalent but stochastically different test forms and proposed a neural network method as a practical solution.
Culture-Specific Sets of Normative Data and Cultural Adaptations for the TMT Stewart et al. (2001) administered the TMT among other tests to 285 African-Caribbean participants between 55 and 75 years of age, residents of south London. Normative data are presented for part A, stratified by two age groups, gender, three education groups, and three occupational classes. Preliminary normative data collected on 190 Greek adults between 18 and 89 years of age are reported by Vlahou and Kosmidis (2002). Elwan et al. (1996, 1997) reported data for 211 normal Egyptians between 20 and 72 years of age. It is unclear, however, how the performance was measured. Giovagnoli (1996) provided data for 287 healthy Italian adults between 15 and 79 years of age. Time to completion for parts A and B as well as the B-A difference were reported. Nielsen et al. (1989, 1995) administered the TMT among other neuropsychological tests to a sample of Danish adults. In their 1989 article, the authors reported data for 101 volunteers and patients who had undergone minor surgery between ages 20 and 54 years, stratified into three age groups. In their 1995 article, they presented data for elderly participants between ages 64 and 83. Flemish normative data for the TMT, collected on 200 healthy adults between 18 and
65
74 years of age, are reported by Lannoo and Vingerhoets (1997). The data are stratified into 3 age x 2 education groups. Siegert and Cavana (1997) presented normative data for a sample of 127 New Zealanders over 60 years of age, divided into five age groups. The Color Trails Test (CTI) and TMT were administered in Hong Kong to 84 ChineseEnglish bilingual and English monolingual participants in a study designed to examine the effect of different language backgrounds on trailmaking performance (Lee et al., 2000). Data for participants between 20 and 50 years of age with at least a university education are reported for both groups. No significant differences in test performance between groups were found. In a follow-up article, Lee and Chan (2000a) provided comparative data for the CTI and TMT on a sample of 108 Chinese adults in Hong Kong. The data are reported in 2 age x 2 education groups. In another study, Lee et al. (2002) provided data for the TMT, among other neuropsychological tests, collected in Hong Kong on a sample of 475 Cantonese-speaking Chinese between 13 and 46 years of age. Time to completion and the number of errors are reported in 3 education x 2 achievement x 2 gender groups for adolescents and in 3 education x 2 gender groups for adults. Lu and Bigler (2000) administered the TMT to 60 American and Chinese students (age 21-32) from an American university. The Chinese group was also given an equivalent of the TMT part B for native Chinese speakers, where English letters were replaced by numbers in Chinese characters. The data for the two groups are presented as T scores using the Heaton et al. (1991) norms. The results indicate significantly longer performance time for part B than for the modified Chinese version of part B for the Chinese group. In a follow-up article, Lu and Bigler (2002) presented normative data for part A and the Chinese version of part B (C-Trails B), collected on a sample of 110 adults between 21 and 75 years of age, who were born in mainland China, Taiwan, or Hong Kong but lived
66
TESTS OF ATTENTION AND CONCENTRATION
in the United States and spoke Chinese as the first and primary language. The psychometric properties of the Arabic version of the Expanded TMT were examined by Stanczak et al. (2001). In this version, English symbols are replaced with Arabic numbers and letters. The words begin and end are also translated into Arabic. The authors compared test performance by Sudanese normal and brain-damaged participants and equivalent groups from the United States, who completed the standard English version of the test. The data are presented as mean logarithmic scores for the four groups. The Hebrew version of the TMT was used by Axelrod et al. (2000a) in a study comparing performance of normal control participants and post-head injury outpatients. In the Hebrew version of the test, English letters are replaced with Hebrew letters on part B. Time to completion and the B:A ratio are reported for both groups. The sensitivity and specificity of different cutoffs for the B:A ratio are presented.
Modified Versions of the TMT Given the multidimensional nature of cognitive mechanisms contributing to TMT performance, several versions of the trailmaking tests were developed in an effort to tease out the effects of different component mechanisms on test performance. A variant of the TMT is included in the Delis-Kaplan Executive Function System (D-KEFS) published by the Psychological Corporation (Delis et al., 2001). This version includes five sections measuring component mechanisms contributing to trailmaking performance: the number sequencing task is a variation of part A; the letter sequencing task is a letter counterpart of the number sequencing task; the visual scanning task taps the ability to identify targets of a specific shape; the motor speed task taps the speed of tracing over a dotted line; and the number-letter switching task taps the flexibility of thinking on a visual-motor sequencing task and is similar to part B of the standard TMT. Performance is evaluated based on time to completion and error analysis. The psychometric properties of
the test are described by the authors in the D-KEFS manual. Normative data for ages 889 years are based on a large, nationally representative sample. Wecker et al. (2000) used a preliminary version of this test to examine the effect of age on component mechanisms contributing to test performance. Another version of the TMT, the Comprehensive Trail-Making Test (CTMT), which includes five conditions, was developed by Reynolds (2002) and published by Psychological Assessment Resources. Normative data for ages 11-74 are based on a large nationwide sample matching the U.S. Census data. In an effort to remedy the shortcomings of the standard TMT and investigate effects of different component mechanisms on agerelated slowing, Salthouse et al. (2000) developed the Connections Test, which is based on the Zahlen Verbindsungs Test introduced by Oswald and Roth (1978) and described by Vernon (1993). It consists of four conditions: numbers, letters, alternating numbers and letters, and alternating letters and numbers. The instructions are to connect successive targets presented in 7 x 7 matrix of circles. The placement of successive targets is fixed to one of eight positions relative to the prior target. The authors examined the equivalence of several alternate forms of the test. In the context of the elementary process model, the authors pointed to target identification, searchcomparison, response, and sequence update processes as being involved in the nonalternating conditions, with the addition of a sequence switching process in the alternating conditions. Based on the performance of a sample of 207 adults between 18 and 80 years of age on this test, the authors concluded that all age-related effects are mediated by perceptual speed. Stanczak et al. (1998, 2001) developed an Expanded TMT (ETMT), which is aimed at systematically varying the stimuli within the TMT format in order to isolate the cognitive components. The authors designed experimental forms X, Y, and Z, which are administered in addition to parts A and B. Form X consists of a series of 25 clock faces with the hour and minute hands pointing to the time arranged in 15-minute increments. Form Y consists of
TRAILMAKING TEST
a sequence of 24 solid black dots of increasing diameter. The two forms are arranged in the vertically inverted mirror-image format of parts A and B, respectively. The examinee is to connect the clock faces in order of advancing time and the dots in order of increasing diameter. Form Z requires alternation between clock faces and dots (apparently, form Z was used only in the Arabic version of the test). The psychometric properties of the adult version of the test were examined by Stanczak et al. (1998) on a sample of 164 brain-damaged individuals and 252 normal controls. The authors found adequate concurrent and criterion validity of the new forms and highlighted their utility in explaining between-group variance and in cross-cultural assessment. The midrange adaptation of the ETMT was initially validated by Davis et al. (1989), who found that forms B, X, and Y discriminated well between normal and learning-disabled children. A modification of this version (MidRange Expanded TMT, METMT), which included five new forms (X, Y, Z, G, and B4) in addition to the standard parts A and B, was cross-validated by Stanczak and Triplett (2003). The authors report adequate psychometric properties of this version and provide normative data. Bamcord and Wanlass (2001) pointed out that the TMT versions designed to reduce the cultural confound associated with use of the Latin alphabet have not addressed the confound associated with the Arabic counting system. To fill in this void, the authors developed an alternative to the TMT, the Symbol TMT (STMT), which uses symbols that are not language- or number-based. The test consists of three trials. Two of them require matching of the symbols to the sequential key presented at the top of the page, with an added alternating background coloration requirement in the second trial. The third trial measures incidental memory for symbols presented in trials 1 and 2. The authors report modestly similar psychometric properties for the STMT compared to the TMT, established on a sample of 210 normal participants. Another culture-fair version of the TMT, the CIT, which was developed for the World Health Organization, is described in Chapter 5.
67
The oral version of the TMT (OTMT) was designed to eliminate visual and graphomotor components in trail-making performance (Abraham et al., 1996; Ricker and Axelrod, 1994; Ricker et al., 1996). Two conditions, performed orally, are presumed to be comparable to parts A and B of the standard version of the test. The instructions are to count from 1 to 25 as quickly as possible in the first condition and to orally switch between numbers and letters in the second condition. The second condition concludes with number 13. Although the clinical validity of the test was not established in the initial study, the authors reported that more recent studies, based on clinical samples, supported the validity of the OTMT as a measure of executive functioning. Ruchinskas (2003) cautioned against use of unadjusted performance indices for the OTMT with older medical patients or individuals with low educational levels, due to moderate correlations of the test scores with the Mini-Mental State Exam (MMSE) and level of education. Two other versions of the OTMT have been described in the literature: the Alphanumeric Sequencing procedure is incorporated in the Behavioral Dyscontrol Scale (Grigsby et al., 1994; Grigsby &: Kaye, 1995); the Mental Alternation Test (Jones et al., 1993) was developed as a bedside screening test of cognition for patients with HIV infection. The psychometric properties of these tests need to be addressed in future studies.
RELATIONSHIP BETWEEN TMT PERFORMANCE AND DEMOGRAPHIC FACTORS The literature to date indicates that TMT performance is associated with age: increased age is related to poorer test scores. The association between age and TMT scores is present in both normals and patients but appears to be of smaller magnitude in brain-damaged samples (Corrigan&: Hinkeldey, 1987). Anumber of studies have reported either significant correlations with age or significant differences across various age groups (Alekoumbides et al., 1987; Anthonyet al.,1980; Bak &: Greene, 1980; Bomstein,1985; Davies,1968; Elias et al.,1993;
68
TESTS OF ATTENTION AND CONCENTRATION
Ganguli et al., 1991, 1996; Giovagnoli, 1996; Gordon, 1972; Goul & Brown, 1970; Heaton et al., 1986, 1991, 1999, 2004; Horton & Roberts, 2001, 2002, 2003, see Brief History of the Test for the description of the study; lvnik et al., 1996; Kennedy, 1981; Lamberty et al., 1994; Lannoo & Vingerhoets, 1997; Lee and Chan, 2000a; Libon et al., 1994; Lu and Bigler, 2002; Lyness et al., 1994; Matthews et al., 1999; Parsons et al., 1964; Rasmusson et al., 1998; Reed & Reitan, 1962; Salthouse & Fristoe, 1995; Saxton et al., 2000; Siegert & Cavana, 1997; Small et al., 2000; Soukup et al., 1998; Stanton et al., 1984; Stuss et al., 1987; Vlahou & Kosmidis, 2002; Wahlin et al., 1996; Wiederholt et al., 1993). In two other studies, which do not formally acknowledge the association between age and TMT performance, TMT data are presented by age groupings, and the mean scores of the groups obviously increase with age (Fromm-Auch & Yeudall, 1983; Harley et al., 1980). Coffey et al. (2001) report a significant relationship between TMT part B performance and age-related brain changes documented on quantitative MRI in a sample of 320 elderly nonclinical volunteers, with poorer performance being related to cerebral atrophy (reflected in decreased cerebral hemisphere volume and increased peripheral cerebrospinal fluid volume) and ventricular enlargement. The authors comment on negative findings in similar neuroimaging studies and suggest that differences in subject characteristics and sample sizes, brain imaging methods, measurement techniques, and approach to statistical analysis might account for this discrepancy. Yeudall et al. (1987) and Boll and Reitan (1973) detected no association between age and part A or B. Yeudall et al.'s (1987) negative findings may be due to the restricted age range of their sample (15-40); examination of other data sets suggests that declines with age appear to occur after age 40 (Goul & Brown, 1970; Stuss et al., 1987) or age 50 (Kennedy, 1981) for part A. The reason for Boll and Reitan's (1973) failure to document age effects is less obvious, but it could involve small sample sizes in the very young and very old groups and problems in the linearity of the data (e.g., very young and very old participants
performing poorly). In spite of the reported nonsignificant correlations, the percent of participants correctly classified as normal for the oldest age group (60-64) did fall precipitously compared to other age groups, suggesting that there was a decline with age in test performance at least in this age group. Gonzalez et al. (2001) did not find a relationship between age and TMT part B performance in a sample of homeless individuals who were receiving medical care. The variance for the test scores was very large (probably due to varied degree of impairment in mental status), which possibly obscured the relationship between age and performance time. Many studies have documented a significant relationship between education and TMT scores in normal individuals, with higher education levels being tied to better test performance (Alekoumbides et al., 1987; Anthony et al., 1980; Bornstein, 1985; Bornstein & Suga, 1988; Drane et al., 2002; Ernst, 1987; Finlayson et al., 1977; Ganguli et al., 1991; Giovagnoli, 1996; Gonzalez et al., 2001; Gordon, 1972; Heaton et al., 1986, 1991, 1999, 2004; Horton & Roberts, 2001, 2002, 2003a, 2003b; Kennedy, 1981; Lamberty et al., 1994; Lannoo & Vmgerhoets, 1997; Lee & Chan, 2000a; Lu & Bigler, 2002; Matthews et al., 1999; Parsons et al., 1964; Portin et al., 1995; Saxton et al., 2000; Stanton et al., 1984; Stuss et al., 1987; Vlahou & Kosmidis, 2002; Wiederholt et al., 1993); however, a few studies did not find a significant correlation between education and TMT scores (Fastenau, 1998; Ivnik et al., 1996; Wahlin et al., 1996; Yeudall et al., 1987). Heaton and colleagues (1986), assessing the combined effect of age and education on TMT part B in normal participants, documented a significant interaction, suggesting that for individuals less than 60 years old lower levels of education are associated with greater amounts of age-associated impairment and for those more than 60 years old level of education has less of an effect than for younger individuals. Another aspect of the age/education interaction in reference to part B performance was presented by Richardson and Marottoli (1996). The mean performance for communityresiding elderly participants with less than
TRAILMAKING TEST
12 years of education was stable across younger-old (76-80) and older-old (81-91) age groups; it was considerably lower than for the better-educated participants in this study and well below expectation in comparison to the Heaton et al. (1991) norms. However, for participants with 12 or more years of education, performance for the younger-old age group was superior to that of the older-old group and comparable to the Heaton et al. (1991) norms. The relationship between TMT scores and education in brain-damaged samples is more equivocal, with some investigators documenting an associ,ation (Anthony et al., 1980) and others failing to detect a relationship (Finlayson et al., 1977) or reporting a statistically significant but clinically negligible association (Corrigan & Hinkeldey, 1987). General intelligence (in the majority of studies expressed as Full Scale IQ) has consistently been found to be related to TMT outcome in both patients and normal participants, with higher intellectual levels being associated with superior test scores (Anthony et al., 1980; Boll & Reitan, 1973; Corrigan & Hinkeldey, 1987; Dodrill, 1987; Giovagnoli, 1996; Goul & Brown, 1970; Kennedy, 1981; Lamberty et al., 1994; Parsons et al., 1964; Siegert & Cavana, 1997; Tremont et al., 1998; Waldmann et al., 1992; Warner et al., 1987; Wiens & Matarazzo, 1977). However, information regarding whether TMT data are more related to Performance IQ (PIQ) vs. Verbal IQ (VIQ) has been contradictory. For example, Yeudall et al. (1987) found a significant correlation between PIQ and both parts A and B but no relationship between TMT scores and VIQ. Conversely, Wiens and Matarazzo (1977) found a significant correlation between part B and VIQ but not PIQ. However, their data should be viewed with caution due to unexpected and unexplainable findings; speciflcally, a significant positive correlation was found between PIQ and part A (the higher the PIQ, the longer it took participants to complete the task), and significant correlations were found for one control group but not the second one. The literature has generally indicated that there are no gender differences in TMT scores in normal participants (Dodrill, 1979; Fromm-
69 Auch & Yeudall, 1983; Heaton et al., 1986, 1991; Ivnik et al., 1996; Stuss et al., 1987; Wahlin et al., 1996; Waldmann et al., 1992; Yeudall et al., 1987). The studies that found gender effects differed as to which gender performed better, and the differences in performance between the genders were small: Davies (1968) and Wiederholt et al. (1993) found that men scored higher than women on part B; Portin et al. (1995) reported superiority of men on part A, whereas Arbuthnott and Frank (2000) reported superiority of women on part A; Bornstein (1985) and Saxton et al. (2000) reported that women outperformed men on both parts A and B; and Gaudino et al. (1995) found significant superiority of women on experimental versions of the test. Multiple regression analyses of the effects of age, IQ, gender, and education on TMT scores in normal individuals by Greene and Farr (1985) suggested that age accounts for most of the variance, followed by FSIQ, for both parts A and B. Gender was a contributor to part B performance only, while education had a negligible association with both parts. In contrast, Heaton et al. (1991) found that while age accounted for the most test score variance on parts A and B, 16%27% of the unique test score variance was attributable to educational level; gender did not account for any appreciable test score variance. The relationship between FSIQ and TMT scores was not assessed in this study. The Greene and Farr (1985) and Heaton et al. (1991) data appear to be contradictory regarding the effect of education on TMT performance. However, the findings can be reconciled if the contribution of education to TMT scores is viewed as occurring through its association with IQ. Hays ( 1995) reported a considerable effect of intelligence level and age on TMT performance, demonstrated by multiple regression analysis of data collected on a sample of 661 psychiatric inpatients. The authors provided normal standard score conversions from raw scores and information derived from the regression analysis, allowing correction ofTMT scores for IQ and age. The effect of culture and acculturation on TMT performance was examined in several
70
TESTS OF ATTENTION AND CONCENTRATION
studies. Arnold et al. (1994) reported a significant effect of acculturation on TMT, and Manly et al. (1998) found that black English use was associated with poor performance on Trails B. Soukup et al. (1998) examined the effects of demographic and other variable; on TMT performance and compiled norm4tive data for adolescents, adults aged 20-54 years, and older adults aged 55-85 years, based on a review of relevant studies. The authors advocated the use of sample-spe~fic normative comparisons and emph~zed the importance of considering the in~ence of demographic and other variables; on test performance. ! Medical condition and physical s'tus have been shown to affect TMT performance. The data for a large sample used in a mflticenter prospective study of risk factors fori osteoporotic fractures suggest that TMT Part B performance is associated with bone; mineral density (Yaffe et al., 1999b), diabet~ (Gregg et al., 2000), depression (Yaffe et al~ 1999a), exposure to toxic chemicals (Mo~ et al., 2001), and apolipoprotein E phenotype (Yaffe et al., 1997) in females, as well as )with sex hormone levels in both males and\ females (Yaffe et al., 1998, 2002). Binder et ~· (1999) found that TMT part B performapce was significantly associated with objectiv~ evaluation of physical function in a sampl~ of 125 elderly participants. In addition, Kilander et al. (2000) reported a strong relationship l>etween diastolic blood pressure at age 50 and performance on the TMT part B 20 years later in a population-based study conducted fin Sweden on a sample of 502 men.
METHOD FOR EVALUATING THE NORMATIVE REPORTS To adequately evaluate the TMT normative reports, six key criterion variables were deemed critical. The first five of these relate tq subject variables, and the remaining dimensi~ refers to a procedural issue. Minimal requifements for meeting the criterion variables ivere as follows. ·
Subject Variables Sample Size
Fifty cases are considered a desirable sample size. Although this criterion is somewhat arbitrary, a large number of studies suggest that data based on small sample sizes are highly influenced by individual differences and do not provide a reliable estimate of the population mean. Sample Composition Description
Information regarding medical and psychiatric exclusion criteria is important. It is unclear if gender, geographic recruitment region, socioeconomic status, occupation, ethnicity, handedness, or recruitment procedures are relevant. Until this is determined, it is best that this information be provided. Age Group Intervals
This criterion refers to grouping of the data into limited age intervals. This requirement is especially relevant for this test since a strong effect of age on TMT performance has been demonstrated in the literature. Reporting of Education Levels
Given the association between education and TMT scores, information regarding educational level should be reported for each subgroup, and preferably normative data should be presented by educational levels. Reporting of Intellectual Levels
Given the relationship between TMT performance and IQ, information regarding intellectual level should be reported for each subgroup, and preferably normative data should be presented by IQ levels.
Procedural Variables Data Reporting
Means and standard deviations, and preferably ranges, for total time in seconds for each part of the TMT should be reported. Given the demonstrated utility of the B-A difference, B:A ratio, and error analysis with some
TRAILMAKING TEST clinical groups, reporting of means and SDs for these indices would facilitate interpretation of the results.
SUMMARY OF THE STATUS OF THE NORMS Our review of the literature located TMT normative reports for adults, as well as three interpretive guides for the HRB (Gilandas et al., 1984; Golden et al., 1981b; Reitan & Wolfson, 1985). Hundreds of other studies have also reported control subject data, and we have included a discussion of those investigations based on well-defined samples that involved some unique features, such as large sample size, retest data, elderly population, cutoff score analysis, reporting of derived measures, error analysis, etc. It should be noted that Russell and Starkey (1993) developed the Halstead-Russell Neuropsychological Evaluation System (HRNES), which includes the TMT among 22 tests. In the context of this system, individual performance is compared to that of 576 braindamaged participants and 200 participants who were initially suspected of having brain damage but had negative neurological findings. Data were partitioned into seven age groups and three educationaVIQ levels. The authors published an appendix to the manual (HRNES-R; Russell and Starkey, 2001), which contains tables of scale scores based on the original HRNES norms, demographic corrections, and regression-based predicted scores. These data will not be reviewed in this chapter because the "normal" group consisted of the Veterans Administration patients who presented with symptoms requiring neuropsychological evaluation. For further discussion of the HRNES system, see Lezak et al. (2004, pp. 676-677). There is a great deal of variability in the methodological aspects of studies summarized in this chapter. Sample sizes vary from 19 to over 700. Age represented in the studies varies from 15 to over 90 years. Sample compositions have been diverse and have included neurologically normal individuals (according to stringent exclusion criteria), job applicants,
71
medical/psychiatric patients, V.A. inpatients and outpatients, and homosexual/bisexual males. Similar concerns regarding variability between studies were raised by Soukup et al. (1998). In addition, some investigators set the maximum time for both parts A and B, which varies from 180 to 300 seconds. The majority of studies report mean age, education, and gender distribution for the sample and/or for the age groups. Some studies report WAIS-R IQs or estimated intelligence level, handedness, occupational level, and ethnic composition. Many studies present data divided into age groups. Few studies classify participants into education or IQ groups or present data for males and females separately; few studies report data for males only or present data in age by education by gender cells. Geographical origin of the data also varies widely: British, Australian, and Canadian data sets are presented in this chapter. Data for other cultural groups are also available in the literature (see above). The data are most commonly reported as time to completion for parts A and B. Some studies present raw data converted to T scores, error rate, percentile ranks, median time, total time for parts A and B, B-A difference, and B:A ratio. One study provides regression equations to correct raw data for age and education. Few studies present classification rates for different cutoff criteria. Test-retest data are reported in some studies with intertrial intervals ranging from 1 week to 24 weeks (in some studies, which are not reviewed for the purposes of data collection, up to 2 years). Issues of reliability and/or practice effect are discussed in these studies. Given that use of the TMT has typically been within the context of the HRB, the Reitan data and interpretation recommendations will be reported first, followed by a summary of the other interpretation formats and then by the normative publications and control data from clinical studies, presented in ascending chronological order. The text of study descriptions contains references to the corresponding tables identified by number in Appendix 4. Table A4.1, the locator table, summarizes information
TESTS OF ATTENTION AND CONCENTRATION
72
provided in the studies described in this chapter. 1
SUMMARIES OF THE STUDIES Reitan and Wolfson, 1985
The authors provided general guidelines for TMT score interpretation in the form of test completion times (in seconds), which correspond to "severity ranges" for part B only: 0-60 sec:
61-72 sec: 73-105 sec: ~106 sec:
perfectly normal (or better than average) normal mildly impaired seriously impaired
No other information was provided, such as score means, SDs, or any data regarding the normative sample on which these guidelines were developed. These cutoffs represent a substantial departure from cutoffs published earlier; the definition of normal performance here is approximately 20 seconds less than in the 1958 and 1979 guidelines.
Considerations regarding use of the study The authors argued that these norms were meant as "general guidelines" and that "exact percentile ranks corresponding with each possible score are hardly necessary because the other methods of inference are used to supplement normative data in clinical interpretation of results of individual participants" (p. 97). However, we maintain that more precise scores as well as separate normative data for different age, IQ, and educational levels are necessary to avoid false-positive errors in diagnosis. Gilandas, Touyz, Beumont, and Greenberg, 1984(p.l02)
The authors provided the percentile ranks associated with Davies' (1968) TMT normative data and concluded that a percentile rank of 25 is "mildly suggestive of brain damage" and scores at the lOth percentile and lower are "moderately suggestive of brain damage." 'Nonns for children are available in Baron (2004) and Spreen and Strauss (1998).
Golden, Osmon, Moses, and Berg, 1981b (pp. 22-23)
The authors provided recommendations regarding the detection of laterality of brain damage: part A is generally considered more a measure of right hemisphere integrity (i.e., visual scanning skills), where part B is more indicative of left hemisphere intactness (i.e., language symbol manipulation and direction of behavior according to a complex plan). Therefore when one part indicates impairment relative to the other part, a lateralized injwy may be present. . . . Part A is considered to indicate greater impairment if the score on part B is less than twice the score on part A. Part B indicates greater impairment if its score is more than three times the score on part A. Tests in which the part B score lies between two times and three times the part A score suggest that performances on the two parts are essentially equal. However, lateralizing properties of performance time ratios for two conditions have been repeatedly refuted in the literature (Hom & Reitan, 1990; Salthouse et al., 1996).
[TMT.l] Davies, 1968 (Table A4.2)
The author published TMT data on 540 British participants as a part of her investigation of the influence of age on TMT performance. Test scores were obtained on 50 men and 40 women in each of six decade age groups. The reference Davies cited as containing a further description of her subject sample could not be located. Mean times in seconds corresponding to lOth, 25th, 50th, 75th, and 90th percentile ranks for parts A and B are provided for each age decade, with the exception that the data on the participants in their 20s and 30s were collapsed. Davies also reports optimal cutoff points for young vs. middle-aged individuals. No significant gender differences were observed within any specific decade, although in the group as a whole men performed slightly but significantly more quickly on part B.
Study strengths 1. Presentation of the data in 10- or 20-year age intervals.
TRAILMAKING TEST
2. Very large sample size, large Ns within each age subgroup, and fairly equal representation of males and females.
Considerations regarding use of the study 1. Lack of IQ and education data or description of exclusion criteria. 2. Lack of test score SDs. 3. Tested in England, which may limit generalizability for clinical interpretive purposes in the United States. [TMT.2] Goul and Brown, 1970 (Table A4.3) The authors tested 103 (or 106) Canadian workers' compensation board non-braininjured patients who had been hospitalized for at least 3 months. These data were collected as a part of the authors' analysis of the effects of age and intelligence on TMT perfonnance. Participants had negative neurological histories and included amputees, burn victims, and patients with lumbosacral fusions. Educational levels ranged from 6 to 13 years of fonnal schooling; no means are reported. Participants were classified into five age groups: 20-29, 30-39, 40-49, 50-59, and 60-72. Individual group sizes ranged from 15 to 26. Mean (SD) WAIS FSIQs for the five groups were 103.8 (12.1), 110.1 (8.9), 105.3 (7.9), 112.7 (8.6), and 104.2 (12.2), respectively. TMT parts A and B data are presented in tenns of mean time in seconds, SDs, ranges, medians, and recommended cutoff scores for the five age groups. Perfonnance declined significantly with age. Contrary to expectations, IQ was significantly positively correlated with TMT scores.
Study strengths 1. Presentation of the data by age groups. 2. Infonnation on mean IQ and SD for each age group. 3. Infonnation on educational level and geographic area provided. 4. Means and SDs are reported.
73
4. Small sample sizes in the upper age ranges. 5. Data were collected in Canada, raising questions regarding their usefulness for clinical interpretation in the United States. [TMT.l] Wiens and Matarazzo, 1977
(Table A4.4)
The authors collected TMT data on 48 male applicants to a patrolman program in Portland, Oregon, as a part of an investigation of the WAIS and Minnesota Multiphasic Personality Inventory (MMPI) correlates of the Halstead-Reitan battery. All participants passed a medical exam and were judged to be neurologically nonnal. Participants were divided into two equal groups, which were comparable in age (23.6 vs. 24.8 years), education (13.7 vs. 14.0 years), and WAIS FSIQ (117.5 vs. 118.3). TMT mean time in seconds and SDs are provided for each group. A random subsample of 29 of the applicants was readministered the TMT 14-24 weeks following the original administration. Means and SDs for TMT times in seconds for both the original testing and retest are reported. None of the 29 original participants obtained scores lower than Reitan's suggested cutoff for part B; however, one subject fell below the recommended cutoff for part A on the second administration. Correlations between test perfonnance and IQ scores were not meaningful. In the first group, significant negative correlations were obtained between part B performance and FSIQ and VIQ, but no significant correlations were obtained between the second control group and IQ measures. In group 1, a significant negative correlation between part A and FSIQ and a significant positive correlation between part A and PIQ were documented; again, no significant correlations were obtained between part A scores and IQ measures in the second control group.
Study strengths Considerations regarding use of the study 1. Participants were medical patients with extensive hospitalizations. 2. Lack of data regarding education means. 3. Some variability in IQs across age groups.
1. Demographic characteristics of the sample are presented in tenns of gender, age, education, IQ, recruitment procedures, and geographic area. 2. Adequate medical exclusion criteria.
74
TESTS OF ATTENTION AND CONCENTRATION
3. Means and SDs are reported: 4. The data are provided in a restricted age range. 5. Information on test-retest performance is provided. Considerations regarding use of the study 1. High IQ level. 2. Relatively small sample size. 3. All-male sample. [TMT.4] Eson, Yen, and Bourke, Personal Communication (Table A4.5)
The authors collected normative data for the TMT on a sample of 63 older patticipants. Mean time in seconds and SDs are;provided for four age groups, with mean agdi of 63.2, 67.0, 72.0, and 78.3 years. Samplel sizes for each age group range between 15 ar!d 16. No other information is provided, such ias exclusion criteria or demographic data. : Study strengths 1. Data on an elderly sample are provided and stratified by age group. 2. Means and SDs are reported. Considerations regarding use of the $rudy 1. No reported exclusion criteria ,or other demographic, IQ, or geographic data. 2. Age range and SDs for each group are not reported. 3. Relatively low sample sizes. [TMT.S] Harley, Leuthold, Matthews, and Bergs, 1980 (Table A4.6)
The authors collected TMT data on 193 V.A.hospitalized patients in Wisconsin, ranging in age from 55 to 79. Exclusion criteria if'tcluded FSIQ less than 80, active psychosis, qnequivocal neurological disease or brain damage, and serious visual or auditory acuity p~blems. Patients with chronic brain syndronle were included. Patient diagnoses were as follows: chronic brain syndrome unrelated to ~cobol ism (28%), psychosis (55%), alcoholis~ (37%), neurosis (9%}, and personality disorddr (4%). Mean educational level was 8.8 ye~. The sample was divided into five age groups: 55-59 (n =56), 60--64 (n = 45), J 65-69 (n=35), 70-74 (n=37), and 75-7~ years
(n =20). Mean educational level and percent included in each of the diagnostic classifications are reported for each age group. The authors also provide test data on a subgroup of 160 participants equated for percent diagnosed with alcoholism across the five age groups. The "alcohol-equated sample" was developed "to minimize the influence that cognitive or motor/sensory differences uniquely attributable to alcohol abuse might have upon group test performance levels" (p. 2). This subsample remained heterogeneous regarding representation of the other diagnostic categories. Mean time in seconds, SD, and ranges are reported for parts A and B for each age interval for the whole sample and for the alcohol-equated sample.
Study strengths 1. Large sample size, with some individual cells of approximately n =50. 2. Reporting of IQ data, geographic area, age, and education. 3. Data presented in age groupings. 4. Means and SDs are reported. Considerations regarding use of the study 1. The presence of substantial neurological (chronic brain syndrome), substance abuse, and major psychiatric disorders in the sample. 2. Low educational level, though IQ levels are average. 3. No information regarding gender, but given the V.A. setting, it is likely that most or all of the sample was male. Other comments The scores for the two oldest age groups are identical in the whole sample and the alcoholequated group because these two groups did not have overrepresentation of alcoholics and, thus, did not need to be adjusted. [TMT.6] Anthony, Heaton, and Lehman, 1980 (Table A4.7)
The purpose of the study was to cross-validate two computer programs designed to determine the presence, location, and process of brain lesions using scores from the HRB and
TRAILMAKING TEST
the WAIS. Patients with structural brain lesions and normal controls were compared. The control group consisted of 100 volunteers with no medical or psychiatric problems and no history of head trauma, brain disease, or substance abuse. The study was conducted in Colorado. TMT data are presented in terms of mean times in seconds and SDs for part B only. Study strengths 1. Information regarding education, IQ, age, and geographic area is provided. 2. Large sample size. 3. Adequate exclusion criteria. 4. Means and SDs are reported. Considerations regarding use of the study 1. Undifferentiated age grouping. 2. The IQ range is high average. 3. No information is available regarding the gender ratio. 4. Data are provided for part B only. [TMT.7] Bak and Greene, 1980 (Table A4.8) The authors gathered TMT data on 30 righthanded Texan participants as a part of an investigation of the effect of age on performance on the HRB and the Wechsler Memory Scale. Participants were equally divided into two age groupings: 50-62 and 67-86. Participants were 8uent in English and denied a history of CNS disorders, uncorrected sensory deficits, or illnesses or "incapacities" which might affect test results; participants in poor health were excluded. The mean (SD) ages of the two groups were 55.6 (4.44) and 74.9 (6.04), respectively. Participants in the first group were born between 1916 and 1929, and participants in the second group were born between 1892 and 1912. Nine individuals in the first group were female, and 10 participants in the second group were female. Four WAIS subtests were administered (Information, Arithmetic, Block Design, Digit Symbol); the mean scores on these measures suggested that IQ levels were within the high average range or higher. Mean times in seconds and SDs for parts A and B are presented for the two age groups. Significant differences in performance were
75
documented between the two groups on both parts of the test. Study strengths 1. The study provides data on a very elderly cohort not found in other published normative data. 2. Adequate exclusion criteria. 3. Sample composition is well described in terms of age, gender, education, 8uency in English, handedness, and geographic area. 4. Means and SDs are reported. Considerations regarding use of the study 1. Sample sizes are small. 2. High IQ and educational level for the older age grouping. 3. The older age grouping spans nearly two decades and may be too broad for optimal clinical interpretive use. [TMT.B] Kennedy, 1981 (Table A4.9)
The author collected TMT data on 150 Canadian participants as a part of his analysis of the effects of age on TMT performance. Participants were employees of a mental health center "who represented diverse work roles" randomly selected from five age groups: 2029,30-39,40-49,50-59,60-69. Participants were excluded who reported histories of "central nervous system disorders, illnesses, or incapacities which would bias test results;" exclusion criteria were not further specified. Mean education was 13. 73, 13.53, 13.11, 11.59, and 12.50 years, respectively; those 50-59 years old were significantly less educated than those 20-29 or 30-39 years old. The Ammons Quick Test was used as an estimate of intelligence level; average estimates for the five groups were 123.43, 127.10, 127.40, 123.30, and 128.54, respectively. Males and females were equally represented in each group. The mean time in seconds and SDs for parts A, B, and A+ B for each group are provided. Performance decreased significantly with age, and significant negative correlations between TMT test scores and education and IQ suggest that lower education and IQ are adversely related to test performance.
76
TESTS OF ATTENTION AND CONCENTRATION
Study strengths 1. Large sample size, although the individual cells had only 30 participants per cell. 2. Presentation of the data in terms of age groupings. 3. Reporting of education, IQ estimates, gender, and geographic area. 4. Means and SDs are reported. Considerations regarding use of the study 1. Very high mean intelligence scores. 2. Some variability in educational level across groups, which may have led to some unusual findings; inexplicably, those 60--69 years old performed either as well as or slightly better than those 50-59 years old. 3. Vague exclusion criteria. 4. Lack of reference to ethnicity/language issues and the fact that data were obtained on Canadians, possibly reducing its generalizability for clinical interpretation in the United States.
Study strengths 1. The large overall sample size. 2. Data are partitioned into five age groups. 3. Sample composition is described in terms of IQ, educational level, age, gender, handedness, recruitment procedures, and geographic area. 4. Some psychiatric and neurological exclusion criteria are used. 5. Means and SDs are reported. Considerations regarding use of the study 1. High intellectual and educational levels of the sample. 2. Sample size for some age groups is very
small. 3. Data were collected in Canada, which may limit their usefulness for clinical interpretation in the United States. 4. Essentially no differences in performance were noted between those 1823 years old and those 24-32 years old, suggesting that use of a single age grouping for 18-32 would have been appropriate.
[TMT.9] Fromm-Auch and Yeudall, 1983 (Table A4.10)
[TMT.10] Bornstein, 1985 (Table A4.11)
The authors obtained TMT data on 193 Canadian participants (111 male, 82 female) recruited through posted advertisements and personal contacts. Participants are described as "nonpsychiatric" and "nonneurological." Eighty-three percent of the sample were right-handed. Mean (SD) age was 25.4 (8.2) years (range= 15--64). Mean (SD) education was 14.8 (3.0) years (range= 8-26) and included technical and university training. Mean (SD) WAIS FSIQ, VIQ, and PIQ were 119.1 (8.8, range= 98-142), 119.8 (9.9, range=95-143), and 115.6 (9.8, range=89146), respectively. Of note, no subject obtained an FSIQ which was lower than the average range. Mean time in seconds, SDs, and ranges for parts A and B are reported for five age groupings: 15-17, 18-23, 24-32, 33-40, and 41-64 years. Sample sizes range from 10 to 75. The two oldest age groupings had sample sizes less than 20. No gender differences were documented, and male and female data were collapsed.
The author collected data on 365 Canadian individuals (178 males and 187 females) recruited through posted notices on college campuses and unemployment offices, newspaper ads, and senior-citizen groups. Participants were paid for their participation. Participants ranged in age from 18 to 69 years, with a mean of 43.3 (17.1) years, and had completed 5-20 years of education, with a mean of 12.3 (2. 7) years. Ninety-one and a half percent of the sample were right-handed. No other demographic data or exclusion criteria are reported. Mean time in seconds and SDs for parts A and B are reported for three age groupings (20-39, 40-59, and 60--69 years), two educational levels (less than high school, greater than or equal to high school), and gender, resulting in a total of 12 separate groups. Individual group sample sizes ranged from 13 to 86. Significant correlations were obtained between TMT scores and age and education, suggesting that better performance was associated with younger age and more years of
TRAILMAKING TEST
education. Females generally outperformed males on both parts A and B. Study strengths 1. Very large overall sample size. 2. Data are stratified by age, gender, and educational level. 3. This data set is unique in that it reports data for participants with less than a high school education. 4. Information on handedness, recruitment procedures, and geographic area is provided. 5. Means and SDs are reported. Considerations regarding use of the study 1. Individual sample sizes of some cells are small. 2. Lack of any reported exclusion criteria. 3. Data were collected on Canadian citizens, which may limit generalizability for their use in the United States. 4. Lack of IQ data. The concern over the lack of IQ data is somewhat mitigated by the fact that the mean education level was not unduly elevated (12.3 years), which might suggest that mean intellectual levels were within the average range. [TMT.11 1 Heaton, Grant, and Matthews, 1986 (Table A4.12)
The authors obtained TMT data on 553 normal controls in Colorado, California, and Wisconsin as a part of an investigation into the effects of age, education, and gender on HRB performance. Nearly two-thirds of the sample were male (males= 356, females= 197). Exclusion criteria were history of neurological illness, significant head trauma, and substance abuse. Participants ranged in age from 15 to 81 years, with a mean of39.3 (17.5) years, and mean education was 13.3 (3.4) years, with a range of 0-20 years. The sample was divided into three age categories (less than 40, 40-59, and greater than or equal to 60 years) with 319, 134, and 100 participants respectively, and into three education categories (less than 12 years, 12-15 years, and greater than or equal to 16 years) with 132, 249, and 172 participants, respectively.
77
Testing was conducted by trained technicians, and all participants were judged to have expended their best effort on the task. The TMT mean time in seconds for part B is reported for the six subgroups, as well as percent classified as normal using Russell et al.'s (1970) criteria. Approximately 30% of the test score variance was accounted for by age and approximately 20% was associated with education level. Significant group differences in TMT scores were found across the three age groups and across the three education groups, and a significant age-by-education interaction was documented. No significant differences in performance were found between males and females. Study strengths 1. Large size of overall sample and individual cells. 2. Information regarding age, education, gender, handedness, and geographic area is provided. 3. Adequate exclusion criteria. 4. Data are grouped by age and educational level. Considerations regarding use of the study 1. No reporting of data for part A. 2. SDs are not provided. 3. Mean scores are reported for individual WAIS subtest scaled scores but not for overall IQ scores. 4. Age groupings are quite large in terms of ranges. [TMT.12] Alekoumbides, Charter, Adkins, and Seacat, 1987 (Table A4.13) The authors report data on 118 medical and psychiatric inpatients and outpatients without cerebral lesions or histories of alcoholism or cerebral contusion from V.A. hospitals in southern California as a part of their development of standardized scores corrected for age and education for the HRB. Among the 41 psychiatric patients, nine were diagnosed as psychotic and 32 as neurotic. In addition to psychiatry services, patients were drawn from medicine (n =57), neurology (n = 22), spinal cord injury (n = 9), and surgery (n = 6) units. Mean age was 46.85 (17.17) years, ranging
78
TESTS OF ATTENTION AND CONCENTRATION
from 19 to 82 years, and mean education was 11.43 (3.20) years, ranging from 1 to 20 years. Frequency distributions for age and years of education are provided. Mean WAIS FSIQ, VIQ, and PIQ were within the average range: 105.89 (13.47), 107.03 (14.38), and 103.31 (13.02), respectively. Means and SDs for individual age-corrected subtest scores are also reported. All participants except one were male; the majority were Caucasian (93%), with 7% African-American. The mean score on a measure of occupational attainment was 11.29. No differences were found in test performance between the two psychiatric groups and the nonpsychiatric group, and the data were collapsed. Mean times to complete parts A and B in seconds and SDs are reported. In addition, regression equation information to allow correction of raw scores for age and education is included. Study strengths 1. Large sample size. 2. Information regarding IQ, age, education, ethnicity, gender, occupational attainment, and geographic area is provided. 3. Regression equation for computation of age- and education-corrected scores is provided. 4. Means and SDs are reported. Considerations regarding use of the study 1. The sample was heterogeneous in terms of medical diagnoses; psychiatric patients were included in this sample, which was supposedly representative of "normal" participants. 2. Undifferentiated age range (mitigated by the regression equation information). 3. Nearly all-male sample. [TMT.13] Bornstein, Baker, and Douglass, 1987a (Table A4.14)
The authors collected TMT test-retest data on 23 volunteers (14 women, nine men) who ranged in age from 17 to 52, with a mean age of 32.3 (10.3), as part of an examination of the short-term retest reliability of the HRB. Exclusion criteria consisted of a positive
history of neurological or psychiatric illness. Mean Verbal IQ was 105.8 (10.8), ranging from 88 to 128, and mean Performance IQ was 105.0 (10.5), ranging from 85 to 121. Participants were administered the HRB in standard order both on initial testing and again 3 weeks later. Means, SDs, and ranges for time in seconds to complete parts A and B for both testing sessions are provided, as well as raw score change and SD, median raw score change, and mean percent of change. For part A, no significant correlations between mean change and age or education or between mean percent of change and age or education were documented. For part B, no significant correlations between mean change and age or education or between mean percent change and education were found; however, a significant correlation did emerge between mean percent of change and age. Study strengths 1. Information on short-term (3-week) retest data is provided. 2. Sample composition is described in terms of age, VIQ, PIQ, and gender. 3. Minimally adequate exclusion criteria. 4. Means, SDs, and ranges are reported. Considerations regarding use of the study 1. Undifferentiated age range. 2. Small sample size. 3. No data on educational level. [TMT.14] Dodrill, 1987 (Table A4.15)
The author collected TMT data on 120 participants in Washington during the years 19751976 (n = 81) and 1986-1987 (n = 39). Half of the sample was female, and 10% were minorities (six black, three Native American, two Asian American, one unknown). Eighteen were left-handed, and occupational status included 45 students, 37 employed, 26 unemployed, 11 homemakers, and one retiree. Participants were recruited from various sources, including schools, churches, employment agencies, and community service agencies, and either paid for their participation or offered an interpretation of their abilities. Exclusion criteria were history of "neurologically relevant disease (such as meningitis or
TRAilMAKING TEST
encephalitis);" alcoholism; birth complications "of likely neurological significance;" oxygen deprivation; peripheral nervous system injury; psychotic or psychosis-like disorders; or head injury associated with unconsciousness, skull fracture, persisting neurological signs, or diagnosis of concussion or contusion. Of note, one-third of potential participants failed to meet the above medical and psychiatric criteria, resulting in a final sample of 120. Mean age was 27.73 (11.04) years, and mean education was 12.28 (2.18) years. The participants tested in the 1970s were administered the WAIS, whereas the participants assessed in the 1980s were administered the WAIS-R; WAIS scores were converted to WAIS-R equivalents by subtracting 7 points from the VIQ, PIQ, and FSIQ. Mean FSIQ, VIQ, and PIQ scores were 100.00 (14.35),100.92 (14.73), and 98.25 (13.39), respectively. The IQ scores ranged from 60 to 138 and reflected a normal distribution. Mean time in seconds and SDs for parts A and B are reported as well as IQ-equivalent scores for various levels of intelligence. Between 10% and 15% of the sample were misclassified as brain-damaged using cutoffs of 39 seconds for part A and 89 seconds for part B.
Study strengths 1. Large sample size. 2. Comprehensive exclusion criteria. 3. Sample composition is described in terms of education, IQ, occupation, gender ratio, age, handedness, ethnicity, recruitment procedures, and geographic area. 4. IQ-equivalent scores are provided. 5. Data for different IQ levels are provided. 6. Means and SDs are reported.
Considerations regarding use of the study 1. Undifferentiated age range. [TMT.15] Ernst, 1987 (Table A4.16) The author obtained TMT data on 110 primarily Caucasian (99%) residents of Brisbane, Australia, aged 6~75. Fifty-nine were female and 51 were male, with a mean educational level of 10.3 years; men and women did not
79
differ in years of education. Participants were recruited primarily through random selection from the Queensland State electoral roll (n = 97), with the remainder (n = 13) solicited through senior-citizen centers. Exclusion criteria were history of significant head trauma or neurological disease. Nearly one-half of the sample was diagnosed with at least one chronic disease (hypertension = 33, heart disease= 9, thyroid dysfunction = 7, asthma= 5, emphysema= 2, diabetes= 1) for which they were receiving treatment described as "wellcontrolled." Sixty-six of the participants were receiving medications, primarily for the diseases listed above. The test was administered according to Reitan's instructions. All participants were administered the TMT first, followed by either the Tactual Performance Test or Booklet Category Test. Using the standard cutoffs of 39 seconds and 92 seconds, 48% and 48% of all participants were misclassified as impaired for parts A and B, respectively. Gender differences were not significant; however, education was significantly related to performance on part B. There were no significant effects of chronic disease or medication intake.
Study strengths 1. Large sample size in a restricted age range. 2. Presentation of the data by gender. 3. Sample composition is described in terms of age, education, geographic recruitment area, recruitment procedures, and ethnicity. 4. Information regarding test administration order effects is provided. 5. Means, SDs, and error rates are reported.
Considerations regarding use of the study 1. Approximately half of the participants had at least one chronic illness, and over half were taking prescribed medications. 2. No information regarding IQ. 3. Low mean educational level. 4. Data were collected in Australia and may be unsuitable for clinical use in the United States.
80
TESTS OF ATTENTION AND CONCENTRATION
[TMT.16] Stuss, Stethem, and Poirier, 1987 (Tables A4.17 and A4.18)
The authors collected normative data on 60 Canadian English- or French-speaking participants, who were recruited through personal contacts or employment agencies and paid for their participation. Tests were administered in each subject's native language. Participants were tested twice at 1-week intervals. Exclusion criteria were abnormal vision (even after correction); history of substance abuse; presence of medical, neurological, and/ or psychiatric disorders; and current use of psychotropic medication (Stuss, personal communication). Ten participants were assigned to each of six age ranges: 16-19, 2029, 30-39, 40-49, 50-59, and 60-69. Fifty-five percent of the sample were male, and 18% were left-handed. Mean education was 14.3 (2.62) years. Data are provided regarding handedness, gender distribution, and education. Mean time in seconds and SDs for the two parts of the TMT for the first, second, and combined testing sessions are reported for each age interval. Mean time and SDs are also provided for males, females, those with less than or equal to 12 years of education, and those with greater than 12 years of education, collapsed across age groupings. Older participants and those with a high school education or less performed significantly poorer than younger participants or those with some college or university education. Educational level was somewhat irregularly distributed across age groups, and the authors suggest that the normative data be used with caution. A practice effect was present, but the authors question the clinical relevance of the improvement. No significant gender differences in performance were present. Study strengths 1. Presentation of the data by age groupings, education groupings, and gender. 2. Extensive information on educational level. 3. Sample composition is described in terms of age, gender, handedness, geographic location, and recruitment procedures.
4. Adequate exclusion criteria. 5. Information regarding practice effect. 6. Means and SDs are reported.
Considerations regarding use of the study 1. Small sample sizes within each age group. 2. Variability in mean educational levels across age groups; of importance, those 50-59 years old had the lowest mean educational level, the lowest mean test scores, and the largest SDs relative to the other age groups. 3. Lack of IQ data. 4. Unknown influence of language differences. 5. Data were obtained in Canada and may be of limited usefulness for clinical interpretation in the United States. [TMT.17] Yeudall, Reddon, Gill, and Stefanyk, 1987 (Table A4.19)
The authors obtained TMT data on 225 Canadian participants recruited from posted advertisements in workplaces and personal solicitations. The participants included meat packers, postal workers, transit employees, hospital lab technicians, secretaries, ward aides, student interns, student nurses, and summer students. In addition, high school teachers identified for participation average students in grades 10-12. The participants (127 males and 98 females) did not report any history of forensic involvement, head injury, neurological insult, prenatal or birth complications, psychiatric problems, or substance abuse. Data were gathered by experienced testing technicians who "motivated the participants to achieve maximum performance" partially through the promise of detailed explanations of their test performance. Means and SDs for time in seconds to complete parts A and B are presented for four age groupings (15-20, 21-25, 26-30, and 31-40) for males and females combined and separately. Information regarding percent right-banders, mean years of education, and mean WAIS/WAIS-R FSIQ, VIQ, and PIQ is reported for each age grouping and ageby-gender grouping. For the sample as
TRAILMAKING TEST
a whole, 88% were right-handed and had completed an average (SD) of 14.55 (2.78) years of schooling. The mean FSIQ, VIQ, and PIQ were 112.25 (9.83), 114.77 (10.34), and 108.50 (10.34), respectively. Study strengths 1. Large sample size. 2. Grouping of data by age. 3. Data availability for a 1~20 year age group. 4. Adequate medical and psychiatric exclusion criteria. 5. Information regarding age, handedness, education, IQ, gender, occupation, recruitment procedures, and geographic area is provided. 6. Means and SDs are reported. Considerations regarding use of the study 1. High educational level of the sample. 2. Data were obtained on Canadian participants, which may limit their usefulness for clinical interpretation in the United States due to possible subtle cultural differences. Other Comments No significant correlations were found between age or education and part A; a significant correlation emerged for age and part B (r = 0.27), but no significant relationship was documented between education and part B. Significant correlations emerged between parts A and B and PIQ but not VIQ. No significant gender differences were observed for part A or B. The authors recommend use of the combined age group norms for part A and the separate age-grouped norms for part B. [TMT.18] Bomstein and Suga, 1988 (Table A4.20)
As part of their evaluation of the effect of educational level on neuropsychological test performance in the elderly, the authors report TMT data on 134 healthy elderly Canadian paid volunteers aged 5~70 according to three educational levels: ~10 (n =46), 11-12 (n = 44), and greater than 12 (n = 44) years. Nearly two-thirds of the sample were female
81 (n = 85). The average (SD) age for the sample was 62.7 (4.3), and the mean ages of the three educational groups were comparable: 62.3, 62.9, and 63.0 years, respectively. Exclusion criteria were history of neurological or psychiatric disorder. Significant group differences in performance on both TMT A and B were obtained across the three education groups, which were due to the group with ~10 years of education performing significantly worse than both of the other education groups (which did not differ from each other). Mean time in seconds and SDs for parts A and B are reported for the three education groups.
Study strengths 1. Large overall sample size and individual cell sizes are adequate. 2. Data are partitioned into three education groups; the study is unique in terms of representation of participants with less than 12 years of education. 3. Information regarding gender, age, and geographic area is provided. 4. Means and SDs are reported. 5. Minimally adequate exclusion criteria. 6. Reasonably restricted age grouping. Considerations regarding use of the study 1. No information regarding IQ. 2. Greater than 12 years of education is too large a category. 3. Data collected in Canada, which may limit generalizability for use in the United States. [TMT.19] Stuss, Stethem, and Pelchat, 1988 (Table A4.21)
In this publication, Stuss and colleagues expanded the data presented in Stuss et al. (1987). The size of the sample is increased, and the participants are collapsed into three age groupings of 30 participants each: 16-29, 30-49, and 50-69. Gender distribution was essentially equal across groups. Mean years of education for the youngest to oldest groups were 14.1 (1.34), with a range of 11-18; 14.9 (3.95), with a range of ~20; and 13.2 (2.38), with a range of 8-18, respectively (compare to TMT.16).
82
TESTS OF ATTENTION AND CONCENTRATION
Mean time in seconds and SDs for the two parts on the initial test and retest 1 week later are reported for each age interval. The,authors call attention to the skewness and i lack of normal distribution of the test datli, which they suggest have implications for t~t score interpretation.
Study strengths 1. Large overall sample size. 2. Small age range within each grouping. 3. Adequate exclusion criteria. 4. Information on age, IQ, education, gender, and geographic recruitment Mea is pro~ded.
5. Means and SDs Me reported. Study strengths 1. Increased sample size per age iqterval. 2. Adequate exclusion criteria. I 3. Information regarding age, e~cation, gender, and handedness is provif!ed. 4. Data regarding retest at 1-week ibtervals Mepro~ded. : 5. Means and SDs Me reported. Considerations regarding use of the stpdy 1. Considerations remain the same for the initial report except for the imprqvement in sample size.
a;
!
[TMT.20] Van Gorp, Satz, and Mitrushh.., 1990 (Table A4.22) I
The authors present TMT data for 156 pealthy elderly participants ranging in age fro~ 57 to 85, recruited from an independent-li~ng retirement community in California. llle data were collected as a part of their investigation of cognitive changes in normal aging1 Information regMding general medical status was collected. Participants with a history of neurological or psychiatric disorder or substance abuse were excluded. Sixty-one percent: of the sample were females. Mean educati~n was 14.14 (2.86) years, and mean FSIQ (\VAIS-R Satz-Mogel format) was 117.21 (12.59, years. Mean time in seconds and SDs to cdmplete parts A and B were listed for the sample as a whole and for four age groups: 57-65,:66-70, 71-75, and 76-85. Sample sizes for e~h age group ranged from 26 to 57. Mean VJQ and PIQ and SDs for each age range M~ listed. Mean VIQs were consistently within ~e high average range, except for those 71-75 years old, who fell within the superior range. Me~ PIQs were within the high average range for those 57-65 years old and 76-85 yeMs old~ Older participants (70 or older) did not diffec significantly from younger participants (less ~an 70) in VIQ, PIQ, or years of education. '
Considerations regarding use of the study 1. High intellectual level of the sample. 2. Relatively high educational level. 3. Inexplicably, part B performance was lower in those 66-70 years old relative to those 71-75 yeMs old, and there appeMed to be considerably more variation in performance in the 66-70 age grouping. Given that increasing age is associated with a worsening of performance, the data for the 66-70 yeM group Me problematic. [TMT.21] Heaton, Grant, and Matthews, 1991 The authors pro~ded normative data on the TMT from 486 (378 in the base sample and 108 in the validation sample) urban and rural participants recruited in several U.S. states (California, Washington, Colorado, Texas, Oklahoma, Wisconsin, Illinois, Michigan, New York, Virginia, and Massachusetts) and Canada. Data were collected over a 15-year period through multicenter collaborative efforts. Sixty-five percent of the sample were males. Mean age for the total sample was 42.06 (16.8) yeMs, and mean educational level was 13.6 (3.5) yeMs. The majority of participants were administered the WAIS; mean FSIQ, VIQ, and PIQ were 113.8 (12.3), 113.9 (13.8), and 111.9 (11.6), respectively. Exclusion criteria were history of learning disability, neurological disease, illness affecting brain function, significant head trauma, significant psychiatric disturbance (e.g., schizophrenia), and alcohol or other substance abuse. The TMT was administered according to procedures outlined by Reitan and Wolfson (1985), with the exception that attempts to complete part B were limited to 10 minutes. In those situations when part B was
83
TRAILMAKING TEST discontinued at 10 minutes, the time score was prorated by dividing 300 seconds by the number of items completed and then multiplying the resulting figure by 25. Participants were generally paid for their participation and judged to have provided their best efforts on the tasks. The normative data, which are not reproduced here, are presented in comprehensive tables in T-score equivalents for test scaled scores for males and females separately in 10 age groupings (20-34, 35-39, 40-44, 45-49, 50-54, 55-59, 60-64, 65-69, 70-74, 75-80 years) by six educational groupings (6--8, 9-11, 12, 13-15, 16-17, ~18 years). For part A, 30% of the score variance was accounted for by age, while 16% was attributable to educational level; gender accounted for a negligible amount of unique variance in performance (1%). A total of 35% of the test score variance was accounted for by demographic variables. For part B, 34% of the score variance was accounted for by age, while 27% was attributable to educational level; again, gender accounted for a negligible amount of unique variance (1%). A total of 45% of the test score variance was accounted for by demographic variables. For the sample as a whole, mean time in seconds for part A was 29.0 (12.5) and that for part B was 75.2 (42.8). The interested reader is referred to the Fastenau and Adams (1996) critique of Heaton et al. (1991) norms and Heaton et al.'s (1996a) response to this critique. In 2004, the authors published the revised norms, which are based on a sample of over 1,000 normal adults. In addition to age, education, and gender stratification, the data are partitioned by race/ethnicity (AfricanAmerican and Caucasian).
Study strengths 1. Large sample size. 2. Comprehensive exclusion criteria. 3. Detailed description of the demographic characteristics of the sample in terms of age, education, IQ, geographic area, and gender. 4. Administration procedures are outlined.
5. The normative data are presented in comprehensive tables in T-score equivalents for males and females separately in 10 age groupings by six educational groupings.
Considerations regarding use of the study 1. Above average mean intellectual level (which is probably less of an issue given that these are WAIS rather than WAIS-R IQ data).
[TMT.22] Seines, Jacobson, Machado, Becker, Wesch, Miller, Visscher and McArthur, 1991 (Table A4.23)
The investigation used participants from the Multi-Center AIDS Cohort Study (MACS). The article presents data for seronegative homosexual and bisexual males collected in Los Angeles for the purpose of establishing normative data for neuropsychological test performance based on a large sample. Participants with a history of head injury with loss of consciousness greater than 1 hour and who reported drinking 21 or more drinks per week in the previous 6 months were excluded. The majority of the sample consisted of Caucasian participants. African-American participants ranged from 3.4% to 4.1% for different age groups. Left-banders ranged from 11.3% to 14.9%.
Study strengths 1. The overall sample size and individual cell sizes are large. 2. Normative data are stratified by age and education. 3. The demographic composition of the sample is described in terms of age, gender, sexual orientation, handedness, ethnicity, and geographic area; demographic composition is described for each age and education cell separately. 4. Means, SDs, as well as scores for the 5th and lOth percentiles are presented. 5. Minimally adequate exclusion criteria.
Considerations regarding use of the study 1. All-male sample. 2. No information on IQ is reported. 3. Very high educational level of the sample.
84
TESTS OF ATTENTION AND CONCENTRATION
[TMT.23] Elias, Robbins, Walter, and Schultz, 1993 (Table A4.24)
The authors explored the influence of gender and age on performance on tests included in the HRB. The sample consisted of 427 community-dwelling volunteers. As per medical interview and self-report on the Cornell Medical Index, none of the participants had a history of treatment for neurological disorder, senility, alcoholism, brain trauma, mental illness, cerebral vascular or catastrophic disease, or a diagnosis of senile dementia. To achieve equivalence between age groups in terms of education, the lower and upper limits for education were set at 12 and 19 years, respectively. All participants had normal or correctedto-normal vision. Occupations ranged from blue-collar to professional. Non-age-corrected WAIS Vocabulary scaled scores ranged from 13.9 to 14.7, and Information scores ranged from 13.2 to 13. 7. Mean time in seconds and SDs to complete parts A and B were reported for six age groups (15-24, 25-34,35-44,45-54,55-64, and ;:::65) for males and females separately. The authors found significant linear trends across age cohorts for parts A and B. Study strengths 1. Large overall sample and adequate sample size for individual cells. 2. The sample composition is well described in terms of age, education, gender, and WAIS Vocabulary and Information scaled scores. 3. Rigorous exclusion criteria. 4. Means and SDs for the test scores are reported. Considerations regarding use of the study 1. Education and estimated intelligence level for the sample are high. 2. Age range for the oldest group is not reported. [TMT.24] Cahn, Salmon, Butters, Wiederholt, Corey-Bloom, Edelstein, and Barrett-Connor, 1995 (Table A4.25)
The study examined the accuracy of neuropsychological measures at detecting Dementia
of the Alzheimer's Type (DAT) in a communitydwelling elderly sample. The participants are stable, upper middle-class, retired older adults who entered the Rancho Bernardo Study, surveying for heart disease risk factors, between 1972 and 1974. The initial sample included 5,052 adults between 30 and 79 years of age, who have been followed until the present. Participants over the age of 65 who returned for a reexamination in 1988 and later and screened positive for cognitive impairment were seen in clinic for diagnostic pmposes (n = 199). A matched control sample of 203 normal elderly participants who screened negative for cognitive impairment was randomly selected for the comprehensive evaluation, which included neurological examination, neuropsychological assessment, standard medical history and examination, and, in some cases, CT scans of the brain. On the basis of the diagnostic evaluation, the group composition was re-assessed. The final sample of normal elderly included 238 participants (97 males, 141 females), with a mean age of 78.4 (6.8), education of 13.8 (2.6), and Dementia Rating Scale (DRS) score of136.8 (5.4). The TMT was administered as part of a larger battery by a trained psychometrist who was blind to the participants' group assignment. Time to completion was reported for the entire sample. In addition, the authors provided optimal cutoff scores and sensitivity/ specificity of the TMT for the diagnosis of DAT: 69%/90% for part A at the cutoff of 66 seconds and 87%/88% for part B at the cutoff of 172 seconds. Study strengths 1. Large sample size. 2. The sample composition is well described in terms of age, education, gender, DRS score, geographic area, history of the project, and recruitment procedures. 3. Rigorous exclusion criteria. 4. Test administration procedures are specified. 5. Means and SDs for the test scores are reported. 6. Sensitivity and specificity for optimal cutoff scores for the two parts of the test are reported.
TRAILMAKING TEST
Considerations regarding use of the study 1. The data are not partitioned by age group. 2. No information on IQ is reported. [TMT.25] lvnik, Malec, Smith, Tangalos, and Petersen, 1996 (Table A4.26)
The study provides age-specific norms for the TMT obtained in Mayo's Older Americans Normative Studies (MOANS) projects, which aim at obtaining normative data for elderly individuals on different neuropsychological tests. The total sample consisted of 746 cognitively normal volunteers over age 55; however, only 359 volunteers participated in TMT testing. Mean MAYO FSIQ (which differs somewhat from standard WAIS-R FSIQ) for the whole sample was 106.2 (14.0) and mean Mayo General Memory Index on the WMS-R was 106.2 (14.2). For a description of their samples, the authors refer to their earlier publications. Participants were independently functioning, community-dwelling persons who were recently examined by a physician and had no active neurological or psychiatric disorder with the potential to impact cognition. Age categorization used the midpoint interval technique. The raw score distribution for each test at each midpoint age was "normalized" by assigning standard scores with a mean of 10 and SD of 3, based on actual percentiJe ranks. The authors provided tables of age-corrected norms for each age group. The procedure for clinical application of these data are described in the original article (Ivnik et al., 1996) as follows: first select the table that corresponds to that person's age. Enter the table with the test's raw score; do not use corrected or final scores for tests that might present their own age- or educationadjustments. Select the appropriate column in the table for that test. The corresponding row in the left-most column in each table provides the MOANS Age-Corrected Scaled Score . . . for your subject's raw score; the corresponding row in the right-most column indicates the percentile range for that same score.
Further, linear regressions should be applied to the normalized, age-corrected MOANS scaled scores (A-MSS) derived from
85
the tables, to adjust the patient's score for education. Age- and education-corrected scores for the TMT (A&E-MSS) can be calculated as follows: A&E-MSSn1r=K+(W 1 -
(W2
* A-MSSn.n) * Education)
where the following indices are specified for the two parts of the TMT: Part A
Part B
1.99 1.10 0.21
3.38 1.06 0.29
Education should enter the formula in years of formal schooling. The tables of scaled scores per age group provided by the authors should be used in the context of the detailed procedures for their application, which are explained in Ivnik et al. (1996). Therefore, they are not reproduced in this book. Interested readers are referred to the original article. Table A4.26 in Appendix 4 summarizes sample sizes for different demographic groups. Study strengths 1. Information regarding age, education, gender, ethnicity, occupation, recruitment procedures, and geographic area is reported. 2. The data were stratified by age group based on midpoint interval technique. 3. The innovative scoring system was well described. The authors developed new indices of performance. 4. The sample sizes for most groups are large. 5. Restricted age range in each cell. Considerations regarding use of the study 1. The measures proposed by the authors are quite complicated and might be difficult to use in clinical practice. 2. Participants with prior history of neurological, psychiatric, or chronic medical illnesses were included.
86
TESTS OF ATTENTION AND CONCENTRATION
Other comments 1. The theoretical assumptions underlying this normative project have been presented in lvnik et al. (1992a,b). 2. The authors cautioned that the validity of the MAYO indices depends heavily on the match of demographic features of the individual to the normative sample presented in this article. 3. Correlations of parts A and B with age were 0.30 and 0.53, respectively, whereas correlations with education and gender were negligible.
[TMT.26] Richardson and Marottoli, 1996 (Tables A4.27 and A4.28)
The authors report data for 101 autonomously living elderly participants who comprise a subsample of a cohort of participants in Project Safety, a study on driving performance conducted in New Haven, Connecticut. Individuals with a history of neurological disease or excessive use of alcohol or those who were at risk for dementia based on MMSE scores were excluded. The sample includes 53 males and 48 females, with a mean age of 81.47 (3.30) years and mean education of 11.02 (3.68) years. Part B was administered and scored according to the standard instructions provided in the test manual. The data were divided into two age groups of younger-old (76--80 years) and older-old (8191 years) and two education groups. The results indicated that the mean performance for participants with less than 12 years of education was stable across the younger-old and older-old age groups; however, it was considerably lower than for participants with 2:12 years of education and well below expectation in comparison to the Heaton et al. (1991) norms. For the participants with 2: 12 years of education, performance for the younger-old age group was superior to that of the older-old and comparable to the norms published by Heaton et al. (1991). Study strengths 1. Data for a relatively large sample of very elderly participants are presented.
2. Information on age, education, gender, and geographic location is reported. 3. Exclusion criteria are described. 4. The data are classified into two age groups by two education groups. 5. Means and SDs are reported. Considerations regarding use of the study 1. Only part B part of the test was administered. 2. No information on IQ is reported. 3. Sample sizes for each age-by-education cell are relatively small.
[TMT.27] Hoff, Riordan, Morris, Cestaro, Wieneke, Alpert, Wang, and Volkow, 1996 (Table A4.29) The authors used the TMT in a study exploring the relationship of cocaine use to performance on neuropsychological tests tapping functions of frontal and temporal brain regions. The performance of crack cocaine users was compared to that of a control group consisting of 54 paid male volunteers with a mean age of 32.1 (9.7) years and mean education 15.4 (2.4) years. The sample included 48 white, four black, and two Hispanic participants. Exclusion criteria were a history of medical, neurological, or psychiatric problems; more than moderate use of alcohol (12 oz./week); history of intravenous drug use; and self-reported history of learning disability (with enrollment in special education classes). Study strengths 1. Relatively large sample size 2. The sample composition is described in terms of age, education, and ethnicity. 3. Rigorous exclusion criteria. 4. Means and SDs for the test scores are reported. Considerations regarding use of the study 1. Wide age and education range. No information on IQ or gender distribution. 2. Recruitment procedures were not reported. 3. Education level for the sample is high.
87
TRAILMAKING TEST
[TMT.28] Salthouse, Toth, Hancock,
and Woodard, 1997 (Table A4.30) The authors examined controlled and automatic processes underlying memory and attention using the process-dissociation procedure, as well as the uniqueness of age-related influences on these processes. Participants were 115 healthy adults (47% male, 53% female) between the ages of 18 and 78 years, who were recruited from appeals to groups and acquaintances. They were included in the study if they were in "reasonably good health," were not currently students, and had at least 11 years of education. No other exclusion criteria are reported. Participants were administered a battery of neuropsychological tests in their homes. The data were stratified into three age groupings: 18-39 (mean age= 29.0, SD = 4.8; mean education= 15.5, SD = 1.7), 40-59 (mean age=49.1, SD=5.1; mean education= 15.2, SD=2.5), and 60-78 (mean age= 69.2, SD = 5.1; mean education= 15.3, SD = 2.6) years. The TMT was administered according to the standard instructions.
Study strengths 1. Sample size is large. 2. Sample composition is well described in terms of age, education, gender, and various health indices. 3. Recruitment procedures are specified. 4. Data are partitioned into three age groups. 5. Test administration procedures are specified. 6. Means and SDs for the test scores are reported.
Consklerations regarding use of the study 1. Exclusion criteria are not well identified. 2. High educational level for each age group. [TMT.29] Rasmusson, Zonderman, Kawas,
and Resnick, 1998 (Table A4.31)
The authors explored the effect of age and dementia status on TMT performance within the scope of the Baltimore Longitudinal Study
of Aging (BLSA). The sample has been recruited continuously since 1958, and participants were asked to return for testing every other year. The majority of the sample are white (37% female); working or retired from scientific, professional, or managerial positions; graduated from college (71% ); and married. All participants aged 70 years and older and some younger participants who met specific criteria were seen by a neurologist for a clinical evaluation, who classified participants in three categories: cognitively normal, suspect for early dementia, or dementia. The 667 nondemented participants who were included in the TMT portion of the study were 60 years of age or older at the last visit at which the TMT was administered. The mean age of the sample was 74.4 (8.2) years, mean education 16.0 (2.9) years, mean MMSE score 28.6 (1.6), and mean number of errors on the Blessed Mental Status Exam 1.3 (1.8). The TMT was administered according to the standard procedures. A maximum of 300 seconds was allowed for each part. The authors provided a detailed description of the administration procedures. The authors found a significant effect of age on completion times for both parts A and B. Incidence of errors increased with age only for part B. Dementia status was significantly associated with the proportion of participants making errors on both parts A and B, independent of age. The error rates did not increase over a 2-year longitudinal comparison made on a subset of the nondemented sample. The authors described the sensitivity and specificity of various cutoff scores in distinguishing between nondemented participants and those with cognitive dysfunction, based on receiver operating characteristic (ROC) analyses. They evaluated on their sample sensitivity and specificity of the previously reported optimal dementia cutoff score on part B of 172 seconds reported by Cahn et al. (1995). All significant effects were replicated.
Study strengths 1. Large sample size. 2. The sample composition is well described in terms of age, education,
TESTS OF ATTENTION AND CONCENTRATION
88
3. 4. 5. 6. 7.
8.
gender, geographic area, and recruitment procedures. Mental status was assessed with MMSE and Blessed Mental Status Exam. Adequate exclusion criteria. Performance for very old group (9096 years) is reported. Test administration procedures are thoroughly described. Means and SDs for the test scores and the percentage of participants who made errors on parts A and B are reported. Data are partitioned by four age groups.
Considerations regarding use of the study 1. Education levelfor the sample is very high. 2. No information on IQ is reported. [TMT.30] Miner and Ferraro, 1998 (Table A4.32)
The study examined the role of different information-processing factors and presentation order in TMT performance. The sample consisted of 110 undergraduate students (88 females and 22 males) from the University of North Dakota, with a mean age of 21.7 (5.24) years, who received a course credit for their participation. Their health was assessed with a background information questionnaire and with the Geriatric Depression Scale. The TMT was administered in a counterbalanced order as part of a larger battery. Those participants who received the test in the part B-part A order demonstrated considerably slower performance on part Bin comparison to the group tested in the standard order.
[TMT.31] Crowe, 1998b (Table A4.33)
The TMT and a series of measures derived from it were administered to 98 undergraduate students from La Trobe University in Melbourne, Australia, in order to examine cognitive mechanisms contributing to performance on both parts. Participants were screened for a history of loss of consciousness or other neuropathology. The mean age for the sample was 23.4 (3.1) years, mean education 14.0 (2.3) years, and mean Wide Range Achievement Test (WRAT) Reading score 101.0 (9.0). The authors developed modified procedures in an effort to separate cognitive mechanisms contributing to TMT performance. They concluded that visual search and motor speed contributed to performance on part A, whereas visual search and cognitive alternation contributed to performance on part B. The latter was further influenced by reading level, ability to mentalJy maintain two simultaneous sequences, attention, and working memory. Time to compJetion for both TMT parts is provided. Study strengths 1. Large sample size. 2. The sample composition is described in terms of age, education, gender, WRAT Reading score, and geographic area. 3. Minimally adequate exclusion criteria. 4. Test administration procedures are specified. 5. Means and SDs for the test scores are reported.
Study strengths 1. Relatively large sample. 2. The sample composition is described in terms of age, education, gender, and incentive for participation. 3. Minimally adequate exclusion criteria. 4. Test administration procedures are specified. 5. Means and SDs for the test scores are reported.
Considerations regarding use of the study 1. No information on IQ is reported. 2. High educational level of the sample. 3. The data were obtained on Australian participants, which may limit their usefulness for clinical interpretation in the United States.
Considerations regarding use of the study 1. Exclusion criteria are not described. 2. No information on IQ is reported.
and Adams, 1998 (Table A4.34) The authors challenged Dodrill's (1997) findings of no relationship between level of
[TMT.32] Tremont, Hoffman, Scott,
89
TRAILMAKING TEST
intelligence and neuropsychological test performance by presenting data collected from archival files of the University of Oklahoma Neuropsychological Laboratory, stratified by intelligence level. The data included files for 157 patients (71 males and 86 females) between 16 and 74 years of age, with a mean age of 39.38 (15.80) and a mean education of 13.12 (3.26); 143 were Caucasian, nine African-American, and the rest other races, and or unknown. All patients were evaluated for suspected neurological disease, which yielded no biomedical evidence for brain impairment. The TMT was administered as part of the HRB. The results are stratified by three intelligence levels, based on patients' WAIS-R FSIQ. The authors concluded that performance on both parts of the test was affected by intelligence level, with the greatest impact on part B.
later. The composition of the latter sample was 48 Caucasian, one African-American, and one Hispanic, with a mean age of 32.5 (9.27) years, a mean education of 14.98 (1.93) years, and a mean FSIQ of 109.30 (12.29) at baseline. At each probe, participants were screened for neurological disease, head injury, learning disabilities, or other medical illnesses based on an informal interview. They were also screened for psychiatric disorders through a structured clinical interview. None was excluded based on these screens. The TMT was administered according to standard procedures by thoroughly trained and supervised technicians. The authors compared TMT performance at baseline and on the retest using reliable change indices and concluded that TMT scores did not change on the retest. Performance on the TMT for the two probes is reported for the entire sample.
Study strengths 1. Relatively large sample. 2. The sample composition is well described in terms of age, education, gender, VIQ, PIQ, FSIQ, geographic area, and clinical setting. 3. It is presumed that standard administration procedures were used since the TMT was administered as part of the HRB. 4. Means and SDs for the test scores are reported. 5. Data are stratified by intelligence level.
Considerations regarding use of the study 1. Wide age range. 2. Data were collected from patients' files. Though biomedical evidence for brain impairment was negative, this is not a normal sample. [TMT.33] Basso, Bornstein, and Lang, 1999 (Table A4.35)
The study examined the practice effect on repeated administration of several tests over a 12-month interval. The baseline sample consisted of 82 men recruited through newspaper advertisements, who were not paid for their participation. Fifty men out of this sample returned for the repeated testing 12 months
Study strengths 1. Adequate sample size. 2. The sample composition is described in terms of age, education, gender, ethnicity, FSIQ, and recruitment procedures. 3. Adequate exclusion criteria. 4. Test administration procedures are thoroughly described. 5. Means and SDs for the test scores are reported.
Considerations regarding use of the study 1. The data are not partitioned by age group. 2. Education level for the sample is high. [TMT.34] Crews, Harrison, & Rhodes, 1999 (Table A4.36)
A control sample of 30 nondepressed women was used in a study on the effect of depression on executive functions in young women. Control participants were recruited via flyers/ sign-up sheets from town and university settings. They did not meet diagnostic criteria according to the ADIS-R and scored within the nondepressed range on the Beck Depression Inventory (BDI). The exclusion criteria were past or present history of neurological problems or psychiatric disorders, alcoholism
TESTS OF ATTENTION AND CONCENTRATION
90
or drug abuse, learning disabilities, concurrent medication/drug usage, eating disorders, or current medical illness. The TMT was administered according to the standard procedures. Test performance is reported for the entire sample.
Study strengths 1. The sample composition is described in terms of age, education, gender, scores on selected WAIS-R tests, and recruitment procedures. 2. Adequate exclusion criteria. 3. Test administration procedures are specified. 4. Means and SDs for the test scores are reported. Considerations regarding use of the study 1. The sample is small. 2. Education level for the sample is high.
[TMT.35] Dikmen, Heaton, Grant, and Temkin, 1999 (Table A4.37) The TMT was used in a study on the psychometric properties of a broad range of neuropsychological measures, based on a sample of 384 normal or neurologically stable adults who were tested twice as part of several longitudinal studies. A group of "friend controls" consisted of 138 individuals who had no history of recent trauma and were friends of head-injured patients. Their mean age was 28.5 (12.2) years and mean education was 12.2 (1.9) years; 60% of the sample were males, and the test-retest interval was 11.1 (0.6) months. A group of "trauma controls" consisted of 121 individuals who had a recent traumatic injury that did not involve the head. They were tested at baseline 1 month after trauma and then 11 months later. Their mean age was 31.2 (13.6) years and mean education was 12.0 (2.6) years; 70% of the sample were males, and the test-retest interval was 10.7 (0.6) months. Both of these groups were tested at the University of Washington under the direction of one of the authors. Twenty percent of friend controls and 46% of trauma controls had preexisting conditions that might affect test performance, the most significant being alcohol abuse or a significant traumatic
brain injury. The rest of the participants in these samples denied any history of conditions that might be expected to affect brain function. The third group, mixed normal controls, consisted of 125 participants who had no history of trauma or disease involving the brain. They were enrolled in longitudinal research projects at multiple sites under the supervision of the neuropsychology laboratories at the University of Colorado and the University of California at San Diego. Their mean age was 43.6 (19.6) years and mean education was 12.0 (3.3) years; 68% of the sample were males, and the test-retest interval was 5.4 (2.5) months. The data are reported for all groups combined. Demographic information for all groups combined is also provided. The mean WAIS FSIQ (Wechsler, 1955) on the initial testing for the three groups combined was 108.8 (12.3). Trails A and B were administered according to the procedures specified by Reitan and Wolfson (1993). Time limits were imposed of 100 seconds on Trails A and 300 seconds on Trails B. The authors provide raw scores for performance at two time probes, as well as various measures of test-retest reliability and magnitude of practice effect. The test-retest reliability over an 11-month interval for Trails A was r = 0. 79 and that for Trails B was r=0.89.
Study strengths 1. Large sample sizes for the three groups. 2. The sample composition is well described in terms of age, education, gender, IQ, geographic area, and setting. 3. Test administration procedures are specified. 4. Means and SDs for the test scores are reported. 5. Information on test-retest reliability is provided. Considerations regarding use of the study 1. Exclusion criteria are not clear]y described. As the authors pointed out, 20% of friend controls and 46% of trauma controls had preexisting conditions that might affect test performance, the most
91
TRAILMAKING TEST significant being alcohol abuse and a significant traumatic brain injury. 2. The data are not partitioned by age group. 3. Time limits were imposed on test performance that deviated from test administration procedures. However, these limits should not have had a noticeable effect on the results.
[TMT.36] Binder, Storandt, and Birge, 1999 (Table A4.38) The authors examined the relationship between performance on psychometric tests and a modified Physical Performance Test (modified PPT) in a sample of 125 adults aged 75 years and older, who participated in trials of exercise or hormone replacement therapy. The study was approved by the Washington University School of Medicine, St. Louis. The mean age for the sample was 82.3 (4.4) years, mean education was 13.5 (3.0) years, 25% were male, and 87% were Caucasian. Indices of physical health, Blessed score, and Geriatric Depression Scale score are reported. Preliminary screening included a medical history; physical examination; the Short Blessed Test of memory, concentration, and orientation; blood and urine chemistries; a chest X-ray; and a cross-validated self-report regarding health problems in the previous 12 months. Exclusion criteria were inability to walk 50 feet independently, active medical problems that would contraindicate performance of a graded exercise stress test, inability to complete the graded exercise stress test or the modified PPT, a score greater than 8 on the Short Blessed Test, inability to provide informed consent due to cognitive impairment, and inability to follow the directions for the psychometric tests due to visual or auditory impairments. The standard administration procedure was used except that the maximal allowed time for both parts A and B was 180 seconds. Time to completion and the number of lines correctly drawn within the allotted time were recorded. The authors found that part B performance was significantly associated with total modified PPT score.
Study strengths 1. Large sample size. 2. The sample composition is well described in terms of age, education, gender, ethnicity, indices of physical health, Blessed score, Geriatric Depression Scale score, geographic area, and research setting. 3. Adequate exclusion criteria. 4. Test administration procedures are specified. 5. Means and SDs for the test scores are reported.
Considerations regarding use of the study 1. The data are not partitioned by age group. 2. No information on IQ is reported.
[TMT.37] Ruffolo, Guilmette, and Willis, 2000 (Table A4.39) Time to completion and number of errors in TMT performance are compared for four clinical I experimental groups and a control group. The latter sample included 49 introductory psychology students, graduate students, and employees of a local social services agency, who were screened for any prior head injuries. The TMT was administered according to standard instructions.
Study strengths 1. Adequate sample size. 2. The sample composition is described in terms of age, education, and setting. 3. Minimally adequate exclusion criteria. 4. Test administration procedures are specified. 5. Means and SDs for the test scores and error rates are reported.
Considerations regarding use of the study 1. The data are not partitioned by age group. 2. Education level for the sample is high. 3. No information on gender and IQ.
[TMT.38] Saxton, Ratcliff, Newman, Belle, Fried, Vee, and Kuller, 2000 (Table A4.40) The TMT was administered as part of the Memory and Aging Study (MAS) conducted as an ancillary project to the CHS,
92
TESTS OF ATTENTION AND CONCENTRATION
a multicenter observational study of heart disease and stroke in Washington County, Maryland, and Pittsburgh, Pennsylvania. No selection criteria were used. Data were analyzed for a sample of 989 participants (444 males and 545 females), who completed all of the cognitive tests included in the battery. The mean age for the sample was 73.63 (4.45) years, and mean education was 13.23 (2.85) years; 93.9% of the sample were white. This sample was divided into two clinical groups and a "no disease" group, based on cardiovascular status. Times to completion for the TMT for the "no disease" sample of 357 participants are reproduced in Table A4.38. Demographic characteristics for this sample are not reported by the authors. However, we assume that they are similar to the demographics for the entire sample described above.
protocol included a standardized general medical history and physical examination; a detailed neurological and mental status examination; hematological, metabolic, and serological tests; and neuroimaging when appropriate. Relevant medical records were abstracted. The sample included 302 females and 181 males, with a mean age of 74.9 (4.4) years; 31.9% of participants had less than a high school education. Times to completion for the two parts of the TMT were reported for the entire sample. Results of the ROC analysis suggested that TMT part B was one of the tests that had the highest accuracy in discriminating between nondemented participants and those who were in the preclinical stages of DAT (area under the curve=0.773).
Study strengths 1. Large sample size. 2. The sample composition is described in terms of age, education, gender, setting, geographic area, and recruitment procedures. 3. Means and SDs for the test scores are reported.
Study strengths 1. Large sample size. 2. The sample composition is well described in terms of age, education, gender, history of the project, and geographic area. 3. Rigorous exclusion criteria. 4. Means and SDs for the test scores are reported. 5. Information on the diagnostic accuracy of part B is provided.
Considerations regarding use of the study
Considerations regarding use of the study
1. No exclusion criteria. 2. The data are not partitioned by age group. 3. No information on IQ is reported. 4. Demographic characteristics for the "no disease" group are not reported. [TMT.39] Chen, Ratcliff, Belle, Cauley, DeKosky, and Ganguli, 2000 (Table A4.41) A control sample of 483 elderly nondemented individuals was derived from a communitybased multiwave prospective study, the Monongahela Valley Independent Elders Survey (MoVIES), in southwestern Pennsylvania. The purpose of the study was to identify cognitive measures that are most accurate in discriminating between individuals with presymptomatic DAT and nondemented individuals. The control participants remained nondemented over a 10-year follow-up period. The study
1. The data are not partitioned by age group. 2. No information on IQ is reported. 3. The number of participants with less than a high school education is reported. However, mean education and SD is not reported. [TMT.40] Small, Graves, McEvoy, Crawford, Mullan, and Mortimer, 2000 (Table A4.42)
The authors examined the relationship between APOE genotype and cognitive functioning in normal aging based on a sample of 413 adults between 60 and 85 years of age, with a mean age of 72.90 years, who were randomly selected from a larger sample of participants in the community-based, crosssectional Charlotte County Healthy Aging Study conducted in south Florida. The sample was stratified into two age groups, young-old
93
TRAILMAKING TEST
(00-73 years, n = 202) and old-old (74-85 years, n = 211), and further divided into two groups according to the presence of the APOE-e4 allele. The sample was almost exclusively white. Education, gender distribution, and self-rated indices of health status are reported for each group. Intelligence levels were estimated using the Spot the Word Test. The TMT was administered according to standard procedures.
Study strengths 1. Large sample sizes per group. 2. The sample composition is well described in tenils of age, education, gender, geographic area, and research setting. 3. Test administration procedures are specified. 4. Data are stratified by two age groups. 5. Estimated intelligence levels are reported. 6. Means and SDs for the test scores are reported.
Considerations regarding use of the study 1. Exclusion criteria are not clearly described. 2. Education level for one of the groups is high. [TMT.41] Stuss, Bisschop, Alexander, Levine, Katz, and lzukawa, 2001 (Table A4.43) The study examined the relationship of the TMT to focal frontal lobe lesions. Time to completion and number of errors performed by the clinical groups with different lesion localizations and the control group were compared. The sample of 19 control participants with a mean age of 53.4 (13.6) years and a mean education of 13.7 (2.5) years was drawn from a general popu1ation pool of volunteers. The participants were fluent English speakers with adequate ability to read and had no prior history of any neurological or psychiatric disorders. The TMT was administered according to standard procedures. All participants continued working on the test until they completed the task. Time to completion for both TMT parts, the B-A difference, and the proportional score (B-A)/A are reported in both raw
scores and their logarithmic transformations. Only four control participants made one error on part B. The resu1ts suggest that error analysis is a more useful method of categorizing performance than time to completion. All patients who made more than one error on part B had frontal lesions.
Study strengths 1. The sample composition is described in terms of age, education, gender, estimated IQ, and clinical setting. 2. Adequate exclusion criteria. 3. Test administration procedures are specified. 4. Means and SDs for the test scores and derived measures are reported.
Considerations regarding use of the study 1. The sample size is small. 2. Age range is wide. 3. The data were obtained on Canadian participants, which may limit their usefulness for clinical interpretation in the United States. [TMT.42] Bell, Hermann, Woodard, Jones, Rutedd, Sheth, Dow, and Seidenberg, 2001 (Table A4.44) The TMT was administered as part of a larger battery in a study examining the neurobehavioral status of patients with early-onset temporal lobe epUepsy. The control group included 29 friends, relatives, and spouses of patients (72% female), who were between ages 16 and 60 years, with a mean age of 34.4 (12.5) years; FSIQ (as measured with the WAIS-111 7-subtest short form) between 69 and 110, with a mean FSIQ o£97.7 (6.4); and mean education of 13.0 (1.7) years. Exclusion criteria were current substance abuse, psychotropic medication use, medical or psychiatric condition that could affect cognitive functioning, an episode of loss of consciousness longer than 5 minutes, developmental learning disorder, and repetition of a grade in school. Time to completion for both TMT parts is provided.
94
TESTS OF ATTENTION AND CONCENTRATION
Study strengths 1. The sample composition is well described in terms of age, education, gender, FSIQ, and recruitment criteria. 2. Adequate exclusion criteria. 3. Means and SDs for the test scores are reported. Considerations regarding use of the study 1. The sample is small and includes a wide age range. 2. The data are not partitioned by age group.
[TMT.43] Stein, Kennedy, and Twamley, 2002 (Table A4.45)
The authors compared cognitive functioning in female victims of domestic violence and nonvictimized women. The control sample included 22 participants who were recruited through posted advertisements and ongoing personal contacts. The study was conducted in San Diego, California. The control group included 22 participants who had no lifetime exposure to a posttraumatic stress disorder Diagnostic and Statistical Manual-IV criterion A stressor, spoke English fluently, and had at least an 8th-grade reading ability. Exclusion criteria were use of any psychotropic medications within 6 weeks before participation, use of oral or intramuscular steroids within 4 months before participation, history of learning disability or attention-deficit disorder, head injury with loss of consciousness greater than 10 minutes, seizure disorder, drug or alcohol use, and history of psychotic illness or neurological disorder. Mean age for the sample was 29.4 (10.7) years and mean education was 13.9 (1.5) years. Time to completion for both TMT parts as well as the B-A difference are provided. Study strengths 1. The sample composition is described in terms of age, education, gender, geographic area, setting, and recruitment procedures. 2. Rigorous exclusion criteria. 3. Means and SDs for the test scores and the B-A difference are reported.
Considerations regarding use of the study 1. The sample is small. 2. Wide age span. 3. No information on IQ is reported. 4. All-female sample.
[TMT.44] Drane, Yuspeh, Huthwaite, and Klingler, 2002 (Table A4.46)
The purpose of the study was to examine the relationship of TMT time to completion as well as derived indices, such as difference scores and ratio scores, with demographic variables. The sample consisted of 285 adults (205 males and 80 females) between 18 and 90 years of age, who participated in a comprehensive neuropsychological normative project. They were recruited through a variety of civic organizations. Participants did not have any history of known psychiatric or neurological disorder, were living independently, had no history of substance abuse, and were not treated with psychotropic medications at the time of the examination, per clinical interview. All participants performed within the normal range on the MMSE. Mean age for the sample was 48.30 (19.68) years, mean education was 12.98 (2.65) years, and mean MMSE score was 28.63 (1.61). The TMT was administered according to standard procedures. Time to completion, BA difference, and B:A ratio are reported for eight age groups. The authors evaluated the sensitivity of the B:A impairment cutoff score of 3.0 that was suggested by Lamberty et al. (1994) and concluded that rates of false-positive misclassification are unacceptably high, especially for older age groups. Study strengths 1. Large sample size. 2. The sample composition is well described in terms of age, education, gender, MMSE scores, setting, and recruitment procedures. 3. Adequate exclusion criteria. 4. Test administration procedures are specified. 5. The data are partitioned by eight age groups.
TRAilMAKING TEST
6. Means and SDs for the test scores as well as derived indices are reported. Considerations regarding use of the study 1. Demographic characteristics for age groups are not reported. 2. Overall sample is large, but some individual cells are small. 3. No information on IQ is reported. [TMT.45] Grady, Yaffe, Kristof, Lin, Richards, and Barrett-Connor, 2002 (Table A4.47)
Data on TMT part B were collected for a subsample of 1,063 older women in a multicenter study examining the effect of hormone replacement therapy on cognitive functioning in postmenopausal women. This is a follow-up on the articles reporting normative data for different subgroups from the same study (Barrett-Connor & Goodman-Gruen, 1999; Kritz-Silverstein & Barrett-Connor, 2002). The participants were younger than 80 years old and had established coronary disease and an intact uterus. They were randomly assigned to treatment vs. placebo groups in a doubleblind experiment. They were followed for 4.2 (.04) years. At the end of the trial, cognitive functioning was measured in both groups. The data are reported for 517 participants in the treatment group and 546 in the placebo group, separately. The mean age for the two groups at the time of testing was 66.3 (6.4) and 67.3 (6.3) years, respectively, and mean education was 12.7 (2. 7) years for both groups; approximately 90% of the sample were white. There are no notable differences between the groups on any demographic variables or physical indices. Trails B was administered according to standard procedures. The authors concluded that there were no differences between the treatment and placebo groups on any cognitive measures. Study strengths 1. Large sample size. 2. The sample composition is well described in terms of age, education, gender, physical findings, clinical setting, and selection criteria.
95
3. Test administration procedures are specified. 4. Means and SDs for the test scores are reported. Considerations regarding use of the study 1. Participants had established coronary disease. It is unclear if any neurological exclusion criteria were used. 2. All-female sample. 3. The data are not partitioned by age group. 4. No information on IQ is reported. [TMT.46] Miller, 2003, Personal Communication (Table A4.48)
The investigation used participants from the MACS study. The data were collected from 949 seronegative homosexual and bisexual males for the purpose of establishing normative data for neuropsychological test performance based on a large sample. These data represent an update on the data provided by Seines et al. (1991). Mean age for the sample was 38.0 (7.5) years and mean education was 16.3 (2.4) years; 91.5% were Caucasian, 3.0% Hispanic, 4.5% black, and 1% other. All participants were native English speakers. The TMT was administered according to standard instructions. The data are partitioned by three age groups (25--34, 35-44, 45-59) times three education levels ~ 16, 16, > 16 years). Study strengths 1. The overall sample size is large, and most of the individual cells have more than 50 participants. 2. Normative data are stratified by age x education. 3. Information on age, education, ethnicity, and native language is reported. 4. Means and SDs for the test scores are reported. Considerations regarding use of the study 1. All-male sample. 2. No information on IQ is reported. 3. No information on exclusion criteria.
96
TESTS OF ATTENTION AND CONCENTRATION
[TMT.47] Tombaugh, 2004 (Table A4.49)
The author provided normative data for 911 community-dwelling adults between 18 and 89 years of age. The data for volunteers who participated in earlier studies were analyzed. Out of this sample, 823 participants were recruited through booths at shopping centers, social organizations, places of employment, psychology classes, and word of mouth. Exclusion criteria were history of neurological disease, psychiatric illness, head injury, or stroke, per self-report; the remaining 88 participants represent a subset of individuals who had received a consensus diagnosis of "no cognitive impairment" made by physicians and clinical neuropsychologists, based on history, clinical and neurological examination, and an extensive battery of neuropsychological tests, over two successive evaluations separated by approximately 5 years. The author pointed out that all participants 18-24 years old were university students. Mean age for the sample was 58.5 (21.7) years, mean education was 12.6 (2.6) years, and the male/female ratio was 4081503. All participants scored above 23 on the MMSE, with a mean of 28.6 (1.5), and below 14 on the Geriatric Depression Scale, with a mean of 4.1 (3.4). Elderly participants were also excluded on the basis of a clinical evaluation of depression. Trails A and B were administered as part of a larger battery according to the Spreen and Strauss (1998) guidelines. The results indicated that test performance for both Trails A and B was affected by age. Performance on Trails B was also related to education, particularly in individuals over 54 years of age. Therefore, tables of raw data and percentiles are stratified into 11 age groups. For ages 55 and above, they are further partitioned into two education levels (~12 and 12+years).
Study strengths 1. Large sample size. 2. The sample composition is well described in terms of age, education, gender, setting, and recruitment procedures. 3. Rigorous exclusion criteria.
4. Test administration procedures are specified. 5. Means and SDs for the test scores are reported. 6. Data are stratified by age x education.
Considerations regarding use of the stw:hj 1. As the authors pointed out, the sample size of the oldest group is small. 2. No information on the intellectual level of the sample is reported. 3. The data were obtained on Canadian participants, which may limit their usefulness for clinical interpretation in the United States.
RESULTS OF THE META-ANALYSES OF THE TRAILMAKING TEST DATA (See Appendix 4m)
Data collected from the studies reviewed in this chapter were combined in regression analyses in order to describe the relationship between age and test performance and to predict expected test scores for different age groups. Effects of other demographic variables were explored in follow-up analyses. The general procedures for data selection and analysis are described in Chapter 3. Detailed results of the meta-analyses and predicted test scores across adult age groups for parts A and B are provided in Appendix 4m. Educational range was unevenly represented, with a large gap between 8.5 and 11.59 years at the lower extreme. Based on the preliminary analyses, the data point with 8.5 years of education was retained in the main analyses but dropped in the analyses generating an education-correction factor (see below). After data editing for consistency and for outlying scores, 28 studies for Trails A and 29 for Trails B, which generated 89 data points for each part based on totals of 6,317 and 6,360 participants, respectively, were included into the analyses. Quadratic regressions of the test scores on age yielded R2 values of 0.905 for Trails A and 0.876 for Trails B, indicating that 91% and 88% of the variance in test scores for the two parts, respectively, is accounted for by the
TRAILMAKING TEST
model. Based on these models, we estimated scores for both parts for age intervals between 16 and 89 years. If predicted scores are needed for age ranges outside the reported boundaries, with proper caution (see Chapter 3), they can be calculated using the regression equations included in the tables, which underlie calculations of the predicted scores. It should be noted in the context of acrosscondition comparisons that mean age for Trails B is somewhat higher than for Trails A because data for one study based on the older sample were reported for Trails B only. Quadratic regressions of SDs on age yielded R2 of 0.602 for Trails A and 0.676 for Trails B, indicating an increase in variability with advancing age, consistent with the literature. Predicted SDs, based on these models, are reported. Examination of the effects of demographic variables on the test scores revealed that education is a significant predictor of test performance for both parts A and B. Values of estimated between-study variance (tau 2 ) for regression of test means with education were considerably lower than the corresponding values for regression without education. This suggests that education explains a considerable amount of the heterogeneity in the outcome variable. Inclusion of education into the regression of test means on age considerably improved the R2 (see Appendix 4m). In this analysis, regression with and without education was rerun on a subset of studies that reported education for each data point. In addition, the group with 8.5 years of education was dropped because of a large gap at the lower extreme of the educational range, with the next lowest level available for analyses being 11.59 years of education. Therefore, the data set for Trails A was based on 25 studies and that for Trails B, on 26 studies. The t-value for education is -3.00 (p = 0.006) for Trails A and -2.56 (p = 0.017) for Trails B. The coefficient for education of -1.308, rounded to -1.31, for Trails A indicates that with a 1-year decrement in education we expect a 1.31-second slowing in test performance. This suggests that the table of predicted values is accurate for individuals with
97
13.87 years of education, rounded to 14 years (which is the mean education for the original data set) in the education-correction tables. With every year of education above or below this level, we suggest correcting the obtained score by adding or subtracting 1.31 to or from the predicted score given in the table for the relevant age group (see Chapter 3 for an example). The coefficient for education for Trails B is -6.446, rounded to -6.45 in the education-correction table. Thus, we suggest correcting the obtained score by adding or subtracting 6.45 to or from the predicted score given in the table for the relevant age group. The SDs for the person's actual age group should be used with the educationcorrected scores. Correction factors for different education levels for both Trails A and B are included in Appendix 4m. These corrections should be applied within the education range of 12-17 years since this is the range available in the original data set. Unfortunately, data for lower educational levels were not available in the literature. Any extrapolation of scores outside the reported range should be made with caution. IQ did not have a significant effect on the test scores in our data set. Given consistent evidence of the effect of intellectual level on test performance described in the literature, our lack of association is likely due to insufficient data regarding IQ levels reported in the studies reviewed (only seven studies reported IQ levels, which generated 21 data points for each TMT part). The difference in mean scores for the two genders across 17 studies reporting scores for males and 15 studies reporting scores for females separately was negligible: 0.704 in favor of males for Trails A and 0.379 in favor of females for Trails B. Strengths of the analyses 1. Total sample sizes of 6,317 for Trails A and 6,360 for Trails B. 2. R2 of 0.905 for Trails A and 0.876 for Trails B, indicating a good model fit. 3. Postestimation tests for parameter specifications did not indicate problems with normality.
98
TESTS OF ATTENTION AND CONCENTRATION
4. Effect of education was evident, which is consistent with the literature. Significant effect of education on both parts A and B called for corrections for education. Limitations of the analyses 1. Postestimation tests for parameter specifications indicated lack of homoscedasticity for both Trails A and B. Variability in scores across age groups is greater than expected by chance, with a considerable increase in variability in the older age groups, as reflected in the size of the confidence intervals. Therefore, the predicted scores are less accurate for the older age ranges than for the younger ranges. 2. Levels of education and IQ for the samples included in the review are high. Although corrections for education are provided, mean IQ levels are 116.69 (7.48) for Trails A and 116.88 (7.80) for Trails B. According to the literature, there is a strong relationship between test performance and IQ level. Therefore, the predicted values are likely to underestimate expected time to completion for individuals with average and lower than average intellectual levels.
CONCLUSIONS
The TMT has achieved high popularity as a screening tool for cognitive impairment. There is ample evidence supporting the sensitivity of performance times for parts A and B to cerebral dysfunction in mild traumatic brain injury, in the differential diagnosis of dementia, in detecting attentional I concentrational dysfunction in children and adults, and in other conditions. Although poor performance on the TMT is viewed as a nonspecific finding due to the complexity of the mechanisms contributing to test performance, the TMT is most sensitive to attentional I
concentrational and executive problems, as well as to psychomotor slowing. It would be misleading to view the TMT as the test for organic brain pathology. For example, patients with memory deficits associated with temporal lobe pathology may perform normally on this test, and if the TMT is administered in isolation, the serious processing difficulties of this population might be overlooked. Most commonly, clinical use of the TMT is based on a norm-referenced interpretation of completion times for each condition. Use of the TMT cutoff criteria for brain impairment is now quite infrequent (Spreen & Strauss, 1998). According to the literature, performance on the TMT (especially part B) is highly affected by age and education. This finding is supported by the results of the meta-analyses discussed above. Thus, it is of utmost importance to interpret individual scores with reference to the relevant normative data. Among the derived measures, B:A ratio was found to be diagnostically useful in several studies, with modest support for use of the B-A difference. There is little consensus on the utility of error analysis. Further research is needed to gain better insight into the diagnostic utility of the derived measures and error analysis with different clinical populations. The optimal format for data reporting in future investigations is in age-by-education and I or -by-intelligence level cells. Given the demonstrated utility of the B-A difference, B:A ratio, and error analysis with some clinical groups, reporting of statistics for these indices would further facilitate interpretation of the results and contribute to diagnostic decision making. As to the use of the cutoffs, Soukup et al.'s (1998) recommendation to report cutoff scores for borderline (15th percentile) and defective (< 5th percentile) ranges in addition to the descriptive statistics should be given careful consideration due to the positive skew in the distribution of TMT scores.
5 Color Trails Test
BRIEF HISTORY OF THE TEST Because of their ease of administration and sensitivity to brain damage, trail-making tasks have long been among the most widely used measures in neuropsychological practice (Lezak et al., 2004). The original Trail Making Test (TMT) was developed in 1944 (see Chapter 4); however, it relied upon the English alphabet as part of the test stimuli, thereby limiting its use in non-English-speaking countries. Further, its use in English-speaking countries was problematic when assessing adults with language and reading disorders, limited education, or English as a second language. The Color Trails Test (CTT) was created in response to a request made in 1989 by the World Health Organization (WHO) for a test that would be similar to the TMT (1944) in terms of its sensitivity and specificity yet allow broader application in cross-cultural contexts. The WHO wanted a test with standardized, equivalent, multiple forms for test-retest purposes. Additionally, although the TMT had been translated into other languages, its basic linguistic and phonological properties continued to limit its application in special-needs contexts (e.g., language disorders, specific reading disorders, or illiteracy). The WHO also wanted standardized test stimuli to insure the new test's reliability (Maj
et al., 1993). Because of the TMT's popularity and availability in the public domain, it became perhaps the most frequently photocopied neuropsychological test of the 20th century. Poor photocopy quality often blurred the target stimuli, and it was not uncommon to discover TMT protocols in which the stimuli closest to the edge of the page had been cut off due to improper placement of the original on the photocopy machine. Successive generations of photocopies yielded slightly smaller or slightly larger versions of the test, thereby changing the distance between stimuli. Because the "time to complete" score obtained for the test reflects not only visual scanning and psychomotor speed but also the distance traveled between stimuli, the problem of not having a standard version of the TMT would necessarily hamper the comparability of research and clinical findings. Therefore, it was important to develop a format of the test that would also discourage photocopying (D'Elia et al., 1996). The CTT is similar to the TMT in that it requires cognitive flexibility and visuomotor skills to complete the task. Additionally, the CTT is similar to the TMT in that it is administered under timed conditions. However, the CTT relies on the use of numbered, colored circles and universal sign language symbols to solve the task, rather than relying 99
100
TESTS OF ATTENTION AND CONCENTRATION
on English (or any other) alphabet letters as part of the test stimuli. Instructions for the CTI may be administered verbally or nonverbally, using only visual cues. Both the TMT and CTI are paper-andpencil tests that are administered in two parts on an 8 ~ x 11" page. However, for the CTI1, the numbers 1-25 are printed within colored circles. All even-numbered circles are printed with a bright yellow background and all oddnumbered circles, with a vivid pink background. These background color differences are perceptible even to color-blind individuals. The individual is instructed to quickly draw a continuous line that connects the numbers in consecutive/sequential order. The incidental fact that color alternates with each succeeding number is not highlighted or discussed with the subject since attention to color sequence is not necessary for completion of the CTIL The CTI2 introduces a divided attentional component, requiring attention to the alternating and sequencing of the stimuli. For the CTI2, the number 1 circle is printed against a vivid pink background; however, the numbers 2-25 are presented twice: once with a vivid pink background and once with a bright yellow background. The subject has to again quickly connect the numbers in sequence; however, the task requires alternation of colors as the sequence of numbers advances, so the subject must ignore distracter circles that contain the correct number but are printed in the wrong color background (e.g., start with pink 1 and avoid pink 2, select yellow 2, avoid yellow 3, select pink 3, avoid pink 4, select yellow 4, etc.). Therefore, there is always a distracter number that must be avoided because it is printed against a color background that is not appropriate to the sequence. Before the CTI1 and CTI2 are administered, nontimed practice trials are administered to insure that the subject understands the task. When the CTI1 and CTI2 forms are administered, however, the time required to complete each form is noted. Subjects must complete each form of the test in :5240 seconds, or that part of the test is discontinued. The CTI1 is a less cognitively demanding task because it requires the subject to perceptually track only a single specified sequence
(number), whereas the CIT2 requires the subject to simultaneously track both a specified number sequence and a separate color sequence. Therefore, an interference index was developed to quantify and highlight the relative difference regarding the effects of visual attention and perceptual tracking required on the CTI1 from the more demanding sustained, divided attention and more complex perceptual tracking required by the CIT2. Interference Index = (CTT2 time raw score- CITl time raw score)
C1Tl time raw score The interference index reflects the comparison of the subject's performance on the CTI1 relative to the CIT2. ThiS index is expressed as a function of the level of performance on the CTil. Therefore, the index score is a relatively "pure" measure of the extent of interference (if any) attributable to the more complex divided attention and the alternating sequencing tasks required by the CIT2. For example, an interference index score of 0 indicates that the subject's time to complete the CTI1 was the same as that to complete the CTI2 (i.e., no interference). An inter- ference index score of 1.0 indicates that the subject required twice as long to complete the CIT2 as the CTI1, whereas a score of 3.0 indicates that it took the subject four times as long to complete the CIT2 relative to the CTI1 (i.e., significant interference). As the interference index score increases, the increasing score suggests the presence of greater susceptibility to cognitive interference from alternating and sequencing demands (i.e., decreased cognitive flexibility). The WHO's request for a test that would allow broader application in cross-cultural contexts seems quite reasonable. Ideally, neuropsychological procedures that assess the effects of conditions affecting neurological functioning, including brain injury, infectious diseases (e.g., HIV) and other pathologies, should be as culture-free as possible; but is it possible to develop a totally culture-free neuropsychological test? Perhaps not. If this is the case, then procedures should be developed that allow, at minimum, enhanced assessment
COLOR TRAILS TEST
in cross-cultural contexts. Although color perception may not be a totally culture-free phenomenon (Bomstein, 1973), color was used as the test stimulus for the categorical shifting in the CIT because it typically transcends most cultural distinctions. Also, the decision to use numbers and colors was based on the fact that both are universal symbols that place limited demands on language production or knowledge (D'Elia et al., 1996). In cross-cultural pilot tests of the CIT, it was found that individuals in poor Third World countries in Africa, Asia, and South America, with little or no formal education, know and recognize the Arabic numbers 1-25, perhaps because they have to barter for goods and services (Maj et al., 1991). In developing the CIT, it was hypothesized that the alternating shift between number and color sequences would require more effortful executive processing than the shift between numbers and letters of the alphabet. Specifically, in the United States, the English alphabet is learned at a very early age. Students are taught not only to recite the alphabet but to sing it as well. As such, the alphabet sequence is strongly encoded. Indeed, it is not unusual to observe a premorbidly high-functioning individual presenting with a history of moderate brain injury who is able to call upon sufficient brain reserve capacity (Satz, 1993) to complete the TMT part B within a nominally "normal" time limit. Interestingly, these individuals have been occasionally observed to hum or sing the alphabet (although almost inaudibly) while solving part B. Removal of reliance on the English alphabet to solve the CIT2 was hypothesized to effectively eliminate this potential performance confound. Use of colors also permitted the development of identical, equivalent forms of the test for repeat administration in longitudinal research. Currently, there are four versions of the CIT (i.e., forms A, B, C, and D). Form A is the standard test form, on which normative data were collected. Therefore, form A is the only one that should be used for clinical evaluation. The subsequent forms were created by printing a mirror-image version, a 90degree rotated version, and a 90-degree mirrorimage version of form A. This method of creating alternate forms insured that the distance traveled between stimuli was standard
101
for all forms. 'The alternate forms (i.e., forms B, C, and D) are considered experimental and should be used only in research settings. The scoring of the CIT differs from that of the TMT, to allow quantification of the cognitive slippage that often occurs following mild brain injury. For instance, following mild cerebral insults, patients commonly report subtle changes in sequencing, planning, and ability to inhibit specific responses. They frequently complain that it takes extra effort to perform most tasks they formerly completed without much thought or effort. Unfortunately, current approaches to characterizing performance on most neuropsychological tests allow empirical quantification of only gross errors but not the more subtle forms of cognitive slippage frequently described by these patients. The nearmiss score was developed to allow empirical quantification of this type of cognitive slippage. This response occurs when a subject initiates an incorrect response but self-corrects before actual connection to a distracter circle. Reporting near-miss scores allows the examiner to comment on the degree to which a patient is susceptible to distracters. Other scoring criteria include quantification of prompts, numbersequence errors, and color-sequence errors. In the course of preparatory work for the WHO cross-cultural study on the neuropsychiatric aspects of HIV-1 infection, Maj et al. (1993) evaluated the CIT in comparison to translated versions of the TMT at four world sites: Munich, Germany; Bangkok, Thailand; Naples, Italy; and Kinshasa, Zaire. Those preliminary results suggested that the CIT was not only sensitive to HIV-1-associated cognitive impairment but also more culturally fair than the TMT. 'The sensitivity of the test was found to hold across the different cultures examined. However, whether it would hold in other cultures was unknown at that time, and more work still needs to be done.
RELATIONSHIP BETWEEN CTT PERFORMANCE AND DEMOGRAPHIC FACTORS 'There are currently four normative reports regarding the CIT. Analyses conducted on
102
TESTS OF ATTENTION AND CONCENTRATION
the CTT data obtained from the U.S. standardization manual revealed that increasing age adversely affects performance on both CTT1 and CTT2. Increasing education was found to enhance performance on CTT2 but not on CTTl. Gender and the interactions between gender and age were not significantly related to CTT performance scores after the effects of age were removed (D'Elia et al., 1996). Ponton et al. (1996) and LaRue et al. (1999), in examining their respective normative data from their Hispanic samples, also found a negative performance association between increasing age and CTT1 and 2 test scores. In addition, they found a positive relationship between education and CTI1 and 2 scores. No gender effects were found. Similarly, Hsieh and Riley (1997), in examining the normative data from their Chinese sample, found a negative performance association between increasing age and CTI1 and 2 test scores and a positive performance association between increasing education and CTI1 and CTT2 scores. No gender effects were found in the Chinese sample. In summary, research suggests that performance on the CTT is enhanced by education and negatively affected by increasing age. No gender effects have been reported. The CTT is available in adult and child formats from Psychological Assessment Resources (see Appendix 1 for ordering information). Normative data for the Children's CTT can be found in Uorente et al. (2003).
METHOD FOR EVALUATING THE NORMATIVE REPORTS Our review of the literature located four normative reports: one for primarily MexicanAmerican, Central and South American, Spanish-speaking adults (Ponton et al., 1996); one for Mandarin-speaking mainland Chinese (Hsieh & Riley, 1997); one for senior adult bilingual Spanish/Mexican Americans (LaRue et al., 1999); and the U.S. standardization manual (D'Elia et al., 1996). To adequately evaluate the CIT normative reports, five key criterion variables were deemed critical. The first four of these relate to
subject variables and the last to procedural variables. Minimal requirements for meeting the criterion variables were as follows.
Subject Variables Sample Size
Fifty cases are considered a desirable sample size. Although this criterion is somewhat arbitrary, a large number of studies suggest that data based on small sample sizes are highly inHuenced by individual differences and do not provide a reliable estimate of the population mean. Sample Composition Description
Information regarding medical and psychiatric exclusion criteria is important. It is unclear if geographic recruitment region, socioeconomic status, occupation, ethnicity, handedness, or recruitment procedures are relevant. Until this is determined, it is best that this information be provided. Age Group Intervals
This criterion refers to grouping of the data into limited age intervals. This requirement is especially relevant for this test since a strong effect of age on CTT performance has been demonstrated in the literature. Reporting of IQ and/or Education level
Given the association between educational level and CTT scores, information regarding highest educational level completed should be reported. Optimally, normative data should be categorically reported by age and education level. It is unclear ifiQ is relevant, so until this is determined, it is best that information on IQ be provided.
Procedural Variables Data Reporting
Means, standard deviations, and preferably ranges for total time in seconds for each part of the CTT should be reported. Additional information regarding prompts, near-misses, errors, and interference index would facilitate interpretation of test performance.
103
COLOR TRAILS TEST
SUMMARY OF THE STATUS OF THE NORMS In tenns of subject variables, the standardization manual as well as the Ponton et al. (1996) and LaRue et al. (1999) studies provide perfonnance data grouped by ag~ and education categories. Hsieh and Riley (1997) present data separately for age and for education. Although the total sample for the U.S. standardization study is 1,528, unfortunately the manual does not indicate the sample size within each of the 30 age/education categories. Whereas in the LaRue et al. (1999) study sample sizes for most of the age and education categories are generally adequate, the sample size for each of the age and education categories reported by Ponton et al. (1996) is small. Similarly, the sample size for each of the age categories reported by Hsieh and Riley (1997) is small. For the standardization study, Ponton et al. (1996), LaRue et al. (1999), the younger age group categories are generally narrowly defined and therefore adequate; however, the older age group categories tend to be very broad. Hsieh and Riley (1997) report age data in 10-year increments as well as data according to the age groupings found in the U.S. standardization manual. Regarding procedural variables, all studies report means and SDs for time to completion for CTfl and 2. Only the U.S. standardization manual reports data regarding errors, near-miss responses, and prompts. The standardization manual and the Hsieh and Riley (1997) study provide data on the interference index. The LaRue et al. (1999), Ponton et al. (1996), and D'Elia et al. (1996) normative data were collected from participants residing in the United States. The Hsieh and Riley (1997) data were collected from participants residing in the mainland People's Republic of China. In this chapter, nonnative publications are reviewed in ascending chronological order. The text of study descriptions contains references to the corresponding tables identified by number in Appendix 5. Table A5.1, the locator table, summarizes infonnation provided in the studies described in this chapter.
SUMMARIES OF THE STUDIES This section presents critiques of the nonnative studies for the CTT. [CTT.1] D'Eiia, Satz, Uchiyama, and White, 1996
This is the original standardization of the CTT. The manual reports nonnative data from a sample of 1,528 healthy, nor.mal i.ndi~duals residing in a variety of settings m diverse regions of the United States. Participants were excluded if there was a history of head trauma, neurological disorder, or substance abuse. The data were collected during the course of several norming studies with distinct samples, including medically and psychiatrically normal participants from a longitudinal cardiovasc~lar epidemiological study that has ~ee? ongomg since 1960; medically and psychiatrically normal pilots from four major U.S. commerci~ airline manufacturing corporations undergomg a yearly medical examination as part of a na~on ally mandated Federal Aviation Administration/ Equal Employment Opportunity Commission study to obtain nonnative data on neuropsychological functioning of pilots across the age span; medically and psychiatrically no~al residents living in an independent retirement community in southern California; medically and psychiatrically healthy, HN-negative, bisexual and homosexual men participating in a multi-center epidemiological study; and medically and psychiatrically healthy ~rican American men living in Los Angeles With no history of drug!alcohol abuse, participating in a larger study of the neuropsychological, medical, and psychosocial consequences of polydrug abuse and HN. The data are stratified by age and education. There are five age categories: 18-29, 3044, 45-59, 60-74, and 75-89. For each age category, data are reported for perfonnance of those with education of :58, 9-11, 12, 13-15, 16, and 2::17 years. The sample is primarily male; women comprise only 12% of the sample. The manual states: Gender and the interactions between gender and age were not significantly related to CIT raw scores
104
TESTS OF ATTENTION AND CONCENTRATION
after the effects of age were removed, explaining between 0.4% to 2.4% of the variance. Therefore, the relatively small proportion of women in the normative sample does not constitute a threat to either the validity or the utility of the CIT. (D'Elia et al., 1996)
Spanish-language administration instructions and preliminary normative data for Hispanics are provided in the manual. The preliminary normative data are from a sample of healthy, normal Hispanics living in southem California, participating in a large, ongoing normative study. The Hispanic data are reported separately since all participants in this subsample were educated outside the United States and were primarily Spanishspeaking or had Spanish as their first language. Data for Hispanics are presented by four age categories: 17-29, 30-39, 40-49, and 50-75 years. The normative data contained in the standardization manual are not reproduced here, and the interested reader is referred directly to the publication for further information.
Study strengths 1. Sample composition is well described in
terms of exclusion criteria. 2. Performance is reported by age and education intervals. 3. Data reporting includes means and SD scores for each age/education interval. 4. Age group intervals are generally adequate.
Considerations regarding use of the study 1. Sample size within each of the 30 age/ education categories is not indicated. 2. No information on the IQ of participants is reported, although the data are presented by age/education intervals. [CTT.2] Ponton, Satz, Herrera, Ortiz, Urrutia, Young, D'Eiia, Furst and Namerow, 1996 (Tables A5.2 and A5.3)
This study presents normative data stratified by age and education for Spanish-speaking adults' performance on the Neuropsychological Screening Battery for Hispanics (NeSBHIS), which contains the C'IT. This is
the initial report from an ongoing project. The sample consists of 300 volunteers (180 female, 120 male) recruited from fliers and advertisements posted at community centers and churches in Los Angeles County, California (Santa Ana, Pasadena, Pacoima, Montebello, and Van Nuys). The sample was primarily right-handed (95%). Regarding language, 210 were monolingual Spanish and 90 were rated by the examiner to be bilingual. The average (SD) duration of residence in the United States was 16.4 (14.4) years; however, 55% of the total sample had lived in the United States less than 15 years, and half of those participants had less than 6 years of residence in this country. Sixty-two percent of the sample were born in Mexico, 15% in Central America, and 23% in other Latin countries. Exclusion criteria included a history of neurological disease, psychiatric disorder, alcohol or drug abuse, or head trauma. Participants ranged in age from 16 to 75 years (mean = 38.4 [13.5] years). Whereas the 3039 and 40-49 age groupings are adequately narrow, the 16-29 and 50-75 age groupings are somewhat broad. The data are reported by age and education groupings. The tables separately present data for males and females.
Study strengths 1. Sample composition is well described in terms of exclusion criteria. 2. Educational levels are reported. 3. Mean and SD scores are reported. 4. Age group intervals are generally adequate for younger samples (<50 years).
Considerations regarding use of the study 1. Sample size is generally small per agel education interval. 2. The age group interval is too broad for the older sample (50-75 years).
Other comments 1. IQ scores are not reported; however, scores are reported for Raven's Standard Progressive Matrices Test at each age/ education level. Raven's test is used to provide an estimate of nonverbal intelligence.
COLOR TRAILS TEST
2. Although the data are reported by gender, there does not appear to be a significant gender effect. [CTT.3] Hsieh and Riley, 1997 (Tables A5.4-A5.6)
This report provides normative data on tests of attention and concentration collected in the People's Republic of China. The normative sample consisted of 177 (93 male, 84 female) urban, Mandarin-speaking participants recruited across a broad range of educational and occupational categories. The data are stratified by age categories and by education categories. The age categories are 30--39, 40--49, 5~9. 60-69, and 70--83 years. The education categories follow the Chinese system of primary school (1-6 years), middle school (7-9 years), and high school (10-12 years). The vast majority of adults over the age of 60 had fewer than 6 years of education. Study strengths 1. Data include means and SDs for test scores. 2. Age group intervals are adequate. Considerations regarding use of the study 1. The sample composition description does not sufficiently address study inclusion/ exclusion criteria. 2. Sample sizes are generally small. Other comments 1. Years of education per age group are presented in a table. 2. Performance is reported separately for age and for education categories. CIT data would generally be more useful if they were partitioned by an age/education category. [CTT.4] LaRue, Romero, Ortiz, Chi Liang, and Lindeman, 1999 (Tables A5.7-A5.1 0)
This report provides normative data for 797 community-dwelling, senior adult (age 65-97), Spanish I Mexican American Hispanic and non-Hispanic Caucasian men and women, reported by age and educational level. CIT data were collected as a result of its inclusion
105
in a brief battery of cognitive measures administered during the course of a communitybased epidemiological survey in Bernalillo County, New Mexico. The authors report that the Hispanic population in that area reflects a well-established ethnic and cultural group since the majority of the residents evidently trace their ancestry to the colonization by the Spanish in the 1600s. The majority of Hispanics in the sample identified themselves as Spanish Americans (83%). Most Hispanic participants were bilingual, with 94% reporting that they spoke Spanish well and 83% reporting that they spoke English well. Four percent reported that they did not speak English. The authors reported that "Seventynine percent of the Hispanic participants completed the cognitive tests exclusively or primarily in English, 14% exclusively or primarily in Spanish, and 7% in a combination of English and Spanish." For the present report, the authors excluded from analyses persons whose age education-adjusted Mini-Mental Status (Folstein et al., 1975) was 23 or lower. Adjustments for age and education were made following a regression equation developed by Mungas et al. (1996). Because of time constraints, the authors deviated from the standard administration of the CIT in one important way: the dependent measure for CIT1 and CTI2 is the number of digits correctly traced in 60 seconds. The data are presented by ethnic group (nonHispanic white vs. Hispanic), by age and education categories. The age categories are 65-74 and 75-97. The educational categories for Hispanics are 0-6,7-9, 10-12, and >12years. The educational categories for non-Hispanics are 012 and > 12 years. The mean age of the nonHispanic Caucasian men (n = 230) was 74.1 (5.8) years, and that of the women (n = 208) was 73.8 (5.9) years. For the Hispanic group, the mean age for men (n = 194) was 73.6 (6.5) years, and that for women (n = 165) was 72.9 (5.6) years. For the non-Hispanic group, the mean educational level for men was 14.1 (3.5) years, and that for women was 13.5 (2.5) years. For the Hispanic group, the mean educational level for men was 9.7 (4.4) years, and that for women was 9.3 (3.9) years.
106
TESTS OF ATTENTION AND CONCENTRATION
Study strengths 1. Sample size of the various age/education groups is generally ~50 for both the Hispanic and non-Hispanic groups (for exception, see "Considerations regarding use of the study," below). 2. Sample composition is well des<$ibed in terms of inclusion/exclusion crit,ria. 3. Performance is reported by 4ge and education intervals. 4. Data reporting includes means ~d SD scores for each age/education inferval. '
Considerations regarding use of the sludy 1. The age group interval for the upper age category is quite broad (75-97).; 2. No information on IQ is rctported, although the data are presented.by agel education intervals. 3. The standard test administration format was altered (for both forms of the test, the data reflect the number oE circles containing digits that were cbrrectly traced in 60 seconds). : 4. Sample size for each of the education categories was <50 participants for the 75-97 year group of Hispanics.
CONCLUSIONS The CTI was developed to allow speeded/ timed assessment and quantification pf cognitive flexibility and sequencing skills :as well as assessment of mental processing s~d and attention/concentration abilities. The~ skills have long been associated with ex~utive/ frontal lobe functioning (Lezak et al.,. 2004). The CTI was not designed to produce equivalent or even similar time to coitfletion scores when compared to the TMT. 1tather, since the TMT has long been view~ as a test of frontal lobe/executive functionipg, the CTI was designed to tap similar frontaV executive cognitive abilities and to allow broader assessment applications in cross-cultu$1 contexts. Although the physical designs :of the CTI and TMT are similar, they are J¥>t. nor were they intended to be, identical. The CTI1 and TMT part A generally take ~imilar
amounts of time to complete; however, the CTI2 generally takes slightly longer to complete than the TMT part B. The fact that the CTT2 takes longer to complete than the TMT part B has been demonstrated in Englishspeaking (D'Elia et al., 1996), Turkishspeaking (Dugbartey et al., 2000), and Chinese (Mandarin)-speaking samples (Lee & Chan, 2000a,b; Lee et al., 2000). In an intact individual, the reason for the added time to complete the CTI2 vs. the TMT part B most probably can be attributed to the design differences in the two test forms and, therefore, the difference in test demands. The CTI2 has almost twice as many stimuli to scan and to consider as the TMT part B (e.g., 25 circle stimuli for TMT part B vs. 49 circle stimuli for CTI2), even though the distance the pencil needs to travel to correctly complete the task is shorter (TMT part B = 243.6 em, CTT2 = 184.6 em). Because the TMT part B uses the English alphabet as one of the alternating stimuli, one must keep in mind not only the increasing sequence of numbers but also the increasing sequence of alphabet letters. In other words, to correctly complete the TMT part B, one must keep in mind that G comes after F, then H etc. The CTI2 similarly requires that the test-taker keep in mind the correct numerical sequence; however, even though there are more distracter stimuli on the page, the alternating choice is between only two colors. As such, for English-speaking individuals, it may be that the working-memory demands of the TMT part B are greater than for the CTT2. However, as discussed earlier, the sequence of the English alphabet seems to be an overlearned and multiply encoded string for individuals educated in the United States. Therefore, the actual difference in workingmemory demands may be quite minimal for intact individuals. This, of course, is an area for further research. Research to date suggests that the CTI holds promise for cross-cultural and longitudinal research as well as clinical assessment of sequencing, visual scanning, and speed of mental processing abilities in non-Englishspeaking adults and adults with limited education, English as a second language, and
COLOR TRAILS TEST
language and reading disorders. Further research is needed to compare CIT performance in cross-cultural settings. Four equivalent forms have been developed for the CIT (A, B, C, and D). Currently, only form A has been normed for clinical use. Even though all four are physically equivalent, future research is needed to establish the psychometric and normative equivalence of the alternate forms. Future research should also focus on establishing the reliability and equivalence of the alternate forms in samples of both normal participants and patients with specific neurobehavioral dysfunctions (e.g., clinical comparison data; aka, abnorms). Future normative studies with any form of the test should also report base rate data regarding error and near-miss responses, data for prompts, as well as information regarding the interference index. For instance, no age- and education-corrected normative data are available for Hispanic samples regarding the occurrence of near-miss and error responses, nor is there information regarding prompts and the interference index. The one Chinese normative study reports information regarding time to complete the CIT1, CIT2, and the interference index but no information regarding prompts, near-misses, or errors. Normative data are needed for Englishspeaking and non-English-speaking individuals with low or no education. Further normative
107
studies of different ethnic/cultural groups are needed. Reporting the data by age/education categories would allow performance comparison across cultures. In general, the age categories need to be narrowed for reporting data on older adults. We recommend that future studies follow the WAIS-III age category groupings as an example. Although some excellent work has been done, further normative work still needs to be done regarding the performance of Hispanic individuals of Mexican descent on the CIT above age 75 years. Fortunately, Ponton and colleagues continue to collect normative information on the NeSBHIS; therefore, a larger normative database will accumulate, allowing a sample size more appropriate for inferential purposes with Hispanics. In their initial report of Spanish-speaking individuals, the sample size for the age- and educationcorrected groups was quite small. Yet, comparison of these preliminary performance data with those found in the U.S. standardization manual for the CIT at the same age and education levels does not suggest a significant difference. This finding coupled with the findings of Maj et al. (1991, 1993) further supports the notion that the CIT may allow enhanced application in cross-cultural contexts. How many cultures this effect transcends remains to be discovered. 1
'Meta-analyses were not performed on the CTr due to a lack of sufficient data.
6 Stroop Test
BRIEF HISTORY OF THE TEST 'ne Stroop Test measures the relativf speed of reading names of colors, naming col+rs, and naming colors used to print an incoagruous color name (e.g., the color red used to print the word blue). The last task requires one to override a reading response. This conflict interference situation has come to b~ called the Stroop Effect. The interference section of the Stroop Test has traditionally been viewed as a me¥ure of executive functioning involving cognitive inhibition (Boone et al., 1990) and, specifically, the ability to inhibit an overlearned response in favor of an unusual one (Spreen &: Straus,, 1998) and "to maintain a course of action in th~ face of intrusion by other stimuli" (Comalli et al. 1962, p. 47). Factor analyses of sets of executiVe measures suggest that the Stroop interfe~ trial has more in common with timed eJeeutive measures, such as verbal fluency (FA$), and measures of information-processing speed, such as Digit SymboL than executive tests i~olving set shifting (Wisconsin Card Sorting Test) or divided attention/working memory (.Ailditory Consonant Trigrams) (Boone et al., 1998). Initial lesion studies indicated that poor performance on the interference sectio* of the Stroop Test was associated with left fronfallobe
108
pathology (Perret, 1974), while subsequent functional imaging studies have found the Stroop interference effect to be associated with activation of anterior cingulate and/ or frontal cortex (Bench et al., 1993; Brown et al., 1999; Carter et al., 1995, 1997; George et al., 1994; Pantelis et al., 1996; Pardo et al., 1990: Peterson et al., 1996; Taylor et al., 1997). Poor performance on the Stroop Test has been associated with frontal system dysfunction secondary to closed head injury (Trenerry et al., 1989), discrete frontal lobe lesions (especially left frontal lobe; Perret, 1974; see Regard, 1981, cited in Spreen &: Strauss, 1998), frontotemporal dementia (Pachana et al., 1996), frontal lobe seizures (Boone et al., 1988), white-matter hyperintensities (Fukui et al., 1994; Ylikoski et al., 1993), Klinefelter's syndrome (Boone et al., 2001), age-associated memory impairment (Hanninen et al., 1997), transient global amnesia (Stillhard et al., 1990), depression (Boone et al., 1995; Trichard et al., 1995), schizophrenia (Brebion et al., 1996; Buchanan et al., 1994; Schreiber et al., 1995), late-life psychosis (Miller et al., 1991), attention-deficit hyperactivity disorder (ADHD; Seidman et al., 1997; Rapport et al., 2001), and exposure to alcohol in utero (Connor et al., 2000); and Stroop scores have been observed to predict aggression (Foster et al., 1993).
STROOP TEST
In addition, Stroop scores are lowered in cases of brain dysfunction not necessarily confined to anterior brain areas, such as left and right cerebrovascular accident (Trenerry et al., 1989), Alzheimer's disease (Binetti et al., 1996; Koss et al., 1984; Pachana et al., 1996), and myotonic dystrophy (Palmer et al., 1994). Stroop performance is impaired in both left and right cerebral damage but may be particularly pronounced with left-sided damage (Perret, 1974; Trenerry et al., 1989), although this may be an artifact of coexistent aphasia. Specifically, Nehemkis and Lewinsohn (1972) found that patients with left cerebral damage with aphasia performed particularly poorly on the Stroop, while patients with left cerebral damage without aphasia actually performed better than patients with right hemisphere dysfunction. Finally, there is evidence that Stroop performance is unaffected by chronic caffeine use (Hameleers et al., 2000) but is influenced by endogenous cholesterol synthesis (Teunissens et al., 2003). The Stroop Test paradigm is among the oldest in experimental psychology. Interest in the relative speed of color naming and reading color-words has been active for over a century. In 1883, as a result of a suggestion by Wilheim Wundt (who founded the first psychological laboratory in Leipzig, Germany), America's first psychologist, James Cattell (then a student of Wundt), began conducting what would later become the earliest published study (1886) examining the relative speeds of color naming and color-word reading. Over 40 years later, the first published report of the conflict I interference situation (e.g., where one must name the color of the ink used to print the word when the color and color name are incongruous) originated in the Marburg, Germany, laboratory of Erick Rudolf Jaenasch (Jensen & Rohwer, 1966). Some years later, John Ridley Stroop, then a graduate student working in the Jesup Psychological Laboratory at George Peabody College for Teachers, began his doctoral research, examining interference in serial verbal reactions in which he developed and
109
used the color-word interference test that now bears his name (Stroop, 1935). Stroop's original studies employed three cards, all with white backgrounds:
1. An achromatic color-word reading card, consisting of a series of 100 words for colors printed in black ink. 2. A chromatic color-word naming card, consisting of a series of 100 color names printed in a color of ink incongruent with the word. 3. A pure color card, consisting of a series of 100 squares printed in different solid colors. For all cards, five colors and/ or color-words were used (red, blue, green, purple, and brown). The words and the colors were generally arranged in a 10" x 10" grid of evenly spaced rows and columns. As Stroop notes: "The colors were arranged so as to avoid any regularity of occurrence and so that each color would appear twice in each column and in each row, and that no color would immediately succeed itself in either column or row. The words were also arranged so that the name of each color would appear twice in each line." For the chromatic color-word naming cards, "no word was printed in the color it names but an equal number of times in each of the other four colors: i.e., the word 'red' was presented in blue, green, brown, and purple inks; the word 'blue' was printed in red, green, brown, and purple inks, etc. No word immediately succeeded itself in either column or row" (p. 648). An alternate form was also created by printing all the cards in the reverse order. In three experiments, Stroop examined four different tasks using the above-mentioned three cards. Using cards 1 and 2, experiment 1 examined the differences in rates of reading color-word names (task 1) when the word was printed in black ink vs. an incongruous ink color (task 2). Using only cards 2 and 3, experiment 2 examined the differences in rates of verbally identifying squares of color (task 3) vs. naming ink colors against the distraction of incongruous color-words (task 4). For experiment 3, Stroop modified his test, shortening
110
TESTS OF ATTENTION AND CONCENTRATION
the cards to 10 columns and five ~ws (so that there were only 50 responses tequired per card instead of 100) and using: colored swastikas on the pure color card instead of solid square color patches. For expeqment 3, Stroop administered each of the foor tasks separately on different days. Stroop never administered all three cards in the same testing period; this procedure did ~ot become standard until Thurstone's (1944) investigations of perception using the Stroop
paradigm.
I
As testimony to its popularity, the Stroop
has been translated into several laGguages, including Spanish (Rosselli et al., : 2002b; Annengol, 2002), Chinese (e.g.,
with a white background, others have used cards with a black background or a color different from both the color ink of the word and the color name ("SuperStroop;" Dyer, 1973). 5. Number of stimuli cards: Various versions of the test require the use of two, three, or four cards. Administration Procedures 1. Scanning orientation: Some versions require the examinee to scan across rows from left to right, whereas others require the examinee to read down columns. 2. Stimuli sequence: Some versions present word reading followed by color naming and vice versa. Method of Scoring Determination of the total score has ranged from the number of correct responses made in 45 or 120 seconds to the total time to complete each card to a difference score (color interference minus color naming or reading) to the total number of errors made in 45 seconds.
Current Administration Procedures At present, there is no one recognized standard version of the Stroop Test. There are, however, three versions that are commerciallypublished: Charles Golden's (1978), Max Treneny et al.'s (1989), and that contained in the Delis-Kaplan Executive Function System (Delis et al., 2001). The first two are available from Psychological Assessment Resources, and the Delis-Kaplan Executive Function System can be purchased from Psychological Corporation. Edith Kaplan's Stroop version can be used as a Comalli et al. (1962) version (reading words, naming colors, naming colors with incongruous color names) or as the Comalli and Kaplan version (naming colors, reading words, naming colors with incongruous color names). Carl Dodrill (1978a) and Otfried Spreen and Esther Strauss (1998) have also developed versions of the Stroop, which can be obtained by writing to their respective laboratories (see Appendix 1 for ordering information).
111
STROOP TEST
The Stroop versions reviewed in this chapter will be limited to five formats which are commercially published or readily available from the authors. (The Delis-Kaplan Executive Function System version will not be reviewed because the published normative data set is large [n for adults= 875], and few studies have appeared using this version due to the recency of its publication.) These five versions differ in format and will be briefly described below. Comalli/Kaplan Stroop, 1962; Personal Communication
The Comalli and Kaplan Stroops use the same three cards originally developed by Comalli et al. (1962), although Comalli and Kaplan differ in the order of presentation of cards 1 and 2. All three cards are 9~" x 9~" with 100 stimuli per card arranged in a 10 x 10 grid against a white background. At the top of each card is an additional row of 10 practice items. The color-name reading card consists of color words (red, blue, green) printed in black ink and presented in random order. The color naming card consists of rectangles (5/16" x 2/16"} of colors (red, blue, green) arranged in random order. The third card presents colorwords printed in a color of ink different from the color designated by the word. For each card, participants are instructed to proceed line by line down the page either reading words or naming colors as quickly as they can. Each line is scanned from left to right, mirroring English reading format. The time to complete the 100 items on each card is recorded, along with the number of errors made. Near-miss responses (i.e., self-corrected errors) are also recorded. The Comalli/Kaplan Stroop scoring protocol also allows for independently tracking the response times for the first half of each card separately from the
last half. In the Comalli et al. (1962) administration format, the Word-Reading card is presented first, followed by the Color-Naming card and then the Interference card. The Kaplan alteration in the test administration format occurred by happenstance when a research assistant mistakenly reversed the order of cards 1 and 2. In thinking about the
mistake, Edith Kaplan decided the error might have been fortuitous. As a result, she decided to permanently make the modification in card order presentation in her laboratory for the following reasons: (1) administering the ColorNaming card first allows an immediate check on whether the subject is color-blind, a condition which would invalidate the use of the test, and (2), more importantly, administration of the Word-Reading task immediately before the Interference task (in which the subject must now inhibit a reading response) may exert a priming effect on the degree of interference, whereas presenting the ColorNaming card before the Interference task (in which the subject is again expected to identify color) may serve to minimize the "Stroop effect" (personal communication). Demick, Kaplan, and Wapner (personal communication) have more recently proposed a process-oriented scoring system for the Stroop that utilizes the identification of specific verbal errors (reflecting both deviant responses to items, e.g., inappropriate color responses, and deviant responses to sequence, e.g., inserted linguistic words or phrases); nonverbal behaviors serving as cognitive devices (e.g., nodding, body rocking); and expanded temporal measures (e.g., time per line, time between utterances). Based on exploratory studies (Demick &: Wapner, 1985; Demick et al., 1986), they have documented that while various developmental groups may not differ with respect to achievement measures (e.g., total time}, they are distinguishable on the basis of more process-oriented measures (e.g., relative to young and middle-aged adults, older adults used significantly more inefficient strategies, such as gazing across the cards to identify specific patterns and I or using fingers as if counting in succession to modulate responses, to meet task demands; on a downward extension of the Stroop for preschoolers, 4- and 5-year-olds employed a range of nonverbal strategies to maintain serial organization, while 3-year-olds failed to do so). Golden Stroop, 1978
This version of the Stroop uses three 8~" x 11" pages. Each page has 100 items presented in five columns of 20 items. Page 1 consists of
112
TESTS OF ATTENTION AND CONCENTRATION
the words red, green, and blue presented randomly and printed in black ink. Page 2 contains blocks of Xs printed in either red, green, or blue ink. Page 3 is the Stroop effect card and contains color-words printed in a noncongruent color (i.e., the word blue printed in red ink, etc.). For each page, the examinee is required to scan the columns vertically, starting on the left side and moving to the right. The score is the number of correctly identified items per page within 45 seconds. Errors are not counted. Dodrill Stroop, 1978a
The Dodrill version of the Stroop consists of two alternate administrations of one stimulus card containing 176 color-words (red, green, blue, and orange) randomly printed in 11 columns of 16 color-words. Each color-word is printed in an incongruous color (e.g., the colorword blue is printed in green ink, etc.). In the first administration, participants read the color-words as they scan down the columns. In the second administration, participants name the color of ink in which the words are printed. Time to complete each card is noted. Two scores are generated: the total time to complete part I (which is essentially an estimate of the examinee's reading speed) and the total time for part II minus that for part I, "which reflects an estimate of the degree of interference induced by the test" (Dodrill, 1987, p. 6). Victoria Stroop, 1991 (Reported -by Spreen & Strauss, 1991, 1998)
The Spreen and Strauss version of the Stroop (also known as the Victoria version) uses three 21.5 x 14 em cards presented in the following order: part D, part W, and part C. Each card has six rows of four items. Part D contains colored dots (red, green, blue, and yellow), and on this card the task is to name the colors as quickly as possible. Part W has the words when, and, over, and hand printed in red, green, blue, or yellow ink; and the examinee must name the color of ink in which each word is printed as quickly as possible. On part C, the color-words red, green, blue, and yellow are printed in incongruous-colored ink (e.g., the word red is printed in green ink, etc.); and the examinee must name the color ink in
which the color-word is printed as quickly as possible. Rows are scanned from left to right as the subject works down the page. Time to completion and the number of errors are recorded for each card. Jensen and Rohwer (1966) provide a detailed and fascinating review of the Stroop Test and its many reincarnations; and Dyer (1973), Golden (1978), and MacLeod (1991) review applications and research findings subsequent to Jensen and Rohwer's report. Lezak et al. (2004) also provide an overview of the test for the interested reader. Trenerry et al. Stroop, 1989
This version of the Stroop consists of two cards: form C and form C-W. Form C contains 112 color-words (red, blue, green, and tan) randomly arranged in four columns of 28 color-words. Each color-word is printed in an incongruous ink color (e.g., the word tan is printed in red, etc.). Form C-W follows the same format as form C; however, there is a different random order of color-words. For form C, the examinee is requested to read the words as quickly as possible while scanning down the columns. For form C-W, the examinee is instructed to name the color ink in which the color-word is printed as quickly as possible, again while scanning down the columns. A maximum of 120 seconds is allowed to complete each task. The score for each task is the number of correct responses (or number of items completed) minus any incorrect responses. Although the Dodrill version relies upon a difference score between the reading and interference cards, a discriminant analysis conducted by Trenerry and colleagues demonstrated that the data from form C-W alone provided the sharpest classification accuracy; thus, the score from form C-W is the only one used for interpretation purposes.
RELATIONSHIP BETWEEN STROOP TEST PERFORMANCE AND DEMOGRAPHIC FACTORS While some studies have found no significant age effect on the Stroop Test (Graf et al.,
STROOP TEST
1995; Lopez et al., 2003) or an age effect that was equaled by the effect of health status
(Houx et al., 1993), others have found significant or nearly significant age-related decrements in Stroop performance (Anstey et al., 2000; Barbarotto et al., 1998; Boone et al., 1990; Cohn et al., 1984; Comalli et al., 1962; Daigneault et al., 1992; Feinstein et al., 1994; Graf et al., 1995; Ivnik et al., 1996; Jensen & Rohwer, 1966; Klein et al., 1997; Libon et al., 1994; Moering et al., 2004; Panek et al., 1984; Rosselli et al., 2002b; Spreen & Strauss, 1998; Sullivan et al., 2002; Swerdlow et al., 1995; Treneny et al., 1989; Uttl & Graf, 1997; Whelihan & Lesher, 1985), although Doan and Swerdlow (1999) observed an age effect for English speakers but not Vietnamese speakers. Of interest, younger and older groups may show a different pattern of performance on color interference, with younger participants performing the first half of the task faster than the second half and older participants demonstrating the opposite pattern (Klein et al., 1997). At least part of the slowing observed in Stroop performance with age may be due to declines in visual function. Specifically, reduced contrast sensitivity is associated with more time needed in word reading, red/ green color weakness is correlated with increased time needed in color naming, and reduced distance acuity negatively impacts speed on the interference trial (van Boxtel et al., 2001). van Boxtel and colleagues (2001) indicate that visual function accounts for half of the agerelated variance in Stroop performance. Several studies have reported that more highly educated participants perform better on the Stroop Test (Anstey et al., 2000; Barbarotto et al., 1998; Houx et al., 1993; Lopez et al., 2003; Moering et al., 2004); however, other investigations have not been able to document a significant relationship between education and Stroop performance (Treneny et al., 1989). The literature on gender differences in Stroop performance is equivocal, with some studies observing differences (Barbarotto et al., 1998; Martin & Franzen, 1989; Moering et al., 2004; Small et al., 2000), others finding none (Anstey et al., 2000; Armengol, 2002;
113 Boone, 1999; Connor et al., 1988; Houx et al., 1993; Ingraham et al., 1988; Jensen & Rohwer, 1966; Swerdlow et al., 1995; Stroop, 1935; Treneny et al., 1989), and others reporting a female advantage confined to color naming (Golden, 1978; Stroop, 1935; Jensen & Rohwer, 1966; Strickland et al., 1997; Swerdlow et al., 1995) or word reading (Strickland et al., 1997). In addition, there is some evidence that a female advantage may be limited to samples with ~ 12 years of education (Moering et al., 2004). Spreen and Strauss (1998) suggest that there is a relationship between Stroop performance and intellectual level, although Treneny and colleagues (1989) indicate that Stroop scores are not strongly related to IQ in braindamaged participants. A recent examination of the relative contribution of IQ and demographic factors to Comalli Stroop color-interference performance in a large sample of healthy, older participants found that age and FSIQ accounted for 15% and 13%, respectively, of unique test score variance; education and gender did not contribute to test performance (Boone, 1999). Similarly, Ivnik et al. (1996), using Golden's (1978) version of the Stroop, found that age strongly influenced Stroop performance, accounting for 13% of the test's raw score variance on the Word-Reading card, 29% on the Color-Naming card, and 27% on the Interference card. Education accounted for 8% of the performance variance on the WordReading card, 3% on the Color-Naming card, and only 2% on the Interference card. Gender accounted for <4% of the performance variance on any card, "suggesting that sexcorrections were not needed" (p. 263). The literature taken as whole suggests that age and IQ are substantial contributors to Stroop performance. Gender appears to have a more minor relationship to Stroop scores. It is doubtful that educational level has an impact on test performance over and above that accounted for by IQ. Regarding the effect of language on Stroop performance, Rosselli et al. (2002b), in studying monolingual Spanish speakers, monolingual English speakers, and bilingual Spanish-English speakers, found that groups
TESTS OF ATTENTION AND CONCENTRATION
114
did not significantly differ in Stroop scores with the exception of slower perf~mance in bilinguals relative to English speakers on color naming in English. Testing bilinguals in their second language was associated with a 10%-15% increase in time for color naming and a So/o-10% increase in time f~ color interference. Bilinguals who were more facile in Spanish were significantly slower jon the English Stroop trials, while bilin~ more fluent in English were slower on the Spanish I Stroop. 1
Subject Variables Sample Size
Fifty cases are considered a desirable sample size. Although this criterion is somewhat arbitrary, a large number of studies suggest that data based on small sample sizes are highly influenced by individual differences and do not provide a reliable estimate of the population mean. Sample Composition Description
As discussed in other chapters, information
METHOD FOR EVALUATING THE NORMATIVE REPORTS j Our review of the literature located two Stroop manuals (Golden, 1978; T*eneny et al .• 1989), as well as 10 normative reports and 18 clinical studies reporting control data on the Stroop Test. The manuals are for the Golden and Treneny Stroop versions; while the normative studies are for the Comalli (Demick &: Harkins, 1997), Kaplan (Schiltz, personal communication; Strickland ~et al., 1997), Golden (Ingraham et al., 1~ Ivnik et al., 1996; Moering et al., 2004), T~neny (Anstey et al., 2000), and Victoria (~egard, 1981, cited in Spreen &: Strauss, 1 1991; Spreen&: Strauss, 1998) versions. Amqog the control data studies, seven provide ~ta for the Comalli version (Boone et al., 1990, 1991, 2001; Boone, 1999; Comalli et al.; 1962; Eson, personal communication; Stuss et al., 1985), one presents data for the Kaplan version (D'Elia et al., unpublished da~). seven report data for the Golden ../ersion (Cohen et al., 2003; Connor et al.,. 1988; Daigneault et al., 1992; Doan &: Sw$'dlow, 1999; Fisher et al., 1990; Rapport et al., 2001; Rosselli et al., 2002b), and two present data for the Dodrill version (Dodrill, ~978a; Sacks et al., 1991). To adequately evaluate the Stroop Test normative reports, eight key criterion variables were deemed critical. The first six o£1 these relate to subject variables, and the re~ing two relate to procedural variables. Minimal requirements for meeting the criteriOJl variables were as follows. !
regarding medical and psychiatric exclusion criteria is important; it is unclear if geographic region, ethnicity, occupation, handedness, or recruitment procedures are relevant, so until this is determined, it is best that this information be provided. Fluency in English appears to moderate Stoop performance; thus, it is preferable that information on this variable also be reported. Age Group Interval
Given the consistent evidence of a significant decline in performance with advancing age, Stroop data should be presented in age groupings. Reporting of Educational levels
Data on the relationship between education and Stroop performance is equivocal; thus, until this issue is resolved, information regarding the educational level of the normative samples should be included. Reporting of IQ
Some data suggest that IQ may be nearly as predictive of performance as age, although this finding needs to be replicated in additional studies. For the time being, Stroop normative data probably do not need to be presented by IQ group intervals but information on the IQs of normative samples should be provided. Reporting Gender Composition
Data on the relationship between gender and Stroop performance are also equivocal; thus, until this issue is resolved, information
STROOP TEST
regarding the gender composition of normative samples should be provided.
Procedural Variables Description of Test Stimuli and Administration Format Used
'nlere are several different versions of the Stroop Test, and each version differs regarding administration and scoring procedures. 'nlerefore, normative data sets must indicate the precise test stimuli and administration format. (For the purposes of this chapter, only those studies which met this criterion were reviewed.) Data Reporting
Means and standard deviations, and preferably ranges, for time in seconds for each card are important. In addition, it is advantageous for data to be provided on number of errors (corrected and uncorrected).
SUMMARY OF THE STATUS OF THE NORMS In terms of subject variables, all studies except one provide information on age (Swerdlow et al., 1995) but 17 do not present data by age groupings (Boone et al., 1991, 2001; Connor et al., 1988; Cohen et al., 2003; Doan & Swerdlow, 1999; Dodrill, 1978a; Fisher et al., 1990; Ingraham et al., 1988; Lopez-Carlos et al., 2003; Rapport et al., 2001; Regard, 1981, cited Spreen & Strauss, 1998; Rosselli et al., 2002b; Sacks et al., 1991; Schiltz et al., personal communication; Strickland et al., 1997; Stuss et al., 1985; Swerdlow et al., 1995). Data on IQ are reported much less frequently, appearing in only 10 publications (Boone et al., 1990, 1991, 2001; Boone, 1999; Cohen et al., 2003; Lopez et al., 2003; Rapport et al., 2001; Regard, 1981, cited in Spreen & Strauss, 1991; Sacks et al., 1991; Stuss et al., 1985); data are stratified by IQ and age in one report (Boone, 1999). Educational level is indicated in all but five reports (Comalli et al., 1962; Eson, personal communication; Golden, 1978; Regard, 1981, cited in Spreen &
115
Strauss, 1991; Swerdlow et al., 1995), four publications present data by educational levels (Anstey et al., 2000; Lopez et al., 2003; Miller, 2003; Moering et al., 2004), and two provide corrections for education (Ivnik et al., 1996; Moering et al., 2004). Gender is indicated in all reports but five (Comalli et al., 1962; Eson, personal communication; Golden, 1978; Regard, 1981, cited in Spreen & Strauss, 1991; Spreen & Strauss, 1998), four reports present data for an all-male sample (Cohen et al., 2003; D'Elia, Satz, & Uchiyama, unpublished data; Lopez et al., 2003; Miller, 2003), two reports provide data stratified by gender (Moering et al., 2004; Swerdlow et al., 1995), and one developed gender corrections for raw scores (Moering et al., 2004). Cell sample sizes are generally adequate in 11 studies (Daigneault et al., 1992; D'Elia, Satz, & Uchiyama, unpublished data; Dodrill, 1978a; Ivnik et al., 1996; Lopez et al., 2003; Moering et al., 2004; Rosselli et al., 2002b; Schiltz, personal communication; Miller, 2003; Swerdlow et al., 1995; Treneny et al., 1989). Medical and psychiatric exclusion criteria are judged to be adequate in a majority of studies (Boone et al., 1990, 1991, 2001; Boone, 1999; Daigneault et al., 1992; D'Elia, Satz, & Uchiyama, unpublished data; Dodrill, 1978a; Lopez et al., 2003; Miller, 2003; Rapport et al., 2001; Rosselli et al., 2002b; Schiltz, personal communication; Strickland et al., 1997; Swerdlow et al., 1995; Treneny et al., 1989). Geographic recruitment area is indicated in all but seven reports (Eson, personal communication; Golden, 1978; Rosselli et al., 2002b; Spreen & Strauss, 1991; Swerdlow et al., 1995; Treneny et al., 1989); most of the data were collected in the United States, although some data were collected in Canada (Stuss et al., 1985; Daigneault et al., 1992; and it is assumed that the Regard, 1981, cited in Spreen & Strauss, 1991, samples were also obtained in Canada), Australia (Anstey et al., 2000; Sacks et al., 1991), Israel (Ingraham et al., 1988), and Mexico (Lopez et al., 2003). Ethnicity is reported in 11 studies (Boone et al., 1990, 1991; D'Elia, Satz, & Uchiyama, unpublished data; Doan & Swerdlow, 1999; Dodrill, 1978a; Ivnik et al., 1996; Lopez et al., 2003; Moering et al., 2004; Rosselli et al., 115
116
TESTS OF ATTENTION AND CONCENTRATION
2002b; Miller, 2003; Strickland et al., 1997), occupational status is indicated in ¥Y five publications (Anstey et al., 2000; Dafgneault et al., 1992; D'Elia, Satz, & Uchiyaina, unpublished data; Dodrill, 1978a; Ivnik et al., 1996), and handedness is described in only four reports (Boone et al., 1991; Ooan & Swerdlow, 1999; lvnik et al., 1996; Regard, 1981, cited in Spreen & Strauss, 1991). Language and/or fluency in English is rep,rted in 10 studies (Boone et al., 1990, 199l, 2001; Boone, 1999; Daigneault et al., 1992; Doan & Swerdlow, 1999; Ingraham et al., 19~; Rosselli et al., 2000; Miller, 2003; Stuss et al., 1985), and recruitment procedures ar. specified in 16 data sets (Anstey et al., 2000; Boone et al., 1990, 1991; Boone, 1999; Conndr et al., 1988; Daigneault et al., 1992; D'Elia, Satz, & Uchiyama, unpublished data; Dodrill,; 1978a; Fisher et al., 1990; Ingraham et al.~ 1988; Ivnik et al., 1996; Lopez et al., 2003; Moering et al., 2004; Rosselli et al., 2000; Schilb, personal communication; Swerdlow et al.J 1995). Regarding procedural variables, test ,stimuli and procedures are described in all ~ports. Means are presented in all data sets, although SDs are not reported in three studies 4Aostey et al., 2000; Comalli et al., 1962; Golden, 1978). Percentiles corresponding to ra~cores, stratified by age and education, are p.:ovided by Anstey et al. (2000) and Moering et al. (2004). One study provides data only for the color-naming trial (Stuss et al., 1985), and six studies report data only for the . colorinterference trial (Boone et al., 1991; Boone, 1999; Cohen et al., 2003; Daigneault! et al., 1992; Sacks et al., 1991; Schiltz, personal communication). Two studies provide· cutoff scores (Dodrill, 1978a; Treneny et al.,;1989), and one study presents data for the fi~t half and the second half of the color-inter&rence I trial separately (Schiltz, personal communication). Several studies report error s~res as well as time scores (Boone et al., 1990; P'Elia, Satz, & Uchiyama, unpublished data; ~gard, 1981, cited in Spreen & Strauss, ' 1991; Spreen & Strauss, 1998), and one presents information on alternate forms (Sacks et al., 1991). The text of study descriptions contains references to the corresponding tables ideptified
by number in Appendix 6. Table A6.1, the locator table, summarizes information provided in the studies described in this chapter.
SUMMARIES OF THE STUDIES Published manuals for the Stroop Test are reviewed first, followed by normative studies and control groups from clinical comparison studies presented in ascending chronological order for each version of the test separately. Studies using the Comalli version are presented first, followed by those using the Kaplan, Golden, Dodrill, Victoria, and Treneny versions.
Manuals [STROOP.1] Golden, 1978 (Golden Version)
The test stimuli and administration procedures developed by Golden are well specified. Primarily utilizing previously published normative reports, the norms presented in this manual have largely been empirically derived by calculating how many items the participants in other studies would have obtained if the test were discontinued after 45 seconds. In addition to including data from his own studies (sample sizes unknown), Golden utilized normative data provided by Stroop (1935), Jensen (1965), and Comalli et al. (1962) to generate the norms. No information is provided regarding demographic, IQ, or other characteristics of Golden's own normative samples. Using the tables in the manual, all raw scores can be converted to T scores. For participants younger than 17 and older than 45, age corrections need to be applied before the T-score conversion can be made. The manual cautions that the age corrections for adults over age 65 and children under age 17 are considered to be "experimental." The normative data contained in Golden's manual are not reproduced here, and the interested reader is referred directly to this publication for further information.
Study strengths 1. The Stroop cards developed by Golden and test administration procedures are well described in the manual.
117
STROOP TEST
2. Mean scores are reported for number of items completed. 3. Some of the data are presented by age group intervals, although the age ranges within each grouping are broad (15-45, 46-64, 65--80) and of questionable clin-
ical utility. Considerations regarding use of the study 1. A major problem with Golden's use of other published normative data to generate his norms is that the Stroop formats and procedures used by the other investigators differed from Golden's version (e.g., in other formats, participants scanned from left to right, while in his version participants scanned the stimuli vertically from top to bottom; other versions required participants to complete the entire stimulus cards rather than stopping at 45 seconds, used colored rectangles rather than colored Xs, used a wall chart presentation rather than a standard page presented up close to the subject, etc.). 2. No information is provided regarding the number of participants in Golden's sample. 3. No information on exclusion criteria is available for Golden's sample. 4. The gender, educational, and IQ levels of Golden's sample are not reported. 5. No SDs are provided.
[STROOP.2] Trenerry, Crosson, DeBoe,
and Leber, 1989 (Trenerry Version) This manual presents the standardization data for this version of the Stroop Test. The sample consisted of 156 adults ranging in age from 18 to 79, divided into two age groups: 18-49 (n = 106) and 5~79 (n =50). Exclusion criteria included history of neurological disorder, major psychiatric illness, or physical handicaps which might affect performance. The younger group averaged 30.34 (8.57) years, with 14.68 (2.44) years of education, and contained 43 males and 63 females. The older group averaged 62.68 (7.93) years, with 14.70 (3.24) years of education, and contained 26 males and 24 females.
Test administration procedures are carefully specified. Means and SDs for number of items completed, number of incorrect responses, and color scores are provided for the Color task. Similarly, means and SDs for number of items completed, number of incorrect responses, and color-word scores are reported for the Color-Word task. In addition, optimal cutoff scores are provided for each score for each age grouping, and percentile ranks for raw scores are also reported for each age grouping. This is a proprietary test, and the normative data are available in the test manual. A significant age effect on performance was detected, but no relationship between education or gender and test performance was observed.
Study strengths 1. Cell sample sizes are adequate. 2. Minimally adequate exclusion criteria. 3. Information is provided for each age grouping on mean age, education, and gender distribution. 4. Test stimuli and administration procedures are specified. 5. Means and SDs are reported for each age grouping on a wide range of scores, as well as optimal cutoff scores and percentile ranks. Considerations regarding use of the study 1. The age group intervals are quite broad. 2. No information regarding IQ. 3. High educational level of the sample.
Normative Studies and Control Groups from Clinical Comparison Studies Comalli Version [STROOP.3] Comalli, Wapner, and Wemer, 1962 (Comalli Version) (Table A6.2) These authors provide data (on the Comalli Stroop version) for 235 participants aged 7-80 apparently in Massachusetts as a part of their study of the effects of aging on the Stroop task. For the purposes of this review, the data on the 63 adult participants will be reported. The adult participants were divided into four age groupings: 17-19 (n = 18), 25-34 (n = 14),
118
TESTS OF ATTENTION AND CONCENTRATION
35-44 (n = 16), 65-80 (n = 15). Those 17-19 years old were undergraduate stude'*s, those 25-34 or 35-44 years old were drawnifrom an evening college, and those 65-80 ~ars old were from a community old-age club; Mean time in seconds for each card is charted in a graph for each age group~ no SDs are provided. :
Study strengths
1
l. Data are presented in narrt>w age groupings. 1 2. Good description of test sti~ and administration procedures. . 3. Information on geographic rec¥tment area is provided. ' 4. Mean time in seconds is reported. Considerations regarding use of the study l. No exclusion criteria. . 2. No information regarding gendet or IQ, and cursory data on educational tvel are provided only for those 17-19 years old. 3. Small individual cell sizes. I 4. No SDs reported. [STROOP.4] Eson, Penonal
Communica~n
(Coma IIi Version) (Table A6.3)
Eson provides Stroop data on 63 older participants in four age groupings that reftect the following mean ages: 63.2 (n = 1~, 67.0 (n = 16), 72.0 (n = 16), and 78.3 (n = 16). The Comalli test stimuli and adminiktration procedures were utilized. Means and ~Ds are reported.
Study strengths
l. Large overall sample size, an~ while individual cell sizes are small, ttey are for very restricted age ranges. 1 2. Test stimuli and administration 1 procedures are specified. 3. Mean time in seconds and Sps are reported. '
[STROOP.Sl Stuss, Ely, Hugenholtz, Richard, LaRochelle, Poirier, and Bell, 1985 (Comalli Version) (Table A6.4)
These authors collected Stroop data on 20 control participants (13 male, seven female) in Canada as a part of their investigation of the neuropsychological effects of closed head injury. Participants spoke either English or French. Mean age was 29.2 (12.0), mean years of education was 12.5 (2.0), and mean WAIS IQ was 106.6 (13.4). Participants were paid $15 for their participation. The Comalli test stimuli and administration procedures were employed. The mean and SD for time in seconds to name colors were 64.0 (12.9) for the control group. (Data from the two other trials are not provided.) Performance on Color Naming was significantly depressed in the head injury group relative to controls; groups did not differ in word reading or color interference.
Study strengths l. Data provided on gender, age, education, IQ, language, and geographic area. 2. Test stimuli and administration procedures are speci6ed.
3. Mean and SD for time to name colors are reported. Considerations regarding use of the study l. No information on exclusion criteria. 2. Data were collected in Canada with at least some participants French-speaking; cultural and linguistic factors may limit usefulness for clinical interpretation in the United States. 3. Data from the word-reading and colorinterference trials are not reported. 4. Small sample size. 5. Undifferentiated age range. [STROOP.6] Boone, Miller, Lesser, Hiii-Gutierrez, and D'Eiia, 1990 (Comalli Version) (Table A6.5)
Considerations regarding use of the stt,dy l. No information on exclusion qriteria, education, gender, IQ, or other characteristics.
Data were collected on 61 middle-aged and older individuals ranging in age from 50 to 79 recruited as controls in southern California through newspaper ads, flyers, and personal
119
STROOP TEST
contacts as part of a study of the effect of aging on executive abilities. Participants had no history of psychotic, major affective, or alcohol and other drug-dependence disorders and spoke English ftuently (a handful of participants spoke English as a second language). Participants were excluded if there was a history of neurologic disease, such as stroke, Parkinson's disease, or seizure disorder. Also excluded were individuals with laboratory findings showing serious metabolic abnormalities (e.g., low sodium level, elevated glucose level, or thyroid or liver function abnormalities). Eighteen percent of the original sample of 74 were eventually excluded due to the presence of previously unidentified strokes or other significant lesions documented on MRI, (n = 9), metabolic abnormalities or undiagnosed medical illness (n = 2), or evidence from laboratory studies and EEG findings of alcohol abuse and substance intoxication (n =2). The final sample (n =61) included 25 men and 36 women grouped by three age decades: 50-59 (n = 25), 60-69 (n = 21), and 7~79 (n = 15). All but 10 participants were white: four were African-American, three were Asian, and three were Hispanic. Mean educational level was 14.34 (2.63) and mean WAIS-R FSIQ was 113.79 (13.51). The Comalli version of the Stroop was administered (i.e., word reading, color naming, and color interference). Mean time in seconds as well as number of errors, with SDs, are presented by age grouping for each card. A significant decline with age was observed for word reading and color naming, with a trend toward a decline with age on color interference.
Study strengths 1. Overall sample size is large, although individual cell sizes are small. 2. Data are presented by age groupings. 3. Good exclusion criteria. 4. Information regarding gender, educational level, IQ, geographic area, ethnicity, ftuency in English, and recruitment procedures is provided. 5. Test stimuli and procedures are specified. 6. Means and SDs are reported for both time and errors.
Considerations regarding use of the study 1. The sample size within each age group interval is small (see 1 above). 2. High educational and IQ levels of the sample. [STROOP.7] Boone, Ananth, Philpott, Kaur, and Djenderedjian, 1991 (Comalli Version) (Table A6.6)
Stroop data were collected on 16 controls as part of a study on the neuropsychological characteristics of obsessive-compulsive disorder (OCD). Participants were recruited in southern California from newspaper advertisements and from siblings (n =9) of OCD patients. Medical exclusion criteria included history of alcohol or drug abuse, head injury, seizure disorder, cerebral vascular disease or stroke, current or past psychiatric disorder, or any renal, hepatic, or pulmonary disease. Participants included nine women and seven men, and 19% were left-handed (n = 3). Fourteen were Caucasian, and two were Asian; and all were ftuent in English. Two participants had a history oflearning disability. Mean age was 35.8 (13.7), mean educational level was 15.2 (2.8), and mean WAIS-R FSIQ was 109.1 (10.9). The Comalli version of the Stroop (i.e., word reading, color naming, color interference) was administered. Mean (SD) for time in seconds to complete the color-interference portion of the test was 112.9 (22.5). Controls and patients did not differ in test performance.
Study strengths 1. Good exclusion criteria, with the exception that two participants had histories of learning disability. 2. Information regarding age, education, FSIQ, gender, handedness, ftuency in English, ethnicity, recruitment procedures, and geographic area is reported. 3. Information on test stimuli and administration procedures. 4. Mean time in seconds and SD are provided but only for the color-interference card.
Considerations regarding use of the study 1. Small sample size. 2. Undifferentiated age range.
120
TESTS OF ATTENTION AND CONCENTRATION
3. High educational level; two participants had a history of learning disability. 4. No data reported for word-reading and color-naming trials. [STROOP.8] Demick and Harkins, 1997 (Coma/Ji Version) (Tables A6. 7-A6.1 0)
The sample consists of 231 individuals recruited in Massachusetts who participated in a study assessing the relationship between field dependence-independence (FQI) cognitive style and driving behavior. Participants were community-dwelling individuals who in telephone screening denied any ~tory of major impairment in perception, cognition, or motor execution and described themselves as having good overall health; corrected visual problems were allowed. The average educational level of the sample was high schrol plus some college courses completed. The Comalli cards and Kaplan administration procedures (i.e., color naming. word reading, color interference) were employed. Means, SDs, and ranges for time in seconds, errors, color difficulty factor (total time on B/total time on A), and interference factor (total time on C - total time on B) are provided for four age groupings (20--39, 40--59, 60--74, 75+ years).
Study strengths 1. Overall sample size is large, with individual cell sizes exceeding 50. 2. Data are presented by age groupings. 3. Probably adequate exclusion criteria. 4. Information regarding gender, overall educational level, and geographic region is provided. 5. Test stimuli and procedures are indicated. 6. Means, SDs, and ranges are prm4ded. Consideration regarding use of the study 1. No information regarding inteDectual level.
Other comments 1. Theoretical issues concerning the Stroop (e.g., process vs. achievement measures, identification of a cognitive sty~) are discussed.
[STROOP.9] Boone, 1999 (Comalli Version) (Table A6.11)
The author obtained Stroop data on 155 middle-aged and older individuals (age range 45-84) recruited as described by Boone et al. (1990); data from the 1990 study were included in the 1999 publication. Mean age of the sample was 63.07 (9.29), mean years of education was 14.57 (2.55), and mean WAIS-R FSIQ was 115.41 (14.11). Fifty-three were male and 102 were female. Medical and psychiatric exclusion criteria are the same as in the 1990 publication, with the exception that participants with significant white-matter hyperintensities documented on MRI were retained in the sample. All participants considered themselves healthy, although 51 had some evidence of vascular illness (defined as cardiovascular disease and I or significant white-matter hyperintensities on MRI) based on self-report or evidence on examination of at least one of the following: current or past history of hypertension (n = 39), arrhythmia (n = 8), large area of white-matter hyperintensities on MRI (e.g., 10 cm2 ; n = 7), coronary artery bypass graft (n = 3), angina (n = 2), and old myocardial infarction (n = 1). Twenty-four participants were currently on cardiac and/or antihypertensive medications. The Comalli version of the Stroop was administered. Means and SDs for time in seconds to complete the color-interference portion of the test are provided. A stepwise regression analysis revealed that age and FSIQ were significant contributors to Stroop color-interference performance, accounting for 15% and 13%, respectively, of test score variance; educational level, gender, and vascular status did not account for a significant amount of unique test score variance. Stroop normative data are presented for color-interference time in seconds stratified by IQ and age (< 65 and ~65; average, high average, and superior IQ).
Study strengths 1. Large overall sample size. 2. Presentation of the data by IQ and age groupings.
121
STROOP TEST
3. Comprehensive medical and psychiatric exclusion criteria including MRI brain scans on all participants. 4. Information regarding educational level, gender, geographic area, recruitment procedures, and fluency in English is provided. 5. Mean times in seconds and SDs for color interference are reported. 6. Test administration format is specified. Considerations regarding use of the study l. Individual IQ by age groupings range in size from 16 to 37. 2. High IQ and educational level of the sample. 3. No data reported for word-reading and color-naming trials. [STROOP.10] Boone, Swerdloff, Miller, Geschwind, Razani, Lee, Gaw Gonzalo, Haddal, Rankin, Lu, & Paul, 2001 (Comalli Version) (Table A6.12)
Stroop performance was assessed in 22 male controls as part of a study on neuropsychological function in adult Klinefelter's syndrome. Participants were recruited from newspaper and radio ads and flyers in the southern California area and paid for their participation. Exclusion criteria included history of learning disability, major psychiatric disorder, substance abuse, or neurologic disorder. All participants were fluent in English. Mean age was 34.32 (14.81), mean years of education was 13.36 (2.15), mean WAIS-R (Satz-Mogel) VIQ was 106.46 (17.01), and mean PIQ was 107.46 (16.58). Means and SDs for the color-interference trial are reported.
Kaplan Venion [STROOP.11] O'Eiia, Satz, and Uchiyama, Unpublished Data (Kaplan Version) (Table A6.13)
These data were collected in 1993 and 1994 during the course of an Federal Aviation Administration/Equal Employment Opportunity Commission (FAA/ EEOC)-mandated study to examine neuropsychological functioning of airline pilots. The sample consisted of 197 male, Caucasian airline pilots aged 40--59 employed by major airplane manufacturers in the United States. All pilots had recently passed their yearly comprehensive FAA physical examination. Data are presented in two age groupings: 4~9 (n = 118) and 50-59 (n = 79). Those 4~9 years old had an average of 16.1 (1.9) years of education, and those 50-59 years old group had 15.6 (2.0) years of education. The Comalli cards and Kaplan administration procedures (i.e., color naming, word reading, color interference) were used to obtain Stroop data on the pilots as part of a 55-minute neuropsychological screening battery. Means and SDs for time in seconds, errors, and near-miss (i.e., self-corrected) errors are provided.
Study strengths 1. Information regarding age, education, gender, IQ, geographic area, language, and recruitment procedures is provided. 2. Adequate exclusion criteria. 3. Means and SDs are reported.
Study strengths 1. Overall sample size is large, with individual cell sizes exceeding 50. 2. Data are presented by age groupings. 3. Adequate exclusion criteria. 4. Information regarding gender, educational level, recruitment procedures, ethnicity, and occupation and some information regarding geographic region is reported. 5. Test stimuli and procedures are indicated. 6. Means and SDs for time and nearmiss (i.e., self-corrected) errors are provided.
Considerations regarding use of the study l. Data are not stratified by age. 2. Small sample size. 3. All-male sample.
Considerations regarding use of the study 1. All-male sample. 2. High educational level of the sample. 3. No information regarding IQ.
122
TESTS OF ATTENTION AND CONCENTRATION
[STROOP.12] Schiltz, Personal Comm1111ication (Kaplan Version) (Table A6.14)
The sample consists of 50 (28 male, 22 female) native English-speaking participants recruited from the University of California at Los Angeles undergraduate introductory psychology courses during 1988-1989. Th~e were healthy normal adults aged 18-20 without a history of head trauma or loss of cqnsciousness. Average years of education for t\e group was 13.36 (0.63, range 13-15). All students were required to participate in ongoing research as a part of their. coursework, and students self-selected to the various studies based on the written descriptions of the studies. The Comalli stimulus cards and Kaplan administration procedures (i.e., color naming, word reading, color interference) were used as part of a larger neuropsychological battery assembled for the putpose of collecting norms. All participants were tested individually. Total battery length was 55 minutes, and the Stroop was adm4listered about 30 minutes into the protocol. Means, SDs, and ranges in seconds are reported for the first half of each stimulus card as well as for each card in total. Performance time on the second 50 items can be calculated by subtracting the time to complete the first half of the card from the total time to complete the whole card. Study strengths 1. The sample size is adequate for the restricted age interval. 2. Exclusion criteria were minimally adequate. 3. Data on gender composition, educational level, geographic area, and recrUitment procedures are reported. 4. Test stimuli and administration procedures are specified. 5. Means for time in seconds and SDs are reported. 6. Data provided for the first half of each card as well as total. Consideration regarding use of the 1. No data on IQ.
st1¥ly
[STROOP.13] Strickland, D'Eiia, James, and Stein, 1997 (Kaplan Version)(Table A6.15)
Stroop data were collected in southern California on 42 Mrican-American participants (15 males, 27 females) aged 19-41 with no remarkable history of neurologic, psychiatric, cardiovascular disease or substance abuse. Mean age for the whole sample was 30.17 (6.34) years; mean age of males was 31.93 (5.26) years, and that of females was 29.19 (6.75) years. Mean educational level for the sample was 14.76 (2.24) years. The Comalli stimulus cards and Kaplan administration procedures (i.e., color naming, word reading, color interference) were employed to obtain Stroop data. Mean times in seconds and SDs are provided for each of the three cards. Errors and near-miss (i.e., self-corrected) errors were tabulated. Women demonstrated significantly better performance than men on cards 1 and 2. There was a similar trend noted on card 3. Study strengths 1. Adequate exclusion criteria. 2. Information regarding gender, ethnicity, age, educational level, and geographic area is reported. 3. Information regarding test stimuli and test administration procedures is provided. 4. Means and SDs for time in seconds, errors, and near-miss (self-corrected) errors is provided for each card. Considerations regarding use of the study 1. Small sample size. 2. Undifferentiated age range. 3. High educational level of the sample. 4. No information regarding IQ. [STROOP.14] Miller, 2003; Personal Communication (Kaplan Version) (Table A6.16)
The investigation used participants from the Multi-Center AIDS Cohort Study (MACS). The data were collected from a sample of seronegative homosexual and bisexual males for the purpose of establishing normative data
STROOP TEST
for neuropsychological test performance based on a large sample. There were 522 participants in the Color Naming, 521 in the Word Reading. and 692 in the Interference conditions. Mean age for the full sample used in the Interference condition was 40.57 (7.5) years, and mean education was 16.31 (2.3) years; 91.2% were Caucasian, 2.9% Hispanic, 5.5% black, 0.4% other. All participants were native English speakers. The three conditions of the Kaplan version of the Stroop were administered according to standard instructions. The data are partitioned by three age groups (25-34, 35-44, 45-59) x three educational levels (< 16, 16, > 16 years). Study strengths 1. The overall sample size is large, and most individual cells have more than 50 participants. 2. Normative data are stratified by age x education. 3. Information on age, education, ethnicity, and native language is reported. 4. Means and SDs are reported. Considerations regarding use of the study 1. All-male sample. 2. No information on IQ is reported. 3. No information on exclusion criteria.
Golden Version [STROOP.15] Ingraham, Chard, Wood, and
123
ence score, are provided. Performance on rapid word reading and color naming was significantly slower (by eight and three words, respectively) than the norms reported by Golden for English-speaking individuals, which the authors suggest may be due to longer speaking times because the Hebrew words were two-syllable. However, scores for the color-interference trial were not significantly different from English-speaking language norms, although the interference score was significantly larger. The authors hypothesize that Hebrew speakers may show less of an interference effect because Hebrew readers are accustomed to reading words without vowels and determining meaning from context, allowing them on the colorinterference trial to "read" the words as other than color names, thereby reducing any interference effect. No significant differences between men and women were detected. Study strengths 1. Relatively large sample size for reasonably restricted age range. 2. Information regarding native language, educational level, gender distribution, and geographic area is reported. 3. Test stimuli and procedures are described. 4. Means and SDs for number of items completed and interference score are reported. 5. Data provided for Hebrew version.
Mirsky, 1988 (Golden Version) (Table A6.17)
Data were gathered on 46 college students and college-educated adults in Tel Aviv using "the general format of Golden's 1978 version with new randomization, a bold typeface, and Hebrew lettering," which is carefully detailed. The sample consisted of 28 men and 18 women, with an average age of 28.4 (3.2) years and a range of 24-36 years. Exclusion criteria included prior psychiatric disorder, primary language which was not Hebrew, and prior familiarity with the Stroop Test. Means and SDs for number of items completed within 45 seconds for the three stimulus cards, as well as Golden's interfer-
Considerations regarding use of the study 1. Psychiatric exclusion criteria reported but no medical exclusion criteria. 2. No information regarding recruitment procedures or IQ level.
[STROOP.16] Connor, Franzen, and Sharp, 1988 (Golden Version) (Table A6.18)
Stroop data were obtained on 40 college student volunteers in West Virginia (17 male, 23 female) who ranged in age from 18 to 25, with the exception of one 32-year-old. The Golden version of the Stroop was administered with either standard instructions
124
TESTS OF ATTENTION AND CONCENTRATION
as detailed in the test manual or •tandard instructions plus six suggestions ("l~king at no more than three words at a time; focusing on only one letter in the word; remembering that the same color never occurs twlce consecutively; going at an even, steady pPce; trying not to become distracted or l~e one's place; and not repeating an already-correct answer when correcting a mistake"). Participants were administered th~ Stroop at baseline (pretest), following five practice sessions (post-test), and at a 1-week follow-up. No effect of gender or instructioq format was documented. A significant effect pf practice was found between the pre- and tost-test but not between the post-test and follow-up. Data are presented in means and ~Ds for number of items completed for the jPretest, posttest, and follow-up sessions. ' I
Study strengths 1. Information on the effects of practice, gender, and alternative instruc~ons on Stroop performance is provided.. 2. Information on age, gender, ~d geographic area, with some inform~on on education and recruitment prooedures, is reported. , 3. Test stimuli and procedures are siiecified. 4. Data are presented in means and SDs for number of items completed. :
Participants had no history (as judged through medical records) of color blindness, cataracts, or glaucoma. The Golden Stroop Test stimuli and administration procedures were employed. Means and SDs are reported for number of items completed on each trial. Some participants (five female, three male) had difficulty discriminating between the colors blue and green on the color trial.
Study strengths 1. Data presented in a homogenous age
grouping. 2. Information is given regarding mean age, mean educational level, gender, mean Blessed Dementia Scale score, ge~ graphic area, and recruitment procedures. 3. Information is given on test stimuli and administration procedures. 4. Means and SDs reported for number of items completed. 5. Test administration format was described. Considerations regarding use of the study 1. Relatively small sample size. 2. No information regarding IQ. 3. High educational level of the sample. 4. Unclear exclusion criteria.
[STROOP.18] Daigneault, Braun, and Whitaker, 1992 (Golden Version) (Table A6.20)
Considerations regarding use of the stfdy 1. Relatively small sample size. ; 2. Undifferentiated age range, althOugh it is somewhat restricted. 3. No information on exclusion ~riteria or IQ. 4. Data are not broken down by gender or education.
I
[STROOP.17] fisher, Freed, and Corkin, ~990 (Golden Version) (Table A6.19)
i
The authors collected Stroop data on 36 older controls (typically spouses of patients) from southern California as part of an inves,gation of Stroop performance in Alzheimer's cliSease. Mean age was 72.9 (8.3) years, mean .educational level was 14.6 (2.7) years, an4 mean Blessed Dementia Scale score was (6.1). The sample included 13 males and 23 f~males.
1.' I
Stroop data were obtained on 128 Frenchspeaking participants in Canada as part of a study investigating the effects of aging on prefrontal lobe skills. Participants were recruited through ads, trade union collaboration, and the help of a large sports center. Exclusion criteria included consumption of more than 24 beers, five bottles of wine, or 15 ounces of spirits per week; consumption of cocaine, LSD, or psychostimulants; any neurological or psychiatric consultation, psychoactive medication, head trauma with hospitalization, or major surgery (e.g., cardiac). Participants were divided into two age groupings: 20-35, with a mean of 27.71 (4.05) years (n = 70), and 45-65, with a mean of 56.62 (5.29) years (n =58). The younger group contained 38 men and 32 women; they were primarily specialized blue-collar
125
STROOP TEST
workers, although some specialized whitecollar and unskilled blue-collar professions were represented. The older group contained 30 men and 28 women, and slightly more than half were specialized blue-collar workers, with some unskilled blue-collar professions, specialized white-collar occupations, and professional occupations represented. The mean educational level of the younger group was 12.36 (2.09) years, and that of the older group was 12.11 (3.63) years. Mean number of items completed and SDs for the color-interference portion of the test are reported. The two age groups differed significantly in test performance, with the younger group outperforming the older group.
Study strengths 1. Good exclusion criteria. 2. Large overall sample size, and each of the two age groupings has more than 50 participants. 3. Information regarding educational level, gender, occupations, geographic area, and recruitment procedures is provided. 4. Information on test stimuli and administration procedures is reported. 5. Means and SDs for number of items completed on part C is provided. Considerations regarding use of the study 1. Data were obtained on French-speaking participants in Canada; thus, it is unclear whether these data are appropriate for clinical interpretation on English-speaking individuals in the United States. 2. No information regarding IQ (although mean scores on the vocabulruy subtest of the French-language WAIS analog are reported). 3. No data provided for the first two sections of the Stroop Test. 4. Test administration format may have been altered (i.e., participants appeared to scan the stimulus cards across rows from left to right). [STROOP.19] Swerdlow, filion, Geyer, and Braff, 1995 (Golden Version) (Table A6.21 ) The authors collected Stroop data on 72 "normal" controls (34 males, 38 females)
recruited through newspaper ads and posted advertisements. No subject had a history of psychiatric illness, substance abuse or dependence, recreational drug use in the month prior to testing, schizophrenia in a first-degree relative, sustained loss of consciousness, severe neurologic or medical illness, or psychotropic medication use. Three participants were excluded based on a urinalysis positive for iJlicit drug use. Participants were divided into "psychosisprone" (n =26) and "non-psychosis-prone" (n = 46) groups based on MMPI criteria (Goldberg index ~60 and F ~70 or Wiggins Psychoticism index ~60). Psychosis-prone individuals scored significantly below nonpsychosis-prone participants on the interference trial and "interference ratio." Women performed significantly better than men on color naming. Means and SDs are reported for word reading, color naming, interference, and the interference ratio for the psychosis-prone and non-psychosis-prone groups (and for subgroups based on MMPI scales that determine psychosis-proneness) and for men and women separately.
Study strengths 1. Excellent exclusion criteria. 2. Information provided regarding gender and recruitment strategies. 3. Large overall sample size. Considerations regarding use of the study 1. No information regarding age, education, IQ, ethnicity, or geographic area. [STROOP.20] lvnik, Malec, Smith, Tangalos, and Petenen, 1996 (Golden Version) (Table A6.22)
This study presents normative data for performance on the Golden (1978) version of the Stroop Test obtained on 356 individuals between the ages of 56 and 94, who participated in the ongoing Mayo Older Americans Normative Studies (MOANS), a project to develop normative data for elderly individuals on various neuropsychological tests. The data are derived from a population of "almost exclusively Caucasian older adults who live in
TESTS OF ATTENTION AND CONCENTRATION
126
an economically stable region of the United States" (the area surrounding Rochester, MN). All participants were community-dwelling, had no active neurologic or psychiatric disorder, and had undergone recent physical exams. Data are reported in discrete age ranges. Age categorization used the midpoint interval technique. The raw score distribution at each midpoint age was "normalized" by assigning standard scores with a mean of 10 and SO of 3, based on actual percentile ranks. The authors provided tables of age-corrected norms for each age group. The procedure for clinical application of these data is described in the original article (Ivnik et al., 1996) as follows:
Analyses did not suggest that a performance correction was necessary for gender.
first select the table that corresponds to that person's age. Enter the table with the test's raw score; do not use "corrected" or "final" scores for tests that might present their own age- or educationadjustments. Select the appropriate column in the table for that test. The corresponding row in the leftmost column in each table provides the MOANS Age-Corrected Scaled Score . . . for your subject's raw score; the corresponding row in the rightmost column indicates the percentile range for that same score.
Considerations regarding use of the study 1. The measures proposed by the authors are quite complicated and might be difficult to use in clinical practice. 2. No information on IQ.
Mean and SO scores for performance by age are not reported; however, the raw performance score can be easily translated to percentile performance scores (and standard scores) using the data tables. MOANS scaled scores by age and education level (A&E-MSS) have to be empirically derived using the following equation: A&E-MSSstmop
= K + (Wt * A-MSSstroop) - (Wz * Education)
Where K is a constant for each test, W1 is a weight to be applied to the age-corrected MOANS scaled score, and W2 is a weight to be applied to the person's education. For the Stroop Test, the values are as follows: K
Word Color Interference
3.47 1.88 1.38
1.10 1.10 1.09
0.34 0.23 0.19
Study strengths 1. Minimally adequate exclusion criteria. 2. Information regarding age, education, gender, handedness, ethnicity, recruitment procedures, and geographic area is reported. 3. The data are stratified by age group based on the midpoint interval technique. 4. Sample sizes for the five age groupings spanning ~79 exceed 50. 5. The test version and scoring procedures are specified.
[STROOP.21] Doan and Swerdlow, 1999 (Golden Version) (Table A6.23)
The Golden Stroop version translated into Vietnamese was administered to 30 native Vietnamese speakers, while the standard Golden version was given to 30 native English speakers. All participants resided in the San Diego area. The average age of the 13 males and 17 females in the Vietnamese group was 34.4 (13.1) years, with a range of 19-68, and the average educational level was 14.3 (3.5) years; 47% were students, and the sample averaged 16 (7) years residence in the United States. The 12 male and 18 female native English speakers averaged 31.2 (11.9) years of age, with a range of 19-57, and average educational level was 15.4 (1.6); 30% were students. All but one subject were righthanded (one Vietnamese speaker was ambidextrous). The colors employed for the Vietnamese translation were blue, brown, and red because the Vietnamese word for green is used for both blue and green. Means and SDs for number of responses for the word reading, color naming, and interference sections and the Golden interference score are provided. No significant differences in performance
127
STROOP TEST
were found between the Vietnamese speakers and English speakers or between monolingual Vietnamese speakers and bilingual speakers.
Study strengths 1. Data provided for Vietnamese test version. 2. Information regarding geographic area, language, ethnicity (although incomplete), age, education, handedness, and gender is provided. 3. Test stimuli and procedures are described. 4. Means and SDs for number of items completed and interference score are reported.
Considerations regarding use of the study 1. No exclusion criteria are reported. 2. Data are not stratified by age. 3. Small sample sizes. 4. No information regarding IQ recruitment procedures.
or
[STROOP. 22] Rapport, Van Voorhis, Tzelepis, and Friedman, 2001 (Golden Version) (Table A6.24) Stroop data were collected on 32 controls (19 males, 13 females) who were either undergraduates at a large midwestern university or residents in the neighboring metropolitan area as part of a study of executive function in adults with ADHD. Exclusion criteria included history of significant neurologic disorder (head injury, stroke, seizure disorder), current substance abuse, or scores greater than 1 SD higher than mean values on ADHD behavior rating scales. Mean age was 33.2 (13.2), mean years of education was 14.8 (2.5), and mean WAIS-R FSIQ was 108.0 (7.7). Means and SDs for the word trial and color trial are reported.
Study strengths 1. Generally adequate exclusion criteria. 2. Information regarding age, education, gender, IQ, and geographic location is reported. 3. Test stimuli and procedures are described.
Limitations regarding use of the study 1. Small sample size. 2. Data are not stratified by age. 3. No specific information regarding recruitment strategies. 4. High educational level.
(STROOP.23] Rosselli, Ardila, Santisi, Arecco, Salvatierra, Conde, and Lenis, 2002 (Golden Version) (Table A6.25)
Stroop Test data were obtained on 40 English monolinguals, 71 Spanish-English bilinguals (90% with Spanish as the first language), and 11 Spanish monolinguals in south Florida who were primarily college students, their family members, and friends. All were right-handed. The average age of the 32 male and 39 female bilinguals was 31.98 (13.14), and average years of education was 14.92 (2.35). The 13 male and 27 female English monolinguals averaged 35.90 (13.08) years of age and 15.35 (2.45) years of education. The three male and eight female Spanish monolinguals averaged 40.91 (15.17) years of age and 14.25 (3.49) years of education. None had any psychiatric or neurological conditions, and all had normal MMSE scores. All bilinguals had had at least some formal education in English and averaged 19 years speaking the second language. A Spanish version of the Stroop Test (Rey, unpublished) was administered to the monolingual Spanish speakers and to the bilinguals, who also completed the English version (order of administration of the two versions was randomized). Number of errors and time in seconds to complete all stimuli (instead of the number of items completed in 45 seconds) were collected. Means and SDs for the three trials are reported for the three groups as well as for three subgroups of bilinguals (unbalanced-Englishdominant, unbalanced-Spanish-dominant, and balanced). Groups did not significantly differ in Stroop scores, with the exception of slower performance in the bilinguals relative to English speakers on color naming in English. Testing bilinguals in their second language was associated with a lOo/o-15% increase in time for color naming and a 5o/o-10% increase in time for color interference. Bilinguals who were more facile in
128
TESTS OF ATTENTION AND CONCENTRATION
Spanish were significantly slower on the English Stroop trials, while bilinguals more fluent in English were slower on the Spanish Stroop.
Study strengths 1. Data on Spanish-language and Englishlanguage Stroop performance in a large sample of bilingual participants (n = 71) as well as a smaller group of monolingual Spanish speakers are provided. 2. Adequate exclusion criteria. 3. Information provided regarding age, education, gender, and handedness, as well as comprehensive information on language characteristics. Considerations regarding use of the study 1. The administration format was altered (time to finish the stimuli rather than number of responses at 45 seconds). 2. No information regarding IQ. 3. Data are not stratified by age. 4. High educational level.
[STROOP.24] Lopez-Carlos, Salazar, Villasenor Saucedo, and Peiia, 2003
version) were included in the battery. WAISIII Block Design raw scores are included in Tables A6.26-A6.29. Mean performance on the Marin Marin Acculturation Scale for the Los Angeles sample was 17.61 (6.19). For the Los Angeles group, Picture Vocabulary subscale scores from the Woodcock-JohnsonIII Tests of Achievement (Mean= 5.36, SD = 6.01) and the Bateria Woodcock-Muiioz-R, Pruebas de habilidad cognitiva-R (Mean= 29.77, SD =5.37) were used to assess level of English and Spanish word expressive abilities. The results are presented by years of education (0-6, 7-10), age (18-29, 30-49 years), and education and age (18-29 years old, 0-6 and 7-10 years of education; 30-49 years old, 0-6 and 7-10 years of education). The authors found a significant difference (p< 0.05} in performance on the Stroop (color and color/ word interference) between the two education groups. However, the two age groups did not differ significantly on any of the sections of the Stroop. No significant differences in scores between individuals from Los Angeles and Mexico were noted.
(Golden Version) (Tables A6.26-A6.29)
The Golden version of the Stroop was used in a study investigating the effects of demographic variables on cognitive abilities in Spanish-speaking individuals with low education. The total sample included 115 volunteer monolingual Latino men with $10 years of formal education, who worked at manual labor in the Los Angeles area (n = 65) and Jalisco, Mexico (n =50). Volunteers were recruited from posted advertisements in workplaces and personal solicitations. The mean age for the sample was 28.23 (8.74) years and mean education was 6.66 (2.54) years. Exclusion criteria consisted of any self-report of head injury, neurological insults, prenatal or birth complications, learning disabilities, psychiatric problems, or substance abuse. Scores on the Beck Depression Inventory-11-Spanish Version (Mean = 12.92, SD = 8.94) and the Beck Anxiety Inventory-Spanish Version (Mean= 6.60, SD = 6.03) are also reported. Standard administration procedures were used. Participants were tested in Spanish. Selected subtests from the WAIS-III (Mexican
Study strengths 1. Large sample for age and education groups. 2. Data availability for a healthy, employable, monolingual Spanish-speaking group with low education level 3. The sample is stratified into two education groups, two age groups, and four age x education groups. Additionally, data are available for United States and Mexico. 4. The sample composition is well described in terms of age, education, gender, geographic area, and recruitment procedures. 5. Adequate exclusion criteria. 6. Means and SDs are reported. 7. WAIS-III Block Design subtest scores are presented. Considerations regarding use of the study 1. All-male sample. 2. Small sample sizes for the combined age and education groups.
129
STROOP TEST
[STROOP.25] Cohen, Brumm, Zawacki, Paul, Sweet, and Rosenbaum, 2003 (Golden Version) (Table A6.30)
Twenty males who averaged 30.5 (10.7) years of age and 11.8 (3.3) years of education were used as controls in a study of cognitive function in domestic violence perpetrators at the University of Massachusetts Medical Center; most of the controls were hospital workers. Mean WAIS-R FSIQ was 100.7 (11.0), mean VIQ was 101.8 (10.6), and mean PIQ was 100.2 (11.2); 16.5% admitted to at least some past drug use, and 9.3% admitted to prior alcoholrelated problems. Three reported previous head injury (two mi1d, one moderate), 10% had experienced learning difficulties, and 12.5% admitted to childhood behavioral problems. Means and SDs for the interference trial are provided. Study strengths 1. Information provided regarding gender, education, age, IQ, previous learning difficulties, childhood behavioral problems, head injury, substance use/abuse, and geographic area. 2. Means and SDs reported for interference trial. 3. Data for a sample of average IQ and education. Considerations regarding use of the study 1. Small sample size. 2. No apparent exclusion criteria; individuals with a history of substance use I abuse, learning and I or childhood behavioral problems, and head injury included. 3. Only males included. 4. Data provided only for interference trial. 5. Data are not stratified by demographic factors. [STROOP.26] Moering, Schinka, Mortimer,
and Graves, 2004
(Golden Version)
(Table A6.31)
Stroop data were collected on 236 older African Americans, aged 60-84, living in private residences in the Tampa, Florida, area.
Participants were recruited through the use of epidemiological sampling procedures. Exclusion criteria included endarterectomy, transient ischemic attack, cerebrovascular accident, Parkinson's disease, or traumatic head injury with loss of consciousness and retrograde amnesia; no information on psychiatric disorders was collected. Fifteen individuals were excluded on the basis of the above factors, as well as one outlier whose Stroop data was 3 SDs from the mean of the sample, which was clearly separated from the remainder of the scores. The sample was divided into two age groupings: 60--71 (n = 111) and 72-84 (n = 125). Younger participants scored significantly better than older participants on all trials, with an effect size of > 0.25. In addition, education and gender were significant predictors of performance, with higher levels of education and female gender associated with better performance. The majority of the sample (72.5%) had<12 years of education. Data are stratified by age, gender, and education (< 12, 12, >12 years), for a total of 12 separate subgroups with sizes ranging 2-56. Means and SDs are reported. In addition, adjustments for education and gender to be applied to raw scores are provided, as well as data on percentile scores for raw scores for each age group. Study strengths 1. Large sample sizes for the two age groupings. 2. Data are stratified by age, education, and gender, although individual cell sizes ranged 2-56. 3. Good exclusion criteria for neurological conditions, although psychiatric conditions or chronic medical illnesses (e.g., hypertension) were not used as exclusion criteria and could at least partially explain the poorer performance observed in this sample relative to Caucasian individuals. 4. Data provided for an African-American population; however, most had a low level of education (although this was apparently representative of the communities in which they lived).
130
TESTS OF ATTENTION AND CONCENTRATION
5. Information on geographic atea and recruitment strategies is provid~. 6. Means and SDs are reported, as well as percentile equivalents of raw scores and score adjustments for education and gender.
Considerations regarding use of the study 1. Issues regarding exclusion criteria, lowered educational level, an' small individual cell sizes. ; 2. No data available on IQ level. . 1
Dodrill Venion [STROOP.27] Dodrill, 1978a (Dodrill Version) (Table A6.32)
Dodrill collected control data on 50 participants in the state of Washington as a paft of his investigation of the cognitive corre~tes of epilepsy. Thirty were male and 2() were female; and mean age and educatim~ level were 27.34 (8.41) years and 11.96 (2.01\) years, respectively. Forty-nine were Caucasi$1, with one listed as non-Caucasian. Nine w~re students, six were housewives, 20 wdre unemployed, and 15 were employed. P~ipants were recruited through employment f$:ilities, churches, a community college, a pub.c high school, a volunteer service agency, and a semisheltered workshop. Participants underwent a detailed neurological history, and those with diseases or other conditions affectfng the nervous system were excluded. The Dodrill version of the Stroop was administered. Means and SDs are r~rted for time in seconds to complete parts I ,md II. In addition, means and SDs are provi4ed for part I+ part II, and part II- part I. Using a cutoff of 93/94 seconds on part I, 7p% of controls were correctly classified. A cQtoff of 150/151 seconds for part II- part I res~ted in a 74% correct classification rate.
Study strengths 1. Adequate sample size (n =50). 2. Information on age, education, ~nder, occupation, geographic area, etlpricity, and recruitment procedures is pr¥ded. 3. Test stimuli and procedures are s~fied.
4. Mean time in seconds and SDs are reported.
Considerations regarding use of the study 1. No information on IQ. 2. Apparently adequate exclusion criteria, although some controls were recruited from sheltered workshops. 3. Undifferentiated age range. [STROOP.28] Sacks, Clark, Pols, and Geffen, 1991 (Dodrill Version) (Table A6.33)
Stroop data were obtained on 12 male university student volunteers in Australia, ranging in age from 18 to 32 with a mean of 22.4 (5) years, as a part of the development of five alternate forms of the Dodrill Stroop. All participants had normal vision (20:20, as tested with a standard Snellen wall chart) and no evidence of color blindness (assessed through Ishihara charts). Participants averaged 13.7 (2.3) years of education. Mean abbreviated WAIS-R FSIQ, VIQ, and PIQ were 109.1 (9.5), with a range of 100-124; 108.4 (8.7), with the range of 100-124; and 106.6 (7.1), with a range of97-120, respectively. The exact procedures used to develop the alternate forms are specified. All participants were administered all six forms of the test in 1 day with a 50-minute rest period between trials on each form. Order of completion of the six forms was randomized. Participants were halted at each error and instructed to correct the mistake before proceeding. Means and SDs for time in seconds are reported for each form. The forms were judged to be equivalent, although a significant practice effect was still present between the first and second test administrations. Sets of the six alternate forms are available from the test authors.
Study strengths 1. Data provided on six alternate forms and practice effects. 2. Information reported on education, gender, IQ, vision, age, and geographic area. 3. Test stimuli development and administration procedures are carefully described.
131
STROOP TEST
4. Means and SDs for time in seconds are reported for each form. Considerations regarding use of the study 1. Small sample size (n = 12). 2. All-male sample. 3. Data are collected in Australia; cultural differences may render the data questionable for clinical interpretation in the United States. 4. No exclusion criteria. Victoria Version [STROOP.291 Regard, 1981, cited in Spreen and Strauss, 1991, 1998 (Victoria Version) (Table A6.34)
Data were obtained on 40 right-handed young adults of average intelligence. Average age was 26.7 (range 20--35). The Victoria Stroop Test stimuli and procedures were employed. Means and SDs are reported for time and errors. Study strengths 1. Homogeneous age grouping. 2. Information regarding age, IQ, and handedness is provided. 3. Test stimuli and procedures are described. 4. Means and SDs for time and errors are reported. Considerations regarding use of the study 1. Fairly small sample size. 2. No information regarding educational level, gender, fluency in English, geographic recruitment area (assumed to be Canada), or exclusion criteria. [STROOP.30] Spreen and Strauss, 1991 (Victoria Version) (Table A6.35)
These authors collected Stroop normative data on 86 healthy older participants aged 50-94; average age was 68.5 (10.78) years. Mean years of education was 13.2 (3.1) years. The Victoria Stroop Test stimuli and administration procedures were used. Means and SDs are reported for time and errors for four age groupings: 50-59 (n = 19), 60-69 (n = 28), 70-79 (n = 24), and 80-94 (n = 15).
Study strengths 1. Data are presented by narrow age groupings. 2. Information is provided regarding mean age and mean educational level. 3. Test stimuli and procedures are well described. 4. Means and SDs are reported for time and errors. Considerations regarding use of the study 1. Unclear exclusion criteria (participants are described as "healthy"). 2. No information regarding IQ, gender, fluency in English, and geographic recruitment area (assumed to be Canada). 3. Small cell sample sizes. Trenerry Version [STROOP.31] Anstey, Matters, Brown, and Lord, 2000 (Trenerry Version) (Table A6.36)
Stroop data were obtained on 369 retired individuals residing in Anglican retirement villages in Australia and involved in a randomized controlled trial of exercise on falls risk and psychological well-being. There were 52 males and 317 females, ranging in age from 62 to 95, with a mean of 79.04 (6.59) years; average years of education was 11.25 (2.79). Exclusion criteria included Parkinson's disease, stroke, or heart attack. Sixty-six percent rated their health as good or very good, 18% rated their health as excellent, and 16% rated their health as fair or poor. The most common health problems were arthritis (65%), cataract (53%), hypertension (50%), glaucoma or poor vision (38%), lung problems (19%), and diabetes (7%). Seventeen percent of the sample had MMSE scores <24 (lowest score 17). Percentile ranks for raw scores for the two Stroop trials, stratified by four age groupings (62--69, 70-79, 80-89, and 90-95) and three education groupings (0-9, 10-12, 13+), are provided, as well as the means and SDs for the sample as a whole. Age and education, but not gender, were significantly related to Stroop scores. Performance on the Stroop Test in individuals older than 90 was particularly poor, and many could not complete the task.
132
TESTS OF ATTENTION AND CONCENTRATION
Study strengths 1. Large overall sample size, although the sizes of individual cells are not reported. 2. Stratification of data by age and education. 3. Information regarding age, education, gender, health status, residential setting, and occupational status is provided. 4. Test stimuli and procedures are reported. 5. Percentiles for raw scores are provided for 12 subgroupings, as well as overall means and SDs for the sample as a whole.
Considerations regarding use of the study 1. Questionable adequacy of exclusion criteria (17% had MMSE<24). 2. Participants age 62 or older are included. 3. No information regarding IQ. 4. Data obtained in Australia, which may limit applicability in the United States.
RESULTS OF THE META-ANALYSES OF THE STROOP TEST DATA (GOLDEN VERSION, INTERFERENCE CONDITION) (See Appendix 6m) Data collected from the studies reviewed in this chapter were examined. Only the data for the Golden version had a sufficient aggregate sample to be included in the analyses. Data were combined in regression analyses in order to describe the relationship between age and test performance and to predict expected test scores for different age groups. Effects of other demographic variables were explored in follow-up analyses. The general procedures for data selection and analysis are described in Chapter 3. Detailed results of the metaanalysis and predicted test scores across adult age groups are provided in Table A6m.l. After initial data editing for consistency and for outlying scores, six studies, which generated 10 data points based on a total of 490 participants, were included in the analyses for the Interference condition. Data for the Word Reading and Color Naming conditions included only eight datapoints collected from
five studies. Due to scarcity of the data for these conditions, they were not analyzed. A linear regression of the Stroop scores on age yielded an R2 of0.791, indicating that 79% of the variance in scores is accounted for by the model. Based on this model, we estimated test scores for age intervals between 25 and 74 years. If predicted scores are needed for age ranges outside the reported age boundaries, with proper caution (see Chapter 3) they can be calculated using the regression equations included in the tables, which underlie calculations of the predicted scores. Linear regression of SDs for the Stroop scores on age suggests that age does not account for a significant amount of variability in SDs (R2 = 0.015). Though some increase in variability with advancing age is expected, this trend was not present in the collected data. Therefore, we suggest that the mean standard deviations for the aggregate sample be used across all age groups. Means and SDs for the Word Reading. Color Naming, and Interference conditions for four studies (seven data points) that report data for all three conditions are summarized in Table A6m.2. Examination of the effect of education on Stroop scores indicated that education did not contribute to the test scores beyond its association with age in the data available for analyses. Effects of IQ and gender on the test scores were not examined as data were not available for these analyses.
Strengths of the analyses 1. Postestimation tests for parameter specifications did not indicate problems with normality or homoscedasticity.
Limitations of the analyses 1. R2 of 0.791 is acceptable. However, this value indicates that only 79% of the variance in Stroop scores is accounted for by the model. 2. The number of studies available for the analyses is small. 3. It should be pointed out that the datapoints available for the analyses are
STROOP TEST
distributed unevenly throughout the age continuum. The datapoints aggregate at the younger and older extremes, with a notable lack of data in the middle part of the age continuum (see scatterplot). 4. Review of the literature indicates that effects of education, intellectual level, and gender on Stroop performance are equivocal. Though we did not find an effect of education on test performance in the data available for review, we were unable to examine the effects of other demographic variables due to a lack of data.
CONCLUSIONS The Stroop has a lengthy history as an experimental measure in psychological studies and more recently has been adapted for clinical neuropsychological use. However, the plethora of test stimuli and administration formats has been confusing. Compilation of the data sets suggests that the Golden version has the largest sample (n = 1,263), followed by the Kaplan (n = 981), Comalli (n = 627), Trenerry (n = 525), Victoria (n = 126), and Dodrill (n = 62) versions. However, within the data sets there are problems regarding representation of age groups: 1. There does not appear to be any data on the Victoria version for participants aged 36-49.
2. There does not appear to be any data on the Dodrill version for participants older than age 40.
133
3. There is a paucity of data on the Comalli version for participants less than 45 years of age. Examination of the means across the Kaplan and Comalli studies suggests that there is no difference in performance of controls on the two administration versions, raising the possibility that these two normative data sets can be used interchangeably. This would increase the total Kaplan/Comalli sample size to 1,608 and remedy the lack of data on younger participants in the Comalli version. Future research is needed to determine whether the Kaplan administration format elicits a more pronounced color-interference effect in clinical groups; it does not appear to have this effect in normals, as observed here. One difficulty in interpreting Stroop scores has been how to parcel out the effect of slowed information processing, as reflected in lowered scores on the first sections of the Stroop, from color-interference performance to obtain a more "pure" measure of executive dysfunction. Some authors have used difference scores (Demick & Harkins, 1997; Dodrill, 1978a; Jensen, 1965), although this approach has been questioned (Koss et al., 1984; Trenerry et al., 1989). Koss and colleagues (1984) recommend an analysis of covariance model, and Jensen and Rohwer (1966) report 16 different methods for relating individual Stroop scores. Further research is needed to determine if there is a more effective approach to Stroop interpretation than the typical independent analysis of individual Stroop scores.
7 Auditory Consonant Trigrams
BRIEF HISTORY OF THE TEST The Auditory Consonant Trigram Test (ACT), also referred to as the Brown-Peterson Consonant Trigram Memory Task or CCC (Stuss, 1987), which is a variant of the Peterson and Peterson technique (Milner, 1970, 1972; Samuels et al., 1972), was originally developed as an experimental verbal memory procedure (Brown, 1958; Peterson & Peterson, 1959). ACT is sensitive to both deficits in short-term auditory verbal memory (the ability to recall rote verbal information over a distractor; Milner, 1970, 1972; Samuels et al., 1972) and divided attention/working memory (amount of information which can be processed simultaneously; Fleming et al., 1995; Marie et al., 1995; Stuss et al., 1985, 1987). Factor analysis has suggested that ACT loads primarily with attentional (e.g., Digit Span) and verbal IQ measures, and not with other executive tasks, such as the Wisconsin Card Sorting Test, Stroop, and FAS (Boone et al., 1998). Several studies have indicated that individuals with memory impairment associated with left temporal lobe damage perform poorly on this task (Giovagnoli & Avanzini, 1996; Milner, 1970, 1972; Samuels et al., 1972), although there is some evidence that right temporal lobectomy patients may also show lowered scores (Samuels et al., 1980). In the absence of a significant verbal memory deficit, poor per134
formance on this measure may also be found in frontal system dysfunction secondary to closed head injury (Stuss et al., 1985), discrete frontal lobe lesions (Stuss et al., 1982), white-matter hyperintensities (Boone et al., 1992), Korsakoff's syndrome (Cermak & Butters, 1972; Leng & Parkin, 1989), presence of or risk for schizophrenia (Fleming et al., 1995; Rutschmann et al., 1980), Parkinson's disease (Marie et al., 1995), inattentive-type attention-deficit hyperactivity disorder (Gansler et al., 1998), myotonic dystrophy (Palmer et al., 1994), Klinefelter's syndrome (Boone et al., 2001), and adults exposed to alcohol in utero (Connor et al., 2001). The impaired test performance of both memory-dysfunctional patients and patients with frontal system defects may lie in a susceptibility to proactive interference; in the former group, the impairment may stem from verbal encoding deficits, while in the latter group it may be due to vulnerability to interfering stimuli that disrupt sustained attention (Stuss et al., 1982). For further information regarding ACT and its variants, please refer to Lezak et al. (2004, pp. 416-418), and Spreen and Strauss (1998).
Administration Procedures ACT involves the auditory presentation of three consonant trigrams followed by a number. The
135
AUDITORY CONSONANT TRIGRAMS
patient is instructed to subtract 3s from the number for several seconds, after which he or she is asked to recall the letters. The exact administration instructions and test stimuli used by Dr. Stuss and colleagues at Ottawa General Hospital are contained in Appendix 2c,d, and those used by Boone and colleagues at Harbor-UCLA Medical Center are contained in Appendix 2a,b. Some examiners use counting intervals of 3, 9, and 18 seconds (Boone et al., 1990, 1992; Boone, 1999; Cermak & Butters, 1972; Corsi, 1969, as reported in Milner, 1972; Samuels et al., 1972), while other investigators have lengthened the subtraction times to 9, 18, and 36 seconds, to increase task difficulty and reduce ceiling effects (Stuss et al., 1987).
Psychometric Properties Assessment of internal consistency of a Turkish translation of ACT using 3-, 9-, and 18-second delays revealed Cronbach's «at a reliable level (« = 0.8535).
RELATIONSHIP BETWEEN ACT PERFORMANCE, DEMOGRAPHIC FACTORS, AND VASCULAR STATUS Relatively few studies are available on the impact of demographic factors and IQ on ACT performance. A recent examination of the unique contribution of IQ and demographic factors to ACT performance in a large sample of healthy, older participants found that FSIQ accounted for 17% of unique test score variance while age accounted for a significant but very modest amount (6%) of unique test score variance; gender and educational level did not contribute to test performance (Boone, 1999). The remaining literature on the relationship between ACT performance and age has been equivocal, with some investigators failing to detect an association (Bherer et al., 2001; Boone et al., 1990; Puckett & Lawson, 1989; Stuss et al., 1987) and others documenting some deterioration with increasing age (Anil et al., 2003; Inman & Parkinson, 1983; Parkin & Walter, 1991; Parkinson et al., 1985;
Schonfield et al., 1983). No other literature aside from the Boone (1999) publication could be located on the impact of IQ on ACT scores. Bherer et al. (2001} and Anil et al. (2003) observed a significant contribution of educational level to ACT performance, in contrast to the findings of Stuss et al. (1987). However, the findings of Boone (1999) suggest that when the effects of education and IQ are assessed simultaneously, only IQ is a significant predictorof ACT performance. Stuss et al. (1987}, like Boone (1999}, failed to detect a significant effect of gender on ACT performance. Of interest, there is evidence that some medical conditions previously thought to be cognitively benign may lead to decrements in ACT performance. Specifically, the presence of cerebrovascular risk factors (e.g., hypertension, post-myocardial infarction, cardiovascular disease, arrhythmias, coronary artery bypass graft, angina, old myocardial infarction, white-matter hyperintensities) accounts for 3% of ACT test score variance (Boone, 1999), and once a threshold amount of total white-matter hyperintensity volume is detected on MRI (i.e., > 10 cm 2), significant declines are detected in ACT performance (Boone et al., 1992). In fact, ACT may be one of the most sensitive cognitive tests to the presence of white-matter damage sustained through hyperintensities of probable vascular origin (Boone et al., 1992) or white-matter and/ or frontal-limbic-reticular activating system disruption secondary to accelerationdeceleration closed head injury (Stuss et al., 1985).
METHOD FOR EVALUATING THE NORMATIVE REPORTS
Our review of the literature located three ACT normative reports published since 1987 (Anil et al., 2003; Stuss et al., 1987, 1988) as well as data from two studies examining the impact of age, education, IQ, gender, and medical illness on ACT performance (Boone et al., 1990; Boone, 1999), which included unique features such as large sample size (Boone, 1999) and reporting of a wide range of ACT scores (Boone et al., 1990).
136
TESTS OF ATTENTION AND CONCENTRATION
To adequately evaluate the ACT normative reports, six key criterion variables were deemed critical. The first four of these relate to subject variables, and the remaining two relate to procedural variables. Minimal criteria for meeting the criterion variables were as follows.
Subject Variables Sample Size
Fifty cases are considered a desirable sample size. Although this criterion is somewhat arbitrary, a large number of studies suggest that data based on small sample sizes are highly influenced by individual differences and do not provide a reliable estimate of the population mean. Sample Composition Description
Given the evidence that ACT performance may be significantly impacted by medical status (e.g., vascular illness), information regarding medical exclusion criteria is critical. In addition, as discussed previously, information should probably also be provided regarding educational level, gender, psychiatric exclusion criteria, geographic region, ethnicity, occupation, handedness, and recruitment procedures, even though there are as yet no data indicating that these factors influence test performance. Reporting of Age
Given the equivocal and modest relationship between age and ACT performance, ACT normative data probably do not need to be presented by age group intervals, but information on the ages of the normative samples should be provided. IQ Group Intervals
Given the evidence that IQ may account for more unique test score variance than do demographic factors, information regarding IQ level should be reported for each subgroup, and preferably normative data should be presented by IQ intervals.
Procedural Variables Description of the Administration Format Used
Given that different test administration formats involve differing lengths of distraction intervals, specific information regarding the delays should be provided. Data Reporting
Means and standard deviations, and preferably ranges, for total score out of 60 are important. In addition, it is advantageous for data to be provided for each of the distraction intervals separately.
SUMMARY OF THE STATUS OF THE NORMS In terms of subject variables, only one study provides data by IQ level (Boone, 1999), although IQ data are reported in a second study (Boone et al., 1990). Information on age, gender, education level, geographic area, and recruitment procedures is reported for all studies. In addition, medical, psychiatric, neurologic, and substance abuse exclusion criteria are described and judged to be adequate for all studies. Ethnic composition was indicated in two studies (Anil et al., 2003; Boone et al., 1990). Handedness data were provided only in the investigations conducted by Stuss and colleagues (Stuss et al., 1987, 1988). While all studies exceeded a total sample size of 50, only one study reached the criterion of 50 participants per individual grouping cell (Anil et al., 2003). In terms of procedural variables, information is available regarding the precise administration formats for all studies. Means and SDs are reported for total score in all but one study (Anil et al., 2003), and means and SDs for individual distractor delays are provided in all but one study (Boone, 1999). Practice effects are investigated in the reports by Stuss and colleagues (Stuss et al., 1987, 1988), and data on qualitative performance variables (perseverations, errors in letter sequence) are provided in Boone et al. (1990).
137
AUDITORY CONSONANT TRIGRAMS
Data are presented in ascending chronological order for two ACT versions separately: first for the 9-, 18-, and 36-second and then for the 3-, 9-, and 18-second delay version. The text of study descriptions contains references to the corresponding tables identified by number in Appendix 7. Table A7.1, the locator table, summarizes information provided in the studies described in this chapter. 1
SUMMARIES OF THE STUDIES Data for 9-, 18-, and 36-Second Delay Version [ACT.l] Stuss, Stethem, and Poirier, 1987 (Tables A7.2 and A7.3)
ACT baseline and 1-week retest data were collected on 60 participants in Canada, who were recruited through employment agencies and paid $10 for the two testing sessions. Participants ranged in age from 16 to 69, with a mean of 39.6 (2.62) years. Years of education ranged 8-20, with a mean of 14.5 (2.63). Thirty-three participants were male and 27 were female. Forty-nine were right-handed. None had a history of significant medical, neurological, or psychiatric disorder; substance abuse; or current psychotropic medication use. Participants were tested in their native language (English or French). Each threeconsonant combination was presented at a rate of one consonant per second followed by a three-digit number. Participants were instructed to count backward by 3s from the number for random delays of 9, 18, and 36 seconds and then to recall the trigram. Practice trials were employed until participants demonstrated understanding of the procedures. Five trials were conducted for each delay interval, with intertrial delays of 2--5 seconds. The counting delays were extended from those employed by Cermak and Butters (1972), to minimize ceiling effects. A total score of 15 was possible for each of the three delays. An alternate form was employed on retesting. 'Norms for children and adolescents are available in Baron (2004) and Spreen and Strauss (1998).
Test performance was not impacted by age, educational level, or gender. A practice effect for the 9- and 18-second delays was observed despite the alternate form. ACT data are provided by six age groupings (16-19, 20-29, 30-39, 40-49, 50-59, and 60-69) for baseline testing, retesting, and the two testing sessions combined for the three delay intervals separately. Data on gender distribution, handedness, mean age and SD, and mean years of education, SD, range, and frequency of"$high school" and ">high school" for each age grouping are provided. In addition, data are presented by gender and educational level ($high school, >high school) separately. Study strengths 1. Information regarding age, education, gender, and handedness for the total sample and for individual age groupings is provided. 2. Adequate exclusion criteria. 3. Information on practice effects is reported. 4. Precise description of test administration procedures. 5. Data presented by age groupings, gender groupings, and education groupings. 6. Data presented separately for each distraction interval. 7. Information on geographic area and recruitment procedures is provided. 8. Means and SDs for the test scores are reported. Considerations regarding use of the study 1. Small individual cell sizes (n = 10). 2. Data collected in Canada, with some test administrations conducted in French; cultural and linguistic factors may limit usefulness of data for clinical interpretation in the United States. 3. No information regarding IQ level. [ACT.2] Stuss, Stethem, and Pelchat, 1988 {Tables A7.4 and A7.5)
In this publication, the authors supplement the data reported in 1987 by expanding the number of participants, increasing cell sizes
138
TESTS OF ATTENTION AND CONCENTRATION
by collapsing the data from six to three age groupings (16-29, 30-49, and 50--69), and presenting the combined data from the two testing sessions in box plots, which has the advantage of visual display of data variability. Each of three age groupings contained baseline and 1-week retest data on 30 participants, none of whom had a positive psychiatric or neurologic history. Data are presented on gender distribution, handedness, mean age and SD, and mean years of education, SD, and range for each age group separately.
Study strengths 1. 2. 3. 4.
Large overall sample size (n = 90). Information on practice effects. Adequate exclusion criteria. Information on gender, educational level, and handedness for each age grouping is presented. 5. Presentation of test score variability via box plots. 6. Test administration format is the same as in Stuss et al. (1987). 7. Means and SDs for the test scores are reported.
Considerations regarding use of the study 1. Same as above: although the sample has been increased by 50%, the three age groupings still have only 30 participants each.
Data for 3-, 9-, and 18-Second Delay Version [ACT.3] Boone, Miller, Lesser, Hill, and D'Eiia, 1990 (Table A7.6)
Data were collected on 61 middle-aged and older individuals ranging in age from 50 to 79, recruited as controls in southern California through newspaper ads, flyers, and personal contacts as a part of ongoing research on latelife depression and psychosis. Participants had no history of psychotic, major affective, or alcohol or other drug dependence disorder and spoke English fluently. (A handful of participants spoke English as a second language.) Participants were excluded if there was a history of physical findings of neurological
disease, such as stroke, Parkinson's disease, or seizure disorder. Also excluded were individuals with laboratory findings showing serious metabolic abnormalities (e.g., low sodium level, elevated glucose level, or thyroid or liver function abnormalities). Eighteen percent of the original sample of 74 were eventually excluded due to the presence of previously unidentified strokes or other significant lesions documented on MRI (n = 9), metabolic abnormalities or undiagnosed medical illness (n =2), or evidence from laboratory studies and EEG findings of alcohol abuse and substance intoxication (n = 2). The final sample (n = 61) included 25 men and 36 women grouped by three age decades: 50-59 (n = 25), 60--69 (n = 21), and 70-79 (n = 15). All but 10 participants were white; four were African American, three were Asian, and three were Hispanic. Mean educational level was 14.34 (2.63) years, and mean WAIS-R FSIQ was 113.79 (13.51). No significant effect of age on ACT performance was documented in comparisons of the three age groups. Means and SDs are presented for ACT total score as well as for 3-, 9-, and 18-second delay for each age group separately. Total possible was 60 (15 points for each delay interval as well as 15 points for a five-trial, 0-delay condition). Means and SDs are also reported for number of perseverations and altered sequences. Perseveration was defined as the reporting of an incorrect letter which was used as an answer on the preceding trial; a total of 57 perseverations were possible. Altered sequence referred to reporting of correct letters but in the wrong position within the trigram; a total of 20 altered sequences were possible.
Study strengths 1. Information on IQ level, years of education, gender distribution, geographic area, recruitment procedures, ethnicity, and fluency in English is presented. 2. Data are reported in terms of total score but also by individual delay intervals; information is also provided on perseverations and altered sequences.
139
AUDITORY CONSONANT TRIGRAMS
3. Comprehensive medical and psychiatric exclusion criteria, including MRI brain scans, on all participants. 4. Test administration format is described. 5. Means and SDs for the test scores are reported. 6. Data stratified by age.
Considerations regarding use of the study 1. Fairly small individual cell sizes (n = 15-25). 2. High average IQ level. [ACT.4] Boone, Ananth, Philpott, Kaur, and Djenderedjian, 1991 ACT data were obtained on 16 controls (nine women, seven men) as part of an investigation of the neuropsychologicaJ characteristics of obsessive-compulsive disorder (OCD). Nine of the participants were siblings of OCD patients, while the remaining participants were recruited through newspaper ads and friends of OCD patients in the southern California area. Mean age was 35.8 ( 13.7) years, and mean educational level was 15.2 (2.8) years. Mean FSIQ, VIQ, and PIQ were 109.1 (10.9), 106.3 (13.0), and 111.8 (10.8), respectively. MedicaJ exclusion criteria were history of alcohol or drug abuse, head injury, seizure disorder, cerebral vascular disease or stroke, psychosurgery, current or past psychiatric condition, or any renal, hepatic, or pulmonary disease. Mean ACT total score was 44.3, with an SO of7.5. Data are not reproduced in this book
Study strengths 1. Information regarding age, education, gender distribution, IQ, geographic area, and recruitment procedures is reported. 2. Comprehensive psychiatric and medical exclusion criteria. 3. Though not stated, test administration procedures are the same as those in Boone et al. (1990). 4. Means and SDs for the test scores are reported. Considerations regarding use of the study 1. Small sample size. 2. Data are not stratified by age or IQ level.
3. Data are presented in terms of total score only, with no information regarding distraction intervals. 4. Nearly high average IQ level.
[ACT.5] Boone, 1999 (Table A7.7) The author obtained ACT data on 155 middleaged or older individuals (ranging in age from 45 to 84 and recruited as described above for Boone et al., 1990; data from the 1990 study are included in the 1999 publication). The mean age of the sample was 63.07 (9.29) years, mean educational level was 14.57 (2.55) years, and mean FSIQ was 115.41 (14.11); 53 were male and 102 were female. Medical and psychiatric exclusion criteria are listed above, with the exception that participants with significant white-matter hyperintensities documented on MRI were retained in the sample. All participants considered themselves healthy, although 51 had some evidence of vascular illness (defined as cardiovascular disease and/ or significant white-matter hyperintensities on MRI) based on self-report or evidence on examination of at least one of the following: current or past history of hypertension (n = 39), arrhythmia (n = 8), large area of white-matter hyperintensity on MRI (e.g., > 10 cm2 ; n = 7), coronary artery bypass graft (n = 3), angina (n = 2), and old myocardial infarction (n = 1). Twenty-four participants were currently on cardiac and/ or antihypertensive medications. A stepwise regression analysis revealed that FSIQ, age, and vascular status were significant contributors to total ACT score, accounting for 17%, 6%, and 3% of test score variance, respectively; educational level and gender did not account for a significant amount of unique test score variance. ACT normative data are presented for total ACT score stratified by IQ and age ( < 65 and ~65; average IQ, high average IQ, and superior IQ).
Study strengths 1. Large overall sample size. 2. Presentation of data by IQ and age groupings. 3. Comprehensive medical and psychiatric exclusion criteria, including MRI brain scans, on all participants.
140
TESTS OF ATTENTION AND CONCENTRATION
4. Information regarding educational level, gender, geographic area, recruitment procedures, and fluency in English. 5. Though not stated, test administration procedures are the same as dtose in Boone et al. (1990). 6. Means and SDs for the test scores are ~ reported.
Considerations regarding use of the s#!.uly 1. Individual IQ-by-age groupings have sample sizes ranging 16-37. 2. Data are presented in terms of total score rather than separately for each distraction interval. [ACT.6l Anil, Kivircik, Batur, Kabakci, ICitis, Giiven, Basar, Turgut, and Arkar, 2003 : (Table A7.8)
ACT data were collected on 236 individuals in Turkey, who were recruited from hospttaJ staff or through personal contacts. Exclusfon criterion included neurological or p~hiatric conditions. The sample was strati6tJ into three age groups (16-25, 26-45, andj46-65) and three education groups (8-10, 11-il-4, and >14 years). The youngest age group ~nsisted of 40 males and 22 females, who aver4ged 22 (2. 7) years of age. The middle age grqup was composed of 70 males and 55 femal.s, who averaged 34.1 (5.9) years. The oldest group included 28 males and 21 females, w}1o averaged 53.8 (4. 7) years. The ACT was translated with the consultation of a linguist, and consonants from the Turkish alphabet showing similar phone~c characteristics to the original ACT were e~loyed. Participants were instructed to count b~kward by 1s rather than the standard 3s. M~s and SDs are reported for each delay intervaj. Analyses revealed no gender Fffects, although better performance was asspciated with younger age and more years of ed"Qcation.
Study strengths 1. Large overall sample size, although the sizes of the nine subcells ate not reported. ~ 2. Information provided for age, ~ender, education, geographic area, l~age, and recruitment procedures. 1
3. Data stratified by both age and education. 4. Adequate exclusion criteria. 5. Means and SDs for the test scores are reported.
Considerations regarding use of the study 1. Test was translated into Turkish and data were collected in Turkey, rendering use problematic for English-speaking patients. 2. Test administration was not standard (subjects counted backward by 1s rather than 3s). 3. No information regarding IQ.
CONCLUSIONS ACT has been underutilized as a clinical measure of executive dysfunction despite evidence that it may be particularly sensitive to whitematter disturbance. Given emerging interest in working-memory paradigms, the consonant trigrams task may experience an increase in popularity. Most working-memory paradigms have been used in experimental studies, and normative data are typically not available. The fact that a normative data pool of upward of500 participants has been collected for ACT may make it an attractive working-memory procedure for clinical practice. In addition, the fact that the ACT task does not involve a timed response makes it a desirable executive measure in that test performance is not confounded by declines in mental speed. For tasks such as Trails B, Stroop Color Interference, and word and design generation, poor scores may reflect slowing in information-processing speed rather than executive dysfunction per se. Future research is needed to determine which delay intervals (i.e., 3, 9, and 18 seconds vs. 9, 18, and 36 seconds) are most sensitive and appropriate for clinical use. Also, normative data need to be obtained on populations with less than average IQs.2
2 Meta-analyses were not performed on ACI' due to a lack of sufficient data.
8 Paced Auditory Serial Addition Test
BRIEF HISTORY OF THE TEST
Sampson (1956) originally developed an auditory and a visual version of the Paced Serial Addition Test. In a landmark study, Gronwall and Sampson (1974) used the auditory version, the Paced Auditory Serial Addition Test (PASAT), to assess information-processing speed and working memory in post-concussive patients. This version was further researched and made popular by Gronwall and colleagues (Gronwall & Wrightson, 1974; Gronwall, 1977a). It is now believed that the PASAT also measures other cognitive abilities, such as sustained and divided attention (Lezak. 1995; Lezak et al., 2004). In the original version of the PASAT, a random series of 61 digits (1-9) are presented on audiotape and the participant is to add the last digit presented to the preceding digit and verbalize the answer. For example, if the digits 1 and 2 are presented, the participant's correct response would be 3 (i.e., 1 + 2), and if the digit 4 is presented next, the participant's correct response would be 6 (i.e., 2 + 4) and so on. Each trial has the same random presentation of the 61 digits; however, the pace at which the digits are presented differs for the four trials. In trial 1, the digits are presented at the rate of 2.4 seconds, in trial 2 at 2.0 seconds, in trial3 at 1.6 seconds, and in trial 4
at 1.2 seconds. The task of the participant is identical for each trial, and thus, a total of 60 correct responses per trial is possible. A practice trial of 10 digits presented at the rate of 2.4 seconds precedes the four test-trials. Functional brain imaging studies have suggested that PASAT performance is associated with activation of right anterior and left posterior cingulate, consistent with an emerging body of literature relating cingulate function to attentional mechanisms (Deary et al., 1994). The PASAT was initially devised as a measure to detect cognitive deficits in postconcussive individuals. Several studies have shown that patients with head trauma perform significantly worse than their normal control counterparts on the PASAT (Bate et al., 2001; Brooks et al., 1999; Cicerone, 1997; Gronwall and Sampson, 1974; Maddocks & Saling, 1996; Ponsford & Kinsella, 1992; Stuss, et al., 1989; Tiersky et al., 1998). Cicerone and Azulay (2002) documented the PASAT to be among one of the most sensitive neuropsychological tests for detecting impairment in patients with post-concussion syndrome, but Maddocks and Sailing (1996), using only the 2.4-second presentation of the PASAT, did not find the same results. The original studies with the PASAT also found it to be a sensitive measure of recovery rate and capability to return to work (Gronwall & Wrightson, 1974; Gronwall, 1977a). 141
142
TESTS OF ATTENTION AND CONCENTRATION
However, of concern, several subsequent studies have been unsuccessful in finding a relationship between PASAT scores and severity of head injury (Levin et al., 1982; ~erman et al., 1997; Stuss et al., 1989). Fos et al. (2000) found that both the auditory and ~ visual versions of the Paced Serial Addition Test significantly correlated with other tesfs of attention but that neither version of this test differentiated patients with mild traumatic brain injury from normal controls in a. college population. In an interesting study }t Chan (2001), there were no differences in TASAT scores of those postconcussive patieJts who were considered "low symptom ref<>rters" and those who were considered "hig~ symptom reporters." More recently, the PASAT has been used to study cognitive functioning in patietlS with multiple sclerosis (MS). In fact, modified versions of the PASAT, using 3- and 2tsecond pacing in digit presentation, are includeP in the Brief Repeatable Battery of Neuropsychological Tests developed by the National ~ultiple Sclerosis Society to be used as a scree$g tool for MS (Rao et al., 1990). In a study lw Shawaryn et al. (2002), the PASAT predictep mental and emotional responses of MS pa~nts on a quality-of-life questionnaire. Johnson et al. (1996) found that patients with MS performed poorly on both the PASAT and the: Paced Visual Serial Addition Test (a visual a¥og to the PASAT), while patients with Chrotrlc Fatigue Syndrome displayed difficulty o~y with the PASAT. The authors postulate that ~eficits on both of these tasks by MS patien;s may suggest impairment of central executite system, a view that is shared by D'Esposith et al. (1996). Kujala et al. (1995) reported intpaired PASAT scores for a group of mildly deteriorated MS patients but intact scores for~ nondeteriorated MS group. Solari et al. ;(1995) found that the PASAT was one of two neuropsychological tests that best discri~J?inated between MS and controls. Fisk and ~hibald (2001) used a different scoring techniq+e (the "dyad" method of counting two consr:utive right responses as one correct point) apd observed that controls outperformed MS J.lltients on only the first two out of four prese,tation trials. Using the dyad scoring method !of the
PASAT and magnetic resonance imaging, Snyder and Cappelleri (2001) found that PASAT scores correlated with the total area of sclerotic brain lesions in MS patients. This correlation was not observed when the original PASAT scoring method was used. It should be noted that studies have found the PASAT to be increasingly difficult for MS patients, and as a result a number of them refuse to perform the task (Aupperle et al., 2002). Additional factors have also been shown to affect PASAT performance. Studies have demonstrated a reduction in PASAT scores during pain (Sjogren et al., 2000), with sleep disruption (Martin et al., 1996), in solvent exposure (Rasmussen et al., 1993), during hypoglycemia in patients with diabetes (Gold et al., 1995), in HIV-positive individuals (Honn et al., 1999), in individuals with schizotypal personality disorder (Mitropoulou et al., 2002), in individuals with Attention-Deficit Disorder (Katz et al., 1998), and in cannabis addicts (Elwan et al., 1997). In addition, a negative effect of smoking on PASAT performance has been reported but only in poorly educated males (Elwan et al., 1997). Further details about Gronwall's (1977b) version of the PASAT and verbatim instructions can be obtained from the PASAT administration manual and test kit (see Appendix 1 for ordering information) or Spreen and Strauss (1998, pp. 243-251; see also Lezak et al., 2004).
Modifications and Alternate Formats of the PASAT While the original version is the most commonly administered format, modifications to the PASAT have been made. Several of these modified versions and alternate formats are presented below. Levin Version
Levin et al. (1987) developed a version in which only 50 digits (rather than 61 digits) are presented in different random order (as opposed to the same random order) for each trial using the same 2.4-, 2.0-, 1.6-, and 1.2-second interval presentation. This version minimizes the practice effects observed with Gronwall's
143
PACED AUDITORY SERIAL ADDITION TEST
original version, as demonstrated by Stuss et al. (1987). PASAT-200, PASAT-100, and PASAT-50 Shortened versions of the PASAT were also developed by Diehr et al. (1998) and further modified by Diehr et al. (2003). The Diehr et al. (1998) PASAT, also referred to as the PASAT-200, is very similar to Levin et al.'s (1987) version in that it consists of the presentation of 50 single digits (except for the number 7) in random order at four different pacing intervals. However, the pacing intervals are 3.0, 2.4, 2.0, and 1.6 seconds per digit instead of 2.4, 2.0, 1.6, and 1.2 seconds. Different random presentation of the digits is used for each trial. Diehr et al. (2003) shortened the PASAT-200 by providing normative data on trial 1 (3-second pacing trial) only, referred to as the PASAT-50, and trials 1 and 2 combined (3- and 2.4-second pacing trials), referred to as the PASAT-100. Computerized Versions of the PASAT Holdwickand Wingenfeld (1999) and Wingenfeld et al. (1999) created a computerized version of Gronwall's (1977a) original PASAT, in which the auditory stimuli are presented via external speakers and responses are recorded by an external microphone. However, the computerized administration does not record a response as correct if it occurs after presentation of the subsequent stimulus, while the traditional administration format of the PASAT has typically given credit for correct "late" responses. Tombaugh (1999) and Royan et al. (2004) recently developed an interesting computer version of the PASAT, referred to as the Adjusting-PSAT. This version measures speed of information processing and working memory by assessing temporal thresholds versus the traditional method of counting number of correct responses. In this version, stimuli are presented through either auditory or visual modalities, and the duration of the interval between number presentation depends on the correctness of the response. In other words, correct responses lead to decreased time between intervals and incorrect responses lead to increased time between intervals.
An additional PASAT administration format includes giving only one or two trials of the original PASAT at select pacing rates (e.g., 2.4 or 2.0 seconds). Psychometric Properties of the Test Adequate reliability and validity have been reported for the original PASAT. Studies have cited split-half reliability of greater than 0.90 (Egan, 1988), suggesting high internal consistency, and test-retest reliability values of0.930.97 (McCaffrey et al., 1995). O'Donnell et al. (1994) reported adequate construct validity of the PASAT, demonstrating relatively strong correlations with other tests of attention, such as Visual Search and Attention Task (r=0.55) and TrailMaking Test Part B (r = 0.58). Moderate correlations between PASAT and other tests of concentration, information processing, and working memory have also been noted by Crawford et al. (1998b), Deary et al. (1991), Gronwall & Wrightson (1981), and Larrabee and Curtiss (1995). However, some authors caution that the PASAT correlates not only with tests of attention but also with tests that measure mathematical skills and overall intellectual ability (Chronicle & McGregor, 1998; Sherman et al., 1997). Sherman and colleagues (1997) voiced concern that "PASAT performance depends on mathematical ability, at least as much as on attentional skills" and recommend that "the PASAT should not be interpreted as a measure of attention when mathematical skills are poor" (p. 43). For further information on the effect of repeated administration and psychometric properties of the PASAT, see Franzen (2000) and McCaffrey et al. (2000).
RELATIONSHIP BETWEEN PASAT PERFORMANCE AND DEM()(jRAPHIC FACTORS Age effects have been frequently reported for the PASAT. Stuss et al. (1988) found declining PASAT scores as a function of age grouping. Using the Levin version of the PASAT, Brittain et al. (1991) also demonstrated an age-related decline in performance. Roman et al. (1991)
144
TESTS OF ATTENTION AND CONCENTRATION
found that individuals in the 6th and 7th decades of life performed significantly worse than two younger groups on all PASAT trials, and Wiens et al. (1997) reported a steady ageassociated decline in PASAT performance for individuals in their twenties to late forties. Further, Diehr et al. (1998) documented a decline with age in a modified version of the PASAT for individuals in three age groups (20-34, 35-49, 50-68). A few studies, however, have shown weak or no age effects. Boringa et al. (2001) reported declining scores as a function of age on the 2-second trial of the PASAT but not the 3-second trial. This would suggest that age differences emerge only during the more difficult portion of the task. Epperson and Cripe (1985) found no significant age effects for a sample of individuals aged 18-49, and Elwan et al. (1997) found no significant correlation between PASAT scores and age in an Egyptian sample ranging from 20 to > 60 years of age. Finally, in one study (using the 2-second delay in number presentation), PASAT scores for older individuals (mean age= 52) were actually higher than for young college students (mean age= 25) (Ward, 1997). A relatively consistent relationship between PASAT performance and education has been reported. Stuss et al. (1987) found that individuals with less than a high school education performed poorer on the PASAT than those with a college education or higher. Wiens et al. (1997) found education effects for trial 1 of the PASAT but not the other trials. Diehr et al. (1998) reported a steady increase in PASAT scores as a function of higher education attainment. In contrast, Brittain et al. (1991) and Elwan et al. (1996, 1997) could detect no significant relationship between education and PASAT performance. The results are mixed in terms of the relationship between general intelligence and the PASAT. Gronwall and colleagues (Gronwall & Sampson, 1974; Gronwall & Wrightson, 1981) and others (Johnson et al., 1988; Roman et al., 1991) report weak or no correlation between intelligence and PASAT, while others have shown a moderate relationship between these two factors (Crawford et al., 1998b; Deary et al., 1991; Egan, 1988; Kanter, 1984; Wiens
et al., 1997). Kanter (1984) observed a strong correlation between PASAT responses and speeded nonverbal intelligence tasks, and significant relationships between PASAT scores and the Shipley tests of intelligence have also been reported (Brittain et al., 1991; Egan, 1988). Deary et al. (1991) found a significant correlation between PASAT scores and WAIS-R IQ in a group of diabetic patients, but on closer examination, the relationship was only significant between the PASAT and the freedom from distractibility index of the WAIS- R. In terms of basic math skills and the PASAT, Gronwall and Sampson (1974) found a weak correlation, but others have shown a stronger relationship (Sherman et al., 1997). Gender differences have not been found in most studies using the PASAT (Boringa et al., 2001; Diehr et al., 1998, 2003; Roman et al., 1991; Stuss et al., 1987). Some studies have found statistically significant differences in performance in favor of males, but the differences were of little clinical or practical importance (Brittain et al., 1991; Wiens et al., 1997). Elwan and colleagues (1996, 1997), administering the PASAT to a sample of Egyptians, found better performance in males but particularly in subjects age 60 or above. Interestingly, Wiens et al. (1997) noted that Hispanic, Asian, and Native American males in their sample appeared to perform "slightly" better than their female counterparts, while the opposite was true for African-American and Caucasian participants. However, cell sample sizes were too small to confirm these observations with statistical analyses. Only a few studies have examined the relationship between race/ethnicity and PASAT performance. Brittain et al. (1991) reported a complex interaction effect between age, IQ, and race. They found that in older "minority" women, PASAT scores across all trials were associated with IQ scores. The specific racial breakdown of their minority subjects was not provided, and this interaction effect was not reported for their Caucasian group. Wiens et al. (1997) found no statistically significant differences between African-American, Hispanic, Native-American, Asian, and Caucasian participants. Diehr et al. (1998), however, reported significantly better PASAT performance by
PACED AUDITORY SERIAL ADDITION TEST Caucasians relative to African Americans across three age groups (20-34, 35-49, 50-68). Additionally, using T-score conversions, Diehr et al.'s distribution of the PASAT scores of a small sample of Hispanic individuals more closely resembled that of the African Americans than the Caucasians.
METHOD FOR EVALUATING THE NORMATIVE REPORTS To adequately evaluate the PASAT normative reports, seven key criterion variables were deemed critical. The first five of these relate to subject variables and the two remaining dimensions refer to procedural issues. Minimal requirements for meeting the criterion variables were as follows.
Subject Variables Sample Size
Fifty cases are considered a desirable sample size. Although this criterion is somewhat arbitrary, a large number of studies suggest that data based on small sample sizes are highly influenced by individual differences and do not provide a reliable estimate of the population mean. Sample Composition Description
Information regarding medical and psychiatric exclusion criteria is important. It is unclear if gender, geographic recruitment region, socioeconomic status, occupation, ethnicity, or recruitment procedures are relevant. Until this is determined, it is best that this information be provided. Age Group Interval
This criterion refers to grouping of the data into limited age intervals. This requirement is especially relevant for this test since a strong effect of age on PASAT performance has been demonstrated in the literature. Reporting of Education Levels
Given the strong association between education and PASAT performance, information
145
regarding educational level should be reported for each subgroup, and preferably normative data should be presented by educational levels. Reporting of Intellectual Levels
Given the probable association between PASAT performance and IQ, information regarding intellectual level should be reported for each subgroup, and preferably normative data should be presented by IQ levels.
Procedural Variables Description of Administration Procedures
Due to variability in administration procedures, a detailed description of the procedures, including identification of the version of the test administered and number of trials (with reported pacing of digit presentation), is desirable. This would allow one to select the most appropriate norms or to make corrections in interpretation of the data. Data Reporting
Group means and standard deviations for the number of correct responses for each pacing condition should be presented at minimum.
SUMMARY OF THE STATUS OF THE NORMS Information presented in the studies reporting data for the PASAT differs across studies. Some of these differences will be summarized below. Of the studies reviewed below, nine were essentially designed to provide normative information (Boringa et al., 2001; Brittain et al., 1991; Diehr et al., 1998, 2003; Roman et al., 1991; Stuss et al., 1987, 1988; Wiens et al., 1997; Wingenfeld et al., 1999). Data for "normal" control groups from clinical comparison studies are also included in this chapter. Various test formats of the PASAT are used, with several studies devoted to modifying test versions or scoring methods. The variations in testing procedure and format include the number of digits used (e.g., 61 or 50), the same vs.
TESTS OF ATTENTION AND CONCENTRATION
146
different random order of the digit presentation across trials, the number of trials administered, and the pace at which the digits are presented (e.g. 3.0-, 2.4-, 2.0-, 1.6-, and/or 1.2-second pacing). Among all of the clinical studies available in the literature, we selected for review those that used well-defined samples; presented means and SDs for more than one presentation condition (e.g., 2.4-second pace per digit); provided adequate description of the test version, procedures, and format; and provided descriptive statistics for sample demographics, such as age and education. In the studies reviewed below, the test scores represent the number of correct responses for each pacing rate or the total scores across all trials, unless indicated otherwise. Summaries of the studies are presented in ascending chronological order for each version of the test separately. Studies using Gronwall's administration procedure are presented first, followed by those using Levin's version, concluding with the PASAT-50, PASAT-100, and PASAT-200 versions. The text of study descriptions contains references to the corresponding tables identified by number in Appendix 8. Table A8.1, the locator table, summarizes information provided in the studies described in this chapter. 1
SUMMARIES OF THE STUDIES
Gronwall's Administration Version
retested with the PASAT. The retesting was approximately 1 week later for head-injured patients; it can be assumed that it was the same time delay for the controls, but there is no specific mention of this. There is no additional information regarding age, gender, or education for this sample. No other exclusion criteria are reported. The 61-digit version of the PASAT was presented at four different pacing rates (2.4, 2.0, 1.6, and 1.2 seconds).
Study strengths 1. Adequate sample size. 2. Test administration procedures are well specified. 3. Means and SDs for the test scores are reported.
Considerations regarding use of the study 1. The sample composition is not well described in terms of age, education, gender, IQ, and recruitment procedures. 2. The age range of the group is quite large, and the majority of the participants are between the ages 17-25 years. 3. No exclusion criteria are provided, and the non-head-injured "accident" cases are not well described. 4. The test-retest time frame for the normal controls is not provided (but the head-injured patients were tested 1 week apart). 5. The data were obtained on New Zealanders, which may limit their usefulness for clinical interpretation in the United States.
[PASAT.1] Gronwall, 1977a (Gronwall Version) (Table A8.2)
[PASAT.2] Stuss, Stethem, and Poirier, 1987
This is one of the first studies to use the PASAT in order to assess cognitive functioning in brain-damaged patients. A sample of 60 "normal" participants in New Zealand aged 14-55 years (with the majority aged 17-25), consisting of 10 non-head-injured accident cases, 10 naval "ratings," and 40 firstyear university students, served as controls. All subjects were initially tested and then
(Gronwall Version) (Table A8.3)
'Nonns for children and adolescents are available in Baron (2004) and Spreen and Strauss (1998).
The authors examined age-related differences in performance on three neuropsychological tests, one of which was the PASAT. The authors recruited 60 participants from Ottawa, Canada, through personal contacts or various agencies (e.g., Seniors Employment Bureau, Youth Employment Agency). Participants were grouped by six decades of life (16-19, 20-29, 30--39, 40-49, 50-59, 60-69). Information regarding handedness, years of education, and ratio of males to females is provided for each
PACED AUDITORY SERIAL ADDITION TEST
age group. None had a history of neurological or psychiatric illness. Educational levels of males (14.36) and females (14.55) were approximately the same, but significant differences were found between educational levels of participants in the different age groups, with the 50-59 group having the lowest educational level. The original Gronwall four-trial version of the PASAT was used. It should be noted that the authors report using 60 digits but also state that 60 correct responses are possible. Thus, it is believed that the original 61-digit version was used. Participants were tested at two different intervals, separated by 1 week. The test was administered in the participants' native language of French or English. Study strengths 1. The sample composition is well described in terms of age, education, gender, geographic area, and recruitment procedures. 2. The data are stratified by six age groupings. 3. Adequate exclusion criteria. 4. Test administration procedures are specified. 5. Means and SDs for the test scores are reported.
Considerations regarding use of the study 1. Overall sample is adequate, but individual cells are very small. 2. Educational levels are not equal across the different age groups, and some of the groups are highly educated. 3. The data were obtained on Canadian subjects, sometimes in French, which may limit their usefulness for clinical interpretation in the United States.
Other comments 1. Individuals in the 50-59 age group had the lowest educational level and the lowest PASAT scores relative to the other age groups. Their PASAT scores were significantly lower than even the oldest age group (60-69). 2. The authors present another table that collapses PASAT scores across age groups, stratifying the data by gender
147
and educational level (~high school vs. >high school). Given the significant age effect, these tables have not been reproduced in this chapter but can be found in the original source. [PASAT.l] Sluss, Stethem, and Pelchat, 1988 (Gronwall Version) (Table A8.4) This study builds on the previous normative study by Stuss et al. (1987) by collapsing the age groups (i.e., creating larger age ranges per group}, thus increasing the number of participants per cell. In the current study, there were three age groups. For the 1~29 age group, there were 16 males and 14 females, with an average age of 22.43 (2.67) and education range of 11-18 years (mean= 14.1, SD = 1.34); for the 30-49 group, there were 14 males and 16 females, with an average age of 40.63 (2.97) and education range of 5-20 years (mean= 14.9, SD = 3.95); and for the 50-69 group, there were 14 males and 16 females, with an average age of 61.77 (3.0) and education range of ~18 years (mean= 13.2, SD=2.38). See the above study (PASAT.2) for additional participant characteristics and recruitment procedures.
Study strengths 1. The sample composition is well described in a previous study (Stuss et al., 1987) in terms of age, education, gender, geographic area, and recruitment procedures. 2. The data are stratified by three age groupings. 3. AdeC(luate exclusion criteria are described in a previous study (Stuss et al., 1987). 4. Test administration procedures are described in Stuss et al. (1987). 5. Means and SDs for the test scores are reported. Considerations regarding use of the study 1. Need to access Stuss et al. (1987) study in order to learn about the sample recruitment and testing procedures. 2. Mean educational levels for some of the age groups are relatively high; the 1~19 and 50-59 groups have
TESTS OF ATTENTION AND CONCENTRATION
148
substantially less education than the other age groups. 3. Overall sample size is adequate, hut individual cells are small. 4. The data were obtained on Canadian subjects, sometimes in French. which may limit their usefulness for · clinical interpretation in the United StatEs. [PASAT.4] Rao, Mittenberg, Bernardin, Haughton, and Leo, 1989 (Gronwall Version) (Table A8.5)
This study examined the effects of £~peri ventricular white-matter changes on 'tive functioning in healthy adults. The uthors selected 40 participants (10 males, 30 males) who had normal brain imaging to serve as controls. Participants ranged in age ~m 25 and 60 years, with an average age of (8.1), average educational level of 14.0 (2. ), and average Verbal IQ of 106.5 (5.8). All articipants were recruited from newspaper!: advertisements in the Milwaukee, Wiscons~, area. Additional exclusion criteria were a pristory of hypertension, cardiac or cerebro~cular disease, neurological illness, head in~·, substance abuse, or psychiatric illness. articipants underwent physical and ne logical exams. I Gronwall's 61-digit test administratiln version of the PASAT was employed, h t only two trials, at 3- and 2-second pacin rates, were used. Total correct responses fqr both trials are reported. '
42!
[PASAT.S] Stuss, Stethem, Hugenholtz, and Richard, 1989 (Gronwall Version) (Table A8.6)
The authors compared the performance of two groups of head-injured patients to controls on three neuropsychological tests. Twenty-six control participants (20 males, 6 females) with no history of neurological or psychiatric disorder were recruited. Participants were matched with head-injured patients on age (± 2 years), education (± 2 years), and gender. Thus, control subjects ranged in age from 17 to 57, with an average of 29.7 (12.4), and ranged in educational level from 7 to 20 years, with an average of 13.2 (3.0). The standard 61-digit version using four trials (2.4, 2.0, 1.6, and 1.2 seconds) was administered at two different points in study 1 and at five different points in study 2. Testing and retesting sessions were separated by approximately 1 week. Data for study 1 are reported in this review.
Study strengths 1. The sample composition is well described in terms of age, education, gender, and recruitment procedures. 2. Adequate exclusion criteria. 3. Test administration procedures are specified. 4. Means and SDs for the test scores are provided.
Study strengths 1 1. The sample composition is w~ described in terms of age, educatio•• gender, and recruitment procedures.! 2. Exclusion criteria are provided. 1 3. Test administration procedures rj-e de! scribed. 4. Means and SDs for the test scolies are reported. :
Considerations regarding use of the study 1. The geographic location where participants were recruited is not provided; however, it may he assumed that they were from the Ottawa, Canada, region, which may limit their usefulness for clinical interpretation in the United States. While not mentioned in this study, in previous studies the authors have administered the test in French or English, depending on the participant's language preference. 2. Small sample size.
Considerations regarding use of the stuldy 1. Relatively small sample size. I. 2. The data are not stratified by ag;, gen: der, or education. 3. Data for only two pacing rates ipr the PASAT are provided. i
Other comments 1. Test data for two testing sessions (from study 1) have been reproduced in this chapter. In addition, the authors provide data for five testing probes (study 2), which can he found in the original study.
i
I
PACED AUDITORY SERIAL ADDITION TEST
[PASAT.6] Rao, Leo, Bernardin, and Unverzagt, 1991a (Gronwall Version) (Table A8.7)
The study examined the pattern of cognitive deficits in patients with MS using a brief neuropsychologicaJ battery. The authors recruited 100 (25 maJes, 75 femaJes) normaJ, heaJthy adults through newspaper advertisements in the Milwaukee, Wisconsin, area. Controls were matched to MS subjects based on age (±3 years), education (±1 year), and gender. Thus, control participants had an average age of 46.0 (11.6) years, an average education of 13.3 (2.0) years, and an average Verba] IQ of 107.2 (11.2). Exclusion criteria were history of substance abuse, psychiatric illness, head injury, or other neurologicaJ disorders. All controls were given neurologicaJ evaJuations and MRI scans. Only one participant was non-Caucasian. All subjects were paid for their participation. GronwaJJ's 61-digit administration version of the PASAT was employed, but only two triaJs, at 3- and 2-second pacing rates, were used. TotaJ correct responses for both triaJs are reported. Study strengths 1. The sample composition is well described in terms of age, education, gender, and recruitment procedures. 2. Relatively large sample size. 3. Adequate exclusion criteria. 4. Test administration procedures are specified in a previous study (Rao et aJ. 1989). 5. Means and SDs for the test scores are reported. Considerations regarding use of study 1. The data are not stratified by age, gender, or education. 2. Data for only two pacing rates are provided. [PASAT.7] Strauss, Spellacy, Hunter, and Berry, 1994 (Gronwall Version) (Table A8.8)
The authors examined the utility of the PASAT as a tool for detecting malingering. They selected 10 (four maJes, six femaJes)
149
undergraduate students from the University of Victoria to serve as controls. Participants ranged in age from 20 to 35, with an average age of23.7 (2.58) and an average education of 15.21 (0.79) years. No exclusion criteria are provided. Two triaJs of GronwaJl's 61-digit version of the PASAT were administered at 2.0- and 1.6-second pacing rates. Study strengths 1. The sample composition is well described in terms of age, education, gender, geographic location, and recruitment procedures. 2. Means and SDs for the test scores are reported. Considerations regarding use of the study 1. Sample size is smaJI. 2. No exclusion criteria are described. 3. The data were obtained on Canadian subjects, which may limit their usefulness for clinicaJ interpretation in the United States. 4. Only two triaJs of the PASATwere used. 5. Education level is high. [PASAT.BJ Zalewski, Thompson, and Gottesman, 1994 (Gronwall Version) (Table A8.9)
The authors compared the cognitive performance of patients with Post-traumatic Stress Disorder and GeneraJized Anxiety Disorder to controls. The data were selected from a large database of scores collected in the Vietnam Experience Study (VES) during 1985-1986 (for more description, see Decoufle et aJ., 1991). The control group consisted of241 nonpsychiatric veterans randomly drawn from a larger sample of 1,579 veterans who had never met criteria for various psychiatric disorders (e.g., depression, bipolar disorder, substance abuse, personaJity disorders). No other exclusion criteria are provided. These participants were initiaJly recruited for the VES in order to study the long-term heaJth effects of military service in Vietnam. Participants were Vietnam and non-Vietnam veterans who entered the U.S. Army between 1965 and 1971. All participants underwent comprehensive medicaJ and psychologicaJ evaJuations. This
150
TESTS OF ATTENTION AND CONCENTRATION
sample is most likely primarily all male, but there is no mention of the gender composition. They were an average of 38.0 years old and had an average of 13.6 years of education (no SDs were reported). There were 189 Caucasians, 35 Mrican Americans, 11 Hispanics, and 6 "others" in the sample. Two trials (2.4 and 1.2 seconds) of Gronwall's version of the PASAT were administered, and total correct responses for both trials is reported.
Study strengths 1. Large sample size. 2. Sample composition is well described in terms of age, education, and ethnicity. 3. Test procedures are relatively well described. 4. Means and SDs for the test scores are reported.
Considerations regarding use of the study 1. It is unclear whether the control group was recruited for research participation only or if any of the participants were referred for clinical assessment. 2. Sample composition is not well described in terms of gender or recruitment procedures, but reference is made to another study. 3. Exclusion criteria only included psychiatric disorder. 4. Only two trials of the PASAT were administered, and total scores were reported. [PASAT.9] Crawford, Obonsawin, and Allan, 1998b (Gronwall Version) (Table A8.10)
The authors examined the relationship between age and PASAT performance, to obtain validity data on the PASAT and to provide additional normative data. A sample of 152 participants (77 males, 75 females) were screened for neurological, psychiatric, and systemic disorders. Participants ranged in age from 16 to 74, with an average age of 40.21 (13.89), an average education of 12.97 (2.86) years, and an average IQ of 105.0 (14.08). Participants were recruited from various communities and organizations within the United Kingdom, including recreational clubs,
community centers, and public service, and were paid for their participation. The original 61-digit version of Gronwall's PASAT was administered in its entirety, and total scores for the four trials are reported for the total sample and for three age groups.
Study strengths 1. Large sample is used. 2. The composition is well described in terms of age, education, gender, IQ, geographic area, and recruitment procedures. 3. Adequate exclusion criteria. 4. Test administration procedures are specified. 5. Means and SDs for the test scores are reported. 6. The sample is stratified into three age groups.
Considerations regarding use of the study 1. The data are not stratified by education or IQ. 2. Total scores are reported instead of individual scores for each of the four trials. 3. The data were obtained on subjects from the United Kingdom, which may limit their usefulness for clinical interpretation in the United States. [PASAT.10] Prevey, Delaney, Cramer, Mattson, and VA Epilepsy Cooperative Study 264 Group, 1998 (Gronwall Version) (Table A8.11)
As part of a large multicenter study of epilepsy,
the cognitive functioning of patients with complex partial and generalized seizure disorders was examined. Control participants consisted of 45 neurologically normal individuals. Additional exclusion criteria were a history of serious medical disorders, psychiatric disorders, or substance abuse. There is no mention of the gender of the participants nor their IQ; however, average age was 44.4 (11.4) years and average education was 12.8 (1.9) years. Participants were primarily recruited from nonmedical hospital staff at 13 different study centers across the United States. Only two trials (2.4 and 2.0 seconds) of Gronwall's 61-digit version of the PASAT were administered.
151
PACED AUDITORY SERIAL ADDITION TEST
Study strengths 1. Sample composition is relatively well described in terms of age, education, and recruitment procedures but not gender or IQ. 2. Adequate exclusion criteria. 3. Test administration procedures are specified. 4. Means and SDs for the test scores are reported.
Considerations regarding use of the study 1. The data are not partitioned by age or education group. 2. Only two trials of the PASAT were used. [PASAT.11] Holdwick and Wingenfeld, 1999 (Gronwall Version) (Table A8.12)
The relationship between mood, anxiety, and attention was assessed in college students. Undergraduate participants were randomly assigned to different conditions in which various mood states were induced (e.g., sad or anxious). Twenty controls were assigned to a neutral condition. There is no specific information regarding the age, education, IQ, or gender of the controls. All were native English speakers, had adequate hearing, and had no histmy of repeating grades in elementuy or high school. Additional exclusion criteria were history of psychological problems, neurological illness affecting attention, head trauma, medication use, substance abuse, attention problems, or learning disability. Age, gender, and ethnicity are described for the sample as a whole but not specifically for the control group. The 61-digit Gronwall version of the PASAT was administered using a computer. The four trials (2.4-, 2.0-, 1.6-, and 1.2-second pacing) were delivered via synthesized computer voice, and responses were recorded by a microphone. All responses were scored manually.
Considerations regarding use of study
1. The sample is small. 2. The age, education, and gender composition of participants in all conditions of the study are provided but not specifically for the control group. [PASAT.12] Honn, Para, Whitacre, and Bornstein, 1999 (Gronwall Version) (Table A8.13)
The authors examined the role of exercise in HIV-positive and -negative males and found that exercise only minimally improved cognitive functioning in both groups. Seventy-six HIV-negative homosexual or bisexual males, with a mean age of 32.5 (6.3) and mean educational level of 14.6 (2.4) years, served as controls. Exclusion criteria were history of intravenous drug use, head injuries resulting in greater than 1 hour of unconsciousness, learning disability, or other neurological disease. In this control sample, 32.4% (n = 13) of nonexercisers and 13.2% (n = 5) of exercisers reported past history of marijuana abuse or dependence. Participants were also administered an intelligence test (WAIS-R), the SCID, and various anxiety and depression rating measures. Only three trials (2.4, 2.0, 1.6 seconds) of Gronwall's 61-digit version of the PASAT were administered.
Study strengths 1. Relatively large sample size. 2. The sample composition is well described in terms of age, education, gender, and IQ. 3. Adequate exclusion criteria. 4. Test administration procedures are specified. 5. Means and SDs for the test scores are reported.
Study strengths 1. Adequate description of participant recruitment procedures. 2. Adequate exclusion criteria. 3. Test administration procedures are specified. 4. Means and SDs for the test scores are reported.
Considerations regarding use of the study 1. An all-male sample is used. 2. Education levels are relatively high. 3. Recruitment procedures are not specified. 4. A portion of the sample reports a history of marijuana abuse or dependence.
152
TESTS OF ATTENTION AND CONCENTRATION
Other comments 1. 'nle exercisers scored significantly higher on the 1.6-second trial of the PASAT relative to the nonexercisers. [PASAT.13] Wingenfeld, Holdwick, Davis, and Hunter, 1999 (Gronwall Version) (Table A8.14)
This study was designed to develop normative data for a computerized version of Gronwall's PASAT. The authors recruited 168 (80 males, 88 females) college students between the ages of 17 and 48 with an average age of 21 (5.1) years at the University of Arkansas, Fayetteville. The sample was 88% Caucasian, 4% African American, 4% Asian American, and 4% other ethnic group. The data were first stratified by gender and then by two age groups (1729, 30-48 years). Exclusion criteria were any history of neurological illness, emotional problems, learning disability, attentional problems, or uncorrected hearing difficulty. Only native English speakers were included. Subjects were given course credit for participation. The testing procedures are similar to those of Gronwall, except that the digits are presented by the computer via speaker and responses are recorded through an external speaker. Additionally, while all four trials are delivered (2.4-, 2.0-, 1.6-, and 1.2-second pacing), a new random series of the 61 digits is presented during each trial.
Other comments 1. Additional outcome measures, such as number of errors committed and number of "no" responses, are reported in the original article, which have not been reproduced in this chapter. [PASAT.14) Bate, Mathias, and Crawford, 2001 (Gronwall Version) (Table A8.1 5)
This study examined the relationship between the Test of Everyday Attention and various neuropsychological measures in patients with severe head injury. The study was conducted in Australia, where 35 controls (20 males, 15 females) who were native English speakers with no history of psychiatric illness, neurological disorders, intellectual disability, substance abuse, or hemiplegia of the dominant hand, were recruited. Participants were an average of 30.2 (10.3) years of age, obtained an average of 12.6 (2.0) years of education, and had an average premorbid IQ of 101.1 (9.1) based on the National Adult Reading Test-Revised (NART-R). The exact location and procedures for participant recruitment are not specified. Also, it is unclear whether the participants were patients with non-brain injury-related illness or healthy individuals from the community. 'nle Gronwall 61-digit version of the PASATwas presented with all four trials (2.4-, 2.0-, 1.6-, 1.2-second pacing).
Study strengths Study strengths 1. Adequate sample sizes, except for the 30-48 age group. 2. 'nle data are stratified first by gender and then by two age groups (17-29, 3048 years). 3. The sample composition is well described in terms of age, gender, ethnicity, and recruitment procedures. 4. Adequate exclusion criteria. 5. Test administration procedures are specified. 6. Means and SDs for the test scores are reported.
Considerations regarding use of the study 1. Cell size for the 30-48 age group is relatively small (n = 12).
1. The sample composition is well described in terms of age, education, gender, and IQ. 2. Adequate exclusion criteria. 3. Test administration procedures are specified. 4. Means and SDs for the test scores are reported.
Considerations reganhng use of the study 1. The sample size is small. 2. Recruitment procedures are not well described. Controls may be non-headinjured medical patients. 3. The data were obtained on Australian subjects, which may limit their usefulness for clinical interpretation in the United States.
153
PACED AUDITORY SERIAL ADDITION TEST
[PASAT.15] Boringa, Lazeron, Reuling, Ader, Hennings, Underboom, de Sonneville, Kalken, and Polman, 2001 (Gronwall Version) (Table A8.16) The sensitivity of the Brief Repeatable Battery of Neuropsychological Tests, used to assess cognitive functioning in patients with MS, was evaluated in Amsterdam. This battery includes a modified, two-trial version of Gronwall's PASAT. A total of 140 healthy participants (62 males, 78 females) between the ages of 22 and 73, with an average age of 45.8 years, were recruited from the community. None had central nervous system disease, psychiatric illness, learning disability, history of substance abuse, serious head injury, or other major medical illness. In terms of education, 31 participants had< 9 years, 55 had 9 or 10 years, and 53 had> 10 years (one participant did not state his education). Gronwall's 61-digit version of the PASAT was administered using only two trials (3- and 2-second pacing).
have been reviewed in this chapter. Participants were 60 (30 males, 30 females) young and middle-aged adults recruited from the Guy's College campus in the vicinity of London, England, via newspaper advertisements and notices. The "young" men were an average of 21.1 (0.4) years of age and had an average IQ of 113.0 (1.5), the "young" women were an average of 20.9 (0.2) years of age and had an average IQ of 112.4 (1.7), the "middle-aged" men were an average of 57.5 (1.3) years of age and had an average IQ of 117.7 (1.8), and the "middle-aged" women were an average of 60.3 (0.7) years of age and had an IQ of 113.3 (2.2). All participants were screened for physical illness in the past week, use of any medication, history of psychiatric disorders, and high scores on a depression or anxiety scale. All four trials (2.4, 2.0, 1.6, and 1.2 seconds) of Gronwall's version of the PASAT were used, and scores for each trial are presented.
Study strengths
Study strengths
1. Large sample size. 2. The sample composition is well described in terms of age, education, and gender. 3. Adequate exclusion criteria. 4. Test administration procedures are specified. 5. Means and SDs for the test scores are reported.
1. The sample composition is well described in terms of age, gender, IQ, geographic area, and recruitment procedures. 2. The data are stratified by two age groups (young and middle-aged) x gender. 3. Adequate exclusion criteria are used. 4. Test administration procedures are specified. 5. Means and SDs for the test scores are reported.
Considerations regarding use of the study 1. Over half of the sample has < 10 years of education. 2. The data were obtained on individuals from Amsterdam, which may limit their usefulness for clinical interpretation in the United States. [PASAT.16] Fluck, Fernandes, and File, 2001 (Gronwall Version) (Table A8.17)
The study had two goals: (1) to examine the effects of two dosages of lorazepam on attention in healthy individuals and (2) to investigate the effects of age and gender on selected tests of attention. More comprehensive norms are presented for the part of the study that examined age and gender; thus, those data
Considerations regarding use of the study 1. Overall sample size is adequate, but individual cells are relatively small. 2. Intelligence level for the sample is relatively high. 3. Educational levels are not reported. 4. The data were obtained on individuals from London, England, which may limit their usefulness for clinical interpretation in the United States. [PASAT.17] Snyder, Cappelleri, Archibald, and Fisk, 2001 (Gronwall Version) (Table A8.18)
Using two different scoring methods for the PASAT, the authors examined the classification
154
TESTS OF ATTENTION AND CONCENTRATION
rates of patients with secondary progressive and relapsing-remitting types of MS. The authors reanalyzed data from MS patients and 35 (9 males, 26 females) healthy controls collected in an earlier study (Fisk & Archibald, 2001). Staff, volunteer workers, and $tudents from the Queen Elizabeth II Health ·Science Centre, Dalhousi University, and MS~ Society in Nova Scotia, Canada, served as qontrols. The average age of the participants ~ 37.97 (12.94) years, average education ~ 14.06 (2.27) years, and average raw WAIS-R!Vocabulary subtest score was 54.5 (7.0). Eiclusion criteria were history of drug or alcohol abuse, major psychiatric illness, learning disability, seizures, head trauma, or other neurological disorder. Additional exclusion criteria were use of specific medications, such as· neuroleptics, benzodiazepines, antiepileptic drugs, or sedatives. All four trials (2.4, 2.0, 1.6, and 1.2 s+conds) of Gronwall's version of the PASAT ~read ministered. Two mean outcome measqres are reported: (1) the mean number of ~rrect responses across the four trials (i.e., the sum of the correct responses for all trials divided by 4) and (2) the dyad score, in which ~airs of correct responses were counted as one correct point.
Study strengths 1. The sample composition is well described (in an earlier study by Fisk & Archibald, 2001) in terms of age, education, Vocabulary subtest perfollllance, geographic area, and recruitment procedures. 2. Adequate exclusion criteria. 3. Test administration procedures are specified. 4. Means and SDs for the test scotes are reported.
Considerations regarding use of the study 1. The sample size is relatively small 2. The data were obtained on Canadian subjects, which may limit their .sefulness for clinical interpretation In the United States. 3. The educational level is relatively high (14.1 years).
Levin's Administration Version [PASAT.18] Brittain, Ia Marche, Reeder, Roth, and Boll, 1991 (Levin Version) (lables A8.19 and A8.20)
In this normative study using the Levin et al. (1987) version of the PASAT, the authors present data for 526 healthy participants (aged 17-88 years). The data were stratified by four age groups (< 25, 25-39, 40-54, and > 55 years). In the< 25 age group, there were 145 (55 male, 90 female) participants, 79 Caucasians and 66 "other" race, with an average of 13.0 (1.3) years of education and an average Shipley IQ of 105.0 (9.1). In the 25-39 age group, there were 164 (67 male, 97 female) participants, 114 Caucasians and 50 "other" race, with an average of 14.0 (2.2) years of education and an average Shipley IQ of 103.0 (10.4). In the 40-54 age group, there were 95 (50 male, 45 female) participants, 79 Caucasians and 16 "other" race, with an average of 13.0 (3.1) years of education and an average Shipley IQ of 101.0 (12.6). In the >55 age group, there were 122 participants, 119 Caucasians and 3 "other" race, with an average of 12.0 (2.5) years of education and an average Shipley IQ of 106.0 (15.1). For the >55 age group, the authors report 82 males and 82 females, but this appears to be a misprint since there were only 122 participants in total for this age group. Exclusion criteria were a history of psychiatric or neurological problems, as well as concussions or loss of consciousness. A detailed description of this modified version of the PASAT is presented. Error rates (rather than correct responses) and seconds taken for each response are used as the outcome measures.
Study strengths 1. The sample composition is well described in terms of age, education, gender, and Shipley IQ. 2. The data are stratified by age and IQ level. 3. Adequate exclusion criteria. 4. Test administration procedures are well specified. 5. Means and SDs for the error scores are reported.
155
PACED AUDITORY SERIAL ADDITION TEST
Considerations regarding use of the study 1. The data are not stratified by educational level. 2. Overall sample is adequate, but some of the individual cells are small. Other comments 1. Number of errors rather than correct responses are reported. 2. Data for number of seconds taken to respond are reported in the original article, but since these data are rarely used in clinical evaluations, they have not been reproduced in this chapter. [PASAT.19] Roman, Edwall, Buchanan, and Patton, 1991 (Levin Version) (Table A8.21)
The authors conducted this study in order to provide additional normative data for the Levin et al. (1987) version of the PASAT. They recruited 143 white adults in three different age groups (18-27, 33-50, and 60-75). IQ was prorated with the Block Design and Vocabulary subtests from the WAIS-R. In the 18-27 age group, there were 62 (58% female) participants, with an average education of 12.0 (0.77) years and an average IQ of llO (12.3). In the 33-50 age group, there were 40 (50% female) participants, with an average education of 15.0 (2.6) years and an average IQ of 110 (12.3). In the 60-75 age group, there were 41 (51% female) participants, with an average education of 15.0 (3.2) years and an average IQ of 107.0 (11.0). Participants were undergraduate students and employees of Baylor University, students from a local business college, members of service clubs and retired professional groups, employees of local businesses, individuals from senior citizen organizations, and individuals in retirement communities. Only one-fourth of the participants were paid ($5 each). Exclusion criteria were a history of head injury with loss of consciousness, other neurological disorders, substance abuse, psychiatric disorders, or current use of psychoactive medication. Study strengths 1. Relatively large sample. 2. The sample composition is well described in terms of age, education,
gender, ethnicity, IQ, geographic location, and recruitment procedures. 3. The data are presented for three age groups. 4. Test administration procedures are specified. 5. Means and SDs for the test scores are reported. Considerations regarding use of the study 1. Educational levels are high in the middle-aged and older adult age groups. Other comments 1. IQ was estimated using only the Vocabulary and Block Design subtests of the WAIS-R. [PASAT.20] Cicerone, 1997 (Levin Version) (Table A8.22)
The author compared the attentional abilities of mildly head-injured patients and normal controls on four neuropsychological tests. Forty control participants between the ages of 18 and 59, with an average age of 33.3 (12.4) years and average educational ·level of 14.9 (2.2), were enrolled. Participants had no history of head injury, neurological disease, or psychiatric illness and were recruited from the Edison, New Jersey, community. They were administered the Levin et al. (1987) version of the PASAT. Study strengths 1. Adequate sample size. 2. The sample composition is well described in terms of age, education, geographic area, and recruitment procedures but not gender. 3. Adequate exclusion criteria. 4. Test administration procedures are specified. 5. Means and SDs for the test scores are reported. Considerations regarding use of the study 1. Wide age range among participants. 2. Educational level is relatively high. 3. Total PASAT scores, rather than individual scores for each of the four trials, are reported.
156
TESTS OF ATTENTION AND CONCENTRATION
[PASAT.21] Wiens, Fuller, and Crossen, 1997 (Levin Version) (Tables A8.23 and A8.24)
[PASAT.22] Tierslcy, Cicerone, Natelson, and Deluca, 1998 (Levin Version) (Table A8.25)
This is a normative study for Levin et al. 's (1987) version of the PASAT. The authors selected 821 (672 male, 149 female) participants aged 20-49 years who were administered neuropsychological and psychological tests as part of a civil service job selection process. There were 699 Caucasians, 46 African Americans, 31 Hispanics, 32 Asians, and 13 Native Americans in the sample. The data were stratified by gender. Male participants were an average of 29.2 (6.1) years of age, with an average education of 14.6 (1.5) years and an average WAIS-R full-scale IQ (FSIQ) of 106.6 (11.0). Female participants were an average of 29.2 (5.6) years of age, with an average education of 14.5 (1.6) years and an average WAIS-R FSIQ of 105.4 (11.1). They were all from the Pacific Northwest of the United States. All participants had passed physical and medical health screening prior to test administration. All had passed a test of basic academic skills, and none had alcohol or substance abuse. All four trials of Levin's version of the PASAT were administered.
Information-processing speed was compared among patients with chronic fatigue syndrome, mild head injwy, and normal controls. All 20 normal control participants were females, who were recruited from advertisements in the local community of New Jersey and paid for their participation. Participants were an average of37.1 (2.4) years of age, with an average education of 15.0 (0.55) years. Exclusion criteria were current medical illnesses, a history of loss of consciousness > 5 minutes, psychiatric illness, use of medication, or participation in a regular exercise program. The Levin et al. (1987) version of the PASAT was used, and the total number of correct responses for all four trials was reported.
Study strengths 1. The sample composition is well described in terms of age, education, gender, IQ, ethnicity, geographic location, and recruitment procedures. 2. The data are stratified by gender and by age x IQ. 3. Adequate exclusion criteria. 4. Test administration procedures are specified. 5. Means and SDs for the test scores are reported. Considerations regarding use of the study 1. Overall sample size is adequate, but some of the individual cells are relatively small. Other comments 1. The authors found differences between the ethnic groups, but the sample sizes were too small to make any definitive conclusions.
Study strengths 1. The sample composition is well described in terms of age, education, gender, geographic area, and recruitment procedures. 2. Adequate exclusion criteria. 3. Reference is provided for test administration procedures. 4. Means and SDs for the test scores are reported. Considerations regarding use of the study 1. Small sample size. 2. Female participants only. 3. Education level is high. 4. Total scores are reported instead of individual scores for each of the four trials. [PASAT.23] Stein, Kennedy, and Twamley, 2002 (Levin Version) (Table A8.26)
The authors examined the difference in neuropsychological test performance of female victims of partner violence with posttraumatic stress disorder (PTSD) compared to victims without PTSD and nonvictimized controls. Twenty-two female control participants were recruited through posted advertisements and personal contacts in the San Diego, California, community. They were an average of 29.4 (10.7) years of age, had an average of 13.9 (1.5) years of education, and had an average raw WAIS-111 Verbal subtest score of 45.9
157
PACED AUDITORY SERIAL ADDITION TEST
(7.4). All participants were ftuent English speakers and had at least an 8th grade reading ability. Further exclusion criteria were meeting DSM-IV criteria for PTSD; use of psychotropic medication within the last 6 weeks of the study; use of oral or intramuscular steroids within the last 4 months of the study; learning disability; history of attention-deficit disorder, substance abuse, seizure disorder, schizophrenia, or other psychotic disorders; or neurological illness. The Levin et al. (1987) version of the PASAT was used, and the total number of correct responses for all four trials was recorded.
Study strengths 1. The sample composition is well described in terms of age, education, Vocabulary subtest performance, geographic area, and recruitment procedures. 2. Adequate exclusion criteria. 3. While test administration is not described, appropriate reference is made to the version of the PASAT used. 4. Means and SDs for the test scores are reported.
Considerations regarding use of this study 1. The sample is small. 2. An all-female sample is used. 3. Summary scores across all trials are reported, rather than correct responses for each individual trial.
[PASAT.24] Diamond, Deluca, Kim, and Kelley, 1997 (Levin Version) (Table A8.27)
This study compared performance on the PASAT and the visual analog version of the PASAT (the PVSAT) of patients with MS and controls. The authors recruited 22 participants to serve as controls on the PASAT task. There is no information about the gender of the participants. They ranged in age from 31 to 56, with an average age of 40.9 (8.9), average educational level of 15.4 (2.2), and average North American Adult Reading Test (NAART) premorbid IQ of 113.6 (13.0). None of the participants had a history of psychiatric or neurological disorders, drug or alcohol abuse, or loss of consciousness. All participants had
normal Mini-Mental Status Exam scores. Participants were recruited from either the Kessler Institute in West Orange, New Jersey, or the local community. The authors report using a 50-digit version of the PASAT at four pacing intervals (2.4, 2.0, 1.6, and 1.2 seconds). However, it is unclear whether the standard version of Levin et al.'s (1987) procedures were used.
Study strengths I. The sample composition is relatively well described in terms of age, education, IQ, geographic area, and recruitment procedures but not gender. 2. Adequate exclusion criteria. 3. Means and SDs for the test scores are reported.
Considerations regarding use of the study 1. The sample size is small. 2. It is unclear whether the digits were presented in a different random order or in a fixed random order across trials. 3. The educational level is relatively high.
PASAT-50, PASAT-100, and PASAT-200 Administration Versions [PASAT.25) Diehr, Heaton, Miller, and Grant, and the HIV Neurobehavioral Center, 1998 (PASAT-200 Version) (Table A8.28)
The authors present normative data for a large sample of Caucasian and African-American males and females, using a modified version of the PASAT (i.e., PASAT-200; see section on Modifications and Alternate Formats of the PASAT). A total of 566 participants were used from four separate studies. One hundred fifty of the participants were HIV-!-seronegative controls recruited from a research center in San Diego, California; 277 participants were African-American volunteers recruited for a normative study from the San Diego, California community; 78 served as controls for a study examining the effects of alcohol on cognitive performance; and 60 were controls for a study examining the effects of eosinophilia myalgia syndrome. Exclusion criteria for all studies were history of neuropsychiatric
158
TESTS OF ATTENTION AND CONCENTRATION
conditions such as substance abuse or dependence, head injury, and developmental disability. Participants ranged in age from 20 to 68, with an average age of39.7 (12.1) years, and ranged in education from 9 to 20, with an average education of 14.2 (2.6) years; 39% were female and 55% were African American. Briefly, the PASAT-200 is very similar to Levin et al.'s (1987) version in that it consists of the presentation of 50 single digits (except for 7) in random order at four different pacing intervals. However, the pacing intervals are 3.0-, 2.4-, 2.0-, and 1.6-seconds per digit, instead of 2.4-, 2.0-, 1.6- and 1.2-seconds. Study strengths 1. Large sample size. 2. The sample composition is well described in terms of age, education, ethnicity, gender, geographic area, and recruitment criteria. 3. Test administration procedures are specified. 4. Adequate exclusion criteria are used. 5. Means and SDs for the test scores are reported. 6. Data are stratified by ethnicity and by educational level.
of education, 21% had a high school education, and 12% had lower than a high school education. Forty-five percent of the sample were Caucasian, while the remaining 55% were African American. All participants were screened for psychiatric illness, developmental disabilities, substance abuse, and head injuries. A more detailed description of the sample is provided above (PASAT.25) and in Diehr et al. (1998). Brie8y, the PASAT-50 consists of one trial of 50 digits (excluding 7) presented in random order at a pace of 3 seconds. The PASAT-100 consists of the same 50 digits presented over two trials, 3-second pace and 2.4-second pace. Study strengths 1. Large sample size. 2. The sample composition is well described in terms of age, education, ethnicity, gender, geographic area, and recruitment criteria. 3. Test administration procedures are specified. 4. Adequate exclusion criteria are used. 5. Means and SDs for the test scores are reported.
Considerations regarding use of the study 1. Total PASAT-200 scores, rather than individual scores for each of the four trials, are reported.
Considerations regarding use of the study 1. The average education level of the sample is relatively high. 2. Total scores, rather than individual scores for each trial, are reported.
[PASAT.26] Diehr, Cherner, Wolfson, Miller, Grant, Heaton, and the HIV Neurobehavioral Research Center Group, 2003
CONCLUSIONS
(PASAT-50, -100, -200 Versions)
(Table A8.29) The authors present demographically corrected normative data for two shortened versions of the PASAT-200, namely, the PASAT-50 and the PASAT-100. The authors used 560 (61% male) participants from a pool of archival data on which the PASAT-200 normative information was based (Diehr et al., 1998). Participants ranged in age from 20 to 68, with an average age of 39.7 (12.1), and 24% of the sample was over 50 years. Their education level ranged from 9 to 20, with an average education of 14.2 (2.6) years. Most (33%) had between 13 and 15 years
Studies have documented the utility of the PASAT as a measure of attention/concentration, working memory, and information processing. In fact, the National Multiple Sclerosis Society included a version of this test in their Brief Repeatable Battery of Neuropsychological Tests. However, the major drawback of the original version of the PASAT is that it can be a lengthy, difficult, and stressful test. In fact, several studies have noted participant frustration and attrition. Fortunately, there are alternatives to the original version of the test. Clinicians can administer only one or two trials rather than
PACED AUDITORY SERIAL ADDITION TEST
all four or use alternative, shortened versions of the PASAT. A review of the literature reveals that there are no significant gender effects for the PASAT but that scores are strongly affected by age, education, and intellectual functioning. As would be expected for most tests involving speed, PASAT performance significantly declines with age, particularly as the pacing time for the digits is reduced, requiring more cognitive resources. Likewise, inspection of the data clearly reveals an improvement in performance with higher educational levels. While not all studies have found strong correlations between the PASAT and intellectual functioning, the data reviewed in this chapter indicate that it is an important factor to consider when administering this test. It is clear that
159
further normative studies partitioning the effects of age, education, and IQ are needed. Significant practice effects have been reported for the Gronwall (1977a,b) version of the PASAT, presumably because the digits are presented in the same random order during each pacing trial. This problem has been addressed to some degree with Levin et al.'s (1987) version, in which digits are presented in a different random order during each trial. The effects of culture, ethnicity, and linguistic background on the PASAT have received very little attention. Only one study explicitly examined the role of ethnicity in PASAT performance (Diehr et al., 1998). It is clear that future PASAT normative studies need to examine factors such as culture, ethnicity, and bilingualism. 2
•Meta-analyses for the PASAT were conducted using data reported in this chapter for each of the four presentation rates separately. Although the R2 and significance level for the resulting regression were minimally acceptable, we felt that the solution was greatly inHuenced by only few data points which had a considerable weight. Therefore, the results of meta-analyses are not presented in this chapter.
J
9 Cancellatiori Tests
BRIEF HISTORY OF THE TESTt
A number of cancellation tests have been developed over the years. Such tests te primarily designed to assess aspects of atlt!ntion, such as sustained and selective a~ntion. Sustained attentiOn "refers to the abllity to maintain a consistent level of perfoemance over an extended period of time,''; while selective attention entails selection of ~levant target stimuli while avoiding distracto~ (Ruff & Allen, 1996). Some cancellation te$ts are also referred to as "vigilance tests" (tezak, 1995; Lezak et al., 2004) and typically fivolve measures of both speed and accuracy of performance. A number of cancellation testJ using letters, numbers, or symbols as target stimuli are available to clinicians. The Ruff 2&7 (Ruff et al., 1986a), Digit Vigilance (Le;vis & Rennick, 1979), Digit Cancellation Test (Della Salla et al., 1992, 1998}, Visual Searcih and Attention Test (Trenerry et al., 1990), Yerbal and Nonverbal Cancellation Tasks (Mt$ulam, 1985}, Letter and Symbol Cancellatio+ Task (Caplan, 1985), and Star Cancellation (Halligan et al., 1991; Wilson et al., 1987) are $mong the many cancellation tests available t~ clinicians and researchers (see Lezak, 19~. and Lezak et al., 2004, for more details on these tests). The Ruff 2&7 Selective Attentioq Test and Digit Vigilance Test are the tw~ most 160
commonly used cancellation tests with the most available literature and have been selected for review in this chapter.
RUFF 2&7 SELECTIVE ATTENTION TEST Brief Overview of the Ruff 2&7
The Ruff 2&7 Selective Attention Test was developed by Ruff and colleagues and is included in the San Diego Neuropsychological Test Battery (Baser & Ruff, 1987; Ruff & Crouch, 1991). The test is designed to examine both sustained and selective attention using two distractor conditions. The test consists of 20 blocks, each containing three lines of 50 characters. Within each line, 10 target digits (2s and 7s) are intermixed with either other number distractors or capital letter distractors. Ruff distinguished two test conditions: (1) blocks in which the target numbers are embedded among letters, referred to as the "Automatic Detection" condition, and (2) blocks in which the target stimuli are embedded among other numbers, referred to as the "Controlled Search" condition. The presentation of the conditions (blocks of all digits or blocks of digits and letters) is alternated. Following brief practice trials, the examinee is given 15 seconds to complete each of the 20 blocks. He or she is
CANCELLATION TESTS
prompted to move to the succeeding block when the examiner says "next." Ruff and Allen (1996) state that in the Automatic Detection condition, because the numbers belong to a different stimulus category from the letters, the selection process is automatic (i.e., "single-step retrieval of categorical information"). However, in the Controlled Search condition, since the targets and distractors belong to the same category, a more effortful search involving aspects of working memory is required. Three outcome measures can be obtained for each of the two conditions: (1) speed is measured with total number of target letters crossed out, (2) errors consist of the total number of commissions and omissions, and (3) detection accuracy is calculated by dividing the speed value by the sum of the speed plus error values (Ruff & Allen, 1996). A number of clinical studies have been conducted with the Ruff 2&7 test. Ruff et al. (1992) found that patients with right hemisphere cerebral lesions performed at far slower rates than those with left-sided lesions and normal controls. Interestingly, those with right anterior lesions were also far less accurate in their performance, while patients with left anterior lesions performed similar to controls. Ruff et al. (1989a) examined the effects of cognitive rehabilitation on Ruff 2&7 performance in patients with head injury. They found that teaching cognitive strategies, such as focused, sustained attention, as well as teaching spatial relationships and memory strategies actually improved test performance over time. Specifically, on the Ruff 2&7, patients in the cognitive strategy condition made fewer errors relative to those in the control condition. Bate et al. (2001) found that patients with severe traumatic brain injury (TBI) crossed out fewer target stimuli (i.e., were slower) than normal controls. Additionally, while significance values are not reported, the TBI patients who were within 1 year postinjury were slower than those who were at least 2 years postinjury. Cicerone and Azullay (2002), in their examination of the sensitivity and specificity of various neuropsychological tests in patients with mild TBI
161
(but whose symptoms persisted for at least 3 months), found the Ruff2&7 test to be among the most sensitive and specific measures. They concluded that this test "can be used with confidence" since those without concussions were unlikely to display impairments on the Ruff 2&7. Finally, Ruff et al. (1993) found that the Ruff 2&7 was among the neuropsychological tests that most strongly predicted head-injured patients' ability to return to work after 1-6 months postinjury. Ruff (1994) observed relatively mild impairment in depressed patients on the Ruff 2&7. The percentile ranking of the majority of patients fell within the average range for speed and accuracy. In fact, none of the depressed patients was impaired on the accuracy measures, and only three patients exhibited slowed speed. Weiss (1996) reported that schizophrenic patients had more difficulty with speed (only 23% of patients scored in the normal range) than with accuracy (67% scored in the normal range) on the Ruff 2&7. Additionally, patients were better able to detect a target stimulus when it was embedded in letters (Automatic Detection condition) rather than within other digits (Controlled Search condition). Finally, Schmitt et al. (1988) discovered that AIDS patients and patients with AIDSrelated complex who were on medication displayed improved performance on the Ruff 2&7 relative to those who were receiving a placebo. Further details about the Ruff 2&7 testing materials, administration procedures, and scoring can be obtained from the test manual and kit (see Appendix 1 for ordering information; also Lezak et al., 2004). Psychometric Properties of the Ruff 2&7
Ruff et al. (1986a) performed a test-retest reliability study of the Ruff 2&7 for four age groups, ranging between 16 and 70 years of age. Testing probes were separated by 6 months. The correlation coefficients for the four age groups by the two conditions (i.e., automatic or controlled) ranged 0.84-0.97. The r values were in approximately the same ranges for the four age groups; however,
162
TESTS OF ATTENTION AND CONCENTRATION
slightly better performance was noted· for the automatic condition (letter distractors) relative to the controlled condition (dipt distractors). While an improvemept of approximately 10 points on the retest was reported, the two conditions showed similar rates of practice effects (Ruff et al., 1986a). Baser and Ruff (1987) conducted factor analysis on the Ruff 2&7 along with a jhost of other neuropsychological tests and fouhd that in normal controls the Ruff 2&7 best [loaded on a factor they termed "complex lintelligence." This factor also contained suclt measures as Controlled Oral Word Ass<>fiation, Full Scale IQ, Vocabulary, Block ~esign, Digit Span, and Digit Symbol. Howdver, in the same study, using a mixed clinical ~ample (e.g., psychiatric and head-injured pai;ients), the Ruff 2&7 outcome measures loade4 on an "arousal" factor (which also included :Finger Tapping, mean designs on the Ruff Figural Fluency Test, Digit Symbol) and a "pikning and 8exibility" factor (which also ittluded outcome measures from the Wiscons~ Card Sorting Test, perseverative score frdm the Ruff Figural Fluency Test, and Ruff-Light Trail Learning Test). I
Relationship Between Ruff 2&7 Performance and Demographic Factors Ruff et al. (1986a) examined diff~ences between genders, four age groups, and three educational levels on the two Ruff 2&!1 conditions. They found no gender effec~. with males and females performing similarly :across the two conditions. Clear age effect~ were found across the two conditions, with a: linear decline in performance as age increased! Similarly, they found that performance im~roved as educational level increased up to 15~years; Ruff 2&7 performance plateaued at > 1$ years of education. They also found that on ayerage individuals performed approximately 15 ~ints better on the Automatic Detection (letter distractors) relative to the Controlled Search (digit distractors) condition. Clearly, more normative studies are qeeded to better understand the relationship hEttween key demographic factors and Ruff 2&7 gerformance. Additional studies should also elilmine
the effects of intellectual functioning, ethnicity, and motor functioning on the Ruff 2&7. For further normative information regarding the Ruff 2&7, see the professional manual produced by Ruff and Allen (1996).
DICIT VIGILANCE TEST
Brief Overview of the DVT The Digit Vigilance Test (DVT) was developed by Lewis and Rennick (1979) as part of a larger test battery, the Repeatable CognitivePerceptual-Motor Battery. The DVT is a test of vigilance and sustained attention, which also measures aspects of rapid visual tracking ability and psychomotor speed. This test consists of two pages, with 35 single digits appearing within 59 rows. The digits on the first page are printed in red ink, and the digits on the second page are printed in blue ink. For the standard administration, the task is to cross out the number 6, which is randomly dispersed throughout the page of digits. The alternate administration procedure requires that the participant cross out the number 9, which also randomly appears throughout the page of digits. The time in seconds taken to complete the task, the number of omissions (target numbers not crossed out), and the number of commissions (numbers other than the target crossed out) are recorded. There are relatively few clinical or normative studies on this test. In a study of mildly hypoxemic patients with chronic obstructive pulmonary disease (COPD), Prigatano et al. (1983) observed that patients required a significantly greater amount of time to complete the DVT relative to normal controls. In a study by Bardwell et al. (2001), DVT was the only neuropsychological test score to significantly improve in obstructive sleep apnea patients who were given continuous positive airway pressure relative to those who were given placebo treatment (Grant et al., 1987). These studies suggest that the DVT, and perhaps similar cancellation tests, is sensitive to detecting neuropsychological deficits in patients with even mild forms of hypoxemia. Smith et al. (2001) reported better performance on the DVT in postmenopausal women
163
CANCELLATION TESTS
who were on hormone replacement therapy (HRT) relative to their age-matched counterparts who were not taking HRT. Shean et al. (2002) found that coaching or providing testtaking instructions significantly improved DVT performance in a group of patients with schizophrenia. Additionally, these authors detected that negative symptoms and degree of disorganized thought significantly correlated with lack of ability to benefit from coaching on the DVT. These findings essentially replicated an earlier study by Eckman and Shean (2000). Psychometric Properties of the DVT
Kelland and Lewis (1994) reported a testretest (probes separated by 1 week) coefficient of 0.87, with a 95% confidence interval of 0.71--0.95, for the standard form test administration of the DVT and a coefficient of 0.89, for the alternate form administration, with a 95% confidence interval of 0.75--0.96. Unfortunately, these data are based on a sample of only 20 individuals. In a subsequent study, Kelland and Lewis (1996) reported practice effects on the DVT, with test speed improving on the second week of test administration relative to the first (initial) testing session. However, no improvements were noted between the third week of testing relative to the second. Kelland and Lewis (1996) also assessed the convergent validity of the Repeatable Cognitive-Perceptual-Motor Battery, which contains the DVT, by evaluating its sensitivity to diazepam. While the overall score for the battery discriminated between individuals on diazepam and placebo, no differences were found between the two groups for the DVT. However, this was also a small sample, with each group containing only 20 individuals. Grant et al. (1987) conducted a factor analysis on tests from the Halstead-Reitan Neuropsychological Test Battery and several other neuropsychological tests, including the DVT, in COPD patients and healthy controls. They observed the DVT to cluster with tests of "alertness-psychomotor speed," such as Trails B and Digit Symbol. In the same study, they noted that the DVT was one of only three
neuropsychological tests that did not discriminate between mild, moderate, and severe hypoxemic COPD patients but did discriminate between the COPD group as a whole and normal controls. Overall, these authors conclude that the DVT clusters with tests of attention and psychomotor speed and that it is a sensitive test for discriminating COPD patients from controls but not for discriminating patients at various stages of COPD. Relationship Between DVT Performance and Demographic Factors
As noted earlier, there are very few normative studies available for the DVT. Heaton et al. (1991) included the DVT in their comprehensive normative book on various neuropsychological tests, making this the largest normative study to date on the DVT. Heaton et al. (1991) detected that in a group of 210 participants, age and years of education accounted for 24% and 13% of variability in the time to complete the test, respectively, and for 15% and 16% of variability in the number of errors committed. However, gender alone accounted for only 2% of the variability in DVT outcome measures. Kelland and Lewis (1996) also found no gender effect for total time required to complete the task or for total number of errors in a group of college students.
METHOD FOR EVALUATING THE NORMATIVE REPORTS
To adequately evaluate the Ruff 2&7 and DVT normative reports, five criterion variables were deemed critical. The first four of these are related to subject variables, and the last one refers to procedural issues. Subject Variables
Sample Size Fifty cases are considered a desirable sample size. Although this criterion is somewhat arbitrary, a large number of studies suggest that data based on small sample sizes are highly influenced by individual differences
164
TESTS OF ATTENTION AND CONCENTRATION
and do not provide a reliable estimate of the population mean. Sample Composition Description
Information regarding medical and p~hiatric exclusion criteria is important. It is unclear if gender, intellectual level, handedness, geographic recruitment region, socioeconomic status, occupation, ethnicity, or recrUitment procedures are relevant. Until this is determined, it is best that this information be provided. Age Group Interval
This criterion refers to grouping of the data into limited age intervals. This requireJnent is especially relevant for this test since a strong effect of age on cancellation test perfo~ance has been demonstrated in the literatu~. Reporting of Educational levels
Given the possible association betw~ education and cancellation test scores, information regarding educational level shotild be reported for each subgroup.
Procedural Variable Data Reporting
For the Ruff 2&7, group means and standard deviations for the number of items correctly cancelled should be reported for the Au~matic Detection and Controlled Search co~tions separately. For the ovr, the mean and SD for time in seconds taken to complete the task should be reported. Additional useful irformation for the cancellation tests includes the number of omissions (target numbers not cancelled) and the number of commissions: (numbers other than the target digits cancelltil).
Only one study was designed to provide normative information on the Ruff 2&:7 (Ruff et al., 1986a). Other data on the Ruff 2&:7 come from control groups in clinical comparison studies. Ruff et al. (1986a) partition normative data for the two conditions by four age groups and three educational levels; the other studies report demographic information. Another study by Ruff et al. (1992) provides normative data for speed and accuracy for normal controls. Finally, Bate et al. (2001) provide Ruff 2&7 data on a small sample of healthy controls. Most of these studies report either speed or speed and accuracy data summed across the two Ruff 2&:7 conditions. Additional normative information, particularly tables for converting raw scores into T scores and percentiles, based on age and educational level, are provided in the Ruff 2&:7 professional manual (Ruff &: Allen, 1996). There are very few normative studies on the DVf. Most of the studies have small sample sizes (10--40), with the exception of Heaton et al.'s (1991, 2004) normative manuals, which include data for 210 participants with standardized scores adjusting for age, education, and gender presented for African-American and Caucasian participants separately in the 2004 edition. In this chapter, we review studies which use Ruff2&:7, followed by DVf studies. Published manuals are reviewed first, followed by normative studies and control groups from clinical comparison studies presented in ascending chronological order for each test separately. The text of study descriptions contains references to the corresponding tables identified by number in Appendix 9. Table A9.1, the locator table, summarizes information provided in the studies described in this chapter. 1
SUMMARIES OF THE STUDIES
Ruff 2&7 Manual SUMMARY OF THE STATUS OF THE NORMS Information presented in the studies fleporting data for the cancellation tests differslacross studies. Some of these differences be summarized below.
Jill
[Ruff 2&7.1] Ruff and Allen, 1996 The normative information in this manual is primarily based on previous studies by Ruff 'Children's norms for various cancellation tests are available in Baron (2004) and Spreen and Strauss (1998).
165
CANCELLATION TESTS
and colleagues (Ruff et al., 1986a; Baser & Ruff, 1987; Ruff & Crouch, 1991). A total of 360 (180 male, 180 female) healthy volunteers between the ages of 16 and 70 years and with 7-22 years of education participated in the study. The sample was initially stratified by four age groups (16-24, 25-39, 40-45, and 55-70 years) and three education groups (:512, 13-15, 16 years) but not gender since this was not a significant factor in test performance. The authors mention that the sample "roughly approximated the 1980 U.S. census proportions with regard to race," but no specific ethnicity data are provided. Data are available for speed and accuracy for each condition individually, as well as total scores for speed and accuracy for the two conditions combined. Thus, a total of six outcome variables are available. Raw score to T score conversion and percentiles are available by age and educational level. Sixty-five percent of the sample was recruited from California, 30% from Michigan, and the rest from the eastern seaboard. The normative data contained in Ruff and Allen's manual are not reproduced here, and the interested reader is referred directly to this publication for further information.
Study strengths 1. The sample composition is well described in terms of age, education, gender, and geographic area. 2. The testing procedures and scoring are well described in the manual. 3. Means and SDs are reported for some of the Ruff 2&7 outcome measures. 4. Raw scores can easily be converted to T scores and percentiles for four age groups and three educational levels.
Considerations regarding use of the study 1. Overall sample is adequate, but some individual cells are relatively small (e.g., fewer than 20 participants in the 55-70 year age group who have 13-15 years of education). 2. No exclusion criteria and recruitment procedures are provided.
Normative Studies and Control Groups in Clinical Comparison Studies for the Ruff 2&7 [RUFF 2&7.2] Ruff, Evans, and Light, 1986a (Table A9.2)
The authors recruited 259 healthy participants (107 male, 152 female) as part of this normative study. Nearly half of the sample was recruited from California and the rest, from Michigan. The investigators selected individuals with a wide age range and educational attainment in order to examine the effects of these demographic factors on test performance. Participants were aged 16-70. The authors report that their sample had 7-72 years of education, but it is unclear whether the upper limit reported is a misprint. The sample was stratified by four age groups (16-24, 25-39, 40-54, and 55-70 years) and three educational levels (:512, 13-15, ~16 years). Standard administration procedures were used.
Study strengths 1. The sample composition is well described in terms of age, education, gender, and geographic area. 2. Means and SDs for the test scores are reported. 3. Data are stratified by four age x three education groups.
Considerations regarding use of the study 1. Overall sample is adequate, but individual cells are relatively small (e.g., some cells contain only 10 participants). 2. No exclusion criteria and recruitment procedures are reported. [RUFF 2&7.3] Ruff, Niemann, Allen, Farrow, and Wylie, 1992 (Table A9.3)
This study examined the effects of cerebral lesions on Ruff 2&7 performance. The authors selected 60 normal controls from a larger standardization sample of 259 reported by Baser and Ruff (1987). The larger sample was recruited from California, Michigan, and New York. Participants were screened for chronic medical illness, "extensive" substance abuse,
166
TESTS OF ATTENTION AND CONCENTRATION
or loss of consciousness due to a heacJ injury. The ethnic breakdown is reported by Baser and Ruff (1987) for the larger subject pool but not for the subsample that setved · in this study. The 60 participants in the currept study were an average of31.2 (4.1) years of~ge and had an average of 12.9 (1.5) years ol education. There is no information on the gender distribution for this sample. Standard, administration procedures were used.
i
Study strengths 1. Sample size is adequate. . 2. The sample composition is well d~cribed in terms of age and education. 3. Adequate exclusion criteria. , 4. Means and SDs for the total sceres for both conditions are reported.
i
Consideration regarding use of the study 1. The data are not partitioned by age.
[RUFF 2&7.4] Bate, Mathias, and Crawfdrd, 2001 (Table A9.4)
This study examined the relationship ~tween the Test of Everyday Attention and ~arious neuropsychologicaJ measures in patierlts with severe head injury. The study was cor!ducted in Australia, where 35 controls (20 nfde. 15 female), who were native English SP,eakers, with no history of psychiatric illness, (neurologicaJ disorders, intellectual disability. substance abuse, or hemiplegia of the dominant hand, were recruited. The exact locatibn and procedures for participant recruitment are not specified. Also, it is unclear whether tite participants were patients with non-brain injuryrelated illness or healthy individuals frf>m the community. Participants were an avefage of 30.2 (10.3) years of age, had an average of 12.6 (2.0) years of education, and had an tfverage premorbid IQ of 101.1 (9.1), as estim.ed by the National Adult Reading Test-Revised (NART-R) (Crawford, 1992). S~ndard administration procedures were used. · Study strengths 1. The sample composition is: well described in terms of age, edti:ation, gender, and premorbid IQ.
l
2. Adequate exclusion criteria. 3. Means and SDs for the test scores are reported. Considerations regarding use of the study 1. The sample size is relatively small. 2. Recruitment procedures are not well described. Controls may be nonhead-injured medical patients. 3. The data were obtained on Australian participants, which may limit their usefulness for clinical interpretation in the United States.
DVT Manual [DVT.1J Heaton, Grant, and Matthews, 1991; Heaton, Miller, Taylor, and Grant, 2004 The DVf manual (Lewis, 1995) refers the reader to the comprehensive normative book published by Heaton et al. (1991). Heaton et al. (1991) gathered a large sample of data on various neuropsychological tests over a 15-year period using several studies. The DVf is among the tests for which normative data are presented. The total sample used in this normative book was recruited from various areas across the United States, including California, Washington, Colorado, Texas, Oklahoma, Wisconsin, Illinois, Michigan, New York, and Virginia, as well as Canada. It is unclear which specific regions were used for DVf data collection. All participants reportedly completed structured interviews, and those with a history of learning disabilities, neurologicaJ illness, "significant" head injury, "serious" psychiatric illness (e.g., schizophrenia), or substance abuse were excluded from the normative data set. The DVf normative data were gathered on a total of 280 participants, who were an average of 44.9 (20.0) years of age and obtained an average of 14.0 (3.2) years of education. The manual provides regression-based raw to T score and percentile conversion for the DVf (and other neuropsychological tests) based on gender, 10 age groups (20-34, 35-39, 40-44,45-49,50-54,55-59,60-64,65-69,7074, and 75--80 years) and six education groups (6-8, 9-11, 12, 13-15, 16-17, and 18+ years). The average DVf raw score reported for the
CANCELLATION TESTS
entire sample of 280 participants for time taken to complete the task is 388.5 (86.5), and that for errors committed is 7.1 (8.7). Other data from the manual are not reproduced here. Interested readers are referred to the original publication. In their recently updated normative manual, Heaton et al. (2004) have gathered additional normative data for the DVf (and other neuropsychological tests). Their sample consists of 860 normal participants, of whom 466 are Caucasian and 394 are African American. The average age of the Caucasian sample was 47.0 (20.2) years, and average educational level was 14.0 (2.9) years; approximately 57.3% of the sample were male. The average age of the African-American sample was 38.7 (12.2) years, and average educational level was 13.5 (2.5) years; approximately 49.7% of the sample were male. The authors report that the data were gathered from various individual and multicenter collaborative research projects over a 25year period. Participants were from various U.S. states and Canada, including California, Washington, Colorado, Texas, Oklahoma, Wisconsin, Illinois, Michigan, New York, Virginia, and the province of Manitoba, Canada. All participants reportedly completed structured interviews, and those with a history of learning disabilities, neurological illness, "significant" head injury, "serious" psychiatric illness (e.g., schizophrenia), or substance abuse were excluded from the normative data set. The manual provides regression-based raw to T score and percentile conversion for the DVf (and other neuropsychological tests) based on gender, 11 age groups (20-34, 35--39, 4~. 45-49, 50-54, 55-59, 60-64, 65-69, 70-74, 75-79, and 80-85 years), six education groups (7-8, 9-11, 12, 13-15, 16-17, and 18-20 years), and two ethnic groups. The average DVf raw score reported for the entire sample of 860 participants for time taken to complete the task is 390.87 (57.59). The average time taken to complete the DVf for the Caucasian sample is 394.59 (88.92), and that for African-American sample is 380.41 (86.63). Other data from the manual are not reproduced here. Interested readers are referred to the original publication. Stan-
167
dard administration procedures were used in both manuals. Study strengths 1. The sample composition is well described in terms of age, gender, ethnicity, and education. 2. Adequate exclusion criteria. 3. Means and SDs are reported for Caucasian and African-American participants separately and for the entire sample. Additionally, T scores and percentiles corrected for age and education are reported for different demographic groups. Considerations regarding use of the study 1. Specific sample sizes used per cell are not reported. 2. Recruitment procedures are not well described. Other comments 1. The interested reader is referred to the Fastenau and Adams (1996) critique of the Heaton et al. (1991) norms, and Heaton et al.'s (1996a) response to this critique.
Normative Studies and Control Groups in Clinical Comparison Studies for the DVT [DVT.2] Prigatano, Parsons, Levin, Wright, and Hawryluk, 1983 (Table A9.5)
The authors examined the neuropsychological test performance of mildly hypoxemic patients with COPD. Twenty-five healthy controls were matched to the COPD patients based on age, education, handedness, and gender. Control participants were an average of 59.6 (9.0) years of age and obtained an average of 10.5 (3.3) years of education. Participants were excluded if they had an "illness that might interfere with their neuropsychological testing (e.g., physical handicap, emotional problems, alcoholism or psychosis)," had COPD, were taking medications for heart or lung disease, or had diabetes. Fifteen of the participants were selected from Winnipeg, Manitoba, Canada, and 10 were selected from
168
TESTS OF ATTENTION AND CONCENTRATION
Oklahoma City, Oklahoma. Standard· administration procedures were used.
Study strengths
.
1. The sample composition is well descnbed in terms of age, education, geOgraphic location, and recruitment procedjues. 2. Adequate exclusion criteria. : 3. Means and SDs for the test scores are reported.
Considerations regarding use of the study 1. Small sample size. . 2. Wide age range for the sample. Data are not presented by age group. 3. The data for over half of the sample were obtained on Canadian part:ipipants, which may limit their use~ss for clinical interpretation in the :United States. ' 4. Low educational level. [0Vl.3] Grant, Prigatano, Heaton, McS~eeny, Wright and Adams, 1987 (Table A9.6)
.
The authors examined neuropsycbPlogical functioning in COPD patients wi~ mild, moderate, and severe hypoxemia. They selected 99 "nonpatient" participants (75 m!Ie, 24 female) who did not have COPD, a ru.tory of "significant" head injury, a history of substance abuse, heart disease that required treaa,ent, or neurological or metabolic illnesses. Partfipants were an average of 63.1 years of age apd had obtained an average of 10.2 (3.6) yead of education. The authors do not specify !testing procedures but do mention the larger ~ttery from which the Dvr is drawn (i.e., the Rennick-Lafayette Repeatable Battery).
Study strengths 1. Relatively large sample size. 2. The sample composition is well described in terms of age, edtfation, I and gender. : 3. Adequate exclusion criteria. 4. Means and SDs for the test sco!es are reported. '
2. Data are not partitioned by age. 3. Low educational level. [DVT.4] Kelland and Lewis, 1994 (Table A9.7)
This study was designed to assess the testretest reliability and validity of the DVf, as well as to measure the single-dose effects of diazepam in groups of college students. The authors selected 20 college students (10 male, 10 female) from a "large urban university" to serve as controls (who were administered a placebo rather than diazepam). Participants ranged in age from 18 to 30, with an average age of 20.0 (2.8) and an average educational level of 13.1 (1.3) years. Participants were excluded from the study if they reported taking medications; had a history of subs~ce abuse· had a medical history that reqwred centr~ nervous system~epressant medication use; had a history of neurological, cardiac, renal, or hepatic disease; or drank more than two cups of coffee a day. The DVf, along ~.th other neuropsychological tests, was administered two times to each participant, with each session separated by 1 week. Standard administration procedures were used. Data are reported for both the standard (crossing out 9s) and the alternate (crossing out 6s) administrations. These data were later reanalyzed by Kelland and Lewis (1996), who found a practice effect from week 1 to week 2 of test administration but no differences between week 2 and week 3. The Kelland and Lewis (1996) data for weeks 1 and 2 are the same as those reported in this study and, thus, will not be reproduced in this chapter.
Study strengths
.
.
Considerations regarding use of the study 1. Test administration procedures are not specifically described.
.
1. The sample composition 1S well descnbed in terms of age, gender, education, and recruitment procedures. 2. Adequate exclusion criteria. 3. Means and SDs for the test scores are reported. 4. Test-retest data are reported.
Consideration regarding use of the study 1. Small sample size.
169
CANCELLATION TESTS
[DVT.S] Bamcord and Wanlass, 1999 (Table A9.8)
The authors compared the performance of college students on six neuropsychological tests administered in the standard, paper-andpencil format vs. a more ecological format of using plastic sheet protectors so as to not create paper waste. For the purposes of this chapter, the participants in the standard testing format were considered the "normal" controls. Ten college students (five male, five female) were recruited. Participants were an average of 19.8 (3.95) years of age, with an average of 12.8 (0.63) years of education.
Study strengths 1. The sample composition is well described in terms of age, education, and gender. 2. Means and SDs for the test scores are reported.
Considerations regarding use of the study 1. The sample is small. 2. No exclusion criteria are provided. 3. Test administration procedures are not specified.
scores for the Vocabulary, Block Design, and Wide Range Achievement Test (WRAT) Reading for the HRT group were 14.2 (3.3), 12.7 (2.4), and 108.6 (5.5), respectively; values for the non-HRT group were 13.9 (3.7), 11.8 (3.5), and 108.8 (12.7), respectively. The women on HRT made significantly fewer errors on the DVT than those who were not onHRT.
Study strengths 1. The sample composition is well described in terms of age, education, gender, and recruitment procedures, with limited IQ data available. 2. Adequate exclusion criteria. 3. Means and SDs for the test scores are reported. 4. Data are reported for postmenopausal women on HRT and those not on HRT.
Considerations regarding use of this study 1. The sample is small. 2. Educational level is relatively high. 3. An all-female sample is used. [DVT.7] Stein, Kennedy, and Twamley, 2002 (Table A9.10)
[DVT.6] Smith, Giordani, Lajiness-O'Neill, and Zubieta, 2001 (Table A9.9)
The neuropsychological effects of HRT were examined in 29 healthy postmenopausal women. Participants were recruited through advertisements and selected if they were 60 years or older, had received HRT without interruption after menopause, or had never been treated with HRT. Exclusion criteria included participants who had stopped and restarted HRT for more than 1 month at a time; had a significant general medical, neurological, or psychiatric illness; had a history of head trauma leading to loss of consciousness; had substance dependence; or were taking medications affecting the central nervous system. Standard administration procedures were used. Participants taking HRT were an average of 65.0 (4.0) years of age, with an average of 15.0 (2.0) years of education; and those not on HRT were an average of 67.0 (6.0) years of age, with an average of 16.0 (3.0) years of education. Average WAIS-R standard
The authors compared neuropsychological test performance of female victims of partner violence with PTSD to victims without PTSD and nonvictimized controls. Twenty-two female control participants were recruited through posted advertisements and personal contacts in the San Diego, California, community. They were an average of 29.4 (10.7) years of age, had an average of 13.9 (1.5) years of education, and had an average raw WAI SIll Vocabulary subtest score of 45.9 (7.4). All participants were fluent English speakers and had at least an 8th-grade reading ability. Further exclusion criteria were presence of PTSD (DSM-IV criteria), use of psychotropic medication within the last 6 weeks of the study, use of oral or intramuscular steroids within the last 4 months of the study, learning disability, history of attention-deficit disorder, history of substance abuse, seizure disorder, a history of schizophrenia or other psychotic disorders, or neurological illness. Standard administration procedures were used.
170
TESTS OF ATTENTION AND CONCENTRATION
Study strengths 1. The sample composition is well ~scribed in terms of age, education, geographic area, and recruitment procedw.'es, with limited Verbal IQ data (i.e., Vocabulary raw scores were available). 2. Rigorous exclusion criteria. 3. Means and SDs for the test scbres are reported.
Considerations regarding use of this study 1. The sample is small. 2. An all-female sample is used.
i
CONCLUSIONS Clinicians and researchers use canqellation tests to assess various aspects of atkention, including vigilance and sustained and ~lective attention. There are numerous such te$ts from 1 r-andwhich to choose, and most involve p~ pencil administration. Such tests also equire aspects of psychomotor responding, , well as
visual tracking ability. Two tests were selected for discussion in this chapter, the Ruff 2&.7 Selective Attention Test and the DVI'. A review of the literature indicates that there •are no gender differences on either of these tests but that performance clearly declines with age. Performance on such tests appears to improve with higher levels of education. Additionally, there appear to be some critical gaps in the existing normative data for the cancellation tests reviewed in this chapter. For example, for the Ruff 2&7, when the data are partitioned by age, sample sizes are vecy small (fewer than 20), particularly for individuals older than 40 years. For the DVT, most participants over 50 years of age tend to have lower educational levels (<12 years). Unfortunately, very few large-scale normative studies have been conducted on either of these cancellation tests. Aside from age and education, future studies should examine factors such as ethnicity and intellectual functioning when developing normative data. 2
• Meta-analyses were not perfonned due to a lack of sufficient data for the two tests discussed in this chapter.
Ill LANGUAGE
10 Boston Naming Test
BRIEF HISTORY OF THE TEST The Boston Naming Test (BNT) is a test of confrontation naming consisting of simple line-drawn pictures. Its experimental version includes 85 drawings (Kaplan et al., 1978). The modified version of the BNT, published in 1983, is limited to 60 of the original 85 drawings, arranged in order of ascending difficulty (Kaplan et al., 1983). Participants are allowed 20 seconds to name each item. Stimulus cues are offered to correct for misperception errors. They are followed by phonemic cues, which provide the first phonemes of the word, facilitating lexical retrieval. The total score on the test is the number of correct responses produced spontaneously (SR) and with the aid of stimulus cues (SC). The basal rule is eight consecutive pictures correctly named without any assistance, and the discontinuation rule is six consecutive failures. (For detailed administration and scoring instructions, see Lezak et al., 2004; Spreen & Strauss, 1998; and instructions in the test stimulus booklet.) The authors provide normative data on the 60-item version for children 5.5-10.5 years of age, broken down into six age groups based on five participants in each group; for normal adults aged 18--59 years, broken down into two educational groups and five age groups based on a total of 84 participants; and for
82 aphasic patients partitioned by aphasia severity level. In 2000, Kaplan et al. published the second edition of the BNT, which includes a 15-item short form, four multiple-choice options for each of the same 60 items that were used in the previous edition of the test, and error codes to categorize incorrect responses. The ceiling was changed to eight items. The normative data included in the record booklet are partitioned by 15 age groups for children, spanning in age between 5-0 and 12-5 years, and five age groups for adults between 18 and 79 years. In addition to being used as a stand-alone test, the BNT, second edition, is included in the Boston Diagnostic Aphasia Examination (BDAE) published by Psychological Assessment Resources (Goodglass et al., 2001). All studies listed in this chapter are based on the original 60-item version of the BNT since no studies based on the second edition of the test were published by the time this book went into production. Thompson and Heaton (1989) and Heaton et al. (1991) reported high correlation between the 85-item and the 60-item versions (r=0.96). However, the mean percent of correct responses was somewhat lower for the original version (85.1% vs. 87.8%) in their sample of clinical referrals for neuropsychological evaluation. 173
174
LANGUAGE
Studies Using BNT Error Quality Analyses Several authors have studied the errors made on the BNT by different clinical vs. normal groups (Albert et al., 1988; LaBarge et al., 1992; Nicholas et al., 1985; Reiter, 2000; Smith et al., 1989; Tombaugh & Hubley, 1997). Approaches to the classification of naming errors are typically based on the presumed underlying mechanisms: perceptual (analysis of the visual features of the picture), semantic (access of the underlying conceptual ~presen tation), and lexicol (retrieval of the appropriate name for the stimulus) (Snodgrass, 1$84). Following Borod et al.'s (1980) study, which indicated an impaired lexical retrievtl mechanism underlying naming difficultieJ in the normal elderly, Nicholas et al. (1985) explored the integrity oflexical retrieval in normal aging through qualitative analyses of naming,errors in the 85-item version of the BNT. Th6 authors identified several error types, which are outlined in Table 10.1. Using this system in the analysis of BNT errors for a group of 162 healthy prupcipants aged 30-79 years, the authors concluded that confrontation naming requires sever~ stages of information processing: (1) perception of the object, (2) semantic identification, (3) retrieval of the label that corresponds to that semantic "concept," (4) encoding the articulatory program, and (5) correct articulation of that label or name. The authors reported a decline in naming ability with age, especially after age 70. Based
Table 10.1. Types of Naming Error Identified by
Nicholas et al. (1985)• Coding Category
Example from BNT
No response {comment)
··1 have one of those on my
Augmented-correct Semantically related Phonologically related Perceptually related Whole-part, part-whole Off-target utterance {circumlocution)
''Propeller on an airplane" '"Harness" for yole ''Prong·· for tong$ ''Flower'' for pinyheel '"Clock" for pendiilum ..Artistic thing fo$ flower" for trellis
porch"
•For a description of each type of error, see Ni~olas et al. {1985).
on the higher frequency of circumlocutory descriptions and semantically related responses in the elderly, the authors cited difficulty with perception and semantic identification as unlikely contributors to the naming difficulty. They concluded that the major age-related difficulty lies in the label (lexical) retrieval stage. Using the same taxonomy of naming mechanisms, LaBarge et al. (1992) identified 17 types of error, which were classified into three categories: no content, linguistically related, and perceptually related (Table 10.2). The authors hypothesized that linguistic errors reflect a loss in lexical (and potentially semantic) information; no-content errors are representative of a loss in semantic content; and perceptual errors are indicative of a breakdown in the perceptual mechanism. Based on the analysis of errors produced by 49 elderly with very mild or mild senile dementia of Alzheimer's type (SDAT), the authors identified loss of lexical information as well as some disruption in specific semantic attributes as processes underlying confrontational naming difficulty in early SDAT. With progression of the disease, increasing involvement of core semantic structures is implicated. Hodges et al. (1991) developed a different error classification system, which, in reference to the item "beaver," can be illustrated as follows: category names ("animal"), withincategory semantic errors ("skunk"), semantic associates ("dam"), and semantic circumlocutions ("an animal that builds dams"). To further refine the process of error classification along a semantic dimension, Nicholas et al. (1996) proposed a system of rating errors on a 5-point scale of semantic relatedness to the target name, with 1 being not at all similar in meaning (for single-word responses) and poor, incomplete definition or description (for multiword descriptions) and 5 being very similar in meaning (for single-word responses) and good, complete definition or description (for multiword responses). Tombaugh and Hubley (1997) exa~ne~ the frequency of errors in responses to mdividual items and distribution of errors across seven categories, such as no response, circumlocution, semantic, phonemic, visual, perseveration, and miscellaneous, in a sample of
175
BOSTON NAMING TEST Table 10.2. Types of Error Described by LaBarge et al. (1992) Examples
Type of Error
No Content Empty phrase No interpretation possible
I don't know Can't think of it No response or jargon
LinguiaticaU, Belated Phonological Phonologically related
Pelican= pentagon Unicorn= hornicorn Rhinoceros= nostros Sphinx= phoenix
Semantic Same category Super- or subordinate Function Attribute Context Description
Latch=hasp Camel= animal Asparagus= vegetable Funnel= used for pouring Compass= makes circles Beaver= eats wood or builds dams Volcano= fire Stethoscope= doctors use it Sphinx= found in Egypt Noose =a rope with a slip knot
Acoustic Meaningful sound
Whistle =make a whistling sound or blow noiselessly Volcano= make a whooshing sound
Pantomime or Gesture Gesture
Comb= gesture to head like combing Accordion= swing arms and hands like playing
Perceptually Belated (Viaually) Whole
Part Perspective Function Attribute Context
Copyright
Whistle= trailer hitch or pacemaker Knocker= chandelier Igloo=turtle or spider's web Dart= feather Rhinoceros= two big horns Harmonica= windows or apartment building or file drawers Dart= nurse to give shot Broom= wash my clothes Beaver= fella who goes underground Mask= a bad picture Wreath=see 'em at a wedding Whistle= hanging on a tree limb
© 1992 by the Educational Publishing Foundation. Adapted with permission.
219 cognitively intact adults 25-88 years of age. They also addressed errors resulting from the ambiguity of items, cultural or regional terms, commonly confused words, and synonyms and suggested specific probes to be used to clarify ambiguous responses. Kirk (1992a) compared quality of errors made by 212 boys 5-13 years of age to those
made by adult aphasic patients. They found that semantic and circumlocution errors accounted for 82% of the errors made by boys, whereas adult aphasics' performance was characterized by phonemic errors in addition to the two error types demonstrated by the boys. The authors proposed 'A Revised Children's BNT' based on item analyses of the
176
boys' responses and provided normative data for the BNT collected on a sample of 382 schoolchildren.
Current Views on the Mechanisms Underlying Confrontation Naming Deficits Naming ability in normal aging has received close attention in the literature. According to LaBarge et al. (1986), uncomplicated aging is not associated with naming deficit. However, a large body of evidence points to an age-related decline in naming ability. Several investigators suggest that deficient access to the lexical network is the leading mechanism of naming difficulty (Bowles & Poon, 1985; Nicholas et al., 1985). However, other investigators argue that age-related naming difficulties cannot be fully attributed to disruption in lexical access. Barresi et al. (2000) suggest that impaired lexical access is the leading mechanism of naming failures in individuals under the age of 70. However, in those older than 70, naming difficulties are in part related to semantic degradation. Similarly, Au et al. (1995) viewed perceptual and semantic processing deficits as partially responsible for naming difficulties in the elderly. Moberg et al. (2000) showed that lexical access was not affected by the aging process in their study; however, they pointed to methodological limitations as a possible cause of this finding. Ferraro et al.'s (1998) results indicate that decline in speed of processing might contribute to age-related decline in BNT performance. Understanding of faulty processes in Alzheimer's disease (AD) also remains controversial. Whereas the majority of the more recent investigations rule out disruption in the perceptual stage as a primary cause of this breakdown (Bayles & Tomoeda, 1983; Frank et al., 1996; Huff et al., 1986b; LaBarge et al., 1992; Martin and Fedio, 1983; Smith et al., 1989), the relative contributions of lexical vs. semantic dysfunction are highly debated in the literature. Regarding the semantic deficit hypothesis, disruption in the content and organization of semantic information is implicated as the primary source of naming difficulties in AD (Bayles & Tomoeda, 1983; Flicker et al., 1987;
LANGUAGE
Frank et al., 1996; Henderson et al., 1990; Hodges et al., 1991; Huff et al., 1986b; Margolin et al., 1990; Martin and Fedio, 1983). A number of studies suggest that this contribution increases as a function of dementia severity (Hodges et al., 1991; Huff et al., 1986b; LaBarge et al., 1992; Shuttleworth & Huber, 1988). As a result, in the early stages of AD, a naming deficit might manifest itself through lexical access difficulties (LaBarge et al., 1992; Neils et al., 1988), which resembles the pattern characteristic for normal aging. Data challenging the common view of semantic disruption as the cause of naming difficulties in AD were presented by Nebes and colleagues (Nebes et al., 1984; Nebes, 1989; Nebes & Brady, 1990), who viewed lexical retrieval as the source of naming difficulties. Similarly, Nicholas et al. (1996) pointed to the breakdown in lexical access in AD and referred to the previous findings of semantic breakdown as an artifact of the methodologies of the previous studies. This view is further supported by results using the General Processing Tree approach, which is a type of a multinomial model that estimates the probabilities of cognitive processes that are presumed to underlie performance on a cognitive task, as measured by categorical data (Reiter, 2000). The author found that lexical access, perceptual analysis, and phonological realization abilities decline with increase in dementia severity, with the most consistent decline being in lexical access. Studies supporting the lexical deficit hypothesis suggest a breakdown in the retrieval stage, which is based on the following findings: (1) incidence of errors in low-frequency words is higher than that in high-frequency words: This mechanism is modulated by lexical processing (Kirshner et al., 1984; SkeltonRobinson & Jones, 1984); (2) facilitation of lexical access by phonemic cues (Martin & Fedio, 1983); (3) semantic relatedness of the words produced by the participant to the target word (Bayles & Tomoeda, 1983; Smith et al., 1989). In spite of the controversy regarding the mechanisms accounting for naming difficulties in AD vs. normal aging, the majority of studies have demonstrated utility of the BNT in distinguishing between AD and age-related
177
BOSTON NAMING TEST decline in naming ability (Beatty et al., 2002; Huff et al., 1986b; Margolin et al., 1990; Storandt & Hill, 1989). Goldman et al. (1998) reported decline in naming ability as a function of severity of Parkinson's disease. Several studies have explored the mechanisms of naming deficits in different types of aphasia. According to Nicholas et al. (1985), aphasic participants (across all major aphasic groups, except for anomies) have difficulty in the phonological encoding of words. Kohn and Goodglass (1985) support this finding by demonstrating considerable similarity in error types across different aphasic groups. In addition, they provide a more specific analysis of anomie errors associated with different types of aphasia: "Negated responses were associated with Broca's aphasia, whole-part errors (hose for nozzle) were associated with frontal anomia, and poor phonemic cuing was associated with Wernicke's aphasia" (p. 266). The authors also reported that anomie aphasics produced the highest frequency of multiword circumlocutions and the lowest number of phonemic errors, which they relate to minimal word production difficulty in anomie aphasia relative to other aphasia syndromes. On the other hand, Lewis and Soares (2000) showed that prelanguage conceptual organization deficit might underlie naming difficulties in aphasic patients who present with semantic paraphasias as a leading feature of their language disturbance. Investigation of the neuroanatomical substrates of naming may shed light on its cognitive mechanisms. The hippocampus of the dominant hemisphere has been widely implicated in naming function (Davies et al., 1998; Martin et al., 1999; Sawrie et al., 2000; Seidenberg et al., 1998). Other aspects of the dominant temporal lobe are also involved in naming function, according to Ojemann et al. (1993) and Wiggs et al. (1999). Bell et al. (2000) showed that a decline in naming ability is a frequent sequela of left anterior temporal lobectomy. These findings consistently point to the involvement of the dominant temporal area in naming function. In addition, areas of the superior parietal and frontal cortices are implicated by Wiggs et al. (1999). In addition to the obvious use of the BNT in assessing word retrieval, Kaplan (1988)
observed that analysis of misperception errors allows identification of perceptual fragmentation and inattention to a part of the visual field, which are associated with nondominant hemisphere dysfunction.
Modifications and Short Versions of the BNT An attempt to create two shorter equivalent forms of the BNT for repeated testing was undertaken by Huff et al. (1986a). Based on the experimental 85-item version, these authors developed two 42-item versions that proved to be reliable (r=0.71-0.81 for controls and r=0.97 for AD patients) and equivalent in difficulty. Both versions were standardized on normal and brain-damaged participants. The different forms of the test were compared by Thompson and Heaton (1989). They administered an 85-item version of the BNT to a clinical group of participants; data were then rescored according to the criteria for the 60and 42-item forms. Although certain differences between forms were found, there were high correlations among different versions of the test (ranging 0.82-0.96) and between BNT scores and other language measures. Heaton et al. (1991, 2004) published normative data for the 85-item version. The revised set of norms (2004) is based on a sample of over 1,000 normal adults, stratified by age, education, gender, and race/ethnicity (African American and Caucasian). Farmer (1990) proposed modifications of the administration, response coding, and scoring procedures for the full version of the BNT, which were used by the author to assess nonbrain-damaged adults. Eight short versions of the test were compared by Mack et al. (1992) and Williams et al. (1989) in patients suffering from AD and neurologically intact elderly. The short forms were four 15-item versions developed by these authors, one 15-item version used by the Consortium to Establish a Registry for Alzheimer's Disease (CERAD), and three 30-item versions. Scores on each version could be extrapolated to a complete 60-item BNT score. Franzen et al. (1995) compared different short forms on a sample of 320 individuals
178
with various neuropsychiatric diagnoses. The authors report adequate internal consistency for all forms and reasonable correlations between forms. Based on their analysis of item difficulty, the authors identified the CERAD version as least desirable. Similarly, Larrain and Cimino (1998) reported poor criterion validity of the CERAD 15-item version as its agreement with the full version of the BNT was only 70% in classifying patients with probable AD as impaired. Fastenau et al. (1998) administered four 15-item versions (Mack et al., 1992) and two 30-item versions in counterbalanced order to 108 normal adults between 57 and 85 years of age. Fifteen-item versions 1 and 2 were combined by Fastenau et al. in the first 30-item version, and 15-item versions 3 and 4 were combined in the second 30-item version. Alternate-form reliability coefficients for the six versions ranged 0.53-0.76, and Cronbach's ex coefficients ranged 0.37-0.75. Validity coefficients of 0.93 and above support the hypothesis that the short forms sample the same domain as the long form. The authors pointed out that short forms 3 and 4 have slightly better psychometric properties than forms 1 and 2. Tombaugh and Hubley (1997) derived eight short versions from the performance of 219 healthy volunteers on the full version. The authors concluded that the 30-item versions are preferable to the 15-item forms and correlate highly with the full test. Ferraro and Barth (2003) compared four 15-item versions and the CERAD version with the full 60-item test and found the short versions to be as reliable as the full version in regard to lexicality issues. Saxton et al. (2000) developed two empirically derived (based on item difficulty) equivalent 30-item short forms, which were strongly related to each other and to the total BNT score. Lansing et al. (1999) compared scores of 719 normal elderly and 325 AD patients on eight previously reported short forms derived from participants' performance on the 60-item version. The short forms differed in their ability to discriminate between patients and controls. The authors derived a new, 15-item, gender-neutral short form with discrimin-
LANGUAGE
ability comparable to the full 60-item version, using stepwise discriminant analysis. Two 30-item versions, comprised of the odd and even items from the 60-item version, were administered by Fisher et al. (1999) to 30 normal elderly and 32 patients with probable AD. The forms were found to be equivalent and discriminated well between the control and patient groups. The combined mean was consistent with that derived by retrospective extraction in the original odd/even test construction study. One of the 15-item versions (version 2) developed by Mack et al. (1992) was used by Calero et al. (2002) to assess normal and demented elderly with low educational level. The authors found a high degree of equivalence between the full and the short versions. The second edition of the BNT (Kaplan et al., 2000) includes one of the 15-item short versions (version 4) developed by Mack et al. (1992).
Cultural Adaptations and Culture-Specific Normative Data for the BNT The literature supports effects of demographics, including language and culture, on the BNT. Cruice et al. (2000) and LeDorze and Durocher (1992) identified the cultural relevance of items, such as word length, frequency, and familiarity, as important in selecting items for a naming task. Several adaptations of the BNT for different cultural and ethnic groups, which use culturally appropriate subsets of items or modified sets of items, are available. Ponton et al. (1996, 2000) described the Ponton-Satz BNT, which is an adaptation of the standard version for assessment of Hispanic patients. It consists of 30 items derived from the original test, which are presented in different order from the original. Some items have several possible correct responses listed on the answer sheet, depending on the country of origin of the examinee. Normative data for a sample of 300 Hispanic participants stratified by gender, age, and education are provided (Ponton et al., 1996). Allegri et al. (1997) reported normative data for the Spanish BNT published in Madrid (see Appendix 1), which were collected on
BOSTON NAMING TEST
200 residents of Buenos Aires across the adult age span. The standard administration protocol was used. The authors proposed an alternative order of items, based on analysis of the frequency of correct responses in their sample. The Spanish version is an adaptation of the original test, with some items substituted with culture-fair items. Several studies have also addressed the impact of culture on the English version of the BNT. A modified Australian version of the BNT was used by Cruice et al. (2000) and Worrall et al. (1995). Items "beaver" and "pretzel" from the standard version were replaced with "platypus" and "pizza." Barker-Collo (2001) addressed cultural bias in the performance of 58 New Zealand university students on the standard BNT. Item analysis indicated that overall performance was hindered by unfamiliarity with several items on the test. The authors suggested strategies for adaptation of the test to New Zealand culture. Stewart et al. (2001) administered the CERAD short version of the BNT to 285 African Caribbean participants 55-75 years of age, who were residents of south London. Normative data are stratified by two age groups, gender, three education groups, and three occupational classes. Performance on a shortened version of the BNT has been reported for Native Americans (Ferraro & Bercier, 1996; Ferraro et al., 2002). Fillenbaum et al. (1997, 2002) report performance of elderly white and AfricanAmerican community residents on the CERAD version of the test. Marien et al. (1998) reported normative data for the standard BNT on a sample of 200 native Dutch-speaking Flemish elderly and recommended cutoff scores according to age, education, and gender. Error analysis was performed. The authors compared American English-, Australian English-, and Dutchspeaking elderly performance based on a literature review and inferred that linguistics do not affect the overall score but do impact the error distribution in different languages. A French version of the BNT was introduced by Colombo and Assai (1992). This report provides data for 420 normal Frenchspeaking Swiss adults 20--89 years old.
179
A Korean version, K-BNT, was developed following the procedures for development of the American version of the test. Based on item difficulty and concreteness of concepts, 60 items were selected from a pool of 175 items (Kim & Na, 1999). Normative data for a sample of 600 normal participants, stratified by eight age groups (15-75+ years) and five educational levels (0-13+ years), are reported.
Psychometric Properties of the Test Review of the literature suggests that validation studies for the BNT have focused on its diagnostic and predictive properties in discriminating between normal and clinical groups (Cahn et al., 1995; Jacobs et al., 1995; Knesevich et al., 1986). Concerns regarding asymmetry (negative skew) and "peakedness" (extreme kurtosis) in the distribution of BNT scores and resulting limited score variability and small SDs in normative samples restricted to high-functioning participants have been widely discussed in the literature (Fastenau, 1998; Hamby et al., 1997; Hawkins & Bender, 2002; Killgore & Adams, 1999). In essence, the test does well what it is designed to do. It is designed to measure deficit in naming ability or severity of aphasia, rather than level of skill or proficiency within the range of normality. It is sensitive to the severity of the naming deficit, with the nonanomie end of the distribution falling within the "normal" category. As such, the test is not expected to discriminate between average, above average, and superior naming ability. Therefore, intact performance on this test would be most accurately described as "within normal limits." Use of z scores and percentiles in describing impaired performance should be done with full understanding that the distribution of test scores is not normal. High test-tretest reliability over a 1-2 week interval in a group of elderly participants was documented by Flanagan and Jackson (1997). Dikmen et al. (1999) reported test-retest reliability of 0.92 over an 11-month interval in a mixed sample. Mitrushina and Satz (1995) reported test-retest reliability ranging 0.620.89 over three annual probes in a sample of neurologically intact elderly. Data on repeated
180
administration are also presented by McCaffrey et al. (2000). High correlation of BNT performance with verbal fluency was reported by Locascio et al. (1995), with r=0.5 for AD patients and r = 0.52 for normal control participants. In contrast, a comparison of BNT performance with measures of different aspects of memory suggested that BNT scores are unrelated to learning and memory scores (Albert et al., 1988). Two reports showed that a lack of agreement between clinicians in the administration and scoring of the BNT affects the usefulness of published norms. Ferman et al. (1998) pointed out that the total score achieved on the test is affected by variability in the interpretation of the discontinuation rule of six consecutive failures. According to the lenient criteria, correct responses aided by phonemic cues are not counted toward the discontinuation rule. The authors compared the rigorous and lenient interpretations of the discontinuation rule and found that the final scores varied in 3% of the sample of 655 normal elderly and in 31% of 140 patients with AD, depending on the interpretation. Lopez et al. (2003) examined three scoring methods that might be viewed as correct interpretations of the test instructions and found discrepancies in the resulting scores and impairment levels. It is unknown if comparable rates of misinterpretation would be found upon replication of this inquiry in different settings with different groups of clinicians as follow-up studies are not available to date. For further information on the psychometric properties of the BNT, see Franzen (2000), Lezak et al. (2004), and Spreen and Strauss (1998).
RELATIONSHIP BETWEEN BNT PERFORMANCE AND DEMOGRAPHIC FACTORS There is abundant evidence of the effect of age on BNT performance, particularly declining scores and increasing performance variability with advancing age (Au et al., 1995; Farmer, 1990; Fastenau et al., 1998; Feyereisen, 1997;
LANGUAGE
Goulet et al., 1994; Kimbarrow et al., 1996; LaBarge et al., 1986; Lansing et al., 1999; Mackay et al., 2002; Marien et al., 1998; Neils et al., 1995; Nicholas et al., 1989; Randolph et al., 1999; Saxton et al., 2000; Van Gorp et al., 1986; Worrall et al., 1995). Several investigators suggest that the most pronounced decline occurs only after age 70 (Albert et al., 1988; Mitrushina & Satz, 1995; Nicholas et al., 1985). Furthermore, Welch et al. (1996) suggest that individuals with >12 years of education retain intact naming ability into their 80s. On the other hand, Goodglass (1980) as well as LeDorze and Durocher (1992) suggest that a significant drop in naming ability may occur in the sixth decade of life. Schmitter-Edgecombe et al. (2000) point to the cohort effect or generational familiarity with individual items that may account for some of the age-related differences in performance identified in crosssectional studies. Goulet et al. (1994), in their review of 25 studies, point to inconsistent reports on the effect of age on BNT performance, with some studies suggesting no agerelated decline in naming. They attribute these differences to methodological issues and subject characteristics. Moberg et al. (2000) examined the effect of age on lexical properties of the word representations for the BNT stimuli, such as familiarity, number of letters, frequency of occurrence, and number of syllables, and found that the process of lexical access was similar in young and older adults. Coffey et al. (2001) did not find a relationship between BNT performance and age-related changes evident on the MRI in 320 nonclinical volunteers aged 66-90. This conclusion is consistent with their review of the relevant neuroimaging literature. Education was found to be related to BNT scores in several studies (Allegri et al., 1997; Borod et al., 1980; Deloche et al., 1996; Hawkins et al., 1993; Hawkins & Bender, 2002; Heaton et al., 1999; Henderson et al., 1998; Kimbarrow et al., 1996; Lansing et al., 1999; Le Dorze & Durocher, 1992; Marien et al., 1998; Neils et al., 1995; Nicholas et al., 1985; Ponton et al., 1996; Randolph et al., 1999; Ross et al., 1995; Saxton et al., 2000; Thompson & Heaton, 1989; Welch et al., 1996; Worrall et al., 1995). Higher variability
BOSTON NAMING TEST
in BNT performance was observed in groups with lower educational levels. In contrast, Cruice et al. (2000), Farmer (1990), Fastenau et al. (1998), Ivnik et al. (1996), and LaBarge et al. (1986) did not find any association between BNT performance and educational level, which might be related in part to restricted ranges of educational levels in some samples. A combined effect of age and education should be taken into consideration, according to Heaton et al. (1999), as older individuals with lower educational levels are more likely to be misidentified as dysnomic. Similarly, an interaction of age and education was reported by Borod et al. (1980), Farmer (1990), and Welch et al. (1996). Manly et al. (1999) showed that illiterates scored significantly lower than literate participants with up to 3 years of education in their sample of Spanish-speaking, non-demented elders. Albert et al. (1988), Killgore and Adams (1999), Thompson and Heaton (1989), and Tombaugh and Hubley (1997) found that verbal intelligence, as measured by WAIS-R Vocabulary score, in their samples of neurologically normal participants strongly affected BNT performance. Similarly, Hawkins et al. (1993) found reading vocabulary score to be strongly correlated with BNT performance in a sample of psychiatric and normal participants. The authors presented BNT performance expectation guidelines based on the Gates-MacGinite Reading Vocabulary Test for use as a complement to the published norms. Based on a review of the relevant literature, Hawkins and Bender (2002) emphasized the contribution of premorbid vocabulary to BNT performance and made recommendations on moderator variables to be considered in further research. Gender was shown in several studies to be unrelated to naming efficiency in normal samples (Cruice et al., 2000; Fastenau et al., 1998; Henderson et al., 1998; Ivnik et al., 1996; LaBarge et al., 1986). However, based on an analysis of BNT performance, Ripich et al. (1995) suggest that naming skills are poorer for women than for men with similar clinical dementia rating (CDR) scores and demographic characteristics in their sample of 60 early AD participants. Similarly, Lansing
181
et al. (1999), Marien et al. (1998), Randolph et al. (1999), Saxton et al. (2000), and Welch et al. (1996) reported males outperforming females in normal samples. Randolph et al. (1999) suggested that the gender effect is due to performance on specific items that are more familiar to men. Reports of the effect of ethnicity and culture on BNT performance yield contradictory findings. In the study by Henderson et al. (1998), comparing healthy African-American and Caucasian participants, ethnicitywas unrelated to BNT performance. Similarly, Manly et al. (2002) did not find a notable difference in naming ability between African-American and Caucasian elders. In contrast, Lichtenberg et al. (1994) and Ross et al. (1995) report higher scores for Caucasian compared to Mrican-American participants in a group of medical inpatients. Similar findings are reported by Kimbarrow et al. (1996) on a sample of geriatric rehabilitation patients and by Whitfield et al. (2000) on a sample of healthy elderly. These findings are discussed by Henderson et al. (1998) in the context oflower educational level of Mrican-American participants. Kimbarrow et al. (1996) emphasize socioeconomic status, ethnicity, and cultural factors and Whitfield et al. (2000) point to the cultural appropriateness of the material as explanatory variables. Furthermore, Manly et al. (1998) found that participants who reported less acculturation obtained lower scores on the BNT in their sample of medically healthy Mrican Americans. Similarly, Touradji et al. (2001) reported lower naming performance in foreign-hom Caucasian elders compared to those born in the United States. Further moderating variables were acculturation level and language use. Qualitative differences in BNT performance as a function of ethnic background and geographical region were reported by Goldstein et al. (2000). Participants in their study tended to use alternative responses to several BNT items that are specific to their region, with further differences between black and white participants in which alternative responses were used. Utility of the standard version of the BNT in assessment of monolingual Spanish-speaking
182
LANGUAGE
and Spanish/English bilingual individuals and/ or item analysis in assessing the difficulty level and cultural appropriateness of each item were reported by Kahnert et al. (1998), Roberts et al. (2002), and Rosselli et al. (2000).
METHOD FOR EVALUATING 11-IE NORMATIVE REPORTS The normative reports reviewed were. limited to those employing the standard English 60-item version. ; To adequately evaluate the BNT n~ative reports, six key criterion variablea were deemed critical. The first five of these ;elate to subject variables, and the remaining refer to procedural variables. Minimal requirements for meeting the cri· terion variables were as follows.
Subject Variables Sample Size Fifty cases are considered a desirable :sample size. Although this criterion is somewhat arbitrary, a large number of studies suggest that data based on small sample sizes arq highly influenced by individual differences Fd do not provide a reliable estimate of tile population mean. · Sample Composition Description As discussed previously, information regarding medical and psychiatric exclusion criteria is important. In addition, emerging litErature indicates that ethnicity, acculturation, and native language impact test perfonnance; therefore, it is preferable that this infoljmation be reported for the normative samples. It is unclear if gender, geographic rec~tment region, socioeconomic status, occupaticpn, and recruitment procedures are relevant, Until this is determined, it is best that thii information be provided. Age Group Intervals This criterion refers to grouping of the data into limited age intervals. This requiretnent is relevant for this test since a strong elect of
age on BNT performance has been demonstrated in the literature. Reporting of Educational Levels Given a strong association between education and BNT scores, information regarding educational level should be reported for each subgroup, and preferably normative data should be presented by educational levels. Reporting of Intellectual Levels Given the strong evidence of a relationship between BNT performance and IQ, especially verbal IQ, information regarding intellectual level!Vocabuhuy score should be reported for each subgroup, and preferably normative data should be presented by IQ levels.
Procedural Variables Data Reporting To facilitate interpretation of the data, group means, standard deviations, and ranges for the total score achieved on the BNT should be presented at a minimum. However, SDs should be used with caution in evaluating the relative standing of an individual score because BNT
scores are not normally distributed. More detailed reporting of statistics for the number of correct responses produced (1) spontaneously, (2) with stimulus cues, (3) with phonemic cues, as well as (4) the number of missed items, is recommended.
SUMMARY OF THE STATUS OF THE NORMS According to a survey of the participants of the 1988 and 1989 Clinical Aphasiology Conference, the BNT was identified as one of the two tests most frequently used to supplement comprehensive aphasia batteries (Jackson & Tompkins, 1991). Similarly, the BNT is used in many studies to explore the efficiency of confrontational naming in normal and clinical samples across various demographic groups and diagnostic categories. These studies vary from several perspectives: (1) versions of the BNT utilized (experimental 85-item version,
183
BOSTON NAMING TEST standard 60-item version, as well as a variety of short versions; see above), with the second edition of the test being published more recently, which will undoubtedly attract the attention of clinicians and researchers; (2) administration procedures, particularly in respect to provisions for the stimulus cues; and (3) aspects of performance reported (total score and/or error analysis [error classification systems also differ between studies], percent of correct responses per item, and recommended cutoff criteria for impaired performance rating). Russell and Starkey (1993) developed the Halstead-Russell Neuropsychological Evaluation System (HRNES), which includes the BNT among 22 tests. In this system, individual performance is compared to that of 576 braindamaged participants and 200 participants who were initially suspected of having brain damage but had negative neurological findings. Data were partitioned into seven age groups and three educational/IQ levels. The authors published an appendix to the manual (HRNES-R; Russell & Starkey, 2001), which contains tables of scale scores based on the original HRNES norms, demographic corrections, and regressionbased predicted scores. These data will not be reviewed in this chapter because the "normal" group consisted of Veterans Administration patients who presented with symptoms requiring neuropsychological evaluation. (For further discussion of the HRNES system, see Lezak et al., 2004, pp. 676--677). Demographically adjusted BNT norms based on a sample of over 1,000 normal adults between 20 and 85 years of age, stratified by age, education, gender, and race/ethnicity (African American and Caucasian), are presented by Heaton et al. (2004). Among all the studies available in the literature, we selected for review those based on well-defined samples. Only studies using the 60-item version were reviewed. In all articles reviewed below, the score represents the total number of correctly named drawings (spontaneously or with a stimulus cue) out of 60. In this chapter, normative publications and control data from clinical studies are reviewed in ascending chronological order. The text of study descriptions contains references to the corresponding tables identified by number in
Appendix 10. Table A10.1, the locator table, summarizes information provided in the studies described in this chapter. 1
SUMMARIES Of THE STUDIES [BNT.1l Van Gorp, Satz, Kiersch, and Henry, 1986 (Table A10.2)
The article provides normative data on BNT for 78 normal, independently living elderly residing in southern California (29 males, 49 females) aged 59-95 with a mean FSIQ of 122 (range 87-150). Participants were screened for neurological disorders based on their self-report. The data are presented in five age groupings. The correlation of BNT scores with age was r = -0.33, with more variability demonstrated by older groups. The authors provide suggested cutoff criteria for impaired performance, which are based on a score falling over 2 SDs below the mean for the respective age group.
Study strengths 1. Demographic characteristics of the sample are well described in terms of age, education, IQ, gender, and geographic area. 2. Adequate exclusion criteria were used. 3. The data are partitioned into five age groups. 4. The authors provide suggested cutoff criteria. 5. Means and SDs for the test scores are reported.
Considerations regarding use of the study 1. SDs for the IQ indices describing the whole sample and the individual age groups are not provided. 2. Sample sizes for age groups are small. 3. Mean education and intelligence levels are high. [BNT.2J Farmer, 1990 (Table A10.3) The author provides data on the BNT for 125 normal male participants aged 20-69 (M =43.9, 'Nonns for children for the BNT and BNT-Second Edition are available in Baron (2004), Kaplan et al. (2000),
and Spreen and Strauss (1998).
184
LANGUAGE
SD = 14.3) recruited in California. Education ranged 8-22 years (M = 14.6, SD = 2.2). All participants were native English speakers and had vision and hearing within normal limits. None of the participants reported a history of brain injury or disease. The BNT was administered according to standard instructions. Data are presented for five age decades, with 25 participants in each age group. Analysis of errors is discussed. According to the results, age was significantly correlated with BNT but educational level was not.
Study strengths 1. Demographic characteristics of the sample are described in terms of age, education, gender, geographic area, and Ruency in English. 2. The data are partitioned into five age groups. 3. Minimally adequate exclusion criteria. 4. Means and SDs are reported.
Considerations regarding use of the study 1. 2. 3. 4.
Mean educational level is high. Individual cell sizes are relatively small. The sample is all male. No information regarding IQ level is provided.
[BNT.3] Boone, Lesser, Miller, Wohl, Berman, Lee, Palmer, and Back, 1995 (Table A10.4)
The authors compared 73 outpatient, depressed elderly and no controls on a battery of neuropsychological tests. All participants were Ruent English speakers over age 45 residing in southern California, who were recruited through newspaper ads. Both clinical and control groups underwent physical and neurological examinations and psychiatric interviews. Strict exclusion criteria were used. Data for the control group are reproduced in Table A10.4.
Study strengths 1. The overall sample sizes are large. 2. A comparison of normal controls with depressed elderly is provided. 3. Information regarding VIQ, age, education, gender, geographic area, recruitment
procedures, and Ruency in English is reported. 4. Means and SDs for the test scores are reported. 5. Good exclusion criteria.
Considerations regarding use of the study 1. Education and intelligence levels of the samples are high. 2. Undifferentiated age range. [BNT.4] Mitrushina and Satz, 1995 (Tables A10.5, A10.6)
The article provides BNT data based on a sample of neurologically intact, highly functioning, independently living participants residing in southern California, who were tested over three longitudinal annual probes. The sample of 156 participants who participated in the first probe included most of the sample of 78 participants described by Van Gorp et al. (1986) [BNT.1]. Due to attrition over a period of 3 years, only 122 participants participated in all three probes. Participants in this sample (49 males, 73 females) had MiniMental State Exam (MMSE) scores >24 and ranged in age from 57 to 85 years, with a mean age of 70.4 (5.0) years, at the first testing probe. Mean education was 14.1 (2.7) years, and mean FSIQ was ll8.2 (13.0). The sample was partitioned into four age groups, which did not differ in level of education. Participants were screened for a history of neurological or psychiatric disorder. All participants were native English speakers. The BNT was administered according to standard instructions as part of a large neuropsychological battery. Some decline in scores after age 70 was apparent from cross-sectional age group comparisons. The pattern of correlations with various neuropsychological measures suggests a predominantly verbal mode of information processing in BNT performance on the first probe, as opposed to a visuospatial mode by the third probe. A comparison of BNT scores across the three probes revealed adequate stability of scores over time, with test-retest correlations ranging r = 0.62-0.89.
185
BOSTON NAMING TEST
Study strengths 1. Infonnation regarding age, education, gender, geographic area, IQ, and fluency in English is reported. 2. Adequate exclusion criteria were used. 3. The data are partitioned into four age groups. 4. Test-retest data are provided. 5. Overall sample size is large, with some cells approaching 50 while some cells being rather small. 6. Means and SDs for the test scores are reported.
Consideration regarding use of the study 1. Mean education and intelligence levels are high. [BNT.S] Neils, Baris, Carter, Dell'aira, Nordloh, Weiler, and Weisiger, 1995 (Table A10.7) The study addresses the effects of demographic factors on BNT perfonnance. Participants were 323 nonnal elderly (244 females, 79 males) aged 65-97 residing in northern Kentucky and the greater Cincinnati, Ohio, area; 167 participants were living independently and 156 were institutionalized in extended-care facilities for at least 1 month. All participants were carefully screened for neurological disorders and had adequate vision, language comprehension, and attention. The administration procedure differed from standard in that the stimulus cues were offered after any error was made, irrespective of whether it was a visual-perceptual error. The data are presented in an age-byeducation-by-living environment matrix. The combination of age, education, and living environment accounted for 32% of the perfonnance variance. The results suggest that scores for low-education and high-education groups are less affected by age and living environment than scores for participants with 10-12 years of education. Correlation between BNT score and education was r=0.38, whereas the correlation of BNT with age was r = -0.33.
Study strengths 1. Infonnation regarding age, education, gender, and geographic area is provided.
2.
3. 4. 5.
Data across wide ranges of different demographic characteristics are presented. Strict selection criteria were used for neurological disorders and cognitive dysfunction. Overall very large sample size. The data are presented in an age-byeducation-by-living environment matrix. Means and SDs for the test scores are reported.
Considerations regarding use of the study 1. No infonnation regarding intellectual level. 2. Sample sizes in individual cells are small. 3. The administration procedure somewhat differed from standard instructions. [8NT.6] Ross, Lichtenberg, and Christensen, 1995 (Table A10.8) This article represents an expansion on the previously reported data in Lichtenberg et al. (1994). In study 1, the authors provide data for 123 geriatric medical inpatients at an urban rehabilitation hospital in Michigan (60% African American, 40% Caucasian, 62% female, 38% male). Mean age was 75.87 (7.42), with mean education of 11.05 (3.38). Rigorous exclusion criteria for neurological disorders and depression were used. Mean Mattis Dementia Rating Scale (DRS) score for the sample was 132.76 (4.93). Patients treated for hypertension, diabetes, and hypothyroidism were included if their conditions were well controlled with medications and without neurological complication. Some participants were tested 2-3 weeks after orthopedic surgery and were not on narcotic medications at the time of assessment. In study 2, participants from study 1 were compared as a "nonnative" group to a "cognitively impaired" group of 151 participants with Mattis DRS scores below 123 (61% African American, 39% Caucasian, 30% male, 70% female). Mean age for this group was 79.7, with mean education of 8.9 years. Participants from this group presented with a wide variety of physical disorders which are likely to affect cognitive status. Twenty-four
186
percent of these participants had scores above 10 on the Geriatric Depression Scale (GDS). The results of study 1 indicated significant correlations of BNT scores with age, education, and ethnicity (-0.308, 0.375, and 0.326, respectively). The combined effects of demographic variables accounted for 21% of the BNT variance. In study 2, a discriminant function analysis based on the BNT and demographic data discriminated between cognitively intact and impaired participants with an accuracy of 72.75% (sensitivity 63%, specificity 80%). The authors underscore the importance of using a demographically appropriate set of normative data and suggest use of their data in urban medical settings.
Study strengths 1. Means and SDs for the test scores are reported. 2. Data are presented by age group. 3. A comparison of BNT performance for clinical and medical control groups is presented. 4. Information regarding age, education, ethnicity, gender, and geographic area is reported. 5. Individual cell sizes approach 50. Considerations regarding use of the study 1. "Normal" participants were geriatric inpatients, many of whom had physical illnesses potentially affecting cognitive status. 2. The age range for the oldest group is not reported. 3. No information on intellectual level. [BNT.7] Worrall, Yiu, Hickson, and Bamett, 1995 (Table A10.9)
The authors assessed the validity of the BNT as part of a large educational project on 136 independently living older Australians. Participants were a recruited through advertisements. Participants with a reported history of neurological disease and non-native English speakers were excluded. The mean age for the sample was 70.43 (SD = 7.8) years, and 74.3% were female.
LANGUAGE
The BNT was administered according to standard instructions, followed by a trial of seven alternative items as potential substitutes for low-frequency original items. In addition to standard scoring, an analysis of errors was conducted according to current systems (e.g., Nicholas et al., 1989). The results revealed that the mean BNT score was 2-5 points below that reported for North American samples. Interrater reliabilities for the total score and for error scoring were high (94.89% and 98.17% agreement, respectively). Age, education, visual acuity, and backward digit span were signi&cantly related to BNT scores (r=0.23-0.33). The analysis of errors indicated that semantically related errors and "don't lmow" responses were most frequent. The authors emphasized an effect of culturerelated word frequency on BNT performance. The proposed alternate items for "beaver" and "pretzel" were "platypus" and "pizza." The longitudinal follow-up data for 91 participants from this sample are reported in Cruice et al. (2000).
Study strengths 1. Minimally adequate exclusion criteria are reported. 2. Data are presented by age group. 3. Authors recommend cutoff scores. 4. Analysis of errors was performed. 5. Information regarding age, gender, geographic area, and recruitment procedures is reported. Considerations regarding use of the study 1. Education and intellectual level are not reported. 2. Sample sizes for most of the age groups are small. 3. Participants were recruited in Australia, and it is unclear if these norms are suitable for clinical interpretation in the United States given that this sample scored 2-5 points below North American samples. [BNT.8] Lafleche and Albert, 1995 (Table AlO.lO)
The BNT was administered to 20 volunteers who comprised a control group in a study on
187
BOSTON NAMING TEST
executive function deficits in mUd AD. The control group included nine men and 11 women, with a mean age of 76.2 years, mean education ofl4.7 years, and mean MMSE score of 29.4 (0.8). Participants were screened for severe head injury, alcoholism, major psychiatric illness, epilepsy, and learning disabtlities. They did not show evidence of a dementing process, either on testing or by history.
Study strengths 1. Adequate exclusion criteria. 2. Means and SDs for the test scores are reported.
Considerations regarding use of the study 1. The sample is small. 2. SDs and ranges for age and education are not provided. 3. Recruitment procedures are not reported. 4. Education level for the sample is high. 5. No information on IQ is reported. [BNT.9] lvnik, Malec, Smith, Tangalos, and
percenttle ranks. The authors provided tables of age-corrected norms for each age group. The procedure for clinical application of these data is described in the original article (Ivnik et al., 1996) as follows: first select the table that corresponds to that person's age. Enter the table with the test's raw score; do not use "corrected" or "final" scores for tests that might present their own age- or educationadjustments. Select the appropriate column in the table for that test. The corresponding row in the left-most column in each table provides the MOANS Age-Corrected scaled score . . . for your subject's raw score; the corresponding row in the right-most column indicates the percentile range for that same score.
Further, linear regressions should be applied to the normalized, age-corrected MOANS scaled scores (A-MSS) derived from the tables, to adjust patient scores for education. Age- and education-corrected scores for the BNT (A&E-MSS) can be calculated as follows:
Petersen, 1996 (Table A10.11)
The study provides age-specific norms for the BNT obtained in Mayo's Older Americans Normative Studies (MOANS), which produce normative data for elderly individuals on different neuropsychological tests. The total sample consisted of 746 cognitively normal volunteers residing in Minnesota, over age 55, 663 of whom took the BNT. Mean MAYO FSIQ (which differs somewhat from standard WAIS-R FSIQ) for the whole sample was 106.2 (14.0), and mean Mayo General Memory Index on the Wechsler Memory ScaleRevised (WMS-R) was 106.2 (14.2). For a description of their samples, the authors refer to their earlier publications. Participants were independently functioning, communitydwelling persons who were recently examined by a physician and had no active neurological or psychiatric disorder with the potential to impact cognition. Age categorization utilized the midpoint inteiVal technique. The raw score distribution for each test at each midpoint age was "normalized" by assigning standard scores with a mean of 10 and SD of 3, based on actual
A&E-MSSsNT = K+(W, •A-MSSsNT) - (W2 *Education)
where the following indices are specified for the BNT: K
3.32
w. 1.07 w2 o.34 Education should enter the formula as years of formal schooling. The tables of scaled scores per age group provided by the authors should be used in the context of the detailed procedures for their application, which are explained in Ivnik et al. (1996). Therefore, they are not reproduced in this book. Interested readers are referred to the original article. Table AIO.ll summarizes sample sizes for different demographic groups.
Study strengths 1. Information regarding age, education, IQ, gender, ethnicity, handedness, and geographic area is reported.
188
2. The data are stratified by age group based on the midpoint inte~ technique. 3. The innovative scoring system is well described. The authors developed new indices of performance. 4. The sample sizes for each group are large. 5. Restricted age range in each cell.
Considerations regarding use of the ~dy 1. The measures proposed by the ~uthors are quite complicated and might be difficult to use in clinical practice. • 2. Participants with prior history ol neurological, psychiatric, or chronic ptedical illnesses were included.
Other comments 1. The theoretical assumptions un4erlying this normative project have lJe;n presented in Ivnik et al. (1992a,b). 2. The authors cautioned that the )validity of the MAYO indices depends he,vily on the match of demographic featUres of the individual to the normative :Sample presented in this article. 3. Correlation of the BNT with age was -0.46, whereas correlations with education and gender were 0.26 and .-0.19, respectively. · [BNT.10] Welch, Doineau, Johnson, and king, 1996 (Tables A10.12-A10.14)
The study provides data on BNT perfofiJlance for 176 normal older adults from middle Tennessee (74 males, 102 females), ranging in age from 60 to 93, with a mean age of 74 years. Education ranged from third grade to lf years, with a mean of 12.28 years. The sample consisted of 61% urban and 39% rural participants; 29% professional, 28% skilled, m;d 43% labor workers; 71% white, 28% ~rican American, and 1% other. Participan~ were recruited mostly from senior-citizen organizations and retirement centers, to ensure tample representation approximating the gene¥ population for the following parameters: yarious occupational levels (skilled, professio~al, or manual labor), race and living characf:J'ristics (urban vs. rural). Strict medical and psyruatric
LANGUAGE
exclusion criteria were employed. Participants with well-controlled hypertension or who had adequate corrected vision were included. The data were presented for five age groups and then further stratified into five age groups by two educational levels and into five age groups for males and females separately. The table for five age groups includes suggested cutoff scores. The results indicated that the interaction of age and education is a better predictor of BNT performance than age alone. Performance variability was higher in the older age and lower education groups. In the ~12th grade education group, BNT performance remained stable until 80 years, while in the <12 years of education group the decrement was evident at 70 years. Gender differences were also reported, with males outperforming females.
Study strengths 1. Information regarding age, education, gender, ethnicity, occupation, recruitment procedures, and geographic area is reported. The sample is representative of the regional population along most demographic parameters. 2. Strict exclusion criteria were used. 3. Data are presented by age group and by different combinations of demographic variables (education, gender). 4. Authors recommend cutoff scores. 5. Means and SDs for the test scores are reported.
Considerations regarding use of the study 1. Sample sizes for some of the age groups are small. 2. No information regarding IQ is provided. [BNT.11] Hoff, Riordan, Morris, Cestaro, Wieneke, Alpert, Wang, and Volkow, 1996 (Table A10.15)
The authors used the BNT in a study exploring the relationship of cocaine use to performance on neuropsychological tests tapping functions of frontal and temporal brain regions. Performance of crack cocaine users was compared to that of the control group, which consisted of 54 paid male volunteers with
189
BOSTON NAMING TEST
a mean age of 32.1 (9.7) years and mean education of 15.4 (2.4) years. The sample included 48 white, 4 black, and 2 Hispanic participants. Exclusion criteria were a history of medical, neurological, or psychiatric problems; more than moderate use of alcohol (12 oz/week), history of intravenous drug use, and self-reported history of learning disability (with enrollment in special education classes).
Study strengths 1. Relatively large sample size. 2. The sample composition is described in terms of age, education, and ethnicity. 3. Rigorous exclusion criteria. 4. Means and SDs for the test scores are reported.
Considerations regarding use of the study 1. Wide age and education range. No information on IQ or gender distribution. 2. Recruitment procedures are not reported. 3. Educational level for the sample is high. [BNT.12] Ponton, Satz, Herrera, Ortiz, Urrutia, Young, D'Eiia, Furst, and Namerow, 1996 (Table A10.16)
The Ponton-Satz BNT was administered to Spanish-speaking volunteers as part of a larger battery in a project designed to provide a standardization of the Neuropsychological Screening Battery for Hispanics (NeSBHIS). Volunteers were recruited through fliers and advertisements in community centers of the greater Los Angeles area over a period of 2 years. Exclusion criteria were a history of neurological or psychiatric disorder, drug or alcohol abuse, and head trauma. Data for a sample of 300 participants, with a median educational level of 10 years, were analyzed. Participants ranged in age from 16 to 75 years, with a mean of 38.4 (13.5) years. Education ranged 1-20 years, with a mean of 10.7 (5.1) years. Male to female ratio was 40/60%. The average duration of residence in the United States was 16.4 (14.4) years. Seventy percent of the sample were monolingual Spanishspeaking, and 30% were bilingual. The proportion of the sample respective to their country of origin closely approximates the 1992 U.S. Census distribution. Correlations
between Marin and Marin (1991) acculturation scale scores and neuropsychological variables are provided. The Ponton-Satz BNT is an adaptation of Kaplan's BNT, consisting of 30 items that are based on the original test but presented in different order. The selection of items was based on the ratings of expert judges in terms of cultural appropriateness and difficulty. In the follow-up study on the factor structure of the NeSBHIS (Ponton et al., 2000), which extracted five factors, the BNT primarily loaded on the Language factor, with a varimaxrotated factor loading of 0.84.
Study strengths 1. Large overall sample with acceptable size for most of the cells. 2. The sample composition is well described in terms of age, education, gender, acculturation information, geographic area, and recruitment procedures. 3. Adequate exclusion criteria. 4. Test administration procedures are specified. 5. Means and SDs for the test scores are reported. 6. Data are partitioned by gender x age x education.
Considerations regarding use of the study 1. Thirty-item version of the test adapted for use with Spanish-speaking population was administered. 2. No information on IQ is reported. 3. It is unclear which of the two educational groups included participants with 10 years of education. [BNT.13] Tombaugh and Hubley, 1997 (Tables A10.17, A10.18)
The study provides age- and educationstratified norms for 219 community-dwelling, cognitively intact volunteers, who participated in a large study on the effect of aging on acquisition and retention of information. They were recruited through booths at shopping centers, social organizations, places of employment, psychology classes, and word of mouth. The sample included participants aged
190
25-88 years (M = 59.0, SD = 16.9). Average educational level was 12.9 (2.3) years; 46% were male. Mean WAIS-R Vocabulary scaled score was 11.6 (2.4). Participants were screened based on a self-reported history of medical and psychiatric problems, including a list of currently prescribed medications. Persons with a known history of neurological disease, psychiatric illness, head injury, or stroke were excluded. Participants with MMSE score <25 and GDS score >13 were also excluded. Participants were administered all items for 20 seconds, starting with item 1. Specific administration procedures are described by the authors in detail. The standard scoring procedure was used. The authors recorded rates of correct spontaneous responses (SR), number correct after a stimulus cue (SC), and number correct after a phonemic cue (PC). Rates of SR + SC (according to the original scoring procedure) and the sum of all correct responses (SR + SC +PC) are provided. Study strengths 1. Administration procedure is well outlined. 2. Sample composition is well described in terms of age, education, gender, WAIS-R Vocabulary score, and recruitment procedures. 3. Strict selection criteria were used. 4. The table of means is stratified by age group, education, and gender; the table of percentiles is presented in age x education cells. 5. Means, SDs, and percentiles for the test scores are reported. Considerations regarding use of the study 1. Administration procedure deviated from the standard procedure described in the test manual. 2. Sample sizes for age groupings are relatively small, although overall sample size is quite large. 3. Data were collected in Canada and, therefore, might be of limited use for clinical interpretation in the United States.
LANGUAGE
[8NT.14] Henderson, Frank, Pigatt, Abramson, and Houston, 1998 (Table A10.19) The authors examined effects of race, gender, and educational level on BNT scores. The sample included 50 African-American and 50 Caucasian participants, with 25 males and females in each group, who ranged in age from 17 to 87 years and in education from 10 to >17 years. Volunteers from local churches in Richland and Florence counties in South Carolina; students, faculty, and staff from the University of South Carolina (USC); and participants in a lexical function study at the USC were included in the sample. Exclusion criteria were a history of mental retardation, dementia or developmental language disorders, traumatic brain injury, cerebrovascular accident, treatment for alcoholism, or current psychiatric illness including depression. Participants with scores above 3 on the Hachinski Ischemia Rating Scale, above 0.5 on the Zung Depression Scale, and <130 on the DRS were excluded from the study. The test was administered by trained staff who followed the standard administration procedure. However, all60 items were presented. Credit was given for spontaneously correct responses and for correct responses with a stimulus cue. Participants and examiners administering the test were matched by race. The results revealed a significant effect of education on BNT scores but no significant effect of race or gender. An item analysis was used to examine the vocabulary base underlying performance on BNT items. Study strengths 1. Large sample. 2. The sample composition is well described in terms of age, education, gender, and geographic area. 3. Rigorous exclusion criteria. 4. Test administration procedures are specified. 5. Means and SDs for the test scores are reported. Considerations regarding use of the study 1. Overall sample is adequate, but individual cells are relatively small.
191
BOSTON NAMING TEST
The study addresses the effect of brain lesion location and etiology on verbal fluency. The BNT, among other tests, was administered to assess basic language competence. The control group included 37 participants (19 males, 18 females) without neurological or psychiatric disorder, with a mean age of 54.4 (14.4) years and mean education of 13.9 (2.3) years. Mean National Adult Reading Test (NART)estimated IQ was 113.8 (6.1).
post-baccalaureate training, with a mean of 14.1 (2.4) years. MMSE scores ranged 24--30, with a mean of 28.1 (1.5). The sample was predominantly Caucasian. English was the primary language for all participants. They were financially compensated for participation. Individuals with a history of cerebrovascular insult, head injury with loss of consciousness exceeding 5 minutes, and chronic substance abuse, according to the structured interview, were excluded. All participants lived independently in the community and denied uncorrected vision, hearing, or motor impairment. The BNT items were reorganized according to the order of items in the short forms which were validated in this study. Test was administered by trained graduate students. The data are stratified into three age groups, with equal numbers of males and females in each group. The authors reported an age effect on BNT performance but no education or gender effects.
Study strengths
Study strengths
2. Demographic characteristics of the sample are provided in different combinations of subgroups. However, they do not match the breakdown of the sample used for normative data reporting. 3. Recruitment procedures are not reported. 4. No information on IQ is reported. [BNT.15] Stuss, Alexander, Hamer, Palumbo, Dempster, Binns, Levine, and lzulcawa, 1998 (Table A10.20)
1. The sample composition is well described in terms of age, education, gender, and estimated IQ. 2. Adequate exclusion criteria. 3. Means and SDs for the test scores are reported.
Considerations regarding use of the study 1. Recruitment procedures are not reported. 2. The data were obtained on Canadian participants, which may limit their usefulness for clinical interpretation in the United States. [BNT.16] Fastenau, Denburg, and Mauer, 1998 (Table A10.21) The authors provided data for the full BNT in the context of a study addressing the validity of short forms of the BNT. Participants were recruited through newspaper advertisements and at local churches and older adult organizations in urban areas. A stratified sampling procedure was used to produce a sample of 108 participants (47% female) with a mean age of 72.2 (7.0) years. Education ranged from 8 years to
1. Large sample. 2. The sample composition is well described in terms of age, education, gender, ethnicity, MMSE scores, primary language, incentive for participation, sampling procedure, geographic area, and recruitment procedures. 3. Rigorous exclusion criteria. 4. Detailed description of test administration procedure and personnel is provided. 5. Means and SDs for the test scores are reported. 6. Data are partitioned into three age groups.
Considerations regarding use of the study 1. Administration was not standard: test items were reorganized. 2. Education and intelligence level for the sample are high. 3. No information on IQ is reported.
[BNT.17] Ross and Uchtenberg, 1998 (Table A10.22) Normative data for 233 neurologically intact medical patients are provided for use in
192
urban medical settings. This study represents an extension of the earlier study by Ross et al. (1995) and is part of a continuing geriatric research protocol, the Normative Studies Research Project (NSRP) for older urban adults. Participants were tested during inpatient postacute rehabilitative services at an urban university-affiliated hospital. All consecutively admitted patients were tested approximately 1 week after their admission and 2-3 weeks prior to their hospital discharge. The medical problems included arthritic conditions, fractures, limb amputations, back injuries, spinal cord injuries, and admissions for physical reconditioning following illness or surgery. Patients were on a variety of medications. The sample included 56% AfricanAmerican and 44% Caucasian participants, 73% of whom were female, ranging in age from 65 to 95. Their mean age was 76.1 (7.1) years, and mean education was 11.1 (3.2) years. Selection criteria were normal neurological examination upon hospital admission; absence of medical history suggesting past neurological dysfunction, psychiatric illness, or alcohol abuse; a score above 123 on the DRS; and a GDS score <10. Participants with current hypertension, diabetes, and hypothyroidism were included only when these conditions were well controlled with medications and did not cause neurological complications. Normative data are stratified by age and education. Results of regression analysis suggest that education accounts for approximately 14% of the variance. Age, ethnicity, and gender each account for an additional2% of the variance in test performance. The authors argue that the selection of the normative data for clinical use should reflect the degree to which the demographic characteristics of the normative samples match those of their patients.
Study strengths 1. Large overall sample. 2. The sample composition is well described in terms of age, education, gender, ethnicity, medical condition, and recruitment procedures. 3. Rigorous exclusion criteria.
LANGUAGE
4. Means and SDs for the test scores are reported. 5. Data are stratified by age and education.
Considerations regarding use of the study 1. Overall sample is large, but individual cells are relatively small. 2. No information on IQ is reported. [8NT.18] Kohnert, Hernandez, and Bates, 1998 (Table A10.23)
The BNT was administered to 100 bilingual adults of Mexican-American descent in both Spanish and English. Participants had > 12 years of education and were recruited from University of California San Diego, University of California Santa Barbara, and the San Diego community. Spanish was the first language and the primary home language for all participants, with English acquisition prior to the age of 8. Proficiency in both languages was assessed with a self-rating questionnaire. Participants were given course credit or paid $5 for their participation. Exclusion criteria were left-handedness; a history of speech, language, or hearing impairment; uncorrected visual deficits; proficiency or prolonged exposure to languages other than Spanish and English; or a medical history potentially resulting in compromised neurological status (based on responses on the health questionnaire). The sample included 41 males and 59 females, with a mean age of 20.82 (2.6) years, mean education of 14.4 (1.7) years, and mean age of English acquisition of 4.6 (3.0) years. The BNT was administered two times to each participant in both languages in the counterbalanced order. Instructions were given in the language of test administration. All 60 pictures were shown to all participants, with no basal or ceiling values set. Spanish protocol was developed based on the original version of the test. The test was administered by trained personnel. Total correct scores were defined as spontaneously correct responses plus those aided with semantic cues. Three group performance scores were derived: English only, Spanish only, and a composite score indicating the total number of items correctly named, independent of language.
193
BOSTON NAMING TEST
Study strengths 1. Large sample. 2. The sample composition is well described in terms of age, education, gender, factors inHuencing language proficiency, incentives for participation, and setting. 3. Adequate exclusion criteria. 4. Test administration procedures are specified in detail. 5. Means and SDs for the test scores are reported. 6. Comparison of performance in Spanish and English is provided.
Consideration regarding use of the study 1. No information on IQ is reported. [BNT.19] Randolph, Lansing, lvnik, Cullum, and Hermann, 1999 (Table A10.24)
The effects of age, education, gender, and diagnostic group with respect to overall BNT performance; the inHuence of phonemic cuing; and performance on individual items were examined on samples of neurologically normal elderly, AD patients and temporal lobe epilepsy patients. The control group included 719 paid and unpaid volunteers for studies on neuropsychological function in normal aging. Procedures for recruitment and sample description are provided in earlier articles based on this study. The follow-up articles provide additional information (Lansing et al., 1999). The sample was almost exclusively white and 60% female, with a mean age of73.6 (10.3) years and mean education of 13.4 (2.9) years. Spontaneously correct responses or responses correct with stimulus cue were scored as correct. The test was discontinued after six consecutive failures. The authors present means and SDs for the data broken down by age groups using the overlapping midpoints technique. They also present data for three education groups: <12, 12, and> 12 years. The authors found that age and education systematically inHuenced BNT scores. Gender had a significant effect across diagnostic groups, with males outperforming females, which was interpreted by the authors as an item-related effect.
Study strengths 1. Information regarding age, education, IQ, gender, ethnicity, handedness, and geographic area is reported in the original articles based on this study (Ivnik et al., 1992a,b). 2. The data were stratified by age group based on the overlapping midpoints technique. 3. The sample sizes for each group are large.
Considerations regarding use of the study 1. To interpret the data presented in age groups broken down by the overlapping midpoints technique, the reader is referred to the original articles by Ivnik et al. 2. Participants with a prior history of psychiatric or chronic medical illnesses were included. [BNT.20] Killgore and Adams, 1999 (Table A10.25)
The study investigates the relationship between BNT performance and WAIS-R Vocabulary score and derives regression-based expected BNT scores from Vocabulary scaled scores. The sample consisted of patients consecutively referred for neuropsychological evaluation at a large midwestern medical center over a 26-month period who were found to be without demonstrable neurological impairment. All patients had negative neurological evaluations and negative neuroimaging and laboratory studies. Patients were excluded if there was evidence of mild dementia or a history of alcohol abuse, learning disability, or seizure disorder. The sample consisted of 28 males and 34 females, ranging in age from 17 to 85 years, with a mean age of 45.7 (15.1) years, mean education of 13.1 (2.7) years, and mean WAIS-R FSIQ of95.1 (12.0); 34% had a psychiatric diagnosis such as Major Depressive Episode or Adjustment Disorder. BNT was administered by trained personnel, according to the standardized instructions. The results did not suggest a relationship between BNT performance and age. However, the regression of Vocabulary scaled scores on the BNT scores accounted for 42% of the variance in BNT scores and was used to
194 derive predicted BNT scores for different Vocabulary levels.
Study strengths 1. The sample composition is well described in terms of age, education, gender, IQ, geographic area, and recruitment procedures. 2. Exclusion criteria are well defined. 3. Test administration procedures are specified. 4. Means and SDs for the test scores are reported. Considerations regarding use of the stfldy 1. Though the overall sample size is adequate, it represents a wide age np1ge. 2. The data are not partitioned by age group. 3. The data are based on a clinical ~ample of patients referred for neuropsfc:hological evaluation, although n~logical damage was ruled out by multipe modalities. 4. Patients with psychiatric diagnoses were included in the sample.
[BNT.21] Heaton, Avitable, Grant, and Matthews, 1999 (Table A10.26) The authors respond to the criticism of their regression-based norms by providing updated norms for the BNT, based on a santple of healthy community residents between 60 and 80 years of age, who volunteered to participate as normal controls in studies of late-life psychosis at the San Diego Veterans Administration Medical Center. The exclusion eriteria were history of neurological illness, significant head trauma, recent or current substance use disorder, major psychiatric illness, or ~temic illness likely to affect central nervous $ystem function. The test was administered by trained technicians. Raw scores and demographically corrected T scores for the BNT are preSented. Study strengths 1. Large sample. 2. The sample composition is well deacribed in terms of age, education, gender,~c composition, and geographic area. :
LANGUAGE
3. Rigorous exclusion criteria. 4. Means and SDs for the test scores are reported.
Considerations regarding use of the study 1. T scores are stratified by demographic groups; however, the raw score is presented for the entire sample. 2. No information on IQ is reported.
[BNT.22] Rosselli, Ardila, Araujo, Weekes, Caracciolo, Padilla, and Ostrosky-Solis, 2000 (Table A10.27)
The authors examined the impact of bilingualism on verbal fluency and repetition tests in older Hispanic bilinguals. The BNTwas used to assess naming proficiency. The sample included 82 right-handed south Florida residents (28 male, 54 female) who volunteered to participate. Participants ranged in age from 50 to 84, with a mean age of 61.76 (9.3) years and mean education of 14.8 (3.6) years. They independently had MMSE scores >27, had Beck Depression Inventory-IT scores <5, and were screened for a history of neurological or psychiatric problems using a structured interview. Forty-five participants were monolingual in English (born in the United States and only spoke English), 18 were Spanish monolinguals (Latin American immigrants living in the city of Hialeah, Florida, who migrated to the United States after age 50 and had been living in the United States an average of 5 years, had no formal education in English or previous employment in which English was required, and used Spanish in all daily activities), and 19 were bilingual (as determined by their responses on a bilingual questionnaire, selfrated language proficiency in both languages, and an acceptable score on the Spanish and English versions of the BNT norms, with a correction for age).
Study strengths 1. The sample composition is well described in terms of age, education, gender, geographic area, and incentive for participation. 2. Adequate exclusion criteria. 3. Means and SDs for the test scores are reported.
195
BOSTON NAMING TEST
Considerations regarding use of the study 1. Overall sample is adequate, but individual cells are relatively small. 2. Recruitment procedures are not reported. 3. Level of education for the monolingual English group is high. 4. No information on IQ is reported. [8NT.23] Schmitter-Edgecombe, Vesneski, and Jones, 2000 (Table A10.28)
The study compared word-finding abilities across three age groups. The sample included 26 participants in each of three age groups: 18-22 (M=18.93), 58-74 (M=66.29), and 75-93 (M = 79.19) years. Each group included 7 males and 19 females. The FSIQ of participants was estimated based on their performance on the WAIS-R Vocabulary, Block Design, Similarities, and Arithmetic subtests. The young group was comprised of undergraduate students, who received course credit for their participation. The older groups included healthy, communitydwelling volunteers. All participants were native English speakers. Exclusion criteria were a history of substance abuse, brain surgery, cerebrovascular or cardiovascular accident, non-normative levels of cognitive decline, brain damage sustained earlier from a known cause, psychiatric disorder, or serious health problems, per self-report. Individuals taking more than two drugs rated to have more than minimal effects on attention were excluded. Color-blind individuals or those who could not see the cards were also excluded. The BNT was administered according to standard instructions. An item analysis revealed a cohort effect (generational familiarity) in performance on four BNT items.
Study strengths 1. The sample composition is well described in terms of age, education, gender, estimated FSIQ, and incentive for participation. 2. Adequate exclusion criteria. 3. Test administration procedures are specified.
4. Means and SDs for the test scores are reported.
Considerations regarding use of the study 1. Overall sample is adequate, but individual cells are relatively small. 2. Recruitment procedures are not reported. 3. SDs for age are not reported. 4. Educational levels for the two older groups are high. [BNT.24] Saxton, Ratcliff, Newman, Belle, Fried, Yee, and Kuller, 2000 (Table A10.29)
The BNT was administered as part of the Memory and Aging Study (MAS), conducted as an ancillary project to the Cardiovascular Health Study (CHS), a multicenter observational study of heart disease and stroke in Washington County, Maryland, and Pittsburgh, Pennsylvania. No selection criteria were used. Data were analyzed for a sample of 989 participants (444 males, 545 females) who completed all of the cognitive tests included in the battery. The mean age for the sample was 73.63 (4.45) years, with mean education of 13.23 (2.85) years; 93.9% of the sample were white. This sample was divided into two clinical groups and a "no disease" group, based on cardiovascular status. Scores on the BNT for the "no disease" sample of 357 participants are reproduced in Table A10.29. Demographic characteristics for this sample are not reported by the authors. However, we assume that they are similar to the demographics for the entire sample described above.
Study strengths 1. Large sample size. 2. The sample composition is described in terms of age, education, gender, setting, geographic area, and recruitment procedures. 3. Means and SDs for the test scores are reported.
Considerations regarding use of the study 1. No exclusion criteria. 2. The data are not partitioned by age group.
196
LANGUAGE
3. No information on IQ is reported. [BNT.2S] Bell, Hermann, Woodard, Jones, Rutecki, Sheth, Dow, and Seidenberg, 2001 (Table A10.30)
The study examined object-naming ability and depth of semantic knowledge in healthy controls and patients with early-onset temporal lobe epilepsy. The control group included 29 friends, relatives, and spouses of temporal lobe epilepsy patients (72% female), aged 1~ 60 years, with a mean age of 34.4 (12.5) years; FSIQ (as measured with the WAIS-III sevensubtest short form) of 69-110, with a mean FSIQ of 97.7 (6.4); and mean education of 13.0 (1.7) years. Exclusion criteria were current substance abuse, psychotropic medication use, medical or psychiatric condition that could affect cognitive functioning, episode of loss of consciousness longer than 5 minutes, developmental learning disorder, and repetition of a grade in school. All 60 BNT items were administered to all participants. Integrity of conceptual knowledge was assessed by asking participants to define a subset of items from the BNT.
Study strengths 1. The sample composition is well described in terms of age, education, gender, FSIQ, and recruitment criteria. 2. Adequate exclusion criteria. 3. Test administration procedures are spec-
ified 4. Means and SDs for the test scores are reported.
Considerations regarding use of the study 1. The sample is small and includes a wide age range. 2. The data are not partitioned by age group. [BNT.26] Roberts, Garcia, Desrochers, and Hernandez, 2002 (Table A10.31)
The authors compared BNT performance and order of difficulty for the individual items for three language groups: 42 monolingual English speakers, 32 Spanish/English bilinguals, and 49 French/English bilinguals. All bilingual
participants were proficient in English and learned it as children. French participants had lived in Canada. Participants had no history of communication problems, head injury, or drug or alcohol abuse. All 60 items of the BNT were administered. Participants were not instructed to use a single word for each picture. Two types of score were used: strict {responses listed in the BNT booklet) and lenient (allowing synonyms and variants for 18 items). Performance of bilingual groups was significantly lower than that of monolingual English speakers. Item difficulty differed across groups.
Study strengths 1. The sample composition is described in terms of age and education. 2. Factors inHuencing language proficiency are well described. 3. Adequate exclusion criteria. 4. Test administration procedures are specified. 5. Means and SDs for the test scores are reported.
Considerations regarding use of the study 1. No information on gender and IQ is reported. 2. Recruitment procedures are not reported. 3. Educational level for the sample is high. 4. The data for French-speaking participants were obtained on Canadian residents, which may limit their usefulness for clinical interpretation in the United States. 5. Administration procedures were altered. [BNT.27] Coffey, Ratcliff, Saxton, Bryan, Fried, and Lucke, 2001 (Table A10.32)
The authors examined cognitive correlates of age-related physiological changes in brain structure, as evident on quantitative MRI. Participants were 320 elderly {38% male, 62% female) between 66 and 90 years of age, who were selected from Pittsburgh, Pennsylvania, and Hagerstown, Maryland, centers of the
BOSTON NAMING TEST
multicenter, population-based Cardiovascular Health Study. Data for a slightly larger subsample from this project are presented in Saxton, 2000 (see review above). Volunteers were excluded from the study if they were not right-handed, had a lifetime history of psychiatric illness or any illness or injury referrable to the brain, MR images revealing structural abnormalities, or MMSE scores of <24. Mean age for the sample was 74.85 (4.95) years, mean education 12.98 (2.87) years, mean MMSE score 28.29 (1.5), and mean WAIS-R Vocabulary score 47.52 (13.26). Of this sample, 71% were taking medications for one or more medical conditions. None was taking medication known to affect brain size (e.g., steroids). The BNT was administered according to standard instructions. Data are presented for the whole sample and for males and females separately. Study strengths 1. Large sample size. 2. The sample composition is well described in terms of age, education, gender, MMSE score, WAIS-R Vocabulary score, geographic area, and research setting. 3. Adequate exclusion criteria. 4. Test administration procedures are specified. 5. Means and SDs for the test scores are reported. Consideration regarding use of the study 1. The data are not partitioned by age group.
[BNT.28] Giovannetti, Goldstein, Schullery, Barr, and Bilder, 2003 (Table A10.33)
The BNT was administered to 31 control participants in order to assess basic language skills associated with temporal lobe functions in a study on the mechanisms of verbal Huency deficits in first-episode schizophrenia. Participants were recruited from the hospital community through announcements in local newspapers and within the medical center. They had no history of substance abuse or neurological/psychiatric/medical illness, per
197
self-report and per Schedule for Affective Disorders and Schizophrenia Interview, physical examination, and urinalysis. Mean age for the group was 25.2 (6.07) years, mean education 15.0 (1.48) years, mean WAIS-R IQ 109.3 (11.51), and male/female ratio 21/10. The sample is further described in the articles by Bilder et al. and Lieberman et al., published between 1991 and 2000. Study strengths 1. The sample composition is well described in terms of age, education, gender, FSIQ, geographic area, and recruitment procedures. 2. Stringent exclusion criteria. 3. Means and SDs for the test scores are reported. Considerations regarding use of the study 1. The sample is small. 2. Educational level for the sample is high.
RESULTS OF THE META-ANALYSES OF THE BOSTON NAMING TEST DATA (See Appendix 1OM)
Data collected from the studies reviewed in this chapter were combined in regression analyses in order to describe the relationship between age and test performance and to predict expected test scores for different age groups. Effects of other demographic variables were explored in follow-up analyses. The general procedures for data selection and analysis are described in Chapter 3. Detailed results of the meta-analysis and predicted test scores across adult age groups are provided in Appendix lOrn. After initial data editing for consistency and for outlying scores, 14 studies, which generated 42 data points based on a total of 1,684 participants, were included into the analyses. Normative data from Kaplan's standardization sample were not included in the database. A quadratic regression of the BNT scores on age yielded an R2 of 0.850, indicating that 85% of the variance in BNT scores is accounted for by the model. Based on this
198
model, we estimated BNT scores for ;age intervals between 25 and 84 years. If p¥cted scores are needed for age ranges outslde the reported age boundaries, with proper taution (see Chapter 3) they can be calculatef using the regression equations included in ~e tables, which underlie calculations of ~e predicted scores. According to the test design, BNT: scores for healthy young to middle-aged sam~es are not expected to be normally distribUted. It should be noted that the mean age ~r the aggregate sample is 67.91 (15.26). As rtflected in the scatterplot of the BNT data aroqnd the regression line, the majority of the .tudies available for review contained data fot older age groups. The mean score for the agjregate sample is 52.25 (3.26), reflecting ~ agerelated decline from the optimal perfoqnance expected in younger samples. Thus, thei distribution of scores in our sample is more tormal than expected in younger samples 1ue to variability in both directions from thetean, avoiding scores being skewed due to a iling effect. A quadratic regression of SDs on a yielded an R2 of 0.583, indicating an incr ase in variability with advancing age, consisteJtt with the literature. Predicted SDs, based ~ this model, are reported. Examination of the effects of demo!fciphic variables on BNT scores indicated th~ education and IQ did not contribute to test scores in the data available for analyses. : The difference in mean scores f<*" two genders across six studies reporting scores for males and females separately was; 3.450 in favor of males, which was not sta~cally significant.
Strengths of the analyses , 1. Total sample size of 1,684 partici~ts. 2. R2 of 0.850, indicating a good m<>fel fit. 3. Postestimation tests for paramet~ specifications did not indicate pr~lems with normality or homoscedasticitf. 4. Predicted scores and SDs presented in Appendix 10m parallel the nor:!_lative data provided by Kaplan et al. (2~) in the BNT-11 test booklet. Furthe~ore, predicted values based on the aggfegate
LANGUAGE
sample support the curvilinear relationship between age and the test scores apparent in the normative data, indicating improvement in test performance up to the fourth decade of life, with a subsequent decline, which is considerably accelerated in the seventh and eighth decades. The advantage of the predictions presented in this book is that they are provided for 12 narrow age intervals spanning ages 25-84 in 5-year increments, whereas the normative data provided by the test authors are grouped in five wide age intervals spanning ages 18-79.
Limitations of the analyses 1. The number of studies that report data for older age ranges considerably exceeds that of studies concerned with younger age ranges. Thus, the mean age for the aggregate sample is 67.91 (15.26) years, and scores are more normally distributed than would be expected in case of more even representation of age groups. 2. A strong effect of education, intellectual level, and fund of vocabulaly on BNT performance has been reported in the literature. However, our data did not support this association, which is likely due to a restricted range of education (11-16.6 years) and IQ (113.8-119.9) in the data available for analyses. We did not have sufficient data to explore the effect of verbal vocabuLuy. Mean educational level of the aggregate sample is 13.79 (1.50) years, and mean IQ is 116.10 (2.60). (IQ is available for only six data points.) Higher scores on the BNT are associated with increased educational and IQ levels. Therefore, the predicted values are likely to overestimate expected performance for individuals with lower educational levels and/or average and lower than average intellectual levels. 3. Superiority of males in BNT performance has been reported in several studies. Our data are consistent with this finding in that a difference in
BOSTON NAMING TEST
mean scores of 3.450 in favor of males was found, which, however, was not statistically significant. Only six data points for each gender were available for analyses.
CONCLUSIONS A review of the literature suggests that confrontation naming ability is affected by many factors that need to be considered in interpreting BNT performance. Inspection of the data sets suggests that educational level is as important as, if not more important than, age in BNT performance. This indicates that future normative studies need to present data by education groupings. Although the results of our meta-analysis did not reveal an association of BNT performance with educational or intelligence level, this is likely due to a restricted range of education and IQ in the aggregate sample used for the analyses. In addition, a majority of the studies were limited to older samples. Additional normative studies are needed on younger populations, especially to ascertain if the same relationships between BNT scores and educational level are found in younger groups. BNT performance is also directly affected by one's culturally determined linguistic background and accumulated vocabulary. Therefore, uncritical use of cutoff criteria might result in false-positive misclassification errors. To avoid unsubstantiated determination of a naming impairment, the BNT score needs to be interpreted within the context of a patient's linguistic background and cultural/educational exposure. In addition, qualitative analysis of performance (frequency of "don't know" responses, misperceptions, tip-of-the-tongue errors, readiness to give up, response latencies, etc.) could contribute to interpretation accuracy.
199
It should be noted that the distribution of performance scores for the BNT is far from normal. The performance of the majority of intact individuals falls at the upper range of the score distribution. As a result, this test does not discriminate well at the highperformance levels (e.g., the distinction between "well above average" and "superior" performance levels cannot be well defined). On the other hand, this test discriminates well at the lower range of performance. It is sensitive to identifying outliers whose performance falls below the expected range. However, it is important that the effect of low education and vocabulary be taken into consideration in clinical interpretation of low scores, as small SDs resulting from extreme kurtosis of the score distribution cause lower test scores consistent with limited vocabulary to fall within the impaired range. Because of the negative skew in the score distribution, investigators examining an effect of demographic variables on test performance should ensure adequate representation of participants with lower levels of education and intellectual level/vocabulary fund. It is the lower tail of the distribution that contributes most to the relationship of the test scores with demographic variables. Unrepresentative samples with scores clustered around the mean will wash out this relationship. The involvement of different informationprocessing mechanisms in confrontation naming has been extensively researched. Due to conRicting findings, however, further research in this area would enhance our understanding of the processes underlying naming ability and their relation to visual-perceptual/recognition ability. Consequently, the differential mechanisms determining age-related decline in naming ability vs. anomia associated with degenerative brain conditions would be further illuminated.
11 Verbal Fluency Test
BRIEF HISTORY OF THE TESTi There are several types of task that ntasure verbal fluency (VF). Their histori~ roots stem from the Thurstone Word Fluen4Y Test (Thurstone &: Thurstone, 1962), whi
S, with 60 seconds for each letter (Bechtoldt et al., 1962; Borkowski et al., 1967). This version became part of the Neurosensory Center Examination for Aphasia (Benton, 1967; Spreen &: Benton, 1969). Later, this test was included in the Multilingual Aphasia Examination Battery (Benton &: Hamsher, 1978; Benton et al., 1994a) under a new name, Controlled Oral Word Association Test (COWA), to eliminate a potentially misleading reference to a fluent/nonfluent aphasia (see Ruff et al., 1996). The COWA is based on two sets ofletters, CFL and PRW. Whereas the former stimuli (FAS) were chosen at random, the selection of letters for the COWA was based on analysis of word difficulty as determined by number of words in the English language that begin with that particular letter. As a result, the CFL and PRW versions of the COWA are of equal difficulty and can be used interchangeably. According to the analysis of letter difficulty (Borkowski et al., 1967), which was based on frequency of associations for 24 different letters, the letters F, A, S, C, P, and W were classified as "easy," whereas Land R fell in the category of "moderately difficult" letters. In spite of unequal difficulty levels for letters included in the FAS vs. CFL and PRW sets, a study of equivalency between the three letter sets conducted on 106 patients with various
VERBAL FLUENCY TEST
neuropsychological dysfunctions yielded correlation coefficients of 0.87-0.94 for different samples (Lacy et al., 1996). The authors concluded that these intercorrelations even surpass correlations between CFL and PRW. In spite of such an optimistic view, the norms for the FAS test should be used with caution in application to the COWA sets (CFL and PRW) due to different levels of letter difficulty (Ruff et al., 1996). Other combinations of letters have been used in several studies. Cavalli et al. (1981) used P, F, and L in a study on lateralized deficits in linguistic processing. Nielsen et al. (1989) used S, N, and F on a large neurologically intact Danish sample. S and P were used by Barr and Brandt (1996) in a study on fluency deficits in dementia, Ganguli et al. (1991, 1993, 1996) in a study on cognitive impairment in an elderly rural population, Goldman et al. (1998) in a study examining cognitive deficits associated with Parkinson's disease, and Coffey et al. (2001) in a study exploring cognitive correlates of human brain aging. Lannoo and Vingerhoets (1997) used N, A, and K in a project collecting normative data for a Dutch version of the phonemic fluency test. Lopez-Carlos et al. (2003) used P, M, and R in their project, collecting normative data for monolingual Spanish-speaking individuals with a low level of education. The authors noted that illiterate individuals tend to make errors in the words that start with A and S, given that many words that start with a silent H begin with the A sound and words that start with a C sound like S. Other versions of the fluency tests involve generation of words from certain semantic categories (Category Naming), such as animal naming (Acevedo et al., 2000; Barr & Brandt, 1996; Beatty et al., 1997; Brady et al., 2001; Crossley et al., 1997; Epker et al., 1999; Fama et al., 2000; Ganguli et al., 1991, 1993; Giovannetti et al., 2003; Kempler et al., 1998; Kozora & Cullum, 1995; Lopez-Carlos et al., 2003; Monsch et al., 1992; Morris et al., 1989; Rosen, 1980; Rosselli et al., 2002a; Seines et al., 1991; Simkins-Bullock et al., 1994; Tombaugh et al., 1999; Troyer, 2000; Ylikoski et al., 1993); types of transportation and parts of a car (Weingartner et al., 1984); items found in a
201
supermarket (Barr & Brandt, 1996; Kozora & Cullum, 1995; Monsch et al., 1992; Troyer, 2000); fruits and vegetables, foods, and things people drink (Acevedo et al., 2000; Miller, 2003; Monsch et al., 1992; Randolph et al., 1993; Simkins-Bullock et al., 1994); first names (Kozora & Cullum, 1995; Monsch et al., 1992); tools and clothing (Huff et al., 1986b); U.S. states (Kozora & Cullum, 1995); and inanimate objects (Fama et al., 2000). Fuld (1981) used category naming tasks, such as proper names of people (same gender as the examinee), foods, vegetables, things that make people happy, and things that make people sad, as distractor trials for delayed recall of the originally presented stimuli in her Fuld ObjectMemory Evaluation (see also Marcopulos et al., 1997). Food, clothing, animals, and things to ride categories are included in the McCarthy Scales for Children's Abilities (McCarthy, 1972). A version of the VF task tapping semantic switching is included in the DelisKaplan Executive Functions System (Delis et al., 2001). This test assesses letter fluency (F, A, S), category fluency (animals and boys' names), and category switching (fruits and furniture). The test has a standard and alternate forms. Two parallel forms of a semantic fluency test are included in the Repeatable Battery for the Assessment of Neuropsychological Status (RBANS; Randolph, 1998). One version of a category naming test is the Set Test (Isaacs & Kennie, 1973), which involves generating items from four successive categories: colors, animals, fruits, and towns. According to this version, examinees are to recall up to 10 items from each category, after which they are instructed to shift to the next category. The score is the total number of items recalled for all categories. The versions proposed by Newcombe (1969), used in assessing patients with lateralized missile wounds, involved naming objects and animals and alternating between naming birds and colors over 1 minute for each of the three trials. The number of correctly generated items for the first two conditions and correct alternations for the third condition are recorded. Villardita et al. (1985) used a modification of the Set Test to assess a group of normal elderly, employing the categories proper names of
202
persons, foods, and animals over 1-minute trials. The score was the total number of items for all categories. For test administration instructions and further discussion of the VF tasks, see Lezak et al. (2004) and Spreen and Strauss (1998).
Psychometric Properties of the Test Analysis of internal consistency for the COWA version (CFUPRW), reported by Ruff et al. (1996) revealed a high coefficient oc (r=0.83) for the three letters, indicating high test homogeneity. Interrater reliability is reported to be near perfect in several studies (Spreen & Strauss, 1998). Norris et al. (1995) reported an interrater reliability of r = 0.98. Test-retest reliability for different versions of this test is quite high (see Spreen & Strauss, 1998). Testretest reliability reported by Ruff et al. (1996) for a 6-month retest with an alternate set of letters yielded a coefficient of r = 0. 74. A gain of about three words on the retest was interpreted by the authors as a practice effect. Similarly, test-retest reliability reported by Dikmen et al. (1999) for an 11-month retest with an alternate set of letters was r=0.72. The magnitude of practice effect in their study was 1.22 (7.85) words. Data on repeated administration are also presented by McCaffrey et al. (2000). For further information on the psychometric properties of the COWA and other versions of VF tests, see Franzen (2000), Lezak et al. (2004), and Spreen and Strauss (1998).
Cognitive Mechanisms Underlying Word Generation VF has been commonly viewed as a component of executive function, which is subserved by the prefrontal cortex. However, a number of studies suggest that cognitive mechanisms underlying efficient organization of verbal retrieval and recall in word generation are multidimensional and involve auditory attention, short-term memory (in keeping track of words already said), ability to initiate and maintain word production set, cognitive flexibility (in rapidly shifting from one word to the next within the selected category), response
LANGUAGE
inhibition capacity, speeded mental processing, and long-term vocabulary storage (Boone et al., 1998; Cauthen, 1978b; Crowe, 1998a; Estes, 1974; Lafleche & Albert, 1995; Martin & Fedio, 1983; Parks et al., 1992; Perlmuter et al., 1987; Ruff et al., 1997). Parkin and Lawrence (1994) further refined the notion of cognitive flexibility and suggested that word generation relies on spontaneous flexibility, rather than on reactive flexibility, as defined by Eslinger and Grattan (1993). Such multidimensionality of cognitive processes involved in word generation is reflected in the results of factor-analytic studies on large batteries of neuropsychological. tests. High loadings of phonemic and semantic fluency tests on factors comprised of verbally mediated tasks, such as Vocabulary, Boston Naming Test, Similarities, and Digit Span, were reported by Lamar et al. (2002) and Ponton et al. (2000). On the other hand, Boone et al. (1998) demonstrated high loading of a phonemic fluency test on the factor representing speeded mental processing, along with Stroop and Digit Symbol. These findings confirm the premise that word generation depends on vocabulary storage and speeded mental processing in addition to executive functions. Use of vocabulary storage in word generation was explored by Crowe (1998a), who demonstrated a decrease in the rate of word generation with each 15-second increment of time, indicating storage of high-frequency words accessed during the early phases of the task performance, with a gradual exhaustion of this storage. A qualitative analysis of the behavioral aspect of word generation offers further insight into the strategy used in accessing lexical or semantic storage. Several systems of qualitative scoring of VF output have been proposed (Raskin et al., 1992; Stuss et al., 1998; Troyer et al., 1997), which are based on the premise that word generation involves phonemic analysis or semantic categorization (clustering) and shifting from one subcategory to another (switching) (Gruenewald & Lockhead, 1980; Troyer et al., 1997). Whereas clustering is an automatic process that relies on memory storage for words, switching is an effortful process that requires speed and cognitive
VERBAL FLUENCY TEST
flexibility, which determine the effectiveness of the search process (Troyer, 2000; Troyer et al., 1997). According to Abwender et al. {2001), switching can be divided into subtypes that reflect different cognitive processes in phonemic vs. semantic fluency. The criteria for clusters and switching are defined differently in different scoring systems. Qualitative scoring across different systems includes the following components: cluster size, number of switches, a pattern of word production in IS-second increments, and errors (perseverations, use of proper names, nonwords, and alternate forms of the same word). Psychometric properties of the Troyer et al. (1997) scoring system are evaluated by Ross (2003). Qualitative scoring was employed in an investigation of the effect of age on strategies used in word generation (Hughes & Bryan, 2002) as well as in studies exploring the cognitive mechanisms of VF deficits in different clinical groups (Beatty et al., 1997, 2002; Elvevag et al., 2002; Epker et al., 1999; Ho et al., 2002; Mayr, 2002; Rich et al., 1999; Troester et al., 1998; Troyer et al., 1998a,b; York et al., 2003). Of note, several studies challenge the specificity of clustering and switching variables and prompt further investigation in this area (Abwender et al., 2001; Demakis et al., 2003; Ho et al., 2002). Interactions between different cognitive mechanisms contributing to word generation are best understood in the context of Baddeley's (1986) model of the central executive component of working memory, which controls and regulates cognitive processing by activating appropriate information from longterm memory, selectively attending to relevant stimuli and filtering out irrelevant ones, switching between mental sets, and monitoring incoming stimuli (Baddeley, 1996). Rende et al. (2002) suggest that in addition to the central executive control that underlies both phonemic and category word generation, separate mechanisms are deployed in each of these tasks. Phonemic fluency relies on a phonological loop, which, according to Baddeley's model, is a temporary storage system for linguistic information. According to Abrahams (2000), short-term memory of phonological
203
information is involved in cueing word retrieval and keeping track of recent responses. In contrast, category fluency utilizes a visuospatial sketchpad, allowing manipulation of visuospatial information. Both the phonological loop and visuospatial sketchpad storage systems are passive slave systems to the central executive (Baddeley, 1986). Contributions of different cognitive mechanisms to word generation have also been demonstrated in the neural network model of phonemic VF proposed by Parks et al. (1992). This approach uses parallel distributed processing (PDP) models, which stem from earlier associationist (Hebbian) models and capture mental representations depicted as patterns of activity distributed across networks of simple processing units. The PDP model of VF is dependent on attentional biases based on reinforcement. Attention in this case is directed at stored memories of previous stimuli. Two reinforcement paths are identified: the selective positive reinforcement of words beginning with a given letter and the selective negative reinforcement of words that are excluded according to the rules of the task and previously produced words. An examination of a hierarchical relationship between features (individual letters) and categories in the case of phonemic word generation suggests that the categories are mutually exclusive and based on a single relevant feature {letter).
Biochemical and Anatomical Correlates and Effect of Brain Pathology on Verbal Fluency Recent advancements in neuroimaging and lesion studies provide validation to the empirically derived theories that point to an interplay of different cognitive mechanisms contributing to word generation. Functional neuroimaging techniques that were used in the studies summarized below to image brain activation in response to silent or overt word generation include proton magnetic resonance spectroscopy eH-MRS), functional magnetic resonance imaging (fMRI), regional cerebral blood flow (rCBF), positron emission tomography (PET), and single-photon emission computed tomography (SPECT).
204
Jung et al. (1999) found word generation, in comparison to other neuropsychological tests, to be most closely related to concentrations of the neurometabolite N-acetylaspartate (NAA) in their study examining biochemical markers of cognition measured by 1H-MRS in normal human brain. The authors related the association between variability in NAA levels and neuropsychological functioning to mitochondrial function of the neuron. A review of recent studies of brain areas involved in verbal fluency indicates that the left dorsolateral prefrontal cortex, inferior frontal cortex (Broca's area), and anterior cingulate gyrus are primarily responsible for both phonemic and semantic word generation (Audenaert et al., 2000; Baldo et al., 2001; Elfgren & Risberg, 1998; Elfgren et al., 1996; Fama et al., 2000; Frith et al., 1995; Gaillard et al., 2000; Leggio et al., 2000; Levin et al., 2001; Parks et al., 1988; Phelps et al., 1997; Pihlajamaeki et al., 2000; Pujol et al., 1996; Ravnkilde et al., 2002; Stuss et al., 1998; Tucha et al., 1999; Warkentin & Passant, 1997; Schloesser et al., 1998). Ruff et al. (1997) proposed a distinction between two conditions that cause a reduction in verbal fluency: (1) low word fluency secondary to deficient verbal attention, word knowledge, and/or verbal long-term memory, which are due to dysfunction of diffuse multifocal or nonfrontal brain areas; (2) low word fluency without the deficits listed above, which is likely to be associated with prefrontal lobe impairment. Further refinement of the role of the left frontal cortex in word generation comes from Elfgren and Risberg (1998), who, based on the results of an rCBF study, suggested that the left frontal cortex is engaged in the generation of internally driven responses, which is a major cognitive component of word generation. Piatt et al. (1999a,b) found that action fluency was sensitive to frontostriatal pathophysiology. More extensive involvement of the left hemisphere in word generation, to include temporoparietal regions, has been proposed in several studies (N'Kaoua et al., 2001; Pihlajamaeki et al., 2000; Schloesser et al., 1998; Stuss et al., 1998). Ravnkilde et al. (2002) hypothesized that the left prefrontal cortex may be involved in the initiating component of
LANGUAGE
the verbal fluency performance and, possibly, in the retrieval of words that are freely associated in the temporal cortex, whereas the anterior cingulate gyrus may be responsible for directing the temporal cortex as to which associations should be attended to or suppressed. Several studies have suggested involvement of both left and right frontal lobes in word generation (Fama et al., 2000; Parks et al., 1988; Philpot et al., 1993). Elfgren and Risberg (1998) and Stuss et al. (1998) hypothesized that areas activated in the course of word generation may reflect differences in cognitive strategy. Furthermore, Crosson et al. (2003) proposed bilateral involvement of basal ganglia in word generation, where the left basal ganglia-ventral anterior thalamic loop is involved in retrieval of words from preexisting lexical stores, whereas the right basal ganglia activity serves to suppress activity of right frontal structures, preventing them from interfering with language production. Martin et al. (2000) found that verbal ftuency impairment might be secondary to an epileptogenic lesion in the left or right anterior temporal lobes due to disruption of distal extratemporal regions, particularly the dorsal lateral prefrontal cortex. This finding is consistent with the nociferous cortex hypothesis and is supported by improvement in verbalftuency after anterior temporal lobectomy. Levin et al. (2001) investigated the effect of closed head injury severity, frontal brain lesions, and age at injury in a longitudinal study of children. Although verbal ftuency recovery after severe head injury was slower in younger compared to older children, left frontal lesions had a more adverse effect on verbalftuency in older children. This dissociation was interpreted by the authors as a reflection of the more established functional commitment of the left frontal region to expressive language in older children. This finding is consistent with the results of Gaillard et al.'s (2000) fMRI investigation comparing brain activation in children and adults in response to a phonemic task. Cortical activation in younger children was wider than in adults and involved the right inferior frontal gyrus in addition to the left inferior frontal cortex. However,
205
VERBAL FLUENCY TEST
predominantly left frontal localization of activation appeared to be established by middle childhood. A number of studies suggest that in addition to the left frontal structures commonly viewed as neural bases for word generation, phonemic vs. semantic Ruency might involve different neural systems. Right (or bilateral) cerebellum has been found to participate in phonemic, but not in semantic, processing (Leggio et al., 2000; Ravnkilde et al., 2002; Schloesser et al., 1998), whereas hippocampi were found to play a role in semantic, but not in phonemic, Ruency performance (Gleissner and Elger, 2001). Similarly, Pihlajamaeki et al. (2000) suggested that the left medial temporal lobe (hippocampal formation or posterior parahippocampal gyrus) is activated in retrieval by category. Stuss et al. (1998) suggested that in addition to the left hemisphere centers participating in phonemic processing, semantic processing involves right dorsolateral and inferior medial regions. This view is supported by N'Kaoua et al.'s (2001) finding that phonemic processing involves the left temporal lobe, whereas semantic processing involves the left and right temporal lobes. Differential rates of deterioration in phonemic vs. semantic Ruency in dementia support the notion that these two types of Ruency are subserved by different neural mechanisms. Animal naming has been shown in some studies to be performed at higher levels than word-generation tasks (Ober et al. 1986; Rosen, 1980). Similar findings are reported with reference to other semantic category naming tests (e.g., fruits: Ober et al., 1986; Randolph et al., 1993). In contrast, Bayles et al. (1989), Monsch et al. (1994), and Sherman and Massman (1999) did not find differences in efficiency of word generation for semantic vs. phonemic tasks. Furthermore, greater impairment of semantic Ruency in comparison to phonemic Ruency tasks in clinical samples was documented by Barr and Brandt (1996), Butters et al. (1987), Cahn et al. (1995); Cerhan et al. (2002); Crossley et al. (1997), Mickanin et al. (1994), Monsch et al. (1992), Rosser and Hodges (1994), and other authors. This pattern of greater deterioration in semantic Ruency, which, according to Butters
et al. (1987), is based on the ability to access and retrieve semantic knowledge, than in phonemic Ruency, which is based on phonological/lexical retrieval mechanisms, is viewed as being mostly due to disruption in the structure of semantic memory early in the course of dementia. Coen et al. (1996) related distinction in the efficiency of phonemic vs. semantic Ruency to the rate of cognitive decline. The authors showed that shorter duration of illness in their sample of patients suffering from dementia of Alzheimer's type (DAT) is associated with greater impairment in letter Ruency, whereas longer duration resulted in predominance of the category fluency impairment. Considering that there was no difference between the groups in dementia severity, these findings were viewed as a function of the rate of disease progression, with greater impairment in phonemic Ruency being associated with more rapid cognitive decline. Effects of different types of brain pathology, including brain injuries, aphasias, some amnesiac conditions, and degenerative dementing conditions on verbal production, are addressed in a number of studies, many of which provide information allowing comparison of word generation in clinical and control groups (Barr & Brandt, 1996; Carew et al., 1997; Cerhan et al., 2002; Clark et al., 1997; Coen et al., 1996; Dalrymple-Alford et al., 1994; Elvevag et al., 2001; Eslinger et al., 1984; Geffen et al., 1993; Goethe et al., 1989; Goldman et al., 1998; Gurd, 2000; Huff et al., 1986b; Joyce et al., 1996; Klimczak et al., 1997; Lafieche and Albert, 1995; Locascio et al., 1995; Margolin et al., 1990; Miller, 1985; Piatt et al., 1999a; Poreh et al., 1995; Robert et al., 1998; Shoqeirat et al., 1990; Zec et al., 1999).
Assessment of Verbal Fluency in Different Languages Spanish versions of the verbal Ruency test are available, which are based on different sets of letters (Artiola i Fortuny et al., 1999; LopezCarlos et al., 2003; Rey & Benton, 1991). Artiola i Fortuny et al. (1999) used letters P, M, and R to assess phonemic Ruency as part
206
of a standardized and validated battery of neuropsychological tests culturally adapted for Spanish-speaking individuals. Lopez-Carlos et al. (2003) provided data for monolingual Spanish-speaking participants with ~10 years of education for the PMR version (see study VF. 43, below). Ponton et al. (1996) provided normative data for Spanish-speaking participants for an FAS version of the test (see study VF.17, below). Acevedo et al. (2000) provided comparative data for English- and Spanishspeaking elderly for category flusency (see study VF.36, below). Rosselli et al. (2000, 2002a) compared oral fluency strategies in FAS and animal word generation of Spanish and English monolingual participants with bilingual participants in both languages (see study VF.40, below). La Rue et al. (1999) provided normative data for verbal fluency, measured as generation of same-sex first names, for a sample of Hispanic older adults (65-97 years of age), stratified by two age levels and four educational levels. Gollan et al. (2002) provided comparative data for Spanish/English bilingual and English monolingual participants on 12 semantic, 10 letter, and two proper name fluency categories. Word fluency for all categories was lower for bilingual participants, with especially pronounced differences in semantic categories. Bilinguals' fluency performance did not improve when words in both languages were used. The authors discussed mechanisms of the bilingual lexical system. Benita-Cuadrado et al. (2002) provided age- and education-adjusted normative data for the Animal Naming test collected on a sample of 445 participants between 18 and 92 years of age in Barcelona, Spain. Dellatolas et al. (2003) explored the effect of illiteracy on cognition on 97 normal Brazilian illiterate adults and 41 schoolchildren 7-8 years old. The authors provided normative data for semantic fluency (animals, clothes) and letter fluency (P, F, M). Psychometric properties and clinical utility of Hebrew versions of phonemic and semantic fluency tests are discussed by Axelrod et al. (2001). Performance of Chinese native speakers residing in Hong Kong, ranging in age 7-95 years,
LANGUAGE
on the animal and transportation versions of the category fluency test is reported by Chan and Poon (1999). Normative data for the Animal Naming and a combined category of Fruits and Vegetables for Cantonese-speaking Hong Kong Chinese adolescents and adults are provided by Lee et al. (2002). Chiu et al. (1997) assessed the validity of the modified Fuld Verbal Fluency Test as a screening instrument for differential diagnosis of dementia on a sample of 53 normal and 56 demented Chinese (Hong Kong) participants, aged 59-97 years. Scores for all semantic categories (animals, fruits, vegetables) differentiated well between normal and demented groups. To improve test-retest reliability, the authors recommended use of a composite measure. Flemish normative data for two category fluency trials, animals and professions, and for a Dutch version of the phonemic fluency test, which used letters N, A, and K. are reported by Lannoo and Vmgerhoets (1997). Rodriguez-Aranda (2003) reported normative data for a Norwegian version of the oral fluency (letters F, A, S; categories Animals, Fruits, and Professions) and written fluency (letters S, I<; categories Vegetables, Sports, and Farm Animals) tests. The data are stratified by age group.
RELATIONSHIP BETWEEN VFT PERFORMANCE AND DEMOGRAPHIC FACTORS The original normative data for the CVFI' (based on the F, A, S letter set) presented in the Neurosensory Center Examination for Aphasia manual (Spreen & Benton, 1969) were based on a rural sample with low educational background and, therefore, have limited clinical relevance (Spreen & Strauss, 1991). Norms compiled more recently on demographically advanced samples yield consistently higher scores across different studies. Benton and Hamsher's (1978) manual for the Multilingual Aphasia Examination (based on C, F, L and P, R, W letter sets) provides corrections for age, gender, and education, implicating effects of these variables on test
VERBAL FLUENCY TEST
performance. (See Lezak, 1995, and Lezak et al., 2004, for the correction table and the percentile rank table.) A decline in verbal fluency with age was documented by Acevedo et al. (2000), Brady et al. (2001), Furry and Baltes (1973), Kempler et al. (1998), Norris et al. (1995), RodriguezAranda (2003), Stuss et al. (1998), and Wiederholt et al. (1993). Schaie and Parham (1977) and Schaie and Strother (1968b) reported decline associated with advancing age in cross-sequential comparisons. Benton et al. (1981) reported a decline in verbal fluency only after age 80 based on a sample of 6584 years old. Parkin and Lawrence (1994) identified significant decline with age only in older cohorts with low educational level. Parkin and Java (1999) reported age-related decline in semantic, but not in phonemic, fluency. Similarly, Troyer (2000) found age to be more strongly related to semantic fluency than to phonemic fluency. Chan and Poon (1999) reported an increase in category fluency from childhood to adulthood, with peak performance in adults 19-30 years of age and a subsequent decline with advancing age. Loonstra et al. (2001) presented aggregate statistics for phonemic fluency suggestive of a progressive age-related decline. In contrast, Anstey et al. (2000), Axelrod and Henry (1992), Bolla et al. (1990), Boone (1999), Boone et al. (1990), Cauthen (1978b), Daigneault et al. (1992), Mittenberg et al. (1989), Ponton et al. (1996), Ruff et al. (1996), Seines et al. (1991), and Tomer and Levin (1993) did not find age-related differences in VF performance. Relative stability in VF performance with advancing age was also reported by Miller (1984), Perhnuter et al. (1987), and Yeudall et al. (1987). Hultsch et al. (1993) demonstrated that, after controlling for selfreported health status and activity levels, age no longer significantly contributed to the variance in word generation. Similarly, Coffey et al. (2001) did not find a relationship between verbal fluency and age-related changes evident on MRI in a sample of 320 nonclinical volunteers aged 66-90. This conclusion is consistent with their review of the relevant neuroimaging literature. The discrepancy between different studies can be explained by numerous confounding
207
factors. Schaie (1983) suggested that a requirement for motor response is a factor contributing to age-related decline. Norris et al. (1995) hypothesized that the discrepancy in the findings across different studies regarding effect of age on VF performance might be due to several factors: (1) sampling differences, with selection of high IQ participants reducing a correlation that might be observed in a broader sample; (2) use of clinical samples may mask the unique contribution of age, which is attenuated by the effect of brain damage; (3) cohort effect might influence the results, with smallest VF performance differences among middle-aged adults to young-old adults and greater differences across older samples; (4) diversity in the elderly populations from which participants are drawn (community vs. institutionalized elderly). Acevedo et al. (2000), Anstey et al. (2000), Axelrod et al. (2001), Chan and Poon (1999), lvnik et al. (1996), Kempler et al. (1998), Lannoo and Vingerhoets (1997), Loonstra et al. (2001), Norris et al. (1995), Ponton et al., (1996), and Troyer (2000) found a significant effect of education on VF performance; however, in the studies by Axelrod and Henry (1992), Bolla et al. (1990), and Stuss et al. (1998), education was not significantly related to efficiency of word generation. Ruff et al. (1996) suggested that gender moderated the effect of education. Benito-Cuadrado et al. (2002) noted a significant effect of age and education on semantic fluency in a Spanish sample. Based on regression analysis, they developed a predictive function for the semantic fluency score: F(x) = 23.89 +age( -0.144) + education(0.39). Tombaugh et al. (1999) found, in a large sample of cognitively intact individuals aged 16-95 years, that phonemic fluency was more sensitive to the effects of education than age (18.6% and 11.0% of variance, respectively), whereas semantic fluency was more affected by age than education (23.4% and 13.6% of variance, respectively). A positive relationship between verbal intelligence and verbal fluency was documented by Bolla et al., (1990), Boone (1999), Cauthen (1978b, for ~60-year-old subgroup only),
208
Borkowski et al. (1967), and Miller (1984). In contrast, Axelrod and Henry (1992) did not find an effect of verbal intelligence on VF performance. Bolla et al. (1990) an~ Boone (1999) suggest that the effect of edu~tion is mediated by verbal intelligence. Bo"r et al. (1990) concluded that data based OJ\ verbal intelligence rather than on educatiot( would be more accurate in differentiating tietween normal and abnormal performance. i An effect of gender, with superiority of female performance, was documented 'y Acevedo et al. (2000), Bolla et al. (1990), paddes and Crockett (1975), Loonstra et al. k2001), Ruff et al. (1996), and Veroff (1980)b' However, no gender differences in verbal ruency in normal and clinical samples were fwnd by Boone (1999), Cauthen (1978b), Ch~n and Poon (1999), Ponton et al. (1996), frupich et al. (1995), Saxton et al. (2000), To~baugh et al. (1999), Troyer (2000), and Yeu~ et al. (1987). . Johnson-Selfridge et al. (1998) demonstrated differences in FAS and Animal Naming word generation among groups ofi white, black, and Hispanic male veterans based on a sample of 200 participants in each gro;.p (see VF.24, below). It should be pointed Oflt that the income, level of education, ancJ; Wide Range Achievement Test-Revised (~T-R) Reading subtest scores were significantly related to verbal fluency in this study and, therefore, might be mitigating factors :in this comparison. An effect of native language was noted by Kempler et al. (1998), who compared samples of Chinese, Hispanic, and Vietnamese participants, with Vietnamese-speaking ,articipants demonstrating the highest rate of animal generation and Spanish-speaking, the lowest. The authors attributed this difference :to the fact that in the Vietnamese language animal names are short (predominantly one syllable), while in Spanish they are longer than ,n any other language used in this study (two ot three syllables). Rodriguez-Aranda (2003) reported a 15trong relationship between readinglhandvmting speed and verbal fluency in a sample of 101 Norwegian participants 20-88 years of ~ge.
lANGUAGE
In addition to demographic variables, functional status was found to contribute to the efficiency ofVF performance, especially in geriatric samples. An effect of depression on VF performance is well documented (Boone et al., 1995; Caine, 1986; Norris et al., 1995). Effects of levels of physical and mental activity are documented by Craik et al. (1987). Neuropathological changes in the brain associated with cardiovascular disease and cerebrovascular risk factors also contribute to decline in VF efficiency (Boone et al., 1993b; Breteler et al., 1994). Brady et al. (2001) showed that the relationship between stroke risk and decline in semantic fluency was 80% as large as the relationship between age and fluency decline. Similarly, Kilander et al. (2000) reported a strong relationship between diastolic blood pressure at age 50 and performance on the FAS 20 years later in a population-based study conducted in Sweden on a sample of 502 men.
METHOD FOR EVALUATING THE NORMATIVE REPORTS To adequately evaluate the VF normative reports, eight key criterion variables were deemed critical. The first six of these relate to subject variables, and the two remaining refer to procedural issues. Minimal requirements for meeting the criterion variables were as follows.
Subject Variables Sample Size Fifty cases are considered a desirable sample size. Although this criterion is somewhat arbitrary, a large number of studies suggest that data based on small sample sizes are highly influenced by individual differences and do not provide a reliable estimate of the population mean. Sample Composition Description As discussed previously, information regarding medical and psychiatric exclusion criteria is important. It is unclear if socioeconomic
209
VERBAL FLUENCY TEST
status, occupation, ethnicity, native language, geographic recruitment region, or recruitment procedures are relevant. Until this is determined, it is best that this information be provided. Age Group Intervals
nus criterion refers to grouping of the data into limited age intervals. In spite of the controversy in the literature regarding the effect of age on VF performance, accuracy of data interpretation is facilitated by using a narrowrange age group as a reference sample. Reporting of Educational levels
Given consistent evidence of effects of education on VF performance, normative data should be grouped by educational level. Reporting of Intellectual levels
Given consistent evidence of effects of intellectual level on VF performance, normative data should be grouped by IQ level. Reporting of Gender Composition
Given the probable association between gender and VF performance in favor of females, information regarding gender composition should be reported for each subgroup, and preferably normative data should be presented by gender. Procedural Variables Descriptio~ of Administration Procedures
Due to V!lPability in administration procedures (see below), a detailed description of the procedures, including identification of the version of the test administered and restrictions in the types of word to be used, is dewould allow one to select the sirable. most appropriate norms or to make corrections in interpretation of the data.
nus
Data Reporting
Group means and standard deviations for the number of words generated for each condition and the total score for all letters or categories should be presented at minimum.
SUMMARY OF THE STATUS OF THE NORMS There are a number of studies exploring the efficiency of word generation in normal and clinical samples across various demographic groups and diagnostic categories. A considerable variability between studies, obscuring their comparability, includes the following aspects: time allotted for each category, type of semantic or phonemic category, relative difficulty of letters within a phonemic category, presence/absence of feedback on intrusion or repetition errors, instructions for item exclusion (inconsistency regarding exclusion of numbers), inconsistent administration ofan example with practice trial prior to the first test trial. Some authors do not specify which letters were used in their phonemic categories. Among all the studies available in the literature, we selected for review those based on well-defined samples and test versions. In the majority of studies reviewed below, the test scores represent a total number of words generated over three 60-second trials in the phonemic category or a number of words generated in one 60-second trial in the semantic category. Deviations from this format are identified in each table. In this chapter, normative publications and control data from clinical studies are reviewed in ascending chronological order. The text of study descriptions contains references to the corresponding tables identified by number in Appendix 11. Table All.l, the locator table, summarizes information provided in the studies described in this chapter. 1
SUMMARIES OF THE STUDIES [Vf.1] Cauthen, 1978b (Table A11.2)
The author administered the Wechsler Adult Intelligence Scale (WAIS, Satz-Mogel short form) and the VF test as part of a large study in Canada aimed at normative data collection for age and IQ levels on neuropsychological 'Norms for children are available in Baron (2004) and Spreen and Strauss (1998).
210
tests. The VF test consisted of eight letters administered in the same order to all participants-S, G, U, N, F, T, ], P-fqllowing the standard procedure (i.e., !-minute oral production). The sample was divided ihto two age groups: 20-59 and 60-94 yeais. The younger group consisted of 12 males and 39 females gathered from a variety of ~urces, with full-scale IQ (FSIQ) ranging lp0-140 (M = 115.6, SD = 8.7). The older ~up included 28 males and 36 females, li~g primarily in institutional settings, wi~ FSIQ ranging 80-140 (M = 111.5, SD = 13.1~ Analysis of the results for the ~unger group did not indicate any relation ~tween VF performance and FSIQ. Therefore, the VF data were presented for the total sa_m;ple. In contrast, VF performance differed significantly for the older group across three FSIQ ranges: 80-106, 107-118, 119-140. 'lite authors hypothesized that speed of perfo\mance is a determining factor in the relationship between VF scores and IQ (specifica$y performance IQ [PIQ] scale). Further 4oalysis suggested that IQ level did not interaft with letter difficulty. No relationship betw~n VF performance and age was evident. · The authors concluded that the use cf these norms in the 20-59 age group is inapp~priate for those with FSIQ <100.
Study strengths 1. Data are presented in age groupings. 2. Efficiency of word generation for eight letters was compared. 3. Data for the older group were stiatified by FSIQ level. 4. Information provided regarding sender and geographic recruitment area. i 5. Sample sizes for each age group are adequate, but individual cell sizes are small. 6. Test administration procedures were specified. 7. Means and SDs for the test sco~s are reported.
Considerations regarding use of the study 1. Participants' level of education ~ not reported. 2. Younger participants' FSIQ was hfgh. 3. Exclusion criteria were not speci~d.
LANGUAGE
4. Participants from the older group lived in institutional settings; however, the mean FSIQ was quite high. It is unclear from the description of the sample composition why they were institutionalized. 5. Data collected in Canada, which may limit usefulness for clinical interpretation in the United States. 6. Wide age ranges within each age grouping.
[VF.2l Yeudall, Fromm, Reddon, and Stefanyk, 1986 (Tables A11.3-A11.5)
The authors obtained VF data on 225 Canadian volunteers (127 males, 98 females) recruited from posted advertisements in workplaces and personal solicitations. Participants included meat packers, postal workers, transit employees, hospital lab technicians, secretaries, ward aides, student interns, student nurses, and summer students. In addition, high school teachers identified for participation average students in grades 10-12. Exclusion criteria were evidence of "forensic involvement," head injury, neurological insult, prenatal or birth complications, psychiatric problems, or substance abuse. Experienced testing technicians gathered VF data and "motivated the participants to achieve maximum performance," partially through the promise of detailed explanations of their test performance. Rigorous exclusion criteria were used. Data are presented for four age groups for males and females combined and separately. The data are reported for the F AS and for Written Word Fluency. Norms for the oral version only are reproduced in this book. No significant effects of gender or age were evident.
Study strengths 1. The sample size is large, with individual
cells approximating 50. 2. The sample is stratified into four age groups. 3. Data are presented for males and females separately. 4. Data availability for a 15-20 year age group. 5. Adequate medical and psychiatric exclusion criteria.
VERBAL FLUENCY TEST
6. Information regarding handedness, education, IQ, gender, occupation, recruitment procedures, and geographic area is provided. 7. Means and SDs for the test scores are reported. Considerations regarding use of the study 1. Both the written and oral versions of this test were administered, but the order of administration is not specified. Issues of practice effect are not addressed. 2. Education and intelligence level for the sample are high. 3. The data were obtained on Canadian participants, which may limit their usefulness for clinical interpretation in the United States. [VF.3] Gordon and Lee, 1986 (Table A 11.6)
The authors studied the relationship between gonadotropins and visuospatial function. A word-generation test was given to experimental participants of a blood-monitoring study and to 250 control participants who were university students, drawn from the same population as the experimental participants. The standard administration procedure was used. The data for 90 males and 160 females from the control group are given in Table A11.6. Total scores for three trials are reported. The results suggest considerable gender differences in the rate of word generation, with superiority of females in both experimental and control groups. Study strengths 1. Sample size is large. 2. Test administration procedures are specified but not test version. 3. Means and SDs for the test scores are reported. 4. Data are presented for males and females separately. Considerations regarding use of the study 1. Participants are identified as university students. Age range for the experimental group is provided. Control participants are assumed to have similar demographic
211
characteristics since they are drawn from the same population. 2. Subject selection criteria are not reported. 3. It is unclear which version of the test was administered. 4. No information on IQ is reported. [VF.4] Bolla, Lindgren, Bonaccorsy, and Bleecker, 1990 (Table A11.7)
The authors examined the effect of demographic factors and influence of different cognitive processes on VF (FAS) performance in healthy elderly. Participants were 199 Caucasian volunteers, 80 men and 119 women, enrolled in the Johns Hopkins Teaching Nursing Home Study of Normal Aging, who were recruited through newspaper advertisements. Participants' ages ranged 39-89 (M = 64.3, SD = 13.5); education ranged 8-22 years, with a mean of 14.7 years (SD = 3). Rigorous exclusion criteria were used. The F AS version of the VF test was administered as part of a comprehensive neuropsychological battery. Standard instructions were used. Participants were instructed to exclude proper nouns. Series of numbers and proper nouns were not scored. To examine the effect of verbal intelligence on FAS performance, WAIS-R Vocabulary test scores were used in a regression analysis along with demographic variables. Verbal intelligence and gender accounted for a significant proportion of the variance in F AS performance. Age and education were not related significantly to performance. Therefore, the authors grouped their data by verbal intelligence for males and females separately. Based on their raw WAIS-R Vocabulary scores, participants were divided into three verbal intelligence groups: average (30-53), high (54--60), and superior (61-68). Study strengths 1. The sample composition is well described in terms of age, gender, education, verbal IQ, geographic area, and recruitment procedures. 2. The data are presented for three verbal intelligence groups for males and females separately. 3. Adequate exclusion criteria.
212
4. Test administration procedures are specified. 5. Means and SDs for the test scores are reported. Considerations regarding use of the study 1. Education and intelligence level for the sample are high. 2. It is unclear whether participants were instructed to avoid numbers in the process of word generation. 3. Overall sample is adequate, but individual cells are relatively small. 4. Data were collected in Canada, and it is unclear if they are appropriate for use in the United States. [VF.S] Seines, Jacobson, Machado, Becker, Wesch, Miller, Visscher and McArthur, 1991 (Table A11.8)
The investigation used participants from the Multi-Center AIDS Cohort Study (MACS). The article presents results of 733 seronegative homosexual and bisexual males for the purpose of establishing normative data for neuropsychological test performance based on a large sample. The majority of the sample consisted of Caucasian participants. Participants with a history of head injury with loss of consciousness > 1 hour and who reported drinking ~21 drinks per week in the previous 6 months were excluded. Percent of AfricanAmerican participants ranged 3.4%-4.1% for different age groups. Percent of left-handers ranged 11.3%-14.9%. The F AS version was administered according to standard instructions. The investigators also utilized an animal category task with a 1-minute interval. Study strengths 1. The overall sample size is large, and individual cell sizes are large. 2. Normative data are stratified by age and education. 3. The demographic composition of the sample is described in terms of age, gender, sexual orientation, handedness, ethnicity, and geographic area; demographic composition is described for each age and education cell separately.
LANGUAGE
4. Means, SDs, as well as scores for percentiles 5 and 10 are presented. 5. Minimally adequate exclusion criteria. Considerations regarding use of the study 1. All-male sample. 2. No information on IQ is reported. 3. High educational level of the sample. [VF.6] Axelrod and Henry, 1992 (Table A11.9) The authors compared the performance of 80 healthy, independently living individuals aged 50-89 on tests tapping executive functioning and WAIS-R (Satz-Mogel abbreviation). Participants were recruited from a universityrelated project and the community. Rigorous exclusion criteria were used. The FAS version of the VF test was administered according to standard criteria Items excluded were proper names and variations of the same word In addition, participants self-rated their health status on a 1-5 scale and reported the number of physician appointments in the past 12 months as an objective measure of health status. Verbal intelligence was measured with WAIS-R Vocabulary scores. No relationship was found between VF performance and intellectual competence, educational experience, or general health status.
Study strengths 1. Administration procedures are well outlined. 2. Sample composition is well described in terms of IQ, age, education, gender, and ethnicity. 3. Strict subject selection criteria were used. 4. Data are stratified by age group. 5. Means and SDs for the test scores are reported. Considerations regarding use of the study 1. The sample sizes for each age group are small. 2. High educational level of the sample. [VF.7] Monsch, Bondi, Butters, Salmon, Katzman, and Thai, 1992 (Table A11.1 0) The authors compared performance of 89 OAT patients and 53 demographically
VERBAL FLUENCY TEST
matched control participants on four measures of VF: category, letter, first names, and supermarket fluency. Control participants, who were 52-86 years of age and had 7-19 years of education, were recruited through newspaper advertisements or were spouses of patients. Participants with a history of alcoholism, other drug abuse, learning disability, and/or a serious neurological or psychiatric illness were excluded. The standard administration procedure was used for the F AS version of the test. The category fluency task included three trials, 60 seconds each, for animals, fruits, and vegetables. First names and supermarket fluency consisted of one 60-second trial for each condition. The reported scores were the total numbers of correct responses across all trials for each test and corresponding SOs. A comparison of performance for two groups across four measures of VF using receiver operating characteristic curves demonstrated superiority of category fluency in discriminating between OAT and normal participants, whereas letter fluency was the least accurate. The authors proposed that the superiority of category fluency is due to its dependence on the structure of semantic knowledge, which deteriorates early in the course of OAT. Analysis of gender differences for the control group indicated that females outperformed males on all measures, with significance levels ranging from p < 0.001 to p = 0.03. Study strengths 1. The sample composition is described in terms of age, gender, and education. 2. Rigorous exclusion criteria. 3. Test administration procedures are well specified. 4. Means and SDs for the test scores are reported. Considerations regarding use of the study 1. Overall sample size is adequate; however, the data are presented across a wide range of age and education. 2. The authors reported statistically significant gender differences; however, the data are not grouped by gender. 3. No information on IQ is reported.
213
[VF.8] Simkins-Bullock, Brown, Greiffenstein, Malik, and McGillicuddy, 1994 (Table A 11.11)
The study addresses the relative utility of various tests in investigating cognitive aspects of memory and executive functioning in patients with anterior communicating artery aneurysm and normal control participants. The control group consisted of 10 males and 9 females with no history of neurological or major psychiatric illness, mean age of 52.6 (15.6), mean education of 8--19 years, and WAIS-R FSIQ of 85--124. VF measures included semantic fluency (animals and fruits or vegetables) and F AS. Study strengths 1. The sample composition is well described in terms of age, gender, education, and FSIQ. 2. Adequate exclusion criteria. 3. Means and SOs for the test scores are reported. Considerations regarding use of the study 1. Test administration procedures are not well identified. It is not clear whether the semantic fluency condition included two or three trials. 2. Recruitment procedures a not reported. 3. The sample is small, which does not allow partitioning of data into age groups. [VF.9] Parkin and Lawrence, 1994 (Table A11.12)
The F AS word fluency test was administered to 22 elderly participants in a study investigating the relationship between frontal lobe function and age-related memory decline. The sample included 18 females and 4 males, with mean age of 71.9 (4.8), mean education of9.4 (1.3), and National Adult Reading Test (NART) FSIQ of 106.1 (12.6), who were living independently and in good health, per self-report. Volunteers with diabetes or a history of neurological illness were excluded. All participants passed screening for dementia using the Blessed Scale. A standard administration procedure was used, with the exception that participants were not instructed to avoid numbers.
214
LANGUAGE
Study strengths
Study strengths
1. The sample is small. 2. The sample composition is well described in terms of age, gender, education, and estimated IQ. 3. Adequate exclusion criteria. 4. Test administration procedu~s are specified. : 5. Means and SDs for the test s~res are reported. ; 6. Relatively low educational level· of participants. Scarce data are avail4ble for this educational group.
Considerations regarding use of the stJJ,dy 1. Small sample size. 2. Recruitment procedures were not reported. 3. The data were obtained in the i United Kingdom, which may limit theirj usefulness for clinical interpretation . in the United States.
[VF.1 0] Friedman, Kenny, Jesberger, Chqy,
and Meltzer, 1995 (Table A 11.13) The F AS word fluency test was admi4istered to 24 adult volunteers recruited by adjtertisement primarily from hospital staff to serve as part of a control group in a study of the relationship between smooth pursuit eye-tf:'acking and cognitive performance in schizopbrenia. The control group used in this arm of~ study represents a subset of a larger control group described in an earlier publication byi Friedman et al. (1991), which included 45 participants screened for health problems $sing a health questionnaire and a structureup. The score for the F AS is the total dumber of words generated over three 60-second conditions.
t
1. The sample composition is described in terms of age and recruitment procedures. 2. Rigorous exclusion criteria. 3. Means and SDs for the test scores are reported.
Considerations regarding use of the study 1. The sample is small. 2. The sample for this arm of the study represents a subset of a larger sample, whose demographics were only cursorily described. 3. It is unclear whether the administration procedure required restrictions in the types of word to be used in the process of word generation. [VF.11] Kozora and Cullum, 1995 (Tables A11.14, A11.15)
The authors compared category and letter fluency in normal aging individuals. Participants were volunteers (n = 174) 50-89 years of age, who were recruited through local media announcements as part of an ongoing aging study. Participants were screened using a semistructured neuromedical interview. None of the participants selected for the study had a known history of substance use, major psychopathology, uncontrolled hypertension, other major medical illnesses, learning disability, or neurological disorder or was taking medications known to affect central nervous system (CNS) functioning. Participants with Mini-Mental State Exam (MMSE) score <24 were excluded. The sample was divided into four age groups by decade, which were equated for educational level, gender distribution, and verbal intellectual level (as measured by WAIS-R Vocabulary subtest). Five different VF tasks were administered as part of a larger study. Letter fluency was measured using the F AS task, with instructions not to use proper names, numbers, or the same word with different suffixes. Category fluency was measured with four tasks: (1) the supermarket item list from the Dementia Rating Scale (DRS; Mattis, 1988), which was administered according to standard instructions except that the total number of items
215
VERBAL FLUENCY TEST
generated in 1 minute was used as the total score; (2) animal naming; (3) state naming. listing U.S. states; and (4) first name generation, listing both male and female names. Performance within 1 minute was recorded for each task. Qualitative aspects of VF were assessed by calculating the hierarchical structure of words generated on the supermarket ftuency task and by examining the frequency of perseverative responses and intrusion errors made on each task. The authors concluded that category ftuency appears to be disproportionately reduced compared with letter ftuency in normal aging, which would be consistent with some degradation of semantic memory systems.
Study strengths 1. Administration procedures are well outlined. .2. Sample composition is well described in terms of age, education, gender, and verbal intelligence (based on Vocabulary score). 3. Strict subject selection criteria were used. 4. Data are stratified by age group. 5. Means and SDs for the test scores are reported. 6. Sample sizes for each age grouping approximate 50.
Consideration regarding use of the study 1. High educational level of the sample. [VF.12] Norris, Blankenship-Reuter, Snow-Turek, and Finch, 1995 (Table A11.16)
The study addressed the effect of depression on cognitive deficits in the elderly. Participants were 54 community-living elderly, 35 institutionalized elderly, and 40 young adults who were paid or received course credit for their participation and whose first language was English. The first group included independently living individuals aged 60-86 (M = 73.1, SD = 6.1), who were solicited through ads and personal references. Participants comprising the second group were aged 6.2-89 years (M = 75.3, SD = 7.5), were living in institutional settings (skilled-care and intermediate-
care facilities), and had MMSE scores ~.20. The third group included undergraduate students from a large southwestern university aged 18-28 years (M = 19.4, SD = 1.8). Participants were assessed with the F AS version of the VF test. The standard procedure was used. Participants were instructed to exclude proper names, numbers, and different extensions of the same word. 11te letter T was used as an example. Two scorers rated all protocols, with interrater reliability of r = 0.98. Depression was assessed with the Geriatric Depression Scale (GDS; Yesavage et al., 1983). Functional status was measured with the Functional Assessment Scale, which is a minor modification of functional scales developed by Lawton and Brody (1969). A hierarchical regression was used to examine the incremental effects of age, education, depression, and functional status on VF performance. Age alone explained 15.8% of the variance in VF scores, while age and education together accounted for .25 ..2% of the variance. Depression was associated with decreased scores on VF only in functionally independent adults. 11te role of this finding in differential diagnosis of depression in older adults is underscored by the authors.
Study strengths 1. Data for two older groups and a young group are presented. .2. Sample composition is well described in terms of age, native language, recruitment procedures, and education. 3. Sample sizes for each group approach 50. 4. Test administration procedures are specified. 5. Means and SDs for the test scores are reported.
Considerations regarding use of the study 1. There is a considerable difference in educational level between the two elderly groups, which might partially account for the differences in VF performance. .2. Data are of limited clinical use due to overinclusive age ranges for each group. 3. Subject exclusion criteria were not specified.
LANGUAGE
216
4. No data on intellectual level or gender are provided. 5. High educational level of the two control (noninstitutionalized) groups.
[VF.13] Cahn, Salmon, Butters, Wiederholt, Corey-Bloom, Edelstein, and Barrett-Connor,
pants' group assignment. Time to completion was reported for the entire sample. In addition, the authors provided optimal cutoff scores and sensitivity/specificity for the diagnosis of OAT: 90/83% for Category Fluency at a cutoff of 32 words and 76169% for Letter Fluency at a cutoff of 31 words.
1995 (Table A 11.17)
The study examines the accuracy of neuropsychological measures in detecting Dementia of the Alzheimer's Type (OAT) in a community-dwelling elderly sample. Participants are stable, upper middle-class, retired older adults who entered the Rancho Bernardo Study, surveying for heart disease risk factors, between 1972 and 1974. The initial sample included 5,052 adults 30-79 years old, who have been followed until the present. Participants over the age of 65 who returned for reexamination in 1988 and later and screened positive for cognitive impairment were seen in clinic for diagnostic purposes (n = 199). A matched control sample of 203 normal elderly participants who screened negative for cognitive impairment was randomly selected for comprehensive evaluation, which included neurological examination, neuropsychological assessment, standard medical history and examination, and, in some cases, CT scans of the brain. On the basis of the diagnostic evaluation, the group composition was re-assessed. The final sample of normal elderly included 238 participants (97 males, 141 females) with a mean age of 78.4 (6.8), mean education of 13.8 (2.6), and mean DRS score of 136.8 (5.4). The Letter Fluency Test was administered according to the procedures described by Borkowski et al. (1967). We infer that the FAS version of this test was administered. The Category Fluency Test was not described by the authors. In their earlier publication (Wiederholt et al., 1993), the authors used data for Animal Naming. However, the values for Category Fluency provided by the authors in Calm et al. (1995) seem to be too high for one 60minute Animal Naming trial. Therefore, data for the Category Fluency Test will not be reproduced in this book. The tests were administered as part of a larger battery by a trained psychometrist who was blind to the partici-
Study strengths 1. Large sample size. 2. The sample composition is well described in terms of age, education, gender, DRS score, geographic area, history of the project, and recruitment procedures. 3. Rigorous exclusion criteria. 4. Test administration procedures are specified. 5. Means and SDs for the test scores are reported. 6. Sensitivity and specificity for optimal cutoff scores for the two parts of the test are reported.
Considerations regarding use of the study 1. The data are not partitioned by age group. 2. No information on IQ is reported.
[VF.14] lvnik, Malec, Smith, Tangalos, and Petersen, 1996 (Table A11.18) The study provides age-specific norms for the COWA obtained in Mayo's Older Americans Nonnative Studies (MOANS), which aim at obtaining nonnative data for elderly individuals on different neuropsychological tests. The total sample consisted of 746 cognitively normal volunteers over age 55,743 of whom took the COWA. Mean MAYO FSIQ (which differs somewhat from standard WAIS-R FSIQ) for the whole sample was 106.2 (±14.0), and mean Mayo General Memory Index on the Wechsler Memory Scale-Revised (WMS-R) was 106.2 (±14.2). For a description of their samples, the authors refer to their earlier publications. Participants were independently functioning, community-dwelling persons who were recently examined by a physician and had no active neurological or psychiatric disorder with the potential to impact cognition. Age categorization used the midpoint interval technique. The raw score distribution
217
VERBAL FLUENCY TEST
for each test at each midpoint age was "normalized" by assigning standard scores with a mean of 10 and SD of 3, based on actual percentile ranks. The authors provide tables of age-corrected norms for each age group (see below). The procedure for clinical application of these data is described in the original article (Ivnik et al., 1996) as follows: first select the table that corresponds to that person's age. Enter the table with the test's raw score; do not use corrected or final scores for tests that might present their own age- or education-adjustments. Select the appropriate column in the table for that test. The corresponding row in the leftmost column in each table provides the MOANS Age-Corrected Scaled Score . . . for your subject's raw score; the corresponding row in the rightmost column indicates the percentile range for that same score.
Further, linear regressions should be applied to the normalized, age-corrected MOANS scaled scores (A-MSS) derived from the tables to adjust the patient's score for education. Age- and education-corrected scores for the COWA {A&E-MSS) can be calculated as follows: A&E-MSScowA = K + (Wt
where the following indices are specified for the COWA: K
3.50 1.16
1. Information regarding age, education, gender, ethnicity, occupation, recruitment procedures, and geographic area is reported. 2. The data were stratified by age group based on the midpoint interval technique. 3. The innovative scoring system was well described. The authors developed new indices of performance. 4. The sample sizes for each group are large. 5. Restricted age range in each cell.
Considerations regarding use of the study 1. The measures proposed by the authors are quite complicated and might be difficult to use in clinical practice. 2. Participants with prior history of neurological, psychiatric, or chronic medical illnesses were included. 3. It is assumed that the authors used the CFL set of letters based on specification that the MAE COWA was used. However, due to frequent reporting of FAS as "COWA," this assumption is only tentative.
* A-MSScowA)
- (W2 • Education)
Wt
Study strengths
w2 o.40 Education should enter the formula as years of formal schooling. The tables of scaled scores per age group provided by the authors should be used in the context of the detailed procedures for their application, which are explained in Ivnik et al. (1996). Therefore, they are not reproduced in this book. Interested readers are referred to the original article. Table A11.18 summarizes sample sizes for different demographic groups. A follow-up article by this group (Lucas et al., 1998) provides MOANS normative data for category fluency.
Other comments 1. The theoretical assumptions underlying this normative project have been presented in Ivnik et al. (1992a,b). 2. The authors cautioned that the _validity of the MAYO indices depends heavily on the match of demographic features of the individual to the normative sample presented in this article. 3. Correlations of COWA with age and gender were -0.15 and 0.12, whereas correlation with education was 0.38. The authors underscore the effect of education on test scores. [Vf.15] Ruff, Light, Parker, and Levin, 1996 (Tables A11.19, A11.20)
The authors summarized the history of VF tasks. Their normative study was based on 360 native English-speaking normal volunteers 1~70 years of age and ranging in education 7-22 years, who resided in mostly
218
LANGUAGE
urban/suburban areas of California, Michigan, and the eastern seaboard. Participan1S with a positive history of psychiatric hospitalization, chronic polydrug abuse, or neurological disorders were excluded from the sample. The COWA (letters CFL and PRW) was administered as part of a comprehensive neuropsychological battery. The standard administration procedure was used. The investigators instructed participants to exclude proper names and same words with different ~ndings. Numbers of correctly generated wofds and perseverative errors were recorded. : Total numbers of words for the thre' letters are reported for three educational grqups for ~ males and females separately. Analyses revealed that age did not' have a significant effect on word generation. Gender moderated the effect of education, ~d education alone accounted for 8% of total variance. The authors proposed correctim~ factors computed for each cell in a gender-lly-education group matrix. A table of percentile ranks and normalized T scores from the 360 participants is provided (Table AU.20). According to the design of the t~t. the three letters differ in terms of diffic~. The authors confirmed that the mean pr~uction for letters C (14.1, SD=4.15), F: (13.3, SD=4.10), and L (12.7, SD=4.0) was significantly different. Analysis of errors (repetitions or : perseverations) revealed an effect of age on the error rate, with those 16-24 years old! perseverating at a much lower rate th~ those 25-79 years old. Based on their analysis of error rate, the authors proposed the following cutoff scores: 0 1 2 3 ;?:4
perseverations represent intact . performance (56% of the total sample) low average (26%) borderline (11%) deficient (5%) seriously deficient (2%)
Analysis of internal consistency revealed a high coefficient a (r=0.83) for the thfe letters, indicating high test homogeneity. Test-retest reliability assessment w~ based on five or more randomly selected parti,ipants from each cell, resulting in a total ;f 120
participants, who were retested after a 6-month delay with the alternate version. The order of versions administered was the same for all participants. The results yielded an acceptably high test-retest reliability coefficient (r=0.74). However, a gain of about three words on the retest was interpreted by the authors as a practice effect. The authors pointed out that the raw scores for the FAS and COWA versions of the test are not comparable. However, percentiles or standard scores are comparable, based on a comparison with other studies. In a follow-up publication based on the same sample (Ruff et al., 1997), the authors provided rates of perseveration for different age groups and proposed that a reduction in word Huency is not only linked to left prefrontal damage but also can be determined by diffuse, multifocal, and nonfrontal damage. Study strengths
1. The sample composition is well described in terms of geographic area, age, education, gender, and IQ. 2. Test instructions are given in detail. 3. The data are presented in an educationby-gender matrix. 4. Correction factor and T-score/percentile equivalents for the raw scores are provided. 5. Exclusion criteria are adequate. 6. The sample size is sufficiently large for the elaborate analyses conducted by the authors. The data cover a wide age span. 7. Means and SDs for the test scores are reported.
Considerations regarding use of the study 1. The raw data for separate age groups are not presented. 2. Data for the retest are not provided. 3. High intellectual (WAIS-R FSIQ 11~ 111) and educational (14 years) levels. Other comments A raw score for a given individual must be education-adjusted according to Table A11.19. Then the percentile and T-score ranking can be obtained by comparing an education-adjusted score to Table A11.20.
219
VERBAL FLUENCY TEST
[VF.16] Hoff, Riordan, Monis, Cestaro, Wieneke, Alpert, Wang, and Volkow, 1996 (Table A11.21)
The authors used the COWA in a study of the relationship of cocaine use to performance on neuropsychological tests tapping functions of frontal and temporal brain regions. Performance of crack cocaine users was compared to that of a control group, which consisted of 54 paid male volunteers with a mean age of 32.1 (9. 7) years and mean education of 15.4 (2.4) years. The sample included 48 white, 4 black, and 2 Hispanic participants. Exclusion criteria were a history of medical, neurological, or psychiatric problems; more than moderate use of alcohol (12 oz/week); history of intravenous drug use; and self-reported history of learning disability (with enrollment in special education classes).
Study strengths 1. Sample size is >50. 2. The sample composition is described in terms of age, education, and ethnicity. 3. Rigorous exclusion criteria. 4. Means and SDs for the test scores are reported. Considerations regarding use of the study 1. It is unclear which version of the test was administered. 2. Wide age and education range. No information on IQ is reported. 3. Recruitment procedures were not reported. 4. Educational level for the sample is high. [VF.17] Ponton, Satz, Herrera, Ortiz, Urrutia, Young, D'Eiia, Furst, and Namerow, 1996
logical or psychiatric disorder, drug or alcohol abuse, and head trauma. Data for a sample of 300 participants with a median educational level of 10 years were analyzed. Participants ranged in age 16-75 years, with a mean of 38.4 ( 13.5) years. Education ranged 1-20 years, with a mean of 10.7 (5.1) years. Male to female ratio was 40%/60%. The average duration of residence in the United States was 16.4 (14.4) years. Seventy percent of the sample were monolingual Spanishspeaking, and 30% were bilingual. The proportion of the sample respective to their country of origin closely approximates the 1992 U.S. Census distribution. Correlations between Marin and Marin (1991) acculturation scale scores and neuropsychological variables are provided. The FAS test was administered in the participants' native language, Spanish. In the follow-up study on the factor structure of the NeSBHIS (Ponton et al., 2000), which extracted five factors, the FAS primarily loaded on the Language factor, with a varimax-rotated factor loading of 0.71.
Study strengths 1. Large overall sample, with acceptable sample size for most of the cells. 2. The sample composition is well described in terms of age, education, gender, acculturation information, geographic area, and recruitment procedures. 3. Adequate exclusion criteria. 4. Test administration procedures are specified. 5. Means and SDs for the test scores are reported. 6. Data are partitioned by gender xage x education.
(Table A11.22)
The F AS version was administered to Spanishspeaking volunteers as part of a larger battery in a project designed to provide standardization of the Neuropsychological Screening Battery for Hispanics (NeSBHIS). Volunteers were recruited through fliers and advertisements in community centers of the greater Los Angeles area over a period of 2 years. Exclusion criteria were a history of neuro-
Considerations regarding use of the study 1. It is unclear whether the administration procedure required restrictions in the types of word to be used in the process of word generation. 2. No information on IQ is reported. 3. It is unclear which of the two educational groups included participants with 10 years of education.
220 [Vf.18] Crossley, D'Arcy, and Rawson, 1997 (Table A11.23)
The authors compared performance on letter and category fluency in a sample of cognitively normal seniors (n=635) and in samples ofDAT and vascular dementia patients participating in the Canadian Study of Health and Aging. The control sample included communitydwelling individuals who were screened for cognitive impairment using the Modified Mini-Mental State Examination (3MS). All participants were fluent in either English or French. A detailed overview of the study participants, methods, and findings is provided by the Canadian Study of Health and Aging Working Group (1994). Letter fluency was assessed with the F AS task, administered in three 60-second trials. Participants were instructed to avoid proper nouns and the same word with a different suffix. Category fluency was assessed with the animal name generation task, within a 60second interval. The data are reported by age group, gender, and educational level. Study strengths 1. Administration procedures are well outlined. 2. Sample composition is well described in the previous reports. 3. Subject selection criteria are outlined. 4. Data are stratified by age group, gender, and education. 5. Means and SDs for the test scores are reported. 6. Sample sizes for each demographic grouping are very large. Considerations regarding use of the study 1. Data were collected in Canada and, therefore, might be of limited use in the United States. 2. It is unknown to what extent having some data collected in French impacted the overall results. [VF.19] Beatty, Testa, English, and Winn, 1997 (Table A11.24)
The authors used FAS and Animal Naming to investigate clustering and switching strategies
LANGUAGE
as determinants of hierarchical organization of semantic memory. Performance of an Alzheimer's group was compared to that of an elderly control group, which consisted of 38 volunteers: 18 males and 20 females. None of the participants had a history of major psychiatric or medical illness, drug or alcohol abuse, head injury, learning disability, or other neurological disease. Standard procedures for administration of the FAS and Animal Naming versions were used. Responses were recorded on audiotape and later analyzed. In the follow-up studies on VF mechanisms in Alzheimer's and Parkinson's diseases (Tr6ester et al., 1998; Piatt et al., 1999a), the authors apparently used the same control sample (at least in part). Therefore, the data from these articles will not be reproduced in this book. Study strengths 1. The sample composition is described in terms of age, gender, and education. 2. Rigorous exclusion criteria. 3. Administration procedure is well described. 4. Means and SDs for the test scores are reported. Considerations regardtng use of the study 1. The sample is relatively small. 2. Recruitment procedures were not reported. 3. No information on IQ is reported. [VF.20] Nybers, Winocur, and Moscovitch, 1997 (Table A11.25)
The FAS word fluency test was administered as part of a test battery sensitive to medialtemporal and frontal lobe function in a study investigating age-related differences in the effect of lexical priming on memory. The sample included 39 healthy elderly participants who ranged in age 66-87 years, with a mean age of 77.3 years. Education ranged 822 years, with a mean of 13.6. Performance on the WAIS Vocabulary test was used as a screening measure. Study strengths 1. The sample composition is described in terms of age and education.
221
VERBAL FLUENCY TEST
2. Test administration procedures are speci£i.ed. 3. Means and SDs are reported for the FAS.
furniture, and vegetable categories were used in the category fluency test. The data are reported for each trial separately.
Study strengths
Considerations regarding use of the study 1. The sample is relatively small. 2. Exclusion criteria are not described. It is unclear which version of the WAIS was administered and what performance on Vocabulary served as a cutoff for inclusion into the study. 3. Recruitment procedures are not reported, and gender distribution is not specified. 4. It is unclear whether the administration procedure required restrictions in the types of word to be used in the process of word generation. 5. SDs for age and education are not reported. 6. The data were obtained on Canadian and/or Swedish participants, which may limit their usefulness for clinical interpretation in the United States. [VF.21] Salthouse, loth, Hancock, and Woodard, 1997 (Table A11.26)
The authors examined controlled and automatic processes underlying memory and attention using the process-dissociation procedure, as well as age-related influences on these processes. Participants were 115 healthy adults (47% male, 53% female) aged 18-78 years, who were recruited from appeals to groups and acquaintances. They were included in the study if reported to be in "reasonably good health,'' to not be a current student, and to have at least 11 years of education. No other exclusion criteria are reported. Participants were administered a battery of neuropsychological tests in their homes. The data were stratified into three age groupings: 18-39 years [mean age= 29.0 (4.8); mean education= 15.5 (1.7)], 40-59 years [mean age=49.1 (5.1); mean education= 15.2 (2.5)], and 60-78 [mean age= 69.2 (5.1); mean education= 15.3 (2.6)]. Letters C, F, and L were used in the letter fluency test, with the constraint that none of the words should be proper nouns. Animal,
1. Sample size is large. 2. The sample composition is well described in terms of age, education, gender, and various health indices. 3. Recruitment procedures are speci£i.ed. 4. Data are partitioned into three age groups. 5. Test administration procedures are speci£i.ed. 6. Means and SDs for the test scores are reported.
Considerations regarding use of the study 1. Exclusion criteria are not well identified. 2. High educational level for each age group. [VF.22] Kempler, Teng, Dick, Taussig, Davis, 1998 (Table A 11.27)
The Animal Naming test was administered to 317 Chinese, Hispanic, and Vietnamese immigrants, speaking primarily their native language, and to white and African-American English speakers 54-99 years old. Participants generated animal names in their native language. The test was administered as part of a normative study for the Cross-Cultural Neuropsychological Battery. Volunteers who had a history of stroke, head injury, or psychiatric, speech, language, or memory problems, as reported on a self-rated health history questionnaire, were not included in the study. The standard administration procedure was used. The results indicated an inverse relationship of word fluency with age and a positive relationship with education. A pronounced effect of native language was also noted (see above).
Study strengths 1. Large sample. 2. The sample composition is well described in terms of age, education, gender, ethnicity, and information on acculturation level for the immigrant groups. 3. Adequate exclusion criteria.
222
LANGUAGE
4. Test administration procedures are spec-
ified. 5. Means and SDs for the test scores are reported, grouped by age, education, gender, and ethnicity.
Consideration regarding use of the study 1. No information on IQ is reported. [VF.23] Stuss, Alexander, Hamer, Palumbo, Dempster, Binns, Levine, and lzukawa, 1998 (Table A11.28) The study addresses the effect of brain lesion location and etiology on VF. The control group included 37 participants (19 males, 18 females) without neurological or psychiatric disorder, with mean age of 54.4 (14.4) years and mean education of 13.9 (2.3) years. Mean NART-estimated IQ was 113.8 (6.1). The letter fluency task (FAS) was administered according to Benton and Hamsher's (1978) instructions (numbers were not excluded according to the instructions). Semantic fluency was measured with the animal name generation task. Number of target words generated, different error types, and measures of clustering were recorded. Measures of VF correlated with age but not with education or NART IQ. Normative data for letter and semantic fluency tasks for three age groups (21-39, 40-64, 65-81 years) stratified by gender are provided. The authors reviewed the results in light of the relationship between different cognitive processes and brain regions.
Study strengths 1. The sample composition is well described in terms of age, education, gender, and estimated IQ. 2. Adequate exclusion criteria. 3. Test administration procedures are described. 4. The data are stratified by three age groups and by gender. 5. Means and SDs for the test scores are reported.
Considerations regarding use of the study 1. Small sample size and data are inconsistent across age groups, with older fe-
males scoring considerably higher than younger females on the letter fluency task. 2. Recruitment procedures are not reported. 3. The data were obtained on Canadian participants, which may limit their usefulness for clinical interpretation in the United States. [VF.24] Johnson-Selfridge, Zalewski, and Aboudarham, 1998 (Table A11.29) The authors examined the effect of ethnicity on word fluency, measured with the F AS and Animal Naming versions. The sample included white, black, and Hispanic male veterans, with 200 participants in each group, who were randomly drawn from a larger sample of 4,462 veterans participating in the Vietnam Experience Study. Hispanic participants were not differentiated by country of origin or primary language. However, the authors stated that <2% of the test results were considered questionable due to language or other problems. The recruitment procedures are not described, though a reference is made to earlier articles describing this study. It is unclear if any exclusion criteria were used. The sample includes participants with a history of motor vehicle accidents, loss of consciousness, seizures, medical conditions, and lifetime psychiatric disorders. However, the statistics provided by the authors suggest that the three ethnic groups did not differ significantly in the number of participants meeting criteria for various conditions. Participants ranged in age 31-46 years, with a mean of 37.9 (2.61) years, and in education 2-18 years, with a mean of 13.2 (2.25) years. The authors also provided information on handedness, place of service, income, and WRAT-R reading scores. The F AS and Animal Naming tests were administered according to standard instructions. Data are reported for each ethnic group separately in raw scores, scores adjusted for demographics, and T scores. Only raw scores and demographically adjusted scores are reproduced in this chapter. The authors found that income, educational level, and WRAT-R Reading subtest scores were significantly related to test performance.
VERBAL FLUENCY TEST
Differences between ethnic groups in VF were small and influenced by other variables, but they may have important clinical implications, according to the authors.
Study strengths 1. Large sample size.
2. The sample composition is well described in terms ofage, education, gender, handedness, ethnicity, income, WRAT-R Reading subtest scores, setting, place of military service, and number of participants with various medical and psychiatric conditions in each group. 3. Test administration procedures are specified. 4. Data for three ethnic groups are provided. 5. Means and SDs for the test scores are reported.
Considerations regarding use of the study 1. It is unclear if any exclusion criteria were
used. 2. Recruitment procedures are not reported, though a reference is made to earlier publications. 3. All-male sample of veterans. 4. No information on IQ is reported, though WRAT-R Reading subtest scores are provided. 5. The sample includes participants with a histol)' of various medical (including neurological) and psychiatric conditions. However, the authors specified that the ethnic groups did not differ significantly in the number of participants meeting criteria for the various conditions. This lends support to the validity of the comparative analyses of test performance across the three groups in this study but raises concerns about the use of these data with nonveteran, medically and psychiatrically healthy individuals. [VF.25] Dikmen, Heaton, Grant, and Temkin, 1999 (Table A 11.30)
The F AS was used in a study on the psychometric properties of a broad range of neuropsychological measures based on a sample of 138 normal or neurologically stable adults,
223
81 of whom were tested twice. The participants in this group had no histOI)' of recent trauma and were friends of head-injured patients. Their mean age was 28.5 (12.2) years and mean education was 12.2 (1.9) years; 60% of the sample were males, and the test-retest interval was 11.1 (.6) months. Participants were tested at the University of Washington under the direction of one of the authors. Twenty percent of the sample had preexisting conditions that might affect test performance, the most significant being alcohol abuse and a significant traumatic brain injul)'. The rest of the participants denied any histOI)' of conditions that might be expected to affect brain function. The other two groups used in this article do not have data for the F AS and therefore will not be described in this chapter. The mean WAIS FSIQ (Wechsler, 1955) on initial testing for the three groups combined was 108.8 (12.3). The F AS was given according to standard instructions (Lezak, 1995). An alternate combination of letters (B, D, T) was used on the retest. The authors provide raw scores for performance at two time probes, as well as various measures of test-retest reliability and magnitude of practice effect. The test-retest reliability for the FAS over 11.1 months was r=0.72
Study strengths 1. Relatively large sample. 2. The sample composition is well described in terms of age, education, gender, IQ, geographic area, and setting. 3. Test administration procedures are specified. 4. Means and SDs for the test scores are reported. 5. Information on test-retest reliability is provided.
Considerations regarding use of the study 1. Exclusion criteria are not clearly described. As authors pointed out, twenty percent of the sample had preexisting conditions that might affect test performance, the most significant being alcohol abuse and a significant traumatic brain injul)'.
224
LANGUAGE I
2. The data are not partitioned · by age group. [VF.26] Manly, Jacobs, Sano, Bell, Mer+ant, Small, and Stem, 1999 (Table A 11.31) ; Category fluency tests were used in :a study comparing neuropsychological test 1 performance among nondemented literate d illiterate elders. The sample was select d from participants in a community-based e demiological study of normal aging and in the ethnically diverse neighbor northern Manhattan, New York. The pulation from which the sample was :wn is comprised of individuals from sever~ different countries, representing three ethtfic categories: Hispanic, African Americ~ and white. The sample was restricted to elders ged 65 and above with 0-3 years of formal ed cation. Exclusion criteria were Parkinson's · ease, stroke, or head injury. All particip~ were found to be nondemented by a neuro gist. The final sample included 123 lite te and 64 illiterate elders. Separate analys~s were performed for the following groups: : I
Gr~ps
of education-matched literate and Q.literate participants (n = 43 for each group) were created using a stratified randobt sampling method. The literate grou~ had an average age of 76.2 (6.1) years pod included 81 %-Hispanic, 9% AfricaD!American, and 10% non-Hispanic 1 white participants. The illiterate groug had a mean age of 74.8 (5.7) years ~d included 91% Hispanic and 9% ;African American participants. Both grout>s were 74% female. · 2. Uneducated sample: Data for lterates (n = 26) and illiterates (n = 47) l-ith no formal education were analyze4 sepa; rately. 3. Stratified random Spanish-sfleaking sample: Stratified random sanjple of education-matched literate and illiterate elders (n = 32 for each group). ; 4. Uneducated Spanish-speaking 4ample: Uneducated literate (n = 17) an4 illiterate (n=43) elders. 1. Stratified mndom sample:
Three category fluency conditions were used-animals, food, and clothing-with standard administration procedures for the Boston Diagnostic Aphasia Examination (BDAE). The score represents the number of words averaged over the three conditions. The authors concluded that category fluency is not affected by literacy status. Study strengths 1. Large overall sample size. 2. The overall sample is described in terms of age, education, gender, ethnicity, geographic area, setting, recruitment procedures, and sampling methods. 3. Adequate exclusion criteria. 4. Test administration procedures are specified. 5. Means and SDs for the test scores are reported. 6. Data for illiterate and low-education samples are provided. Considerations regarding use of the study 1. Demographic characteristics for three out of four groups are not provided. 2. The data are not partitioned by age group. 3. No information on IQ is reported. However, WAIS-R Similarities performance is reported. [VF.27] Boone, 1999 (Table A11.32)
The chapter summarizes the results of a study on the effect of aging, demographic factors, and medical conditions on executive functions, which were presented in earlier publications (Boone et al., 1990, 1995). Participants are 155 healthy elderly volunteers (53 males, 102 females) aged 45-84 years, with a mean age of 63.07 (9.29), mean education of 14.57 (2.55), and mean FSIQ of 115.41 (14.11). All participants were fluent English speakers and were recruited through newspaper ads. Participants underwent physical and neurological examinations and psychiatric interviews. Rigorous exclusion criteria were used, including history of psychosis, major affective disorders, alcohol dependence, neurological disorders, and serious metabolic abnormalities. Frequency of vascular illnesses and intake of
225
VERBAL FLUENCY TEST
cardiac and/or antihypertensive medications was recorded. The F AS version of the test was used. Normative data are stratified by IQ level (average, high average, superior) based on performance on the Satz-Mogel abbreviation of the WAIS-R. The results identified the FSIQ as the only significant predictor of F AS performance, responsible for 15% of test score variance, based on stepwise regression analysis.
The standard administration procedure was used.
Study strengths
Considerations regarding use of the study
1. The sample size is large. 2. Composition of the sample is well described in terms of IQ, age, fluency in English, education, gender, and recruitment procedures. 3. Rigorous exclusion criteria. 4. Normative data are stratified by IQ. 5. Means and SDs for the test scores are reported.
Considerations regarding use of the study 1. Age and education for each of the three IQ groups are not provided. 2. Education and intelligence levels of the sample are high. 3. Data are not presented by age groupings. [VF.28] Demakis, 1999 (Table A11.33)
The authors used the COWA as part of a battery in a study of response consistency across a 3-week interval in an analog malingering design. Data are presented for control and dissimilation groups. All participants were students from undergraduate psychology courses at a small midwestern liberal arts college. The control group consisted of 21 participants with a mean age of 22.5 years (7.99) and mean education of 13.6 (1.46) years; 67% were female. Control participants were told that they were in a car accident but that they had not suffered any injuries and were instructed to perform to the best of their ability. Participants were retested 3 weeks after the initial testing. Control participants demonstrated a practice effect on the retest. Only data for the initial testing probe for the control group are replicated in this book.
Study strengths 1. The sample composition is described in terms of age, education, gender, and geographic area. 2. Test administration procedure is specified. 3. Means and SDs for the test scores are reported.
1. The sample is small. 2. Exclusion criteria are not clearly described. 3. It is unclear which version of the test was administered. 4. Recruitment procedures were not reported. 5. No information on IQ is reported. [VF.29] Epker, Lacritz, and Cullum, 1999 (Table A11.34)
The authors used F AS and Animal Naming in a study of the diagnostic utility of a qualitative scoring technique for fluency tasks in Alzheimer's and Parkinson's diseases. The control group included 65 elderly participants with a mean age of 70.6 years (4.7), mean education of 14.3 (2.9) years, and a male/female ratio of 22/43, who participated in an investigation of cognitive function in aging. They were screened for health problems using a semistructured neuromedical interview. Participants did not have a known history of substance abuse, major mental illness, learning disability, neurological disease, or major psychopathology. Standard administration procedures were used.
Study strengths 1. Relatively large sample. 2. The sample composition is well described in terms of age, education, gender, and MMSE score. 3. Adequate exclusion criteria. 4. Test administration procedures are well specified. 5. Means and SDs for the test scores are reported.
226
Considerations regarding use of the siludy 1. Recruitment procedures were ~ot reported. ' 2. Educational level for the sample ?is high. 3. No information on IQ is reportetl. [VF.30] Tombaugh, Kozak, and Rees, 1gf9 (Tables A11.35-A11.37)
The article provides normative data fur FAS and Animal Naming stratified by three 1evels of age (16-59, 60-79, 80-95) and three ~vels of education (0-8, 9-12, 13-21), as weQ as for nine age groups, four education gro~s, and the two genders separately. The total!sample included participants from two differelt studies. Participants were recruited throu~ booths at shopping centers, social organizatic:fs, places of employment, psychology classfs, and word of mouth. Volunteers with a kn~ history of neurological disease, psychiatri~ illness, head injury, or stroke were excluded ~m the study. A subsample of participant$ were judged to be cognitively intact on the ~asis of history, clinical and neurological exa~ation, and an extensive battery of neuropsycht>logical tests. All participants stated that Engijsh was their first language. : The subset of the sample for the F~S test included 895 participants aged 16-9~ years, with a mean age of 60.7 years (19.9), 4J1d education ranging 0-21 years, with a me+n education of 12.1 (3.2). The male-to-female ratio was 559n4L I The subset of the sample for the Animal Naming test included 735 participanis aged 16-95 years, with a mean age of 67.0 years (19.8), and education ranging 0-21 yeats, with a mean education of 11.4 (3.4). The n.ale-tofemale ratio was 310/425. . The standard administration pr~dures were used, with the exception that n*mbers were allowed on the F AS test. Meru} numbers of words are presented for four edpcation groups, nine age groups, and the two genders separately. Percentile scores and mea+ number of words are also presented in thtee age (16-59, 60-79, and 80-95) by three edrcation (0-8, 9-12, and 13-21) cells. . FAS was found to be more sens~ve to the effects of education than age. For f\nimal Naming, the relationship was opposite. (;ender
LANGUAGE
was not found to affect performance on either test.
Study sfrengths 1. Large sample. 2. The sample composition is well described in terms of age, education, gender, and recruitment procedures. 3. Adequate exclusion criteria. 4. Test administration procedures are specified. 5. Means and SDs for the test scores are reported. 6. Data are stratified by age, education, gender, and age x education. Considerations regarding use of the study 1. The data were obtained on Canadian participants, which may limit their usefulness for clinical interpretation in the United States. 2. No information on IQ is reported. [VF.31] Basso, Bomstein, and Lang, 1999 (Table A11.38)
The study examined the practice effect on repeated administration of several tests over a 12-month interval. The baseline sample consisted of 82 men recruited through newspaper advertisements, who were not paid for their participation. Fifty men out of this sample returned for the repeated testing 12 months later. The composition of the latter sample was 48 Caucasian, 1 African American, and 1 Hispanic, with a mean age of 32.5 (9.27) years, mean education of 14.98 (1.93) years, and mean FSIQ of 109.30 (12.29) at baseline. At each probe, participants were screened for neurological disease, head injury, learning disabilities, or other medical illnesses based on an informal interview. They were also screened for psychiatric disorders through a structured clinical interview. None was excluded based on these screens. The F AS was administered according to standard procedures by thoroughly trained and supervised technicians. The authors compared FAS performance at baseline and on the retest using reliable change indices and concluded that FAS scores did not change on the retest.
227
VERBAL FLUENCY TEST
The number of words generated on the FAS for the two probes, with age, gender, and education corrections applied, is reported for the entire sample.
Study strengths 1. Adequate sample size. 2. The sample composition is described in terms of age, education, gender, ethnicity, FSIQ, and recruitment procedures. 3. Adequate exclusion criteria. 4. Test administration procedures are thoroughly described. 5. Means and SDs for the test scores are reported. Considerations regarding use of the study 1. The data are not partitioned by age group. 2. Educational level for the sample is high. [VF.32] Gladsjo, Schuman, Evans, Peavy, Miller, and Heaton, 1999 (Tables A11.39, A11.40)
The authors provided normative data and demographic corrections for age, education, and ethnicity, for letter and category fluency tasks, based on a sample of 768 normal adults aged 20-101 years, with education of 020 years; 55% are Caucasian and 45% African American; 52% are male. Mean age is 50.4 (19.4) years; mean education is 13.6 (3.1) years. The sample consists of volunteers who were enrolled as normal comparison participants in various clinical studies at the UDiversity of California San Diego. Caucasian participants were recruited through local media announcements and personal contacts. Mrican-American participants were part of a federally funded study (African American Norms Project) and were recruited to match the census representation of Mrican Americans within the larger San Diego area. Participants were screened with the Structured Clinical Interview for DSM-III-R or based on self-report of no past history of diagnosis or treatment for an Axis I disorder. Exclusion criteria were history of significant head trauma with loss of consciousness for >20 minutes or persisting neurological sequelae, neurological illness, conditions expected to affect neuropsychological test performance, psychotic
disorder, other major psychiatric illness, current substance dependence or abuse within the last 6 months, or primary language other than English. F AS and Animal Naming were administered. According to the F AS instructions, proper names and plurals were excluded. Total number of words generated for three FAS trials and for Animal Naming are reported for the sample stratified by three age groups (20-34, 35-49, 50-101 years) and three education groups (0-11, 12-15, 1620 years). Data stratified by age are also presented for African Americans and Caucasians separately. In addition, multiple regression analyses were used to develop equations for demographic corrections. Tables for conversion of raw scores to demographically corrected T scores were provided by the authors. Raw scores for FAS and Animal Naming are reproduced in this chapter.
Study strengths 1. Large sample size. 2. The sample composition is well described in terms of age, education, gender, ethnicity, geographic area, setting, and recruitment procedures. 3. Adequate exclusion criteria. 4. Test administration procedures are specified. 5. Normative data are stratified by age x education for the whole sample and by age for African Americans and Caucasians separately. 6. Means and SDs for the test scores are reported. Consideration regarding use of the study 1. No information on IQ is reported. [VF.33] Binder, Storandt, and Birge, 1999 (Table A11.41)
The authors examined the relationship between performance on psychometric tests and a modified Physical Performance Test (modified PPT) in a sample of 125 adults aged 75 years and older, who participated in trials of exercise or hormone replacement therapy. The study was approved by the Washington University School of Medicine, St. Louis. The
228
mean age for the sample was 82.3 (4.4), mean education was 13.5 (3.0), 25% were male, and 87% were Caucasian. Indices of physical health, Blessed score, and Geriatric Depression Scale score are reported. Preliminary screening included a medical history; physical examination; the Short Blessed Test of memory, concentration, and orientation; blood and urine chemistries; a chest X-ray; and a crossvalidated self-report regarding health problems in the previous 12 months. Exclusion criteria were inability to walk 50 feet independently, active medical problems that would contraindicate performance of a graded exercise stress test, inability to complete the graded exercise stress test or the modified PPT, a score >8 on the Short Blessed Test, inability to provide informed consent due to cognitive impairment, and inability to follow the directions for the psychometric tests due to visual or auditory impairments. The test was administered according to standard instructions. The authors found that VF was not significantly associated with total modified PPT score. Study strengths 1. Large sample size. 2. The sample composition is well described in terms of age, education, gender, ethnicity, indices of physical health, Blessed score, Geriatric Depression Scale score, geographic area, and research setting. 3. Adequate exclusion criteria. 4. Test administration procedures are specified. 5. Means and SDs for the test scores are reported. Considerations regarding use of the study 1. The data are not partitioned by age group. 2. No information on IQ is reported. [VF.34] Fama, Sullivan, Shear, Cahn-Weiner, Marsh, Lim, Yesavage, Tinklenberg, and Piefferbaum, 2000 (Table A11.42)
Fluency tests were administered to Alzheimer's patients and normal controls in a study
LANGUAGE
on the relationship between regional brain volume and semantic, phonological, and nonverbal fluency. The control group included 51 participants with a mean age of 66.7 (7.4) years and mean education of 16.4 (2.3) years. Exclusion criteria were significant history of psychiatric or neurological disorder, past or present alcohol or drug abuse or dependence, or other serious medical condition, as identified on a psychiatric interview and medical examination. The standard administration procedure was used for the FAS, with the exception that participants were not instructed to avoid numbers. Semantic fluency was measured with two !-minute trials, in which participants were instructed to generate names of animals and names of inanimate objects, respectively. These data were used in a previous article by Fama et al. (1998) in calculations of standardized z scores for Alzheimer's participants that corrected the raw scores for age. Study strengths 1. Relatively large sample. 2. The sample composition is described in terms of age and education. 3. Rigorous exclusion criteria. 4. Test administration procedures are specified. 5. Means and SDs for the test scores are reported. Considerations regarding use of the study 1. Recruitment procedures a not reported. 2. Gender distribution is not reported 3. The data are not partitioned by age group. 4. Educational level for the sample is high. 5. No information on IQ is reported. [VF.35] Troyer, 2000 (Table A11.43)
The study addressed clustering and switching on phonemic and semantic VF tasks in a total sample of 411 healthy adults aged 18-91. This is a follow-up on previous publications by these authors (Troyer et al., 1997, 1998a,b). The mean age for the sample was 59.8 (20. 7) years, and mean education ranged 5-21 years, with a mean of 13.9 (2.9). The male/female ratio was 30%nO%. All participants were
VERBAL FLUENCY TEST
fluent in English. Participants were screened for neurological or psychiatric disorders. Participants aged 2::60 were screened for cognitive decline. Only those participants who obtained MMSE score 2::25 or scores within the normal range on an episodic memory test were included. The F AS version of the phonemic fluency test was administered to 257 participants and the CFL version, to 154 participants. Standard administration procedures were used, with the exception that participants were not instructed to avoid numbers. Two 60-minute semantic fluency trials were administered: animal fluency version was administered to 407 participants; 156 participants from this sample were also administered supermarket fluency. Based on the results of regression analyses, the author inferred that age had a greater effect on semantic than on phonemic fluency. Education affected both semantic and phonemic fluency. Gender was not related to VF performance.
Study strengths 1. Large sample. 2. The sample composition is well described in terms of age, education, gender, and native language. 3. Test administration procedures are specified. 4. Means and SDs for the test scores are reported.
Considerations regarding use of the study 1. Recruitment procedures are not reported. 2. Participants were screened for neurological or psychiatric disorders; however, medical exclusion criteria are not reported. 3. Demographic characteristics for subsets of participants in each condition are not provided. 4. Phonemic fluency norms are provided as the mean for FAS/CFL. 5. The data are not partitioned by age group. 6. No information on IQ is reported. 7. The data were obtained on Canadian participants, which may limit their use-
229
fulness for clinical interpretation in the United States.
[VF.36] Acevedo, Loewenstein, Barker, Harwood, Luis, Bravo, Hurwitz, Aguero, Greenfield, and Duara, 2000 (Tables A11.44-A11.47) The authors provided normative data for three conditions of the Category Fluency test, Animals, Vegetables, and Fruits, for 424 English-speaking and 278 Spanish-speaking participants over the age of 50. The sample was drawn from a larger pool of communitydwelling individuals who presented for free memory screening sessions offered by the Wien Center for Alzheimer's Disease and Memory Disorders between 1994 and 1999. Participants in the English-speaking group spoke English as their prirruuy language and were born in the United States. Participants in the Spanishspeaking group spoke Spanish as the primary language and were hom in a country where Spanish is the primary language. All participants were screened in their primary language using the MMSE, Hamilton Depression Rating Scale (Hamilton, 1960), and questionnaires related to demographic information, medical and psychiatric history, and cognitive status. Only participants who had MMSE score 2::27 and a score 2::10 on four delayed recall trials of the three words used in the MMSE (based on the cutoff identified in Loewenstein et al., 2000) were included in the study. For English speakers, the mean age was 69.1 (6.9) years, mean education was 14.4 (2.5) years, male/female ratio was 26%n4%, and mean MMSE score was 28.9 (1.0). For Spanish speakers, the mean age was 64.9 (7.7) years, mean education was 13.4 (3.2) years, male/female ratio was 30.8/ 69.2%, and mean MMSE score was 28.7 (1.0). Among English speakers, 99% were classified by the examiner as white, <1% as African American, and <1% as Asian American. Among Spanish speakers, 97% were classified as white, <1% as black, and 3% as "other." Three 60-second trials on Animals, Vegetables, and Fruits categories, were administered to all participants. Normative data are stratified by age, education, and gender. Cells with low sample sizes are not included in the tables.
230
LANGUAGE
The authors concluded that age, ewcation, and gender affect Category Fluency· performance, with gender being the best predictor after adjusting for age and education,· regardless of the primary language. Performances of English and Spanish speakers were shpilar for the Animals and Fruits categories. However, English speakers generated more words for the Vegetables category.
Study strengths
i
1. Large samples for most of the c+lls. 2. The sample composition is well d~scribed in terms of age, education, gender, ethnicity, language, geographic area., and recruitment procedures. 3. Adequate exclusion criteria. 4. Test administration procedures are specified. ' 5. Means and SDs for the test scqres are reported. 6. Normative data are stratified ~y age, education, and gender.
Consideration regarding use of the stlf:ly 1. No information on IQ is reporteil. [Vf.37] Chen, Ratcliff, Belle, Cauley, Dekosky, and Ganguli, 2000 (Table A 11.48)
'
A control sample of 483 elderly nondemented individuals was derived from a com"unitybased multiwave prospective study, thei Monongahela Valley Independent Elders ·Survey (MoVIES), in southwestern Pennsylvania. The purpose of the study was to 1<1entify cognitive measures that are most acc*'ate in discriminating between individuals with presymptomatic Dementia of Alzheimer\; Type (DAT) and nondemented individuals. ~ontrol participants remained nondemented :over a 10-year follow-up period. The study Jtotocol included a standardized general medifal history and physical examination; a detailed neurological and mental status e~ation; hematological, metabolic, and sertlogical tests; and neuroimaging when appr9priate. Relevant medical records were abstracted. The sample included 302 females atd 181 males, with a mean age of 74.9 (4.4~ years; 31.9% of participants had less than fl high school education.
The authors identified the assessed domains as category fluency and initial letter fluency, with references to Borkowski et al. (1967) and Benton and Hamsher (1976). However, we gathered from the earlier publications based on this study that category fluency was assessed using the Animals and Fruits categories and initial letter fluency was measured with generation of words starting with the letters P and S. Evidently, the scores reported by the authors represent the total number of words for two conditions within each domain.
Study strengths 1. Large sample size. 2. The sample composition is well described in terms of age, education, gender, history of the project, and geographic area. 3. Rigorous exclusion criteria. 4. Means and SDs for the test scores are reported.
Considerations regardtng use of the study 1. The data are not partitioned by age group. 2. No information on IQ is reported. 3. Number of participants with less than a high school education is reported. However, mean education and SD are not reported. 4. The tests and administration procedures used are not specified. This information was gathered from earlier articles describing this study. [VF.38] Anstey, Matters, Brown, and Lord, 2000 (Tables A 11.49, A 11.50)
The authors report normative data for a sample of old and very old Australian adults living in retirement villages and hostels. A sample of 369 participants was drawn from 13 residential sites located throughout the Sydney and Illawarra regions. The criterion for inclusion into the subsample of 280 participants described in this article was a minimum of four neuropsychological tests completed. The exclusion criteria were a history of Parkinson's disease, stroke, or heart attack. Participants ranged in age 62-95 years, with a mean age of79.04 (6.59) years, and had mean education of 11.25 (2.79) years; there
231
VERBAL FLUENCY TEST
were 52 males and 317 females; the majority of the sample rated their health as good, very good, or excellent. Participants with MMSE scores < 24 (the lowest recorded score was 17) were included in the sample. The FAS was administered according to standard instructions. Repetitions and nonwords were excluded. The data are reported in raw scores for a subsample of 280 participants and in percentile distribution for a subsample of 260 participants stratified by age x education. The authors found that performance on F AS correlated with education but not with age (which is probably due to the restricted age range).
Normative Aging Study at the Department of Veterans Affairs Medical Center in Boston. All participants were free of cardiac disease, hypertension, cataracts, loss of hearing, or abnormal laboratory tests on entering the study. At each visit, participants received a thorough medical exam. A 60-second trial of the Animal Fluency test was administered at each longitudinal session to 235 stroke- and dementia-free men with a mean age of 66.41 (6.73) years and mean education of 14.03 (2.62) years. The results for the initial testing and for the retest 3 years later are reported in Table A11.51. The test-retest difference in word generation was significant at 0.05 level.
Study strengths
Study strengths
1. Large overall sample. 2. The sample composition is well described in terms of age, education, gender, health status, range of MMSE scores, geographic area, setting, and recruitment procedures. 3. Exclusion criteria are reported. 4. Test administration procedures are specified. 5. Means and SDs for the test scores are reported for the entire sample. 6. The percentile distributions are reported for the sample stratified by age x education.
Considerations regarding use of the study 1. Overall sample is large, but sample sizes for half of the age x education cells are less than 10. 2. Participants with MMSE scores <24 (the lowest recorded score was 17) were included in the sample. 3. The data were obtained on Australian participants, which may limit their usefulness for clinical interpretation in the United States. 4. No information on IQ is reported. [VF.39] Brady, Spiro, McGiinchey-Berroth, Milherg, and Gaziano, 2001 (Table A11 .51 ) The Animal Fluency test was used in a study on the effect of stroke risk factors on cognitive functioning, which is part of the longitudinal
1. Large sample. 2. The sample composition is described in terms of age, education, gender, and geographic area. Recruitment procedures are reported in previous publications. 3. Adequate exclusion criteria. 4. Test administration procedures are specified. 5. Means and SDs for the test scores are reported.
Considerations regarding use of the study 1. All-male sample. 2. The data are not partitioned by age group. 3. No information on IQ is reported., [VF.40] Rosselli, Ardila, Salvatierra, Marquez, Matos, and Weekes, 2002a (Table A 11.52) The authors compared oral fluency strategies of Spanish English bilinguals in Spanish and English with Spanish and English monolingual participants tested in their native language, in their performance on the F AS and animal name generation tests. This article expands on the data analyses presented in an earlier publication by these authors (Rosselli et al., 2000). The sample included 28 males and 54 females 50-84 years old, with a mean age of 61.76 (9.30) and education ranging 2-23 years, with a mean of 14.8 (3.6) years. Based on the questionnaire assessing participants' bilingualism,
232
45 participants were English monolinguals, 18 were Spanish monolinguals, and 19 were Spanish English bilinguals. All English monolingual participants were born in the United States. All Spanish monolingual participants were Latin American immigrants living in the city of Hialeah, Florida, had been living in the United States for an average of 5 years, and had migrated after age 50. The latter participants claimed Spanish as their first language, mean age of exposure to English of 18.85 (14.24) years, and mean number of years of exposure of 35.95 (13.37). Participants were screened for a history of neurological or psychiatric problems, as well as for current dementia and depression, using a structured interview, the MMSE, and the Beck Depression lnventory-11 (BDI-11). All participants performed above 27 on the MMSE and below 5 on the BDI-11. Groups were similar in their naming ability, as measured by the English and Spanish versions of the Boston Naming Test. Three 60-minute trials for the F AS and one trial for Animal Naming were used. The order of presentation for bilingual participants was counterbalanced. To avoid errors in scoring, the examiners read to the participants their responses after they were finished. The authors concluded that bilinguals produced significantly fewer words than English monolinguals in the Animal Naming, but not in the F AS, condition. The use of grammatical vs. content words and crosslinguistic differences in the recall of alphabetical words were discussed.
Study strengths 1. Large sample. 2. The sample composition is well described in terms of age, education, gender, acculturation level, and geographic area. 3. Adequate exclusion criteria. 4. Test administration procedures are specified. 5. Means and SDs for the test scores are reported. 6. Data for the Spanish-speaking sample are provided.
lANGUAGE
Considerations regarding use of the study 1. Recruitment procedures are not reported. 2. It is unclear whether the administration procedure required restrictions in the types of word to be used in the process of word generation. 3. Demographic characteristics for each linguistic group are not provided. 4. The data are not partitioned by age group. 5. Education for the sample is high. 6. No information on IQ is reported. [VF.41] Grady, Yaffe, Kristof, Un, Richards, and Barrett-Connor, 2002 (Table A11.53)
Data on Animal Naming were collected for a subsample of 1,063 older women in a multicenter study examining the effect of hormone replacement therapy on cognitive functioning in postmenopausal women. Participants were younger than 80 years, with established coronary disease and an intact uterus. They were randomly assigned to a treatment or placebo group in a double-blind experiment. They were followed for 4.2 (0.4) years. At the end of the trial, cognitive functioning was measured in both groups. The data are reported for 517 participants in the treatment group and 546 in the placebo group, separately. Mean ages for the two groups at the time of testing were 66.3 (6.4) and 67.3 (6.3) years, mean education was 12.7 (2.7) years for both groups; approximately 90% of the sample were white. There are no notable differences between the groups on any demographic variables or physicaJ indices. The Animal Naming test was administered according to standard procedures. The authors concluded that there were no differences between the treatment and placebo groups on any cognitive measures. Moreover, on the VF test, performance for the placebo group was higher than for the treatment group.
Study strengths 1. Large sample size. 2. The sample composition is well described in terms of age, education, gender, physicaJ findings, clinicaJ setting, and selection criteria.
VERBAL FLUENCY TEST
3. Test administration procedures are specified. 4. Means and SDs for the test scores are reported. Considerations regarding use of the study 1. Participants had established coronary disease. It is unclear if any neurological exclusion criteria were used. 2. All-female sample. 3. The data are not partitioned by age group. 4. No information on IQ is reported.
[VF.42] Giovannetti, Goldstein, Schullery, and Barr, 2003 (Table A11.54)
Animal Naming was administered to 31 control participants in a study on the mechanisms of VF deficits in first-episode schizophrenia. Participants were recruited from the hospital community through announcements in local newspapers and within the medical center. They had no history of substance abuse or neurological!psychiatridmedical illness, per self-report and per Schedule for Affective Disorders and Schizophrenia Interview, physical examination, and urinalysis. Mean age for the group was 25.2 (6.07) years, mean education was 15.0 (1.48) years, mean WAIS-R IQ was 109.3 (11.51), and male/female ratio was 21110. The sample is further descibed in the articles by Bilder et al. and lieberman et al., published between 1991 and 2000. The standard administration procedure was used.
Study strengths 1. The sample composition is well described in terms of age, education, gender, FSIQ, geographic area, and recruitment procedures. 2. Stringent exclusion criteria. 3. Test administration procedures are specified. 4. Means and SDs for the test scores are reported. Considerations regarding use of the study 1. The sample is small. 2. Educational level for the sample is high.
233 [VF.43] Lopez-Carlos, Salazar, Villasenor Saucedo, and Peiia, 2003 (Tables A11.55-A11.58) The PMR version of the phonemic VF and Animal Naming tests were used in a study investigating the effects of demographic variables on cognitive abilities in Spanish-speaking individuals with low educational level. The total sample included 115 volunteer monolingual Latino men with ::;10 years of formal education, who work in manual labor in the Los Angeles area (n = 65) or Jalisco, Mexico (n =50). Volunteers were recruited from posted advertisements in workplaces and personal solicitations. The mean age for the sample was 28.23 (8.74) years, and mean education was 6.66 (2.54) years. Exclusion criteria were self-report of head injury, neurological insults, prenatal or birth complications, learning disabilities, psychiatric problems, or substance abuse. Scores on the BOI11-Spanish version (M = 12.92, SO= 8.94) and the Beck Anxiety Inventory (M = 6.60, SO= 6.03) are also reported. Standard administration procedures were used. Participants were tested in Spanish. The PMR version of the VF test was selected over the F AS, to minimize the effects of education. Two common errors among individuals with limited education in the latter measure are naming words that begin with an "A" sound but are preceded by a silent "Ha" and naming words that begin with an "S" sound but their spelling begins with the letters C or Z (Artiola i Fortuny et al., 1999). Selected subtests from the WAIS-III (Mexican version) and the (EIWA) were included in the battery. WAIS-111 Vocabulary raw scores are included in Tables A11.55-All.58. Mean performance on the Marin Marin Acculturation Scale for the Los Angeles sample was 17.61 (6.19). For the Los Angeles group, Picture Vocabulary subscale scores from the WoodcockJohnson-III Tests of Achievement (M = 5.36, SO= 6.01) and the Bateria WoodcockMufioz-R, Pruebas de Habilidad Cognitiva-R (M = 29.77, SO= 5.37) were used to assess level of English and Spanish expressive ability. The results are partitioned by education group (0--6, 7-10 years), by age group (18-29,
234
LANGUAGE
30-49 years), and by education x age groups. The authors reported that the diffe~ce between the two education groups in perfopnance on the PMR was significant atp < 0.05l«f.'el. No difference between the two education :groups was evident on Animal Naming. The tiwo age groups did not differ significantly on ei~er test. The mean PMR score for the Los !ngeles group was 30.00 (11.09) and that for the (Mexico group was 35.62 (10.30). The mean !Animal Naming score for the Los Angeles ~p was 17.16 (4.46), and that for the Mexico ~up was 18.52 (5.53). Differences in scores ijltween individuals from Los Angeles and Mexi4o were noted on PMR, with the Mexican ~ample scoring slightly higher. The authors atliibuted this difference to using Spanish more frequently in Mexico and/or a slightly higll.er educational level of 7.76 (2.18) years 4>r this sample, as opposed to 5.82 (2.49) years~or the Los Angeles sample. No statistically :;:i"ficant differences were noted on Animal N g. This study provided data for a ealthy working monolingual Spanish-speaking sample with low educational level, whi~ is a valuable addition to the available normative data primarily based on English-~aking, highly educated participants. Study strengths 1. Large sample for age and education groups. 2. Data provided for a healthy employable monolingual Spanish-speaking group with low educational level. 3. The sample is stratified into two education groups, two age groups, and age x education groups. 4. The sample composition is well defcribed in terms of age, education, gend$-, goographic area, and recruitment p~ures. 5. Adequate exclusion criteria. 6. Means and SDS for the test scotes are ; reported. 7. WAIS-111 Vocabulary subtest sco,-es are available.
'
Considerations regarding use of the study 1. All-male sample. 2. The sample sizes for the combined age and education groups are small.
[VF.44] Miller, 2003; Personal Communication (Table A 11.59)
The investigation used participants from MACS. The data were collected from 728 seronegative homosexual and bisexual males for the purpose of establishing normative data for neuropsychological test performance based on a large sample. These data represent an update on the data provided by Seines et al. (1991). Mean age for the sample was 37.5 (6.9) years, and mean education was 16.3 (2.3) years; 93.7% were Caucasian, 2.6% Hispanic, 2.9% black, 0.8% other. All participants were native English speakers. The F AS and Foods VF tests were administered according to standard instructions. The data are partitioned by three age groups (25-34, 35-44, 45--59) times three educational levels (<16, 16, >16 years). Study strengths 1. The overall sample size is large, and most of the individual cells have more than 50 participants. 2. Normative data are stratified by age x education. 3. Information on age, education, ethnicity, and native language is reported. 4. Means and SDs for the test scores are reported. Considerations regarding use of the study 1. All-male sample. 2. No information on IQ is reported. 3. No information on exclusion criteria. [VF.45] Ravdin, Katzen, Agrawal, and Relkin, 2003 (Table A 11.60)
The authors examined the effects of mild depressive symptoms on letter and semantic fluency in a sample of 188 community-dwelling adults aged 60-92 years, who were recruited from community-based lectures and health fairs. All participants lived independently and were in self-reported good health. Exclusion criteria were history of neurological disease, head injury with loss of consciousness >5 minutes, substance abuse, or psychiatric treatment. Participants were also excluded if a more extensive neuropsychological evaluation
VERBAL FLUENCY TEST
revealed evidence of cognitive decline, defined as any score falling more than 2 SD below the mean for their age. Data for 149 participants who were not depressed, based on GDS scores of< 10, were divided into three groups: young-old (60-69 years), middle-old (7~79 years), and old-old (8~92 years). The mean age for the nondepressed sample was 74.82 (6.63) years, mean education was 15.57 (2.67) years, AMNARTestimated Verbal IQ was 120.44 (5.74), and male/female ratio was 321117. Letter fluency was measured with the COWA-CFL version, and semantic fluency with the Animals, Fruits, and Vegetables categories. Scores for each letter and category, as well as total scores for three trials for letters and semantic categories, respectively, were reported.
Study strengths 1. Adequate sample size. 2. The sample composition is well described in terms of age, education, gender, IQ, setting, and recruitment procedures. 3. Adequate exclusion criteria. 4. Data are stratified by age groups. 5. Means and SDs for the test scores are reported.
Considerations regarding use of the study 1. It is unclear whether participants were instructed to avoid numbers or proper names in the process of word generation. 2. Educational and intelligence levels for the sample are high.
RESULTS Of THE META-ANALYSES OF THE VERBAL FLUENCY DATA (See Appendix 11m)
Data collected from the studies reviewed in this chapter were combined in regression analyses in order to describe the relationship between age and test performance and to predict expected test scores for different age groups. Effects of other demographic variables were explored in follow-up analyses. The general procedures for data selection and analysis are described in Chapter 3. Detailed
235 results of the meta-analyses and predicted test scores across adult age groups for the F AS and Animal Naming versions of the VF tests are provided in Appendix 11m. The age range represented in the original data set is 17.76-87.5 years. However, there are no data available for FAS between ages 74.3 and 87.5 years. Therefore, the 87.5 year data point was dropped from the analyses. Examination of the distribution of education ranges suggests relatively smooth increments from 9.4 to 17 years, whereas data are lacking between 4 and 9.4 years. Therefore, three data points representing participants with 4 years of education were excluded for both tests, to avoid extrapolation of a prediction rule over ranges that are not supported by existing data. After further data editing for consistency and for outlying scores, 18 studies for FAS and 11 for Animal Naming, which generated 30 and 25 data points for the two tests, based on totals of 3,469 and 2,823 participants, respectively, were included in the analyses. Quadratic r~ssion of the FAS scores on age yielded an R2 of 0. 711, indicating that 71% of the variance in F AS scores is accounted for by the model. Based on this model, we estimated FAS scores for age intervals between 18 and 74 years. Linear regression of Animal Naming scores on age yielded an R2 of 0.764. Based on this model, we estimated Animal Naming scores for age intervals between 25 and 87 years. If predicted scores are needed for age ranges outside the reported boundaries, with proper caution (see Chapter 3) they can be calculated using the regression equations included in the tables, which underlie calculations of the predicted scores. It appears that age-related changes in rates of phonemic fluency have a curvilinear pattern, with increase in fluency up to the third decade of life, followed by a gradual decline. In contrast, the relationship between Animal Naming and age is linear. This difference might be attributable to gains in vocabulary fund facilitating phonemic fluency up to the late 30s, which is counteracted by agerelated anatomical changes thereafter. Evidently, the increase in vocabulary fund does not contribute to the generation of animal names.
236
Regressions of SDs for the FAS and Animal Naming scores on age suggest that age does not account for a significant amount of variability in SDs (R2 = 0.018 and 0.319, respectively). Though some increase in variability with advancing age is expected, this trend was not present in the collected data. Therefore, we suggest that the mean SDs for the aggregate sample be used across all age groups. Examination of the effects of demographic variables on FAS scores revealed that education is a significant predictor of test performance. Values of estimated between-study variance (tau2 ) for regression of test means with education were considerably lower than the corresponding values for regression without education. This suggests that education explains a considerable amount of the heterogeneity in the outcome variable. Inclusion of education into the regression ofFAS means on age considerably improved the R2 (see Appendix 11m). It should be noted that in this analysis, regression with and without education was rerun on a subset of studies that had education reported for each data point (n = 17). The t value for education was 2.47 (p = 0.025). The coefficient for education of 0.498, rounded to 0.50, indicates that with a 1-year increment in education we expect a 0.50-unit increment in word production. This suggests that the table of predicted FAS scores is accurate for individuals with 14.31 years of education, rounded to 14 years (which is the mean education for the original data set) in the education correction table. With every year of education above or below this level, we suggest correcting the obtained score by adding or subtracting 0.50 to or from the predicted score given in the table for the relevant age group (see Chapter 3 for an example). Correction factors for different educational levels are included in Table A11m.1 in Appendix 11m. This correction should be applied within the education range of 10-17 years since this is the range available in the original data set. Unfortunately, data for lower educational levels were not available in the literature. Any extrapolation of scores outside the reported boundaries should be made with caution.
LANGUAGE
Inclusion of education into the analysis of Animal Naming also indicated a considerable reduction in tau2 and improvement in R2 . However, education did not significantly account for the variability in test scores (t = 1.93, p = 0.083). Thus, a correction table for education was not provided for Animal Naming. IQ had a significant effect on FAS scores. However, the limited number of studies that reported IQ does not allow close examination of this relationship. The effect of IQ on Animal Naming was not examined due to a lack of data. With regard to the effect of gender, analysis was conducted only for the F AS. The difference in mean scores for the two genders across six studies reporting scores for males and females separately was negligible, 0.088 in favor of females.
Strengths of the analyses 1. Total sample size of 3,469 for the F AS and 2,823 for the Animal Naming test. 2. Postestimation tests for parameter specifications did not indicate problems with normality or homoscedasticity. 3. Effects of education and IQ on F AS performance were evident, which is consistent with the literature. A significant effect of education on F AS called for corrections for education.
Limitations of the analyses 1. R2 of 0.711 and 0.764 for the two tests are acceptable. However, these values indicate that only 71% and 76% of variance in the FAS and Animal Naming scores, respectively, is accounted for by the models. 2. The number of studies on the Animal Naming test that report data for older age ranges considerably exceeds the number concerned with younger age ranges. Thus, the mean age for the aggregate sample is 62.60 (18.13) years. 3. Educational levels of the aggregate samples are over 14 years for both tests, and the IQ level for the F AS sample is 113.91 (7.72). (IQ is available for only seven data points.) Higher rates of word generation are associated with higher educational
VERBAL FLUENCY TEST
and IQ levels. Therefore, the predicted values are likely to overestimate expected performance for individuals with lower educational levels and/or average and lower than average intellectual levels. A correction table for education for the FAS is provided in the appendix (only for the educational ranges represented in our data). Although a significant effect of IQ on the FAS was found in our data, which is consistent with the literature, due to a scarcity of data on IQ reported in the reviewed studies, close examination of this effect was not possible. 4. Possible gender differences in word generation in favor of females have been reported in the literature. Our comparison of six data points for the FAS that were broken down by gender indicated no relationship between gender and word-generation fluency. Clearly, the scarce data available for review were not sufficient for a close examination of the effect of gender on VF.
CONCLUSIONS Different versions of the VF task are widely used in clinical practice as measures sensitive to executive dysfunction. Use of the normative data assuring accurate interpretation of the test results is obscured by variability in procedural and reporting aspects of the studies. There is confusion across studies in identifying the version of the test administered: the CVFr test (FAS) is commonly presented as the COWA (see Brief History of the Test, above). This confusion undermines the accu-
237 racy of interpretation. Due to different levels of letter difficulty, the norms for the F AS should be used with caution in application to the COWA sets (CFL and PRW) and vice versa. Other issues of concern are time allotted for each category, type of semantic or phonemic category administered, presence/ absence of feedback on intrusion or repetition errors, instructions for item exclusion (whether or not numbers were excluded), and administration of an example with a practice trial prior to the first recorded trial. These procedural aspects are inconsistent across studies. Attention to these aspects of administration and data reporting is recommended for future studies involving VF tasks. Other issues of interest to future investigators of this test might include the following: (1) comparative efficiency of word generation for semantic vs. phonemic categories in normal and clinical samples; (2) comparison of psychometric properties for different types of VF task, aiming at the possibility of interchangeable use of the normative data across tasks; (3) effect of demographic variables on test performance across different age ranges, educational/intelligence levels, genders, ethnic, cultural, and linguistic backgrounds, and clinical diagnoses. Of concern, the majority of normative studies were confined to participants of high educational and/or IQ levels. Because a number of studies, as well as the results of the meta-analyses performed on the aggregate sample described in this chapter, indicate that education (and/or IQ) is highly predictive of VF performance, the scarcity of normative data for lower IQ and educational groups is a problem for clinical practice.
IV PERCEPTUAL ORGANIZATION: VISUOSPATIAL AND TACTILE
12 Rey-Osterrieth Complex Figure
BRIEF HISTORY OF THE TEST The Rey-Osterrieth Complex Figure (ROCF), also known as the Rey Figure and the Complex Figure Test (CFT), consists of a complex two-dimensional line drawing containing 18 details, including crosses, squares, triangles, and a circle, arranged around a central rectangle (Fig. 12.1). The patient is instructed to carefully copy the design with pencil on paper. Rey (1941) asserts that the figure does not require a high level of graphic aptitude; each of the details is simple to reproduce separately, and the difficulty of the task is due to the arrangement of the elements. Organizational strategy is documented by having the patient use differentcolored pencils when executing the task (Rey & Osterrieth, 1993). Some studies report alternate methods for recording the strategy of reproduction, such as numbering lines on a copy of the ROCF or replicating the subject's drawing with indication of the directionality of lines, as the subject proceeds (Binder, 1982; Kirk & Kelly, 1986; Waber & Holmes, 1986). The popularity of the ROCF in assessing visuospatial constructional and visual memory deficits with different clinical groups has been steadily increasing (see Knight et al., 2003, for survey results on current use). A number of different administration procedures and scoring
systems, as well as alternate versions of the complex figure, have emerged to meet the needs of specific clinical groups.
Administration Procedures Rey's (1941) original instructions are as follows. The subject is presented the stimulus figure with the isosceles triangle and circle oriented to the right and is instructed to "make a copy of this design as best as possible" on a plain sheet of paper with a colored pencil. The subject is told that "the copy can be an approximation as far as the proportions are concerned but that care should be taken not to forget any detail." The subject is handed a different-colored pencil with which to draw "each time an element is determined;" typically, five to six pencils are utilized. The order in which the colored pencils are used is noted on the side of the paper. Time to complete the copy is recorded. The subject is not allowed to change the orientation of the stimulus figure but may reposition the drawing sheet. When the subject stops drawing, he or she is asked if he or she is finished, and the copy sheet and the model are removed from view. After 3 minutes, the subject is handed a new sheet of paper and a regular pencil and asked to draw the design from memory. L. B. Taylor's (1969) instructions are a commonly used variation: "Copy this drawing 241
242
PERCEPTUAL ORGANIZATION: VISUOSPATIAL AND TACTILE
'
Figure 12.1. Rey-Osterrieth Complex Figure (O!(terrieth, 1944). ~
as well as you can. Make sure you do n~ leave out anything." No time limit is impos~, but the length of time required to complQI:e the copy is recorded. Forty minutes laterJ a reproduction of the drawing from me~'ry is requested (p. 278). In a second publi ation, L. B. Taylor (1979) provided additio al information regarding instructions: osure of the figure for copying is limited to minutes. Then 45 minutes later the pa . nt is
"EI'
of recall (Berry & Carpenter, 1992). Similarly, the pilot data collected by Wood et al. (1982) indicated that there are no differences in Taylor figure scores obtained on 30- vs. 60minute delayed recall in patients or normals. Delayed recall is, however, affected by administration of the Immediate Recall trial or repeated Delayed Recall trials, which have a facilitating effect on delayed performance (Chiulli et al., 1989; Loring et al., 1990). Loring et al. (1990) reported an increase in accuracy of 30-minute delayed recall of about 6 points due to interposing a 30-second recall trial. The interested reader is referred to Knight (2003), Lezak et al. (2004), and Spreen and Strauss (1998) for additional information on ROCF administration procedures.
Alternate Versions Repeated administration of the ROCF results in inflation of the retest scores due to the practice effect. According to Spreen and Strauss (1991), the inflation reaches about 10% of the original score on the 1-month retest. The Taylor figure (L. B. Taylor, 1969; Hubley &
asked to reproduce as much of the figur; as he
Tomhaugh, 2003; Lezak, 1995; Lezak et al.,
can remember" (p. 167). According ~ Osterrieth (1944), participants are not allo"ed to make erasures. Copy and recall administration proc~dures have varied across investigations (see Ruffolo et al., 2001). According to the standard procedure, participants are asked to rec411 the figure without being forewarned. In .· many studies, the copy condition is followed by the immediate recall. The reported interVal for the delayed recall varies between 3 rrinutes and 24 hours. According to Knight ~ al.'s (2003) survey of practicing neuropsychoJogists (International Neuropsychological ~iety membership), the most frequently use
2004) was produced as an alternate version of the complex figure to avoid the practice effect in repeated testing situations. The assumption of comparability of the two figures stems from the fact that both have an equal number of details of assumed equal complexity. To validate this assumption, several studies have compared scores obtained by participants on the ROCF and Taylor figure in normal and clinical populations (Berry et al., 1991; Delaney et al., 1992; Duley et al., 1993; Hamby et al., 1993; Hubley, unpublished honor thesis; Hubley & Tombaugh, 2003; Kuehn & Snow, 1992; Peirson & Jansen, 1997; Strauss & Spreen, 1990; Tombaugh & Hubley, 1991; Tombaugh et al., 1990, 1992a; Vingerhoets et al., 1998). The results indicated that the figures yield equivalent copy scores. However, recall scores or percentage of the copy score retained on the recall condition was much higher for the Taylor figure, irrespective of the delay interval, learning paradigm (incidental or intentional), or scoring system. According
thb
REY-OSTERRIETH COMPLEX FIGURE
to Casey et al. (1991), this might be attributed to the fact that the Taylor figure is more likely than the ROCF to be encoded verbally. This finding limits interchangeable use of these figures in test-retest situations. Loring and Meador (2003) describe the psychometric properties, scoring criteria, and normative data for four complex figures developed earlier (Meador et al., 1991, 1993), which are similar to the ROCF and Taylor figure. Some of the same elements used in the previous figures were incorporated but placed in different locations. New components were added. A 36-point scoring system is used. Hubley and Tremblay (2002) developed a modified version of the Taylor Complex Figure (MTCF) by removing or downplaying distinctive elements which are likely to be verbally encoded, adding more lines to increase the complexity of the visual array, and modifying the placement of some elements. The resulting figure was compared by the authors to the ROCF on 62 university students using three experimental paradigms. The results suggest comparability of the two figures on learning, memory, and copy scores. Hubley et al. (2003) further described the MTCF as well as the development of new simplified figures for assessment of older adults. Information on psychometric properties and normative data for these modified figures is provided. A recent attempt to design a complex figure that is equivalent to the ROCF is described by Frazier et al. (2001). The authors introduced the Mack Complex Figure Test (Mack CFT), and presented a comparative study of the copy and 45-minute delayed recall accuracy scores for the two figures on a sample of 245 adults. The authors report a high degree of equivalence between the figures.
Scoring Systems Osterrieth's (1944) scoring system, adapted by E. M. Taylor (1959), is most commonly used in scoring the copy and recall reproductions of the ROCF. It is based on the accuracy of a subject's reproduction and assigns 0-2 points based on placement and presence of distortion for each of 18 structural elements of the
243
figure. This system replaced that of Rey (1941) for scoring copy and 3-minute delayed recall, which was based on a 47-point scale. Translations of Rey's (1941) and Osterrieth's (1944) original articles describing scoring criteria, along with critical commentaries and recommendations for administration, are offered by Corwin and Bylsma (1993). Scoring criteria published by E. M. Taylor (1959, adapted from Osterrieth, 1944) are reproduced in Figure 12.2 and Table 12.1. Ratings of the accuracy of reproduction for each unit are presented in Table 12.2. To improve the quantitative objectivity of scoring and to address organizational aspects of performance, several other scoring systems have been proposed. 1. Visser (1973) developed a system which quantifies figure accuracy (based on the presence or omission of cettain details) and organization (based on interruption and sequence scores). The method of test administration has been altered to record the sequence of segments reproduced by the patient. Three aspects of performance are scored: (1) omission of a detail or its portion, (2) interruption of a line before being completed, and (3) sequence of reproduction. This system was developed for research with brain-damaged patients. 2. Binder (1982) used the original Osterrieth scoring criteria to quantify the accuracy of reproduction but developed his own system to quantify organization. Five structural elements of the figure were identified, which were drawn as a single unit by non-brain-damaged participants in the pilot study: the horizontal midline, vertical midline, two diagonals, and the vertices of the pentagon. The organizational score is based on the number of structural units drawn as a single unit vs. fragmented units and on the number of missing units. This system was used to evaluate reproductions of patients with unilateral cerebrovascular brain pathology. Savage et al. (1999) developed a system for quantifying the organizational
PERCEPTUAL ORGANIZATION: VISUOSPATIAL AND TACTILE
244
~.
------~1 2~,------~~
18
Figure 12.2. Scoring units for the Rey-Osterrieth Complex Figure. (Reproduced by permission from 0. Spreen and E. Strauss, 1998.)
Table 12.1. Rey-Osterrieth Complex Figure Scoring Criteria
Table 12.2. Ratings for Each Rey-Osterrieth Complex Figure Unit
Unit.:
Correct
l. Cross upper left comer, outside of rectangle 2. Large rectangle 3. Diagonal cross 4. Horimntal midline of 2 5. Vertical midline 6. Small rectangle, within 2, to the left 7. Small segment above 6 8. Four parallel lines within 2, upper left 9. Triangle above 2, upper right 10. Small vertical line within 2, below 9 ll. Circle with three dots within 2 12. Five parallel lines with 2 crossing 3, lower right 13. Sides of triangle attached to 2, on the right 14. Diamond attached to 13 15. Vertical line within triangle 13, parallel to the right vertical of 2 16. Horimntal line within 13, continuing 4 to the right 17. Cross attached to 5, below 2 18. Square attached to 2, lower left
Scoring: Consider each of the 18 units separately. Appraise accuracy of each unit and relative position within the whole of the design. Rating for each unit is presented in Table 12.2.
approach based on the structural units identified by Binder plus the base rectangle, which they used to evaluate organizational strategies in patients with obsessive-compulsive disorder (OCD).
Distorted or incomplete but recognizable Absent or not recognizable Maximum
Placed properly Placed poorly Placed properly Placed poorly
2 points 1 point 1 point 112 point 0 points 36 points
3. Klicpera (1983) used the original Osterrieth scoring to assess the accuracy of reproduction. In addition, he introduced organizational criteria, such as presence of parts of the configuration (main rectangle, internal structural components, external and internal details, intersections, and segments forming the large rectangle, diagonals, and perpendiculars), organization (intersections, alignment and arrangement of details), and approach to drawing (sequence of construction, continuit;y of lines, segmentation of key parts). Klicpera used this system to explore planning abilities in dyslexic children. 4. Denman (1984) developed an itemized scoring system for the ROCF as part of the Denman Neuropsychology Memory Scale, which assigns 0-3 accuracy points
245
REY-OSTERRIETH COMPLEX FIGURE
to each of the 24 "designs" of the figure, which are divided into eight sectors. Rapport et al. (1995) developed several measures of hemispatial deficit that may be incorporated in this scoring system. Tombaugh (unpublished research) developed a similar system for scoring the Taylor complex figure. Both systems consist of a total possible of 72 (69) points and are more detailed than the original 36-point scoring system to provide more objective interpretation. 5. Bennett-Levy (1984) proposed scoring criteria to evaluate strategy in addition to Osterrieth's original accuracy criteria. "Strategy total" included scores for "good continuation" (requiring a line to be drawn as one piece and continued until intersection with another line) and for "symmetry" (reflecting construction of symmetrical units and their components). 6. Waber and Holmes (1985, 1986) introduced the Developmental Scoring System (DSS-ROCF) to quantify goodness of organization and style in the context of developmental changes in children, which was published by Psychological Assessment Resources (Bernstein & Waber, 1996). Administration of the DSS-ROCF requires five colored pencils, which are switched after specific time periods. The figure is separated into the smallest segments, which are categorized as part of the four major components: base rectangle, main substructure, outer configuration, and internal detail. The presence of intersections and alignment of key components (base rectangle, main substructure, and outer configuration) are scored. In addition, ratings of organization and style of reproduction are obtained. Quality of reproduction of 24 organizational features results in placement of the protocol in one of the five organizational levels ranging from poor (I) to excellent (5). The style rating is based on reproduction of 18 "criteria! juncture features," which place the protocol into partoriented, intermediate, or configurational categories.
7.
8.
9.
10.
Additional normative data and clinical examples based on this scoring system are provided in Bernstein (2003). Kirk and Kelly (1986) adapted the scoring method for accuracy and error from the system developed by Waber and Holmes (1985). In addition, they developed a system designed to provide an objective basis for recording, classifying, and evaluating starting strategy (configurational vs. piecemeal) and the level of organization (structured vs. nonstructured) in ROCF reproductions by children. Starting strategies were also evaluated from a part/whole perspective. In addition, progression strategies were assessed. This system allows evaluation of the relationship between strategy, accuracy, and errors in ROCF reproductions. A modified scoring system was used by Becker (Becker, 1988; Becker et al., 1988) to evaluate visuoconstructional skills and visual memory in Alzheimer's patients. According to this system, 12 "bits" of the drawing were scored for accuracy and placement (2 points) and for accuracy only (1 point). Loring et al. (1988, 1990) developed an 11-point method for scoring qualitative errors, which reflects the accuracy of reproduction of each of 18 original details in recall of the ROCF. Scoring allowed a certain degree of "reproduction tolerance" due to the focus on memory functioning, rather than on constructional ability. These criteria were applied to reproductions produced by patients with temporal lobe epilepsy and found to be effective at discriminating between right and left temporal lobe epilepsies. However, application of this method to scoring reproductions produced by a sample of college students resulted in over 95% of the sample scoring 36 out of 36 points. Simplified versions of the ROCF and Taylor figure were developed by Hubley and colleagues (Hubley, unpublished honors thesis; Hubley et al., 2003; Tombaugh et al., 1990) to assess elderly and neurologically impaired individuals. Fifty-point
246
PERCEPTUAl ORGANIZATION: VISUOSPATIAl AND TACTilE
itemized scoring systems were developed to score the figures. These systems are similar to the itemized scoring systems developed for the full versions of the figures, with the exception of scoring only 19, rather than 24 (23), "designs" of the figures. 11. Beny et al. (1991) modified the original scoring criteria to improve the sensitivity of the score to distortion and displacement of details. Scores for each of the 18 details ranged ~2 in half-point increments. A score of one-quarter point was also allowed, to denote gross distortion or severe displacement of the detail. These criteria were applied to scoring of the ROCF for a sample of elderly participants. The modified system scores are on the average 2 points higher than those obtained using the original scoring system; however, the correlations between the two systems are high (ranging 0.82-0.96 for different conditions). 12. Chervinsky (see Chervinsky et al., 1992) developed the "organizational scoring system," which is designed to assess the organization of reproductions and is minimally affected by reproduction accuracy. According to this system, the subject is offered a different-colored pencil when it is judged by the examiner that he or she has completed a "chunk" of the figure. The figure is separated into six sections, consisting of different combinations of conceptual "chunks." A total score reflecting the organizational quality of reproduction is based on the scores assigned for completeness of the conceptual "chunks" (reproduced with the same-colored pencil) minus penalty scores. (One conceptual detail is reproduced with different-colored pencils.) 13. Meyers and Meyers' (1992) modified administration procedure includes four conditions: Copy (recording the time to completion), 3-Minute Delayed Recall (from the time of completing the copy), 30-Minute Delayed Recall (from the time of completing the copy), and a Recognition subtest. Scoring of the Copy and Recall conditions is based on the
original system and similar to the system presented by Loring et al. (1990). However, the authors introduced a 114" rule for "misplacement" and a 118" rule for drawing errors, which reduce the effect of subjective judgment on scoring. The Recognition subtest is composed of 12 components of the ROCF presented in their proper size, shape, and orientation mixed randomly with 12 distracters. The subject is to circle all the parts recognized from the original design. Scoring of the Recognition subtest is based on the number of correct, falsepositive, and false-negative responses. Meyers and Lange (1994) compared scores on the Recognition subtest for different clinical groups and a normal group. The results suggested that this subtest discriminates best between braininjured and normal participants or participants with minor brain injuries. Dawson and Grant (2000) used remarkable improvement on the Recognition trial in comparison to poor free recall to demonstrate impaired retrieval in the face of intact acquisition in recently detoxified alcoholics. The professional manual for the Rey Complex Figure Test and Recognition Trial (RCFT), published by Psychological Assessment Resources (Meyers & Meyers, 1995b), contains test description, administration, scoring and profiling procedures, interpretation guidelines, information on psychometric properties, normative data expressed in raw scores, and demographically corrected normative data compiled on a sample of 394 individuals 18-90 years of age. Supplemental norms for children and adolescents ~18 years of age (Meyers & Meyers, 1996) are also available. 14. Duley et al. (1993) developed explicit scoring criteria for the ROCF and Taylor figure, which substantially increase interrater reliability. These criteria define scorable distortions and misplacements and outline clear rules for scoring deviations in drawings for each of 18 elements of the original 36-point scoring
REY-OSTERRIETH COMPLEX FIGURE
system for both figures. This system was used by the authors in a study with participants infected with HIV. 15. Hamby et al. (1993) developed an organizational quality scoring system for the ROCF and Taylor figure. Administration of the figures requires colored pens, which are switched at equal points in the construction of the figure. The scoring is based on 18 standard elements for each figure. It focuses, however, on the reproduction strategy through determining the order of placement of configura! elements (base rectangle or square, horizontal and vertical midlines), the appropriate continuation of lines, and the order of placement of details. Ratings of organizational quality are made on a 5-point scale and reflect the presence of three types of mistake (configura!, diagonal, and detail), with higher scores indicating better organization. This system was used by the authors to discriminate between asymptomatic and symptomatic HIVpositive patients. 16. R. A. Stem et al. (1994, 1999) proposed the Boston Qualitative Scoring System (BQSS), which divides the ROCF into three hierarchical sets of elements: configural elements, clusters, and details. These sets are scored for different combinations of the following features: presence, accuracy, fragmentation, and placement. Reproductions are also evaluated with respect to planning, organization, size distortion, perseveration, confabulation, rotation, neatness, symmetry, and immediate and delayed retention. These qualities yield 17 initial scores for each of the three reproductions (Copy, Immediate Recall, and 20-30 Minute Delayed Recall). In addition, six summary scores are calculated. According to the professional manual for the BQSS (Stem et al., 1999), the authors were pursuing dual goals: to provide quantitative summary scores and to formulate a system of qualitative ratings. A description of the BQSS and information on its psychometric properties are provided in the manual, based on a standardization
247
sample of 433 individuals aged 18-94. In addition, Folbrecht et al. (1999) reported good to excellent indices of interrater reliability and internal consistency. Somerville et al. (2000) demonstrated modest correlations between BQSS indices of executive functioning and performance on other tests measuring this domain. The main disadvantage of the BQSS is the complexity and considerable time commitment required to learn it (Boone, 2000). However, the authors report a rapid reduction in total scoring time to 10-15 minutes for all three trials. In addition, the Quick Scoring Guide (QSG) was developed by the authors, which allows rating BQSS scores without reference to the elaborate criteria. A majority of the ratings correlate with comprehensive scores above 0.7. Further review of the advantages and disadvantages of the BQSS compared to the 36-point system is provided by Hartman and Potter (1998). Akshoomoff and Stiles (1995a,b, 2003) applied the BQSS (with some modifications) to explore strategies used by children in ROCF copy and recall. Cahn et al. (1996) and Schreiber et al. (1999) used the BQSS to examine qualitative features ofROCF performance in attentiondeficit hyperactivity disorder (ADHD). Javorsky and Stem (1999) demonstrated superiority of the BQSS to the 36-point scoring system in discriminating between dementia of Alzheimer's type and ischemic vascular dementia. Usefulness of the BQSS in discriminating between clinical groups and controls is described in the manual. 17. Fastenau (1996, 2002b, 2003) developed Recognition and Matching trials for the original version of the complex figure, which are to follow the Copy, Immediate Recall, and Delayed Recall trials. This elaborated version was named the Extended Complex Figure Test (ECFT). The Recognition trial contains 30 multiple-choice items, which are classified into Global and Detail scales. The Left-Detail and RightDetail subscales are incorporated into the
PERCEPTUAL ORGANIZATION: VISUOSPATIAL AND TACTILE
248
Detail scale. Ten matching items represent a subset of the items used in the Recognition trial. Each item is presented with five multiple-choice options. Psychometric properties of the scales and normative data for adults 30-85 years of age are presented by Fastenau (1996, 2003) and Fastenau et al. (1999). Development of child normative data is presented by Sasher and Fastenau (2001). Applicability of 30-50 year norms for younger adults is discussed by Fastenau (2002a). The ECFT has been published by Western Psychological Services (Fastenau, 2002b). The test manual provides further information on psychometric properties and normative data for this test. A review and critical analysis of different scoring systems are provided in several sources (Chervinsky et al., 1992; Hamby et al., 1993; Knight, 2003; Lezak et al., 2004; Troyer & Wishart, 1997). In summary, a number of different scoring systems for the ROCF and Taylor figure were proposed, which focus on different aspects of reproduction: accuracy, organization, strategy, and style. Due to the variations in scoring systems and differences in operationalization of conceptual aspects of reproduction, results of different studies and normative data reported in these studies should be interpreted with caution.
Reliability Of some concern in the use of the ROCF is the issue of interrater reliability. Most clinicians use the E. M. Taylor (1959) scoring criteria, and those familiar with the system are aware of the subjective judgment involved in determination of a "distortion." According to Bennett-Levy (1984), pilot study data have indicated that only by use of very strict or very lenient scoring criteria can adequate interrater reliability be attained. He states that Taylor conveyed to him in a personal communication that he, in fact, employs very strict criteria involving quality of draughtsmanship as well as presence, distortion, and misplacement of figures. However, Bennett-Levy (1984), Berry
and Carpenter (1992), Berry et al. (1991), Boone et al. (1993a), Carr and Lincoln (1988), Delaney et al. (1992), Rapport et al. (1997), and Stem et al. (1994) report respectable interrater reliabilities of 0.80-0.99. Liberman et al. (1994) assessed interrater and intrarater reliabilities of ROCF scoring in a large sample of male boxers. The authors report very high intrarater reliability and good interrater reliability for Copy, Immediate Recall, and Delayed Recall. Tupler et al. (1995) established excellent inter- and intrarater reliabilities for total scores (ranging 0.85-0.97) in a sample of95 memoryimpaired elderly participants. However, corresponding reliabilities for the 18 individual items were highly variable, ranging 0.14-0.96. The authors recommended amplified delineation of relevant decision criteria for scoring of individual items. Interrater reliability for the Savage et al. (1999) scoring method, established on a subsample of 15 drawings obtained by Deckersbach et al. (2000a) in a study on organizational approach to the complex figure in OCD individuals, was moderate to high, with Cohen's K coefficients ranging 0.69-0.92 for different organizational elements of the figure. Berry et al. (1991) and Rapport et al. (1997) reported adequate internal consistency reliabilities for the standard scoring system. The latter study also reported higher internal consistency for the Denman system in comparison to the standard system. Data on test-retest reliability of the ROCF over a 1-year period were provided by Berry et al. (1991). Based on performance of a sample of 41 elderly "normal" participants, they reported low reliability for the Copy condition but moderate reliability for Immediate Recall and Delayed Recall. Similarly, test-retest reliability coefficients for ROCF performance of elderly participants, reported by Mitrushina and Satz (1991a), were quite low. Assessed over three annual probes, the coefficients ranged 0.56-0.68 for Copy and 0.57...{).77 for 3-Minute Delayed Recall. Data on repeated administration are also presented by McCaffrey et al. (2000). For further information on the psychometric properties of the ROCF, see Franzen
REY-OSTERRIETH COMPLEX FIGURE
(2000), Knight (2003), Lezak et al. (2004), and Spreen and Strauss (1998).
Clinical Utility The ROCF assesses visuoperceptual/constructional skills, spatial organizational skills, and visual memory. It has frequently been found to be sensitive to nondominant hemisphere functioning and right temporoparietal area integrity in particular (Binder, 1982; Milner, 1975; Pimental & Ross, 2003; Taylor, 1969; Wood et al., 1982), although some investigators have failed to document this relationship (King, 1981). Some authors have suggested that right hemisphere- and left hemisphere-damaged patients may show different types of ROCF copy and recall failure. On copy, right hemisphere-damaged patients have exhibited distortions, while left hemisphere-damaged patients have produced the design in a piecemeal and fragmented fashion but still frequently produced an adequate copy (Binder, 1982). On recall, right hemisphere-damaged patients have shown distortions and loss of the general organization of the figure (based on Taylor's version), while left hemisphere-damaged patients have shown preservation of the four-square quadrant but loss of details (Wood et al., 1982). Poreh and Shye (1998) showed that right-sided local elements of the ROCF are most useful for discriminating between right and left hemisphere-damaged patients. Kaplan (1988) emphasized the importance of qualitative interpretation of the drawing strategy in the context of the process analysis of a patient's performance. According to Kaplan, both frontal damage and posterior damage might result in poor reproduction of the ROCF. In case of left frontal pathology, patients retain the outer contour and the major structural lines. In case of a right parietal lesion, the breakdown in contour and organization is expected on the side contralateral to the lesioned hemisphere (i.e., the left side of the design). Using quantitative MRI morphometric analysis in patients with localized lesions due to traumatic brain injuries, Bigler (2003) showed that ROCF reproduction requires involvement
249
of the integrated, multisystem neural mechanisms, which are not limited to the right hemisphere and involve all four lobes of the brain. Specifically, the left hippocampus was shown to make a major contribution to the efficiency of ROCF recall. The author hypothesized that this finding might reflect use of verbal mediation in recall strategies. Regard and Landis (1994) analyzed clinical findings in 37 patients who produced a happy face ("smiley") instead of the detail involving a circle with three dots. The authors concluded that the "smiley" is rare but that its presence is associated with dysfunction of the anterior part of the right hemisphere. Seidman et al. (1995, 1997) found that developmental analysis of the ROCF identifies organizational difficulties related to ADHD, which is associated with developmentally lower levels of copy organization and recall style. However, Reader et al. (1994) did not find lower ROCF performance to be associated with ADHD when using a traditional scoring system. Fujii et al. (2000) reported visuospatial skills to be more salient in predicting copy performance in ADHD adults with high IQ, whereas organizational skills were more predictive in those with low IQ. The authors discussed their findings in light of the overall efficiency of information processing. ROCF productions in ADHD children are further discussed in Conners et al. (2003) and Teknos et al. (2003). Kixmiller et al. (2000) investigated the role of perceptual and organizational factors in ROCF performance in three amnesic groups. They found that the Korsakoff patients' impairment on the copy task is due to the superimposing of visual-organizational difficulties upon visuoperceptual processing deficiencies, whereas diminished copy accuracy in patients with anterior communicating artery aneurysm is related to organizational and behavioral control inefficiencies. As for Korsakoff patients' poor recall, visuoperceptual deficits play a role in their immediate recall deficits. A combination of visuoperceptual deficits and severe amnesia may explain their poor delayed recall. Diamond and DeLuca (1996) reported a profound loss of information between the copy and immediate recall conditions in anterior communicating
250
PERCEPTUAL ORGANIZATION: VISUOSPATIAL AND TACTILE
artery amnesiacs, despite normal limits G:>r copy scores. The authors attributed this losi of infonnation to inadequate encoding, ir1apaired consolidation, or accelerated rates of forgetting. Deckersbach et al. (2000a,b) and Savage et al. (1999, 2000) found problems with organization of the drawing in spite of ~urate reproduction of the geometric figure ip their sample of individuals with OCDqwhich implicates frontostriatal dysfunction, b ed on neuroimaging findings. Organization uring the copy condition was a strong pr · tor of subsequent memory performance · their sample. A role of executive dysfun~on in OCD, identified through ROCF perfo~ance, is further described by Savage and Otto ~2003). In a study by Waber et al. (1994), lon;-tenn survivors of childhood acute leukemia rfalled fewer organizing-scheme components pn the ROCF but more incidental features iii comparison to nonnative expectations. The ~thors suggest a metacognitive basis for this !weakness, rather than a visuoperceptual de6fit. Shorr et al. (1992) computed measqres of copy accuracy, perceptual clustering, 4ncoding, and savings ror 50 neuropsychia~c patients based on their ROCF perfonnan
!
normal elderly controls, using a scoring system developed by the authors, that divides the ROCF into six perceptual categories: right, left, upper, lower, basic gestalt, and inner detail. They found that demented patients had deficits in all six categories compared to normal controls. In addition, left hemispatial inattention contributed to impaired performance in DAT. Use of the ROCF in dementia is further described by Libon et al. (2003) and Saxton et al. (2003). The ROCF has been extensively used with other clinical groups: schizophrenia (Kalinowski et al., 2003), toxic exposure (Diamond et al., 2003), unilateral neglect (Rapport & Webster, 2003), temporal lobe epilepsy (Barr, 2003), frontal lobe damage (Ruff & Jurica, 2003), brain injury and organic memory impairment (Wilson & Watson, 2003), vascular cerebellar lesions (Greve et al., 2003b), stroke rehabilitation (Greve et al., 2003a), and pediatric closed-head injury (Yeates et al., 2003). Lu et al. (2003) suggested that with recent addition of a recognition memory trial (Meyers &: Meyers, 1995b), the ROCF becomes a potentially useful instrument for capturing suspect effort. Waber et al. (1989) studied ROCF recall in children in the context of a dual-code cognitive neuropsychological model. Recall was compared after copying the figure vs. after visual inspection for the two experimental groups, respectively. Fifth-graders who did not copy the design prior to recall remembered its organization better and produced it more configurationally than the other group. However, there was no treatment-group difference for eighth-graders. The authors explain these results from the perspective of complementary functioning of the visual system, which favors configurational processing, and the motor system, which relies on sequential or part-oriented processing. Therefore, performance style can be indicative of the relative strength of the visual and motor codes, which can be interpreted in terms of neuropsychological referents, sequential motor programming being associated with the left cerebral hemisphere and gestalt pattern perception with the right. In the context of
251
REY-OSTERRIETH COMPLEX FIGURE
this theory, the results of this experiment suggested that among preadolescent children the motor aspect interfered with efficient encoding of visuospatial information. An explanation based on differential involvement of cerebral hemispheres in information processing was also offered. Further advancements in dual-code theory are described by Waber (2003). Further review of the clinical utility of the ROCF is provided in Knight (2003), Lezak et al. (2004), and Spreen and Strauss (1998).
Culture-Specific Studies and Normative Data for the ROCF Normative data for the Copy and 10-Minute Delayed Recall conditions, collected on a sample of 300 Hispanic participants stratified by gender, age, and education, are provided by Ponton et al. (1996). Psychometric properties of the ROCF with children in Spanish-speaking populations are examined by Galindo and Cortes (2003). Normative data based on a sample of 624 Spanish-speaking children and adults living in Bogota, Colombia, stratified by age, education, and gender, are reported by Ardila and Rosselli (2003). Guo et al. (2000) investigated the applicability and psychometric properties of the ROCF on a sample of 111 healthy retired Chinese elderly. A subset of 40 participants also completed the Taylor figure. Effects of demographic variables on the Copy and Delayed Recall conditions were explored. No difference between ROCF and Taylor figure performance was found. Good test-retest reliability and construct validity were reported. Normative data for the Copy and 10-Minute Delayed Recall of the ROCF, collected on a sample of 280 normal Italian participants, were reported by Caffarra et al. (2002). Significant effects of age and education on the copying task were reported. Gender affected only the delayed recall performance. The authors reported inferential cutoffs and equivalent scores for the ROCF. Normative data for Canadian children and adults aged ~70+ are provided by Spreen and Strauss (1998).
RELATIONSHIP BETWEEN ROCF PERFORMANCE AND DEMOGRAPHIC FACTORS A robust relationship has been found for ROCF copy and recall performance with age in patients and control participants (Ardila & Rosselli, 1989, 2003; Ardila et al., 1989; Bennett-Levy, 1984; Boone et al., 1993a; Chiulli et al., 1995; Fastenau, 2003; Fastenau et al., 1999; Janowsky & Thomas-Thrapp. 1993; King, 1981; Lannoo & Vingerhoets, 1997; Meyers & Meyers, 1995a; Mitrushina & Satz, 1991a; Mitrushina et al., 1995a; Ostrosky-Solis et al., 1998; Powell, 1979; Rapport et al., 1997; Speers & Ribbler, unpublished manuscript; Visser, 1973), although some negative findings have been reported (Brooks, 1972; Delaney et al., 1992). King (1981) concluded: the strong effect of age . . . indicates a need for caution in the interpretation of copy and recall data from elderly participants. It may be that the complexity of the Rey Figure renders it an unsuitable task for the discrimination of brain-damaged from non-brain-damaged elderly participants. Further data are needed to answer this question. (pp. 642643)
Tombaugh et al. (1992a) used a multipletrial, intentional learning procedure to explore recall of the ROCF and Taylor figure in healthy participants 20-80 years old. In their study, participants 60-80 years old scored lower than those 20-59 years old on recall of both figures, which was attributed by the authors to less efficient encoding and retrieval strategies used by older people. In another study, Tombaugh et al. (1992b) used a similar procedure to explore intentional learning of the Taylor figure in neurologically intact participants 20-79 years old. The Taylor figure was presented for observation for 30 seconds, after which participants were to reproduce it from memory within 2 minutes. The procedure was repeated over four acquisition trials, which were followed by a retention trial 15 minutes later, and concluded with copying the figure from the model within 4 minutes. An itemized scoring system was used (Tombaugh, unpublished research).
252
PERCEPTUAl ORGANIZATION: VISUOSPATIAl AND TACTilE
The results indicated that performance on the Taylor figure is not affected by gender, depression, or education and is not related to performance on the Vocabulary and Block Design subtests of the WAIS-R. An effect of age, however, was evident, with a greater rate of age-related decline on recall than on copy trials. The authors concluded that age has an effect on constructional as well as on memory processes. Hartman and Potter (1998) and Vingerhoets et al. (1998) did not find remarkable age differences in drawing ability or organizational quality of complex figure reproductions. However, robust age differences in recall were reported. Similarly, Mitrushina et al. (1990) did not find differences on the copy condition between 73 young-old (57-70 years) and 80 old-old (71--85 years) participants. However, pronounced differences between these two age cohorts were reported on the 3-minute delayed recall condition. Changes in the accuracy and organizational quality of the copy and recall of ROCF as a function of advancing age were also reported by Chervinsky et al. (1992). Waber and Holmes (1985, 1986) reported maturational changes in children which affect the accuracy and organization of ROCF reproduction in the copy and recall conditions. According to these authors, improvement in accuracy of reproductions is evident up to the age of 9 years, after which accuracy remains relatively stable. In contrast, developmental changes in planning and organization of the reproduction continue into adolescence. Berry et al. (1991) found a significant relationship of ROCF immediate and 30-minute delayed recall with age and education. Scores on the copy condition were not related to age, education, or gender. The relationship between education and ROCF performance has been equivocal, with some reporting a significant relationship (Ardila & Rosselli, 1989, 2003; Ardila et al., 1989; Berry et al., 1991; Vingerhoets et al., 1998) and others failing to document an association (Boone et al., 1993a; Delaney et al., 1992; Hartman & Potter, 1998; Schreiber et al., 1999; Speers & Ribbler, unpublished manuscript). Fastenau et al. (1999) reported that education explained 2%-3% of variance on copy and
memory trials in their sample of healthy adults. Similarly, Meyers and Meyers (1995a) reported a negligible contribution of education (ranging 0%-5%, with 2% average) to any of the RCFT scores in their standardization sample. Of importance to the clinician is the frequently reported relationship between ROCF performance and intellectual level in both patients and control participants. Hemsley (1974) documented a significant correlation between ROCF percent loss on 40-minute delay and WAIS full-scale IQ (r = - 0.405). He suggested that "it may be necessary to take IQ into consideration when interpreting scores on the Rey-Osterrieth test" (p. 1134). Wood et al. (1982) report significant correlations between ROCF 40-minute delayed recall and WAIS Block Design scaled scores (r=0.628), with minimal associations with WAIS Vocabulary (r= 0.258), Digit Span forward (r=0.079), and Digit Span backward (r = 0.221). They argue that ROCF performance is "dominated by variation in intellectual skills unrelated to memory" (p. 181). A majority of other investigators have found significant correlations between IQ variables and ROCF copy and recall performance (Bennett-Levy, 1984; Boone et al., 1993a; King, 1981; Mitrushina et al., 1989; Powell, 1979; Visser, 1973), although some negative findings have been documented (Speers & Ribbler, unpublished manuscript). In general, no differences in ROCF performance between men and women have been found (Berry et al., 1991; Boone et al., 1993a; Browers et al., 1984; Fastenau et al., 1999; Meyers & Meyers, 1995b). In some studies, men outperformed women, but the amount of variance accounted for by gender has been minimal (Ardila & Rosselli, 1989; Ardila et al., 1989; Bennett-Levy, 1984; King, 1981). Some studies have addressed the relative importance of, or interaction between, ROCF scores and IQ/demographic factors. Boone et al. (1993a) documented with stepwise regression analyses a significant association between ROCF scores (copy and delay) and age as well as IQ in a healthy middle-aged and elderly population, whereas gender and education were not predictive of ROCF scores.
253
REY-OSTERRIETH COMPLEX FIGURE
Similarly, Bennett-Levy (1984) reported that age was strongly related to ROCF copy performance in his young to middle-aged sample, while copy score and age were significant predictors of recall scores; gender and IQ were not found to contribute to test scores once the other variables had been considered. While Boone et al. {1993a) did not find any interaction between age and IQ in their agerestricted sample, their findings considered in conjunction with Bennett-Levy's results may suggest that IQ is a less important factor in ROCF performance during young to middle adulthood but emerges as an important variable only in advanced age. While many studies did not find a relationship between ROCF reproduction and handedness, Weinstein et al. (1990) report an interaction between handedness and academic major in college women. The highest quality of reproduction (in both copy and recall conditions) was seen in left-handed math/science majors, while the poorest quality was demonstrated by familial right-handed non-math! science majors. The authors attributed this difference to an increased ability to coordinate the use of left and right hemisphere processing in high-performing groups. Ardila et al. (1989) found interactions between education and age and between education and gender on ROCF copy scores in their Spanish-speaking sample, with age and gender effects observed primarily in participants with no formal education. However, in a larger sample of participants limited to age 55 and older and classified into 0-5, ~12, and > 12 years of education, while main effects for age, education, and gender on ROCF scores were also documented, no interaction between age, education, and gender was found (Ardila & Rosselli, 1989). Rosselli and Ardila (1991) reported effects of age, educational level, and gender on the scores for both copy and immediate reproduction of the figure. The discrepancy in the Ardila et al. (1989) and Boone et al. (1993a) reports regarding the importance of education for ROCF performance is probably related to differences in the educational levels sampled. The Boone et al. (1993a) population included only four partici-
pants who had completed < 12 years of education. It may be that education is a predictor of ROCF scores only at low levels but once a "threshold" is reached, such as completion of high school, it no longer has an influence.
METHOD FOR EVALUATING THE NORMATIVE REPORTS To adequately evaluate the ROCF normative reports, eight key criterion variables were deemed critical. The first six of these relate to subject variables, and the remaining two relate to procedural variables. Minimal requirements for meeting the criterion variables were as follows.
Subject Variables Sample Size Fifty cases are considered a desirable sample size. Although this criterion is somewhat arbitrary, a large number of studies suggest that data based on small sample sizes are highly influenced by individual differences and do not provide a reliable estimate of the population mean. Sample Composition Description Information regarding medical and psychiatric exclusion criteria is important. It is unclear if geographic recruitment region, socioeconomic status, occupation, ethnicity, handedness, and recruitment procedures are relevant. Until this is determined, it is best that this information be provided. Age Group Intervals nus criterion refers to grouping the data into limited age intervals. This requirement is relevant since a strong effect of age on ROCF performance has been demonstrated in the literature. Reporting of Educational Levels Given the probable relationship between education, especially low levels, and ROCF performance, information regarding education should be provided for each subgroup.
PERCEPTUAl ORGANIZATION: VISUOSPATIAl AND TACTilE
254
Reporting of Intellectual levels
Given the relationship between ROCF performance and IQ, information regarding intellectual level should be reported for each subgroup, and preferably normative data should be presented by IQ levels. Reporting of Gender Composition
Given the possible relationship between gender and ROCF performance in favor of males, information regarding gender composition should be reported for each subgroup.
Procedural Variables Description of the Scoring System Used
A clear statement of the method used for scoring the ROCF is important because numerous scoring systems are available. In addition, given the rather subjective nature of some of the scoring procedures, information regarding interrater reliability is desirable. Data Reporting
Means and standard deviations, and preferably ranges, for copy and recall scores are important. Percent forgetting or percent retention could substitute for a recall score mean. However, SDs should be used with caution in evaluating the relative standing of an individual score because ROCF scores are not normally distributed.
SUMMARY OF THE STATUS OF THE NORMS In terms of subject variables, not all studies have provided data according to age intervals, although all data sets do indicate the mean age of the sample. Only a few studies have provided data for elderly participants. Several studies have reported IQs or IQ estimates. However, only one publication provided data by IQ intervals. Mean educational level was indicated in many studies, but only a few publications reported data by educational level or in age-by-education cells. Information on gender composition of the samples was documented in many studies, and several
publications provided data for males and females separately. Several data sets had total sample sizes >50; however, a few studies did not have adequate medical and psychiatric exclusion criteria. Information on geographic recruitment area was present in the majority of data sets. Ethnicity, handedness, and occupation were less commonly reported. In regard to procedural variables, all studies reported either mean raw scores or mean percent retention and almost all reports provided SDs. Several studies presented copy scores only or recall performance only. Although documentation of copy strategy through use of colored pencils is commonly employed, only a few investigators have reported qualitative information on the strategy features of ROCF copy or delay. Similarly, only a few reports have indicated that ROCF performance was timed, and only a few of these reported the data on performance time. Number and types of error for copy and recall were described in a few studies. The length of time elapsed before delayed recall varied widely: immediate recall, 3-minute delay, 10-minute delay, 20-minute delay, 30-minute delay, 40-minute delay, 45-minute delay, and 24-hour delay. In only some studies were the precise scoring systems described. Only a few studies reported interrater reliability data for ROCF scores and internal consistency indices. Among all the studies available in the literature, we selected for review those based on well-defined samples or that offered some information not routinely reported. A majority of authors provide data based on the OsterriethTaylor scoring system. Any deviations from this standard reporting are specified in the context of each table. Scores based on the BQSS are not reproduced in the tables for this chapter. The clinician is urged to pay close attention to the sequence of trials (e.g., administration of the immediate recall prior to the delayed recall trial) and the length of the delay interval (e.g., 3- vs. 30-minute delay) as these factors have a notable effect on rate of recall (see above). In this chapter, normative publications and control data from clinical studies are reviewed in ascending chronological order. The text of
255
REY-OSTERRIETH COMPLEX FIGURE
study descriptions contains references to the corresponding tables identified by number in Appendix 12. Table A12.1, the locator table, summarizes information provided in the studies described in this chapter. 1
Study strengths 1. Information on IQ, age, gender distribution, geographic recruitment area, and handedness is provided. 2. Relatively large sample size. 3. Means and SDs are reported.
Considerations regarding use of the study SUMMARIES OF THE STUDIES [ROCF.1] Powell, 1979 (Table A12.2)
The authors report data on 64 Londoners as part of an examination of the relationship between ROCF performance and IQ and verbal memory performance. The 64 "normal" participants were part of a sample of 150 right-handed patients referred for neurological screening but "confirmed as having no brain damage on the usual physical tests such as the EMI scan" (p. 336). Twenty-one of the 64 participants were female, and mean age was 41.0 (14.05). Mean Verbal and Performance IQs (VIQ, PIQ) prorated from scores on the Comprehension, Similarities, Vocabulary, Block Design, and Object Assembly subtests were 107.70 (16.80) and 83.70 (21.55), respectively. Means and SDs for percent retention of the figure following a 40-minute delay compared to original copy scores are reported. Significant correlations were noted in the sample as a whole between ROCF percent retention score and Block Design (r=0.51), Object Assembly (r=0.50), Digit Span (r=0.28), Comprehension (r=0.39), Similarities (r=0.32), Vocabulary (r = 0.27), PIQ (r = 0.38), VIQ (r = 0.30), and age (r= -0.33). Powell concludes that ROCF performance is more associated with PIQ than VIQ. He cautions that the ROCF "must be viewed in the light of the patient's intelligence . . . [and] in relation to the patient's age, since older people score lower'' (p. 339). Powell describes how the clinician can derive an expected memory score based on IQ from a regression equation, which can be used to determine if the patient's actual memory score is worse than expected.
'Nonns for children are available in Baron (2004) and Spreen and Strauss (1998).
1. While no objective evidence of brain dysfunction was found on lab tests, patients were apparently suspected of having dysfunction-hence, the referral for neuropsychological testing. Probably a sizable percent had some type of subtle brain dysfunction not detected by the standard diagnostic laboratory measures. The significantly lower PIQ corroborates that this group is probably not "normal." 2. Undifferentiated age range. 3. Lack of data regarding education. 4. Lack of data regarding copy scores or raw recall scores. 5. The data were collected in England and may have limited value for use in this country. 6. No information on scoring system or interrater reliability. [ROCF.2] King, 1981 (Table A12.3)
The authors obtained ROCF data on 71 Canadian controls as part of a study on the effects of lateralized nonfocal brain dysfunction and age on the ROCF and the relationship between ROCF and Wechsler Memory Scale Visual Reproduction performance in a sample of 185 participants. Controls consisted of healthy volunteers or patients with "nonneurological or psychiatric conditions." Mean age, years of education, and WAIS FSIQ were 39.6 (21.4), 11.4 (2.9), and 104.5 (18.1), respectively. Participants copied the ROCF and drew it from memory following a 40-minute delay, during which time verbal tasks were administered. Copy and recall scores were obtained using the E. M. Taylor (1959) guidelines. A percent recall score was also calculated by multiplying the ratio of recall to copy score by 100. Means and SDs for copy. recall, and percent recall are reported for three age groups:
256
PERCEPTUAL ORGANIZATION: VISUOSPATIAL AND TACTILE
< 30 (n = 36), 30-60 (n = 17), and >60 (n = 18). Significant correlations were o}ltained between ROCF scores (copy, recall, and percent recall) and age and IQ in the pati~ts and controls as a whole. In addition, for the whole sample, males were found to perform. significantly better than females on recall and percent recall; however, the amount of .Jmance accounted for was negligible. Study strengths I. Data are presented by age groupings. 2. Information regarding IQ, educatiJ>n, and geographic recruitment area is prl>vided. 3. Specification of scoring system. j 4. Means and SDs are reported.
l
' I
Considerations regarding use of the st;dy 1. An unspecified number of p~ipants were medical or psychiatric patiepts. 2. The data were collected in Cana~. and their usefulness for an American ~ample is unclear. i 3. No information on interrater reli.bility. 4. Below average educational level. 1 5. No information regarding gende~ distribution. .i 6. There is an apparent error in tlte data presentation table: the recall s;'re for controls < 30 years reads 0.0. ( r calculations suggest it should read .0.) 7. The sample sizes for the individ'al age groupings are small. [ROCF.3] Huhtaniemi, Haier, Fedio, and. Buchsbaum, 1983 (Table A12.4)
Four hundred male college student~ were screened on a measure of vigilance, the Continuous Performance Test (CPT). Two groups were identified for further study: (1) t good attention group (upper 5% of CPT se
deficits similar to those seen in patients with bilateral and diffuse cortical damage. Study strengths I. A nonclinical sample of participants with good or poor attentional ability was studied. 2. Specification of scoring system. 3. Information on gender, age, and education is provided. 4. Restricted age range. 5. Means and SDs are reported. Considerations regarding use of the study I. All participants were male. 2. Demographic variables for each group are not provided. 3. High intellectual level of the participants. 4. No exclusion criteria. 5. No information on interrater reliability. 6. Small sample size. [ROCF.4] Bennett-Levy, 1984 (Table A12.5) The authors collected ROCF data on 107 English volunteers as part of a project to develop a quantified technique of scoring copy strategy and assessing its relationship with recall. Forty-five participants were hospitalized patients tested prior to nonemergency surgery; 62 were auto-assembly line workers. Seventy-six were male and 31 were female. Exclusion criteria were history of head injury or epilepsy. Mean age was 29.3 (9.3), range 17-49, and mean estimated IQ based on the Schonell Graded Word Reading Test and the New Adult Reading Test was 104.9 (7.6). Participants were instructed as follows: "Copy the figure as accurately as [possible] ... While the subject copied the figure, the experimenter made a note of every line drawn in sequence . . . When the subject indicated that she had finished the copy, the figure and the drawing were removed from sight; 40 minutes later, the experimenter said to the subject: 'You remember that drawing you copied for me. I would now like you to recall as much of it as you possibly can.' When the subject first indicated that s/he could recall no more, the experimenter said: 'Give yourself a little more time. I always say to people to give themselves some more time.' In a number of cases,
257
REY-OSTERRIETH COMPLEX FIGURE this procedure resulted in one further detail being recalled. When the subject indicated that s/he had finished, this was accepted. (p. 110)
Bennett-Levy developed precise scoring criteria for the copy and recall of the figure based on Osterrieth's (1944) and E. M. Taylor's (1959) outline. He noted that his scoring system is "almost certainly more stringent than those customarily used by clinical psychologists." He opted for strict scoring criteria, to reduce the presence of ceiling effects on the copy trial; a majority of the protocols would have achieved the maximum possible score if more lenient criteria had been used. Bennett-Levy developed a more lenient scoring system in addition to his strict criteria for the recall trials, reasoning that memory scores should not be penalized due to sloppiness or imprecision of the reproduction when it was clear that the details were in fact accurately remembered. Interrater reliabilities of 0.96 and 0.98, respectively, for the "strict" and "lax" scoring systems were obtained on 25 randomly selected recall protocols. BennettLevy also developed scores describing "good continuation" (when a straight line is drawn as one piece until its final intersect with another line), "symmetry" (successive construction of symmetrical units), and "strategy total" (sum of good continuation and symmetry scores). Means and SDs for copy score, strict and lax recall, copy time, symmetry, good continuation, and strategy total are reported for males and females separately and for the sample as a whole. Multiple regression analyses revealed that strategy, copy time, and age were the primary determinants of copy scores, while strategy, copy score, and age were the principal predictors of recall. IQ and gender were significantly associated with copy and recall performance but did not provide a unique contribution to prediction once the other variables were considered. Bennett-Levy provides a regression equation to allow prediction of lax recall from strategy total and age. He suggests that a large discrepancy (i.e., >2 standard errors) between predicted and observed recall scores in favor of predicted score would argue for the presence of a significant memory impairment. He further cautions that failure
to consider strategy and age and reliance solely on normative recall scores might lead to an inaccurate interpretation of memory impairment.
Study strengths 1. Careful specification of a scoring system,
2.
3. 4.
5. 6. 7.
including quantification of drawing strategy, and information on interrater reliability. Specification of a regression equation which can provide a predicted recall score to be used for comparison with actual test scores. Relatively large overall sample size. Information on IQ, gender, age, occupation, and geographic recruitment area is provided. Presentation of data by gender. Information on time to complete the drawing. Means and SDs are reported.
Considerations regarding use of the study 1. A unique scoring system, which is much more complicated than the Taylor system and not in wide usage. 2. Use of hospitalized orthopedic patients for nearly half of the subject sample. 3. Undifferentiated age range. 4. Minimal exclusion criteria. 5. No information on educational level. 6. The data were collected in England, and their usefulness for an American sample is unclear.
[ROCF.S). Speers and Ribbler, Unpublished Manuscript (Table A12.6) The authors report data on 40 (20 male, 20 female) normal participants tested as part of a study on rates of loss for newly learned information over a 24-hour delay. Participants were recruited from staff and visitors at a rehabilitation hospital in California. Those with abnormal neurological histories were excluded, as were those with signs of language, motor, or visual disability. Mean age was 35.00 (10.79), range 23-70, and mean years of education was 16.15 (2.77), range 10-22. IQ estimates obtained through the Quick Test revealed a mean IQ of 107.93 (8.73), range 87-123.
258
PERCEPTUAL ORGANIZATION: VISUOSPATIAL AND TACTILE
The Rey figure was modified to include an additional detail in the left lower quadrant. Participants copied the design and then drew it from memory immediately and at delays 30 minutes and 24 hours later. They were not informed that they would be requested to recall the figure again after the immediate recall attempt. Participants were kept occupied during the 30-minute delay intetval with a questionnaire and IQ test items; the 24-hour delay period was unstructured. Time to copy the figure averaged 3 minutes 26 seconds. The authors developed a scoring system involving a 0-2 rating scale for 50 separate scoring units, which resulted in a total possible score of 100. Approximately 12% of the information could not be retrieved on immediate recall, with additional!% losses noted on 30-minute and 24-hour delayed recall. No mean scores or SDs are reported; however, rates of forgetting over the three delay periods are plotted in a graph format, and the modal reproduction of the figure at copy, immediate, and 30-minute and 24-hour delays are pictured. The investigators cite significant negative correlations between amount of information loss and age (r= -0.39) but found no significant relationship between IQ (r= -0.12) or education (r= 0.09) and retention of the figure.
Study strengths 1. Presentation of data regarding three different recall intervals (although earlier recall trials no doubt influenced later recall trials to some unknown extent and, therefore, the data on 30-minute and 24-hour delays should not be used as a comparison reference for participants who were not administered the earlier recall trials). 2. Information on IQ estimates, age, gender distribution, educational level, and geographic recruitment area is provided. 3. Adequate exclusion criteria. Considerations regarding use of the study 1. Modification of the stimulus figure. 2. A unique scoring system, not in common usage, was employed. 3. Lack of mean scores and SDs for recall conditions. 4. Undifferentiated age range.
5. High mean educational level. 6. No data on interrater reliability. 7. Sample size is relatively small.
[ROCF.61 Ardila, Rosselli, and Rosas, 1989 (Table A 12.7) The authors obtained ROCF data on 200 "normal," Spanish-speaking, right-handed Colombians as part of their assessment of neuropsychological functioning in illiterates. Inclusion criteria for illiterate participants were as follows: (1) total illiteracy, (2) illiteracy due to lack of opportunity to attend schooL (3) no current or past neurological or psychiatric history or sensory or motor impairment, (4) adequate performance in activities of daily living. All were of low income, and occupations included factory or construction workers, maids, cooks, or homemakers. High-education participants were recruited to match the illiterates on age and gender. Participants aged 1~25 were students and had at least 10 years of education; participants over age 25 had at least 17 years of formal education. No exclusion criteria are reported. The E. M. Taylor (1959, as reported in Lezak, 1983) scoring criteria were employed. Copy and immediate recall data were obtained, although only the means for ROCF copy in gender-by-age-by-education cells are reported; age intervals were 1~25. 2~. 36-45, 46-55, and ~ and each cell had 10 participants. Significant education, gender, and age main effects were documented for copy and immediate recall performance, with better performance associated with males, younger age, and higher education. In addition, significant interactions on copy score were noted for education and age, with lower scores primarily observed in the illiterate, rather than the educated, participants over age 55. Similarly, a significant education and gender interaction was documented, with men outperforming women only in the illiterate group. No significant interactions were obsetved for recall performance.
Study strengths 1. Provided data on a Spanish-speaking population. 2. Provided data on ROCF performance in illiterates.
259
REY-OSTERRIETH COMPLEX FIGURE
3. Large overall sample size. 4. Data are partitioned in age-by-educationby-gender cells. 5. Scoring system is specified. 6. Information regarding gender, occupation, handedness, and geographic recruitment area is provided.
Considerations regarding use of the study 1. Small sample size in individual cells. 2. ROCF scores for immediate recall are not presented. 3. No SDs are reported. 4. No information on interrater reliability. 5. No information on exclusion criteria in educated participants. 6. No IQ data available. 7. Data were collected in Columbia, and their usefulness for an American sample is unclear. [ROCF.7] Van Corp, Satz, and Mitrushina, 1990 (Table A12.8)
The authors collected data on 156 healthy elderly adults, aged 57-85, living independently in a retirement community in southern California as part of their investigation of neuropsychological processes in normal aging. Exclusion criteria were history of neurological or psychiatric disorder or substance abuse. Thirty-nine percent (n = 62) of the sample were male and 61% (n = 94) were female. Mean years of education were 14.1 (2.9), and mean WAIS-R (Satz-Mogel format) FSIQ, VIQ, and PIQ were 117.21 (12.59), 118.77 (13.27), and 110.74 (13.07), respectively. Mean scores and SDs are provided based on the Taylor (1959) scoring system for the copy and 3-minute delayed recall of the ROCF for four age intervals: 57-65 (n = 28), ~70 (n=45), 71-75 (n=57), and 76-85 (n = 26). Mean VIQ for each age interval ranged 114.8-122.9, and mean PIQ ranged 101.~115.1. No differences in VIQ, PIQ, or education were found between older and younger participants (<70 vs. >70).
Study strengths 1. Relatively large sample size, with some individual age groupings approximating 50.
2. Organization of the data into relatively small age intervals. 3. Adequate exclusion criteria. 4. Specification of scoring system. 5. Information on IQ, education, gender, and geographic recruitment area is provided. 6. Means and SDs are reported.
Considerations regarding use of the study 1. Very high intelligence and educational level of the sample. 2. No information regarding interrater reliability. [ROCF.BJ Berry, Allen, and Schmitt, 1991 (Table A12.9)
The authors collected ROCF data on 107 (55 male, 52 female) elderly Caucasian participants from Kentucky, aged ~79, as part of their evaluation of the psychometric properties of the ROCF and Taylor figure. Mean age was 65 (8.6), and participants were recruited from newspaper ads, flyers at seniorcitizen centers, and a volunteer subject pool. Exclusion criteria were history of cardiac, neurological, or psychiatric disease or use of psychoactive medication. All participants undeiWent a physical exam by a physician and electroencephalographic (EEG) evaluation by a neurologist. Ten recruits were excluded due to the discovery of previously undetected diseases. Participants were provided with blank sheets of 8 ~" x 11" paper and told to draw the complex figure as best they could. Without being forewarned, participants were instructed to draw the design from memory immediately after and 30 minutes later. The period between immediate and delayed recall was occupied with verbal testing. The scoring system was based on the E. M. Taylor (1959) protocol, with the following exceptions: (1) distorted but properly placed details or correctly reproduced but misplaced details received 1.5 points (rather than 1 point, as in the Taylor system) and (2) severely misplaced but correctly drawn details or severely distorted but correctly placed details received 1.25 points (rather than 1 point, as in the Taylor system). Significant correlations were noted between the revised and Taylor systems (copy,
260
PERCEPTUAL ORGANIZATION: VISUOSPATIAL AND TACTILE
r=0.82; immediate, r=0.93; delay, r:::::0.96); scores for the revised system averaged 2 points higher. Interrater reliability quotients for the revised scoring system on 87 of the protocols were 0.80, 0.93, and 0.96 for the copy, immediate, and delayed recall trials, respectively; and the most experienced rater's scores were used for data analysis. In a subgroup of 54 participants, alternate form equivalence was assessed by administering either the ROCF or Taylor figure in a morning testing session; in the afternqon, the version not given in the morning was ,administered. Interrater reliability for 37 1 Taylor protocols was comparable to that docutnented 1 ediate, with the ROCF (copy, r=0.84; imi r = 0.97; delay, r= 0.93), and the mo experienced rater's scorings were used r data an alySIS. Test-retest reliability was evaluattfJ in a subset of 41 participants who were ~tested 1 year after original testing. The ~.uthors conclude that ROCF copy scores w+re not reliable, with moderate reliability nQted on immediate and delayed recall. Howeter, examination of mean scores suggests thlit there was not a clinically significant change i~ scores over 1 year (e.g., means differed by 0.3-1.0 point). Means and SDs are reported for copy, immediate recall, and 30-minute delaye~ recall for the ROCF and Taylor figure for participants as well as for baseline and 1-year follow-up ROCF scores in the subset of 41 participants. Data on internal cons~tency, construct validity, and criterion-related validity are also reported. Significant correlations between ROCF scores and age and education were documented for immediate and delayed recall but not for copy scores; gender was not significantly related to any ROCF scores. 0
i '
51
Study strengths 1. Relatively large sample size. 2. Informationregardingage,education,gender, ethnicity, recruitment proc,dures, and geographic area is provided. ~ 3. Information regarding interrater reliability is provided. 4. Well-specified exclusion criteria. 5. Specification of scoring system.
i
6. Data on the Taylor figure as well as comprehensive psychometric information for the ROCF. 7. Means and SDs are reported.
Considerations regarding use of the study 1. Undifferentiated age range spanning three decades. 2. Even though the total sample size is large (n = 107), it appeared that the reported means and SDs were based on sample sizes of either 54 or 41 but that the information on mean age and education was derived from the whole sample. 3. Idiosyncratic scoring system. 4. High mean educational level. 5. No information regarding IQ. 6. The baseline data on the subset of 41 participants on whom the 1-year retest data were obtained have mean recall scores 5 points below the sample of 54 participants, suggesting that the subset was not randomly chosen and was not representative of the larger sample. [ROCF.91 Tombaugh and Hubley, 1991 (Tables A12.1 0, A12.11)
The authors assessed the comparability of scores obtained on the ROCF and Taylor figure in copy and recall conditions. Four studies were reported. Study 1 used an incidental learning paradigm, in which participants were not informed that recall would follow the copy condition. Participants were 64 undergraduate students enrolled in a third-year psychology course who were randomly assigned to copy the ROCF or Taylor figure. Participants reported no history of head injury producing loss of consciousness, neurological dysfunction, or current use of psychoactive medication. Participants were allowed 5.5 minutes to copy the complex figure, after which they were instructed to reproduce it from memory. After a 4-minute delay filled with a nonpictorial task, the second recall condition was offered. Twenty minutes later, following completion of verbal learning tasks, participants were administered the third recall condition. A 2-minute time limit was imposed on all recall trials.
REY-OSTERRIETH COMPLEX FIGURE
Itemized scoring systems allowing a total of 72 points were used for scoring of both figures (Denman, 1984, system for ROCF; Tombaugh, unpublished research, system for Taylor figure). The authors concluded that the degree of accuracy on the copy condition was comparable for the two figures. However, all recall trials yielded significantly higher scores for the Taylor figure (Table A12.10). Study 2 used another sample to replicate the above results and to explore the effect of scoring system on the comparability of scores for the two figures. Participants were 67 undergraduate students enrolled in a third-year psychology course who were randomly assigned to copy the ROCF or Taylor figure. The procedure was similar to that used in study 1, with the exception of omission of the 4-minute delayed trial and inclusion of a 1-month delayed trial (which was based on the data for 52 participants). In addition, the time required for copy and reproduction of each figure was recorded. Reproductions were scored according to the itemized system and the original OsterriethTaylor system. Results were consistent with study 1 in that both figures were comparable on the copy but not on the recall condition, irrespective of scoring system. No forgetting was demonstrated between the immediate and 20-minute delayed recall trials; however, a substantial decline in scores, equivalent for both figures, was seen over the 1-month interval. Study 3 used a modified size of the base rectangle of the ROCF to equate it to the base square of the Taylor figure (total area of both figures= 64 cm2). Itemized scoring systems were used. Results indicated that differences in recall scores could not be attributed to differences in size of the structural components of the two figures. Study 4 modified the administration of the figures to explore the effect of intentional learning and difference in time of exposure on recall efficiency. Seventy-two students enrolled in a first-year psychology course were instructed to study figures for a specified interval of time (15, 30, and 60 seconds) in order to reproduce them from memory after expo-
261
sure. Six learning trials were used, followed by 20-minute delayed recall trial and a copy trial. A maximum of 2 minutes was allowed for all memory trials and 4 minutes for the copy trial. Itemized scoring systems were used. The results suggested that the difference in performance was most pronounced with the 60-second presentation interval. The authors inferred that reported differences in recall efficiency of the two figures reflect lower degree of learning with the ROCF in comparison to the Taylor figure. Study strengths 1. The authors systematically explore the effect of different variables on performance on two figures. 2. Sample sizes are sufficient for these homogeneous samples. 3. Minimally adequate exclusion criteria. 4. Means and SDs are reported. Considerations regarding use of the study 1. Age range and demographic characteristics of the samples are not reported; age range is probably sufficiently narrow. 2. Itemized scoring systems are not sufficiently described, and no information is reported on interrater reliability. 3. High educational level (third year of college). [ROCF.10] Berry and Carpenter, 1992
(Table A12.12) The authors report the rate of recall of the ROCF over four different delay periods (15, 30, 45, and 60 minutes) in older persons in Kentucky. A sample of 60 participants was divided into four groups of 15, which were equivalent in age, gender, and education. All participants were volunteers who were in good health with no active illness, no history of neurological or psychiatric illness, and no current use of psychoactive medication. MiniMental State Exam (MMSE) scores were >24, with a mean of 28.5 (1.7). The sample consisted of 31 males and 29 females. All participants were white, and 10% of the sample were left-handed. All participants were administered copy and immediate recall trials with no time limits,
262
PERCEPTUAL ORGANIZATION: VISUOSPATIAL AND TACTILE
which were followed by one of the four delay durations. Timing of the delay started from completion of the copy trial. During tbe delay periods, participants were administered other neuropsychological tests of a verbal 'nature. Each protocol was scored for accutacy by two independent raters using the sys4em described by Beny et al. (1991). Interr~er reliability for the three trials was as foll~: copy, r=0.95; immediate recall, r=0.98; belayed recall, r=0.99. Scores for data anal~s represent the average of final scores assigned by two raters for each protocol. The results revealed no significant affect of delay period on recall. Scores on im~ediate and delayed-recall trials were silllihtr. The authors inferred that most forgetting! occurs very quickly, as a result of "overl$ading" working memory.
The ROCF and Taylor figure were administered in the same order to all participants, which is consistent with the order used in clinical practice. The time interval between administration of the two figures was approximately 1 month. Three conditions were administered for both figures: copy, immediate recall, and 20-rninute delayed recall (delay filled with nonvisuospatial tasks). Reproductions were scored according to the standard criteria. Interrater reliability based on scoring of 10 samples by two experienced neuropsychologists was 0.91. Correlations with age (- 0.11 to- 0.26) and education ( - 0.01 to 0.20) were relatively low. The authors concluded that performance on the copy condition for both figures was nearly identical; however, participants performed significantly better on the Taylor figure on both recall conditions.
Study strengths
Study strengths
.
1. Sample composition is well descrf.bed in terms of age, ethnicity, gender, jeducationallevel, handedness, and geographic location. ; 2. Adequate exclusion criteria were rused. 3. Interrater reliability and scoring 1system are reported. ; I 4. Means and SDs are reported. 5. Age range is probably sufltciently narrow.
Considerations regarding use of the study 1. While overall sample is adequate, individual sample sizes are small. 2. High educational level. 3. No IQ information is reported.
1. Information on interrater reliability is provided. 2. Information regarding age, education, and geographic area is provided. 3. Information on alternate form is provided. 4. Sample size approximates 50. 5. Minimally adequate exclusion criteria. 6. Means and SDs are reported.
Considerations regarding use of the study 1. The data are not broken down by age. 2. SDs for age and education are not reported. 3. No information regarding IQ or gender. [ROCF.12] Kuehn and Snow, 1992 (Table A12.14)
[ROCF.11] Delaney, Prevey, Cramer, and
Mattson, 1992 (Table A12.13) This study addressed the comparability of the ROCF and Taylor figure in a nonpatient sample and is based on the control iample data collected as part of a large study carried out in various locations of the United: States on the effect of anticonvulsant medica~ns on memory functioning. Participants were free of neurological and psychiatric disorders or current drug history. Ages ranged 22--61 years ' and education, 6-16 years.
The study explored the comparability of the ROCF and Taylor figure in a clinical sample. Participants were 38 Canadian patients referred for neuropsychological assessment for various forms of brain damage. Patients unable to draw a Greek cross or administered either figure previously were excluded from the study. Mean age was 46.7 years. The procedure consisted of copying each figure with a lead pencil, followed by 40minute delayed recall (without forewarning). Approximately 3 hours elapsed between
REY-OSTERRIETH COMPLEX FIGURE
administration of the two figures, during which time tests involving drawings or visual memory were not administered. Two figures were presented in a counterbalanced order. The standard scoring systems were used for both figures. Percent recall was calculated. The authors concluded that performance on both figures was equivalent for copy and recall scores. Percent recall scores, however, were higher for the Taylor figure, when it was administered first.
Study strengths 1. Scoring system is specified. 2. Information on gender, age, education, IQ, and geographic area is provided. 3. Information on alternate form is provided. 4. Means and SDs are reported.
Considerations regarding use of the study 1. Data are not broken down by age group. Age range is not specified. 2. The two groups, used for counterbalancing, are not comparable in education but are comparable in IQ. 3. Clinical sample; no exclusion criteria. 4. No information on interrater reliability. 5. Small sample size. 6. Data were collected in Canada and may be problematic for use in the United States. [ROCF.13] Boone, Lesser, Hiii-Gutierrez, Bennan, and D'Eiia, 1993b (Table A12.15)
The investigators collected data on 91 fluent English-speaking healthy older adults recruited in southern California through newspaper ads, flyers, and personal contacts as part of their investigation of the effects of age, IQ, education, and gender on ROCF performance. Exclusion criteria were current or past history of major psychiatric disorder or alcohol or other substance abuse, neurological illness, and significant medical illness which could affect central neiVous system function (e.g., uncontrolled hypertension or diabetes). In addition, potential participants were rejected if they had abnormal findings on neurological examination, metabolic disturbances detected with labora-
263
tory tests, or abnormal findings on EEG or MRI. The final sample included 34 males and 57 females. Seventy-one participants were Caucasian, 10 were African American, five were Asian, and five were Hispanic. Mean educational level was 14.5 (2.5) years. and mean WAIS-R FSIQ (Satz-Mogel format) was 115.9 (13.0). Participants were instructed to copy the figure onto a blank paper "as carefully as you can without tracing." Performance was not timed, and participants were allowed to make erasures. Following a 3-minute verbal fluency task and without forewarning, participants were instructed to draw what they could remember of the figure on a second sheet of blank paper. The E. M. Taylor (1959) scoring system was employed. Means and SDs are reported for copy scores and percent retention for three age groupings (45--59, 60--69, and 70--83) and four FSIQ levels (90-109, 110-119, 120-129, and 130-139). Interrater reliability between two experienced neuropsychologists was 0.82 for copy and 0.93 for delay. In regression analyses, a relatively small but significant percent of the variance in ROCF performance was associated with age and FSIQ; gender and education were not predictive of ROCF scores. In addition, ROCF copy score was not associated with delay score or percent retention. Significantly poorer ROCF scores did not emerge until age 70 and older, and individuals of average IQ showed a trend toward poorer performance on ROCF delay relative to participants falling in the very superior intelligence range. No interaction effects between age and FSIQ were obseiVed. The number and type of errors committed on copy and recall are summarized.
Study strengths 1. Information regarding education, gender, geographic recruitment area, ethnicity, and recruitment procedures is provided. 2. Rigorous exclusion criteria. 3. Data are presented by age and IQ groupings. 4. Scoring system is specified, and information on interrater reliability is provided.
264
PERCEPTUAL ORGANIZATION: VISUOSPATIAL AND TACTILE
5. Information regarding error number and type for copy and delay is provi.ed. 6. Large overall sample size, although individual cells all fall short of 50. 7. Means and SDs are reported.
Consideration regarding use of the study 1. High intellectual and educational level. Other comments 1. For participants older than 74, age-corrected FSIQs were based on Ryan et al. (1990) tables. [ROCF.14] Chiulli, Haaland, LaRue, and Garry, 1995 (Table A12.16)
The study explored rates of decline in ROCF performance after age 70. Participants were 153 healthy elderly individuals aged· 70-93, living independently, who participated in the New Mexico Aging Process Study, wliich explores nutrition and aging. Persons wtth serious medical illnesses or taking preseription medications were excluded. The sample was partitioned into three age groups. The ROCF was administered as part of a brief battery of psychological tests. Standard administration and scoring procedures were used. A copy condition was followed by immediate and 30-minute delayed recaU. If the reproduction started with the drawing of the large rectangle, the approach was categorized as "configura!." All other approaches were determined to be "nonconfigural." All protocols were checked by a second, blind ev~uator. The results revealed a significant main effect for age group. Accuracy was greitest in the copy condition but did not differ between the immediate and delayed recall conditions. The most pronounced decline in performance was demonstrated between the first aild second groups, which did not differ consi~erably from the third group performance. No gender effects were evident. The number of participants using the configura! approach ~d not differ significantly for the three age gri>ups. L
Study strengths 1. Data for an elderly sample ar~ partitioned into three age groups.
2. Relatively large sample size, and individual cells approximate 50. 3. Administration system is specified. 4. Exclusion criteria are specified. 5. Information on education, gender, and geographic recruitment area is reported. 6. The study assessed strategy used in approach to drawings. 7. Means and SDs are reported.
Consideration regarding use of the study 1. High educational level. 2. Data were checked by a blind evaluator, but no information on interrater reliability is provided. 3. No information on IQ. [ROCF.15] Meyers and Meyers, 1995a (Table A12.17)
The study explored the effect of different administration procedures on the rate of recall of the ROCF. Participants were undergraduate students from a college in Iowa and had no prior history of head injury, drug abuse, learning disability, or psychiatric illness. Participants were randomly assigned to one of four groups, each of which received a different combination of trials (30 participants in each group). There was no significant difference between the groups on age, gender, or education. Reproductions were scored according to the system developed by Meyers and Meyers (1992), which is based on the standard scoring system with addition of 114" rule for misplacement and a 118" rule for drawing errors. In addition, the authors used a recognition trial (Meyers & Lange, 1994). The authors suggest use of a 3-minute recall instead of immediate recall due to its higher correlation with the 30-minute recall.
Study strengths 1. Scoring system is described. 2. Sample composition and demographic characteristics are described, as well as geographic area. 3. Overall sample size is large (n = 120), although individual groupings are relatively small.
REY-OSTERRIETH COMPLEX FIGURE
4. Adequate exclusion criteria. 5. Means and SDs are reported. 6. Age grouping is suitably restricted. Consideration regarding use of the study 1. No information regarding interrater reliability or IQ. [ROCF .16] Ponton, Satz, Herrera, Ortiz, Urrutia, Young, D'Eiia, Furst, and Namerow, 1996 (Table A12.18)
The ROCF was administered to Spanishspeaking volunteers as part of a larger battery in a project designed to provide standardization of the Neuropsychological Screening Battery for Hispanics (NeSBHIS). Volunteers were recruited through fliers and advertisements in community centers of the greater Los Angeles area over a period of 2 years. Exclusion criteria were a history of neurological or psychiatric disorder, drug or alcohol abuse, and head trauma. Data for a sample of 300 participants with a median educational level of 10 years were analyzed. Participants ranged in age 16-75 years, with a mean of38.4 (13.5) years. Education ranged 1-20 years, with a mean of 10.7 (5.1) years. The male-tofemale ratio was 40%/60%. The average duration of residence in the United States was 16.4 (14.4) years. Seventy percent of the sample were monolingual Spanish-speaking, and 30% were bilingual. The proportion of the sample respective to their country of origin closely approximates the 1992 U.S. Census distribution. Correlations between Marin and Marin (1991) acculturation scale scores and neuropsychological variables are provided. Participants were instructed to copy the complex figure with no time limit. Reproductions were scored according to Taylor's (1959) criteria. The authors provided normative data for the copy and 10-minute delayed recall conditions. Study strengths 1. Large overall sample, with acceptable sample size for most of the cells. 2. The sample composition is well described in terms of age, education, gender, acculturation information, geographic area, and recruitment procedures.
265
3. Adequate exclusion criteria. 4. Test administration and scoring procedures are specified. 5. Means and SDs for the test scores are reported. 6. Data are partitioned by gender x age x education. Considerations regarding use of the study 1. No information regarding interrater reliability or IQ. 2. It is unclear which of the two educational groups included participants with 10 years of education. [ROCF.17] Rapport, Charter, Dutra, Farchione, and Kingsley, 1997 (Table A12.19)
The study addressed interrater and internal consistency reliabilities of the standard (as described in Lezak, 1995) and Denman scoring systems for the ROCF. Participants were 318 veterans (312 males, 6 females), aged 1884 years, who were referred to a Veterans Administration hospital assessment service. The majority of participants were inpatients. Mean age was 55.01 (4.31) years and mean education, 12.62 (2.77) years. Three independent raters scored copy and immediate recall reproductions using standard and Denman criteria. Interrater reliabilities are presented for the entire sample and for three referral sources separately: neurology, psychiatry, and rehabilitation medicine. The authors concluded that internal consistency and interrater reliabilities for both scoring systems were high. Coefficient !X reliabilities were also high, indicating psychometrically sound inter-item congruity for both scoring systems. Age was modestly related to performance on the copy condition and strongly related to recall. Education was modestly associated with copy and weakly associated with recall performance. Study strengths 1. Information on gender, age, education, and recruitment procedures is provided. 2. A large sample size. 3. Data on psychometric properties of the ROCF are provided.
266
PERCEPTUAL ORGANIZATION: VISUOSPATIAL AND TACTILE
4. Two scoring systems are compared. 5. Means and SDs are reported. , Considerations regarding use of the sJudy 1. Participants were V.A. inpatients from different wards, including neurology. Selection criteria and participants' diagnoses are not specified. The dati on test scores are of limited use with the, general population due to likely health confounds of the sample. 2. The sample was not partitioned Jnto age groups. 3. No information on IQ. 4. Mostly male population. [ROCF.18] Hartman and Potter, 1998 (Table A12.20)
The authors explored the contributio•s of visuospatial ability, organization, and memory to age differences on the ROCF in ad$lthood. Participants were 30 undergraduaf:e and graduate students aged 18-32, with a tnean of 22.3 years, and older adults recruited through fliers and advertisements in local ne~papers and senior-citizen newsletters. Participants were screened for history of neurological illness, head trauma or loss of consciousness,
significant psychiatric illness, untreated hypertension, current use of psychoactive medication, excessive current use of alcohol, and dementia. All participants lived independently in the community and reported thems~lves in good or excellent health. All older: adults scored >24 on the MMSE. The t)vo age groups were selected from a larger satn,ple in order to match them on Shipley Qartford Vocabulary Test scores (36.2 vs. 35.5)J The ROCF was administered actording to Rey's (1941) original instructions, using different-colored pens handed to p~ipants at equal intervals. Copy and immedia~ recall without forewarning were used. Sco~g was done by two investigators using BQSS ;nd the extended 36-point system. BQSS infraclass correlations for a subsample of 22 p~tocols ranged 0.79-1.00, with the exception ~f qualitative items (perseveration, confab'itlation, and neatness), which were low, 0.$--<>.65. Intraclass reliability coefficients for tht latter system ranged 0.79-0.99. Mean scores !for the
two age groups according to the extended 36-point scoring system are presented in Table A12.20. The authors found that lower performance for the older group, on the Copy condition, was the result of minor inaccuracies in drawing and, on the Recall condition, the result of omission of elements. No decline in organizational quality with age was evident. Small age differences were seen on the copy condition, with robust differences evident in recall. The authors discussed the advantages and disadvantages of the BQSS and the extended 36point scoring system. Table A12.20 provides data according to the latter scoring system. Study strengths 1. The sample composition is well described in terms of age, gender, vocabulary test scores, and recruitment procedures. 2. Rigorous exclusion criteria. 3. Two scoring systems are compared. 4. Means and SDs for the test scores are reported. 5. Information on scoring system and interrater reliability is provided. Considerations regarding use of the study 1. The samples are relatively small. 2. Educational levels for the samples are high. 3. SDs or ranges for education are not provided. [ROCF.19] Ostrosky-Solis, Jaime, and Ardila, 1998 (Table A12.21)
The authors investigated an effect of normal aging on memory abilities. The sample included 105 participants (44 male, 61 female) aged 20-89 years, with a minimum of 6 years of formal education. The sample was partitioned into seven age groups, with 15 participants in each group. All volunteers were of average socioeconomic status, lived in Mexico City, and were native Spanish speakers. Exclusion criteria were presence of dementia according to the DSM-IV criteria, a score < 24 on the MMSE, and a history of neurological or psychiatric conditions, per selfreport questionnaire.
267
REY-OSTERRIETH COMPLEX FIGURE
The ROCF was administered according to Taylor's (1959) instructions. Copy, Immediate Recall, and 20-minute Delayed Recall conditions were administered. The standard scoring procedure was used. Study strengths 1. The sample composition is well described in terms of age, gender, incentive for participation, and geographic area. 2. Minimally adequate exclusion criteria. 3. Test administration procedures are specified. 4. Means and SDs for the test scores are reported. 5. Information on scoring system is provided. Considerations regarding use of the study 1. Overall sample is large, but individual cells are small. 2. Recruitment procedures are not reported. 3. Specific information on education is not provided, other than "the participants had a minimum of six years of formal education." 4. The data were obtained on Mexican participants, which may limit their usefulness for clinical interpretation in the United States. 5. No information on IQ is reported. 6. No information on interrater reliability is provided. [ROCF.20] Fastenau, Denburg, and Hufford, 1999
This normative study included 211 healthy adults aged 30-85 years, with a mean of 62.9 (14.2) years. Education ranged 12-25 years, with a mean of 14.9 (2.6) years; 55% were women, and over 95% were Caucasian. Participants were recruited using a stratified sampling procedure at three different sites as part of other studies and financially compensated. Exclusion criteria were history of cerebrovascu1ar insult, head injury with loss of consciousness exceeding 5 minutes, and chronic substance abuse, per structured interview. The Extended Complex Figure Test was administered, which supplements the original Copy, Immediate Recall, and Delayed Recall
with Recognition and Matching trials. Testing and scoring were performed by trained personnel. Scores were generated using Osterrieth's (1944) criteria. The data for conversion of the raw scores into scaled scores are presented in overlapping age groups using the midpoint interval technique introduced by lvnik et al. (1992a). These tables should be used in the context of the detailed procedures for their application, which are explained by the authors. Therefore, they are not reproduced in this book. Interested readers are referred to the original article. The authors concluded that age and education effects were evident on all trials but education explained minimal variance on the copy and memory trials. Gender had a minimal effect on performance. [ROCF.21] Schreiber, Javorsky, Robinson, and Stern, 1999 (Table A12.22)
The BQSS and the 36-point scoring system were compared on samples of adults with ADHD and matched controls. The control group included 18 participants (9 male, 9 female) aged 18-51, with a mean age of 29.5 (11.5) years and mean education of 15.1 (1.7) years. Exclusion criteria were history of neurological disorder, major medical illness, psychiatric illness, developmental disorder, learning disability, ADHD, or significant visual or auditory impairments. The ROCF was administered according to the procedures described in the BQSS manual (R. A. Stem et al., 1999), switching differentcolored pens. The Copy, Immediate, and 2030 minute Delayed Recall conditions were used. The test was administered and scored by trained personnel using the BQSS and the 36point scoring system. The interrater reliability of these scorers was reported in the BQSS manual. Table A12.22 provides a score for the copy condition obtained using the 36-point scoring system. The authors discussed the superiority of the BQSS in discriminating between the two groups. Study strengths 1. The sample composition is well described in terms of age, education, and gender.
268
PERCEPTUAL ORGANIZATION: VISUOSPATIAL AND TACTILE
2. Rigorous exclusion criteria. 3. Test administration and scorinJ procedures are specified. 4. Means and SDs for the test sa>res are reported.
Considerations regarding use of the *'dy 1. The sample is small, with a vnde age range. 2. Data for the recall conditions are not reported. 3. Educational level for the sample is high. 4. No information on IQ is reported.
[ROCF.22] Deckersbach, Savage, Henil\ Mataix-Cols, Otto, Wilhelm, Rauch, Ba.r, and Jenike, 2000 (Table A12.23) The psychometric properties of ~ scoring systems measuring organizational apptoach to the ROCF and influences of copy org;uzation and accuracy on immediate recall were studied on individuals diagnosed with oCD and normal controls. Control participants were recruited through bulletin board noti~s at the Massachusetts General Hospital. Th~ control group consisted of 55 healthy adults (38% male) 19-64 years of age, with a meaJl age of 35.13 (12.6) years, and education ranpg 12-20 years, with a mean of 16.7 (2.3) yeap. Beck Depression Inventory scores ranged 0--15, with a mean of 2.3 (3.2). All particip~ts were Caucasian and right-handed. Estimtted intelligence level was above averagttI Their health status was determined bas~ on a structured clinical interview. Exclusi«in criteria were history of Axis I psychiatric disorder, significant head injury, seizure, neu*>logical condition, or current medical conditiQil. Copy and Immediate Recall condiions of the ROCF were administered. The ~ tration procedure used switching ·colored pencils every 15 seconds. The prot~ls were scored according to Meyers and Meyers' (1995b) system. In addition, the organt?.ational approach used during the Copy condition was assessed according to the Shorr et ~ (1992) and Savage et al. (1999) scoring meth¥s. The interrater reliability for the Savagq et al. method, established on a subsample o( 15 randomly selected drawings, was mod~rate to high, with Cohen's " coefficients :ranging
0.69-0.92 for different organizational elements of the figure. Table Al2.23 provides scores for the Copy and Immediate Recall conditions based on the Meyers and Meyers scoring system. The authors concluded that organization during the Copy condition was a strong predictor of subsequent recall.
Study strengths 1. Relatively large sample. 2. The sample composition is well described in terms of age, education, gender, estimated intelligence level, geographic area, and recruitment procedures. 3. Rigorous exclusion criteria. 4. Test administration procedures are specified. 5. Means and SDs for the test scores are reported. 6. Information on scoring systems and interrater reliability is provided.
Considerations regarding use of the study 1. The data are provided for a wide age range. 2. Educational level for the sample is high. [ROCF.23] Miller, 2003; Personal Communication (Table A12.24)
The investigation used participants from the Multi-Center AIDS Cohort Study (MACS). The data were collected from 729 seronegative homosexual and bisexual males for the purpose of establishing normative data for neuropsychological test performance based on a large sample. Mean age for the sample was 40.4 (7.4) years, and mean education was 16.2 (2.4) years; 91.2% were Caucasian, 2.5% Hispanic, 5.6% black, 0. 7% other. All participants were native English speakers. The Copy, Immediate Recall, and 20minute Delayed Recall conditions were administered according to standard instructions. The data are partitioned by three age groups (25-34, 35-44, 45-59) x three educational levels (< 16, 16, >16 years).
Study strengths 1. The overall sample size is large, and most individual cells have more than 50 participants.
REY-OSTERRIETH COMPLEX FIGURE
2. Normative data are stratified by age x education. 3. Information on age, education, ethnicity, and native language is reported. 4. Means and SDs for the test scores are reported. Considerations regarding use of the study 1. All-male sample. 2. No information on IQ is reported. 3. No information on exclusion criteria.
RESULTS OF THE META-ANALYSES OF THE ROCF DATA (See Appendix 12m)
Data collected from the studies reviewed in this chapter were combined in regression analyses in order to describe the relationship between age and test performance and to predict test scores for different age groups. Effects of other demographic variables were explored in follow-up analyses. The general procedures for data selection and analysis are described in Chapter 3. Detailed results of the meta-analyses and predicted test scores across adult age groups are provided in Appendix 12m. Only those data based on the standard 36-point scoring system, including Meyers and Meyers' (1995b) approach, were used in the analyses. Data generated using other methods were not included. Data provided in Meyers and Meyers' (1995b) manual are not included. Separate analyses were performed on the Copy, Immediate Recall, and Long-Delayed Recall conditions. Data for 3-minute delayed recall were not analyzed as only few studies reported data for this condition. The longdelay interval varies widely in the data reviewed. According to the literature, varying the delay interval between 15 and 60 minutes has minimal effect on the rate of recall (Berry & Carpenter, 1992). Therefore, 20-, 30-, 40-, and 45-minute delayed recall trials were combined in one run of analysis. In all cases, the long-delayed recall was preceded by an immediate or a 3-minute delayed recall but not both.
269
After data editing for consistency and for outlying scores, the following data were included in the analyses: nine studies, which generated 19 data points based on a total of 1,340 participants for the Copy condition; seven studies, which generated 12 data points based on a total of 1,086 participants for Immediate Recall; seven studies, which generated 11 data points based on a total of 1,056 participants for the Long-Delayed Recall. Quadratic regressions of the test scores on age yielded R2 of 0.899 for the Copy condition, 0.822 for Immediate Recall, and 0.862 for Long-Delayed Recall, indicating that 82%-90% of the variance in test scores for the three conditions is accounted for by the models. Based on these models, we estimated scores for the three conditions for age intervals between 22 and 79 years. If predicted scores are needed for age ranges outside the reported age boundaries, with proper caution (see Chapter 3), they can be calculated using the regression equations included in the tables, which underlie calculations of the predicted scores. It should be noted in the context of acrosscondition comparisons that mean age for the Copy condition is considerably higher than mean ages for Immediate Recall and LongDelayed Recall because data for two large studies based on the older samples were reported only for the Copy condition. The scores for the Copy condition of the ROCF for healthy young to middle-aged samples are not expected to be normally distributed. It should be noted that the majiority of the studies contributing to the aggregate sample for the Copy condition in our analyses, reported data for older age groups, with the mean age of 62.73 (19.27). The mean Copy score for the aggregate sample is 32.20 (1.79), reftecting age-related decline from the optimal performance expected in younger samples. Thus, the distribution of scores in our sample is more normal than expected in younger samples due to variability in both directions from the mean, avoiding scores being skewed due to ceiling effect. The pattern of SDs differs across the three conditions. For the Copy condition, linear regression of SDs on age yielded R2 of 0.685;
270
PERCEPTUAL ORGANIZATION: VISUOSPATIAL AND TACTILE
and for Immediate Recall, quadratic regression of SDs on age yielded R2 of 0.694, indicating increase in variability with advancing age, consistent with the literature. Predicted SDs, based on these models, are reported. Regressions of SDs for Long-Delayed Recall on age suggest that age does not account for a significant amount of variability in SDs (R2 = 0.482). Though some increase ;in variability with advancing age is expeded, this trend was not significantly evident in !the collected data. Therefore, we suggest that the mean SD for the aggregate sample be used across all age groups. Predicted scores and SDs for 12 age ranges across three conditions are summarized in Table A12m.4. Examination of the effects of demQgraphic variables on the ROCF scores indicated that education did not contribute to the te~ scores in the data available for analyses. Th. effects of intelligence level, gender, and hanf}edness on ROCF performance were not expl~d due to a scarcity of data available for revi~.
Strengths of the analyses
.
1. Total sample size of 1,340 for the Copy condition, 1,086 for Immediate· Recall, and 1,056 for Long-Delayed Re~all. 2. R2 of 0.899 for the Copy condition, 0.822 for Immediate Recall, and 0.~62 for Long-Delayed Recall, indicating a good model fit. 3. Postestimation tests for parameter specifications did not indicate problelDS with normality or homoscedasticity, With the exception of the marginally signifieant test for normalit:f for Long-Delayed ~4. It should be noted that the pledicted values match closely the normative data provided in the Meyers and Meyers' (1995b) manual for all three conditions, with respect to both the extrem' values and the direction/rate of ag~related changes.
Limitations of the analyses
1
1. Postestimation test for normality: for the Long-Delayed Recall was m~ginally significant. The Kdensity plot pemonstrated a positive skew in the disttibution
of residuals, which does not affect the estimates of regression coefficients and accuracy of prediction but does infiuence the results of significance tests. 2. Data for only a narrow range of higher levels of education are available for the analyses (12.2-16.2 years). Mean education of 14.33 (0.98) for the Copy condition is high. We were unable to fully explore the effect of education on the test scores because lower educational levels are not represented in the data. Though reports on the relationship between education and test scores are equivocal, a number of studies suggest that higher levels of education are associated with better test performance. Therefore, the predicted values might overestimate expected scores for individuals with lower educational levels. 3. Although the effect of intellectual level on ROCF performance has been reported in several studies, we could not include measures of intellectual level in our analyses due to great variability in the type of measures used to assess functional level among the different studies.
CONCLUSIONS
A great number of studies exploring the psychometric properties of the ROCF and its clinical utility attest to its popularity among clinicians and investigators alike. However, tremendous variability in administration and scoring of the ROCF obscures comparability of the results of these studies. To improve consistency across different studies, the procedures for administration and scoring need to be highlighted in detail by clinicians and investigators. It should be noted that the distribution of scores for the ROCF Copy condition deviates considerably from the normal distribution. A majority of participants are capable of copying the figure without major distortions. Therefore, a label of "superior" performance given to a subject achieving a high ROCF score is meaningless. On the other hand, the test is highly sensitive to deficits in visuospatial
REY-OSTERRIETH COMPLEX FIGURE
information processing, and achieving a low performance score falling in the outlying range has clinical significance. In addition to the numerical expression of a subject's performance, the value of qualitative interpretation and the delineation of subject's strategy/type of errors was emphasized in several studies reviewed above. In this context, the two avenues of research on the ROCF, namely, studies on clinical utility and on the cognitive processes involved in figure drawing, are mutually enriching. Recommendations for future research on the ROCF include careful analysis of the effects of demographic factors on performance. The well-documented effects of age and intelligence (and possibly education) need to be considered in subject selection and data presentation format. Although education did not have an effect on ROCF performance in the meta-analyses described in this chapter, this is due to a narrow range of education in the
271
aggregate sample. The scope of the research literature should be expanded to include lower levels of education and intellectual functioning. A large number of studies on the learnin!¥ processing strategies in children and on the clinical sensitivity of the test to different neurological conditions in adults are available in the literature, but only a few studies are dedicated to the cognitive/processing strategies issues related to older age groups. The psychometric properties of different scoring systems need to be further assessed. Data on interrater reliability, internal consistency, and test-retest reliability are scarce. From the review of existing studies, it appears that different scoring systems are differentially applicable to specific clinical and research situations. Additional information on the current use of the ROCF and suggestions for future investigations, submitted by clinicians, are summarized by Knight et al. (2003).
13 Hooper Visual Organization Test
BRIEF HISTORY OF THE TEST
The Hooper Visual Organization Test (HVOT) consists of 30 line drawings of familiar objects which have been fragmented into pieces. The task requires the examinee to mentally reintegrate and name the objects, which are arranged in order of increasing difficulty. The response format can be oral or written, depending on whether the individual administration or the booklet format is used. The score is the number of correctly identified items, with halfpoints available for some of the items. Wetzel and Murphy (1991) suggest a discontinuation rule of five consecutive errors, based on a rating change of only 1% using this strategy. The test was first published in 1958 and revised in 1983. The test manual for the revised edition provides conversion tables to correct raw scores for age and educational level. Corrected or uncorrected raw scores can be converted to T scores according to the tables provided in the manual, with higher T scores representing a greater likelihood of neurological dysfunction. The standardization data reported in the manual are based on Mason and Ganzler's (1964) all-male sample of 231 patients, personnel, and volunteer workers from a Veterans Administration hospital. The sample was stratified into nine age cohorts:25-29,30-34,35-39,40-44,45-49,5054, 55-59, 60-64, and 65-69 years.
272
In addition to using T-score tables, determination of impaired vs. normal performance can be made using the cutoff criteria. The cutoff scores recommended by the authors vary depending on test administration setting. In a clinical diagnostic setting, a cutoff score of ~24 is suggested in determining whether further assessment is needed. On the other hand, if the test is used as part of a screening battery administered to all patients admitted to a facility with a low incidence of organic brain pathology, a cutoff of 20 is recommended to minimize the rate of false-positive errors. Boyd (1981) argued: no single cutoff score can be recommended for use in all clinical situations. Factors such as the subject's age, educational level, intelligence, and whether the situation requires minimization of false positives or false negatives, must all be weighed in interpreting test results. (p. 19)
While the cutoff score suggested by Hooper was judged by Boyd (1981) to be optimal for evaluating chronically ill institutionalized patients, it appeared to be too low for less incapacitated patient populations. Furthermore, Nabors et al. (1997) suggested a cutoff score of ~ 15 for determination of cognitive impairment in medically ill elderly as this score provided the best correct classification in their sample of urban medical inpatients at
HOOPER VISUAL ORGANIZATION TEST
273
a post-acute geriatric rehabilitation unit (81% sensitivity, 79% specificity). Hooper also developed a qualitative system of response analysis involving four categories: isolate, perseverative, bizarre, and neologistic responses. Lezak et al. (2004) underscores the benefits of qualitative analysis of errors, pointing to the localizing significance of fragmentation tendencies. Nadler et al. (1996) concur that qualitative analysis of errors improves the differentiation between the effects of right vs. left hemisphere dysfunction on HVOT performance. Merten and Beal (2000) found item ranking for the HVOT to deviate from empirically based item difficulty in their sample of German-speaking neurological patients and rules for a number of items to be arbitrary. The authors proposed a revised version based on empirical item analysis, which retains the original items but has a modified set of instructions, order of items, and scoring and administration rules. Merten (2002) developed a short form consisting of 15 items, which was validated on another sample of Germanspeaking neurological patients.
1982a,b; Rathbun & Smith, 1982; Woodward, 1982). "nte above issue is directly related to assumptions as to which cognitive functions are measured by the HVOT. Two components of information processing involved in HVOT responses are mental reintegration and naming of the objects for each test item. If visual perception and synthesis are the primary mechanisms involved in item analysis, then nondominant hemisphere contribution prevails. If test performance also imposes considerable naming demands, then both dominant and nondominant hemispheres contribute substantially to test performance. Studies exploring the relative contribution of these cognitive processes to HVOT performance are largely equivocal. Lezak (1995), Lezak et al. (2004), and Spreen and Strauss (1991, 1998) suggest caution in interpreting HVOT failures as a manifestation of visuospatial deficit due to the contribution of the naming component. Schultheis et al. (2000) developed the Multiple-Choice Hooper Visual Organization Test (MC-HVOT), which consists of the 30 original stimuli presented with four response choices, in order to remove the naming demands on test performance. The authors found that performance of anomie patients was significantly facilitated by the multiple-choice format. Furthermore, patients with both right and left hemisphere involvement benefited from diminished naming demands. In contrast, Ricker and Axelrod (1995) found that perceptual organization accounted for 44% of HVOT performance variance, whereas confrontation naming ability was not significantly related to test performance. Similarly, in a study designed to replicate and extend the above research, Paolo et al. (1996c) observed the HVOT to be a measure of perceptual organization, whereas performance on the test was not significantly impacted by poor naming ability. Paul et al.'s (2001) results are consistent with these findings. Greve et al. (2000) found a small but significant effect of naming on HVOT performance, which, however, was interpreted by the authors as having little or no practical impact. Such discrepant findings are likely to be related to composition of study samples, with
Construct Validity The HVOT was developed as a screening instrument for organic brain dysfunction. However, the issue of the test's sensitivity to general vs. lateralized dysfunction remains controversial. The test authors suggest that the HVOT "is sensitive to general impairments, not specific visuopractic functions" (Hooper, 1983, p. 6). This view is supported by Boyd (1981, 1982a), Wang (1977), and Wetzel and Murphy (1991). However, the HVOTs sensitivity to lateralized dysfunction has been demonstrated in several studies. Lewis et al. (1997) report that HVOT performance is vulnerable to acute lesions in the right anterior quadrant of the brain. In contrast, Fitz et al. (1992), Rathbun and Smith (1982), and Woodward (1982) demonstrate HVOT sensitivity to localized dysfunction of the nondominant parietal lobe. In fact, a heated debate over general vs. specific sensitivity of the HVOT is reflected in a series of articles published in response to Boyd's (1981) article (Boyd,
274
PERCEPTUAL ORGANIZATION: VISUOSPATIAL AND TACTILE
samples comprised of aphasic patients demonstrating the largest effect of naming difficulty on HVOT performance. Merten and Beal (2000) indicated that the HVOT measures visuoperceptual and visuospatial-{)rganizational dysfunction, Seidel (1994) found it to be a measure of general visualperceptual-constructional abilities in a pediatric population, and Johnstone and Wilhelm (1997) concluded that HVOT measures global visuospatial intelligence and shares 12%-23% of variance with WAIS-R PIQ subtests.
Psychometric Properties of the Test Lopez et al. (2003) examined the psychometric properties of the test on a sample of 281 cognitively impaired and intact patients and reported acceptable estimates of internal consistency (oc = 0.882) and interrater reliability (0.977-0.992). Similarly, an internal consistency estimate of >0.88 was reported by Merten and Beal (2000) on a sample of 320 German-speaking neurological patients. Additional data on the reliability and validity of the HVOT are provided by Gerson (1974), Franzen (2000), Franzen et al. (1989), Lezak et al. (2004), and Spreen and Strauss (1998). Item analysis for use of the HVOT with Indian participants was performed by Verma et al. (1993).
RELATIONSHIP BETWEEN HVOT PERFORMANCE AND DEMOGRAPHIC FACTORS Age and intelligence level are consistently related to HVOT performance. Tamkin and Jacobsen (1984) report an effect of age and IQ on HVOT performance in their sample of 211 male, veteran, psychiatric inpatients. Similarly, Wentworth-Rohr et al. (1974) found a positive relationship between HVOT scores and intelligence level as well as a negative age/ HVOT relationship beginning in the late 30s. Age-related changes in HVOT performance are also documented by Farver and Farver (1982) and by Tamkin and Hyer (1984). Hilgert and Treloar (1985) documented an effect of age and IQ level but no gender differences
in elementlll)'-school children. An effect of IQ is also reported by Gerson (1974). Education and gender were unrelated to HVOT scores in a study by Wentworth-Robr et al. (1974). In contrast, Verma et al. (1993) found significant effect of education on HVOT scores. Based on the analysis of HVOT performance of 434 normal children aged ~13, Kirk (1992b) reported that boys attained adult performance by age 12, whereas girls participating in this study did not reach the adult level. Based on these data. Kirk documented an effect of age and gender on HVOT performance. An interaction between age and education in a sample of cognitively intact elderly was reported by Richardson and Marottoli (1996). Nabors et al. (1997) found HVOT scores to be significantly related to age and education in a total sample, which combined cognitively intact and impaired elderly urban medical patients, whereas performance was not significantly related to these demographic variables for the cognitively intact group considered separately. For further information regarding the HVOT, see Lezak et aL (2004) and Spreen and Strauss (1998).
METHOD FOR EVALUATING THE NORMATIVE REPORTS To adequately evaluate the HVOT normative reports, seven key criterion variables were deemed critical. The first six of these relate to subject variables, and the remaining refers to a procedural issue. Minimal criteria for meeting the criterion variables were as follows.
Subject Variables Sample Size
Fifty cases are considered a desirable sample size. Although this criterion is somewhat arbitrlll)', a large number of studies suggest that data based on small sample sizes are highly influenced by individual differences and do not provide a reliable estimate of the population mean.
HOOPER VISUAL ORGANIZATION TEST
Sample Composition Description
Information regarding medical and psychiatric exclusion criteria is important. It is unclear if geographic recruitment region, socioeconomic status, occupation, ethnicity, or recruitment procedures are relevant. Until this is determined, it is best that this information be provided. Age Group Intervals
This criterion refers to grouping of the data into limited age intervals. This requirement is relevant for this test since a strong effect of age on HVOT performance has been demonstrated in the literature. Reporting of Educational Levels
Given the possible association between education and HVOT performance, information regarding education should be provided for each subgroup. Reporting of Intellectual Levels
Given the relationship between HVOT performance and IQ, information regarding intellectual level should be provided for each subgroup, and preferably nonnative data should be presented by IQ levels. Reporting of Gender Composition
Given the possible association between gender and HVOT performance, information regarding gender composition should be reported for each subgroup. Procedural Variables Data Reporting
Means and standard deviations for the total number of correct responses should be reported.
SUMMARY OF THE STATUS OF THE NORMS There are only few studies available in the literature that provide performance levels for the HVOT. Several studies have reported data for psychiatric or neurological samples. Among the studies providing data for nonnal samples,
275
several used only selected HVOT items. Only studies that report data for the full HVOT for nonnal samples are reviewed in this chapter. In all articles reviewed below, the score represents the total number of correct responses (out of 30). In this chapter, nonnative publications and control data from clinical studies are reviewed in ascending chronological order. The text of study descriptions contains references to the corresponding tables identified by number in Appendix 13. Table A13.1, the locator table, summarizes information provided in the studies described in this chapter. 1
SUMMARIES OF THE STUDIES [HVOT.1] Rao, Leo, Bernardin, and Unverzagt, 1991a (Table A13.2)
The authors described the performance of a control group in their study on cognitive dysfunction in multiple sclerosis. The control group included 100 participants (75 females, 25 males), who were paid for their participation. The mean age of the sample was 46.0 (11.6), mean education was 13.3 (2.0), and estimated premorbid intelligence (based on demographic variables) was 106.5 (6.9). All except for one participant were Caucasian. Participants were recruited from newspaper advertisements. Exclusion criteria were history of substance abuse, psychiatric disturbance, head injury or any other nervous system disorder, or use of prescription medications. In addition to detailed medical and psychosocial history participants underwent a neurological examination, MRI, and neuropsychological testing. The HVOT was administered as part of a larger battery. For a description of the administration procedure, the authors referred readers to an earlier article. Study strengths 1. Large sample size. 2. The sample composition is well described in tenns of age, education, 'Nonnative data for children 5-11 years old are provided by Seidel (1994) and for those 5-13 years old by Kirk (1992b). See also Baron (2004) and Spreen and Strauss (1998).
276
PERCEPTUAL ORGANIZATION: VISUOSPATIAL AND TACTILE
gender, ethnicity, IQ estimate, geographic area, clinical setting, and recruitment procedures. 3. Rigorous exclusion criteria. 4. Means and SDs for the test scores are reported. Consideration regarding use of the study 1. The data are not partitioned by age group. [HVOT.2] Libon, Glosser, Malamut, Kaplan, Goldberg, Swenson, and Sands, 1994 (Table A13.3)
The HVOT was administered to a sample of 37 right-handed participants aged 64-94 years as part of a study examining the relationship between age and cognitive functions in normal aging. Participants were recruited from a local community center and from the Active Life Program, an exercise and fitness program at the Philadelphia Geriatric Center. All participants scored ?.27 on the Mini-Mental State Exam (MMSE) and ~10 on the Geriatric Depression Scale (GDS). All participants passed a physical examination and a graded exercise cardiac function test. Exclusion criteria were history of stroke, head injury, seizure disorder, or major psychiatric problems including substance abuse or psychoactive medications, per clinical interviews. The sample was divided into the young-old (64-74 years) and old-old (75-94 years) groups. There were no between-group differences in education or MMSE or GDS score. The HVOT was administered as part of a larger battery. The number of correct responses was recorded. Study strengths 1. The sample composition is well described in terms of age, education, gender, handedness, MMSE and GDS scores, geographic area, setting, and recruitment procedures. 2. Rigorous exclusion criteria. 3. Means and SDs for the test scores are reported. 4. The sample is divided into two age groups.
Consideration regarding use of the study 1. Small sample size. [HVOT.3] Richardson and Marottoli, 1996 (Table A13.4)
The authors report data for 101 autonomously living, mostly Caucasian, elderly participants who comprise a subsample of a cohort of participants in Project Safety, a study on driving performance conducted in New Haven, Connecticut. Individuals with a history of neurological disease, excessive use of alcohol, or risk for dementia (based on MMSE score) were excluded. The sample consisted of 53 males and 48 females, with a mean age of 81.47 (3.30), mean education of 11.02 (3.68) years, and mean MMSE score of 26.97 (2.55). Ethnic composition was 90.1% white and 9.9% black. The HVOT was administered and scored according to the standard instructions provided in the test manual. The data were divided into two age groups of younger-old (76-80) and older-old (81-91) by two education groups. The results indicated that the mean performance for participants with < 12 years of education was stable across younger-old and older-old age groups and considerably lower than for their more educated counterparts; however, performance for the younger-old age group with >12 years of education was superior to that of the older-old group with comparable education. Study strengths 1. Data for a relatively large sample of elderly participants are presented. 2. Sample composition is well described in terms of gender, education, geographic area, and ethnicity. 3. Adequate exclusion criteria. 4. The data are classified into age-byeducation groupings. 5. Means and SDs are reported. Considerations regarding use of the study 1. No information on intelligence level is provided. 2. Sample sizes for each age-by-education cell are relatively small.
HOOPER VISUAL ORGANIZATION TEST
[HVOT.4] Walsh, Lichtenberg, and Rowe, 1997 (Table A13.5)
The authors compared HVOT performance for three groups of geriatric rehabilitation inpatients: cognitively intact, mildly impaired, and severely impaired. Patients were referred for routine cognitive evaluations from two sites: a geriatric rehabilitation service of an urban university rehabilitation hospital and the physical medicine and rehabilitation unit at a suburban rehabilitation hospital. The cognitively intact group consisted of 32 participants (10 male, 22 female) who scored 2:123 on the Dementia Rating Scale or in the unimpaired range on all subtests of the Neurobehavioral Cognitive Status Examination. Participants had no evidence of closed head injury, stroke, or other neurological conditions which could affect cognition, as determined by medical chart review, patient interview, and/or negative radiological findings. The HVOT was administered according to standard instructions.
Study strengths 1. The sample composition is well described in terms of age, education, gender, and clinical setting. 2. Adequate exclusion criteria. 3. Test administration procedures are specified. 4. Means and SDs for the test scores are reported. Considerations regarding use of the study 1. The sample is relatively small. 2. No information on IQ is reported. [HVOT.Sl Lichtenberg, Ross, Youngblade,
and Vangel, 1998 (Table A13.6) The authors compared two groups of geriatric urban medical inpatients: cognitively intact and impaired. All patients were recruited from consecutive admissions to a geriatric medical
277
rehabilitation program in a midwestern urban university hospital. Seventy-four patients were identified as cognitively intact. This sample had a mean age of 76.9 (5.9) and mean education of 10.8 (3.0); 74% were women, 51% were African American, and 49% were European American. All participants were functionally independent across all cognitive domains and activities of daily living; had no history of neurological disease, psychiatric illness, or substance abuse; and had normal results of neurological examination. The HVOT was administered as part of a larger battery.
Study strengths 1. Adequate sample size. 2. The sample composition is well described in terms of age, education, gender, ethnicity, clinical setting, and recruitment procedures. 3. Adequate exclusion criteria. 4. Test administration procedures are specified. 5. Means and SDs for the test scores are reported. Considerations regarding use of the study 1. The data are not partitioned by age group. 2. No information on IQ is reported.
CONCLUSIONS The HVOT has been used clinically as a measure of visual perception and organization. However, the effect of naming impairment on HVOT performance remains unclear. The clinical utility of this test would be enhanced with the availability of normative data for a large sample of neurologically intact participants of both genders across a wide age span, partitioned by age group and intelligence level. 2
2 Meta-analyses were not perfonned on the HVOT due to lack of sufficient data.
14 Visual Form Discrimination Test
BRIEF HISTORY OF THE TEST Many of the most commonly administered tests in neuropsychological practice require intact visual perception, and accurate interpretation of visually mediated tests often rests upon the assumption that visual perceptual skills are intact (Lezak et al., 2004). For example, in the absence of careful assessment of visual perceptual abilities, low performance scores on visual memory tests may be mistakenly attributed to memory impairment when in fact the deficits may be primarily related to visual perceptual ability rather than memory. The Visual Form Discrimination Test (VFDT) was developed by Arthur L. Benton and colleagues (Benton et al., 1983b) as a screening test for visual perceptual deficits. (Please see Appendix 1 for ordering information.) The VFDT is a multiple-choice, matchingto-sample task. The test is presented using a spiral-bound booklet (Benton et al., 1983b). The subject views an 81h x 11" inch page in the booklet displaying a sample design containing three geometric elements. Directly below the stimulus page, the adjoining 81h" x 11" inch page presents four smaller three-element designs (numbered 1, 2, 3, or 4). The subject, therefore, can concurrently view the main stimuli and the four smaller design groupings below. The designs on both pages are similar
278
in that each contains two large geometric shapes and a small peripheral figure. However, only one of the smaller designs shown on the adjoining page below is an exact match for the larger stimulus design above. The other three designs are considered "distracters" and are variants of the larger stimulus design. One of the three distracter designs is created by moving or rotating the peripheral figure, the second by distorting one of the major figures, and the third by rotating one of the major figures. The subject is requested to point to or "say the number" of the design below that exactly matches the larger stimulus design. The VFDT consists of two practice items and 16 test items. There is no time limit, and the scoring system awards 2 points for each correct answer and 1 point for an error that involves only the peripheral figure. Errors involving the major figures receive no points. Scores range 0-32. Unimpaired individuals usually can complete the test in less than 5 minutes, and the test rarely takes longer than 10 minutes to complete regardless of the level of impairment. Because the VFDT is a nonmotoric task, it is especially useful when assessing senior adults, patients with severe arthritis or hemiparesis, and/or the medically ill. The validity of the VFDT to assess visual perceptual impairments with various neurological conditions has been well established.
VISUAL FORM DISCRIMINATION TEST
For instance, the VFDT has been used to examine visual perceptual impairments in posthead injury patients (Iverson et al., 1997b, 2000; Malina et al., 2001; Millis et al., 2001; Wilde et al., 2000), aphasic patients (Varney, 1981), and patients with vascular dementia (Mast et al., 2000), Alzheimer's disease (Iverson et al., 1997a; Kaskie & Storandt, 1995), or Parkinson's disease (Tang & Liu, 1993). Patients with right hemisphere lesions show the highest rates of test failure (Benton, 1983a), although aphasic alexics have been observed to show a 36% failure rate (Varney, 1981) and recovery in letter recognition is accompanied by improvement in visual form discrimination. Test-retest reliability has been examined by Campo and Morales (2003) and found to be quite stable over brief intervals (e.g., ~1 days). However they did find a practice effect for peripheral errors and cautioned that further testretest research is necessary to investigate the stability of the VFDT across longer periods of time. Internal consistency has ranged from 0.66 (Malina et al., 2001) to 0.75 (Iverson et al., 2000). In a preliminary study, Caplan and Caffery (1996) examined the test as a motor-free measure of short-term visual recognition memory. For the short-term recognition memory format, each stimulus page from the VFDT was exposed for 10 seconds and after each exposure the subject was shown the response plate and asked to select the design that he or she recognized. This process was repeated for each of the remaining stimulus and response plates in the booklet. After the recognition memory part of the test was completed, subjects were reexamined on failed items, using the standard VFDT "matching-to-sample" administration procedures. Caplan and Caffery (1996) found that both the standard VFDT administration and the 10-second exposure short-term memory administration effectively discriminated between brain-diseased patients and controls; however, use of the combined administration allowed parceling of the effect of perceptual deficits on multiple-choice recognition memory performance. Because there are 16 VFDT response cards and a one-in-four chance of guessing the correct answer on each card, one would expect, based on probability theory, that a
279
person would obtain a minimum of four correct answers simply by guessing. A5 such, the VFDT can additionally be used as a measure of symptom validity and motivation when performance scores are especially low (i.e., significantly below chance). However, because the task tends to be very easy for most clinical populations, performances above chance can still be noncredible. Larrabee (2003) found that a raw score of <26 on the VFDT correctly identified 48% of his definite malingering neurocognitive dysfunction group (i.e., subjects who performed significantly worse than chance on the Portland Digit Recognition Test) and 93% of moderate to severe closed head injury patients. He reported that no closed head injury patients scored <24 on the VFDT, whereas 32% of his definite malingering neurocognitive dysfunction subjects scored <24. He concluded that the VFDT demonstrates good sensitivity and specificity for discriminating malingerers from headinjured patients. Since the standard administration of the VFDT usually takes less than 5 minutes to complete with normals and rarely takes more than 10 minutes regardless of level of impairment, the necessity for a shortened VFDT version may be questioned. However, very brief and accurate screening measures are welcomed when individuals can tolerate only minimal testing, due to such factors as low frustration tolerance, fatigue, attentional problems, or severely impaired cognition. Two short form versions of the test have been created by splitting the test in half. The "Firsthalf" short form consists of the first eight items of the standard test, and the "Frontback" version consists of the first four (1-4) and last 4 (13-16) items chosen because the correct answer was presented an equal number of times in the four quadrants. Scores from the short forms are multiplied by 2. Iverson et al. (1997b) evaluated the concurrent validity and clinical utility of the Front-back short form in a sample of patients with closed head injury and observed that the mean difference between the short and full forms was < 1 point and the correlation between the two versions was 0.86. In a subsequent publication, Iverson and colleagues (2000) obtained full and short
280
PERCEPTUAL ORGANIZATION: VISUOSPATIAL AND TACTILE
form data on patients with heterogeneous neurological and psychiatric diagnoses who were referred for neuropsychological evaluation. Internal consistencies for the two short forms were 0.62 (First-half) and 0.63 (Frontback). Correlations between the full and short forms were 0.85 (First-half) and 0.86 (Frontback). For both shortened versions, the majority of scores were within 2 points of the full version (70.7% for First-half, 71.6% for Frontback). The authors also documented that the mean difference score between both the short forms and the full form was less than 1 point and concluded that use of the short form results in a very minor loss of accuracy. Clinical decision rules were provided for improving the overall accuracy of the short forms. The authors concluded that both short forms had adequate concurrent validity for both clinical and research applications, but they expressed preference for the First-half version because no research had been conducted on the Frontback version without administration of the middle eight items. Unfortunately, no normative data for either of the short forms have been published, limiting the clinical utility of these forms.
RELATIONSHIP BETWEEN VFDT PERFORMANCE AND DEMOGRAPHIC FACTORS The influence of various demographic factors on VFDT performance is sketchy, largely because the studies have been few and sample sizes often have been small. Benton et al.'s (1983b) original normative sample consisted of 85 healthy subjects aged 19-74 years. He found no effect of age, education, or gender on VFDT performance. Similarly, Larrabee (2003) studied VFDT performance in samples of severe closed head injury patients, moderate closed head injury patients, and mixed neurological and psychiatric patients and observed no significant relationship between age and education and VFDT performance. The lack of an age effect has also been observed by Axelrod and Ridder (as reported by Benton et al., 1983b, 1994b); however, Valdois et al. (1989) found that older subjects did show a
significant decline in mean scores compared to younger age groups. In addition, Campo and Morales (2003) noted a significant influence of age and education on VFDT performance in their large (n = 397) normative sample, although they also did not observe a gender effect. However, the extent to which their findings, on data collected in Spain, generalize to U.S. populations is unclear.
METHOD FOR EVALUATING THE NORMATIVE REPORTS To adequately evaluate the VFDT normative reports, five key criterion variables were deemed critical. The first four of these relate to subject variables and the last to procedural variables. Minimal requirements for meeting the criterion variables were as follows.
Subject Variables Sample Size
Fifty cases have generally been recommended (Hayes, 1963; Guilford, 1965) as providing a reliable estimate of the population mean. For the purpose of review, a minimum of 50 subjects per age group interval was deemed adequate. Sample Composition Description
Information regarding medical and psychiatric exclusion criteria is important. It is unclear if geographic recruibnent region, socioeconomic status, occupation, ethnicity, or recruibnent procedures are relevant. Until this is determined, it is best that this information be provided. Age Group Intervals
This criterion refers to grouping of the data into limited age intervals. This requirement is relevant for this test since an effect of age on VFDT performance has been demonstrated in some studies. Reporting of Education Level
Given that there has been some association between educational level and VFDT scores, information regarding educational level
VISUAL FORM DISCRIMINATION TEST should be reported. It is unknown whether IQ is relevant, so until this is determined, it is best that information on IQ be provided.
Procedural Variables Data Reporting Means and standard deviations should be reported. With these requirements for reporting in mind, the normative studies for the VFDT were examined.
SUMMARY OF THE STATUS OF THE NORMS Only two studies report normative data for the VFDT. Benton et al. (1983a) report data for a group of individuals aged 1~74 years who are Uving in the United States, and Campo and Morales (2003) report data for a group of adults aged 18--59 years who are Uving in the south and southwest of Spain. The Benton et al. (1983b) study suffers from small sample size and no performance SDs reported (although they were subsequently provided by Caplan and Schultheis, 1998). As such, data from this report should be considered "provisional," and further normative research with U.S. populations across the age span is encouraged. The Campo and Morales (2003) normative data are adequate for patients aged 18--59 years who were hom and educated in Spain. In this chapter, normative publlcations are reviewed in ascending chronological order. The text of study descriptions contains references to the corresponding tables identified by number in Appendix 14. Table A14.1, the locator table, summarizes information provided in the studies described in this chapter.
SUMMARIES OF THE STUDIES This section presents critiques of the two normative studies for the VFDT.
[VFDT.1] Benton, Hamsher, Varney and Spreen, 1983b (Table A14.2) This is the report of the original normative study. The report provides the mean, median,
281 and range of performance scores for two age groups of men and women on the VFDT (males 19--54 years old, n = 28; females 19--54 years old, n = 30; males 55-74 years old, n = 15; females 55-74 years old, n = 12). The normative sample consisted of 85 "patients without history or evidence of brain disease or healthy subjects" (p. 58). No further information regarding the population from which this sample was drawn was provided.
Study strengths 1. Data are stratified by age and gender. Considerations regarding use of the study 1. Sample size per age/gender category is small. 2. Inadequate exclusion criteria; sample included patients with unspecified medical conditions excluding history or evidence of brain disease. 3. Age group intervals are too broad. 4. No information is reported regarding the educational level of the various gender/ age groups. However, for the total sample, 72 subjects had >12 years of education and 13 subjects had <12 years of schoollng. 5. No SDs are reported. Other comments 1. In examining the distribution of scores of their normative sample, Benton et al. (1983b) found that the majority of subjects had near perfect scores (31-32) and that scores of ~26 were obtained by 95% of the group. They concluded that a score of 24 or 25 should be considered "borderline or mildly defective performance," a score of 23 reflects "moderately defective performance," and scores of <23 indicate "severely defective performance." 2. More recently, Caplan and Schultheis (1998) provided the SDs for Benton et al.'s (1983b) original normative sample. They also generated an interpretive table with T scores, percentile equivalents, and cllnical performance descriptors based on Heaton et al.'s (1991)
PERCEPTUAL ORGANIZATION: VISUOSPATIAL AND TACTILE
282
interpretive schemes. They noted that because of ceiling effects on the VFDT, only a perfect score would produce an "above-average" rating according to the Heaton et al. criteria, while twQ "fullcredit" errors plus a single 1-poi~t error (5 points) would be sufficient to yield a "mildly impaired" score (i.e., score= 27). Using the Benton et al. description, a perfect score produces an "'abovenormal" performance rating, while three "full-credit" errors plus a single ~-point error (7 points) produces a "milfly defective" score (i.e., score= 25). A lli!Ore of 26 would fall within the Benton et al. normal range, whereas using the ~eaton et al. scheme, a score of 26 w~ld be considered "mildly-moderately imjjaired." Further, a "moderately defectivet score of 23, as suggested by Benton :et al., would produce "moderate impmtment" on the Heaton et al. scheme. Capfr1 and Schultheis (1998) concluded tijlt the Heaton et al. scale differs from fhat of Benton et al. by increasing the : cutoff value for impairment.
[VFDT.2] Campo and Morales, 2003 (Table A14.3)
This study presents normative findings by age and education for a group of 397 healthy, unpaid volunteers, recruited by wqrd of mouth, and living in the south and southwest of Spain. The sample was composed bf 191 males and 206 females. The data are rei>orted by two age groups (18--39, n =222; ~9. n = 175) and two educational levels (6-8 years, n = 115; ~9 years, n = 282). T~ be a participant in this normative study, s..bjects had to speak Spanish as their prim~ language, to be independently functionirt, and to have no history of neuropathologic4J conditions, psychiatric hospitalization, ab.ormal psychomotor development, drug or ~cohol abuse, or psychotropic drug use that: could affect attention/concentration or pf>duce drowsiness. Individuals with chronic medical conditions such as diabetes, hypertensio+, mild hearing loss, however, were not excludetJ. The VFDT was administered and scored accprding
. I
to Benton et al. (1983b, 1994b). The three error types were categorized according to Kaskie and Storandt (1995) (i.e., major distortion, major rotation, peripheral error), and normative data for error types were reported.
Study strengths 1. Sample size per age/education category is adequate (n ~50). 2. The sample composition is well described in terms of gender and geographic area. 3. Information is reported regarding the educational level of the various age groups. 4. Means and SDs for correct responses and errors are reported.
Considerations regarding use of the study 1. Age group intervals are broad. 2. Data were collected in Spain, which raises questions regarding their appropriateness for clinical use in the United States.
Other comments Campo and Morales (2003) found total scores of ~26 were obtained by 94% of their sample. This result is practically identical to that reported by Benton et al. (1983b), who found that 95% of their sample obtained scores >26.
CONCLUSIONS Indeed, the normative database for the VFDT has remained meager. The first normative study was reported by Benton et al. (1983b). Mean and median performance data for 85 adults, aged 16--75 years were reported; however, no information regarding SDs was provided, which significantly limited the clinical utility of the test for several years. Campo and Morales (2003) provided the only other normative report on a group of 397 healthy adults aged 18-59, living in the south and southwest of Spain. Two additional studies (Lichtenberg et al., 1998; Nabors et al., 1996) are found in the literature, which are often cited as providing normative data for the VFDT. However, close inspection of the procedural details for these studies quickly
VISUAL FORM DISCRIMINATION TEST
reveals that the data were not collected from a group of normal healthy individuals but from a group of elderly medical rehabilitation inpatients. Although there has been limited research using short forms of the VFDT, no normative data have been reported for these forms.
283
Because of its design, the VFDT seems especially suited for use with senior adults. Most of the normative data have been collected on senior adults in the age range of 6074 years. Additional studies with senior adults more advanced in age would be important since dementia is most common in old age.
15 Judgment of: Line Orientation
BRIEF HISTORY OF THE TEST: The Judgment of Line Orientation Tes~ (JLO) is designed to evaluate visual-spatial skills by assessing the ability to judge the
,W
partial
284
with the test. Following the practice trials, all 30 test items are administered; there is no discontinuation rule. Each correct response is given one credit, and the total number of correct answers out of 30 is recorded. The test was developed and first published by Benton et al. (1978). This study assessed the ability of individuals with left or right hemisphere lesions to discriminate line angles. Prior to publication of the JLO, these authors and others used a tachistoscope to examine line orientation discrimination in healthy and clinical populations (Benton et al., 1975; Fontenot & Benton, 1972; Warrington & Rabin, 1970). The JLO was created out of the desire to have a more convenient, brief clinical test that could be used with patients at bedside, if necessary. Further details regarding test development, administration, scoring, and norms on the initial sample were later published in a clinical manual, which also contains several other orientation and visualspatial tests (Benton et al., 1983a). Studies have consistently shown far greater right than left hemisphere association with performance on the JW. In the original sample of patients with unilateral right or left hemisphere lesions, Benton et al. (1978) found that right hemisphere brain-damaged patients performed strikingly worse than patients with left hemisphere damage. While
JUDGMENT OF LINE ORIENTATION
left hemisphere-damaged patients performed poorer than controls, only a minority (10%) were classified as moderately to severely impaired. Similar findings were reported by York and Cermak (1995) and Trahan (1998), who found poorer performance on the JLO for patients with right hemisphere cerebrovascular accidents (CVAs) relative to those with left CVAs and normal controls, with the poorest performance occurring in right CVA patients with left-sided visual neglect. Mehta and Newcombe (1996) suggested that in JLO performance the right hemisphere may be responsible for "metric measurement" but that the left hemisphere keeps tracks and updates decisions in line orientation judgment. Consistent with the other studies, Hannay et al. (1987) reported increased blood flow to temporo-occipital regions bilaterally for participants performing a line orientation task similar to the JLO, but the greatest blood flow increase occurred in the right hemisphere. However, Gur and colleagues (2000) reported that right parietal temporal activation may be present only in males. Impaired performance on the JLO has been reported in patients with various forms of dementia. A number of studies have shown that patients with Alzheimer's disease perform worse than normal controls on this task (Eslinger & Benton, 1983; Eslinger et al., 1985; Ska et al., 1990). Similarly, studies of Parkinson's disease patients have shown lower performance relative to normal controls (Flinton et al., 1998; Montgomery et al., 1993; Montse et al., 2001; Ska et al., 1990). Error types, such as vertical and horizontal errors or inter- and intraquadrant errors, on the JLO have also been examined in dementia patients and found by some authors to discriminate between controls and Alzheimer's patients (Ska et al., 1990) and Parkinson's patients (Montse et al., 2001), although Flinton et al. (1998) were unable to replicate these findings in Alzheimer's patients. Specifically, Flinton et al. (1998) found that only two out of 10 JLO error types used by Ska et al. (1990) occurred with more frequency in Alzheimer's patients relative to controls. The error types included misjudging one oblique line with another that is separated by only one spacing
285
and incorrectly identifying a vertical line as an oblique or horizontal line. Flinton et al. (1998), however, recognized that their sample of Alzheimer's patients was slightly older, was more educated, and obtained better overall JLO scores than the sample used by Ska et al. (1990). Patients with Lewy body dementia with psychosis have been reported to commit more visual perceptual errors than Alzheimer's patients and patients with Lewy body dementia with predominantly parkinsonian features (Simard et al., 2003). As a model for studying diseases such as Alzheimer's, the relationship between cholinergic system dysfunction and visual-spatial skills has been examined. Studies employing the JLO have reported mixed findings. Meador et al. (1993) administered scopolamine (an anticholinergic) vs. placebo to healthy middleaged individuals and found that the scopolamine group performed significantly worse on the JLO relative to the placebo group. However, in a similar study, Obonsawin et al. (1998) using scopolamine vs. saline in normal middleaged individuals, found no significant differences in JLO performance for the two groups. Relatively small sample sizes were used in both studies (n = 12 per group), and even when differences were statistically significant, mean score differences were rather small. The few studies conducted on patients with mental illness have shown that JLO performance does not appear to be altered in patients with schizophrenia. Fleming and colleagues (1997) found no differences between schizophrenics and controls on the JLO, suggesting that basic visual-spatial skills are intact in this patient group. Similarly, Sweeney et al. (1991) reported no change in performance on the JLO from an acute baseline episode to a 1-year follow-up period in patients with schizophrenia; mean scores at both points were within the normal range. In contrast, JLO scores have been observed to be lower in depression (Coello et al., 1990), and JLO performance has been reported to be predictive of aggression in forensic patients (Foster et al., 1993). Studies assessing the influence of various other factors on JLO performance have yielded interesting results. Rahman and Wilson (2003) found that heterosexual men outperformed
286
PERCEPTUAl ORGANIZATION: VISUOSPATIAl AND TACTilE
homosexual men on the JLO, but no differences were reported between hetero- and homosexual women. The authors suggest that underlying differences in brain strUctures, particularly those involving the parietal cortex, may explain the differences in performance for the two male groups. Interestingly, factors such as near-sightedness do not appear to afftct JLO performance (Kempen et al., 1994). finally, studies have shown that the JLO h~ some limited utility for the detection of maliQgering. That is, while scores may be lower 4>r malingerers, classification rates for malingerers vs. nonmalingerers are quite low (lversoD, 2001; Meyers et al., 1999). Psychometric Properties of the Test
In their manual, Benton et al. (1983a)· report the split-halfreliabilityofboth Forms If and V to be high. The reliability for Form Hj based on 40 participants, was 0.94, and ~at for Form V, based on a sample of 124 participants, was 0.89. Additionally, in a salilple of 37 participants who were administered both versions of the test (with test probes se;arated by 6 hours to 21 days), test-retest retability was 0.91 and the increase in mean scortts from one test administration to the next was negligible (23.1-23.5), suggesting the absence of a practice effect. Validity studies have shown that the JLO discriminates patients with right hemisphere lesions from those with left; hemisphere lesions (Benton et al., 1983a, 1994b; see above discussion). Adequate test relability has also been reported for the sh~ened (odd-even) forms of the JLO (Woodard et al., 1998). For further information on tlie psychometric properties of the JLO, see F)anzen ' (2000). Alternate Brief Forms of the JLO
More recently, researchers have qreated shortened versions of the JLO (Qualls~ et al., 2000; Woodward et al., 1996). One ~ethod for shortening the JLO is to split the t~t into even- and odd-numbered items. This J1)ethod has created two valid and reliable forms (Woodward et al., 1996, 1998). The authots also found that these two shorter forms produce
similar mean scores, that the distribution of the scores across both forms is virtually identical, and that when the original JLO cutoff scores are used, the shortened versions actually have a high accuracy in classification of patient groups. Similarly, Mount et al. (2002) observed that odd-even short forms are highly correlated with the total test (r=0.000.93) and that all patient scores derived from short forms were within 2 points of full scores. Winegarden et al. (1998), evaluated five shortened versions of the JLO, including a 10-item test (items 1-10), two 20-item versions (items 1-20 and 11-30), and odd-even versions and found that using items 11-30 of Form V had the highest internal consistency and best overall correlation with the full version. Another shortened version of the JLO was created by Qualls et al. (2000) in which the Latin square randomization strategy was used to place each of the JLO items into one of two shortened forms. The shortened versions in Qualls et al.'s (2000) study showed adequate internal consistency and validity relative to the long form, but classification rates (normal vs. impaired) were lower when using the original JLO cutoff scores.
RELATIONSHIP BETWEEN JLO PERFORMANCE AND DEMOGRAPHIC FACTORS
Effects of age and gender have consistently been reported in the JLO literature, while the impact of education on JLO performance has been more equivocal. Benton et al. (1978, 1983a) reported a moderate decline in JLO scores with age. Similar age effects were reported by Eslinger and Benton (1983) in older individuals (55+ years) but not by Ska et al. (1990). Subsequent studies have shown age effects in normal healthy individuals, with poorer performance with advanced age (Basso et al., 2000; Rahman and Wilson, 2003; Woodward et al., 1996). Only one study, using a shortened version (only odd items were presented), found no relationship between age and JLO performance (Salthouse et al., 1997). This may be due to the restricted number of test items presented to participants.
JUDGMENT OF LINE ORIENTATION
Benton et al.'s (1978, 1983a) original study revealed significant gender differences, with males outperforming females. In a study with relatively small sample sizes, Desmond et al. (1994) found gender differences in JLO performance in healthy controls and stroke patients, with males outperforming females in both groups. Montse et al. (2001) also found that in both Parkinson's and healthy control groups, females scored lower than their male counterparts on the JLO. Finally, Woodward et al. (1996), using a short form of the JLO, also obs_erved better performance in male vs. female patients. Woodward and colleagues (1996) observed a decline in JLO performance as a function of lowered education. However, Benton et al. (1983a) found a more complex relationship between JLO performance and education. Specifically, males under the age of 65 years displayed no differences in JLO scores as a result of education level (12+ vs. < 12 years); however, females consistently showed a decline in scores as a result of the interaction of age and education status. . . There is virtually no information regarding JLO performance in different ethnic groups. To date, only one study (Rey et al., 1999) has examined the performance of Hispanics on the JLO. These authors report median scores and cutoffs for "defective performance" for a group of South American, Central American, and Cuban Spanish-speaking individuals. They found that the distribution for Hispanic participants (median JLO score= 24, ~utoff score = 17) was virtually identical to published norms on English-speaking individuals (median JLO score= 25, cutoff score= 15). While the findings of this study suggest that use of the Spanish adaptation of this test is valid, further research is needed. For further information regarding the JLO, see Benton et al. (1983a, pp. 48--54), Lezak (1995, pp. 400-401), and Lezak et al. (2004, pp. 390-391).
287
critical. The first five of these relate to subject variables, and the remaining two refer to procedural issues.
Subject Variables Sample Size Fifty cases are considered a desirable sample size. Although this criterion is somewhat arbitrary, a large number of studies suggest that data based on small sample sizes are highly influenced by individual differences and do not provide a reliable estimate of the population mean. Sample Composition Description Information regarding medical and psychiatric exclusion criteria is important. It is unclear if geographic recruitment region, socioeconomic status, occupation, ethnicity, or recruitment procedures are relevant. Until this is determined, it is best that this information be provided. Age Group Interval
This criterion refers to grouping of the data into limited age intervals. This requirement is relevant for this test since a strong effect of age on JLO performance has been demonstrated in the literature. Reporting of Educational levels Given the possible association between education and JLO scores, information regarding educational level should be reported for each subgroup. Reporting of Gender Composition Given the strong association between gender and JLO performance in favor of males: .information regarding gender composition should be reported for each subgroup, and preferably normative data should be presented by gender.
Procedural Variables METHOD FOR EVALUATING THE NORMATIVE REPORTS To adequately evaluate the JLO normative reports, seven criterion variables were deemed
Description of Test Version Full and shortened versions of the JLO exist. Description of the test version allows selection of the most appropriate norms.
PERCEPTUAL ORGANIZATION: VISUOSPATIAL AND TACTILE
288
Data Reporting
Group means and standard deviations for the number of correct responses (out of 30) should be presented.
SUMMARY OF THE STATUS OF THE NORMS Information presented in the studies reporting data for the JLO differs considerably across studies. Some of these differences will be summarized below. Only two studies were designed to provide normative information on the JLO. Benton et al. (1983a) report the original findings of the Benton et al. (1978) study, in which the JLO was developed; and Woodward et al. (1998) report normative data for two shortened, parallel forms of this test. Data for "normal control'' groups from clinical comparison $tudies are also included in this chapter. Studies that did not include normal control groups or did not report means and SDs for the JLO are not reviewed in this chapter (Benton et al., 1981; Iverson, 2001; Montgomery et al., 1993; Ng et al., 2000; Qualls et al., 2000; Raskin et al., 1990; Seidenberg et al., 1995; Sweeney et al., 1991; Trahan, 1998; Winegarden et al., 1998; Woodard et al., 1996). Other studies were not included because JLO scores for normal controls were abnormally low (Desmond et al., 1994), authors used individuals referred for neuropsychological evaluation as their subject pool rather than healthy normal controls (Vanderploeg et al., 1997), or they used modified versions of the JLO for which the administration procedures were not clearly described (Gur et al., 2000; Levin et al., 1991). The majority of the studies used the 30-item JLO (Form V or H), but three studies provide norms on 15-item short forms. Most studies report the number of correct responses as the outcome data; however, one study reports number of errors and one study reports T scores. While many of the studies present statistical differences between males and females, very few stratify the data by gender. Few studies stratify mean scores by age group, and no study reports data by educational level. Most studies
used middle-aged (30s and 40s) or older (55+ years) samples. Among all of the clinical studies available in the literature, we selected for review those that had well-defined samples, presented means and SDs for the JLO (with the exception of the original study by Benton et al., 1983a), and provided descriptive statistics for demographic factors such mean age and education for the sample. Additionally, given that shortened versions of the JLO save on administration time and may be desirable for the clinician, one such study with adequate normative data is presented. In this chapter, normative publications and control data from clinical studies are reviewed in ascending chronological order. The text of study descriptions contains references to the corresponding tables identified by number in Appendix 15. Table A15.1, the locator table, summarizes information provided in the studies described in this chapter. 1
as
SUMMARIES OF THE STUDIES UL0.1] Benton, Hamsher, Varney, and Spreen, 1983a (Table A15.2)
The normative data in this manual are based on the original study in which the JLO was constructed and developed (Benton et al., 1978). The original study used 144 controls, seven of whom were left-handed (Benton et al., 1983a), but the later manual reports data for 137 control participants. It is unclear if it is the seven left-handed individuals whose data are not included in this manual. Patients who evidenced no brain disease were recruited from the general medical services of the UDiversity of Iowa to serve as the control group. Normative data are reported for three age groups (16--49, 5064, 65-75) by gender. The average educational level of the participants was approximately 12 years. Mean JLO scores, without SDs, are reported for each age-by-gender group. Study strengths 1. The sample composition is well described in terms of age, education, gender, 'Nonns for adolescents are available in Baron (2004).
289
JUDGMENT OF LINE ORIENTATION
geographic area, and recruitment procedures. 2. Data are partitioned by age groups and gender, and correction scores for age and gender are provided. 3. Test administration procedures are specified. 4. A distribution of corrected scores is provided along with percentiles and classification labels based on the performance of normal participants relative to those with brain lesions.
Considerations regarding use of the study
1. No SDs for JW scores are reported. 2. Control participants are medical patients (with no evidence of brain disease) and thus may represent a biased sample. 3. No other exclusion criteria are provided. 4. Overall sample is adequate, but individual cells are relatively small.
Other comments 1. Specific corrections (point additions) to the achieved scores need to be made, depending on age and gender. UL0.2] Eslinger and Benton, 1983 (Table A15.3) The control sample included 178 volunteers (35 males, 143 females) aged 65-94 years. Participants were recruited from senior-citizen organizations and retirement communities in and around Iowa City. Participants with a history of neurological disease and psychiatric hospitalizations were excluded from the study. The average educational level for males and females was 12.6 and 13.1, respectively. The sample was stratified by three age groups (65-74, 75-84, 85-94) but not by gender. Standard JW administration procedures were used. The results revealed a steady decline in performance with advancing age, with an approximately 0.5 SD decline per decade after the age of 65 years.
Study strengths 1. The sample composition is well described in terms of age, education, gender, geographic area, and recruitment procedures.
2. Large overall sample size. 3. Data are partitioned by age group, and in particular, old and very old groups are included. 4. Adequate exclusion criteria. 5. Test administration procedures are specified. 6. Means and SDs for the test scores are reported.
Considerations regarding use of the study 1. The data are not partitioned by gender, and there are two to three times more females than males in each age group. 2. Raw scores are not reported. Means and SDs are reported in T scores. 3. It is unclear whether this elderly sample is all community-dwelling. UL0.3] Eslinger, Damasio, Benton, and Van Allen, 1985 (Table A15.4) JW scores were collected on 53 normal volunteers (25 males, 28 females) aged 60--88, recruited through senior-citizen and community organizations in the Iowa City area. Mean age was 73.1, and mean education was 12.0. All were independent, community-dwelling individuals who were screened for neurological disorder (including head injury and alcoholism), psychiatric illness requiring hospitalization, and any disabling medical or physical condition. Subjects considered themselves to be in generally good physical and mental health.
Study strengths 1. Minimally adequate sample size. 2. Information on age, gender, education, geographic area, and recruitment strategies. 3. Means and SDs for the test scores are reported.
Consideration regarding use of the study 1. Data are not stratified by gender or age, although age grouping is fairly narrow. UL0.4] Rao, Mittenberg, Bernardin, Haughton, and Leo, 1989 (Table A15.5) This study examined the effects of focal periventricular white-matter changes on cognitive
290
PERCEPTUf\l ORGANIZATION: VISUOSPATIAL AND TACTILE
functioning in healthy adults. The authors selected 40 participants (10 males, 30 f$nales) who had normal brain imaging to s~e as controls. Their ages ranged 25-60 yeaJS, with a mean of 42.8 (8.1), mean education of 14.0 (2.3), and mean Verbal IQ of 108.9 (1L9). All participants were recruited from n~paper advertisement in the Milwaukee, Wiseonsin, area. Additional exclusion criteria were a history of hypertension, cardiac or cerebrovascular disease, neurological illness, head injury, substance abuse, or psychiatric illness. Participants underwent physical and neuniogical exams. Standard JLO administration rrocedures were used. 1
controls. Error types differed for Alzheimer's patients vs. normal controls.
Srudy
stnm~~
1. The sample composition is well described in terms of age, education, gender, and recruitment procedures. 2. The data are stratified by age, and in particular, older age groups are included. 3. Adequate exclusion criteria. 4. Test administration procedures are specified. 5. Means, SDs, and ranges for the test scores are reported.
Considerations regarding use of the srudy Srudy~nm~~
,
1. The sample composition is well de!jcribed in terms of age, education, WAIS-t;\ Verbal IQ, gender, MMSE scores, geotaphic area, and recruitment procedures.· 2. Adequate sample size. · 3. Exclusion criteria are provided. 4. Test administration procedures are described. 5. Means and SDs for the test scores are reported.
Considerations regarding use of study 1. Relatively small sample size for a wide age range. 2. The data are not stratified by age or gender. ULO.S] Ska, Poissant, and Joanette, 1990, (Table A15.6)
In this study, 95 nonhospitalized norrtal elderly controls (19 males, 76 females~ were recruited from the community. Exclusion l criteria were a history of alcoholism. drug abuse, or neurological or psychiatric illness. The sample was divided into three age gi-oups: 55-64,65-74, and 75--84. Average educational levels were 10.13 (3.3.8), for the fir$t age group, 9.46 (3.40) for the second group, and 8.06 (2. 77) for the third group. Standari JW administration procedures were used. . The authors found a decline in JLO performance with age, but the various types of JW error evaluated in this study did opt differ for different age groups in the ~rmal
1. Overall sample is adequate, but the sample size for the oldest group is relatively small. 2. The data are not stratified by gender. There are two or, in some age groups, three times as many females as males. 3. Data were collected in Canada, which may limit their usefulness for clinical interpretation in the United States. UL0.6] Rao, Leo, Bernardin, and Unverzagt, 1991a (Table A15.7)
The study examined the pattern of cognitive deficits in patients with multiple sclerosis using a brief neuropsychological battery. The authors recruited 100 healthy adults (75 males, 25 females) with newspaper advertisements from the Milwaukee, Wisconsin, area. The average age of the participants was 46.0 (11.6), average education was 13.3 (2.0), and average Verbal IQ was 106.5 (6.9). Exclusion criteria were history of substance abuse, psychiatric illness, head injury, or other neurological disorders. All controls were given neurological evaluations and MRI. Only one participant was nonwhite. Standard JW administration procedures were used. All participants were paid.
Srudy
stnm~~
1. The sample composition is well described in terms of age, education, WAIS-R Verbal IQ, gender, geographic area, andrecruitment procedures. 2. Large sample size.
291
JUDGMENT OF LINE ORIENTATION
3. Adequate exclusion criteria. 4. Test administration procedures are specified. 5. Means and SDs for the test scores are reported. Consideration regarding use of study 1. The data are not partitioned by age or gender.
UL0.7] Meador, Moore, Nichols, Abney, Taylor, Zamrini, and Loring, 1993 (Table A15.8) Twelve healthy individuals (eight males, four females) served in this study. Participants were used in a counterbalanced, repeated-measures design, in which they were administered saline or scopolamine (an anticholinergic) at different times. Participants ranged in age from 20 to 42, with a mean age of 31 years, and had completed 12-20 years of education, with an average of 16 years. Test sessions were separated by at least 72 hours. Their JLO scores when on the saline solution were reported in this chapter. Participants were staff employees of the Medical College of Georgia and were paid for their participation. They were screened for use of psychoactive drugs and history of neurological, psychiatric, or "major" medical disease. Study strengths 1. The sample composition is well described in terms of age, education, gender, geographic area, and recruitment procedures. 2. Exclusion criteria are provided. 3. Test administration procedures are specified. 4. Means and SDs for the test scores are reported. Considerations regarding use of study 1. The sample size is small. 2. The data are not partitioned by age or gender. 3. Educational level for the sample is high. 4. Participants were staff of the medical center and may represent a biased sample. 5. Despite counterbalancing methods, JLO scores for six participants (50% of the
sample) are actually retest scores and may slightly inflate the mean value reported due to a practice effect. ULO.BJ Kempen, Kritchevsky, and Feldman, 1994 (Table A15.9)
The authors compared JLO performance of 13 individuals (three males, 10 females) with normal vision to those who were visually impaired. Participants ranged in age from 55 to 74, with an average age of 65.2 (5.9), and obtained an average of 16.5 (2.9) years of education. Participants were recruited at the University of California San Diego School of Medicine during a routine ophthalmic exam by one of the authors. The only exclusion criterion was poor visual acuity (Snellen distance acuity of $20/50). Standard JLO administration procedures were used. Interestingly, the authors found that poor vision did not affect JLO performance. Study strengths 1. The sample composition is well described in terms of age, education, gender, geographic area, and recruitment procedures. 2. Test administration procedures are specified. 3. Means and SDs for the test scores are reported. Considerations regarding use of study 1. Sample size is small. 2. Inadequate exclusion criteria used. 3. The data are not partitioned by age or gender. 4. Educational level for the sample is high.
UL0.9] York and Cermak, 1995 (Table A15.10) The authors compared JLO scores between 15 control participants (six males, nine females) and patients with CVA. The average age of participants was 61.89 (8.67) and average education was 15.07 (3.41) years. Normal controls were selected from the orthopedic floor of two rehabilitation hospitals in Portland, Maine. Control participants did not have a history of CVA or other neurological deficits. Standard JLO administration procedures were used.
292
PERCEPTUAL ORGANIZATION: VISUOSPATIAL AND TACTILE
Study strengths 1. The sample composition is well described in terms of age, education, gender, geographic area, and recruitment procedures. 2. Test administration procedures are specified. 3. Means and SDs for the test scores are reported. Considerations regarding use of the study 1. Sample size is small. 2. The data are not partitioned by age or gender. 3. Educational level for the sample is high. 4. Exclusion criteria are very limited (e.g., no mention of psychiatric disorders) and not well defined. 5. Controls are hospital patients and may represent a biased sample.
UL0.10] lvnik, Malec, Smith, Tangalos, and Petersen, 1996 (Table A15.11) The study provides age-specific norms for the JLO test obtained in Mayo's Older Americans Normative Studies (MOANS) project, which obtains normative data for elderly individuals on different neuropsychological tests. The total sample consisted of 746 cognitively normal volunteers residing in Minnesota, over age 55, 216 of whom participated in JLO testing. Mean MAYO FSIQ (which differs somewhat from the standard WAIS-R FSIQ) for the whole sample was 106.2 ( ± 14.0) and mean Mayo General Memory Index on the WMS-R was 106.2 ( ± 14.2). For a description of their samples, the authors refer to their earlier publications. Standard JLO administration procedures were used. Age categorization used the midpoint interval technique. The raw score distribution for each test at each midpoint age was "normalized" by assigning standard scores with a mean of 10 and SD of 3, based on actual percentile ranks. The authors provided tables of age-corrected norms for each age group (see below). The procedure for clinical application of these data is described in the original article (Ivnik et al., 1996) as follows: first select the table that corresponds to that person's age. Enter the table with the test's raw score; do not
use "corrected" or "final" scores for tests that might present their own age- or education-adjustments. Select the appropriate column in the table for that test. The corresponding row in the left-most column in each table provides the MOANS Age-Corrected Scaled Score ... for your subject's raw score; the corresponding row in the right-most column indicates the percentile range for that same score. Further, linear regressions should be applied to the normalized, age-corrected MOANS scaled scores (A-MSS) derived from the tables, to adjust patient scores for education. Age- and education-corrected scores for the JLO (A&E-MSS) can be calculated as follows: A&E-MSS1w=K+(W1 • A-MSS1w) - (W2 • Education)
where the following indices are specified for the JLO: K WI
1.54
1.10
w2 o.23 Education should enter the formula as years of formal schooling. The tables of scaled scores per age group provided by the authors should be used in the context of the detailed procedures for their application, which are explained in Ivnik et al. (1996). Therefore, they are not reproduced in this book. Interested readers are referred to the original article. Table A15.11 summarizes sample sizes for the different demographic groups.
Study strengths 1. Information regarding age, education, IQ, gender, ethnicity, handedness, and geographic area is reported. 2. The data were stratified by age group based on the midpoint interval technique. 3. The innovative scoring system was well described. The authors developed new indices of performance. Considerations regarding use of the study 1. Sample size for most age groups is small {ranging 2-45), with the exception of the 80-84 group (n = 69).
293
JUDGMENT OF liNE ORIENTATION
2. 'nle measures proposed by the authors are quite complicated and might be difficult to use in clinical practice. 3. Participants with prior history of neurological, psychiatric, or chronic medical illnesses were included. Other comments 1. 'nte theoretical assumptions underlying this normative project have been presented in Ivnik et al. (1992a,b). 2. 'nte authors cautioned that the validity of the MAYO indices depends heavily on the match of demographic features of the individual to the normative sample presented in this article. 3. Correlation of the JLO with age was -0.25, whereas correlations with education and gender were 0.21 and- 0.24, respectively. UL0.11 1 Fleming, Goldberg, Binks, Randolph, Gold and Weinberger, 1997 (Table A15.12) 'nte control group in this study was comprised of 27 (16 males, 11 females) paid volunteers recruited from the Washington, D.C., community. 'nte average age of participants was 26.1 (7.4), and average education was 15.4 (3.2) years. Participants were screened by psychiatrists for substance use and major psychiatric illness, neurological disease, and other medical diagnoses. Standard JLO administration procedures were used.
Study strengths 1. 'nte sample composition is well described in terms of age, education, gender, IQ, geographic area, and recruitment procedures. 2. Adequate exclusion criteria. 3. Test administration procedures are specified. 4. Means and SDs for the test scores are reported.
Considerations regarding use of the study 1. Sample size is small. 2. 'nte data are not partitioned by age or gender. 3. Educational level for the sample is high.
UL0.12] Finton, Lucas, Graff-Radford, and Uitti, 1998 (Table A15.13) JW scores for 24 normal elderly control participants (13 males, 11 females) were obtained. The average age of participants was 70.4 (6.0), average education was 15.7 (2.8), and average Dementia Rating Scale score was 138.1 (3.5). Control participants had no complaints of cognitive deficits and did not display disorders affecting cognition on physical or neurological examination. Standard JLO administration procedures were used.
Study strengths 1. 'nte sample composition is well described in terms of age, education, and gender. 2. Adequate exclusion criteria. 3. Test administration procedures are specified. 4. Means and SDs for the test scores are reported.
Considerations regarding use of study 1. Sample size is small. 2. 'nte data are not partitioned by age or gender. 3. Educational level for the sample is high. 4. Recruitment procedures are not reported. UL0.13] Obonsawin, Robertson, Crawford, Perera, Walker, Blackmore, Parker, and Besson, 1998 (Table A15.14) 'nle study was designed to assess the effects of scopolamine (an anticholinergic) on neuropsychological test performance. Twelve healthy participants (three males, nine females) who were injected with a saline solution served as the normal controls. Participant ranged in age from 20 to 36, with an average age of 40.83 (12.55), and obtained an average score of 35.09 (7.23) on the National Adult Reading Test. 'nle authors do not mention precisely from what region of the United Kingdom participants were drawn. All participants had a medical examination. In addition, abstinence from alcohol for at least 24 hours and from coffee and tea for at least 12 hours was required. Standard JW administration procedures were used. Participants were paid.
294
PERCEPTUAL ORGANIZATION: VISUOSPATIAL AND TACTILE
Study strengths 1. The sample composition is well described in terms of age, gender, and estimated IQ. 2. Test administration procedur~ are specified. 3. Means and SDs for the test scores are provided. Considerations regarding use of study 1. Sample size is small. 2. Educational level is not reported. 3. Exclusion criteria are not clearly described. 4. Recruitment procedures are not ref>rted. 5. The data are not partitioned by rge or gender. UL0.14] Meyers, Galinsky, and Volbrech4 1999 (Table A15.15) ·
The authors obtained JW scores on $0 (14 male, 16 female) normally functio~g individuals. The average age of particip~ was 36.70 (20.50), average education was: 13.67 (3.47), and average IQ was 113.97 (.3.51). Exclusion criteria were history of neurobgical disease, closed head injury, motor ~hicle accidents, learning disabilities, loss o( consciousness, or "other" conditions. StaJtdard JLO administration procedures were used. Study strengths 1. The sample composition is weD described in terms of age, educatio~ gender, and IQ. 2. Adequate exclusion criteria. 3. Test administration procedures- are specified. 4. Means and SDs for the test scores are reported. Considerations regarding use of the study 1. Sample size is relatively small. 2. Data are not partitioned by age or g~der. 3. There is a wide spread in the ~e of participants (SD is nearly 21 years). 4. Recruitment procedures are not reported. UL0.1 5] Basso, Hanington, Matson, and Lowery, 2000 (Table A15.16)
I
The authors sampled 52 healthy undergraduate students (26 males, 26 females). ~ av-
l'
erage age of the male participants was 22.04 (3.53) and the average age of the female participants was 22.62 (7.24). The average education for males was 13.92 (1.01) and the average education for females was 13.85 (1.05). Males had an average IQ of 112.67 (3.78) and females had an average 1Qof112.08 (4.20). No statistical differences were found between males and females on age, education, and IQ. All participants were fluent in English. There were 25 Caucasian and one AfricanAmerican females and 21 Caucasian, three African-American, and two Hispanic males. All participants were assessed to be right-handed by the Edinburgh Handedness Inventory, and all denied a history of learning disability, neurological illness, psychiatric disease, or head trauma during an interview. Standard JLO administration procedures were used. Study strengths 1. Adequate sample size. 2. The sample composition is well described in terms of age, education, gender, and ethnicity. 3. Data are partitioned by gender. 4. Adequate exclusion criteria. 5. Test administration procedures are specified. 6. Means are reported, and we were able to calculate SDs from the confidence intervals provided. Consideration regarding use of the study 1. Recruitment procedures are not reported. UL0.16] Bell, Hennann, Woodard, Jones, Rutecki, Sheth, Dow, and Seidenherg, 2001 (Table A15.17)
The study examined object-naming ability and depth of semantic knowledge in healthy controls and patients with early-onset temporal lobe epilepsy (TLE), using the JLO as part of a larger neuropsychological battery. The control group included 29 friends, relatives, and spouses of TLE patients (72% female), aged 16--60 years, with a mean age of 34.4 (12.5) years; FSIQ (as measured with the WAIS-111 seven-subtest short form) of 69-110, with a mean FSIQ of97.7 (6.4); and mean education
295
JUDGMENT OF LINE ORIENTATION
of 13.0 (1.7) years. Exclusion criteria were current substance abuse, psychotropic medication use, medical or psychiatric condition that could affect cognitive functioning, past episode of loss of consciousness >5 minutes, an identified developmental learning disorder, and repetition of a grade in school. Standard JLO administration procedures were used. Study strengths 1. The sample composition is well described in terms of age, education, gender, FSIQ, and recruitment procedures. 2. Adequate exclusion criteria. 3. Means and SDs for the test scores are reported.
Considerations regarding use of the study 1. Sample size is small. 2. Sample includes a wide age range. 3. The data are not partitioned by age or gender. UL0.17] Montse, Pere, Carme, Francese, and Eduardo, 2001 (Table A15.18)
The authors examined the various types of error made by patients with Parkinson's disease and normal controls on the JLO. A sample of 76 (38 males, 38 females) healthy individuals was selected as a control. Participants ranged in age from 39 to 85, with an average of 63.84 (9.93), and ranged in educational level from 1 to 20 years, with an average of 9.64 (4.17). Participants were free of neurological and psychiatric illness. Controls were spouses or friends of the Parkinson's patients and were recruited from Barcelona, Spain. They were administered Form H of the JLO. A subset of the sample was retested. Standard JLO administration procedures were used. The number of errors on the JLO was used as an outcome variable. Study strengths 1. Adequate sample size. 2. The sample composition is well described in terms of age, education, gender, geographic area, and recruitment procedures.
3. Test administration procedures are specified. 4. Mean error scores and SDs are reported.
Considerations regarding use of the study 1. The data are not stratified by age or gender. 2. Exclusion criteria are not clearly described, except to say that participants had no history of neurological or psychiatric illness. 3. There is a wide spread in age (39-85) and education (1-23). 4. The data were obtained in Spain, which may limit their usefulness for clinical interpretation in the United States. UL0.18] Rahman and Wilson, 2003 (Table A15.19)
This study examined various aspects of cognitive functioning in heterosexual and homosexual males and females. The authors recruited 240 healthy participants. Data are reported for four groups: heterosexual males, homosexual males, heterosexual females, and homosexual females. Participants were aged 18-40 years and were recruited from King's College London sources, the local community, and social networks. The authors admit that the participants "may have been oversampled from university sources." All participants were righthanded. Standard JLO administration procedures were used. Data are stratified by gender, and information regarding age, education, and performance on Raven's Matrices is provided. Study strengths 1. Relatively large sample size. 2. The sample composition is well described in terms of age, education, gender, Raven's Matrices score, geographic area, and recruitment criteria. 3. Data are partitioned by gender. 4. Test administration procedures are specified. 5. Means and SDs for the test scores are reported.
Considerations regarding use of the study 1. Data are not partitioned by age. 2. Educational level for the sample is high.
296
PERCEPTUA,L ORGANIZATION: VISUOSPATIAl AND TACTILE
3. No exclusion criteria are provided. 4. By the authors' own admission, participants were recruited mostly· from university sources, possibly affecting the generalizability of the data. ' Short Forms UL0.19] Salthouse, loth, Hancock, and Woodard, 1997 (Short form) (Table A15.2P)
Using the shortened format of the JLO~ these authors examined the relationship bqtween age and JLO performance (as well asj other neuropsychological tests) in a group of~althy adults (n = 124; approximately 47% mal~, 53% female) aged 18-78 years. Participant!4 were included if reported to be in "reasonabli good health," to not be a current student~d to have at least 11 years of education. N other exclusion criteria are reported. Rec ent procedures are not provided, nor is th location from which participants were rec~ted. Participants were administered a batttry of neuropsychological tests in their home4. The data were stratified into three age groupings: 18-39 years [mean age= 29.0 (4.8); irnean education= 15.5 (1.7)), 40-59 years Fean age= 49.1 (5.1); mean education= 15.2 J2.5)], and 60-78 [mean age=69.2 (5.11); me$1 education= 15.3 (2.6)]. Test procedures f9r administering only the odd-numbered itejns of the JLO were used (Woodard et al., 1996). Study strengths 1. Sample size is relatively large. 2. The sample composition is weD described in terms of age, education, and gender. 3. Data are partitioned by three age groups. , 4. Test administration procedures! are specified. · 5. Means and SDs are reported for; each short form. Considerations regarding use of the study 1. Recruitment procedures are : not reported. 2. Exclusion criteria are not well identified. '
UL0.20] Woodard, Benedict, Salthouse,
loth, Zgalijardic, and Hancock, 1998 (Short form) (Table A15.21)
Using the odd and even items ofJLO Form V, two different short versions of the test were created, Form E and Form 0. These two versions were administered to 82 (22% male) "healthy, nondepressed, community dwelling, geriatric individuals" from western New York State. Mean age, education, and MMSE scores are reported. Participants ranged in age from 55 to 84, with an average age of 65.8 (6.7); educational level ranged 9-20 years, with an average education of 14.0 (2.3) years; and premorbid IQ ranged 86-121, with an average IQ of 110.3 (6.6). The sample was primarily Caucasian (97.6%). Additional raw score to standard score conversion tables are provided in the original article but are not reproduced in this chapter. Study strengths 1. Sample size is relatively large. 2. The sample composition is well described in terms of age, education, gender, and geographic region. 3. Test administration procedures are specified. 4. Means and SDs are reported for each short form. Considerations regarding use of the study 1. Recruitment procedures are not reported. 2. Exclusion criteria are not well identified. 3. Data are not partitioned by age or gender.
CONCLUSIONS
As the literature above suggests, the JLO is a clinically useful test for assessing visual-spatial functioning in various patient populations. Older age has been consistently associated with lower JLO performance; however, only three of the studies stratified data by age, and of these, two included data only for older age groups (i.e., 2;::55 years). Further, most of the normative studies have been based on small sample sizes, and of the four studies which did have sample sizes >50, three did not stratify
297
JUDGMENT OF LINE ORIENTATION
by age; if they had, the resulting cells would have had small sample sizes. Consistent gender effects have been reported, with males outperforming females by approximately 2 points. However, only three studies stratified data by gender. Finally, whether education
substantially impacts JLO performance is less clear, but of concern, the educational levels of the samples, when reported, have typically been high (i.e., 2::13 years in 10 studies) and data have not been stratified by educational level. 2
2
Meta-analyses for the
JW
were conducted using data
reported in this chapter. Although the R2 and significance level for the resulting regression were minimally acceptable, we felt that the solution was not empirically supported by the data as the majority of the data points aggregated at the younger and older age ranges, with a paucity of data in-between. Therefore, the results of metaanalyses are not presented in this chapter.
16 Design Fluency Tests
BRIEF HISTORY OF THE TESTS The first nonverbal fluency test was developed by Jones-Gotrnan and Milner (1977) and was designed to be a nonverbal version of the Thurstone and Thurstone (1949) verbal fluency test. This original design fluency test consisted of two trials: a 5-minute unstructured task, in which the patient was instructed to rapidly draw novel, nonnameable designs (excluding scribbles), and a 4-minute trial, in which the patient was required to rapidly draw novel designs containing exactly four lines. However, as rightly pointed out by Hanks et al. (1996), a number of factors, such as the effect of design complexity on production rate and questions regarding what constitutes a perseverative error vs. unique design, make clinical use of this test problematic. Regard et al. (1982) subsequently created the Five-Point Test, a more structured format than the Jones-Gotrnan/Milner nonverbal fluency test. This test was modified (e.g., dot arrangement was changed and time limit reduced) and used as part I of the subsequent Ruff Figural Fluency Test (RFFT). The RFFT, developed by Ruff and colleagues (Evans et al., 1985; Ruff et al., 1987), consists of five parts, with each part containing 35 items of five dots drawn within a 3 em square, arranged in a 5" x 7" array drawn in black and white on letter-sized paper. The dots are 298
arranged in a symmetrical, circular pattern for parts I, II, and III, with the two latter parts containing interference objects such as diamond shapes or preconnected dots. In parts IV and V, the five dots are arranged in a nonsymmetrical fashion and there are no interference objects. For each part of the RFFT, the examinee is instructed to make a unique design in each square by connecting two or more dots with a straight line. The scores used for interpretation include total number of novel designs produced in the five !-minute trials, number of designs repeated (perseverations), and the ratio of perseverations to unique designs produced. However, additional indices of production strategy (Ross et al., 2003; Ruff, 1988) are also available. While this version of design fluency has been gaining widespread use, criticisms have been raised that, although the RFFT structured format may increase scoring reliability, the unstructured Jones-Gotrnan/Milner version may better measure the initiation and organization found to be deficient in patients with frontal executive dysfunction (suggested by the fact that the free condition on the JonesGotrnan/Milner version is more sensitive than the fixed condition; Harter et al., 1999). Less commonly used adaptations of design fluency tasks have also been developed: (1) providing a sheet with five lines and instructing participants to produce three different shapes
DESIGN FLUENCY TESTS
(outcome measure is time to complete three designs; Kivircik et al., 2003); (2) providing four flat pieces of plastic, two straight and two curved, and instructing participants to create as many designs (representational or nonrepresentational, with pieces touching at least once) as possible within 3 minutes (Griffiths, 1991), (3) instructing participants to draw as many designs as possible in 2 minutes using three lines (two vertical, one semicircular) (Allen et al., 1996), and (4) providing participants with three shapes (two straight lines and an arc) and instructing them to make as many different figures as possible using all three shapes within 2 minutes (Hanks et al., 1996). A design fluency task has also been included in the recently published Delis-Kaplan Executive Function system (Delis et al., 2001). This task involves three !-minute trials: a "basic" trial, in which the participant is presented with 35 boxed five-dot arrays and instructed to make designs consisting of four straight lines which connect to each other; a "filter" task, in which the subject is presented with 35 10-dot arrays (five solid dots and five clear dots) and instructed to draw designs using the empty dots only; and a "switch" trial, in which participants are again presented with the identical 35 10-dot arrays but told to draw designs by switching back and forth between solid and clear dots. The number of correct designs, percentage of errors (errors divided by number of designs), inappropriate designs (e.g., designs with five lines), and perseverative errors are tabulated. The above adaptations will not be covered in this chapter due to the lack of empirical research using these techniques. While relatively few clinical studies have been conducted with nonverbal fluency tasks, these tests in general have been shown to be sensitive to frontal lobe impairment (Butler et al., 1993; Taylor et al., 1986), with some evidence suggesting more sensitivity to right frontal damage, although functional imaging studies have revealed activation of both frontal lobes in design generation (Elfgren & Risberg, 1998). Jones-Gotman and Milner (1977) found that patients with primarily right frontal or right frontal-central cortical lesions were most impaired on design generation, followed by
299 patients with right temporal or left frontal lesions. Subsequently, Jones-Gotman (1991a, b) reported that poor performance on design fluency was unique to patients with right frontal lesions. In a comparison of patients with primarily right vs. left frontotemporal lobar degeneration (FTD) (as demonstrated by asymmetrical SPECT perfusion patterns), using a difference score consisting of verbal fluency (FAS) minus design fluency, patients with right FTD displayed higher verbal fluency and lower design fluency than patients with left FTD (Boone et al., 1999). Using the Five-Point Test, Lee et al. (1997) found that patients with frontal lobe dysfunction committed a greater percentage of perseverative errors than non-frontal neurological and psychiatric patients. Furthermore, the Five-Point Test more effectively classified right frontal lobe patients relative to patients with damage to other brain regions. However, Tucha et al. (1999) failed to corroborate this observation; they reported that right and left frontal lobe--lesioned patients performed comparably to each other and to controls. Ruff et al. (1986b) found that patients with severe head injuries produced fewer designs on the RFFT than those with mild head injuries. In a subsequent study, Ruff et al. (1994) assessed RFFT and verbal fluency performance in patients with right frontal, left frontal, right posterior, and left posterior lobedamaged groups and found that, relative to the other groups, the right frontal group by far performed worse on the RFFT but not on the verbal fluency test. Suchy et al. (2003) observed that the RFFT was better than F AS (phonemic verbal fluency) at discriminating between frontal and temporal foci seizures, although F AS was slightly better in classifying right vs. left seizure foci. Design fluency performance is also suppressed in conditions associated with nonlateralized brain dysfunction, such as Parkinson's disease and Alzheimer's disease (Fama et al., 1998), attention-deficit hyperactivity disorder (ADHD) (Rapport et al., 2001), obsessivecompulsive disorder (OCD) (Mataix-Cols et al., 1999), suicidal behavior (Bartfai et al., 1990), autism (Turner, 1999), transient global amnesia (Stillhard et al., 1990), amyotrophic
300
PERCEPTUA.L ORGANIZATION: VISUOSPATIAL AND TACTILE
lateral sclerosis (Abrahams et al., 2000), negative symptoms in schizophrenia (Stolar et al., 1994), head injury (Harter et al., 1~), exposure to alcohol in utero (Connor et al. 2001), schizoaffective disorder (Beatty; et al., 1993), borderline personality disorde~ (Judd & Ruff, 1993), "high hostile" males (~am son & Harrison, 2002), and bilateral ahterior cingulate cortex lesions secondary to kingulotomy for chronic pain (Cohen et al., 11999). Normal design fluency has been repo~ed in dyslexia (Griffiths, 1991) and in inditiduals with post-concussion-type symptoms (Chan, 2001), and improved performance h~ been reported following glucose ingestion !(Allen et al., 1996) and hypnosis (Gruzelier ~ Warren, 1993), in participants high in n~ for achievement (Allen et al., 1992), ud after pallidotomy for treatment of adrcmced Parkinson's disease (Lacritz et al., 2~ RFFI' scores are mostly related to res from the Jones-Cotman/Milner versio with correlations ranging from 0.25 for thf free condition to 0.38 for the fixed conditimi (Demakis & Harrison, 1997). Further tletails regarding the RFFT testing materials, ~inis tration procedures, and scoring can be o~ned in the test manual and kit (see Appendil1 for ordering infonnation) and in Lezak et al. (2004). Specific administration and scoring instrQctions for the Jones-Cotman/Milner design fluency task are contained in Spreen and Strauss (1998, pp. 199-201).
Psychometric Properties of the Design Fluency Tests The RFFT The RFFT has demonstrated adequate reliability and construct validity. Test-rete~ reliability coefficients at 6 months for ~que designs produced on the individual I trials range 0.58-0.69 (Ruff, 1996). The htghest r value is reported for total designs proiuced for all five trials, ranging from 0.85 \when testing intervals are separated by 4-7 iveeks (Ross et al., 2003) to 0.88 when testing pntervals are separated by approximately 3 weeks (Demakis, 1999). Correlation coefficieats of 0.71-0.76 at 6- to 12-month testing in~rvals
have been reported (Basso et al., 1999; Ruff, 1996). An average increase of eight designs has been reported on retesting at approximately 1-month (Ross et al., 2003) and 6month (Ruff, 1988) intervals, while an average gain of six or seven designs has been documented at 12 months (Basso et al., 1999). However, a much greater practice effect was reported at 3 weeks, equivalent to an average of 17 more unique designs produced on retesting (Demakis, 1999). Berning et al. (1998) reported interrater reliabilities of 0.93 (novel designs), 0.74 (perseverations), and 0.66 (error ratio); Ross et al. (2003) observed intraclass correlations of 0.82-0.96 for eight different RFFT scores; and Sands (1998) documented a coefficient of 0.99 for unique design and perseverative errors. Factor analyses have revealed that total unique designs loaded on initiation measures, whereas error ratios loaded on planning factors (Ruff, 1996).
Design Fluency Test Oones-Gotman/ Milner Version) Test-retest correlations at 1 month were reported at 0.69 for novel designs, while values of 0.77, 0.91, and 0.51 were documented for unique scoring parameters of complexity, concrete responses, and drawing quality, respectively (Harter et al., 1999); on average, two additional designs were produced at 1-month re-retesting. Chan (2001) documented interrater reliabilities of 0.74-0.98. Similarly, Varney et al. (1996) reported 90% agreement between two independent raters on scoring the free condition. Harter et al. (1999) observed interrater reliabilities of 1.00 and 0.98 for novel designs, number of perseverations, and number of scribbles as well as generally high correlations for additional scores reflecting complexity (0.97-0.98), number of concrete responses (0.81-0.90), and number of variations (0.77), although ratings of drawing quality were lower (0.46-0.59). Carter et al. (1998) reported interrater agreements of0.77 and 0.66 for novel output, 0.82 and 0.70 for perseverative errors, and 0.50 and 0.47 for nameable errors respectively, for the free and fixed conditions. In addition, an interrater agreement of 0.42
DESIGN FLUENCY TESTS
was noted for incorrect number of lines on the fixed condition. However, the ratings reported by Woodard et al. (1992), based on ratings of graduate students in clinical neuropsychology, were somewhat lower, with coefficients of agreement of 0.64 and 0. 71 for novel designs, 0.64 and 0.24 for nameable responses, and 0.57 and 0.41 for perseverative responses in the free and fixed conditions, respectively, and 0.68 for wrong number of lines in the fixed condition. The data from Carter and colleagues were based on the stricter scoring criteria of Jones-Gotman, as summarized in Spreen and Strauss (1998).
RELATIONSHIP BETWEEN DESIGN FLUENCY PERFORMANCE AND DEMOGRAPHIC FACTORS Relatively sparse literature exists on design fluency measures. Only a few studies examining the influence of demographic factors on RFFr performance could be located. In the original normative study, Ruff et al. (1987) found that age and education significantly affect RFFr performance. They documented that unique design production decreased as a function of age, particularly in those with less than a college education. Similar age and education effects were noted for perseverative errors. Interestingly, the authors reported no overall gender differences for design production but revealed a trend toward men committing fewer perseverative errors than women in two subgroups: 40--55 years of age with :512 years of education and 55-70 years of age with 16 years of education. Overall, the authors concluded that these findings were not robust enough to suggest significant gender differences. Demakis and Harrison (1997) and Ross et al. (2003) also failed to demonstrate gender differences in their samples of college students. Ruff et al. (1987) found no relationship between motor speed or verbal fluency and RFFr performance but did note a significant correlation between Performance IQ (WAIS-R PIQ) and RFFr total design production. All normative studies appear to have used participants in the United States.
301
Regarding the Jones-Gotman/Milner version, Varney et al. (1996) observed no significant relationship between number of designs generated and age, educational level, or IQ and no differences in performance between men and women. Similarly, Daigneault et al. (1992) found no relationship between young and middle-aged adults in design fluency; however, Mittenberg et al. (1989) reported a significant association between age and design fluency performance across a larger age range. Demakis and Harrison (1997) and Mataix-Cols et al. (1999) found no effect of gender on design generation; however, Harter et al. (1999) did demonstrate more novel designs produced by college males compared to college females, although they note that their sample of men was small. These authors also failed to document any effects of age or education on test performance. Interestingly, they observed worse performance in left-banders, in terms of perseverations and scribbled responses, who also performed more poorly on additional measures of design quality. A minimally significant relationship with Performance IQ was observed. Design fluency data have been obtained on participants in the United States as well as Spain (Mataix-Cols et al., 1999) and England (Abrahams et al., 2000).
METHOD FOR EVALUATING THE NORMATIVE REPORTS To adequately evaluate the design fluency normative reports, five criterion variables were deemed critical. The first four of these are related to subject variables, and the last one refers to procedural issues.
Subject Variables Sample Size
Fifty cases are considered a desirable sample size. Although this criterion is somewhat arbitrary, a large number of studies suggest that data based on small sample sizes are highly influenced by individual differences and do not provide a reliable estimate of the population mean.
302
PERCEPTUAL ORGANIZATION: VISUOSPATIAL AND TACTILE
Sample Composition Description
Information regarding medical and psychiatric exclusion criteria is important. It is unelear if intellectual level, gender, geographic recruitment region, socioeconomic status, occupation, ethnicity, handedness, or recrultment procedures are relevant. Until determined, it is best that this information be provided. Age Group Intervals
This criterion refers to grouping of the data into limited age intervals. A significant age effect has been reported for the RFFT, although a relationship between age and the Jones-Gotman/Milner version has not been documented. Given the probable relationship between age and design fluency perforrpance, presentation of data by age groupings Iis recommended for these tasks. Reporting of Educational Levels
: I
Educational level has been linked to JtFFr scores but not to the Jones-Gotmanlifilner version; given the probable relationshtp between design fluency and education, 'information regarding educational level should be reported for each subgroup. Reporting of IQ Information
IQ level (especially PIQ) has been assoeiated with RFFr performance, although no such link has been documented for the tonesCotman/Milner version; given the evidepce of a relationship between IQ and design fthency performance, information regarding Iq level should be reported. f
Procedural Variable Data Reporting
Test version should be indicated. F~ the RFFr, group means and standard de~tions for the total number of unique designs lcross five parts of the test should be presented, and preferably data on perseverations shm4d be included. For the Jones-Gotman/Milnet version, group means and SDs should be, provided for the 5-minute unstructured task and/ or the 4-minute structured task.
SUMMARY OF THE STATUS OF THE NORMS
The RFFT Only one study was designed to provide normative information on the RFFr: Ruff et al. (1987) presented data for 358 healthy individuals ranging widely in age and educational level. Information on age is provided in all studies, although only Ruff et al. (1987) stratify on age. Similarly, all studies report information on educational level, although again only Ruff et al. (1987) stratify on education. All studies had at least 50 participants with the exception of Demakis (1999). Only Ruff et al. (1986b) did not provide information on gender. Exclusion criteria are specified in all but three studies (Berning et al., 1998; Demakis, 1999; Ruff et al., 1986b). With the exception of one study (Berning et al., 1998), scores for the five RFFr parts are combined. Information regarding geographic area is indicated in all but one study (Demakis & Harrison, 1997), and recruitment strategies are specified in only three studies (Berning et al., 1998; Demakis, 1999; Ross et al., 2003).
Jones-Gotrnan/Milner Version Only five studies report nonnative data for the free and fixed conditions separately (Abrahams et al., 2000; Carter et al., 1998; Demakis & Harrison, 1997; Mataix-Cols et al., 1999; Woodard et al., 1992), while three studies collapse the data (Daigneault et al., 1992; Harter et al., 1999; Rapport et al., 2001), three studies provide information for only the free trial (Boone et al., 1991, 2001; Varney et al., 1996), and one study documents data for only the fixed trial (Beatty et al., 1993). Five studies provide data on samples of < 50 (Abrahams et al., 2000; Beatty et al., 1993; Boone et al., 1991, 2001; Mataix-Cols, 1999). Data regarding age are provided in every study, and a full age spectrum is represented (i.e., young adult to elderly); however, only one study stratified by age (Daigneault et al., 1992), although four studies recruited college students (Carter et al., 1998; Demakis & Harrison, 1997;
303
DESIGN FLUENCY TESTS
Harter et al., 1999; Mataix-Cols et al., 1999), which likely represents a relatively restricted age range. All studies provide infonnation on educational level, with most studies reporting on individuals with > 13 years of education (or in college at the time of testing). Gender distribution is reported in every study. Five studies provide data on IQ level and reveal average to high average intellectual levels. Seven investigations reported geographic area, with six studies indicating actual recruitment strategies. Ethnic makeup is not specifically reported for any study. Exclusion criteria were judged to be at least minimally adequate in all studies. In this chapter, nonnative publications and control data from clinical studies are reviewed in ascending chronological order. Studies using the RFFr are reviewed first, followed by those using the Jones-Gotman/Milner version. 1be text of study descriptions contains references to the corresponding tables identified by number in Appendix 16. Table A16.1, the locator table, summarizes infonnation provided in the studies described in this chapter.1
SUMMARIES OF THE STUDIES RFFT Version [RFFT.1J Ruff, Evans, and Marshall, 1986b (RFFT Version) (Table A16.2)
1be authors compared the perfonnance of moderately and severely head-injured patients to that of 50 nonnal control volunteers. 1be gender composition of the control group is not provided. Participants were an average of 28.2 (8.8) years of age, with an average of 13.2 (1.7) years of education. From the general description provided by the authors, it can be assumed that the control participants were recruited from the San Diego, California, community. No exclusion criteria are provided. Standard test procedures were used. Means and SDs for total unique designs drawn, number of perseverative errors, and error ratios are reported. 1Norms for children and adolescents are available in Baron (2004) and Spreen and Strauss (1998).
Study strengths 1. Adequate sample size. 2. 1be sample composition is well described in tenns of age and education. 3. Test administration procedures are specified. 4. Means and SDs for the test scores are reported. Considerations regarding use of the study 1. Exclusion criteria are not described. 2. Recruitment procedures and gender are not reported. 3. 1be data are not partitioned by age. [RFFT.2] Ruff, Light, and Evans, 1987 (RFFT Version) (Table A16.3)
1bese authors collected data on 358 (161 male, 197 female) individuals aged 16-70 years with education of 7-22 years. Participants were recruited from various parts of the United States, with approximately 65% from California, 30% from Michigan, and 5% from the eastern seaboard. Participants were excluded if they had a history of psychiatric hospitalization, chronic polydrug abuse, or neurological disorder. 1be data are stratified by four age groups (16-24, 25--39, 40--54, 5570 years) and three educational levels (:::;12, 13-15, >16 years) but not by gender since no gender effects were found. Standard test procedures were used. Based on the data from this study, a professional manual was published by Psychological Assessment Resources (Ruff & Allen, 1996) in which tables converting raw scores into T scores and percentiles are provided for the different age groups.
Study strengths 1. Large sample size. 2. The sample composition is well described in tenns of age, education, gender, and geographic area. 3. 1be data are partitioned by age and education. 4. Adequate exclusion criteria. 5. Test administration procedures are specified. 6. Means and SDs for the test scores are reported.
304
PERCEPTUAL ORGANIZATION: VISUOSPATIAL AND TACTILE
Consideration regarding use of the study 1. Recruitment procedures are not reported.
[RFFT.4] Fama, Sullivan, Shear, Cahn-Weiner, Yesavage, Tinklenberg, and Piefferbaum, 1998 (RFFT Version) (Table A 16.5)
Other comments 1. Frequency count of educational levels is reported. 2. The authors removed 20 outlier perseverative scores (i.e., scores 2:2 SD) to approximate a normal distribution in scores for the normative table. 3. While no overall gender effects were found, in two of the subgroups (44-55 years old with ~ 12 years of edpcation and 55-70 years old with 2:16 ~ars of education) women committed more errors than men.
[RFFT.3] Demakis and Harrison, 1997 (RFFT Version) (Table A 16.4)
This study examined the relationship hftween three fluency tests (one verbal and nonverbal). The authors recruited 134 (61 n\ale, 73 female) college students. Participants ~ere an average ofl9.1 (2.0) years of age, butth~e is no information regarding educationalleve~ None of the participants had learning disabilitfes and all had been screened for current or p~t neurological or psychiatric disease. Stand*d test procedures were used. Means and SDs fur total number of novel designs are reported. .
rnf>
Study strengths 1. Large sample size. 2. The sample composition is well described in terms of age and gend.r. 3. Adequate exclusion criteria. 4. Test administration procedure$ are specified. 5. Means and SDs for the test scoies are reported. i
This study examined the verbal and nonverbal fluency performance of patients with Parkinson's disease. A total of 51 normal controls were included. Control participants ranged in age from 52 to 85 years, with an average of 66.7 (7.4) years. Participants had an average of 16.4 (2.3) years of education and an estimated premorbid IQ of 115.6 (5.9). A subset of controls was recruited from the Palo Alto, California, community via "word-of-mouth." Additional data were selected from an archival database collected in the scope of previous studies (referenced in the article). Participants who were recruited from Palo Alto were paid, but no information on the incentive for participation is provided for those selected from the archival database. Participants were excluded from the study if they had a significant history of psychiatric disorder, neurological illness, past or present alcohol or substance abuse, or other "serious medical conditions," based on an interview and medical examination. Standard test procedures were used. Means and SDs for total number of unique designs drawn are reported.
Study strengths 1. Large sample size. 2. The sample composition is well described in terms of age, education, and estimated IQ. 3. Adequate exclusion criteria. 4. Test administration procedures are specified. 5. Means and SDs for the test scores are reported.
Considerations regarding use of the study 1. Educational level of the sample is not reported. ' 2. Recruitment procedures are npt reported.
Considerations regarding use of the study 1. Broad age ranges are used, and the data are not partitioned by age. 2. Recruitment procedures are not reported for half of the sample, but reference is made to another study. 3. Educational and estimated IQ levels for the sample are high.
Other comments 1. The authors found no gender ilifferences; thus, the data were not! partitioned by gender. '
Other comments 1. Demographic information is reported for 51 participants, but the RFFT norms are based on 50 participants.
DESIGN FLUENCY TESTS
[RFFT.S] Berning, Weed, and Aloia, 1998 (RFFT Version) (Table A16.6)
This study examined the interrater reliability of the RFFT. Participants were 124 undergraduate (34 male, 90 female) students recruited from introductory psychology courses at the University of Mississippi. Participants ranged in age from 18 to 31 years, with a median of 20 years. They were given course credit for their participation. No exclusion criteria are mentioned. Standard test procedures were used. The number of unique designs and perseverative errors as well as error ratios are presented for each of the five parts and for the whole test (i.e., total scores). Study strengths 1. Large sample size. 2. The sample composition is well described in terms of age, gender, recruitment procedures, and geographic area. 3. Test administration procedures are specified. 4. Means and SDs for the test scores are reported. Considerations regarding use of the study 1. Exclusion criteria are not described. 2. Educational levels are not reported. 3. While adequate interrater reliability was found, the RFFT results were scored by the participants themselves.
[RFFT.6] Demakis, 1999 (RFFT Version) (Table A 16.7)
Response consistency and test-retest reliability were examined in malingering simulators and controls for various neuropsychological tests. Normal controls consisted of 21 undergraduate psychology students (67% female) from a small Midwestern liberal arts college. Participants were an average of 22.5 (7.99) years of age and had an average of 13.6 (1.46) years of education. Standard test procedures were used. There was an approximately 3-week interval between the initial testing and retesting. Participants were paid $10. No exclusion criteria are provided.
305
Study strengths 1. The sample composition is well described in terms of age, gender, education, geographic area, and recruitment procedures. 2. Test administration procedures are specified. 3. Means and SDs for the test scores are reported. Considerations regarding use of the study 1. Sample size is small. 2. No exclusion criteria are reported. Other comments 1. The study provides data for RFFT practice effects.
[RFFT.7] Ross, Foard, Hiott, and Vincent, 2003 (RFFT Version) (Table A 16.8)
RFFT data were collected on 90 college students (55% female; 89.8% right-handed) recruited from introductory psychology courses at an urban, Midwestern university: 44% were Caucasian, 30% were African American, 9% were Hispanic, and 6% were designated "other." Ages ranged 18-69, with a mean of 23.9 (7.3), and mean estimated IQ North American Adult Reading Test (NAART) was 108.1 (9.2). Exclusion criteria were history of neurological disorder, learning disability, or psychiatric conditions involving medication usage. Standard test procedures were used. Participants received course credit for their participation. Forty-eight participants were retested an average of 35.2 days (range 28--51) later. Means and SDs are reported for numbers of unique designs and perseverations, error ratios, as well as five production strategy scores (rotational strategies, enumerative strategies, total strategic clusters, mean cluster size, and percent designs in strategies). Study strengths 1. Large sample size. 2. The sample composition is well described in terms of gender, age, estimated IQ, geographic area, recruitment procedures, and ethnicity.
PERCEPTUAl ORGANIZATION: VISUOSPATIAl AND TACTILE
306
3. Means and SDs are reported for standard scores as well as for five additional strategy scores. 4. Adequate exclusion criteria.
Consideration regarding the use of the study 1. No stratification by age, although the sample would generally represent a narrow age range (college students).
Other comments 1. Data provided on practice effects.
Jones-Gotman/Milner Version [Design Fluency.1] Boone, Ananth, Philpott, Kaur, and Djenderedjian, 1991 Oones-Gotman/ Milner Version) (Table A16.9)
Design fluency performance was collected on 16 controls (nine women, seven men) recruited in southern California through newspaper advertisements and from siblings of patients with OCD as a part of a study on the cognitive characteristics of OCD. Exclusion criteria were history of alcohol or drug abuse, head injury, seizure disorder, cerebral vascular disease or stroke, psychiatric disorder, or any renal, hepatic, or pulmonary disease. Mean age was 35.8 (13. 7) years, mean education was 15.2 (2.8) years, and mean WAIS-R (Satz-Mogel) FSIQ was 109.1 (10.9). Means and SDs for number of novel designs generated are reported.
Study strengths 1. The sample composition is well described in terms ofage, education, gender, IQ, geographic area, and recruitment strategies. 2. Adequate exclusion criteria. 3. Means and SDs for the test scores are reported.
Considerations regarding use of the study 1. Small sample size. 2. Data not stratified. 3. Data only reported for the free trial. [Design
Fluency.2] Woodard, Axelrod, and Henry,
a study examining neuropsychological function in older individuals. Exclusion criteria were dementia, current or past neurological illness or injury, substance abuse, drug use, and history of psychiatric disorder. Average age was 69.4 (10.6) years, with a range of 51.5-89.6 years, and average education was 14.6 (3.4) years. Protocols were independently scored by two advanced clinical psychology graduate students who had extensive training in clinical neuropsychology; however, neither rater had experience with the Design Fluency task prior to this study. Each rater based his or her ratings on reading of Jones-Gotman and Milner's (1977, and Jones-Gotman, personal communication, 1984) scoring criteria. Coefficients of agreement were 0.64 and 0. 71 for novel designs, 0.64 and 0.24 for nameable responses, and 0.57 and 0.41 for perseverative responses in the free and fixed conditions, respectively, and 0.68 for wrong number of lines in the fixed condition. Means and SDs for total, novel, nameable, and perseverative responses are provided for the free and fixed conditions separately, as well as for wrong number of lines in the fixed condition.
Study strengths 1. Adequate exclusion criteria. 2. Adequate sample size. 3. The sample composition is well described in terms of gender and educational level. 4. Means and SDs for various scores in the free and fixed conditions reported.
Considerations regarding use of the study 1. Information provided regarding age, but data are not stratified by age. 2. No information regarding recruitment strategies or IQ level (although mean scaled score [10.0] for the Vocabulary subtest of the WAIS-R is reported).
[Design Fluency.3] Daigneault, Braun, and Whitaker, 1992 (Jones-Gotman!Milner Version)
1992 (Jones-Gotmarv'Milner Version) (Table A16.1 0)
(Table A16.11)
Design fluency scores were gathered on 80 volunteers (35 men, 45 women) as part of
Design fluency data were obtained on 128 French-speaking participants in Canada as
307
DESIGN FLUENCY TESTS
part of a study investigating the effects of aging on prefrontal lobe skills. Participants were recruited through ads, trade union collaboration, and the help of a large sports center. Exclusion criteria were consumption of more than 24 beers, five bottles of wine, or 15 oz of spirits per week; consumption of cocaine, LSD, or psychostimulants; or any neurological or psychiatric consultation, psychoactive medication, head trauma with hospitalization, or major surgery (e.g., cardiac). Participants were divided into two age groupings: 20-35 years (mean=27.71, SD=4.05 years, n=70) and 45-65 years (mean= 56.62, SD = 5.29 years, n =58). The younger group contained 38 men and 32 women; they were primarily specialized blue-collar workers, although some specialized white-collar and unskilled blue-collar professions were represented. The older group contained 30 men and 28 women; slightly more than half were specialized blue-collar workers, but some unskilled blue-collar professions, specialized white-collar occupations, and professional occupations were represented. The mean educational level of the younger group was 12.36 (2.09) years, and that of the older group was 12.11 (3.63) years. Means and SDs for total number of different correct drawings is reported. In addition, means and SDs are provided for perseverative errors, defined as: (1) identical drawings; (2) none of the parts of the designs differed, in angle or dimension, by>100%, or (3) the only difference between two designs was a rotation, a mirror representation, or a change in the global dimension. Perseverations were scored by two assistants, and the average ratings are reported. No significant differences in performance between the two groups were detected.
Study strengths 1. Good exclusion criteria. 2. Large overall sample size, and each of the two age groupings has >50 participants. 3. The sample composition is well described in terms of educational level, gender, occupation, geographic area, and recruitment procedures. 4. Means and SDs for total number of correct designs and perseverations are provided.
Considerations regarding use of the study 1. Data were obtained on French-speaking participants in Canada, and thus it is unclear whether the are appropriate for clinical interpretation among Englishspeakingparticipants in the United States. 2. No information regarding IQ (although mean scores on the Vocabulary subtest of the French-language WAIS analog are reported). 3. Data are not provided for fixed and free conditions separately.
[Design Fluency.4] Beatty, Jocic, Monson, and Staton, 1993 Oones-Gotman/Milner Version)
(Table A16.12) Data on the 4-minute fixed trial were obtained on 20 controls (9 male, 11 female) as part of a study on memory and executive function in schizophrenia and schizoaffective disorder. Mean age was 34.7 (7.7) years, and mean education was 13.6 (1.4) years. Nineteen of the participants were working or were students. Exclusion criteria were history of central nervous system disease or injury, major medical illness, major psychiatric disorder, or current alcohol or drug abuse (one apparently had a past history of substance abuse). The authors indicate that an arc, a straight line, and a circle each counted as one line. Participants were told that the lines did not need to touch and that the drawings did not have to represent a real object. Means and SDs are reported for number of designs generated and number of rule violations (i.e., drawings containing fewer or more than four lines).
Study strengths 1. Procedure well specified. 2. Means and SDs reported for number of designs and number of rule violations. 3. The sample composition is well described in terms of age, education, and gender. 4. Good exclusion criteria.
Considerations regarding use of the study 1. Small sample size. 2. No information regarding IQ or recruitment strategy. 3. Data available only for the fixed trial.
308
PERCEPTUAL ORGANIZATION: VISUOSPATIAL AND TACTILE
[Design Fluency.S] Varney, Roberts, Struchen, Hanson, Franzen, and Connell, 1996 (Jones-Gotman/Mi/ner Version) (Table At6.13)
Data on the 5-minute free condition were collected on 87 volunteers (28 males~ 59 females) with no history of neurolo3ical or psychiatric illness, loss of consciousnes• due to head trauma, or severe febrile illnes!#. Mean age was 27.7 (13.1) years, with a rang4 of 1877, and mean education was 14.4 (2) years, with a range of 12-21. · Means and SDs for number of nqvel designs produced are reported. n.e agrtement rate between two independent ratqrs was 90%. Ninety-five percent of particip~ts produced one or fewer nameable desigqs, 95% made three or fewer repeated desi~s. and scribbling errors were rare. : No significant relationships betwee~ number of designs generated and age or edUcation and no significant differences in performance between men and women were absented. Study strengths 1. Large overall sample size. 2. n.e sample composition is w'U described in terms of age, educati~n. and gender. ' 3. Adequate exclusion criteria. 4. Means and SDs for the test scores are reported. Considerations regarding use of the study 1. Data are not stratified. 2. Data available only for the free condition. 3. No information regarding IQ, ethnicity, geographic area, or recruitment procedures.
[Design Fluency.6] Demakis and Harrisoa, 1997 (Jones-Gotman!Milner Version) (Table A 16.14)
This study examined the relationship between three fluency tests (one verbal and t\\J> nonverbal). The authors recruited 134 (61 male, 73 female) college students, who a~ raged 19.1 (2.0) years of age. n.e order of ll!st administration was counterbalanced so th•t each test was administered first, second, ana third the same number of times. None of the participants had learning disabilities or h~ been
screened for current or past neurological or psychiatric disease. Means and SDs for the free and fixed conditions and total score are reported. Study strengths 1. Relatively large sample size. 2. n.e sample composition is well described in terms of age and gender. 3. Adequate exclusion criteria. 4. Test administration procedures are specified. 5. Means and SDs for the test scores are reported. Considerations regarding use of the study 1. Educational level of the sample is not reported. 2. Recruitment procedures are not reported.
Other comments 1. n.e authors found no gender differences; thus, the data were not partitioned by gender. [Design Fluency.7] Carter, Shore, Hamadek, and Kubu, 1998 (Jones-Gotman/Milner Version) (Table A 16.15)
Sixty-six participants (19 males, 47 females), primarily undergraduates in Ontario, Canada, were tested. Inclusion criteria were age 18--60, right-handed, English as first or main language, FSIQ > 79, and no significant neurological, systemic, or psychiatric illness. Mean age was 25.06 (7.83) years, with a range of 1956; mean education was 15.21 (1.60) years; and mean WAIS-R FSIQ estimate was 100.85 (11.07), with a range of 81-124. Participants were administered the free and fixed conditions per the instructions of JonesCotman (in Spreen & Strauss, 1998). Means and SDs for total designs, novel output, perseverative errors, and nameable errors in the free and fixed conditions separately are reported as well as for incorrect number of lines in the fixed condition. Study strengths 1. n.e sample composition is well described in terms of age, education, gender,
DESIGN FLUENCY TESTS
apparent recruitment strategy, FSIQ, language, and geographic area. 2. Adequate exclusion criteria. 3. Means and SDs using the Jones-Gotman (in Spreen & Strauss, 1998) scoring criteria are reported. Considerations regarding use of the study 1. Data are not stratified by age, although a relatively narrow age range is assumed. 2. High educational level but average IQ.
309
Considerations regarding use of the study 1. Data are not stratified, although it can be assumed that the sample had a relatively narrow age range (college students). 2. No information regarding IQ. 3. Data are not provided for the free and fixed conditions separately. Other comments 1. Data regarding interrater reliability and test-retest reliability are provided.
Other comments 1. Information regarding interrater reliability is reported.
[Design Fluency.9] Mataix-Cols, Barrios, Sanchez-Turet, Vallejo, Junque, 1999
[Design Fluency.&] Harter, Hart, and Harter, 1999
Data were collected on 27 (23 women, 4 men) undergraduates in Barcelona, Spain, as part of a study of the impact of subclinical obsessivecompulsive symptoms on design fluency performance. Mean age was 19.1 (1.3) years, and one subject was left-handed. An exclusion criterion was history of a psychiatric disorder. Means and SDs for the free and fixed conditions are reported.
Oones-Gotman/Milner Version) (Table A16.16)
Design fluency data were obtained on 64 college students (91% female) apparently enrolled at Texas Tech University as part of a study on expanded scoring criteria for the Jones-Cotman/Milner design fluency version. Ages ranged 17-58 years, with an average of 20 years. Eighty-one percent of participants reported no history of neurological disorder; 11 reported history of a blow to the head resulting in loss of consciousness, although only one subject alleged continuing difficulties related to the injury (headaches and reading comprehension problems), and three reported recent signs of possible neurological dysfunction, including seizures, dizziness, fainting, and memory difficulties. Eighty-six percent of the sample were right-handed. Means and SDs for number of novel designs, scribbles, and perseverations as well as additional scores reflecting complexity, number of concrete responses, and drawing quality are provided both for the sample as a whole and with the 12 students with possible neurological dysfunction excluded. Study strengths 1. Large sample size. 2. The sample composition is well de~ scribed in terms of age, education, gender, geographic area, handedness, and recruitment procedure. 3. Means and SDs are reported for several scores.
(Jones-Gotman/Milner Version)
(Table A 16.17)
Study strengths 1. The sample composition is well described in terms of age, educational status, gender, geographic area, and recruitment strategy (ethnicity assumed Spanish). 2. Means and SDs for the test scores are reported. 3. Minimally adequate exclusion criteria. Considerations regarding use of the study 1. Small sample size. 2. No information regarding IQ. 3. Data were collected in Spain, which may limit their usefulness for clinical comparison in the United States.
[Design Fluency.1 0] Abrahams, Leigh, Harvey, Vythelingum, Grise, and Goldstein, 2000 (Jones-Gotman/Milner Version) (Table A16.18)
Design Fluency performance was measured in 25 controls (16 males, 9 females) as part of a study of cognition in ALS. Mean age was 55.8 (11.6) years, mean education was 14.1 (3.1) years, and mean NART FSIQ was 114.6 (9.9).
310
PERCEPTUAL ORGANIZATION: VISUOSPATIAL AND TACTILE
Controls were recruited in London lfrom a local volunteer organization, a local e
Study strengths j 1. The sample composition is Wfll described in terms of age, education, FSIQ, handedness, gender, georaphic area, and recruitment strategies. • 2. Minimally adequate recruitment strategy. 3. Means and SDs for the test scotes are · reported. Considerations regarding use of the stody 1. Small sample size. · 2. Data were collected in England,: which may limit their usefulness for ~linical comparison in the United States.~ 3. Data are not stratified. [Design Fluency. 11] Boone, Swerdloff, ¥iller, Geschwind, Razanl, Lee, Gaw Gonzalo, tladdal, Rankin, Lu, and Paul, 2001 ()ones-Gotma,J!Milner Version) (Table A 16.19)
Design Fluency performance was asseSsed in 22 male controls as part of a study op neuropsychological function in adult Klinefelter's syndrome. Participants were recruitecl from newspaper and radio ads and flyers in the southern California area and paid foj- their participation. Exclusion criteria were history of learning disability, major psychiatric{ disorder, substance abuse, or neurological! disorder. All participants were fluent in E',tglish. Mean age was 34.32 (14.81) years, mE:ian education was 13.36 (2.15) years, mean WAIS-R (Satz-Mogel) VIQ was 106.46 (17.01), and mean PIQ was 107.46 (16.58). Means and SDs for number of no~el designs in the free condition are reporte4.
Study strengths 1. Information regarding age, education, gender, IQ, geographic area, l~age, and recruitment procedures is reported.
2. Adequate exclusion criteria. 3. Means and SDs for the test scores are reported.
Considerations regarding use of the study 1. Data are not stratified by age. 2. Small sample size. 3. All male sample. 4. Data are provided only for the free condition.
CONCLUSIONS
The relatively few studies conducted on design fluency tests have generally shown these tasks to be sensitive to frontal lobe impairment. The two different tests selected for review in this chapter, the RFFT and the Jones-Gotman/ Milner design fluency task, have unique advantages and disadvantages. The RFFT is a structured test, which is relatively easy to administer and score. This test demonstrates adequate inter-rater and test-retest reliability. However, criticisms have been raised that due to its structured format the RFFT may not measure initiation and organization skills in patients with frontal dysfunction as well as the Jones-Gotman/Milner. On the other hand, the complexity of designs on production rate and questions regarding what constitutes a perseverative error vs. a unique design make clinical use of the Jones-Gotman/Milner version somewhat problematic. A review of the literature for design fluency tests found relatively sparse normative studies. For the RFFT, no studies reported gender effects but effects of age and education were documented. In general, unique design production decreases and perseverative error rates increase with advanced age, particularly in individuals with less than a college education. Additionally, one study (Ruff et al., 1987) found that total design production increased with higher Performance IQ scores. While normative data and conversion tables have been created for a wide age range in the RFFT manual (Ruff, 1996), it is clear that additional normative studies using larger sample sizes are needed. Significant practice effects have been reported for the RFFT,
DESIGN FLUENCY TESTS
particularly when the test-retest interval is 5 weeks or less. For the Jones-Gotman!Milner version, mixed results have been reported regarding the influence of age, education, and gender. Most studies report no effects of education (Harter et al., 1999; Mittenberg et al., 1989; Varney et al., 1996). Some studies have found no age effects (Daigneault et al., 1992; Harter et al., 1999; Varney et al., 1996), while others have reported decreased design fluency pro-
311
duction with advancing age (Mittenberg et al., 1989). Similarly, some studies have reported no differences in design production between males and females (Mataix-Cols et al., 1999; Varney et al., 1996), while others have reported significantly greater production of novel designs by males relative to females (Harter et al., 1999). A minimal relationship between Performance IQ and design production has also been reported (Harter et al., 1999). 2
'Meta-analyses were not performed as sufficient amounts of homogeneous data were not available for either version of the test.
17 Tactual Performance Test
BRIEF HISTORY OF THE TESJ The Tactual Performance Test (TPT) ij based on the Sequin-Goddard Formboard. ~It was incorporated by Halstead (1947) in his triginal neuropsychological battery and subse~uently employed by Reitan (1979) in his expa4sion of the Halstead Battery. Participants are blindfolded, and 10 blocks of differing shapds and a matching 10-hole formboard are pl~f!ed in front of them. They are instructed td insert the blocks in the board as quickly as they can with their dominant hand only. Follo*g this trial, they are required to place the blocks with their nondominant hand and th$1. both hands together. The blocks and formboard are removed, followed by the blindfold, ~d the participants are asked to draw the fonnboard and as many of the block shapes as ~y can remember in their relative location. Specific instructions are provided in Reitan's •(1979) Manual for Administration of Neuropsychological Test Batteries for Adults and Clfildren, Reitan and Wolfson's (1985) The Htf.steadReitan Neuropsychological Test Batteiy, and Swiercinky's (1978) Manual for the· Adult Neuropsychological Evaluation. Th~pson and Parsons (1985) summarize the literature through 1983 on the status of the 1PT in terms of test construction, effect of $Ubject variables on performance, interpretation of
312
scores regarding lateralization and localization, and test performance in specific patient populations; the interested reader is referred to this publication. The TPT involves several different abilities, including tactile perceptual skills, propriospatial ability, tactile/spatial memory, and visual constructional skills. Deficits in TPT performance have been associated with frontal lesions (Halstead, 1947; Shure & Halstead, 1958), posterior dysfunction (Reitan, 1964), and right hemisphere disturbance (Reitan, 1964; Schreiber et al., 1976). Some reports have indicated that increased right-hand time is associated with left hemisphere damage and increased left-hand time is tied to right hemisphere dysfunction (Dodrill, 1978a; Reitan, 1964), but no consistent lateralization findings have been documented for the memory and localization scores (Heilbronner et al., 1991; Thompson & Parsons, 1985). Of interest, normal right-handed males show a left-hand advantage on time and memory scores when the left hand is tested first, suggesting an enhanced role of the right hemisphere in TPT performance (Jenkins & Parsons, 1989). TPT performance has typically been reported in terms of time to complete the task with the preferred hand, nonpreferred hand, both hands, and total time, as well as number of blocks correctly recalled and located.
TACTUAl PERFORMANCE TEST
Reitan (1979) recommends Halstead's (1947) cutoff of 15.6 minutes for total time, five or fewer blocks recalled, and four or fewer blocks located. However, given the significant association of TPT scores with age, IQ, and possibly education and gender, a single cutoff would not appear to be appropriate, particularly in older participants, as documented by the following reports. Price et al. (1980) found that using Halstead's (1947) cutoffs for total time, memory, and localization, 89.8%, 12.2%, and 77.6%, respectively, of a healthy, elderly sample with a mean age of 71.9 were misclassified as brain-damaged. Similarly, Ernst (1987) documented in his sample aged 65-75 years misclassification rates of 77%, 36%, and 89% for total, memory, and localization scores, respectively. Bak and Greene (1980) reported that 40% of their healthy sample aged 50-62 were misclassified on total time. The mean scores of this group fell below the cutoffs for memory and localization, and the mean scores of an older group (67-86) did not surpass the cutoff for any of the TPT measures. Cauthen (1978a), studying a broader age range, documented 22%, 15%, and 53% misidentification rates for total time, memory, and localization in his sample aged 20-60 years; only 2% of 20-year-olds were misclassified on total time, while 63% of those 50-60 years old were miscategorized. Dodrill (1987) documented 21.7%, 5%, and 39.2% misclassification rates for total time, memory, and localization, respectively, in a young control sample. Chavez et al. (1982) noted that the mean localization score (4.87) of their male college students fell below the Reitan cutoff of 5. Bornstein and colleagues (1987b) emphasized that cutoff scores may be useful but only if considered in the context of other neuropsychological information obtained in a test battery and if age, education, and other appropriate adjustments are made. Reitan and Wolfson (1985) and others (Golden et al., 1981b; Jarvis & Barth, 1984) have suggested that nondominant hand performance should be 30% faster than dominant hand performance, although several investigators have documented that a sizable percentage of normals show superior dominant hand performance (Cauthen, 1978a; Thompson &
313
Heaton, 1991) and that this percentage increases with age (Goldstein & Braun, 1974; Price et al., 1980; Thompson et al., 1987). Criticisms have emerged regarding the lack of standardization of some aspects of TPT administration, which could substantially influence the obtained test scores. Snow (1987) notes that Reitan recommends discontinuation of a time trial at 15 minutes if the patient is discouraged and not close to finishing the task but continuing if the patient is near completion, while other clinicians (e.g., Russell et al., 1970) routinely stop at 10 minutes. Snow (1987) points out that differing amounts of exposure time to the blocks could influence memory and localization scores. Snow (1987) also observes that no precise scoring criteria have been developed for the memory and localization scores; for example, some clinicians allow a four- or five-point star, while others give credit for a six-point star. Chavez et al. (1982) note that some clinicians follow a consistent order in placing the blocks before blindfolded participants while other examiners randomly arrange the blocks. They also report that the order of block presentation did not affect the time trials with their normal college student population but that a standardized block presentation format was associated with higher memory scores, and a trend was noted toward higher localization scores. Kupke (1983) observed better total time scores in his college sample using a portable version of the TPT rather than the standard equipment. In addition to the above concerns regarding TPT administration and interpretation, Lezak et al. (2004) point out that the TPT is very frustrating and stressful for patients due to its length and difficulty and question its practical utility. Also, because moderately to severely impaired individuals fail the test so completely, little information can be gleaned regarding extent of impairment in these participants (Russell, 1985). In an attempt to shorten and simplify the test, Lezak et al. (2004) have recommended substituting the six-block child's TPT (TPT-6) for the 10-block adult version. Russell (1985) reported that TPT-6 scores were highly correlated with the 10-block TPT and that the TPT-6 successfully discriminated
314
PERCEPTUAL ORGANIZATION: VISUOSPATIAL AND TACTILE
controls from brain-damaged participants. Also, test administration was shortened by two-thirds. Unfortunately, although tllese reports regarding the utility of substituling the six-block TPT for the 10-block version in adults are promising, the only "normative" data on the six-block TPT in adults are Clark and Klonoffs (1988) 79 coronary byp.ss surgery patients and Russsell's (1985) 1.9 "controls" referred for suspected but unconfirmed brain dysfunction. In spite of the above-mentioned liabilities, the TPT enjoys some popularity in clinical assessment. It was found to be one of the HRB tests which are most useful in the assessment of brain impairment (total time and location scores; Mutchnick et al., 1991) and to be associated with daily living skills in geriatric patients (Searight et al., 1989), driving ability in head-injured patients (Brooke et al., 1992), and failure to return to premorbid educational or vocational levels in brain tumor survivors (Hochberg & Slotnick, 1980). It was the most sensitive measure in the identifi(!Jltion of brain dysfunction in blind patients (:Bigler & Tucker, 1981). In addition, TPT performance is lower in verbal (Davis et al., 1989) and nonverbal (Hamadek & Rourke, 1994) learn-
ing-disabled individuals, patients with obsessiv~ompulsive disorder (Insel et al., 1983), inmates referred for psychiatric treatment (Young & Justice, 1998), and alcoholics (Gurling et al., 1991; Hesselbrock et al., 1985; Hochla et al., 1982; Loberg, 1980), especially left-hand performance (Jenkins & ~arsons, 1981). Of interest, TPT scores were Dot impacted by chronic marijuana use in medical students (Rochford et al., 1977). Further discussion of TPT interpretation is proviiled by Bradford (1992).
Psychometric Properties of the TPT Initial investigations of interrater reliability revealed moderate agreement for IJ)emory scores (71 %-76.3%) and poor correspohdence for localization scores (56. 7%-63.8%; ~rtin & Greene, 1978); however, more recentlj' Charter et al. (1998) reported interscorer reliabilities of 0.9810 for the memory trial and 0.9773 for the localization trial. Ch~r and
colleagues have also addressed additional psychometric properties of the TPT, reporting internal consistency reliabilities of 0.6588 for the preferred hand, 0.8715 for the nonpreferred hand, 0.8199 for both hands, 0.8953 for total time, 0.69 for memory, and 0. 79 for localization (Charter et al., 2000; Charter, 2001c). Reliabilities for the difference scores across trials have been poor (i.e., 0.0892 for preferred hand minus nonpreferred hand, 0.0838 for both hands minus preferred hand, and 0.2065 for both hands minus nonpreferred hand), although the blocks per minute reliabilities have been somewhat higher (i.e., 0.3122 for nonpreferred hand minus preferred hand, 0.6387 for both hands minus preferred hand, and 0.5058 for both hands minus nonpreferred hand) (Charter, 2001b). The test blocks are not of comparable difficulty; for example, on the Memory trial, the rhombus is the most difficult to recall for both patients and controls while the circle is the easiest (Charter & Dutra, 2001a,b). Three blocks from the Memory trial (diamond, oval, and rhombus) and one block from the localization trial (oval) obtained discrimination indexes <0.3, suggesting that these items do not discriminate well between good and poor performers. In addition, on the block placement trials, blocks at the top of the board are more difficult to correctly position than blocks on the bottom (Charter, 2000b). Clark and Klonoff (1988) examined the internal consistency, test-retest reliability, and construct validity of the TPT-6 and concluded that it was reliable and had construct validity characteristics comparable to those of the 10block TPT.
RELATIONSHIP BETWEEN TPT PERFORMANCE AND DEMOGRAPHIC FACTORS Age has been consistently related to TPT performance in normal individuals (Bak & Greene, 1980; Cauthen, 1978a; Elias et al., 1993; Fromm-Auch & Yeudall, 1983; Heaton et al., 1986, 1991; Moore et al., 1984; Reed & Reitan, 1962, 1963b; Reitan, 1955d; Yeudall et al., 1987) as well as brain-damaged (Fitzhugh et al.,
315
TACTUAl PERFORMANCE TEST 1964; Prigatano & Parsons, 1976; Reed & Reitan, 1962; Vega & Parsons, 1967), medical (Alekoumbides et al., 1987; Reed & Reitan, 1963b), and psychiatric (Alekoumbides et al., 1987; Ernst et al., 1987; Prigatano & Parsons, 1976) patients, with older age associated with poorer scores. To our knowledge, no published study has failed to document a correlation between TPT and age. Higher IQ also appears to be associated with better TPT performance in various groups, including psychiatric (Warner et al., 1987) and mentally retarded individuals (Matthews, 1974). IQ levels have also been associated with TPT scores in normals (Cauthen, 1978a; Wiens & Matarrazzo, 1977), with PIQ showing more of a correlation than VIQ (Wiens & Matarrazzo, 1977; Yendall et al., 1987). The data on the relationship of education and gender with TPT indices have been more equivocal; taken as a whole, the available literature does not demonstrate a convincing relationship between these variables and TPT performance. Ernst (1987), Finlayson et al. (1977), and Yendall et al. (1987), failed to detect a relationship between educational level and TPT scores in normals, while Heaton et al. (1986, 1991) observed significant differences across educational levels only for the memory score. No association has been detected between education and TPT performance in brain-damaged patients (Finlayson et al., 1977; Prigatano & Parsons, 1976; Vega & Parsons, 1967; Warner et al., 1987), although Vega and Parsons observed an effect of education on memory and timed scores in a medical patient sample and Alekoumbides et al. (1987) apparently detected a relationship between education and TPT scores in a medical and psychiatric sample. Most publications suggest that there is no difference between males and females on TPT performance in normal participants (Dodrill, 1979; Elias et al., 1990; Filskov & Catanese, 1986; Fromm-Auch & Yendall, 1983; King et al., 1978; Moore et al., 1984; Pauker, 1980; Thompson et al., 1987; Yendall et al., 1987) and in patients (Dodrill, 1979). However, Heaton et al. (1986) reported that males outperformed females on total time, although the amount of variance in test scores accounted
for by gender was minimal (Heaton et al., 1991). Ernst (1987) documented better male performance on memory, localization, and all time scores except dominant hand, while Elias et al. (1993) also found a male superiority in memory scores. Conversely, other reports have suggested that women score better than men on memory (Fabian et al., 1981), localization (Chavez et al., 1982), or both scores (Gordon & O'Dell, 1983), although Gordon and O'Dell (1983) comment that the gender differences they observed were not of "practical significance." In the two studies addressing the relationship between lateral preference and test performance, no differences in test scores between right- and left-banders were documented (Gregory & Paul, 1980; 'n.ompson et al., 1987). Arnold et al. (1994) documented a significant effect of acculturation (i.e., higher performance associated with higher acculturation level) on TPT dominant hand, nondominant hand, and total time, with no effect of acculturation on localization and memory scores.
METHOD FOR EVALUATING THE NORMATIVE REPORTS Our review of the literature located nine TPT normative reports for adults published since 1965 (Cauthen, 1978a; Dodrill, 1987; Ernst, 1987; Fromm-Auch & Yendall, 1983; Harley et al., 1980; Heaton et al., 1991; Pauker, 1980; Schear, 1986; Yendall et al., 1987), as well as the original Halstead (1947) and Reitan (1955b, 1959) normative data and three interpretive guides for the HRB (Golden et al., 1981b; Reitan & Wolfson, 1985; Russell et al., 1970). Hundreds of other studies have also reported control subject data, and we have included discussion of 13 of those investigations which involved some unique feature, such as large sample size (~100), retest data, elderly population, non-English-speaking sample, and use of a shorter version of the test (Alekoumbides et al., 1987; Anthony et al., 1980; Bak & Greene, 1980; Bornstein et al., 1987a; Clark & Klonoff, 1988; El-Sheikh et al., 1987; Heaton et al., 1986; Klove & Lochen, cited in Klove, 1974; Matarazzo et al., 1974; Moore et al.,
316
PERCEPTUAL ORGANIZATION: VISUOSPATIAL AND TACTILE
1984; Russell, 1985; Thompson & Heaton, 1991; Wiens & Matarazzo, 1977). (We also found a study in which military personnel were administered the TPT during various field maneuvers of differing intensity, to evaluate the effect of environmental stress on TPT performance [Arima, 1965]. However, given the questionable relevance of this information for the typical neuropsychological testing session, we decided not to review this publication.) Russell and Starkey (1993) developed the Halstead-Russell Neuropsychological Evaluation System (HRNES), which includes 22 tests. In the context of this system, individual performance is compared to that of 576 braindamaged participants and 200 participants who were initially suspected of having brain damage but had negative neurological findings. Data were partitioned into seven age groups and three educational!IQ levels. This study will not be reviewed in this chapter because the "normal" group consisted of the V.A. patients who presented with symptoms requiring neuropsychological evaluation. For further discussion of the HRNES, see Lezak et al. (2004). Of note, few relevant manuscripts have emerged since the 1980s, perhaps due either to the publication of Heaton et al.'s (1991) comprehensive normative tables or to the increasing use in research and clinical practice of flexible neuropsychological test protocols which include newer tasks rather than traditional fixed neuropsychological batteries. To adequately evaluate the TPT normative reports, seven key criterion variables were deemed critical. The first six of these relate to subject variables, and the remaining refers to procedural issues. Minimal requirements for meeting the criterion variables were as follows.
Sample Composition Description
As discussed previously, information regarding medical and psychiatric exclusion criteria is important; it is unclear if geographic recruitment region, socioeconomic status or occupation, ethnicity, handedness, and recruitment procedures are relevant. Until determined, it is best that this information be provided. Age Group Intervals
Given the association between age and TPT performance, information regarding the age of the normative sample is critical and normative data should be presented by age intervals. Reporting of JQ Levels
Given the relationship between TPT performance and IQ, data should be presented by IQ intervals, or at least information regarding intellectual levels should be provided. Reporting of Educational Levels
Given the possible, although minor, association between educational level and TPT scores, it is preferable that information regarding highest educational level completed be reported. Reporting of Gender Distribution
Given the possible, although minor, association between gender and TPT scores, it is preferable that information regarding gender be reported.
Procedural Variables Data Reporting
Means and standard deviations, and preferably ranges, for time in seconds or minutes on the TPT for the dominant and nondominant hands separately and together and the total across all three trials should be reported, as well as means and SDs for memory and localization scores.
Subject Variables Sample Size
As discussed in previous chapters, a minimum of 50 participants per grouping interval is optimal.
SUMMARY OF THE STATUS OF THE NORMS All but 10 data sets had total sample sizes larger than 100 (Bak & Greene, 1980; Bornstein
TACTUAL PERFORMANCE TEST
et al., 1987a; Clark & Klonoff, 1988; El-Sheikh et al., 1987; Halstead, 1947; Klove & Lochen, cited in Klove, 1974; Reitan, 1955b, 1959; Russell, 1985; Wiens & Matarazzo, 1977). Only three publications consistently had at least 50 participants in individual subject groupings (Ernst, 1987; Heaton et al., 1986; Schear, 1984), although some reports had some subgroups which met this criterion (Fromm-Auch & Yendall, 1983; Harley et al., 1980; Pauker, 1980; Yendall et al., 1987). Approximately half of the studies summarized in this chapter present TPT data according to circumscribed age ranges (Bak & Greene, 1980; Cauthen, 1978a; Ernst, 1987; Fromm-Auch & Yendall, 1983; Harley et al., 1980; Heaton et al., 1986, 1991; Moore et al., 1984; Pauker, 1980; Schear, 1984; Wiens & Matarazzo, 1977; Yendall et al., 1987). Information on IQ levels is reported in all but seven studies (Barrett et al., 2001; El-Sheikh et al., 1987; Ernst, 1987; Heaton et al., 1986; Klove & Lochen, cited in Klove, 1974; Russell, 1985; Shear, 1984), and two reports presented TPT data for age-by-IQ groupings (Cauthen, 1978a; Pauker, 1980). Similarly, educational level was also indicated in all but four studies (Bornstein et al., 1987a; Cauthen, 1978a; Moore et al., 1984; Pauker, 1980), and Heaton et al. (1986, 1991) organized data by educational levels. Information on gender composition of the samples was available in all but four reports (Anthony et al., 1980; Harley et al., 1980; Klove & Lochen, cited in Klove, 1974; Thompson & Heaton, 1991); five data sets included only male (Clark & Klonoff, 1988; Schear, 1984; Wiens & Matarazzo, 1977) or nearly all male (Alekoumbides et al., 1987; Russell, 1985) populations. Ernst (1987) and Heaton et al. (1991) presented data separately for male and females. Information on other subject variables was provided less frequently; data on handedness was indicated in eight studies (Bak & Greene, 1980; Cauthen, 1978a; Clark & Klonoff, 1988; Dodrill, 1987; Fromm-Auch & Yendall, 1983; Russell, 1985; Schear, 1984; Yendall et al., 1987), occupation or socioeconomic status was described in five reports (Alekoumbides et al., 1987; Dodrill, 1987; Halstead, 1947; Wiens &
317
Matarazzo, 1977; Yendall et al., 1987), and information regarding ethnicity was presented in four data sets (Alekoumbides et al., 1987; Dodrill, 1987; Halstead, 1947; Russell, 1985). Exclusion criteria were judged to be adequate in only 11 publications (Anthony et al., 1980; Bak & Greene, 1980; Bornstein et al., 1987a; Dodrill, 1987; Fromm-Auch & Yendall, 1983; Heaton et al., 1991; Moore et al., 1984; Pauker, 1980; Thompson & Heaton, 1991; Wiens & Matarazzo, 1977; Yendall et al., 1987). Geographic recruitment areas were specified in all but two publications (Bornstein et al., 1987a; Dodrill, 1987). Fourteen data sets were obtained in the United States (Alekoumbides et al., 1987; Anthony et al., 1980; Bak & Greene, 1980; Halstead, 1947; Harley et al., 1980; Heaton et al., 1986, 1991; Klove & Lochen, cited in Klove, 1974; Reitan, 1955b, 1959; Russell, 1985; Schear, 1986; Thompson & Heaton, 1991; Wiens & Matarazzo, 1977), six in Canada (Cauthen, 1978a; Clark & Klonoff, 1988; Fromm-Auch & Yendall, 1983; Moore et al., 1984; Pauker, 1980; Yendall et al., 1987), one in Norway (Klove & Lochen, cited in Klove,1974), one in Egypt (El-Sheikh et al., 1987), and one in Australia (Ernst, 1987). Total mean time in seconds or minutes and SDs to complete the task across the three time trials as well as means and SDs for Memory and Localization were reported in 19 datasets (Alekoumbides et al., 1987; Anthony et al., 1980; Bak & Greene, 1980; Bornstein et al., 1987a; Cauthen, 1978a; Dodrill, 1987; ElSheikh et al., 1987; Ernst, 1987; FrommAuch & Yendall, 1983; Harley et al., 1980; Heaton et al., 1986, 1991; Klove & Lochen, cited in Klove, 1974; Pauker, 1980; Russell, 1985; Schear, 1984; Thompson & Heaton, 1991; Wiens & Matarazzo, 1977; Yeudall et al., 1987). Twelve reports provided data for the preferred and nonpreferred hands separately and together (Alekoumbides et al.,1987; Bak & Greene, 1980; Cauthen, 1978a; ElSheikh et al., 1987; Ernst, 1987; Fromm-Auch & Yendall, 1983; Heaton et al., 1991; Russell, 1985; Schear, 1984; Thompson & Heaton, 1991; Wiens & Matarazzo, 1977; Yendall et al., 1987). Three studies reported score
318
PERCEPTUAL ORGANIZATION: VISUOSPATIAL AND TACTILE
ranges (Fromm-Auch & Yendall, 1983; Halstead, 1947; Harley et al., 1980). Several publications reported supplementary TPT scores such as T-score equivalents (Harley et al., 1980); T scores corrected for age, education, and gender (Heaton et al., 1991); IQ-equivalent scores (Dodrill, 1987); test-retest data (Bomstein et al., 1987a; El-Sheikh et al., 1987; Matarazzo et al., 1974); and data for a six-block version (Clark & Klonoff, 1988; Russell, 1985). In addition, Harley et al. (1980) provide the perce~tage of patients correctly placing blocks, and Harley et al. (1980), Ernst (1987), and Schear (1984) report means and SDs for number of blocks correctly placed by each hand, both hands, and total. Alekoumbides et al. (1987) present means and SDs for number of blocks placed per minute, and Harley et al. (1980) report mean and SD for time to place each block for the three trials. The text of study descriptions contains references to the corresponding tables identified by number in Appendix 17. Table A17.1, the locator table, summarizes information provided in the studies described in this chapter. 1
SUMMARIES OF THE STUDIES Given that the TPT has typically been used within the context of the HRB, the aaistead (1947) and Reitan (1955b, 1959) data and interpretation formats will be reported first, followed by a summary of the other interpretation formats. Then, the normative publications and control groups from clinical comparison studies will be reviewed in ascending chronological order.
history of brain injury. The eight participants who carried diagnoses of mild psychoneurosis were male soldiers aged 22--38 (mean= 29.6); some had had combat experience but none had a history of head injury. The last six participants were aged 27-39 and included a depressed military prisoner facing execution, a severely depressed female with suicidal and homicidal impulses tested prior to lobotomy, and a suicidaVhomicidal female and a suicidal male tested pre- and post-lobotomy. Educational level ranged from 7-18 years, and the following occupations were represented: artist, entertainer, farmer, housewife, semiskilled and unskilled laborers, professional, secretary, teacher, technician, trade, and student. Ethnic background included American, Balkan, English, French, German, Irish, Polish, Scandinavian, and Scottish. IQ levels ranged from 7(}..140. Mean total time for the three trials and mean scores for memory and localization are reported for the total group and each control subgroup, as well as the individual scores for each subject. The TPT criterion scores used in calculating the Impairment Index were >15.6 minutes for total time, fewer than six blocks recalled, and fewer than five blocks located.
Study strengths 1. Information provided regarding IQ, education, occupation, age, gender, ethnicity, and geographic recruitment area. Considerations regarding use of the study 1. Small sample size including use of two participants twice. 2. Inclusion of participants with psychiatric diagnoses and post-lobotomy. 3. No reporting of SDs. 4. Undifferentiated age range.
Original Studies
[TPT.2] Reitan, 1955b, 1959 (Table A17.3)
[TPT.1] Halstead, 1947 (Table A17.2)
The author obtained TPT scores on 50 participants in Indiana who had apparently been referred for neuropsychological testing and "who had received neurological examinations before testing and showed no signs or symptoms of cerebral damage or dysfunction. . . . None . . . had positive anamnestic findings" (p. 29); but participants hospitalized with paraplegia and neurosis were included.
The author obtained data on 28 control participants in Chicago, half of whom had psyChiatric diagnoses. The 14 participants without psychiatric diagnoses were nine male and five· female civilians aged 15-50 (mean= 25.9), without 'Norms for children are available in Baron (2004) and Spreen and Strauss (1998).
TACTUAl PERFORMANCE TEST
319
The sample included 35 men and 15 women, and mean age and educational level were 32.36 (10.78) and 11.58 (2.85), respectively. Mean WAIS VIQ, PIQ, and FSIQ were 110.82 (14.46), 112.18 (14.23), and 112.64 (14.28), respectively. Study strengths 1. Information regarding IQ, education, gender, age, and geographic recruitment area is provided. 2. Adequate sample size. 3. Means and SDs are reported. Considerations regarding use of the study 1. Undifferentiated age range. 2. Insufficient medical and psychiatric exclusion criteria; the sample included participants hospitalized with spinal cord injuries and psychiatric disorders.
Interpretive Guides [TPT.3] Reitan and Wolfson, 1985
The authors suggest that nondominant hand performance should be about one-third faster than dominant hand time and that performance on the third trial (both hands) should be one-third faster than nondominant hand performance. They provide general guidelines for TPT score interpretation in the form of "severity ranges:" perfectly normal (or better than average), normal, mildly impaired, and seriously impaired. They list the test total completion time in minutes and number of blocks recalled and located that correspond to each severity range. No other information is provided, such as score means and SDs or any
data regarding the normative sample on which these guidelines were developed. Considerations regarding use of the study The authors argue that these norms were meant as "general guidelines" and that "exact percentile ranks corresponding with each possible score are hardly necessary because the other methods of inference are used to supplement normative data in clinical interpretation of results of individual participants" (p. 97). However, we maintain that more precise scores as well as separate normative data for different age, IQ, and educational levels are necessary to avoid false-positive errors in diagnosis. It is not clear how cutoffs were developed (not reproduced here); they do not match the cutoffs recommended by Halstead (1947). Golden et al. (1981b, pp. 2(}..21) recommend that if nondominant hand performance is 40% better than dominant hand performance, a dominant hemisphere lesion should be inferred, and if the nonpreferred hand score is <25% better than the preferred hand score, a nondominant hemisphere lesion should be hypothesized. BU88eU et al. (1970, pp. 42-44), in constructing their neuropsychological key approach, devised six rating equivalents of TPT raw scores based primarily on "rules of thumb" recommended by P.M. Rennick and apparently derived from a small group of patients. Russell (1984) subsequently modified the ratings as reflected in Table 17.1. Russell et al. (1970) suggest that left hemisphere damage is indicated if the righthand score is worse than the left-hand score by 2 rating points and vice versa.
Table 17.1. Rating Equivalents of the TPT Raw Scores, According to Russell et al. (1970) Rating Equivalents 0 Dominant Nondominant Both Total Memory Localization
~3.9 ~2.9
~1.7 ~8.9
9-10 7-10
4-7.9 3-5.4 1.8-3.3 9-16.6 6-8 5-6
2
3
4
5
6
8-9.9 5.5-6.7 3.4-3.9 16.7-20.5 4-5 3-4
10-12.9 6.8-9.7 4-5.7 20.6-28.5 2-3 1-2
13-15.9 9.8-13.9 5.8-9.5 28.6-39.7 1 0
16-19 14-18 9.6-18
20,X 19-20,X 19-20,X 57-60 6=TPT total 6=TPT total
40-56
0 M=O
320
PERCEPTUAl ORGANIZATION: VISUOSPATIAl AND TACTILE
Normative Studies and Control Groups from Clinical Comparison Studies [TPT.4] Klove and Lochen (cited in Klove, 1974) (Table A 17.4)
The authors obtained TPT data on 22 American controls from Wisconsin and 22 Norwegian controls as part of a validation study on the ability of the HRB to detect brain damage. Mean age, educational level, and IQ for the American participants were 31.6, 11.1, and 109.3, respectively, and for the Norwegian participants, 32.1, 12.2, and 111.9, respectively. The TPT data are presented in terms of mean for total time in minutes and memory and localization scores for each group. Study strengths 1. This publication is unique in providing TPT data on a Norwegian population. 2. Information regarding educational level, IQ, age, and geographic recruitment area is reported. Considerations regarding use of the study 1. Small sample size. 2. Undifferentiated age ranges. 3. No SDs reported. 4. No exclusion criteria are specified, and no information regarding gender distribution of the sample is provided. 5. Individual times for each hand separately and combined are not reported. 6. No information on gender. 7. Relatively low educational level of the U.S. sample, although IQ is average. [TPT.5] Wiens and Matarazzo, 1977 (Table A17.5)
The authors collected TPT data on 48 male applicants to a patrolman program in Portland, Oregon, as part of an investigation of the WAIS and MMPI correlates of the HRB. All participants passed a medical exam and were judged to be neurologically normal. Participants were divided into two equal groups, which were comparable in age (23.6 vs. 24.8), education (13.7 vs. 14.0), and WAIS FSIQ (117.5 vs. 118.3). Mean time in minutes and SDs to complete the TPT with the preferred hand, nonpreferred hand, both hands, and total are
provided, as well as means and SDs for memory and localization scores. A random subsample of 29 applicants was readministered the TPT 14-24 weeks following the original administration (Matarazzo et al., 1974) as part of an examination of the Halstead Impairment Index. Means and SDs for TPT total time in minutes and memory and localization are reported for both the original testing and the retest. One of the 29 participants obtained a score outside Halstead's (1947) suggested cutoff for total time, while nearly a third (nine of 29) of the participants scored below Reitan's cutoff on localization. Significant correlations were observed between the time scores and PIQ in both control groups (Wiens & Matarazzo, 1977), while significant correlations were documented between time scores and FSIQ in only one control group; VIQ was not associated with any of the TPT scores, and only the time scores were related to IQ measures. Study strengths 1. Information on test-retest performance. 2. Adequate sample size for the small age range. 3. Adequate medical exclusion criteria. 4. Information provided regarding educational level, IQ, gender, occupation, and geographic recruitment area. 5. Means and SDs are reported. Considerations regarding use of the study 1. High IQ level. 2. High educational level. 3. All-male sample. [TPT.6] Cauthen, 1978a (Table A 17.6)
The author obtained TPT data on 117 participants recruited from hospital volunteers and service clubs in Canada. Those with "evidence of organic dysfunction" associated with head injuries, illnesses, and symptoms of organic dysfunction were excluded (although the author indicates that five participants "were judged to have performed in a manner consistent with central nervous system dysfunction" on the TPT and "apparently were suffering from such dysfunction"). The sample included 35 male and 82 female Caucasian, urban residents. All but three participants
321
TACTUAL PERFORMANCE TEST
were right-handed, and given that there were no apparent differences in performance between right- and left-banders, the three leftbanders were included in the sample. Mean time in minutes and SD are reported for preferred hand, nonpreferred hand, both hands, and total, as are means and SDs for memory and localization. The TPT data are presented in four age (20-29, 30-39, 40-49, and 50--60) by three WAIS IQ (91-111, 112-122, and 123-139) groupings, with individual cell sizes ranging 5-18. Significant differences across age groups were documented on all TPT scores, and significant differences across IQ groups were documented for time for both hands, total time, and memory and localization. Using Halstead's (1947) cutoffs of 15.7 minutes for total time, :::;5 for memory, and :54 for localization, 22% of the sample were misclassified as impaired for total time, 15% for memory, and 53% for localization. The authors conclude that "the inclusion of over half the normals in the dysfunctional range of performance on location indicates that the cutoff point requires adjustment" (p. 458). They also observed that the percent exceeding the cutoff for total time increased dramatically across age groups (e.g., 2% for the 20-29 group, up to 63% for the 50--60 group). The authors provide revised cutoff scores for each age decade for total time, memory, and localization; when all three cutoffs were employed, 29%-37% of participants in each age group fell below the cutoffs for at least one score. Participants decreased their nondominant hand performance by an average of 1.3 (1.9) minutes, but 19% failed to show faster nondominant hand performance. Study strengths 1. Large overall sample size. 2. Presentation of data in age-by-IQ groupings. 3. Information regarding handedness, ethnicity, gender, and geographic recruitment area. 4. Means and SDs are reported. Considerations regarding use of the study 1. Small individual cell sizes. 2. No information regarding educational level.
3. Minimal exclusion criteria. 4. No data on individuals with less than average IQ. 5. IQ groupings are somewhat odd, and no information is provided regarding how they were derived. 6. The mean total time for those 30-39 years old with average IQ is in error. 7. Several unusual variations occurred in the data, which are probably due to the small individual cell sizes. For example, the 30-39 group with average to high average IQ had mean scores on preferred hand time which were higher than the 40-49 group with the same IQ level. Also of concern, the 50--60 group in the highest IQ range scored lower than those of lower IQ levels on nonpreferred hand time and memory and localization scores. 8. Data were collected in Canada, raising questions regarding their generalizability for clinical interpretation in the United States.
[TPT.7] Harley, Leuthold, Matthews, and Bergs, 1980 (Table A 17. 7)
The authors collected TPT data on 193 V.A. hospitalized patients in Wisconsin aged 55-79. Exclusion criteria were FSIQ <80, active psychosis, unequivocal neurological disease or brain damage, and serious visual or auditory acuity problems. Patients with a diagnosis of chronic brain syndrome were included. Patient diagnoses were as follows: chronic brain syndrome unrelated to alcoholism (28%), psychosis (55%), alcoholism (37%), neurosis (9%), and personality disorder (4%). Mean educational level was 8.8 years. The sample was divided into five age groupings: 55-59 (n =56), 60-64 (n = 45), 65-69 (n = 35), 70-74 (n=37), and 75-79 (n=20). Mean educational level and percent of sample included in each diagnostic classification are reported for each age grouping. The authors also provide test data on a subgroup of 160 participants equated for percent diagnosed with alcoholism across five age groupings. The "alcohol-equated sample" was developed "to minimize the influence that cognitive or motor/ sensory differences uniquely attributable to
322
PERCEPTUAL ORGANIZATION: VISUOSPATIAL AND TACTILE
alcohol abuse might have upon gro~p test performance levels" (p. 2). This su~ample remained heterogeneous regarding representation of the other diagnostic categori$. T-score equivalents for raw scores and mean time in minutes per block are ntported for dominant hand, nondominant han6, both hands, and total time per block for thf three trials combined by age groupings. In atldition, total number of blocks correctly placed; number of blocks correctly placed py the dominant hand, nondominant hand, ~ both comhands; time per block for three bined; and percentage of patients c~rrectly placing blocks are provided by age grqupings for the total and alcohol-equated samptes. We reproduce only the mean, SD, and ra;.ge for · the four time scores.
triaJt
Study strengths , 1. Large sample size in many individual cells, approximately 50. · 2. Reporting of data on IQ, eduqational level, and geographic recruitment area. 3. Data are presented in age groupiPgs. 4. Means and SDs are reported. · Considerations regarding use of the sttJdy 1. The presence of substantial ne...ologic (chronic brain syndrome), substance abuse, and major psychiatric diso~ers in the sample. : 2. Low educational level, although IQ levels are average. · 3. No information regarding gem~r, although given that data were obtafned in a V.A. setting, the sample is likel,r all, or nearly all, male. Other comments The scores for the two oldest age groips are identical in the whole sample and the :4coholequated group because these two gro~~s did not have overrepresentation of alcohopcs, so they did not need to be adjusted. . [TPT.8] Pauker, 1980 (Table A17.8)
'
The author obtained TPT scores on 363loronto citizens fluent in English recruited thro$gh announcements and notices. Participantf' were
aged 19-71 and included 152 men and 211 women. Exclusion criteria consisted of significant physical disability, sensory deficit, current medical illness, use of medication that might affect test performance, history of actual or suspected brain disorder, and alcoholism. MMPI profiles "could not suggest severe disturbance" or include more than three clinical scales with T scores ~70 or an F -scale score >80. The TPT was administered according to Reitan's guidelines. Means and SDs for total time in seconds and memory and localization scores are reported for the sample as a whole and by three age groupings (19-34, 35-52, 53-71) by four WAIS IQ levels (89-102, 103-112, 113-122, 123-143). Individual cell sample sizes ranged from 4-60. Age-by-IQ categories were determined "in a compromise between what would be desirable and what the obtained sample characteristics and size dictated" (p. 1). No differences in TPT performance between men and women were documented. Study strengths 1. Large sample size, although individual cell sizes are substantially below 50. 2. Presentation of the data in age-by-IQ groupings. 3. Adequate medical and psychiatric exclusion criteria. 4. Information regarding gender, recruitment procedures, language, and geographic recruitment area. 5. Means and SDs are reported. Considerations regarding use of the study 1. No information regarding education. 2. Individual times for each hand separately and together are not reported. 3. Participants were recruited in Canada, raising questions regarding usefulness for clinical interpretation in the United States. 4. The age-by-IQ cell representing participants aged 53-71 with IQ of 89-102 contained only four participants; Pauker comments that this category "should not be considered to be of any more than interest value" (p. 2).
TACTUAl PERFORMANCE TEST
5. At least one subject 53-71 years old in the 123-143 IQ range scored particularly poorly on total time and localization, causing the means to be artificially low and the SDs to be excessively large for this age-by-IQ cell. 6. IQ levels below the average range are not represented. [TPT.9] Anthony, Heaton, and Lehman, 1980 (Table A 17.9)
The authors amassed TPT data on 100 normal volunteers from Colorado as part of a crossvalidation of two computerized interpretive programs for the HRB. Participants had no history of medical or psychiatric problems, head trauma, brain disease, or substance abuse. In addition, for 85% of the controls, normal EEGs and neurological exams were obtained; in the remaining 15% of participants, it appears that this information was not available. Mean age was 38.88 (15.80) years, and mean education was 13.33 (2.56) years. Mean WAIS FSIQ, VIQ, and PIQ were 113.54 (10.83), 113.24 (11.59), and 112.26 (10.88), respectively. TPT data are presented in terms of mean and SD time in minutes divided by number of blocks placed and mean and SD for memory and localization scores. Participants incorrectly identified as brain-damaged (according to Russell et al., 1970) were older, less educated, and less intelligent than participants correctly classified as non-brain-damaged. Study strengths 1. Large sample size. 2. Adequate exclusion criteria. 3. Information regarding education, IQ, age, and geographic recruitment area. 4. Means and SDs are reported. Considerations regarding use of the study 1. The large undifferentiated age grouping. 2. The IQ range is high average. 3. No information regarding gender. 4. Individual times for each hand separately and together are not reported. [TPT.1 0] Bak and Greene, 1980 (Table A 1 7.1 0)
The authors gathered TPT data on 30 righthanded Texan participants as part of an
323
investigation of the effect of age on performance on the HRB and the Wechsler Memory Scale. Participants were equally divided into two age groupings: 50--62 and 67-86 years. Participants were fluent in English and denied history of CNS disorders, uncorrected sensory deficits, or illnesses or "incapacities" which might affect test results; participants in poor health were excluded. Mean ages of the two groups were 55.6 (4.44) and 74.9 (6.04) years, respectively. Participants in the first group were hom between 1916 and 1929, and participants in the second group were born between 1892 and 1912. Nine individuals in the first group and 10 in the second group were female. Four WAIS subtests were administered (Information, Arithmetic, Block Design, Digit Symbol); mean scores on these measures suggested that IQ levels were within the high average range or higher. Mean educational levels for the two groups were 13.7 (1.91) and 14.9 (2.99) years, respectively. Mean and SD times in seconds for the right hand, left hand, both hands, and total are reported, as are means and SDs for memory and localization. The groups differed significantly on all the time measures. Study strengths 1. Data on a very elderly age group not found in other published reports. 2. Adequate exclusion criteria. 3. Information regarding education, IQ, gender, handedness, fluency in English, and geographic recruitment area. 4. Means and SDs are reported. Considerations regarding use of the study 1. High IQ level. 2. High educational level. 3. The older age grouping spans nearly two decades and may be too broad for optimal clinical interpretation. 4. Small sample sizes. [TPT.11] Fromm-Auch and Yeudall, 1983 (Table A 17.11 )
The authors obtained TPT data on 193 Canadian participants (111 male, 82 female) recruited through posted advertisements and personal contacts. Participants are described
324
PERCEPTUAl ORGANIZATION: VISUOSPATIAl AND TACTilE
as "nonpsychiatric" and "nonneurological." Eighty-three percent of the sample were right-handed, and mean age was 25.4 (8.2) years, with a range of 15-64 years. Mean education was 14.8 (3.0) years, with a range of 8-26 years, and included technical and university training. Mean WAIS FSIQ, VIQ, and PIQ were 119.1 (8.8, range 98-142), 119.8 (9.9, range 95-143), and 115.6 (9.8, range 89-146), respectively. No subject obtained an FSIQ which was lower than the average range. Participants were classified into five age groupings: 15-17 (n = 32), 18-23 (n = 74), 24-32 (n =56), 33-40 (n = 18), and 41-64 (n = 10) (total sample for this test= 190). Mean time in minutes, SD, and range are reported for preferred hand, nonpreferred hand, both hands combined, and total time for each age grouping. Similarly, mean correct blocks, SD, and range are summarized on localization and memory trials. The authors note that none of their participants required more than 15 minutes to place the blocks with preferred, nonpreferred, and both hands. No gender differences were documented, and male and female data were collapsed. Study strengths 1. Large overall sample, with some individual cells approximating 50. 2. Presentation of the data by age grouping. 3. Information regarding mean IQ and educational levels, handedness, gender, recruitment procedures, and geographic recruitment area. 4. Some psychiatric and neurological exclusion criteria. 5. Means and SDs are reported. Considerations regarding use of the study 1. The high intellectual and educational level of the sample. 2. An age grouping of 41-64 with 10 participants would not appear to be particularly useful. 3. Participants were recruited in Canada, raising questions regarding the usefulness of the data for clinical interpretation in the United States. 4. At least one subject in the 18-23 group scored particularly poorly on the time
scores, causing the means to be artificially low and the SDs to be excessively large for this age grouping. [TPT.12] Moore, Richards, and Hood, 1984 (Table A17.12) This data set of 284 participants was actually provided by Pauker (1980) but is again included because these authors divide the data into smaller age ranges than those provided by Pauker. Specifically, the following six age groupings were employed: 19-27 (mean= 23.1; 24 men, 32 women), 2fh'36 (mean= 31.8; 36 men, 28 women), 37-45 (mean= 40.8; 31 men, 28 women), 46--55 (mean= 50.8; 26 men, 34 women), 56-65 (mean= 61.2; 8 men, 12 women), and 66-76 (mean=69.5; 8 men, 17 women). Mean FSIQ for the six groups were as follows: 115, 112, 111, 116, 115, and 115. Participants were recruited through newspaper advertisements and paid $10. Exclusion criteria were nonHuency in English, history of central nervous system disorder, current treatment for emotional disorder, and major physical illness or disability. Standard test administration procedures were followed. Means for number recalled, number correctly located, average completion time across the three placement trials, and "location proportion" (not defined) are reported.
Study strengths l. Large sample size, although the two older age groupings have fewer than 50. 2. Presentation of the data in age groupings. 3. Adequate medical and psychiatric exclusion criteria. 4. Information regarding gender, recruitment procedures, and language. 5. Means reported. Considerations regarding use of the study 1. No information regarding education. 2. Individual times for each hand separately and together are not reported. 3. Participants were recruited in Canada, raising questions regarding usefuJness for clinical interpretation in the United States.
TACTUAL PERFORMANCE TEST
4. IQ levels below the average range not represented. 5. SDs not reported. [TPT.13] Schear, 1984 (Tables A17.13, A17.14)
The author reports norms for the TPT from a Kansas neuropsychiatric sample consisting of 556 right-handed males with no "peripheral injuries or defects" which could adversely affect test performance. The sample reflected a large number of diagnostic categories: 35% had evidence of various signs of brain damage (organic brain syndromes, alcohol encephalopathy, epilepsy, etc.) and 49% exhibited psychiatric disturbance (nonorganic psychotic disorders, schizophrenia, alcoholism, etc.). A maximum of 10 minutes was allowed for each of the three TPT timed trials. Means, SDs, and ranges are provided for years of education and time in minutes for right hand, left hand, both hands, and total for five age decades: ~29 (n = 111), 30-39 (n = 112), 40-49 (n = 111), 50-59 (n = 155), and 60--69 (n = 67). Means, SDs, and ranges are also reported for memory and localization and for number of blocks placed with the right hand, left hand, both hands, and total. Study strengths l. Large sample size, with individual cells exceeding 50. 2. Presentation of the data by age decades. 3. Information regarding education, gender, handedness, and geographic recruitment area. 4. Data regarding mean number of blocks placed. 5. Means and SDs are reported. Considerations regarding use of the study l. Insufficient exclusion criteria; participants were diagnosed with organic, psychiatric, and/or medical illnesses. 2. All male sample. 3. No information on IQ.
325 of the TPT in adults. Participants had been admitted to a neurology ward of the Miami V.A. Medical Center for a suspected neurological condition but showed no evidence of brain damage upon neurological evaluation, including brain CT scans. Mean age was 43.5 (13.6) years, and mean educational level was 14.8 (6.4) years. All but two participants were male. Exclusion criteria were severe psychiatric disturbance (e.g., psychosis, severe depression), left-handedness, and inability to use one or both hands. The TPT was administered according to the Rennick procedures (Russell et al., 1970); specifically, if a time trial was not completed after 10 minutes, the trial was discontinued and the score was prorated b~ed on the number of blocks remaining to obtain a combined time score. The 19 control participants were tested as part of a group of 80 participants administered both the six-block and 10block TPT. Forty participants were given the 10-block version first, and the remaining sample was administered the six-block version first during the course of a comprehensive neuropsychological evaluation. The interval between administration of the two versions ranged from 1 hour to 2 days. Mean time in minutes and SD are reported for the dominant hand, nondominant hand, both hands, and total, as are means and SDs for memory and localization for both versions. Correlations of the time scores between the two tests ranged from 0.62 (nondominant hand) to 0.82 (total time), suggesting that these scores "are measuring approximately the same attributes and the TPT 6 could be substituted for the TPT 10" (p. 73). The correlation for the memory scores was 0. 71, but the correlation for localization was only 0.55. Despite the overall strong association between the two versions, the TPT-10 was found to be much more difficult than the TPT-6; the time to complete the TPT-6 was approximately one-third that of the full TPT. The localization score had poor reliability due to its marked variability.
[TPT.14] Russell, 1985 (Table A17.15)
The author obtained data on the six-block children's version of the TPT in a sample of 19 Caucasian "control" participants as part of his examination of the use of a shortened version
Study strengths l. Data on the 6-block version in adults and its relationship to the 10-block test are reported.
326
PERCEPTUAL ORGANIZATION: VISUOSPATIAL AND TACTILE
2. Information on educational level, geographic recruitment area, gender, ethnicity, handedness, and age is given. 3. Means and SDs are reported.
scores. Significant differences in performance were found between males and females; men significantly outperformed women on total time.
Considerations regarding use of the study 1. Small sample size. 2. Undifferentiated age range. 3. Insufficient exclusion criteria. 4. The mean score for the TPT-10 for both hands appears to be in error since it is twice that of either hand alone. 5. High mean educational level. 6. No data on IQ. 7. Mostly male sample.
Study strengths 1. Large size of overall sample and individual cells. 2. Information regarding education, gender, age, and geographic recruitment area. 3. Data grouped by age and educational level. 4. Minimally adequate exclusion criteria.
[TPT.15] Heaton, Grant, and Matthews, 1986 (Table A17.16)
The authors obtained TPT data on 553 normal controls in Colorado, California, and Wisconsin as part of an investigation into the effects of age, education, and gender on HRB performance. Nearly two-thirds of the sample were male (356 males, 197 females). Exclusion criteria were history of neurological illness, significant head trauma, and substance abuse. Participants were aged 15-81, with a mean of 39.3 (17.5). Mean education was 13.3 (3.4) years, with a range of 0-20. The sample was divided into three age categories (<40, 40-59, and ;?:60) with sizes of 319, 134, and 100, respectively, and classified into three education categories (<12, 12-15, and ;?:16 years) with sizes of 132, 249, and 172, respectively. Testing was conducted by trained technicians, and all participants were judged to have expended their best effort on the task. TPT mean total time in minutes per block and mean memory/localization scores are reported for the six subgroups, as well as percent classified as normal using Russell et al.'s (1970) criteria. Approximately 20%-30% test score variance was accounted for by age, but only approximately 5%-10% of test score variance was associated with educational level. Significant group differences on all TPT scores were found across the three age groups; significant group differences across educational levels were documented only for the memory score, and a significant age-by-education interaction was documented for memory and localization
Considerations regarding use of the study 1. No reporting of TPT score SDs. 2. Mean individual WAIS subtest scaled scores reported but not overall IQ scores. 3. Individual times for each hand separately and together are not reported. [TPT.16] Alekoumbides, Charter, Adkins, and Seacat, 1987 (Table A17.17)
The authors report TPT data on 135 medical and psychiatric medical inpatients and outpatients without cerebral lesions or histories of alcoholism or cerebral contusion from V.A. hospitals in southern California as part of their development of standardized scores corrected for age and education for the HRB. Among the 41 psychiatric patients, nine were diagnosed as psychotic and 32 as neurotic. In addition to psychiatry services, patients were drawn from medicine (n =57), neurology (n = 22), spinal cord injury (n = 9), and surgery (n = 6) units. Mean age was 46.85 (17.17) years, ranging 19-82, and mean education was ll.43 (3.20) years, ranging 1-20. Frequency distributions for age and years of education are provided. Mean WAIS FSIQ, VIQ, and PIQ were within the average range, i.e., 105.89 (13.47), 107.03 (14.38), and 103.31 (13.02), respectively; means and SDs for individual age-corrected subtest scores are also reported. All participants except one were male; the majority were Caucasian (93%), with 7% African American. The mean score on a measure of occupational attainment was 11.29. No differences were found in test performance between the two psychiatric groups and the nonpsychiatric group, and the data
327
TACTUAL PERFORMANCE TEST
were collapsed. Mean and SD for time in minutes to correctly place all the blocks are presented for the preferred hand, nonpreferred hand, both hands, and total time, as well as memory and localization scores. In addition, mean and SD blocks correctly placed per minute are summarized for the preferred hand, nonpreferred hand, both hands, and total time. The latter scores were included because the first set of scores would not discriminate between participants who successfully placed all the blocks except one in the allotted time vs. participants who correctly placed no blocks. Both age and educational level had significant associations with TPT scores in the expected direction, and regression equation information to allow correction of raw scores for age and education is included.
for time, Memory, and Localization scores for both testing sessions are provided, as are raw score change and SD, median raw score change, and mean percent of change. Significant improvements were noted in time, memory, and localization scores. Correlations between such demographic variables as age, education, mean percent of change, and mean change were generally small, with education accounting for up to 17% of variance and age accounting for up to 8% of variance.
Study strengths l. Large sample size. 2. Information regarding IQ, age, education, ethnicity, gender, occupational attainment, and geographic recruitment area. 3. Regression equation data for computation of age- and education-corrected scores. 4. Means and SDs are reported.
Considerations regarding use of the study l. Undifferentiated age range. 2. No information regarding education. 3. TPT time score is not defined. 4. Individual times for each hand separately and together are not reported. 5. Small sample size.
Considerations regarding use of the study I. Data were collected on medical and psychiatric patients. 2. Undifferentiated age range (mitigated by the regression equation information). 3. Nearly all male sample. [TPT.17] Bornstein, Baker, and Douglass, 1987a (Table A17.18)
The authors collected TPT test-retest data on 23 volunteers (14 women, 9 men) aged 17-52, with a mean of 32.3 (10.3) years, as part of an examination of the short-term retest reliability of the HRB. Exclusion criteria consisted of a positive history of neurological or psychiatric illness. Mean VIQ was 105.8 (10.8), with a range of 88-128, and mean PIQ was 105.0 (10.5), with a range of ~121. Participants were administered the HRB in standard order both on initial testing and again 3 weeks later. Means, SDs, and ranges
Study strengths I. Information on short-term (3-week) retest data. 2. Information on IQ level, gender, and age. 3. Minimally adequate exclusion criteria. 4. Means and SDs are reported.
[TPT.18] EI-Sheikh, EI-Nagdy, Townes, and Kennedy, 1987 (Table A17.19)
The authors reported TPT data on 32 undergraduate and graduate Egyptians at the American University at Cairo as a part of their cross-cultural investigation of the LunaNebraska and Halstead-Reitan batteries. The average age was 20.6 years (range 17-24). No subject had a history of known brain damage. Participants were described as "Arabic and English speaking." TPT instructions were translated into Egyptian colloquial Arabic by the first author and checked by two independent judges fluent in both Arabic and English. In case of disagreement between these two judges, a third judge was consulted. The TPT was administered in English to 23 participants and in Arabic to nine participants and readministered 2 weeks later. Means and SDs for time in minutes are reported for the preferred hand, non preferred hand, both hands, and total, as are means and SDs for memory and localization. No differences in performance were found between
328
PERCEPTUAL ORGANIZATION: VISUOSPATIAL AND TACTILE
administration in English and Arabic. Significant practice effects were documented for time for both hands, total time, and localization. Study strengths 1. Data obtained on an Arabic sample. 2. Information on test-retest scores. 3. Information regarding educational level, age, and geographic recruitment area. 4. Means and SDs are reported. Considerations regarding use of the study 1. Small sample size. 2. Minimal exclusion criteria. 3. No information regarding intellectual level. 4. Undifferentiated age range, although it can be assumed it is fairly restricted. [TPT.19] Dodrill, 1987 (Table A17.20)
The author collected TPf data on 120 participants in Washington during the years 1975-1976 (n = 81) and 1986-1987 (n = 39). Half of the sample was female, and 10% were minorities (six African American, three Native American, two Asian American, one unknown). Eighteen were left-handed. There were 45 students, 11 homemakers, and one retiree; 37 were employed, and 26 were unemployed. Participants were recruited from various sources, including schools, churches, employment agencies, and community service agencies. They were either paid for their participation or offered an interpretation of their abilities. Exclusion criteria were history of "neurologically relevant disease (such as meningitis or encephalitis)," alcoholism, birth complications "of likely neurological significance," oxygen deprivation, peripheral nervous system injury, psychotic or psychotic-like disorders, or head injury associated with unconsciousness, skull fracture, persisting neurological signs, or diagnosis of concussion or contusion. One-third of potential participants failed to meet the above medical and psychiatric criteria, resulting in a final sample of 120. Mean age was 27.73 (11.04) years, and mean education was 12.28 (2.18) years. Participants tested in the 1970s were administered the WAIS, and participants assessed in the 1980s were administered the WAIS-R; WAIS
scores were converted to WAIS-R equivalents by subtracting 7 points from the VIQ, PIQ, and FSIQ. Mean FSIQ, VIQ, and PIQ scores were 100.00 (14.35), 100.92 (14.73), and 98.25 (13.39), respectively. IQ scores ranged from 60-138 and reflected a normal distribution. Mean time in minutes and SDs are reported for TPf total time, as are means and SDs for memory and localization. In addition, IQ-equivalent scores for various levels of intelligence are presented. Using Halstead's (1947) cutoffs of ~15.6 minutes for total time and 2:6 and 2:5 for memory and localization, respectively, 21. 7%, 5.0%, and 39.2% of a subgroup of the sample were misclassified as brain-damaged. Study strengths 1. Large sample size. 2. Comprehensive exclusion criteria (although the appropriateness of including mentally retarded individuals could be questioned). 3. Information regarding education, IQ, age, occupation, gender ratio, handedness, ethnicity, recruitment procedures, and geographic area. 4. IQ-equivalent scores provided. 5. Data for different IQ levels provided. 6. Means and SDs are reported. Considerations regarding use of the study 1. Undifferentiated age range. 2. Individual times for each hand separately and together are not reported. 3. On the IQ-equivalent scores, the two highest IQ groups have lower scores on localization than the 100-120 IQ group. [TPT.20] Yeudall, Reddon, Gill, and Stefanyk, 1987 (Table A17.21)
The authors obtained TPf data on 225 Canadian participants recruited from posted advertisements in workplaces and personal solicitations. Participants included meat packers, postal workers, transit employees, hospital lab technicians, secretaries, ward aides, student interns, student nurses, and summer students. In addition, high school teachers identified for participation average students in grades 10-12.
329
TACTUAL PERFORMANCE TEST
Exclusion criteria were evidence of "forensic involvement," head injwy, neurological insult, prenatal or birth complications, psychiatric problems, or substance abuse. Participants were classified into four age groupings devised to ensure group homogeneity: 15-20, 21-25, 26-30, and 31-40 years. Information regarding percent right-banders, mean years of education, and mean WAIS/WAIS-R VIQ and PIQ are reported for each age grouping for males and females separately and combined. For the sample as a whole, 88% were right-handed and had completed an average of 14.55 (2.78) years of schooling. 11te mean FSIQ, VIQ, and PIQ were 112.25 (10.25), 112.60 (10.86), and 108.13 (10.63), respectively. TPI' data were gathered by experienced testing technicians who "motivated the participants to achieve maximum performance" partially through the promise of detailed explanations of their test performance. Means and SDs for time in seconds to execute the task with the preferred and nonpreferred hands separately and together are presented, as are means and SDs for Memory and Localization scores, each age grouping, and each age-by-gender grouping. No significant relationships were found between TPI' scores and gender or educational level. Significant age effects were noted for the time, memory, and localization scores; and significant correlations were documented between all scores and PIQ, particularly in males. An association between TPI' memory score and VIQ was also noted. Because no significant differences were found between men and women, only the combined sample data are reproduced below. Study strengths 1. Large sample size, with individual cells approximating 50. 2. Data grouped by age. 3. Data availability for a 15-20 age group. 4. Adequate medical and psychiatric exclusion criteria. 5. Information regarding handedness, education, IQ, gender, occupation, geographic recruitment area, and recruitment procedures. 6. Means and SDs are reported.
Considerations regarding use of the study 1. 11te sample was atypical in terms of its high average intellectual level and high level of education. 2. 11te data were obtained on Canadian participants, which may limit their usefulness for clinical interpretation in the United States due to possible subtle cultural differences. 3. Examination of the data reveals odd, unpredicted variability, with those 21-25 years old performing worse on the time scores than those 26--30 years old. [TPT.21] Ernst, 1987 (Table A17.22)
11te author obtained TPI' data on 110 primarily Caucasian (99%) residents of Brisbane, Australia, aged 65-75. Fifty-nine were female and 51 were male, and mean educational level was 10.3 years; men and women did not differ in years of education. Participants were recruited primarily through random selection from the Queensland State electoral roll (n = 97), with the remainder (n = 13) solicited through senior-citizen centers. Exclusion criteria were history of significant head trauma or neurologic disease. Nearly one-half of the sample were diagnosed with at least one chronic disease (hypertension = 33, heart disease= 9, thyroid dysfunction= 7, asthma= 5, emphysema= 2, diabetes= 1), for which they were receiving treatment and which was described as wellcontrolled. Sixty-six participants were receiving medications, primarily for the diseases listed above. Test administration was according to the Reitan instructions. All participants were administered the Trailmaking Test first, followed by either the TPI' or Booklet Category Test. Mean times in minutes and SDs are reported for the preferred hand, nonpreferred hand, both hands, and total. In addition, mean number of blocks and SDs are presented for each time measure and for memory and localization. Using cutoffs of 15.7 minutes for total time and five blocks for memory and four blocks for localization, 77%, 36%, and 89%, respectively, of the sample were classified as impaired. Men outperformed women on memory, localization, and all time scores except dominant hand. No differences in TPI'
330
PERCEPTUAL ORGANIZATION: VISUOSPATIAL AND TACTILE
scores emerged between participants with and without chronic disease, although participants taking medications scored better on memory and localization. Educational level did not appear to be related to TPr performance. A third of the sample (34.5%) failed to show superior nondominant hand performance on the second trial, and this subgroup was not overrepresented by men or women, older age (<70% ), chronic illness, or medication usage and did not show poorer scores on the TPr measures. Participants administered the TPr prior to the Booklet Category Test obtained a higher mean number of blocks placed for the preferred hand, but no other test order effects were documented. Study strengths 1. Large sample size in a restricted age range. 2. Presentation of the data by gender. 3. Information regarding education, geographic recruitment area, recruitment procedures, and ethnicity. 4. Information regarding test administration order effects. 5. Means and SDs are reported. Considerations regarding use of the study 1. Approximately half of the participants had at least one chronic illness, and over half were taking prescribed medications. 2. No information regarding IQ. 3. Low mean educational level. 4. Data were collected in Australia and may be unsuitable for clinical use in the United States. [TPT.221 Clark and Klonoff, 1988 (Table A17.23) The authors obtained data on the children's version of the TPr-6 in a sample of 79 male, right-handed, coronary bypass surgery participants in Canada as part of their evaluation of the reliability and construct validity of the shortened TPr in adults. Exclusion criteria were postsurgical complications and stroke or other neurological conditions. Apparently all of the sample were prescribed coronary medications (p blockers= 75%, calcium channel blockers= 65%, long-acting nitrates =59%,
nitroglycerin= 47%, antiarrhythmics = 7%, digitalis= 7%). In addition, six participants were prescribed antianxiety medication, two were receiving sleeping medication, and one was on an antidepressant. Also, several were receiving treatment for chronic medical illnesses such as diabetes (n = 2), ulcers (n = 3), gout (n = 2), allergies (n = 3), and arthritis (n = 2). Mean age was 55.5 (8.0) years, with a range of35-68 years, and mean WAIS-R FSIQ was 105.9 (12.2), with a range of77-137. The majority of the sample had completed at least 9 years of education (84%), and nearly one-third had completed some post-high school work (29.1%). The TPr-6 was administered according to the instructions for the 10-block TPr to each subject 3 weeks before surgery and 3, 12, and 24 months postsurgery. Mean time in minutes and SDs for time to complete the task with each hand both hands, and total are reported, as are means and SDs for localization and memory for each of the four testing sessions. No significant correlations were noted between measures of presurgical cardiac status and TPr scores, and only one significant difference in test performance was noted across the four largest cardiac medication groups; however, this may have been a chance result given the multiple comparisons. The authors argue that "these findings suggest that the sample is an appropriate normative group in that the stress of a disease state and upcoming surgical intervention was present but the disease per se or specific medications were not directly related to test performance" (p. 177). The authors note that the TPr-6 had good test-retest reliability, with little practice effect across the four testing sessions. Construct validity appeared to be consistent with that of the 10-block TPr. Study strengths 1. Data for the six-block version. 2. Test-retest data (although it is technically not a true test-retest study since the participants underwent an intervening surgical procedure). 3. Information regarding education, gender, age, IQ, geographic recruitment area, and handedness.
TACTUAL PERFORMANCE TEST
331
4. Large sample size. 5. Means and SDs are reported.
4. Means and SDs are reported for total time, memory, and localization trials.
Considerations regarding use of the study 1. All patients were medically ill and receiving various medications. 2. Undifferentiated age range. 3. Low overall educational level, although IQ is average. 4. Data collected in Canada, raising concerns regarding their usefulness for clinical interpretation in the United States. 5. All-male sample. [TPT.23] Elias, Podraza, Pierce, and Robbins, 1990 (Table A 17.24)
The authors recruited 183 community-dwelling participants (76 men, 107 women) from church groups, businesses, professional organizations, and community service organizations for older persons as part of a study on the impact of hypertension on cognition. Exclusion criteria were no major chronic or acute disease, including hypertension, treatment for a neurological disorder, brain trauma, mental illness, or any cardiovascular or cerebrovascular disease. Skilled clerical, supervisory, bluecollar, and professional-executive occupations were represented. Participants were divided into three age groupings: 20-31 (41 men, 47 women), 37-49 (23 men, 38 women), and 55--67 (12 men, 22 women). Mean educational levels for the three groups were 15.4, 15.7, and 14.9, respectively (range 12-20 for each age group); and mean WAIS VIQ and PIQ were 119 and 116 for the youngest group, 122 and 122 for the middle-aged group, and 124 and 121 for the oldest group. Means and SDs for TPT total time (in minutes), Memory (number correct), and Localization (number correct) are reported. Study strengths 1. Large overall sample size (with individual subgroup sizes of 88, 61, and 34). 2. Adequate exclusion criteria 3. Information regarding gender, educationallevel, IQ, and recruitment strategies is provided; and data are stratified by age.
Considerations regarding use of the study 1. Data on hands separately not reported. 2. No information regarding ethnicity or geographic area (although it can be assumed it was Maine, given the academic affiliations of the authors). 3. High IQ and educational level. [TPT.24] Thompson and Heaton, 1991
(Table A 17.25)
The authors report TPT data on 489 participants apparently recruited from California, Colorado, Ohio, and/or Michigan as part of their examination of the relationship between patterns of TPT performance and other neuropsychological test scores. Exclusion criteria were history of head trauma, neurological illness, substance abuse, "serious" psychiatric illness, and peripheral injury which could interfere with test performance. Mean age and years of education were 39.43 (17.76) and 13.19 (3.46), respectively. Mean WAIS FSIQ was 113.09 (12.07). The TPT was administered by trained technicians according to standard procedures. Mean time in minutes and SDs for dominant hand, nondominant hand, both hands, and total are provided, as are numbers of blocks correctly recalled and located. A reversal in the expected pattern of improvement from dominant to nondominant hand performance occurred in 30% of participants and was associated with relatively poorer scores on some WAIS Performance subtests. Study strengths 1. Large sample size. 2. Information regarding education, IQ, age, and geographic recruitment area. 3. Adequate exclusion criteria. 4. Means and SDs are reported. Considerations regarding use of the study 1. Undifferentiated age range. 2. No information regarding gender distribution. 3. High mean educational and IQ levels.
332
PERCEPTUAL ORGANIZATION: VISUOSPATIAL AND TACTILE
[TPT.25] Heaton, Grant, and Matthews, 1991
The authors provide normative data on the TPT from 486 (378 in the base sample and 108 in the validation sample) urban and rural participants recruited in several U.S. states (California, Washington, Colorado, Texas, Oklahoma, Wisconsin, Illinois, Mifhigan, New York, Virginia, and Massachusetts) and Canada. Data were collected over a !5-year period through multicenter collali>rative efforts; the authors trained the test :ufministrators and supervised data collectiop. Exclusion criteria were history of l~ng disability, neurologic disease, illnesses \affecting brain function, significant head ttauma, significant psychiatric disturbance ; (e.g., schizophrenia), and alcohol or substance abuse. Mean age for the total sample was 42.0 (16.8) years, and mean educationalle.Jel was 13.6 (3.5) years. Sixty-five percent ~f the sample were male. Mean WAIS FSIQ, VIQ, and PIQ were 113.8 (12.3), 113.9 (13.$), and 111.9 (11.6), respectively. . Participants were generally paid for their participation and judged to have prpvided their best efforts on the tasks. The was administered according to Reitan and Wolfson's (1985) instructions, with the exdeption that time trials were discontinued at 10 minutes unless the subject was progressing well, near to finishing the task, or judged to have the potential for becoming distressed if forced to discontinue before completion. ~inutes per block (number of minutes divided by number of blocks correctly placed) was the performance parameter employed for the preferred hand, nonpreferred hand, an~ both hands and for total time; numbers of blocks recalled and correctly located were ~d for memory and localization scores. A T-score system with demographic correction. was developed on 378 participants and .crossvalidated on 108 participants. Extensive Tscore tables corrected for age, educatiop, and gender are provided; and the int~ested reader is referred directly to the handbook for these data. Age accounted for 7% (dominant harid) to 29% (localization) of the variance in TPT scores; education was associated with 3%
nrr
(dominant hand) to 14% (memory) of score variance; and gender accounted for at most 1% of score variance. These demographic variables in combination were associated with 7% (dominant hand) to 34% (total) of score variance. For the sample as a whole, minutes per block for the dominant hand, nondominant hand, both hands, and total were 0.7 (0.8), 0.6 (0.6), 0.4 (0.6), and 0.5 (0.3), respectively. Numbers of blocks recalled and correctly located were 7.6 (1.6) and 4.3 (2.5), respectively. In 2004, the authors published revised norms, based on a sample of over 1,000 normal adults. In addition to age, education, and gender stratification, the data are partitioned by race/ethnicity (African American and Caucasian). Study strengths 1. Large sample size. 2. T scores corrected for age, education, and gender. The 2004 edition presents the data for two race/ethnicity groups. 3. Comprehensive exclusion criteria. 4. Information regarding IQ and geographic recruitment area. Consideration regarding use of the study 1. Above average mean intellectual level (which is probably less of an issue given that these are WAIS, rather than WAIS-R, IQ data). Other comments I. The interested reader is referred to the Fastenau and Adams (1996) critique of the Heaton et al. (1991) norms, and Heaton et al.'s (1996) response to this critique. [TPT.26] Elias, Robbins, Walter, and Schultz, 1993 (Table A17.26)
TPT data on 427 participants, including those from the 1990 study and reflecting the same exclusion criteria, are provided for men and women separately for six age groupings: 1524 (37 men, 24 women), 25-34 (40 men, 56 women), 35-44 (36 men, 56 women),
333
TACTUAL PERFORMANCE TEST
45-54 (25 men, 46 women), 55-64 (25 men, 35 women), and 65+ (24 men, 23 women). Participants with <12 or >19 years of education were excluded because participants outside this range were disproportionately distributed across the age and gender groupings. Mean WAIS Vocabulary and Information subtest scores ranged from 13.914.7 and 13.2-13.7, respectively, across the age groups. Means and SDs for total time, Memo:ry, and Localization trials are provided.
Mean scores for localization, memo:ry, dominant hand, nondominant hand, and both hands are reported.
Study strengths 1. Huge sample size. 2. Information regarding age, occupational status, and recruitment strategy with some limited data regarding ethnicity, educational level, and gender (it is assumed all were male).
Considerations regarding use of the study Study strengths 1. Large overall sample size, although most individual age x gender cells were <50. 2. Adequate exclusion criteria. 3. Information regarding educational level (although only ranges provided) and WAIS subtests (Information, Vocabulary) is provided, and data are stratified by age and gender. 4. Means and SDs reported.
Considerations regarding use of the study 1. No information regarding ethnicity or geographic area (although it can be assumed it was Maine, given the academic affiliations of the authors). 2. No data reported for hands separately. [TPT.27] Barrett, Morris, Akhtar, and Michalek, 2001 (Table A17.27)
TPT data were obtained on 1,052 Air Force veteran controls who setved in Southeast Asia from 1962 to 1971 in a study examining the effects of Agent Orange on cognition. Subjects averaged 43.9 (7.6) years of age, and 5.3% were African American (with the rest "nonblack"); 37% were officers, 16.6% were enlisted Byers, and 46.0% were enlisted ground crew; most of the officers were college-educated, and most enlisted personnel were high school-educated. No exclusion criteria are listed aside from low exposure to dioxin and epilepsy. Participation was voluntary.
1. No exclusion criteria. 2. No stratification of data by age or education. 3. No SDs reported for the test scores.
CONCLUSIONS Review of the validity studies and normative data for the TPT reveals considerable controversy regarding the utility of this test. Although some authors found the TPT to be sensitive to different aspects of brain dysfunction, many pointed to notable drawbacks limiting its practical utility. One of the major liabilities of this test is that it is long and difficult and requires the examiner to blindfold the examinee. This often creates considerable psychological discomfort for the examinee (see Lezak et al., 2004, pp. 469471). Use of the TPT-6 can alleviate some of the problems associated with the full 10block version. However, data should be collected on the validity/reliability of the short version and on its comparability to the full version if it is to be used in clinical practice. The clinical usefulness of the TPT would be improved by adjusting cutoffs relative to a subject's age, intellectual level, and possibly education and gender, although the effect of the latter two demographic variables on TPT performance needs to be further explored. Taking demographic factors into account in assigning participants to impaired vs. nonimpaired groups would improve the specificity
334
PERCEPTUAl ORGAN IZATION: VISUOSPATIAL AND TACTILE
of the TPT. This would also reduce excessively high rates of misclassification of "normal" participants in the impaired range reported in the studies reviewed. Another aspect of the test which would benefit from further research attention is standardization of test administration.
The above suggestions address major criticisms voic d by investigators regarding the validity of the TPT. With further improvements in administration and refinement in interpretation guidelines, the utiJity of this widely used test might be considerably . d.2 1mprove
' Meta-analyses were not pe rform ed for the TPT as th e data available for review are hete rogeneous in terms of measures reported, country wh re data were collected, and sample composition, with s ve ral studies reporting data for patients with medical or neurological conditions.
v VERBAL AND VISUAL LEARNING AND MEMORY
18 Wechsler Memory Scale (WMS-R, WMS-111, and WMS-IIIA)
BRIEF HISTORY OF THE TEST The Wechsler Memory Scale (WMS) made its first appearance in 1945 and was one of the first standardized memory batteries. Revisions of the WMS test battery were published in 1987 (i.e., WMS-R) and 1997 (i.e., WMS-III), with an abbreviated administration and scor-:ing version appearing in 2002 (i.e., WMSIIIA). (Please see Appendix 1 for ordering information.) The WMS-III and WMS-IIIA are the versions generally currently used in standard practice; however, as of this writing (January 2004), the WMS-R is still used in some clinical practices and research settings and, indeed, is still available for purchase from The Psychological Corporation.
WMS (1945)
Wechsler's (1945) original test battery was developed with the intention of providing "a rapid, simple and practical memory examination" (p. 87). The original WMS required about 30 minutes to administer and consisted of seven subtests: (1) Personal and Current The authors gratefully acknowledge the conbibutions of Paul Satz and David Schretlen to an early version of this chapter, as it originally appeared in D'Elia, Satz &: Schretlen (1989).
Information, which included six relatively simple general and personal information questions; (2) Orientation, which asked five questions relating to place and time; (3) Mental Control, which required the patient to count backward from 20, recite the alphabet, and count by 3s, with bonus points awarded for fast, perfect performance; (4) Logical Memory, which tested only immediate auditory memory for two separate orally presented stories; (5) Digits Forward and Digits Backward, which assessed attention span and immediate auditory memory; (6) Visual Reproduction, which tested immediate visual memory for geometric designs after a 10-second exposure; and (7) Associate Learning, which tested recall for an orally presented list of five semantically related (easy) and five unrelated (hard) word pairs over three trials. Although Wechsler (1945) originally introduced two forms of the WMS, only form I was normed by Wechsler for clinical use. Researchers and clinicians, therefore, relied on form I of the WMS to provide evidence for the integrity of memory function. As the popularity of the original WMS increased around the world, numerous normative studies emerged to enhance its clinical utility. Of the 27 normative reports that were published for the original WMS form I, 15 337
338
VERBAl AND VISUAl lEARNING AND MEMORY
presented data for groups residing in the United States (Abikoff et al., 1987; Bak & Greene, 1980; Bigler et al., 1981a,b; Haaland et al.• 1983; Heaton et al., 1991; Hulicka, 1966; Ivnilc et al., 1991; Mitrushina & Satz, 1991b; Russell, 1975; Russell & Starkey, 1993; Ryan et al., 1987; Schaie & Strother, 1968a; Trahan et al., 1988; Van Gorp et al., 1989; Wechsler, 1945). Nine studies reported data for individuals living elsewhere, including three studies of Clfladian individuals (Klonoff & Kennedy, 1965~ 1966; Cauthen, 1977), four of Australian groups (desRosiers & Ivison,1986; Ivinskis et al~ 1971; Ivison, 1977, 1986), and one study each for British (Kear-Colwell & Heller, 197$) and Turkish (Gilleard & Gilleard, 1989) populations. Russell (1975) was the first to add a delayed recall condition to the test, and almost Without exception, WMS studies published subsequent to Russell's article included a delayed recall condition on one or more subtests, usually Logical Memory and Visual Reprod11ction. However, some investigators (Bak & Greene, 1980; Haaland et al., 1983; Ivnik, et al.; 1991; Trahan et al., 1988) followed Russell's procedure of interposing 30 minutes betwe~n the immediate and delayed recall conditions. while others used delays of 45 minutes (Mitrushina & Satz, 1991a,b; Van Gorp et al., 1990), 1 hour (Cauthen, 1977), and even 24 hours (Abivided WMS normative data for delayed recall of the Logical Memory subtest. Trahan et al. (1988), Ryan et al. (1987), and Ivnik et al. (1991) provided normative data for the delayed rocall of the Visual Reproduction subtest of the WMS. Interestingly, no studies provided reliable WMS delayed recall normative data tor the Associate Learning subtest for U.S. pOpulations. Delayed recall data for any subteshiof the original WMS were scant for subjects under 20 years of age. Finally, in 1987, the long-awaited standardized revision of the test was pub¥shed. WMS-R (1987)
The WMS-R was introduced in 1987 and described by the test publisher as a "diagnostic
and screening device for use as part of a general neuropsychological examination, or any other clinical examination requiring the assessment of memory functions" (Wechsler, 1987, p. 1). Several changes were made in this first major standardized revision of the WMS, including modifications of test items and administration procedures. New subtests were added to assess spatial and figural memory, and delayed recall testing on most subtests was incorporated as a standard procedure. Scoring accuracy was greatly improved by the provision of detailed scoring procedures. Overall, the significant improvements in scoring criteria and administration procedures for the WMS-R permitted a more valid assessment of memory than was possible with the original WMS. Although the WMS-R is still used in some laboratories and in some research applications, it has largely gone the way of the original WMS, and the WMS-III and WMS-IIIA should now be considered the versions of standard practice. A brief overview of the WMS-R follows. The full battery includes information and orientation questions, eight short-term memory tasks, and four delayed recall trials, all of which take about 45 minutes to 1 hour to administer. Four of the six information and all five of the orientation questions from the original WMS were retained, and three new questions were added; thus, there are a total of 14 scorable information and orientation questions (as opposed to 11 in the original WMS). The presentation of the Mental Control subtest was left unchanged, except that bonus points are no longer assigned for fast, perfect performance. The Digit Span subtest uses a different sequence of numbers from the original WMS and begins with series that are shorter by one digit. Paragraph 1 of the WMS-R Logical Memory subtest is similar to the one used in the original WMS, with only some slight modifications. Immediate and 30-minute delay recall are assessed. A detailed scoring method awards full credit for either a verbatim or a "gist" response. The WMS-R Verbal Paired Associates subtest uses four of the six easy and all four of the hard pairs found on the Associate Learning subtest of the original WMS.
339
WECHSLER MEMORY SCALE
The revised edition provides up to six trials to correctly learn all the pairs, and equal credit is awarded to all pairs, regardless of difficulty. Delayed recall for the pairs is assessed after 30 minutes. Regarding the Visual Reproduction subtest, two of the three original WMS stimulus cards were retained. Two additional cards were added: one containing one design, the other containing two designs. Although this subtest was intended to assess nonverbal learning and memory, all stimulus cards are, unfortunately, easy to verbally encode. Immediate and 30minute delay recall for the designs are assessed. A detailed set of scoring criteria for the designs was developed. Three additional subtests were added to the WMS-R, which purport to assess aspects of nonverbal (visual) memory. The Visual Paired Associates subtest was developed as an analog to the Verbal Paired Associates (word pairs) subtest. This test presents six colors, one at a time, in association with a different design. Immediately following presentation of the six pairs, the design is presented alone and the subject is to remember the name of the color paired with the design. Although this subtest was developed with the intention to "minimize the role of verbal mediation in memorizing and responding to the figure-color pairs," it appears that four of the six designs can be readily verbally encoded. A 30-minute delay recall of the figurEH!olor pairs is also assessed. The Figural Memory subtest was developed as a measure of nonverbal (visual) recognition memory. The subject is shown a set of shaded geometric designs which are difficult to verbally encode. After the designs are removed, multiple-choice recognition memory for the design is assessed. The Visual Memory Span subtest is a variant of the Corsi Cube Test, which is itself a variant of the Knox Cube Imitation Test. (See Lezak et al., 2004, pp. 354-356 for discussion of Knox Cube and Corsi Block-Tapping tests.) For this WMS-R subtest, eight like-colored squares are printed on a card in random order. As in Corsi's task, every time the examiner taps the squares in a prearranged sequence, the subject attempts to copy the tapping
pattern. Part two requires the subject to tap out the pattern in the reverse order. The original WMS Memory Quotient (MQ) was replaced with five composite scores intended to differentiate separate memory mechanisms. However, it should be noted that the scores bottom out at 50, so the test may generate an overestimation of memory functioning in individuals with severe memory impairments. The WMS-R is purported to provide norms for individuals aged 16 years 0 months to 74 years 11 months; however, close inspection of the technical manual reveals that the norms for the age groups 18-19, 25-34, and 45-54 were interpolated on the basis of the scores of the adjacent sampled age groups. Although the WMS-R was clearly an improvement over the original WMS, several reviews suggested there was still room for improvement (D'Elia et al., 1989; Loring, 1989; Chelune et al., 1989; Elwood, 1991). (Please refer to Lezak (1995, pp. 502--505) and Spreen and Strauss (1998, pp. 391-415) for information regarding neuropsychological findings as well as further presentation of the merits and limitations of the WMS-R.) In 1997, the new standardization of the WMS appeared (WMS-III).
WMS-111 (1997) The WMS-III is an individually administered battery of 11 subtests of learning, memory, and working memory. Six of the subtests are included in the core battery, and five subtests are considered optional I supplemental. In comparison with the WMS-R, administration time has been reduced. Administering the WMS-III core battery (i.e., six subtests) takes approximately 30-35 minutes, and administering the five supplementary scales adds another 30 minutes to the process. Most of the memory subtests include a delayed recognition procedure so that a differentiation between retrieval vs. encoding deficits can be made. The WMS-III was co-normed with the WAIS-III, and their joint factor index scores permit ability/memory comparisons. The WMS-III includes six subtests from the WMS-R, although most have been altered and
340
VERBAL AND VISUAL LEARNING AND MEMORY
improved (Information and Orientation, Digit Span, Mental Control, Verbal Paired Associates, Logical Memory, and Visual Reproduction). The scoring sensitivities of these snbtests have been generally improved by extEnding the floor and raising the ceiling. The WMS-R Figural Fluency and Visual-Paired A$ociate subtests have been deleted from the wMS-III battery. Four new subtests have been a4ded to the test battery (Faces, Family Picturest Word Lists and Letter-Number Sequencing) A brief description of the WMS-III spbtests follows. With the exception of minor cftanges to the wording of one question, the Information and Orientation subtest is essentially unchanged from the WMS-R version. Rerftrding the Logical Memory subtest, in story 4, poor old Anna is still living in the same neighborhood and experiencing the same problems. One very minor change to the description of her plight was made. Regarding story iB, the macho truck driver from the WMS-R was evidently fired, and a new story is offered about a man living in San Francis~ who prefers watching old movies. The new !lory B was reportedly developed so that it wo~d "be less likely to evoke an emotional reactio• from some examinees" {p. 13).
Regarding the immediate recall condition, similar to the WMS-R, the examiner separately requests recall for story A and story B following each presentation. However, unlike the WMS-R, the WMS-III examiner p~sents story B a second time, to increase the.likelihood of learning story B details. Im~diate recall for story B is then reassessed. Folbwing an approximately 30-minute delay, recall for stories A and B is again captured. Scoring procedures for verbatim or nearverbatim recall of the stories have • been tightened; numerous examples are offered in the manual to improve inter-rater reliab~ty. A supplemental scoring procedure has also been developed to allow examiners to charaQterize the subjects' "gist" or thematic recall ff the stories. Following free recall for both stories after an approximately 30-minute delay, recognition memory is tested, using 30 qu~tions that probe for details about stories A ~d B. The WMS-III manual does not present any standardized administration gui~lines
regarding the speed (slow, medium, fast) of reading the stories to the subject nor does it comment on the intonation, pauses, or inflection of the presentation. It is known that the speed of presentation of similar prose passages can affect delayed recall performance (Shum et al., 1997). To ensure a standardized administration and improve the reliability and validity of the subtest (especially when the test is being administered by trainees, interns, etc.), a cassette-taped version of the two stories using a medium speed (Shum et al., 1997) should be considered for future versions of this subtest. The Faces subtest is new to the WMS-III but will be familiar to anyone who has used the Warrington Faces Test (Warrington, 1984) or the Denman Neuropsychology Memory Scale (Denman, 1984). The WMS-III Faces subtest uses a recognition paradigm to assess visual immediate and visual delayed memory by presenting a series of 24 faces to the subject and then showing them a second series of 48 faces and requesting that they identify only the 24 faces that they had been previously shown. Approximately 30 minutes later, the subject is again shown the series of 48 faces and again requested to identify the 24 faces initially presented. All eight Verbal Paired Associates word pairs from the WMS-R have been replaced with novel, unrelated word pairs. The ceiling has been raised by an expanded word-pair list. The WMS-R provided for eight learning trials; however, only four learning trials of the wordpair list are administered for the WMS-111. Following an approximately 30-minute delay, cued recall is elicited, with the examiner offering the first word of each pair and the subject expected to provide the second, or "associated," word. There is also a subsequent recognition task, where the examiner reads a list of 24 word pairs and the subject must identify the pair as either "new" or from the previous condition. The Family Pictures subtest is new to the WMS-III, is purported to be the visual analog to the Logical Memory subtest, and requires recall of the characters, their scene activity, and spatial location. The Family Pictures subtest is a multimodality test of memory in
WECHSLER MEMORY SCALE
that the family scenes are visually presented but can also be verbally encoded. The subject is initially shown a "family portrait" card of six family members along with the family dog and told that these will be the characters on four subsequent scene cards. The subject is then shown the four cards for 10 seconds each and requested to tiy and recall as much about each of the scenes as possible. After all four cards have been exposed, recall for the characters in the scene, their location, and their action is elicited for each card. Approximately 30 minutes later, the recall paradigm is repeated for all four cards. The Word Lists subtest is new to the scale and is considered optional; however, it will be familiar to anyone who has administered the Rey Auditory-Verbal Learning Test (Rey, 1964) or the California Verbal Learning Test (Delis et al., 1987). The WMS-111 version presents a list of 12 semantically unrelated words to be learned across four learning trials. Following the learning trials, the subject is presented with a one-trial distracter list to learn, followed by recall of the original list. The subject is told that recall of the first list will be later assessed. Approximately 30 minutes later, recall for the first list is obtained, followed by an auditory recognition protocol. The Visual Reproduction subtest, a former core WMS-R subtest, is optional for the WMS111 and consists of five design cards, some of which are easy to verbally encode. Cards A , C, and D remain from the WMS-R; however, card B has been dropped. Two additional design cards are offered, one of which is a modification of a card resurrected from the original WMS (1945). Similar to the WMS-R, immediate and 30-minute delay recall for the designs are assessed. However, unlike the WMS-R, following the delay condition, there is a 48item recognition task, a direct copy condition, and a seven-item discrimination condition, allowing for evaluation of motor vs. memory performance. The scoring criteria have been improved to allow for partial credit, rather than the ali-or-nothing WMS-R approach. The Letter-Number Sequencing subtest is new to the WMS-111 and is a measure of auditory working memory. The subject is read seven ever-increasing strings of letters and
341
numbers, the first starting with just a single number-letter pair and the final string containing eight elements in four pairs (e.g., 7-N1-Q-4-V,-3--0). Following each string presentation, the subject must remember the numbers and letters and then repeat them, saying the numbers first in ascending order and then the letters in alphabetical order. The WMS-111 Spatial Span subtest stimuli have been modified from the WMS-R twodimensional, eight like-colored square format to a three-dimensional, 10-block board format patterned after the WAIS-R NI Spatial Span Board (Kaplan et al., 1991). The administration is otherwise identical to the WMS-R: the subject is asked to replicate an increasingly long series of visually presented spatial locations that are tapped out on the stimulus block tops. Every time the examiner taps the blocks in a prearranged sequence, the subject attempts to copy the tapping pattern. Part two of the test requires the subject to tap out the pattern in the reverse order. The Mental Control subtest is considered an optional test and consists of eight items. The ceiling has been raised by expanded content. Counting backward by 3s has been deleted, and six new tasks have been created. As was true for the WMS (1945) but not for the WMS-R (1987), bonus points are awarded for fast, perfect performance. As a result, a possible 40 points can be generated from the eight items. The Digit Span subtest is optional on the WMS-111 and is identical to the WAIS-III version. It is similar to the WMS-R version, with the exceptions that the Digit Span Forward test now hegins with a two-digit sequence and ends with a nine-digit sequence. For Digit Span Backward, an eight-digit sequence has been added. The administration and scoring procedures are otherwise identical to the WMS-R. Only six of the WMS-III subtests are considered primary (Logical Memory, Verbal Paired Associates, Faces, Family Pictures, Letter-Number Sequencing, and Spatial Span), and all must be administered in order to calculate the WMS-111 index scores. As noted earlier, the primary subtests can be administered in 30-35 minutes.
342
VERBAl AND VISUAl lEARNING AND MEMORY
The Information and Orientation, Word Lists, Visual Reproduction, Digit Span, and Mental Control subtests are considered optional and do not contribute to the WMS-III index scores but are used to obtain supplementary information. The structure of the WMS-III is substantially different from that of the WMS-R in that the number of summary indices has been increased from five to eight and they reflect the significant revisions to the content of the test. The WMS-III provides for the calculation of eight primary index scores and four Auditory Process Composite scores. The primary indexes are considered by the publishers to be core to evaluating memory functioning, whereas the Auditory Process Composites are considered supplementary scales to better isolate and characterize various processes in memory functioning, such as single-trial learning vs. learning over trials, retention, and retrieval. The method of calculating the WMS-III index scores is also different from that for the WMS-R (see pp. 203-211, W AIS-Ill, WMS-lll Technical Manual-Updated, [Wechsler, 2002a]). There are three primary indices for auditorially processed material (Auditory Immediate, Auditory Delayed, and Auditory Recognition Delayed), two primary indices for visually processed materials (Visual Immediate, Visual Delayed), and three primary indices for multimodality memory functioning (Immediate Memory, Working Memory, and General Memory). The WMS-111 and WAIS-III were conormed, with the WMS-III randomly administered to approximately one-half of the WAIS-III standardization participants within each standardization stratification variable category. The standardization sample for the WAIS-III included 2,450 adults selected to be representative of the U.S. population of adults aged 16-89 years in 1995. From this larger sample, 1,032 adults were tested using the WMS-III protocol. This WMS-111 sample was then weighted to match the 1995 census data. The collected sample was stratified for age, gender, racelethnicity, educational level, and geographic region. The standardization test sites were divided into four regions of the
country: northeast, north central, south, and west. Each age group was required to have a full-scale IQ score of 100. However, some age groups did not meet the IQ standardization requirement, and others did not meet other demographic criteria. Therefore, a weighted sample was derived to improve the fit to the census data and to more closely approximate an average (IQ) of 100 for each age group. The WMS-III standardization sample includes 1,250 individuals aged 16-89 years, divided into 13 age group categories (16-17, 18-19, 20-24, 25-29, 30-34, 35-44, 45-54, 55-64, 65-69, 70-74, 75-79, 80-84, 85-89). However, although the final WMS-111 standardization sample is reported to have 1,250 cases, only 1,032 cases in the standardization are actually unique. Because the WAIS-111 and WMS-111 were co-normed, one can readily compare performance on the WMSIII with IQ. Both tests have summary indices with means of 100 and standard deviation (SD) scores of 15 points.
WMS-IIIA (2002) The WMS-IIIA (Wechsler, 2002b) was developed to allow quick estimates of general memory functioning of individuals aged 16-89 years when extended memory testing was not indicated and/ or feasible. The WMS-IIIA is basically a "carve-out" and repackaging of the WMS-III as a screening instrument to help determine whether further memory or neuropsychological evaluation is needed. Speciflcally, the WMS-IIIA allows reliable assessment of immediate and delayed auditory and visual memory abilities and provides "clinicians with a method for evaluating memory problems in their patients when the clinician does not intend to parse out specific memory factors or make statements regarding brain-behavior relationships" (WMS-IIIA manual, p. 30). The WMS-IIIA consists of test stimuli and normative data derived directly from the WMS-111 and its original standardization. A trained examiner can usually administer the WMS-IIIA in 15-20 minutes. The WMS-IIIA consists of four subtests: Logical Memory I, Family Pictures I, Logical Memory II, and Family Pictures II. The test
WECHSlER MEMORY SCAlE stimuli, administration, and scoring procedures for the WMS-III Logical Memory subtests are identical to those for the WMS-IIIA Logical Memory subtests. Therefore, on the WMS-IIIA Logical Memory I subtest, the examinee listens to two different stories, A and B, and immediately after hearing each story is asked to retell it from memory. Story B is then read to the examinee a second time, to increase the likelihood of learning story B's details. Immediate recall for story B is then reassessed. The Logical Memory I score is based on the accuracy of the immediate recall of these stories. Without warning, approximately 30 minutes later, the subject is again asked to recall the two stories. The Logical Memory II score reflects the accuracy of the delayed recall of these stories. Identical to the WMS-III Family Pictures subtest, the WMS-IIIA Family Pictures subtest requires the subject to view four different scenes with family members for 10 seconds each. Subjects are then requested to recall as much about each of the scenes as possible. The scoring protocol is identical to that used for the WMS-111, and points are assigned for correct recall of the characters, their scene activity, and spatial location (Family Pictures I). Without warning, approximately 30 minutes later, the recall and scoring paradigm is repeated for all four cards (Family Pictures II). Using the norms tables developed for the WMS-IIIA, the raw scores for each subtest are converted to scaled scores. Each subtest performance score can therefore be expressed in terms of a standard scaled score or percentile. Three additional scores may also be calculated by summing the appropriate subtest scaled scores and converting these sums to standard scores. Specifically, the Immediate Memory Composite is composed of Logical Memory I and Family Pictures I; the Delayed Memory Composite is composed of Logical Memory II and Family Pictures II. A Total Memory Composite score is based on the sum of the four subtest scaled scores. The WMS-IIIA composites are reported to have psychometric properties approaching those observed for the WMS-111 index scores, and the WMS-IIIA overall is reported to have clinical sensitivity similar to the WMS-111. However, although
343
the WMS-IIIA is a short form of the WMS-III, it should not be considered a parallel form. A major limitation and difference between the two tests is that there is no recognition testing following the delayed recall procedures for the two subtests used to assess memory functioning on the WMS-IliA. As such, clinicians using the WMS-IIIA are not able to comment on possible contributions to the scores from encoding and/or retrieval difficulties. Also, since the Logical Memory subtest primarily measures auditory verbal learning and memory and the Family Pictures subtest is a multimodality measure of memory (since the pictures can also be verbally encoded), the examiner cannot comment on possible lateralization of memory findings. To perform such analyses, a more comprehensive assessment of memory using the WMS-III and/or other combinations of memory tests, with recognition testing, would be required. The standardization sample for the WMSIIIA, which was psychometrically derived from the original WMS-111 standardization data, included 1,032 adults selected to be representative of the U.S. population of adults aged 16-89 years in 1995. As noted earlier, the original WMS-111 standardization test sites were divided into four regions of the country: northeast, north central, south, and west. Participants were originally tested on the WMS-111 protocol. The sample was stratified for age, sex, race/ethnicity, educational level, and geographic region. Regarding education, the sample was stratified according to five levels (~8. 9-11, 12, 13-15, ;:::16 years). Interestingly, for individuals aged 16-19 years, information regarding the educational attainment of the parent was used. This initial WMS-IIIA standardization sample was then divided into 13 age group categories (16-17, 18-19, 20-24, 25-29, 30-34, 35-44, 45--54,
55-64, 65-69, 70-74, 75-79, 80-84, 85-89). Each age group was required to have an average FSIQ of 100, but this was found not to be the case. Further, some of the age groups did not meet the 1995 population criteria on some of the stratification variables. As such, "cases were randomly selected within stratification parameters for duplication" and a "weighted sample was derived to improve the
VERBAL AND VISUAL LEARNING AND MEMORY
344
fit to the census data as well as to more•closely approximate an average (IQ) of 100 for each age group" (page 8). Therefore, altho"gh the final standardization sample is repotted to have 1,250 cases, only 1,032 are actudny unique. In the final WMS-IIIA standar
RELATIONSHIP BETWEEN TESl PERFORMANCE AND DEMOGRAPJiiC FACTORS
It has long been accepted that memory functions decline with age (Botwinick, 198l). Examining the normative data provided 'n the manuals, advancing age does appear to have a negative effect on the memory subtest ~of the WMS-R, WMS-III, and WMS-IIIA. The WMS-R VISual Reproduction ~btest appears to be the most sensitive to deterioration with age (Ivnik et al., 1992b; Le~ et al. (2004), whereas immediate recall performance on the Logical Memory subtest rem~ relatively stable through middle age and then starts to decline. However, Logical M~mory delayed recall performance begins to gr~ually decline in the 30s-50s, after which the decline accelerates (Wechsler, 1987). Lezak et al. (2004) caution that for older age groups~n the WMS-R standardization, education ru*l age are strongly negatively correlated, thereby making it difficult to disentangle the~ individual impact on performance. The ~S-R manual, however, notes that "the user ~ould keep the effects of age in mind when rinter-
preting Index scores (p. 77)" since these scores reflect performance relative to an age-peer group. Wechsler (1987) notes that an index score of 100 obtained by an older adult in his or her 70s, although average, would not reflect the same level of absolute performance when compared to a young adult in his or her 20s obtaining the same index score. The question of gender differences in performance has been more controversial. On the Visual Reproduction subtest, while some authors have reported no gender effects (Trahan et al., 1988), Ivison (1977) found that women performed slightly worse than men. The WMS-R manual reports that males and females do not significantly differ on the WMSR indices. Moderate correlations have been generally reported between education and memory functioning Lezak et al. (2004); Wechsler, (1987). lvnik et al. (1992b), in examining Visual Reproduction performance, found it was a significant variable in older adults, with better performance noted for those who were better educated. Years of education were significantly correlated with all WMS-R indexes (Wechsler, 1987). Richardson and Marottoli (1996) note that adjustment for lower levels of education is especially important when testing the elderly, since the "educational attainment of both White and non White Americans aged 75 years and older is less than 12 years (median for Whites= 11.6 years, median for non Whites= 8 years, median for all ethnic= 10.9 years)." As Stem et al. (1992) have emphasized, "utilization of standard non-education corrected normative data will frequently result in normal subjects being misclassified as impaired or demented, especially if they are older and have low education (e.g.,<8 years)." The effect of IQ on subtest performance remains unclear. However, an examination of correlation tables provided in the WMS-R manual suggests modest correlations with IQ for many of the subtests. Regarding demographic influences on WMS-111 performance, the WMS-III manual does not specifically present information on the effects of sex and education!IQ on individual aspects of subtest performance; however, examination of the normative data tables does
345
WECHSlER MEMORY SCAlE
Reporting of IQ and/or Educational levels
show the expected decline in performance starting in the 30s and accelerating in the 50s-70s. In further analyses of the standardization data, Haaland et al. (2003) report that the oldest groups always performed the poorest. They also noted a more significant age deterioration on the Visual Reproduction subtest than on the Logical Memory subtest for immediate, recall and recognition conditions.
Since performance on the WMS-R and WMSIII is positively correlated with IQ and education, the means and SDs for years of education and/ or estimated IQ should be reported for each subgroup, and preferably normative data should be presented by level of education and/or IQ.
METHOD FOR EVALUATING THE NORMATIVE REPORTS
Given the equivocal evidence of an effect of gender on performance, information on the gender distribution of the sample should be reported.
Reporting of Gender
To adequately evaluate the WMS-R, WMS-111, and WMS-IIIA normative reports, eight key criterion variables were deemed crucial. The first five of these relate to subject variables, and the remaining three relate to procedural variables. Minimal requirements for meeting the criterion variables were as follows.
Subject Variables Age Group Interval
Since Wechsler's original intention was to tie memory to IQ and because of the welldocumented correlation between memory and age, WAIS-R age interval groupings were adopted as the standard against which to compare the WMS-R studies, whereas the age group intervals of the WAI S-Ill were used for the WMS-III and WMS-IIIA. Sample Size
As Wechsler (1987) noted, 50 cases have generally been recommended (Guilford, 1965; Hayes, 1963) as providing a reliable estimate of the population mean. Following Wechsler's (1987) lead and for the purpose of review, a minimum of 50 subjects per age group interval was deemed adequate.
Procedural Variables Inclusion of a Delayed Recall Condition
To assess storage and rates of forgetting over time, the addition of a delayed recall condition to assess long-term memory is essential. Description of Scoring Procedures
A clear statement of the method used for scoring the WMS-R, WMS-III, and WMS-IIIA is critical. Several methods are available for scoring the WMS-R (Crosson et al., 1984b; Power et al., 1979; Schear, 1986; Schwartz & Ivnik, 1980; Wechsler, 1945). It is important to keep in mind that application of any one of these scoring procedures to the same WMS-R protocol will result in a different score. Therefore, the scoring method used in the nonnative report must be identical to that used by the clinician, to insure appropriate comparison. Data Reporting
Mean and SD scores should be provided, at minimum. With these requirements for reporting in mind, the normative studies for the WMS-R, WMS-111, and WMS-IIIA were examined.
Sample Composition Description
Information regarding medical and psychiatric exclusion criteria is important. It is unclear if geographic recruitment region, socioeconomic status, occupation, ethnicity, and recruitment procedures are relevant. Until determined, it is best that this information be provided.
SUMMARY OF THE STATUS OF THE NORMS Performance on the WMS batteries is positively correlated with IQ (Lezak et al., 2004); however, reports of WMS-R, WMS-111, and
346
VERBAl AND VISUAl lEARNING AND MEMORY
WMS-IIIA comparison data which control for IQ are surprisingly unavailable. Mittenberg et al. (1992) reports WMS-R scores for subjects grouped by IQ estimates within each age group interval; however, these results were also based on small sample sizes. Almost every study compensates, to some degree, for this limitation by reporting mean IQ and/or educational level for subject groups. Still, an empirical basis for judgments about the range of normal variance in memory that is attributable to variation in IQ cannot be found in the literature at this time for the WMS-R, WMS-III, or WMS-IIIA. Seven of the eight WMS-R normative reports reviewed present data from U.S. populations (with Marcopulos et al., 1997, presenting data for low-education, rural individuals), and one study reports normative data based on an Australian sample. There are no published data available regarding recognition testing on delay for any of the WMS-R or WMS-IIIA subtests. Without assessing delay recognition for the material not freely recalled, one is unsure whether the patient primarily has an encoding or a retrieval problem. Therefore, the WMS-R and WMS-IIIA tests should be considered as providing a limited assessment of memory functioning until recognition testing procedures are developed and data become available. Fortunately, the WMS-111 corrects this problem by providing a recognition component following the delayed recall condition on most of the memory subtests. The WMS-111 and WMS-IIIA are adequately normed for ages 16-89 years; however, the WMS-111 norms for recognition memory testing appear to be quite forgiving. Caution is warranted when interpreting retention rates for the WMS-111 memory subtests. A high retention rate does not necessarily reflect an excellent memory. The reason is that the retention rate score does not allow one to make a distinction between the person who initially remembered a high percentage of the material and later recalled a high percentage and the person who initially remembered only a couple of elements of the material and later recalled the same couple of elements. In other words, as Lacritz and
Cullum (2003) noted, "One-hundred percent retention of one word from Verbal Paired Associates still indicates poor performance on that test" (p. 525). Problems with ceiling effects in the standardization sample on some subtests can also affect scaled score performance. For instance, Lacritz and Cullum (2003) point out that because of ceiling effects in the WMS-111 standardization data for the Family Pictures and Faces subtests, subjects under age 45 must retain 97-99% of the learned information just to obtain a scaled score of 10 (i.e., 50th percentile). They also note that because almost everyone in the standardization study obtained perfect recognition scores, in general, the distribution scores for the WMS-111 Auditory Delayed Recognition Index are restricted and negatively skewed. They suggested, therefore, that this should be considered more of a deficit index than an ability index.
SUMMARIES OF THE STUDIES This section presents critiques of the various normative studies for the WMS-R, WMS-111, and WMS-IIIA. The studies are reviewed in chronological order. Studies using the WMS-R are presented first, followed by those using the WMS-III and the WMS-IIIA. The text of study descriptions contains references to the corresponding tables identified by number in Appendix 18. Table A18.1, the locator table, summarizes information provided in the studies described in this chapter. [WMS-R.1] Wechsler, 1987
This is the standardization of the WMS-R. The standardization sample was "designed to represent the normal population of the United States." Each case needed for the standardization sample was prespecified according to age, sex, race, and geographic region. Educational levels of the standardization sample were "0-11 years, 12 years (high school graduation or its equivalent), and 13 years or more." The mean FSIQ estimate of the sample at each age group was 100, with SD of 15 (see Other Comments, below). Although the
347
WECHSLER MEMORY SCALE
manual reports that the WMS-R provides "norms stratified at nine age levels" (p. 2), close inspection reveals that normative data were collected for only six of the nine age groups and that the other "normative data" have been estimated statistically by interpolation. Specifically, no data were collected for three of the nine reported age groups (18-19, 25--34, and 45-54 years). The WMS-R is a proprietary test, and the interested reader is referred to the manual for normative data presentation.
Study strengths 1. Age group intervals mirror WAIS-R intervals. 2. Sample size generally meets criteria. 3. Information regarding gender, ethnicity, and geographic region is provided. 4. Delayed recall procedures are followed for four of the nine subtests. 5. Scoring procedures are well described in the manual. 6. Data reporting include mean and SD scores. 7. Sample composition description is adequate. Considerations regarding use of the study 1. Normative data are not available by education and/or IQ level, although information regarding IQ and educational characteristics of the sample was described. 2. Exclusion criteria were not fully detailed. Other comments 1. IQ scores were generally estimated. The complete WAIS-R was administered only to those 35-44 or 55--69 years old. A four-subtest short form (Vocabulary, Arithmetic, Picture Completion, Block Design) was administered to the other four age groups on which normative data were collected.
[WMS-R.2] Cullum, Butten, Troster, and Salmon, 1990 (Tables A18.2, A18.3)
The sample is described as a group of healthy, "above average" educated, communitydwelling, older adult volunteers recruited via flyers and newspaper advertisements and
screened over the telephone for neuropsychological, neurological, psychiatric, and medical disorders. The authors' stated intention was to provide "preliminary" WMS-R norms for healthy elderly individuals over age 74 since the WMS-R standardization cuts off at age 74 years. Educational level is presented for each age group: 50-70, mean= 14.4, SD = 2. 7 years; 75-95, mean= 14.6, SD = 3.0 years. Delayed recall for lngical Memory, Verbal Paired Associates, Visual Reproduction, and Visual Paired Associates was assessed at 30 minutes.
Study strengths 1. Adequate exclusion criteria. Sample composition description is adequate. 2. Information regarding education is provided. 3. Delayed recall is assessed. 4. Scoring and method of administration were according to Wechsler (1987). 5. Data reporting included mean and SD scores. 6. Sample size is sufficiently large to allow cautious interpretation. Considerations regarding use of the study 1. The age group intervals are broad; however, they are probably sufficiently narrow to allow cautious interpretation. 2. High educational level. 3. No information regarding gender. Other comments 1. Forgetting rates (i.e., "savings scores") are also provided. Savings scores (SS) were calculated for I..ngical Memory, Visual Reproduction, Verbal Paired Associates, and Visual Paired Associates using the following formula: SS = (delayed recall/ immediate recall) x 100. 2. The subject's last learning trial was used as the measure of immediate recall. 3. All subjects were administered the Dementia Rating Scale (DRS) (Mattis, 1976) to assess overall cognitive status. Ages
DRS Total Scores
50-70
141.1 (3.7) 139.8 (4.3)
75-95
348
VERBAL AND VISUAL LEARNING AND MEMORY
4. The authors caution against application of the standard WMS-R index scores for individuals over age 74 because this "would likely result in an underestimation of older subject's true abilities." Therefore, the WMS-R index scores "must be used only as approximate estimates of memory function" in individuals over age 74. [WMS-R.3] Mittenberg, Burton, Darrow, and Thompson, 1992 (Tables A18.4, A18.5)
This study provides empirically derived normative data for an age group (25--34) neglected in the WMS-R standardization. The WMS-R manual reports estimated "norms" statistically interpolated from adjacent age groups in the standardization sample. Subjects were volunteers recruited in Florida from "local businesses, evening/ weekend adult education and vocational I technical classes" and screened to exclude neurological, psychiatric, and alcohol problems. The sample was designed to match 1980 U.S. Census data stratified on age, gender, ethnicity, and education. Educational level of the sample was described as 0-11 years (n = 12), 12 years (n = 20), and 13+ years (n = 18). Mean WAIS-R estimated FSIQ was 101.3 (14.58, range 72-131, median= 100). The WAIS-R FSIQ estimate was based on administration of the Vocabulary and Block Design subtests only. Study strengths I. Age group interval is adequate. 2. Sample size met minimal criteria. 3. Exclusion criteria are well described and sample composition description is adequate. 4. Educational level and IQ estimate of the sample are described. 5. Delayed recall procedures were according to Wechsler (1987). 6. Scoring procedures were according to Wechsler (1987). 7. Data reporting include mean and SD scores. 8. Information provided regarding geographic area and recruitment procedures.
Considerations regarding use of the study I. Normative data are not reported by educational or IQ level because sample size is small relative to this purpose. 2. No information regarding gender.
Other comments I. The authors note that the interpolated
WMS-R norms presented in the manual underestimate performance on the "Attention/Concentration Index at the lower end of the score distribution and overestimate scores on the VJSual Memory and Delayed Recall Indexes at the upper end of the distribution." [WMS-R.4] Lichtenberg and Christensen, 1992 (Table A18.6)
This study, entitled "Extended Normative Data for the Logical Memory Subtest of the Wechsler Memory Scale-Revised: Responses from a Sample of Cognitively Intact Elderly Medical Patients," provides clinical comparison data for a group of cognitively intact geriatric medical patients aged 70-99 years, seeking treatment in a hospital in Detroit. Michigan, one-third of whom were being seen for hip fractures from falling, one-third for knee replacements due to arthritis, and onethird for stabilization and recovery from a lengthy illness. The sample was comprised of 25 men and 43 women (35 Caucasian, 31 African American) screened for the absence of neurological dysfunction. Information on years of education is provided for each age group interval: 70-74 group, education= 12.0 (3.7) years 75-79 group, education= 10.5 (2.8) years 80-99 group, education= 11.2 (3.1) years
Although the title of this report suggests that normative data are being supplied, the authors actually provide WMS-R Logical Memory subtest clinical comparison data for a sample of cognitively intact geriatric medical inpatients. Indeed, the author's caution that "the data presented here will be best used when applied to geriatric medical patients seeking treatment in an urban medical setting" (p. 746).
WECHSlER MEMORY SCAlE
Study strengths 1. Age group intervals are generally adequate; however, the 80-99 age category is broad. 2. Information regarding years of education is provided for each age group interval. 3. Delayed recall for the Logical Memory subtest is assessed according to Wechsler (1987). 4. Scoring procedures were according to Wechsler (1987). 5. Data reporting included mean and SD scores. 6. Information regarding gender, ethnicity, and geographic area is provided. Sample composition description is adequate. Consideration regarding use of the study 1. Sample size within each age group was relatively small. Other comments 1. Medical history ruled out suspected cerebral damage, and the Mattis Dementia Rating Scale (cutoff score= 129) was administered to ensure intact cognitive functioning. 70-74 group, DRS= 134.7 (3.6) 75-79 group, DRS= 134.9 (4.2) 80-99 group, DRS= 133.3 (3.9)
[WMS-R.SJ lvnik, Malec, Smith, Tangalos, Petersen, Kokmen and Kurland, 1992b
This study presents age-corrected (aka, agespecific) norms for the WMS-R derived from a sample of 441 cognitively normal individuals aged 56-94 who are participants in Mayo's Older American Normative Studies (MOANS). Age groups were ~74 (n=274) and 75-94 (n = 167). Information regarding the IQ of the sample was provided; however, valid WAIS-R Verbal IQ (VIQ), Performance IQ (PIQ), and FSIQ could be calculated only for the 274 individuals aged ~74 since the standardization data could be obtained from the WAIS-R manual. For this group, the actual WAIS-R IQ summary data are as follows: VIQ = 106.1 (10.0), PIQ = 108.0 (11.7), and FSIQ = 107.3 (10.6). "nle MAYO values for the entire sample
349
of 441 are as follows: VIQ = 105.5 (10.0), PIQ= 107.3 (11.4), and FSIQ= 106.6 (10.5). "nle MOANS recruits participants from two ongoing research projects at the Mayo Clinic in Rochester, Minnesota. One of the goals of research project 1 is to obtain age-specific norms on traditional and experimental cognitive tests. Potential normal volunteers were captured by sampling all recent medical examinations performed at Mayo's Department of Community Internal Medicine (CIM). A participant was deemed "normal" if he or she was able to function independently and lacked any active neurological or psychiatric conditions that might compromise cognitive status. All subjects had a thorough medical screening; however, chronic medical illness was not an exclusion criterion. For instance, persons with diabetes, cardiac problems, and hypertension were included in the "normal" sample. Individuals deemed normal by chart review were added to a possible participants list. Participants cuJled from research project 1 were willing to undergo a 4-hour outpatient clinic visit, during which a battery of neuropsychological tests was administered. Research project 2 is a component of the Mayo Clinic's ongoing Alzheimer's Disease Patient Registry. "ntis project follows up all newly diagnosed persons with dementia who present for examination at Mayo's Department of CIM. Every patient is demographically matched to a normal control. "nle normal control is obtained by recruiting individuals who present for a general medical examination to any CIM internist. To be considered "normal," the individual must have a Mini-Mental State Exam (Folstein et al., 1975) score >24, and the internist must satisfy "him or herself that the patient is indeed cognitively normal" (Ivnik et al., 1990). "nlese individuals also received a neurological examination. Data from a potential recruit were reviewed by a team, which included a geriatrician, neurologists, and neuropsychologists. "nlese individuals had also been administered a battery of neuropsychological tests, but the results of this evaluation were not used in determining the normality of the potential subject. The criteria for normality were otherwise the same as for project 1.
350
VERBAL AND VISUAL LEARNING AND MEMORY
Potential participants were randomly solicited from projects 1 and 2, and approximately 34% agreed to participate. A further exclusion criterion was inability to complete any neuropsychological test that was administered, regardless of reason. The MOANS normative sample consists of primarily well-educated (high school or greater) Caucasian adults living in Rochester, Minnesota, and the immediately surrounding, "primarily agricultural" communities of Olmstead County.
Study strengths 1. Age group intervals are generally adequate, with the exception of the table listing data for an 88-year midpoint. (The age group interval for this midpoint value table is actually 83-94 years.) 2. Sample size is adequate. 3. Sample composition description is well detailed in the current and prior reports (Ivnik et al., 1990, 1991, 1992a; Malec et al., 1992). 4. Reporting of IQ and educational levels was detailed. 5. Delayed recall conditions were administered according to the WMS-R manual. 6. The scoring procedure is well described according to standardized procedures detailed in the test manual. 7. Adequate exclusion criteria with the exception of chronic medical illness. Considerations regarding use of the study 1. Data reporting does not include presentation of mean and SD scores. Rather, tables are provided which convert WMS-R raw scores to percentile ranks and age-specific scaled scores for midpoint ages occurring at 3-year intervals from 61 through 88. The age range around each midpoint age is ±5 years. Other comments 1. The sample consists of primarily welleducated, urban, Caucasian adults, which should be considered when using these data. 2. One of the purposes of this study was to extend the norms on the WMS-R above
age 74. To accomplish the extension, data were collected on normal volunteers aged 56-74. Data collected from this younger sample were then compared to the actual 56-74 WMS-R norms available in the test manual to determine how the two groups differed. Knowing how the younger sample differed allowed Ivnik and colleagues to "correct" the norms collected from their 74-94 sample so that they would correspond with what probably would have been the WMS-R norms for those 74-94 years old had they actually been collected during the WMS-R standardization. In other words, this research provides a statistically derived estimate of probable WMS-R values for individuals older than age 74. The differences noted in performance values between the WMS-R standardization sample and the MOANS sample most probably reflect the different sampling procedures employed in the two studies as well as differences in sampling national vs. regional populations. MAYO procedures convert all WMS-R subtest raw scores to age-corrected, normalized scaled scores "before any other algebraic or tabular conversions" are undertaken. Regarding the "normalizing" procedure, each raw score was first converted to a percentile rank "based upon the actual cumulative percent distribution." Each percentile rank was then converted to a scaled score (M = 10, SD = 3). Within each age group, "the percentile range encompassed by each scaled score was set so that the resultant distribution of scaled scores was as normal as possible," resulting in age-corrected scaled scores. In determining age-corrected MOANS WMS-R summary indices, linear transformations of the sums of MOANS scaled scores were computed to convert the distribution to the accepted WMS-R standard (i.e., M = 100, SD = 15). The resulting sumrruuy scores are referred to as MAYO Verbal Memory Index, MAYO Visual Memory Index,
WECHSLER MEMORY SCALE
MAYO General Memory Index, MAYO Attention/Concentration Index, MAYO Delayed Recall Index, MAYO Percent Retention Index. The MAYO Percent Retention Index is new and computationally derived. 3. A major assumption of Ivnik and colleagues is as follows: persons above age 74 who agreed to senre as nonnal volunteers are demographically similar to those age 56 through 74, since they were drawn from the same population via identical sampling procedures. We further assume that the differences which exist between our younger sample and the [WMS-R] national sample would be similar for the persons above age 74 if we were able to compare these older persons to a national nonnative sample of like age.
(p.5) 4. The administration and scoring procedures for the MAYO system are identical to the WMS-R with the exception of the Verbal and Visual Paired Associates I subtests. Specifically, for the Verbal and Visual Paired Associates I subtests, if the criterion was not reached by the third learning trial, additional trials were not administered as required by the WMS-R manual. As a result, the MAYO Delayed Recall Index is not directly comparable to the WMS-R Delayed Recall Index. 5. Since Ivnik and colleagues are applying the same norming procedure to each test in their extensive neuropsychological battery, this should enhance the comparability of test scores across tests. 6. MAYO indices differ from WMS-R indices, even for those aged 56-74 years. Therefore, if you are evaluating someone who was previously tested at age 74 or younger and is now above age 74, you will need to convert the prior exam scores to MOANS equivalents in order to compare old vs. current performance. 7. MAYO and Wechsler summary scores are not interchangeable. Ivnik et al. (1993) note that the difference can be as large as ±17 points. Due to the innovative nature and the voluminous amount of the normative
351
data, they are not reproduced here. The reader is referred to the original article.
[WMS-R.61 Richardson and Marottoli, 1996 (Table A18.7)
This study provides education-corrected normative data for the performance of adults aged 75-91 years on commonly administered neuropsychological tests, including the WMS-R. The sample consists of a subset of 101 adults (53 males, 48 females) recruited from a larger longitudinal study (Project Safety, n = 1,103), aged 76-91 years, who were active drivers, free from neurological and psychiatric disease, and living independently in an urban community in the northeastern United States (e.g., New Haven, CT). Educational levels were reported as follows: Age 76-80 years Education 10.44 (3.86) Sample size 50 Range for entire sample: 4th
81-91 years 11.59 (3.45) 51 grade through college
Sample sizes for age education categories are as follows: Age
76-80
Education <12years Sample size 26
~12years
24
81-91 <12years 18
~12years
33
Study strengths 1. Age group intervals are adequate. 2. Adequate exclusion criteria. The sample description was adequate. 3. Education for the sample was reported. 4. Delayed recall was assessed for the Logical Memory and Visual Reproduction subtests. 5. Description of the administration and scoring procedures was according to the WMS-R test manual. 6. Data reporting included mean and SD scores. 7. Information regarding gender and geographic area is reported.
Consideration regarding use of the study 1. Sample size is adequate for data reported by age category; however, it becomes
352
VERBAl AND VISUAl lEARNING AND MEMORY
quite small when data are preset.ted by age and education. Other comments 1. Logical Memory subtest data for the less educated 81-91 group should he used with caution due to small sample size and the finding that the norms deviated significantly from normal when compared to other available normative data. 2. The authors note that since their ~ample was not medically screened an~ individuals were not ruled out if ~y had more common medical diseaseS (e.g., hypertension and diabetes), thq "data should be viewed as reflecting the 'normative' population of urban driver older than age 75 rather than the perfotmance of 'normal' individuals."
[WMS-R.7] Marcopulos, Mclain, and Giqliano, 1997 (Tables A18.8-A18.11)
.
The goal of this study was to develop normative data for rural, community-dwelling adults aged 55 and older with no more than 10 years of formal education. Participants w~e administered a battery of nine neuropsJchological tests, which included the WMS-~. The data were obtained from a biracial sample (n = 133; white = 64, African ~meri can = 69) of nondemented, healthy adults aged 55 and older (mean age= 76.48, SD 7.87, range unknown), who attended schoollfor 10 years or less (mean education= 6.65: years, SD=2.14, range 0-10 years) and we~ educated and primarily lived in a rural cominunity setting (central Virginia). Subjects wEire excluded if they reported the presenct of a chronic or severe psychiatric disorder, lustory of extensive psychotropic drug use, long-term substance abuse, neurological diseaset electroconvulsive therapy, head injury, or 1oss of consciousness. Subjects were paid $~ and received a certificate of appreciation for completing a 2-hour test battery, which was administered in one session. Data were reported by age and again by age/educati~. Age group categories were as follows: 55-64, 65-74, 75-84, and 85 +. The upper range of thqoldest age category is not known. Educatio~ data
f
were reported by age category as follows: 0-4, 5, 7, and 9-10 years. The sample contains primarily females. Study strengths 1. Age group intervals are generally adequate. 2. Adequate exclusion criteria. Sample composition description is adequate. 3. Education data were reported and grouped by age category. 4. Delayed recall for the Logical Memory and Visual Reproduction subtests is assessed. 5. Data reporting included mean and SD scores. 6. Information on gender, ethnicity, and geographic area is provided. Considerations regarding use of the study 1. Sample size is generally too small to allow valid use of the normative data. 2. Scoring procedures were not specifically described but presumably followed the protocol in the WMS-R manual. 3. The sample is primarily female; therefore, caution is suggested when applying the data to males. Other comments 1. The percent retention as a measure of the rate of forgetting (i.e., savings score) was calculated using the following formula: Savings score% = (delayed recaU/ immediate recall) x 100
The authors found the savings score to be relatively impervious to the effects of age and suggested that it may be "the most sensitive and specific indicator of abnormal memory functioning." 2. The authors highlight the fact that most of the normative data in the literature are based on urban, high schooleducated, white adults. They caution that "the use of extant norms with lower educated, rural-dwelling, older adults can overestimate degree of cognitive impairment." Since 26% of the elderly
353
WECHSLER MEMORY SCALE
live in nonurban areas and rural individuals tend to be disadvantaged with regard to education, the authors urge the development of more normative data for this group. [WMS-R.Bl Shores and Carstairs, 2000
One of the goals of the Macquarie University Neuropsychological Normative Study (MUNNS) was to develop WMS-R normative data for adults aged 18--34 living in Australia. Participants were administered the WMS-R as part of a larger battery of neuropsychological tests that were also being normed for Australian populations. The WMS-R was administered and scored according to the methods described in the standardization manual. The only deviation from standard procedure was that Logical Memory subtests I and II were administered with the Australian idiom stories reported by Ivison (1993). The data were obtained from a sample of 399 healthy adults (193 males, 206 females) living in the Sydney metropolitan area. The methodology and sample characteristics were described in detail in an earlier report by these researchers (Carstairs and Shores, 2000). Data were reported by age/educational level and again by gender. Education data were reported by age category as follows: <12 (n = 91), 12 (n = 91), and >12 (n = 217) years.
Study strengths 1. Adequate exclusion criteria and sample composition description per Carstairs and Shores (2000). 2. Sample size is adequate. 3. Education data were reported and grouped by age category. 4. Delayed recall for the Logical Memory and Visual Reproduction subtests is assessed. 5. Information on gender and geographic area is provided. 6. The administration and scoring criteria follow the WMS-R manual. Exceptions to the standard administration are appropriately referenced.
Considerations regarding use of the study 1. Age group interval is too broad. 2. Data reporting did not include mean and SD scores; however, raw scores and their corresponding scaled score equivalents can be obtained from the tables. Other comments 1. The authors present several normative tables regarding various aspects of the WMS-R in their excellent article, and the interested reader is referred to this document for Australian normative information. [WMS-111.11 Wechsler, 1997
This is the standardization of the WMS-111. The normative information was generated from a national sample stratified by age, sex, race/ethnicity, educational level, and geographic region, representative of the 1995 U.S. population. The data are a subset (n = 1,250) of a larger stratified sample collected while norming the WAIS-III (n = 2,450). Subjects were located "primarily through the use of marketing research firms in 28 U.S. cities in the Northeast, North Central, South, and West regions." Subjects were recruited by random telephone calls, newspaper ads, and flyers posted at senior centers and various community organizations. Several independent examiners recruited and tested additional subjects. Standardization sampling sites (marketing researchers and independent researchers) were located in 47 of the 50 states (Hawaii, Utah, and North Carolina were evidently not included). All participants were paid for taking the tests. Participants were excluded if they were color-blind, had uncorrected hearing loss or visual impairment, were in current treatment for drug or alcohol dependence or consumed more than three alcoholic beverages on more than two nights a week, were taking any psychiatric medications, were seeing any professional for thinking or memory problems, had a disability that would affect motor performance, reported a history of unconsciousness for ~5 minutes, or had a history of any medical or psychiatric condition that could potentially alter cognitive functioning (see WAIS-III,
354
VERBAL AND VISUAL LEARNING AND MEMORY
WMS-III Technical Manual-Updated [Wechsler, 2002a], p. 21 for full details). Performance scores are reported for 11 age groupings in the range 16-79 years. Each of these age group categories contains data on 100 participants. The two oldest age groui?ings in the range 80-89 years have a sample si2lfl of 75. This is a proprietary test, and the normative data are available in the test manual.: Scaled scores can be converted to percentile~ by examining Table E.1 on page 200 of thai WMSlll Administration and Scoring )lanual (Wechsler, 1997). Demographically ajljusted norms are also available when usihg the computerized scoring program that ii available from the test publisher. ! Study strengths 1. Age group intervals are adequata 2. Sample size met criteria. 3. Information on education, gender, ethnicity, geographic area, and recn.Ptment procedures is provided. ' 4. Delayed recall and recognition iprocedures are standard on all apprppriate subtests. 5. Scoring procedures are well descdbed in ' the manual. 6. Adequate exclusion criteria. Sample composition description is adequate. Considerations regarding use of the st~dy 1. Data reporting did not include mean and SD scores; however, raw scores arld their corresponding scaled score equwalents can be obtained from the tables. ~ Other comments 1. Education is positively correlateJl with performance on memory tests (Lezak et al., 2004). In 2002, the publishet of the WMS-III made available age/eduf:ationcorrected norms; however, the age/ education-corrected norms may be obtained only by purchasing a CODJputerized scoring program for the WAjS-IIIWMS-III (see Appendix 1 for oidering information). Data are not available by age and IQ; however, some comptlrisons can be made with the WAIS-III si.lce the two tests were co-normed.
2. The WMS-III battery, similar to the WMS and WMS-R, is designed so that individual subtests can be administered, scored, and interpreted as needed. 3. For ages 16-64, an equal number of males and females participated; however, for ages 65-89, the proportions reflected 1995 census data. [WMS-IIIA.1] Wechsler, 2002b
As noted earlier, the WMS-IIIA is basically a "carve-out" and repackaging of the WMS-III as a screening instrument to help determine whether further memory or neuropsychological evaluation is needed. As such, the WMS-IIIA uses Logical Memory and Family Pictures subtest stimuli that are identical to those used in the WMS-III. The scoring protocols are also identical to the WMS-111. Since the norms for this short-form version were carved out of the original WMS-111 standardization, details of the standardization sample will not be repeated here. Please refer to the descriptions provided above for the WMS-111 for details regarding the standardization. This is a proprietary test, and the normative data are available in the test manual.
Study strengths 1. Age group intervals are adequate. 2. Sample size met criteria. 3. Information on education, gender, ethnicity, geographic area, and recruitment procedures is provided. 4. Delayed recall procedures are standard on the subtests. 5. Scoring procedures are well described in the manual. 6. Adequate exclusion criteria. Sample composition description is adequate. Considerations regarding use of the study 1. Data reporting did not include mean and SD scores; however, raw scores and their corresponding scaled score equivalents can be obtained from the tables. 2. No recognition testing following the delay recall condition.
WECHSLER MEMORY SCALE
Other comments 1. Although education is positively correlated with performance on memory tests (Lezak et al., 2004), WMS-IIIA normative data are not available by age and education. Data are also not available by age and IQ; however, some comparisons can be made with the WAIS-111 since the two tests were co-normed. 2. For ages 16-64, an equal number of males and females participated; however, for ages 65--89, the proportions reflected 1995 Census data.
CONCLUSIONS
There are currently three published versions of Wechsler's memory battery that are commercially available: WMS-R (1987), WMS-111 (1997), and WMS-IIIA (2002). The strengths and considerations regarding use of the extant comparison data for the various versions of the test battery are summarized below. 1 WMS-R (1987)
The WMS-R has remained popular since its introduction over 15 years ago, and WMS-R subtests have continued to be included in some research protocols, which insures that we will continue to see data using this test for some years. However, given the improvements in assessment achieved with the WMS-111, the WMS-R should be phased out and replaced with the WMS-111 in clinical settings. Regarding norms for the WMS-R, for Americans of average intelligence in the age ranges 16-17, 20-24, 35-44, 55--64, 65-69, and 70-74, the data provided with the WMS-R should receive the strongest attention. The data presented for age groups 18-19, 25-34, and 45-54 most probably should be used only with appropriate caution since the values were estimated by statistical interpolation. Although "test standardization of necessity involves 'Meta-analyses were not performed as this review was not intended to summarize all of the voluminous literature available on this test. Conversely, comprehensive sets of norms are available in the test manuals for all versions of theWMS.
355 interpolation between points on a continuous variable like age" (Bowden & Bell, 1992, p. 34), granting permission to a commercial test developer/distributor to interpolate data across decade increments seems unjustified. We feel strongly that reliance on such estimated normative data for interpretive purposes raises professional concerns. Highlighting our concern, Mittenberg et al. (1992) provided empirically derived provisional norms for the 25-34 age group. Differences between these norms and the WMS-R published index scores appear to be clinically significant. (With the more recent development of the WMS-111, the publishers noted that no data were interpolated in the revised standardization.) The WMS-R normative data developed by lvnik et al. (1992b) are also recommended as long as the demographics of lvnik's participants match closely the demographic characteristics of the subject under study. Despite the fact that educational level was significantly related to test performance in the standardization sample, WMS-R normative data stratified by age and education, as well as data for various ethnic and cultural groups, are lacking. Although four of the nine subtests of the WMS-R require delayed recall of the material presented, a recognition testing format was not provided, making it impossible to determine whether a poor recall score is due to an encoding problem or a retrieval problem or both. The interested reader is referred to Fastenau (1996a), who has developed multiple-choice recognition stimuli for the WMS-R Logical Memory and Visual Reproduction subtests. Over 15 years have passed since the introduction of the WMS-R, and considerable research has accumulated regarding its utility with various clinical populations. Although the WMS-R is a better instrument than the original WMS, the WMS-111 should be considered the preferred version of the test for the assessment of memory functions. Research using the WMS-R is winding down and should soon cease. Therefore, clinicians and researchers are encouraged to use the WMS-111. However, it still may be necessary to occasionally use sections of the WMS-R when
VERBAl AND VISUAl lEARNING AND MEMORY
356
evaluating a patient previously assessed on this older form of the test. WMS-111 (1997) and WMS-IIIA
(200~)
The larger size of the normative data sample (n = 1,250, aged 16-89) for the w.MS-III compared to the WMS-R represen~ a significant improvement. The WMS-~ was normed on only 316 subjects aged ·16-74. Because the WMS-III was co-normad with the WAIS-III, one can now perform analyses of difference scores as well as more s4>phisticated test pattern analyses. The W~S-IIII WMS-IIIA standardization norms pro~ded in the manual appear to be adequate ftr ages 16-89, especially if applied to individu. born
and educated in the United States. However, although recognition memory testing has been added to several of the WMS-III subtests, this is an area that needs further improvement in terms of test development and norms. Since education has been consistently found to enhance WMS test performance, future normative studies involving the WMSIII and WMS-IIIA should report data by age/ education categories in the test manual. Relative to earlier editions, the WMS-111 definitely reflects the improvements in psychometrics and advances in scientific and clinical understanding of learning and memory function and dysfunction that have accrued over the years. However, we do look forward to the WMS-IV.
19 List-Learning Tests
REY AUDITORY-VERBAL LEARNING TEST
The Rey Auditory-Verbal Learning Test (Rey AVLT) has been extensively used to evaluate memory functioning in normal samples and in a variety of clinical samples representing different medical and psychiatric conditions. The test was introduced by the Swiss psychologist Andre Rey (1941) as a measure to assess the inconsistency in relative performance on the recall task vs. the recognition task. The test was described in English by Taylor (1959). The English translation of the Rey AVLT does not correspond exactly with the original French version: three French words were substituted in the translation (bell for belt, moon for sun, and nose for mustache). Later, Rey (1964) modified the procedure to include five free recall trials and a recognition task. Contemporary versions of this test (described in Lezak, 1995; Lezak et al., 2004) include an interference trial (first introduced by Taylor, 1959) and a postinterference recall trial; the subsequent additions are the delayed recall and delayed recognition trials (Lezak, 1995; Lezak et al., 2004; Spreen & Strauss, 1991, 1998).
Variability in Administration of the Rey AVLT
Due to its usefulness in detection and identification of faulty memory mechanisms, the
Rey AVLT has gained remarkable popularity among clinicians. Interpretations of clinical data, however, are obscured by considerable variability in administration of the test. There is very little uniformity in various procedural aspects. 1. The administration procedure varies
widely. The standard administration includes five successive presentations of the original list of 15 words, followed by free recall on each trial; an interference trial, involving presentation and free recall of another list of 15 words; postinterference free recall of the words from the original list; and a recognition trial (Lezak et al., 2004). A number of studies have also utilized delayed recall and delayed recognition trials (Lezak et al., 2004). In contrast to this standard, some studies use a different number of recall trials, which varies from three (White, 1984) to six (Madison et al., 1986). The interference trial is omitted in several studies (Miceli et al., 1981; Bolla-Wilson & Bleecker, 1986; Bleecker et al., 1988). Uzzell and Oler (1986) omitted the postinterference recall trial and presented recognition immediately following recall of the interference list. Squire 357
358
VERBAl AND VISUAl lEARNING AND MEMORY
and Shimamura (1986) and Shimamura et al. (1987) modified their procedure to present a recognition trial after each of the five acquisition trials, without using free recall trials. Shimamura et al. (1987) presented the words in a different order on each acquisition trial. 2. The format of the recognition trial varies widely: Rey (1964) and Lezak et al. (2004) described a story format in which all 15 words from the original list are imbedded. The original story described by Rey contained twice as many distracter nouns as the one being used in current studies. According to Rey (1964), the story was to be read to the subject, who was instructed to stop the examiner when a word was recognized. The story described by Lezak was to be read by the subject with the instructions to circle recognized words. In addition to the story format, Lezak described other versions of the recognition trial, consisting of lists of 50 words which include words from list A, the interference list, and 20 words phonemically and/or semantically similar to the words from both lists presented to the participants. Presentation of the lists can be either auditory or visual. It should be noted that other versions of the recognition list have been proposed. For example, Ivnik et al. (1987, 1990, 1992c) used a 30-word list for recognition. The order of administration of the recognition trial also varies between studies. The recognition trial is administered after the postinterference recall trial, after the delayed recall trial (with varying delay intervals), or after bothpostinterference and delayed recall trials, which influences performance on the last recognition trial. 3. The interval for the delay varies from 15 minutes (Miceli et al., 1981) to 60 minutes (Ivnik et al., 1987). Geffen et al. (1990, 1994), Seines et al. (1991), and Miller (personal communication) used a 20-minute delay, whereas Ivnik et al. (1990, 1992c) and Savage and Gouvier (1992) used a 30-minute delay. These
differences in delay intervals should be taken into consideration, even though Lezak et al. (2004) report a minimal decline in recall over a 30-minute interval. In addition, the delay interval is 6lled with various activities in different studies, which may influence performance on the delayed recall and recognition trials. 4. The rate of presentation differs across studies. According to Rey (1964) and Taylor (1959), each word should be separated by a !-second interval. Rey also suggested recording the number of words remembered in each 15-second block to provide an indication of the rhythm of recall. According to Lezak et al. (2004), the rate of presentation should be one word per second. Some authors have used a slower rate of presentation, e.g., one word every 2 seconds (White, 1984). 5. Time allowed for recall differs between studies. According to Rey's (1964) instructions, the first presentation of the list should be followed by recall within 60 seconds, and on subsequent acquisition trials, 90 seconds should be allowed for recall. The majority of investigators, however, allow unlimited time for recall. This aspect of administration is not usually described by the authors. The degree of encouragement on the part of the examiner to elicit maximal effort from the subject also varies between the studies. 6. The extent of feedback about the subject's performance on each trial might influence the results. According to Rey (1964) and Taylor (1959}, the feedback had to be given to the subject each time the word was repeated within one trial. Rey also instructed test administrators to provide the subject with the feedback on the number of words recalled on each trial and to inform the subject on the fifth recall trial that this is the last trial. The current standard administration does not follow these guidelines. According to Lezak et al. (2004}, the examiner should not volunteer information about repetition of the same words
LIST-LEARNING TESTS
within one trial unless this infonnation is solicited by the subject because this might cause distraction. In view of the variability in administration procedures for the Rey AVLT and considering the usefulness of the test and extensive research database, Schmidt (1996) published a handbook which summarizes the current status of the administration, scoring, and normative resources for the Rey AVLT. The author outlines the historical development of the test, the role of demographic factors on test perfonnance, and statistical properties with respect to different clinical groups; describes alternate fonns and test-retest comparisons; addresses issues of qualitative and quantitative interpretation of the results; and suggests administration instructions, which are derived from the instructions described by Lezak (1976, 1983) and Spreen and Strauss (1991). The review includes summaries of nonnative data for basic scores, additional scores, and derived indices for different age groupschildren, adolescents, adults, and elderly-in the context of reviews of the empirical studies from which these data are gathered. In addition, the author provides metanonns, which are derived by calculating a pooled mean and standard deviation (SD) for the relevant studies. The handbook contains test fonns and stimulus sheets which are reprinted from the literature. Peaker and Stewart (1989) also published a review of the Rey AVLT. Additional infonnation on the history and procedural aspects of the Rey AVLT is provided in Lezak et al. (2004) and Spreen and Strauss (1998).
functioning of Different Memory Mechanisms, as Assessed by the Rey AVLT 'nte advantage of the Rey AVLT is that it not only provides a measure of rote verbal memory but also allows partitioning of memory processes into their components and identifying faulty memory mechanisms that lead to memory loss. The Rey AVLT was reported to load on a verbal learning and memory factor in a
359
factor-analytic study conducted by Ryan et al. (1984). Vakil and Blachstein (1993) factoranalyzed the Rey AVLT perfonnance of 146 nonnal participants. The basic factors extracted were acquisition and retention. The latter factor was further subdivided into storage and retrieval. 'ntis information suggests that verbal memory functioning can be subdivided into separate mechanisms, which is consistent with empirical data accumulated in the field of experimental psychology. The integrity of these mechanisms is reflected in the different indices derived from Rey AVLT performance. Recall on trial I represents immediate memory span for words. In most cases, immediate memory span for words is expected to be consistent with the immediate memory span for digits. A discrepancy between recall on trial I and digit span in favor of digit span can be attributed to infonnation overload (Geffen et al., 1990). Similarly, recall on trial I is expected to be roughly similar to recall on the interference trial because both present equivalent word lists for the first time. Superiority in recall on trial I over the interference trial may indicate the effects of proactive interference on the latter. The opposite pattern might suggest difficulties in changing response set (Lezak et al., 2004). Furthermore, comparison of recall on trial V vs. the postinterference trial provides a measure of retention for newly learned infonnation and its loss due to retroactive interference. Both proactive and retroactive interference are thought to reflect executive and/or memory dysfunction (Blusewicz et al., 1996; Gershberg & Shimamura, 1995; Kareken et al., 1996; Paulsen et al., 1995a; Torres et al., 2001; Yehuda et al., 1995). Fleming et al. (1995), Kareken et al. (1996), and Torres et al. (2001) found increased susceptibility to retroactive interference in schizophrenic patients, which was attributed by Torres et al. (2001) to the frontally mediated central executive dysfunction. Change in performance over five acquisition trials indicates a learning curve, and its slope provides a measure of verbal learning. Verbal learning over five acquisition trials is also commonly identified as the recall difference between trial V and trial!. Some authors
360
VERBAl AND VISUAl lEARNING AND MEMORY
define learning as the difference between the highest recall on any trial and recall on trial I (Query & Megran, 1984). Analysis of the recall pattern might shed light on the use of learning strategies and organization of the material. Mungas (1983) introduced a measure of consistency of sequential organization, reflecting the degree to which pairs of words that are recalled consecutively on one trial are recalled consecutively on the next trial. Woodard et al. (1999a) used analytical decomposition of the learning curve over five acquisition trials to examine the relative contribution of acquisition and consolidation mechanisms to the multitriallearning deficit in patients with Alzheimer's disease. This method yields measures of gained items and lost items across consecutive learning trials, which are expressed as gained access (acquisition) and lost access (consolidation failures that lead to rapid intertrial forgetting). The authors assert that this method is useful in uncovering the cognitive processes underlying learning deficits in persons with memory disorders and in characterizing potential areas for remediation. Serial position effects provide information on the functioning of specific memory mechanisms. Primacy and recency effects in the recall pattern provide another index of vulnerability to proactive and retroactive interference. These indices are useful in distinguishing amnesias affecting encoding mechanisms (in which case the predominance of the recency effect would be evident) and those conditions which compromise the efficiency of recall but spare basic encoding processes (demonstrating intact primacy and recency effects). For example, Tierney et al. (1994) reported greater recency than primacy effects in Alzheimer's patients, whereas Parkinson's dementia patients as well as control participants demonstrated intact primacy and recency effects. Similarly, Bigler et al. (1989) reported only recency effects in a sample of Alzheimer's patients. In contrast, Crockett et al. (1992) did not find differences in the magnitude of primacy/recency effects in patients with anterior and posterior brain damage and in psychiatric inpatients. Stefanova et al. (2002) reported intact primacy/recency effects during five acquisition trials in patients
with ruptured and repaired anterior communicating artery aneurysm. However, on delayed recall, these patients demonstrated neither a primacy nor a recency effect, which indicates very poor recall from any part of the list. The serial position effect also reflects the predictive role of recall efficiency from the middle segment of the list. Ryan et al. (1992) reported significantly lower recall of words from the middle segment in AIDS patients compared to controls in a federal corrections sample. The rate of forgetting over time can be explored through comparison of recall on the postinterference trial and the delayed recall trial. Comparison of performance on recall vs. recognition conditions provides additional insight into faulty encoding, storage, or retrieval mechanisms. Comparison of the number of words recognized from list A and the interference list suggests differences in storage of overlearned vs. once-learned material. Analysis of errors also provides useful information on the integrity of different memory mechanisms. Rey (1964) instructed that, in addition to the number of correct responses, the numbers of false responses, repetitions, and repetitions where the subject questioned recalling the word previously should be recorded. Spreen and Strauss (1998) and Lezak et al. (2004) suggest a notation system to identify the quality of errors on recall and recognition trials. The quality of intrusion errors has been analyzed in several studies. Intrusions of words from list A on recall of the interference list or from the interference list on postinterference recall provides evidence of proactive interference and weakness in source or context memory (Geffen et al., 1990). Extralist intrusions reveal a tendency for semantic or phonetic confusion or confabulatory responses. Intrusions as well as the tendency to repeat words from the list more than once reflect impairment in self-monitoring functions (Lezak et al., 2004). lvnik et al. (1990, 1992c) and Geffen et al. (1990, 1994) developed additional measures of performance, which allow exploration of the effect ofdifferent memory mechanisms on memory functioning. Similarly, Vakil and Blachstein (1994) developed a measure to assess incidental
liST-LEARNING TESTS
learning of temporal order, which provides an additional index of retention. The contribution of the Rey AVLT to the accuracy of diagnostic determination in head injuries, dementias, amnesias, and frontal lobe syndrome has been well documented (Armstrong et al., 1996; Bigler et al., 1989; Blachstein et al., 1993; Glennerster et al., 1996; Guilmette & Rasile, 1995; Heubrock, 1995; Janowsky et al., 1989; Lezak et al., 2004; Lucas & Sonnenberg, 1996; Mitrushina et al., 1994, 1995b; Woodard et al., 1999a). Data for a sample of recent-onset spinal cord injury patients are presented by Kurylo et al. (2001). In addition to its usefulness in the identification of faulty memory mechanisms, several Rey AVLT indices can be used as measures of motivational level and cooperation in the testing procedures (Ashendorf et al., 2003; Bernard, 1990, 1991; Bernard et al., 1993; Binder et al., 1993, 2003; Greiffenstein et al., 1994; King et al., 1998; Nelson et al., 2003; Sherman et al., 2002; Sullivan et al., 2001, 2002). Barrash et al. (2004) developed an expanded version of the test (AVLTX) aimed at detection of inadequate effort or malingering by adding 60-minute delayed recall and recognition trials. The authors identified performance patterns that are highly inconsistent with performance of brain-damaged patients.
Practice Effect and Alternate Forms ofthe Rey AVLT The effect of repeated administration of the Rey AVLT was investigated by several authors. Lezak (1982) reported a small but statistically significant practice effect observed at 6- and 12-month retest in a group of normal participants. Mitrushina and Satz (1991a) reported improvement in recall on trial I (attributed to the practice effect) over repeated annual probes, which was consistent for all age groups in a sample of healthy elderly. Crawford et al. (1989) hypothesized that two factors can lead to practice effects on memory tests: (1) retention of specific test material and (2) a metamemoric factor (i.e., exposure to a similar task may facilitate development of optimal strategy). Comparison of the test-retest performance for those participants who were
361
presented with the same vs. different forms of the Rey AVLT on retest (27 ± 3 days after original testing) indicated that the practice effect was evident only in the group exposed to the same list. Therefore, the practice effect is largely due to the retention of specific test material rather than to the metamemoric factor and can be overcome using an alternate form on the retest. Uchiyama et al. (1995) found a significant practice effect on 1-year longitudinal follow-up in males with an average age of 36.55 (7.19), which was of a comparable magnitude for two alternate forms. Shapiro and Harrison (1990) suggested that when the test-retest sessions are spaced very close to each other (5 days), there remains a general practice effect due to repeated administrations in healthy college students but not among the older patient population. Data on repeated administration are also presented by McCaffrey et al. (2000). These results indicate a need for alternate forms of the AVLT which can be used in longitudinal evaluation of changes over time in patients' verbal memory. A number of investigators have developed alternate forms and reported their psychometric properties (Crawford et al., 1989; Geffen et al., 1994; Ryan et al., 1986; Shapiro & Harrison, 1990). Lezak (1983, 1995) and Lezak et al. (2004) provide several alternate forms. Ryan and Geisser (1986) and Uchiyama et al. (1995) report high comparability between forms A and C provided by Lezak. Criteria for generation of alternate word lists vary between different studies; however, the most salient requirements include a match between the original and alternate lists with regard to the following characteristics: probability of the occurrence of the word in English usage, which is assessed using the ThorndikeLarge tables (1944); word length (one- or twosyllable nouns); serial position; the imagery value of the words, based on Paivio et al. 's (1968) tables; and control for semantic or phonetic associations between words. A good match of the word lists on the first criterion is especially important since, according to Fuller et al. (1997), low-frequency words result in lower recall and higher recognition than highfrequency words.
362
VERBAL AND VISUAL LEARNING AND MEMORY
Different studies have yielded a wide array of alternate form reliability coefficients, but generally coefficients fell within the acceptable range (>0.60) for up to a 1-month interval and between 0.30 and 0.89 for a 1-year interval. For further information on the psychometric properties of the Rey AVLT, see Franzen (2000), Lezak et al. (2004), Schmidt (1996), and Spreen and Strauss (1998). Assessment of Auditory-Verbal Learning with the Rey AVLT in Different Languages and Cultures
To account for the effect of demographic variables on Rey AVLT performance, Tuokko and Woodward (1996) developed a demographic correction system (using the Heaton et al., 1991, procedure), which incorporates various neuropsychological measures, including the Rey AVLT. It was developed and validated on samples of community-dwelling and institutionalized Canadian elderly over the age of 65 years. In this system, raw scores are converted to T scores corrected for age and education, which can be plotted on a profile sheet containing suggested optimal cutoff points for impairment. The set-up of the profile sheet allows grouping of the data in accordance with the DSM-111-R criteria for dementia. The authors suggest that application of the system considerably improves the accuracy of diagnostic interpretations of test scores. Lannoo and Vingerhoets (1997) provided data for the Rey AVLT for a large sample of healthy Flemish adults partitioned into two age groups x two education groups x gender. Normative data for a Brazilian version of the Rey AVLT for Brazilian adults 15-93 years old are reported by Diniz et al. (2000). The Portuguese version of the test is provided. Lee et al. (2002) provided data for the Chinese Rey AVLT (C-RAVLT), among other neuropsychological tests, collected in Hong Kong on a sample of 475 Cantonese-speaking Chinese 13-46 years old. Data are reported in three education x two achievement x two gender groups for adolescents and in three education x two gender groups for adults.
CALIFORNIA VERBAL LEARNING TEST-SECOND EDITION
The California Verbal Learning Test (CVLT) was originally published by Delis et al. in 1987, and the revised version (CVLT-11}, which considerably improved test utility and the normative database, was published in 2000. The structure and administration of the CVLT/CVLT-11 are similar to those of the Rey AVLT (see above). However, in contrast to the Rey AVLT, the word lists contain 16 items, which are drawn from four categories.
Structure of the CVLT-11 and Desctiption of the Normative Data Provided in the Test Manual
List A, which is used in five acquisition trials, includes four items from each of the following categories: vegetables, animals, ways of traveling, and furniture. List B, which is used in the interference trial, includes items from two of the categories in list A (vegetables and animals) and from two new categories (instruments and parts of building). Recall of list B is followed by short-delay free recall and cued recall of list A. After a 20-minute delay filled with nonverbal testing, long-delay free recall and cued recall are assessed, followed by yes/ no recognition of list A. Ten minutes later, an optional forced-choice recognition trial is administered. Performance on the CVLT-II generates 28 variables, which reflect various aspects of learning and memory. Factor analysis using principal components analysis of scores on 19 key variables from the normative reference sample extracted six factors: general verbal learning, response discrimination, primacyrecency effects, organizational strategies, recall efficiency, and acquisition rate. Factor analysis on the mixed clinical sample yielded roughly similar results in terms of the types of factors and the variables loading on each factor. However, the authors argue against data reduction through the use of factors since it leads to loss of information provided by individual variables. The U.S. Census-matched normative reference sample consists of 1,087 adults aged
363
LIST-LEARNING TESTS
16--89 years. The test manual contains normative data stratified by seven age groups: 1619 (n = 150), 20-29 (n = 190), 30-44 (n = 200), 45-59 (n = 150), 60-69 (n = 145), 70-79 (n = 145), and 80--89 (n = 107). In addition, the data were stratified by gender for those variables that revealed significant gender differences. The normative data represent transformations of raw scores to T or z scores with increment values of ± 0.5. The distribution for each variable was hand-smoothed for minor irregularities both within and across age groups. For the variables with skewed distributions precluding z-score transformations, the data are reported in frequency and cumulative frequency. The authors pointed out that the norms for the CVLT-11 are less stringent than the norms for the original version of the CVLT; thus, the same raw score on a particular variable may yield a higher standardized score on the CVLT-II than on the original CVLT. Scoring software that computes the multiple raw and standardized scores for the standard, alternate, and short forms of the test is available. The authors report good psychometric properties of the test. Alternate and Short Forms of the CVLT-11
The alternate form uses words from different categories in comparison to the standard form. The two forms were, however, equated in terms of word frequency. Raw scores on the alternate form were calibrated to raw scores on the standard form using linear equating, based on a sample of 288 participants who were administered both forms in a counterbalanced order. The authors developed a short form of the test to be used for screening purposes or for testing patients who may feel overwhelmed by the full-length form. The short form uses only one list of nine words taken from three categories, which is administered over four acquisition trials. Short-delay free recall follows a distractor task of counting backward from 100 for 30 seconds. The subsequent trials include long-delay free recall following a 10-minute delay filled with non-verbal tasks,
cued recall, yes/no recognition, and an optional forced-choice recognition trial administered after a 5-minute delay. Although the short form yields fewer performance indices, the authors cite literature supporting its usefulness for diagnostic purposes. Raw scores for the short form were calibrated to raw scores for the standard form using equipercentile equating (due to the skewness of some variables on the short form) based on a sample of 278 participants administered both forms in a counterbalanced order. Review of the Recent Literature on the CVLT and CVLT-II
The CVLT-11 manual (Delis et al., 2000) contains an extensive review of the studies which used the original version of the CVLT. Studies examining the performance of various clinical groups, age and gender effects on CVLT performance in healthy groups, detection of exaggeration/malingering, and prediction of everyday functioning as well as critiques of the first edition of the test are summarized in the manual. Demographically adjusted norms for the CVLT based on a sample of over 1,000 normal adults aged 20--85 years, stratified by age, education, gender, and race/ethnicity (African American and Caucasian), are presented by Heaton et al. (2004). Additional studies addressing the utility of the CVLT and CVLT-11, which were published after the CVLT-11 manual was written, are summarized below. Effect of Semantic Organization on Recall
Word lists in the CVLT are comprised of words taken from four semantic categories. Words from the same category are never presented consecutively, which allows assessment of semantic clustering, reflecting the "extent to which the examinee has actively imposed an organization on the list of words according to shared semantic features" (Delis et al., 2000). This strategy facilitates encoding and retrieval of words. The rationale and computational formulas for semantic clustering indices were revised in the CVLT-11. Whereas the semantic clustering index in the original version used the words recalled during a given trial as the
364
VERBAL AND VISUAL LEARNING AND MEMORY
baseline for calculating expected values (recall-based expectancy), the clustering indices in the second edition use the word list as the baseline (list-based expectancy)' (Delis et al., 2000; Stricker et al., 2002). : The following recent studies addre~ed the effect of semantic organization on recall in healthy samples using the CVLT. Shear et al. (2000) examined the effect of semanti4 cueing on free recall of the word list in 154 :healthy young adults. The sample was divided ~to four groups, each receiving either the stan
organization of information becomes a determining factor in recall efficiency. (It should be pointed out that decline in speed of mental processing is another factor compromising recall rates in the elderly as slowing down in the rate of CVLT word list presentation reduced recall differences between young and elderly groups in the study by Weible et al., 2002.) Anatomical Correlates
Brain mechanisms subserving learning and memory, as measured by the CVLT, were investigated in several recent studies. Baldo et al. (2002) investigated the role of frontal cortex in long-term memory using the CVLT-11 in patients with focal frontal lesions and age- and education-matched controls. The performance pattern of frontal patients was characterized by overall low recall, an increased tendency to make intrusion errors, reduced semantic clustering, and impaired yes/no recognition due to endorsement of semantically related words and words from an interference list. The authors concluded that these findings support earlier reports relating false recollections and source memory deficits to semantic confusion attributed to frontal dysfunction. Savage et al. (2001) explored the neural basis of spontaneous and directed semantic organization using positron emission tomography (PET) technology. The CVLT procedure was modified to include three encoding conditions that manipulated semantic organization: spontaneous, directed, and unrelated. This design allowed the authors to manipulate semantic clustering levels over three encoding conditions. The imaging results revealed two distinct activations in the left inferior prefrontal cortex (inferior frontal gyrus) and left dorsolateral prefrontal cortex (middle frontal gyrus), corresponding to the levels of semantic clustering observed in the behavioral data. Blood How in the orbitofrontal cortex was strongly correlated with semantic clustering scores during immediate free recall (encoding). The authors concluded: orbitofrontal cortex performs an important, and previously unappreciated, role in strategic memory by supporting the early mobilization of effective
LIST -lEARNING TESTS
behavioral strategies in novel or ambiguous situations. Once initiated, lateral regions of left prefrontal cortex control verbal semantic organization. (p. 219)
Involvement of bilateral frontotemporal areas in verbal memory is documented by Johnson et al. (2001) in their study exploring brain activation on functional magnetic resonance imaging (fMRI) during CVLT performance. The results provide correlational evidence of right frontal and medial temporal lobe (hippocampal) participation in verbal memory, which mediates the success of word processing by the left medial temporal lobe as measured by performance on the CVLT. Assessment of Learning and Memory in Traumatic Brain Injury
Construct and criterion validity of the CVLTin assessment of learning and memory deficits associated with traumatic brain injury (TBI) have been tested in several studies. The multifactorial structure of the CVLT was reexamined by Wiegner and Donders (1999), who identified a four-factor model in their study with 150 TBI patients, which differed from the six- or five-factor model identified by the test authors. The factors were interpreted as attention span, learning efficiency, delayed recall, and inaccurate recall. Cluster analysis, performed on the variables with the highest factor loadings on each of the four factors, revealed four performance subtypes. Two of them differed in the level of performance (average scores across all cluster variables vs. significantly below average on all variables), which was related to injury severity, with the other two subtypes reflecting distinct performance patterns. The authors confirmed the usefulness of the CVLT in assessing learning abilities in patients with TBI and underscored its sensitivity to the general severity of brain injury. Subsequent studies revisited the issue of level of performance vs. pattern of performance in subtyping TBI patients. Curtiss et al. (2001) used CVLT and the Wechsler Memory Scale-Revised (WMS-R) Digit Span to derive seven indices of short- and long-term memory processes on two samples of TBI patients
365 (n = 150 and 151). Cluster analysis on seven indices revealed five distinct performance pattern clusters, which included two intact memory clusters differing in encoding strategy (semantic vs. serial) and three clusters indicative of deficits in consolidation, retention, and retrieval memory processes, respectively. The latter cluster had accompanying problems with increased number of intrusions and perseverations, pointing to poor memory control. In contrast, Demery et al. (2002) found two distinct level of performance clusters (within normal limits and moderate to severe impairment) in their sample of 160 TBI patients. No pattern of performance clusters was identified in this study. Vanderploeg et al. (2001) proposed that impaired learning and memory in TBI patients is due to a consolidation deficit, rather than to deficient encoding or retrieval processes. The authors compared CVLT performance of 55 patients with moderate to severe TBI, 55 controls matched on age and performance on trial 5 and total for trials 1-5 (acquisition-matched), and 55 controls matched on demographic characteristics but not on CVLT performance (demographicmatched). The results revealed comparable rates of learning across groups, indicating no encoding differences. Rate of forgetting was significantly more rapid for the TBI group in comparison to the two control groups, consistent with consolidation problems. Lower rates of proactive interference for TBI patients in comparison to demographic-matched controls provided further evidence of a consolidation problem in TBI patients. Comparably low rates of proactive interference for the TBI and acquisition-matched groups point to impaired acquisition in both. All groups benefited equally from semantic or recognition retrieval cues, indicating no differences in retrieval process. These findings support the authors' assertion of a consolidation deficit underlying memory impairment in TBI. However, the authors point to the limitation of their study in using a primarily male sample and address the possibility that other subgroups of TBI patients might display encoding and retrieval deficits, as reported in previous studies.
366
VERBAL AND VISUAL LEARNING AND MEMORY
As to the low rates of proactive intel!ference reported by Vanderploeg et al. (2001), Numan et al. (2000) suggested that buildup and release from proactive interference in TBI can be uncovered if appropriate methods of analysis are used. Whereas the commonly used measure of proactive interference (trial! recall on list A minus recall on list B).did not reveal any evidence of proactive interference in either the TBI or control group to their study, proactive interference was dettteted in both groups when relative recall of shared and nonshared category items from the ~o lists was taken into account. · Furthermore, the impact of TBI on tipecifi.c memory components should be vi~ed on a continuum. For example, Duchniclc et al. (2002) examined the discrepancy qetween long-delay free recall and recognition discriminability (LDFR/RD), and the discrepancy between free recall and seml¥}tically cued recall as indicators of retrieval p110blems in 122 TBI patients. The results support a continuum of retrieval deficit severi\Y: performance improvement with recogni~on but not semantic cueing points to more~ severe retrieval deficit, whereas improvem4t with both recognition and semantic cuei~ indicates less severe retrieval deficit. Thus, there is no consensus on the number of factors that represent learning and Jllemory constructs which are measured by the CVLT or on the number or the basis (level vs. pattern of performance) for subtyping TBI wtients. However, all studies point to the clinifal sensitivity of the instrument to leamieg and memory deficits in TBI. The difference$ across studies in hypothesized mechanisms ~derly ing memory impairment in TBI might be due to differences in demographic charac~ristics of the samples, severity of TBI, degree of recovery, and time elapsed between TBI and assessment. Assessment of Serial Position Effect ; in Dementias
Studies using the CVLT to assess dettentias focused on a serial position effect. Bayley et al. (2000) examined performance pattern 'in Alzheimer's disease and amnesia associabpd with electroconvulsive therapy (ECT). The authors
compared serial position effects produced on trial! in 25 patients with mild dementia [MiniMental State Exam (MMSE) = 20], 25 patients with very mild dementia (MMSE =25.5), and 50 age- and education-matched normal controls. Performance of the very mildly demented group was also compared to that of a group of 11 patients with transient amnesia arising from a series of ECT treatments. Primacy and recency effects were defined in this study as recall of the first two and the last two items, respectively. The results indicated significantly lower overall recall as well as a significantly reduced primacy effect, with a normal recency effect in the demented group compared to the control group. Performance of patients with ECT-induced amnesia was comparable to that of very mildly demented patients on most standard CVLT measures; however, they revealed an expected primacy and recency pattern. The authors concluded that a reduction in the primacy effect is an early feature of memory impairment in Alzheimer's disease. They pointed out, however, that it is not a necessary feature of all causes of memory impairment. Paul et al. (2002) examined the serial position effect in vascular dementia. The authors compared performance of 19 patients with mild dementia and 17 with moderate dementia. The mildly demented group demonstrated intact primacy and recency effects, while neither primacy nor recency effects were produced by the moderately demented group. The authors concluded that absence of serial position effects may occur in more advanced dementias regardless of dementia type. Repeated Administration and Practice Effects
McCaffrey et al. (2001) evaluated CVLT performance of 22 HIV-negative individuals across five assessment probes equally spaced over approximately 1.5 years, in the context of a National Institute of Mental Health study on the natural progression of HIV infection. Total recall (trials 1-5) was analyzed as part of a larger battery. A considerable practice effect was observed on this measure, with a statistically significant improvement at the second administration and relative stability in recall
367
LIST-LEARNING TESTS
for the following three administrations. The authors interpreted the initial administration of the test as a period of "adaptation" and learning, with subsequent stabilization in performance, and cautioned test users to consider this trend when interpreting changes in CVLT performance on the retest. Data for repeated administration of the CVLT are also presented in McCaffrey et al. (2000). In a follow-up article based on the same study, Duff et al. (2001) proposed a dualbaseline approach to minimize the practice effect. The test was administered twice within a 2-week period to 18 HIV-positive/symptomatic, 25 HIV-positive/asymptomatic, and 26 HIV-negative participants and later administered two more times at approximately 6-month intervals. The short-term stability coefficients for recall trials for the asymptomatic and control groups ranged 0.~. 77 (with the exception of one coefficient falling below this range). Significant practice effects were found on six of the CVLT measures for all three groups over the initial short testretest interval. Across four assessments, the authors noted a cubic trend on most of the measures for all three groups, with an initial increase in scores from time I to time II, a subsequent decrease from time II to time III, followed by an increase from time III to time IV. This pattern of performance change across repeated administrations supports the dualbaseline method for controlling practice effects. In addition, the dual-baseline approach led to improvement in a number of stability coefficients. The authors cautioned, however, that this approach is appropriate not for all patient groups and not with all neuropsychological instruments. For further information on the psychometric properties of the CVLT/CVLT-11, see Elwood (1995), Franzen (2000), Lezak et al. (2004), and Spreen and Strauss (1998).
Assessment of Effort with the CVLT The pattern of CVLT performance associated with malingering and insufficient effort has been described in several studies. Demakis (1999) reported that simulators were less likely to consistently recall the same word across successive learning trials. Sweet et al.
(2000b) and Slick et al. (2000) addressed the sensitivity of different cutoff scores to detect suboptimal effort on the CVLT. Baker et al. (2000) developed two formulas to detect incomplete effort on the CVLT. The discriminant function formula, which incorporates scores on three CVLT measures, yielded an acceptable false-positive rate of 7.46%.
Use of the CVLT in Other Languages and Cultures Nolin (1999) used the French translation of the CVLT in a study on memory functioning in head-injured individuals. Psychometric properties of the test were examined on 309 intact individuals, 25 male closed head injury patients, and 26 bilingual university students. Reliability indices varied depending on the method used. Acceptable translation validity and criterion-related validity were demonstrated. Significant intercorrelations between the French and English versions were found. Kim and Kang (1999) reported normative data for the Korean version of the CVLT (KCVLT) collected on a sample reflecting Korean census data (n = 357). The data for 22 indices stratified by age and gender are reported. Consistent with the analyses presented in the test manual (Delis et al., 2000), factor analysis of 19 K-CVLT indices extracted six factors accounting for 73.3% of common variance. The scores on various K-CVLT variables were generally lower than the normative data for the English CVLT, and SDs were larger. Barker-Collo et al. (2002) examined the effect of the American content of the CVLT on the performance of New Zealanders. The CVLT and a modified version developed to reflect New Zealand content (NZ-VLT) were administered to 90 healthy adults. The authors report significantly better performance on the NZ-VLT on several measures.
Adaptations and Alternate Versions of the CVLT The children's version of the CVLT (CVLT-C, Delis et al., 1994) follows the same approach to assessment of learning and memory as the
368
VERBAL AND VISUAL LEARNING AND MEMORY
adult version. 'nle word lists consist of 15 shopping items falling into three semantic categories. Utility of the children's version has been recently addressed by Goodman et al. (1999), Donders (1999a,b), and Beebe et al. (2000). A 9-item dementia version of the CVLTwas described by Spreen and Strauss (1998), Woodard et al. (1999b), and Davis et al. (2002). Several test measures were reported to be sensitive to specific memory deficifs characteristic of Alzheimer's disease and ischemic vascular dementia (Davis et al., 2002); Stevens (2000) and Stevens et al. (2001) described development of a pictorial version of the CVLT, the Connecticut Pictorial Learning Test (COPLT). Results of two experiments were reported. The analyses performed in the first experiment reveaied that the new verbal stimuli to be adapted for pictorial use were relatively equivalent in difficulty to the original CVLT. The authors proposed use of these stimuli as an alternate form III of the CVLT. 'nle second experiment explored psychometric properties of the pictorial format of the test. The authors reported good internal consistency and concurrent validity of the pictorial version.
HOPKINS VERBAL LEARNING TEST
The Hopkins Verbal Learning Test (HVLT) was originally introduced by Brandt in 1991 to allow brief repeated assessments over time for use with demented patients or those who are too impaired for administration of more lengthy and complex tests. The HVLT eonsists of a 12-item word list, composed of four words drawn from each of three semantic categories. Three free recall trials are followed by a yes/no recognition trial using a list of 24 words, which includes 12 targets and 12 distracters. Half of the distractors are drawn from the sa'me semantic categories as the targets (related distractors) and half, from other categories (unrelated distractors). The HVLT incl~des six equivalent forms. The items chosen for the word lists represent four very high-frequency responses from each of 18 semantic categories (three categories for each of the six eq~valent
forms), drawn from the Battig and Montague (1969) category exemplar collection. 'nle two most common responses were used as nontarget distractors for the recognition tasks. 'nle lists were closely matched for mean frequency of occurrence of the words in the Battig and Montague study and for mean word frequency in printed text (Frances & Kucera, 1982). Interform reliability reported by Brandt (1991) was high, especially for the free recall trials. A revised version of the test (HVLT-R) was introduced in 1996 (Benedict et al., 1996, 1998; Frank & Byrne, 2000) and made commercially available by Brandt and Benedict in 2001. The administration procedure in the revised version was modified to include a 20-25 minute delayed recall trial, with recognition testing following the delayed recall. In all other respects, the revised test is identical to the original version. The HVLT-R generates 10 performance indices-recall on each learning trial, total recall across three learning trials, delayed recall, percent retention on delayed recall respective to the higher of the recall scores from learning trials 2 and 3, and four indices of recognition-number of true positive responses, semantically related false-positive errors, semantically unrelated false-positive errors, and the Recognition Discrimination Index, derived by subtracting the number of false-positive errors from the number of truepositive responses. The normative data reported in the test manual (Brandt & Benedict, 2001) are based on a sample of 1,179 community-dwelling individuals free of neurological or psychiatric disorders (300 men, 879 women), aged 16-92 years, with a mean age of 59.00 (18.62) years and mean education of 13.47 (2.88) years. The normative data stratified into eight age ranges are reported in raw scores. In addition, the tables reported in the appendix allow raw to T-score conversion for total recall, delayed recall, percent retention, and the Recognition Discrimination Index, using the overlapping age strata technique. Analysis of interform reliability of the HVLT-R suggested high equivalency of the six forms for the free recall trials. However, the six forms for the recognition trial yielded very
LIST-LEARNING TESTS
small but statistically significant differences in performance. Therefore, the four recognition indices are listed separately for two homogeneous groups of forms: forms 1, 2, and 4 and forms 3, 5, and 6. The structure of the HVLTIHVLT-R also allows examination of the qualitative aspects of performance, such as extent of category clustering, similar to the CVLT. The psychometric properties of the test are generally good. Stability coefficients over a 9month period in a sample of healthy elderly were moderate yet comparable with similar properties reported for other tests of verbal memory (Rasmusson et al., 1995). Test-retest reliability coefficients for the four primary variables of the HVLT-R over a 6-week period in a normal elderly sample ranged 0.39--0.74 (Benedict et al., 1998). The authors noted that restricted and non-normal distribution of several variables limited the range of reliability coefficients. A considerable practice effect was noted on repeated administration of the same HVLT-R form over four testing probes 2 weeks apart, whereas no practice effect was seen on repeated administration of the alternate forms (Benedict & Zgaljardic, 1998). For further information on the psychometric properties of the HVLT, see Franzen (2000), Lezak et al. (2004), and Spreen and Strauss (1998). The usefulness of the HVLT/HVLT-R in distinguishing between demented/amnesic groups and normal elderly has been addressed in several studies (Brandt, 1991; Brandt & Benedict, 2001; de Jager et al., 2003; Frank & Byrne, 2000; Hogervorst et al., 2001; Krebs, 1994; Kuslansky et al., 2004; Lynch, 2002; O'Connor, 2002; Shapiro et al., 1999). The specificity of the Total Recall and Recognition Discrimination Index of the HVLT ranged 98%-100%, whereas sensitivity was somewhat lower, ranging 87%-94%, in screening for dementia of Alzheimer's type (Brandt, 1991; Hogervorst et al., 2001). Kuslansky et al. (2004) reported a modest sensitivity of 69% with a specificity of 89% for the HVLT Total Recall in differentiating between nondemented elderly and those meeting DSM-IV criteria for dementia. In a study on the
369
discriminative validity of the HVLT-R between patients with Alzheimer's disease and demographically matched controls, a threevariable discriminant equation correctly classified 90% of participants (Shapiro et al., 1999). The utility of the HVLT in assessing memory deficits associated with head injuries was demonstrated by Guskiewicz et al. (2001) and Morey et al. (2003). A comparison of the HVLT/HVLT-R and the CVLT/CVLT-R suggests that the HVLT/ HVLT-R can adequately assess basic verbal learning capacity. However, it has limitations in assessing more complex and qualitative aspects of verbal learning and memory (Lacritz & Cullum, 1998; Lacritz et al., 2001). On the other hand, the advantages of the HVLTIHVLT-R make it highly suitable in situations where more complex measures cannot be administered. Among its advantages, noted across different studies, are the following: it is brief, easy to administer, and repeatable; has no ceiling effects on many variables for older groups; and does not require adjustment for education. In this chapter, we reviewed a recent article (Friedman et al., 2002) reporting normative data for elderly African Americans based on a sample of 237 participants stratified by two age groups (60-71, 72-84), gender, and three educational levels (< 12, 12, > 12).
WHO-UCLA AUDITORY VERBAL LEARNING TEST The WHO-UCLA Auditory Verbal Learning Test (WHO-UCLA AVLT; Maj et al., 1993) was developed by a group of UCLA researchers for the World Health Organization (WHO), for use in a multicultural context. It is designed to minimize the cultural bias inherent in the Rey AVLT while preserving its original format. The test consists of two lists, each containing 15 words from five crossculturally relevant categories: body parts, animals, tools, household objects, and transportation vehicles. Three exemplars for each of the five categories, which are selected from the 250-item lexicon of universally familiar concepts compiled by Snodgrass and
370
VERBAL AND VISUAL LEARNING AND MEMORY
Vanderwart (1980), are presented in fixed random format. A copy of the WHO~UCLA AVLT and administration instructions :can be found in Appendix 3. Due to "universal familiarity" of the words used in this test, its intercultural variability is low (Lezak et al., 2004), which warranted its use by the WHO in a study on the cognitive sequelae of. HIV-1 infection across different countries. The test includes five acquisition trials, interference trial, recall after interfere4ce, 20to 30-minute delayed recall, and a '\terbally presented yes/no delayed recognition Qut of a list of 30 words, which includes 15 ~ts and 15 distracters. Normative data for a Spanish transl~tion of the WHO-UCLA AVLT (Ponton et al.~ 1996), stratified by age and education, are I reproduced in this chapter. The construct va$dity of this version of the test was assessed by Ponton et al. (2000) using factor analysis of a latge test battery assessing various cognitive d«?mains. The WHO-UCLA AVLT measures incloded in the analysis (recall on trial 5, postinterference recall, and 20-minute delayed recall) lo~ed on a unique factor, with very low loadings on four nonmemory factors extracted in this· study. The validity of the Spanish version of the test and demographic predictors of test performance were further examined by LopezCarlos (1999) and Mares (2002).
CERAD LIST-LEARNING TEST The Consortium to Establish a Registry for Alzheimer's Disease (CERAD) battery includes a word list learning task, whiqh taps verbal memory using a list of 10 unrelated words presented visually across three 1Jials at a rate of one every 2 seconds. The ofder of presentation changes with each trial. qelayed recall and recognition are tested after a brief delay filled with a nonverbal distract'r task (Morris et al., 1989). The highest test-retest reliability coefficients (ranging 0.62--0.~) for clinical and control samples were obtained on the total number of words recalled over three learning trials. Kaltreider et al. (2000) compared th~ utility of the CERAD test and the CVLT in the
assessment of memory deficits in a sample of 138 individuals with the diagnosis of probable Alzheimer's disease. The results revealed modest but statistically significant associations between these tests on many variables, with total number of words learned over three trials showing the strongest association. The authors concluded that the results support the utility of shorter list-learning tasks in patients with mild to moderate Alzheimer's disease and suggested use of the CERAD word list as a good alternative in situations where administration of the CVLTis not feasible. However, they suggest caution in applying interpretive strategies that are derived on more comprehensive measures to shorter measures. The psychometric properties of the CERAD Word List Learning task in assessment of memory functioning in nondemented elderly and in differential diagnosis of dementia across ethnic samples have been examined in many studies (Andel et al., 2003; Calm et al., 1995; Chen et al., 2000; Fillenbaum et al., 1998; Galasko et al., 1995; Ganguli et al., 1991, 1996; Grady et al., 2002; Guruje et al., 1995; Morris et al., 1989, 1993; Stewart et al., 2001; Unverzagt et al., 1996, 1999; Welsh et al., 1994). Normative data for the CERAD battery based on a sample of 83 healthy AfricanAmerican individuals were reported by Unverzagt et al. (1999). The Word List Learning task is included in a Brazilian-Portuguese version of the CERAD battery (Bertolucci et al., 2001). Norms on CERAD for use in an Australian setting were reported by Collie et al. (1999) on a sample of 243 healthy aging adults. The Word List Learning task discriminated well between normal and demented individuals in a Jamaican sample (Unverzagt et al., 1999).
SELECTIVE REMINDING TEST The Selective Reminding Test (SRT) was originally introduced by Buschke (Buschke, 1973; Buschke & Fuld, 1974) for assessment of memory and learning. Hannay and Levin (1985) described and examined the psychometric properties of the Verbal Selective
371
LIST-LEARNING TESTS
Reminding Test (VSRT), which is based on this paradigm. The test uses a list of 12 unrelated words. It differs from other list-learning tests in that only the words that were omitted by the examinee on the preceding trial are repeated by the examiner, whereas the task for the examinee is to repeat the entire word list on each trial. The administration procedure includes up to 12 free recall trials. The learning criterion, according to Larrabee et al. (1988), is recall of all words for three consecutive trials without reminders. The test structure allows distinguishing between list learning and item learning. It also allows separating retrieval from long-term storage (when items are retrieved without further reminders) and retrieval from short-term storage. The test yields several derived indices of verbal learning, which include Short-Term Retrieval, Long-Term Retrieval, Long-Term Storage, Random Long-Term Retrieval, and Consistent Long-Term Retrieval. Various modifications of the SRT have been reported. Buschke (1984) added a cued recall component to the test. This version, Free and Cued Selective Reminding Test (FCSRT), has been used with demented and amnesic patients and with nondemented elderly (Buschke, 1984; Degenszajn et al., 2001; Grober et al., 1997, 1998, 2000; Ivnik et al., 1997; Petersen et al., 1999). The VSRT normative data, psychometric properties, and use with nondemented samples have been described in several studies (Larrabee & Levin, 1986; Larrabee et al., 1988; Ruff et al., 1989b; Trahan & Larrabee, 1993). The validity of a six-trial version of the test was examined by Smith et al. (1995), Drane et al. (1998), and Larrabee et al. (2000). These studies demonstrated comparable psychometric properties of the 12- and six-trial versions. A number of studies report utility of the original Buschke SRT, the VSRT, and modified versions of the test with various clinical populations, including those with amnesia or Alzheimer's/Parkinson's dementia (Bartok et al., 1997; Boeve et al., 2003; Faglioni et al., 2000a; Kuzis et al., 1999; Masur et al., 1989, 1990; O'Connell & Tuokko, 2002; Sliwinski et al., 1997; Stem et al., 1998, 1999), head
injury (Dikmen et al., 1995; Paniak et al., 1989), multiple sclerosis (Beatty et al., 1996a,b; Chiaravalloti et al., 2003; DeLuca et al., 1998; Demaree et al., 2003; Faglioni et al., 2000b; Pan et al., 2001), temporal lobe epilepsy (Drane et al., 1998), end-stage pulmonary disease (Crews et al., 2001, 2003; Ruchinskas et al., 2000), and brain tumors (Torres et al., 2003) as well as depressed patients receiving ECT (Datto et al., 2001). Use of versions of the SRT with different ethnic and cultural groups has been described. Xavier et al. (2002) used an SRT with a Brazilian sample of nondemented participants 80-95 years old. Butman (2001) included the Buschke SRT as part of a battery for early diagnosis of dementia in Latin America. Two Spanish versions of the VSRT were developed and validated on healthy and demented groups (Campo & Morales, 2004; Campo et al., 2000, 2003). A Hebrew version of the SRT with three parallel forms was developed by Gigi et al. (1999). An adaptation of the SRT for use in the visual in addition to the verbal modality and its validation on a normal sample were reported by Indian researchers (Rao & Andrade, 1998).
OTHER VERBAL AND NONVERBAL LIST-LEARNING TESTS Two parallel forms of a 10-word list-learning test are included in the Repeatable Battery for the Assessment of Neuropsychological Status (RBANS), published by Randolph (1998). Artiola i Fortuny et al. (1999) included a 16word list-learning task in their standardized and validated battery of neuropsychological tests culturally adapted for Spanish-speaking individuals. Normative data based on 390 participants, which were collected in Spain, Mexico, and the United States, are stratified by geographical area x age x education. Toglia and Battig (1978) developed the Affective Auditory Verbal Learning Test, which consists of positively and negatively valenced word lists. Snyder and Harrison (1997) suggested that the affective valency of the word list yielded different magnitudes of primacy vs. recency effects, which may be useful in the
372
VERBAL AND VISUAL LEARNING AND MEMORY
evaluation of individuals suffering from affective disorders. The effect of positively and negatively valenced words on physi.,logical measures of arousal was described by 'Snyder et al. (1998). Majdan et al. (1996) developed a nooverbal analog of the Rey AVLT, the Aggie Figures Learning Test (AFLT). The three forD1f of the AFLT, described by the authors, wete constructed according to the Rey AVLT! format using abstract figures that do not§lenthemselves to verbal encoding. The auth emphasized that the AFLT is superior to e Rey Visual Design Learning Test (see . et al., 2004), which has only one version an4 is designed to assess learning over five acqUisition trials and recognition only. Rey's anal«tg does not contain an interference list, whiqh precludes evaluation of retention rates. Ill addition, Rey's stimuli represent simple geo~etrical designs that can be easily encoded verllally. In addition to these nonverbal +alogs, Lezak et al. (2004) described the Jtctorial Verbal Learning Test (PVLT), whic~J~llows the five-trial learning format of the ReyjAVLT. For more information on list-learning tests, refer to Lezak et al. (2004) and Spreen and Strauss (1998).
RELATIONSHIP BETWEEN LIST-LEARNING TEST PERFORMANCE AND , DEMOGRAPHIC FACTORS Effect of Age The effect of demographic variables on Rey AVLT performance has been extensiVely explored in the literature. Studies cons~tently demonstrate an effect of age on rec~, and some studies report an effect of age on recognition (Bleecker et al., 1988; Geffen et al., 1990; Graf & Uttl, 1995; Mitrushina et al., 1991; Mitrushina & Satz, 1991a,b; Q,ery & Berger, 1980; Query & Megran, 1983; Savage & Gouvier, 1992; Selnes et al., 1991; Uchiyama et al., 1995; Wiens et al., 1988). Som~ investigators suggest that age impacts only specific AVLT scores. For example, total number of words recalled on the five learning trials may be lower in older participants, who may show
more information overload (forward digit span >trial I), confusion regarding information source (e.g., misclassify list B words as list A words), and less efficient retrieval. However, rate of learning (learning curve), loss of information over a distractor or an extended delay, recognition ability, and false-positive errors may be resistant to aging (Bleecker et al., 1988; Bolla-Wilson & Bleecker, 1986; Cohen et al., personal communication; Geffen et al., 1990; Mitrushina et al., 1991; Wiens et al., 1988). Petersen et al. (1992) concluded that learning performance declines uniformly with age but forgetting remains relatively stable across age when adjusted for the amount of material initially learned. Salthouse et al. (1996) suggested direct age-related influences on memory independent of speed of performance, based on the existence of the direct path from age to memory in their structural equation model. Similarly, age-related performance decline was reported by Norman et al. (2000) for a number of CVLT indices. According to the literature review by Delis et al. (2000), several studies using the original CVLT version (Murphy et al., 1997; Woodruff-Pak & Finkbiner, 1995) reported decline in the rate of acquisition, primacy effect, use of semantic clustering, and tendency for intrusion errors on delayed recall trials. Delis et al. (2000) reported a high inverse relationship between age and CVLT -II recall, with age explaining 25% of the variance in total recall for five acquisition trials. Age was also significantly related to several performance indices of the HVLT-R (Brandt & Benedict, 2001; Friedman et al., 2002; Vanderploeg et al., 2000), accounting for 19% of the variance in total recall according to Brandt and Benedict (2001). However, Kuslanskyet al. (2004) did not find a relationship between age and performance on the HVLT in a large sample of elderly aged ~70 years.
Effect of Education The effect of education is less clear. Cohen et al. (personal communication), Delaney et al. (1992), Query and Berger (1980), Query and Megran (1983), and Uchiyama et al. (1995) observed a relationship between education and Rey AVLT scores, while Bolla-Wilson and
373
LIST-LEARNING TESTS
Bleecker, (1986), Mitrushina et al. (1991), Petersen et al. (1992), and Wiens et al. (1988) did not find a significant association. BollaWilson and Bleecker (1986) argued, based on their multiple regression analyses, that education does not account for AVLT test score variance over that associated with IQ. The other studies reporting a relationship between education and AVLT performance either did not include an examination of the effects of IQ on scores or simply reported correlations between education and AVLT scores without controlling for the significant association between education and IQ. An effect of education on CVLT performance was reported by Norman et al. (2000). In respect to the CVLT-II, the correlation between total recall over five trials and educational level was 0.29, according to the test manual (Delis et al., 2000). On the HVLT-R, education accounted for 5% of the total variance or less across different variables, according to the test manual (Brandt & Benedict, 2001). Vanderploeg et al. (2001) did not find a relationship between education and HVLT-R scores in a predominantly white, community-dwelling, elderly sample, whereas Friedman et al. (2002) found a significant, moderate effect of education on HVLT-R performance in a sample of African-American elderly. This discrepancy might be due to greater variability in educational ranges in the latter study. Kuslansky et al. (2004) did not find a relationship between education and performance on the HVLT in a large sample of elderly. Effect of Intelligence Level
Bolla-Wilson and Bleecker (1986) and Query and Megran (1983) reported an association between IQ and Rey AVLT measures, although there has been some discrepancy regarding which specific scores are affected. Wiens et al. (1988) reported a relationship between FSIQ and age with recall but not recognition, which suggests the utility of recognition in assessment of pathological conditions, free of confounds from demographic and IQ factors. Similarly, Bleecker et al. (1988) did not find an effect of age, sex, or
vocabulary storage on Rey AVLT recognition performance. Delis et al. (2000) reported a correlation of 0.46 between the CVLT-11 total recall over five trials and the WASI Vocabulary raw score. Similar association between CVLT performance and WAIS-R Vocabulary was reported by Keenan et al. (1996). Effect of Gender
Gender differences in Rey AVLT performance have been reported in several studies. Women have outperformed men on the learning, interference, and delayed recall trials (Bleecker et al., 1988; Bolla-Wilson & Bleecker, 1986; Cohen et al., personal communication; Geffen et al., 1990; Vakil & Blachstein, 1997). Other studies, however, did not demonstrate an effect of gender on any of the Rey AVLT measures (Savage & Gouvier, 1992; Wiens et al., 1988). As to the CVLT, Norman et al. (2000) reported a significant relationship between gender and performance, with female superiority on several indices. Delis et al. (1987, 2000) reported a significant association between gender and performance on 13 out of 22 indices of the original version of the CVLT, with women outperforming men on all of the list A and list B recall variables, semantic clustering, percent middle (serial position) recall, and number of free recall and cued recall intrusions, whereas men outperformed women on percent recency (serial position) recall. According to Delis et al.'s (2000) literature review, the 1988 study by Kramer et al. supports gender differences found in the above study. Women recalled about one word per trial more than men, although their results did not differ in terms of error type, learning slope, and forgetting, which was 10%-15% from trialS to short-delay recall, with no further forgetting from short-delay to long-delay recall. Similarly, Wiens et al. (1994) found significant gender differences on 26 CVLT variables in a sample of 700 healthy job applicants. In their summary of the literature review, Delis et al. (2000) concluded: "Differences between males and females, in favor of females, are to be expected on the CVLT, and these differences seem to stem from the use of
374
l
VERBAL AND VISUAL LEARNING AND MEMORY
more efficient, semantically based l~arning strategies by women than by men" (p. ~8). On the other hand, Randolph et al. (1999) questioned the interpretation of female su!friority in the CVLT normative reference sample as evidence for a female advantage in: verbal memory functions. The authors propo!led that "this advantage was stimulus driv~n and therefore neither strictly langua~e bf;ed. or memory based. A different selection ~f stimulus items on the CVLT might confi~ably result in superior performance by m~n (p. 495). Such an int~rpretation was well j~tified in reference to the original version of the CVLT, which was presented in the fo of a shopping list. However, stability of the r,attern of gender differences, which is also evftent in the second edition of the CVLT, has~romp ted researchers to revisit this issue, gi n that the CVLT-11 does not use the shop ng list format and its items are gender-neufral. In fact, according to Delis et al. (2000), ootal recall across five acquisition trials for }"Omen was on the average five words higher ~an for men on both the original CVLT 3.lld the CVLT-11. Similarly, prominent gender:effects were seen on many CVLT-11 variabl~, with ! women outperfoqning men. Examination of the effect of gen~er on HVLT-R performance yielded conflic.ng results. Gender accounted for at most l 7% of variance on total recall according to ~e test manual (Brandt & Benedict, 2001). Si~ilarly, Kuslansky et al. (2004) did not find Ia relationship between gender and perform~. ce on the HVLT in a large sample of elderly: However, in a study by Vanderploeg et al.\(2000) using multiple linear regression analyst;, gender contributed 8.5% of unique vari!Jnce in I the prediction of total recall, with wom~n outperforming men. Similar findings w;re reported by Friedman et al. (2002) on a ~ample of African-American elderly. r
METHOD FOR EVALUATINGj THE NORMATIVE REPORTS : To adequately evaluate the normative ~ports, eight key criterion variables were clee~ed critical. The first six of these relate to ~ubJeCt
variables, and the remaining two refer to procedural issues. Minimal requirements for meeting the criterion variables were as follows. Subject Variables Sample Size
Fifty cases are considered a desirable sample size. Although this criterion is somewhat arbitrary, a large number of studies suggest that data based on small sample sizes are highly influenced by individual differences and do not provide a reliable estimate of the population mean. Sample Composition Description
Information regarding medical and psychiatric exclusion criteria is important. It is unclear if geographic recruitment region, socioeconomic status, occupation, ethnicity, or recruitment procedures are relevant. Until determined, it is best that this information be provided. Age Group Intervals
This criterion refers to grouping of the data into limited age intervals. This requirement is especially relevant for this test since an unequivocal effect of age on list-learning test performance has been demonstrated in the literature. Reporting of Educational Levels
Given the association between education and performance on list-learning tests, information regarding educational level should ~e reported for each subgroup, and preferably normative data should be presented by educational levels. Reporting of Intellectual Levels
Given the relationship between performance on list-learning tests and IQ, information regarding intellectual level should be reported for each subgroup, and preferably normative data should be presented by IQ levels. Reporting of Gender Composition
Given the probable relationship between gender and performance on list-learning tests
375
LIST-LEARNING TESTS
in favor of males, information regarding gender composition should be reported for each subgroup, and preferably normative data should be presented by gender.
Procedural Variables Description of Administration Procedures
Administration procedures for each list-learning test differ widely among studies. A detailed description of the procedures allows selection of the most appropriate norms or corrections in interpretation of the data to account for differences in administration procedures. Data Reporting
The group mean and standard deviation for the number of words correctly recalled/ recognized on each trial should be presented at minimum.
that particular table. The clinician is urged to pay close attention to the sequence of trials (e.g., administration of the recognition trial prior to delayed recall) on which the data set is based as it has a notable effect on rate of recall (see above). In this chapter, normative publications and control data from clinical studies are reviewed in ascending chronological order. Data for the Rey AVLT are presented first, followed by review of one study on the HVLT-R and one study on the WHO-UCLA AVLT. The text of study descriptions contains references to the corresponding tables identified by number in Appendix 19. Table A19.1, the locator table, summarizes information provided in the studies described in this chapter. 1
SUMMARIES OF THE STUDIES [RAVLT.1] Rey, 1941,1964 (Table A19.2)
SUMMARY OF THE STATUS OF THE NORMS Studies reporting normative data for the listlearning tests vary in their descriptions of procedural and subject variables and in the grouping of obtained data into categories. Among all the studies available in the literature, we selected for review those based on well-defined samples or that offer some aspects of information that are not routinely reported. Most of the reviewed studies provide data for the Rey AVLT. We did not include normative reports for the CVLT-11 as comprehensive normative data are provided in the test manual. One study providing norms for the HVLT-R collected on a large sample of African-American participants was reviewed. In addition, we reviewed one study reporting normative data for the WHO-UCLA AVLT, Spanish version, collected on a large sample of monolingual and bilingual Spanish-speaking participants. The majority of the studies present data in number of words correctly recalled/recognized by participants on different test trials. When reported units deviate from this format, the system used is described in the context of
The author provided normative data for the five learning trials on a sample of 132 Frenchspeaking Swiss participants. The AVLT performance of five groups is reported: manual laborers (n = 25), professionals (n = 30), students (n = 47), elderly laborers (n = 15), and elderly professionals (n = 15). Age ranges are specified only for the last two groups: elderly laborers ranged 70-90 and elderly professionals ranged 70-88. No other descriptive data are provided, such as mean educational or intellectual level and exclusion criteria. Mean scores and SDs across the five learning trials are reported. No other data are available.
Study strengths 1. Large overall sample size. Considerations regarding use of the study 1. Data were collected more than 50 years ago. 2. No data are available for the interference trials or delayed recall. 3. No description of exclusion criteria, IQ levels, years of education, or composition of the sample by gender. 1Nonnative data for children for Rey AVLT are available in Baron (2004) and Spreen and Strauss (1998).
376
VERBAl AND VISUAl lEARNING AND MEMORY
4. Participants were French-speaking, and as discussed earlier, the French test stimuli differ from the currently used English words (i.e., moustache, stm, and belt were included in the French stimuli).
Other comments 1. Chiulli et al. (1985) reported that use of Rey's norms led to misclassificltion of 22% of their control group as iiOpaired (using criteria of 1 SD below the appropriate age and education 'lllean). They recommend caution in using Rey's norms, and this concurs with o$r own clinical experience. [RAVLT.2] Query (Table A19.3)
and Megran, 1983
The study provides norms for 677 male ambulatory inpatients aged 19-81 tre~ted at North Dakota Veterans Administration Medical Center for a variety of physical complaints. Mean education was 11.44; mean IQ was 93.83. Psychotic, severely brain-damaged patients and those suffering from major depressive disorder were not included in the study. The sample was divided into age groups. The standard administration procedtire was used; recognition was measured with a story in which the subject was instructed to circle words from the learned list. Learning was measured by the difference in number of words recalled on the highest trial vs. the lowest trial. According to the authors, the data suggest effects of education and IQ on maintenance of learning ability for younger men. A more complicated relationship was found for older men. The authors concluded that there is progressive short-term memory loss with advancing age, which results in declines in recall, then recognition, followed by l~arning ability. Recognition is highly affected by IQ in older men.
Study strengths 1. The data are broken down by age group. 2. The administration procedure is well outlined. ; 3. Sample size is large, and most in
5. Information on gender, education, IQ, and geographic area is presented.
Considerations regarding use of the study 1. Data for acquisition trials II-IV are not provided. 2. Demographic characteristics for each group are not reported, even though some of them were used in the analyses. 3. The exclusion criteria indicate that "severely brain damaged" individuals were not included in the sample. This implies that mildly to moderately brain-damaged patients were included. 4. The data were collected on medical patients in an acute-care hospital. 5. Mean IQs nearly fell within the low average range of intellectual ability, and mean educational level was below average. Thus, the data may be relevant only for individuals with IQs in the 80s and 90s. 6. Data are available only for males. [RAVLT.3] Rosenberg, 1984 (Table A19.4)
Ryan, and Prifitera,
Ninety-two male psychiatric and neurological inpatients from the V.A. Medical Center in Chicago were divided into memory-impaired and non-memory-impaired groups. Those participants whose memory quotient (MQ) based on the WMS was 12 or more points lower than the WAIS FSIQ and/or those whose MQ was <85 were classified as memory-impaired. The groups did not differ significantly with respect to age or education. Participants were referred to the psychology service for routine psychological and/ or neuropsychological evaluation and selected without regard to diagnostic classification. Mean age, education, and FSIQ for the entire sample were 48.05 (14.03), 11.37 (2.82), and 93.11 (13.43), respectively. The test was administered according to Lezak's (1976) description. The paragraph was used for the recognition task.
Study strengths 1. Information on age, education, IQ, gender, recruitment procedures, and geographic area is provided. 2. Sample sizes are sufficient.
377
LIST-LEARNING TESTS
3. Test administration procedures and the type of recognition task are identified. 4. Means and SDs are reported.
Considerations regarding use of the study 1. Data are not partitioned into age groups. 2. The "normal" control sample is not representative of the overall population. It consists of neurological and psychiatric patients. 3. All-male patient sample. 4. Somewhat low IQ level. [RAVLT.4] Cohen, Andres, and Smolen, Personal Communication (Table A19.5) Participants were elderly volunteers (57 women, 28 men) from Peoria, Illinois, and the surrounding communities, aged 60-89 years. Mean education was 13.8 years; 30.6% of the sample were suffering from hypertensive illness, and 18.8% reported a history of head trauma. These participants were included in the study since the results of preliminary regression analyses indicated a lack of association between test performance and these conditions. However, gender was found to be related to performance. Therefore, norms are presented for males and females separately. The administration procedure included acquisition trials, postinterference recall with immediate recognition, and 30-minute delayed recall followed by delayed recognition. The authors concluded that gender, age, and education are related to performance, with women performing better than men.
Study strengths 1. The data are partitioned into age groups. 2. The data are presented for males and females separately. 3. A comprehensive set ofscores is presented. 4. Information regarding education, gender, and geographic area is provided. 5. Test administration procedures are generally specified. 6. Means and SDs are reported.
Considerations regarding use of the study 1. Sample sizes are small. 2. The format of the recognition procedure was not specified.
3. No reported exclusion criteria. Participants with histories of head trauma and chronic illnesses (e.g., hypertension) were included. 4. No information on IQ. 5. An immediate recognition trial was used, which facilitates performance on delayed recall and delayed recognition. [RAVLT.S] Ryan, Geisser, Randall, and Georgemiller, 1986 (Table A19.6) The study provides alternate form reliability and equivalency in a group of diagnostically heterogeneous inpatients from the V.A. Medical Center in Kansas (82 males, three females) referred for psychological and/or neuropsychological assessment. The sample represented a wide range of psychiatric and neurological diagnoses, including undiagnosed patients and vocational counseling clients. This sample, therefore, is not representative of any identifiable diagnostic group. For the original administration of the test, the authors followed the standard procedure described in Lezak (1983). The recognition trial consisted of a list of 50 words which included words from list A and list B (interference list) and 20 words phonemically and/ or semantically similar to those in lists A and B. For the alternate form, the authors used list C provided by Lezak (1983) and Taylor (1959). Interference and recognition stimuli were constructed by the authors following Lezak's model. The original and alternate forms were presented in a counterbalanced order, with a mean test-retest interval of 140 minutes. Alternate form reliability coefficients ranged 0.60-0.77. Differences between means were less than 1 point on each of the five acquisition trials, postinterference trial, and recognition trials. The forms were judged to be equivalent measures.
Study strengths 1. Administration procedure is well identified. 2. Information on age, education, IQ, gender, ethnicity, recruitment procedures, and geographic area is provided. 3. Sample size is large.
378
VERBAL AND VISUAL LEARNING AND MEMORY
4. Information on alternate form is provided. 5. Means and SDs are reported.
Considerations regarding use of the study 1. The sample is clinically heterogeneous, consists of inpatients, and cannot be considered "normal;" no exclusion crteria. 2. Undifferentiated age group. · 3. Predominantly male sample. 4. Low IQ level of the sample. [RAVLT.6] Bleecker, Bolla-Wilson, Agnew, and Meyers, 1988 (Table A19.7) The authors report AVLT data on a large sample (n = 196) of Maryland part4!ipants aged 40-89 drawn from the Johns !iopkins Teaching Nursing Home Study of !Normal Aging as part of an investigation of tqe contribution of age, gender, vocabulary ! range, and depression to AVLT performancf.l. The sample includes participants descri$ed in Bolla-Wilson and Bleecker (1986). 1 Participants with histories of head ~uma with loss of consciousness, stroke, seizures, uncontrolled hypertension, congestive! heart failure, abnormal thyroid function, electroconvulsive therapy, sleep disorders, coma, psychiatric disorders, or alcohol or drug abu9e were excluded. All participants had Mini-Mental State Examination scores >23. W AIS-R Vocabulary raw scores were used to e$timate verbal intelligence. The presence of depressive symptoms was assessed with the Be
addition, means and SDs for number of words learned (trial V minus trial I) are reported separately for men and women aged 40-65 and 66-89 (not reproduced in this book). Although males and females were comparable in verbal intelligence, women outperformed men on the acquisition trials, especially among the older participants. Older participants scored lower than younger participants, except for the recognition trials. Perseverations, confabulations, and intrusions were not associated with age or gender. A stepwise regression analysis showed that age and gender accounted for a significant portion of the variance on each acquisition trial. Vocabulary accounted for a significant portion of the variance only on trials IV and V. Performance on the recognition trial was not affected by age, gender, or vocabulary. Overall performance was higher for women in comparison to men, with an increase in this tendency with age.
Study strengths 1. Large sample size, but individual cells are small. 2. Presentation of data by age decades and by gender. 3. Stringent medical and psychiatric exclusion criteria. 4. Use of a screening instrument for depressive symptomatology. 5. Information regarding educational level, estimated verbal intelligence, and geographic recruitment area is provided. 6. Specification of an administration format. 7. Means and SDs are reported. Considerations regarding use of the study 1. Very high educational levels for some age groups. 2. Test administration did not include an interference or delayed recall trial. In addition, the recognition trial followed trial V, and thus, the normative data derived from this recognition trial may not be representative of recognition performance associated with the traditional administration format involving an interference trial prior to the recognition trial.
379
LIST-LEARNING TESTS [RAVLT.7] Wiens, McMinn, and Crossen, 1988 (Tables A19.8-A19.10)
The study reports normative data for 222 job applicants, currently employed in a variety of occupations (white collar and blue collar), who had previously passed basic academic skills tests and physical examinations and were free from physical illness or limitations. The applicants represented an occupational crosssection of the community. Participants were free from alcohol and other substance abuse and ranged in age 19-51 years, with a mean age of 29.1 (6.0). Sample composition was 87% male, 13% female; 94.6% Caucasian, 5.4% racial minority. Standard administration was used with a !second interval between each word during presentation. On the recognition trial, participants were to circle words from the learned list in a paragraph. The data are stratified by WAIS-R FSIQ, age, and education. The authors noted better recall at higher IQ levels and an inverse trend between age and all acquisition and delayed recall trials. A proactive interference effect was observed for all groups; recall for the second word list was inferior to the initial recall of the first word list. Performance on the recognition trial was unrelated to age and IQ, which points to its importance in studying pathological memory loss.
Study strengths 1. The data are stratified by age, education, and IQ levels. 2. Administration procedures are well outlined. 3. Exclusion criteria appear to be adequate. 4. Sample sizes vary between the groups, but for the majority of the groups they are adequate. 5. Information regarding recruitment procedures, gender, ethnicity, and geographic area is provided. 6. Means and SDs are reported.
[RAVLT.8] Crawford, Stewart, and Moore, 1989 (Tables A19.11, A19.12)
The study compared test-retest performance with the same form of the Rey AVLT and with a parallel version of the AVLT. The parallel version was developed by the authors based on the criteria identified in the article. Sixty participants, free of neurological, psychiatric, and sensory disability, were recruited from nonmedical health-service personnel and the fire service. Participants were divided into pairs matched for gender, age (±3 years), and years of education (±1 year) to form two groups. There were no significant differences between groups in mean estimated IQ based on National Adult Reading Test (NART) scores (106 and 108, respectively). One group was administered the original AVLT and the other, the parallel version. Participants were retested following a delay of 27 days (±3 days), with half of each group receiving the same version and the other half, the alternative version. The standard administration procedure was used. On the recognition condition, participants were asked to inform the examiner of the stimulus words that had been contained in the previously presented word lists. A recognition score was obtained by subtracting the number of false-positive identifications from the number of words correctly identified. The authors concluded that the parallel version can be used as an equivalent form of the AVLT. A significant practice effect was seen for participants who were administered the same versions.
Study strengths 1. Information on an alternate form and practice effects. 2. Administration procedure was described. 3. Adequate exclusion criteria. 4. Information on IQ and geographic recruitment area. 5. Means and SDs are reported.
Considerations regarding use of the study Considerations regarding use of the study 1. Age groups are restricted to younger range. 2. Some sample sizes are very small.
1. Demographic characteristics of the sample are not described. 2. Relatively small sample sizes. 3. Data are not presented by age groupings.
380
VERBAL AND VISUAL LEARNING AND MEMORY
4. Data were collected in the United Kingdom, and it is unclear to what extent they are appropriate for clinical use in the United States.
[RAVLT.9] Nielsen, Knudsen, and Daugbjerg, 1989 (Table A19.13) · The authors gathered AVLT normative data on 101 Danish participants aged 2rkers. Exclusion criteria included history of head trauma, alcohol abuse, prolonged expoiure to organic solvents or other toxic agents, presence of somatic or psychiatric disease fwhich might adversely affect neuropsychological functioning," or use of medications "which could affect intellectual capacity." Fifty-three participants were male and 48 were female; 91% were right-handed. The sample was classified into three age groupings: 20-29 (n = 35), 30-39 (n = 27), and 40--54 (n = 39); only two participants were older than 50. Mean prorated Verbal IQ (based on a, translation of the WAIS-R) was 98.61 (12.21), with a range of 78--140; means and SDs for seven subtests are reported. Information on occupational status indicated that skilled workers were somewhat overrepresented compared to the general population, while self-employed individuals were underrepresented. Test administration appeared to be: based on the Lezak (1983) instructions. Specfically, "a list of 15 discrete nouns was read aloud at a rate of one per second for five consecutive learning trials, each followed by a free recall test. Delayed recall was tested 15 min after completion of the fifth learning trial" (p. 39). It is not reported if the words were tr~lated. Means, SDs, and ranges are repo~d for the five acquisition trials, total score across the five trials, and delayed recall.
Study strengths 1. Data are presented by age groupings. 2. Information regarding gender, IQ, geographic recruitment area, age, ~ded ness, and occupation is presented.
3. Adequate exclusion criteria. 4. Means and SDs are reported. 5. Administration procedure is specified.
Considerations regarding use of the study 1. No information regarding educational level. 2. Some information regarding occupational status is reported, but not all of the categories are defined. 3. It is unknown if the stimulus words were translated. 4. No interference trial was administered; thus, the delayed recall information may not be comparable to that obtained if any interference trial had been administered prior to the delayed trial. 5. Individual cell sizes are rather small. 6. Data were collected on a Danish sample, which might limit their clinical usefulness in the United States.
[RAVLT.10] Roth, Davidoff, Thomas, Doljanac, Dijkers, Berent, Morris, and Yarkony, 1989 (Table A19.14)
The authors obtained AVLT data on 61 paid control participants as part of their examination of neuropsychological deficits in acute spinal cord-injured patients. Mean age was 27.5 (standard error= 1.0), and mean education was 12.8 years (standard error= 0.2). Forty-five participants were male and 16 were female; 39 participants were recruited in Detroit and 22 in Ann Arbor. Exclusion criteria were history of closed head injury and recent high-frequency alcohol or substance abuse. The AVLT appeared to have been administered according to the Lezak (1983) instructions. Means and standard errors are reported for the five acquisition trials, list B, postinterference recall, and recognition.
Study strengths 1. Information regarding age, gender, education, and geographic recruitment area is provided. 2. Minimally adequate exclusion criteria. 3. Adequate sample size. 4. Means and standard errors are reported. 5. Test instructions are not specified but appear to be standard.
381
LIST-LEARNING TESTS
Considerations regarding use of the study 1. Undifferentiated age range, although the small standard error suggests that the age spread is not large. 2. Recognition procedure is not specified. [RAVLT.11l Geffen, Moar, O'Hanlon, Clark, and Geffen, 1990 (Tables A19 .15-A19.17)
The article provides norms for 153 adults residing in Australia in age groups spanning seven decades (16--86 years; M = 44.5, SD = 20.2). Age groups included approximately equal numbers of males and females and were matched for intelligence, education, and occupation. All participants were physically healthy and free of neurological symptoms by self-report. Participants' occupations (current or preretirement) ranged from unskilled to professional. Average education was 11.2 (2.2) years, with a range of 7-22 years. Estimates of FSIQ were derived from error scores on the National Adult Reading Test (NART, Nelson, 1982). Average estimated IQ was 111.5 (7.3), with a range of94-127. All participants spoke English as the first language. Standard administration procedures were used for recall trials. Twenty-minute delayed recall followed by a recognition condition was used. The delay was filled with the Digit Span, WMS Logical Memory, and NART. The recognition condition consisted of a list of 50 words which included words from lists A and B as well as 20 phonemically or semantically similar distractor words. Participants were to identify as many of the previously learned words as possible, as well as the specific list of origin (list A or B). A specially designed computer program was used for scoring. The following variables were included in the analyses. 1. Recall trials: recall of list A on trials 1-V; total recall over five trials; total number of repeated words; extralist intrusions; recall of list B; postinterference recall (retention); and 20-minute delayed recall. 2. Recognition trial: number of words recognized from lists A and B; a nonparametric signal detection measure of recognition performance, p(A) = 0.5( 1 + hit rate- false-positive), which corrects
the recognition score by taking into account misassignments from list A to list B and vice versa; and total number of false-positive misidentifications. 3. Serial position effect was measured for list A averaged over the five acquisition trials. Serial positions of the words were collapsed into five groupings (words 1-3, 4-6, 7-9, 10--12, 13-15). 4. Functional indices, such as proactive and retroactive interference effects, forgetting, retrieval efficiency, and item information overload, were computed as ratios of pairs of trials. All of the above indices were presented for males and females separately. Based on participants' performance on different trials, the authors made inferences regarding the memory mechanisms involved. The authors noted significant associations between age, gender, and Rey AVLT performance, with females consistently outperforming males. Performance of younger participants was superior to that of older participants.
Study strengths 1. Demographic characteristics are well described in terms of age, occupation, education, IQ, fluency in English, and geographic area. 2. Normative data are presented by age group and separately for males and females. 3. The administration procedure is well described. Additional indices were developed to explore different memory mechanisms. 4. Large overall sample size. 5. Adequate exclusion criteria. 6. Means and SDs are reported. Considerations regarding use of the study 1. The 20-minute delay interval was filled with verbal memory tasks, which might serve as a source of interference. 2. The sample had relatively high estimated IQ values, which might limit generalizability of the data. 3. Estimated IQs of the youngest age group were up to 10 points lower than those of
382
VERBAL AND VISUAL LEARNING AND MEMORY
the other age groups; however, this may have been an artifact of the NART since the teenagers probably had not y~t been exposed to all the vocabulary items on the NART and, as a result, the tost may have underestimated their IQs. · 4. Sample sizes per group are small. 5. Data are obtained in Australia, which may limit usefulness for clinical interpretation in the United States. [RAVLT.12] lvnik, Malec, Tangalos, Petersen, Kokmen, and Kurland, 1990 (Tables A19.18-A19.20) The study provides age-specific norms for the Rey AVLT, derived from a sample pf 394 cognitively intact volunteers aged :55--97, living in Olmsted County, Mi$esota. Participants received general medical examinations performed by their primary-care physicians prior to enrollment in the study. Participants were excluded if they had. an active psychiatric or central nervous system condition, complaints of cognitive difficulty during history taking and systems review, findings on physical examination suggesting disorders with potential to affect cognition, certain types and dosages of psychoactive medication, or prior history of disorders causing residual cognitive deficit; ahronic medical illnesses were excluded only when they were reported by the physician to compromise cognition. These normative d'ta are believed to be a reasonably unbiased :representation of the elderly living in th. geographic region. . The standard administration procedJ¥e was used, including 30-minute delayed recall and recognition trials. The standard scoring procedure was used. Scores were also provided for the number of errors on the recopition trial. In addition, four summary measur~s (two "learning" and two "memory" scores) were computed: 1 1. Total Learning (TL) represents a total of five acquisition trials. · 2. Learning Over Trials [LOT= TL- (5 x trial I)] is an estimate of an individual's improvement over trials. ·
3. Short-Term Percent Retention represents trial VI recall as a proportion of trial V recall. 4. Long-Term Percent Retention reflects the Delayed Recall as a proportion of trial V recall. The data are broken down into seven age groups. In addition, the data for the summary scores are presented using overlapping intervals at specified age midpoints. The authors provide justification for this approach, asserting its advantage over more conventional group assignment. The latter tables should be used in the context of the detailed procedures for their application, which are explained in Ivnik et al. (1990) and not reproduced here. Therefore, only raw data tables are reproduced in this book.
Study strengths 1. Demographic characteristics of the sample are well described in terms of geographic area, age, handedness, gender, education, IQ, and marital status. 2. The data are partitioned into age groups. 3. The scoring system is well described. 4. The sample sizes for each group are large. 5. Stringent exclusion criteria are used. 6. Means and SDs are reported. 7. A new technique using overlapping intervals at specified age midpoints is described. Considerations regarding use of the study 1. The procedure for the recognition trial is not specified. 2. The technique proposed by the authors is quite complicated and should be used in the context of the detailed procedures for its application described in the original article. [RAVLT.13] Miller, Seines, McArthur, Satz, Becker, Cohen, Sheridan, Machado, VanGorp, and Visscher, 1990 (Table A19.21) The article described the results obtained on homosexual/bisexual males recruited in the Multi-Center AIDS Cohort Study (MACS), an epidemiological project designed to assess the
LIST-LEARNING TESTS
natural history of HIV-1 infection. The article explores the effect of HIV serostatus and symptom status on cognitive and motor functioning. The administration procedures outlined in Lezak (1983) were employed. Study strengths 1. Sample sizes are large. 2. Demographic characteristics of each sample are described in terms of gender, race, age, education, and geographic location. 3. Means and SDs are reported. 4. Administration procedures are specified. Considerations regarding use of the study 1. The study recruited participants aged 21-72. Norms are presented for all ages combined. They would be most accurate for young and middle-aged adults. 2. The data are presented only for trial V and the total for trials 1-V. 3. High educational level of the sample. 4. All-male sample. 5. Exclusion criteria are not specified. Other comments 1. In addition to other demographic variables, the paper reports race composition (white, black, Hispanic, other), CES Depression Scale scores, and CD4 cells/ mm 3 count. [RAVLT.14] Shapiro and Harrison, 1990
(Tables A19.22-A19.24) The authors developed criteria for word selection to generate new lists to be used as alternate forms of the AVLT. Four AVLT forms were compared on two samples of participants: 1. Seventeen elderly participants were recruited from the patient population at the V.A. Medical Center at Salem, Virginia, with a mean age of 66 years. They were rehabilitating primarily from stroke or limb surgery but also had associated medical illness. Seven of these participants carried a clinical diagnosis of dementia. Exclusion criteria were as follows: acute
383
illness, change in psychotropic medication during the study, and being present on the ward for less than 2 weeks. 2. Twenty-five participants were undergraduate students, with a mean age of 19 years. Four alternate forms of the AVLT {forms AB, CD, EF, and GH) were used {one trial for the acquisition and interference lists, respectively). List pair AB represented the standard form used for the Rey AVLT; form CD was the alternate list pair provided by Lezak. The remaining two list pairs were generated based on their match to the above lists, according to the criteria developed by the authors. The order of administration was counterbalanced across dates and participants. Only one form was used per session. The interval between sessions varied 2-13 days (mean= 5, SD = 3.6). The standard administration procedure was used for each form. Means and SDs are reported for the five acquisition trials, list B, and postinterference recall. The authors concluded that all four forms yielded comparable mean recall scores. Alternate form reliability coefficients for each comparison varied 0.67-0.90. The results suggested that the use of alternate forms may eliminate direct practice effects; however, a general practice effect remained due to repeated administrations when the tests are spaced as much as 5 days apart. This effect persisted for a number of days in healthy college students but not among the older patient population. Study strengths 1. Established reliability indices for alternate forms. 2. Means and SDs are reported. 3. Test administration procedures are specified. Considerations regarding use of the study 1. Sample sizes are small. 2. Demographic characteristics of the participants are scarcely described (no information on gender, educational level of the older group, IQ, or age ranges).
384
VERBAL AND VISUAL LEARNING AND MEMORY
3. No exclusion criteria are reported for the younger group. 4. Participants in the older groUp had either neurological or medical illl)ess. [RAVLT.15] Mitrushina, Satz, Chervinsky and D'Eiia, 1991 (Tables A19.25-A19.27)
The study explored the effect of age ~m different memory mechanisms in a sample) of 156 healthy elderly participants (62 males,: 94 females) aged 57-85. The sample was.' partitioned into four age groups, which did not differ significantly in level of educa4on or FSIQ. Participants with a history of reurological or psychiatric illness (per self-feport) were excluded. MMSE scores for all 'articipants were >24. All participants were;native English speakers and active in the comrfunity. The standard administration procedt.te was used: recognition trial, consisting of paragraph, followed a 10-minute interval :rtler the last recall trial, during which nonverb4! tests : were administered. The authors reported recall on five aJ:quisition trials, postinterference recall, r~tion, number of false-positive misidentifi
a
Study strengths , 1. The data are divided into age groups. 2. Sample composition is described in terms of gender, native languag*', age, IQ, education, and geographic ar€fa. 3. Administration procedure is specined. 4. Adequate exclusion criteria. ! 5. Data on a comprehensive set of tile Rey · AVLT scores are provided. 6. Means and SDs are reported.
Considerations regarding use of the study 1. The sample represents highly functioning, well-educated, elderly individuals, which might limit the generalizability of results. 2. Sample sizes for the youngest and oldest groups are relatively small. [RAVLT.16] Mitrushina and Satz, 1991a (Tables A19.28, A19.29)
The study examined the magnitude of the practice effect in repeated administration of neuropsychological measures in a sample of 122 healthy elderly participants (49 males, 73 females) aged 57-85 recruited in southern California. This study represents a longitudinal follow-up of the sample described in Mitrushina et al. (1991). Participants with a history of neurological or psychiatric illness (per self-report) were excluded. Mini-Mental State Exam scores for all participants were >24. All participants were native English speakers and active in the community. Mean age was 70.4 (5.0) years, mean education was 14.1 (2.7) years, and mean WAIS-R FSIQ was 118.2 (13.0). The sample was partitioned into four age groups, which did not differ significantly in level of education or gender. Standard administration procedures were followed. The longitudinal data over three annual probes for trial I, trial V, and postinterference recall are presented. The authors concluded that recall on trial I improved over repeated probes for all age groups, which might be attributed to a practice effect, whereas cross-sectional comparisons revealed decreased scores on this trial with age. The effect of longitudinal testing on verbal learning and forgetting (trial V and postinterference recall trial) was negated by the effect of immediate rehearsal of to-beremembered information over five trials. Study strengths 1. The data are divided into age groups. 2. Sample composition is described in terms of age, gender, education, IQ, English fluency, and geographic area. 3. Administration procedure is specified. 4. Adequate exclusion criteria. 5. Means and SDs are reported.
385
LIST-LEARNING TESTS
Considerations regarding use of the study 1. The sample represents highly functioning, well-educated, elderly individuals, which might limit the generalizability of results. 2. Sample sizes for the youngest and oldest groups are relatively small. 3. No data for trials II-IV.
[RAVLT.18] Delaney, Prevey, Cramer, and Mattson, 1992 (Table A19.31)
This article summarizes results from the MACS described above (see Miller et al., 1990). Data for 733 seronegative homosexual and bisexual males are presented for the purpose of establishing norms for neuropsychological test performance based on a large sample. The standard administration procedure was used. In addition, recall after a 20-minute delay was assessed, followed by a delayed recognition trial. (Recognition was not tested after administration of the postinterference trial.) The total score for five acquisition trials and the score on trial V as well as postinterference recall, delayed recall, and delayed recognition data for three age groups and three education groups are reported. The authors concluded that age and education are important determinants of performance on the Rey AVLT.
The authors collected data on 42 control participants as part of an investigation on partial complex and generalized seizures and memory as well as anticonvulsant efficacy. Participants were recruited in Connecticut, California, Florida, Virginia, Massachusetts, New York, and Minnesota. Exclusion criteria were history of neurological or psychiatric disorder or "current drug history that could affect performance." Mean age was 45.8 years (range 22-67), and mean education was 12.8 years (range ~16). The administration format detailed in Lezak (1983) was employed, with the exception that participants were forewarned that a 20-minute delayed recall would occur. During the 20minute delay period, additional testing involving motor, attentional, and verbal fluency tasks was conducted. An alternate list (C; Lezak, 1983) was administered 1 month later. Means and SDs for trials I, III, and V; postinterference recall; 20-minute delayed recall; and recognition are reported for both test forms. The two forms correlated highly (acquisition trials, r=0.61-0.86; delayed recall trials, r=0.51-0.72), providing support for their comparability. Significant correlations were documented between test scores and age (r =-0.22 to -0.55) and education (r=0.14 to 0.54).
Study strengths
Study strengths
[RAVLT.17] Seines, Jacobson, Machado, Becker, Wesch, Miller, Visscher and McArthur, 1991 (Table A19.30)
1. Data are stratified by age and education. 2. The demographic composition of the sample is well described in terms of age, gender, ethnicity, education, and geographic area. 3. The administration procedure is well outlined. 4. Sample sizes are large. 5. Means and SDs are reported.
1. Information on alternate forms is provided. 2. Adequate exclusion criteria. 3. Information regarding age, education, and geographic recruitment area is provided. 4. Most test administration procedures are specified. 5. Means and SDs are reported.
Considerations regarding use of the study 1. The exact recognition procedure was not
Considerations regarding use of the study 1. The generalizability of the results is limited due to high educational level of the majority of the sample. 2. Data for trials I-IV are not reported. 3. All-male sample. 4. No exclusion criteria are reported.
specified. 2. Participants were forewarned regarding delayed recall. 3. Undifferentiated age range. 4. No information regarding IQ or gender. 5. Somewhat small sample. 6. No data for trials II and IV.
386
VERBAL AND VISUAL LEARNING AND MEMORY
[RAVLT.19] lvnik, Malec, Smith, Tangalos, Petersen, Kokmen, and Kurland, 1992c (Table A19.32) The study provides age-specific norms for the Rey AVLT obtained in Mayo's Older ~men cans Normative Studies (MOANS). Information provided in this article updates p~vious normative data reported by Ivnik et aJ. ~1990). The present study extends the normatif base from 394 to 530 participants, refines scoring procedures, and develops a uniform rnkthodology for producing normative information on many different tests. There were no clanges in the administration procedure compted to the earlier publications. The sample consisted of cognitively normaJ volunteers aged 5~97. Age categorization used the midpoint intervaJ technique.IMean MAYO VIQ, PIQ, and FSIQ (which Idiffer somewhat from standard WAIS-R IQs) for the whole sample were 104.8 (10.4), 106.6 (11.5), and 105.8 (10.8), respectively. The sa~ple is aJmost exclusively Caucasian and living in an economically stable region. The standard administration procedute was used for recall triaJs. In addition, 30-minute delayed recall followed by a recogniti~ triaJ was administered. The recognition tri~ presented 30 words italicized in Lezak's! 1976 paragraph as a two--column list, which ~ subject reads. Participants indicated recof'ition of a learned word by crossing it off the 'st. The AVLT component scores wer. converted to age-corrected and normaJized scaJed scores, with a mean of 10 and an SD of3, ~hich are derived from the cumulative fre
2. 3.
4. 5.
remembered across triaJs 1-V) corrected for immediate word span (triaJ I): LOT=TL-(5xtria1 I score). This index represents the ability to improve upon triaJ I performance during each of the subsequent four learning trials. Recall of list B represents an index of proactive interference on later word span. The Mayo Auditory-Verba] Delayed Recall Index (MAVDRI) includes the following measures: a. Recall after interference represents memory status after a brief delay. b. Recall after 30-minute delay represents memory status after an extended delay. Recognition efficiency. The MAYO Auditory-Verba] Percent Retention Index (MAVPRI) reflects the common clinicaJ practice of evaJuating information recalled after a delay as a function of the amount of data originally learned and includes the following measures: a. Short-Term Percent Retention (STPR) expresses recall after interference as a proportion of triaJ V recall: STPR = 100 x (tria] VI recall!triaJ V recall). b. Long-Term Percent Retention (LTPR) expresses delayed recall as a proportion of triaJ V recall: LTPR= 100 x (delayed recall score/triaJ Vrecall).
Conversion of raw and computed scores into scaJed scores aJlows greater comparability of these indices to each other and to performance on other tests. These data should be used in the context of the detailed procedures for their application, which are explained in lvnik et aJ. (1992c). Therefore, they are not reproduced in this book. Interested readers are referred to the originaJ article.
Study strengths 1. Demographic characteristics of the sample are well described in terms of geographic area, age, gender, education, IQ, handedness, and ethnicity.
387
LIST-LEARNING TESTS
2. The data are broken down by age group based on the midpoint interval technique. 3. The administration procedure is well described. 4. The innovative scoring system is well described. The authors developed new indices of performance to explore different memory mechanisms. 5. The sample sizes for each group are large. Considerations regarding use of the study 1. Participants with prior history of neurological, psychiatric, or chronic medical illnesses are included. 2. The technique proposed by the authors is quite complicated and should be used in the context of the detailed procedures for its application described in the original article. Other comments The theoretical assumptions underlying this normative project have been presented in Ivnik et al. (1992a,b). The authors cautioned that validity of the MAYO AVLT indices depends heavily on the match of demographic features of the individual to the normative sample presented in the article. [RAVLT.20] Savage and Gouvier, 1992 (Tables A19.33, A19.34)
The study explores the effect of age and gender on Rey AVLT performance. Participants were 134 undergraduate students, senior citizens from community programs, and others (66 males, 68 females), forming seven age groups ranging from the late teens through the seventh decade. Only those participants who completed 12th grade were included in the study (with the exception of those 16-19 years old). All participants completed an extensive medical history questionnaire. Participants with a history of head injury, alcoholism, mental illness, cardiovascular disease, or other conditions associated with impaired memory functioning were excJuded. The standard administration procedure was used, including recall after a 30-minute delay, during which participants 6lled out personal
and medical questionnaires. The number of correctly recalled words from list A as well as the number of commission errors or words incorrectly identified as being on the list were recorded. Immediate and delayed recognition trials were used, which consisted of underlining the ]earned words in the same paragraph for both recognition trials. The authors concluded that there is no effect of gender for any of the trials. An effect of age was evident for all trials, with the exception of trials I and II and list B. Study strengths 1. Administration procedures are well described. 2. The data are partitioned by age and gender. 3. Adequate excJusion criteria. 4. Information regarding education, recruitment sources, and geographic area is provided. 5. Means and SDs are reported. Considerations regarding use of the study 1. An immediate recognition trial was used, which facilitates performance on delayed recall and recognition. 2. No information on mean educational level. 3. Sample sizes for each group are small. 4. Normative data for older groups (60--69 and 70-76) are not reported for the delayed recall trials. [RAVLT.21] Crossen and Wiens, 1994 (Table A19.35)
The study compares performance on the Rey AVLT and the CVLT. Participants were 60 individuals (52 men, eight women) who had applied for jobs in the civil service involving public safety and had passed medical screening examinations. Mean age was 29.9 (6.2) years, mean education was 14.7 (1.6) years, and mean WAIS-R FSIQ was 106.3, with a range of 88-133. All participants were administered both forms in a counterbalanced order, with an interval of about 2-4 hours between administrations. The Rey AVLT was administered according to standard procedures. On the
388
VERBAL AND VISUAL LEARNING AND MEMORY
recognition trial, participants were to ·.read a paragraph and circle the learned wor<Js. The CVLT was administered according to Ute instructions in the manual (Delis et al., l987). Data for acquisition trials I-V and the total for five acquisition trials, list B, postinterkrence recall, and recognition are reported. The authors concluded that the CYI4T yielded higher scores than the Rey AVLT.: There were significant differences in performajlce on all parameters of the AVLT and correspf>nding variables of the CVLT, which amounted fo onehalf to one full word difference on evet trial. The results suggested no order effect and! a minimal practice effect for different list-l~ng tests administered in the same test batteiy.
Study strengths 1. Information regarding age, gend~r, IQ, and educational level is provided. I 2. Administration procedure is outli.;ed. 3. Sample size is sufficient. ! 4. Adequate exclusion criteria. , 5. Means and SDs are reported.
I
Considerations regarding use of the study 1. Undifferentiated age range. f 2. High educational level.
:
3. Mostly male sample. [RAVLT.22] Geffen, Butterworth, and Geffen, 1994 (Tables A19.36-A19.39)
The study explored the equivalence between the original form of the Rey AVLT (form 1) and a new form (form 4), as well as the testretest reliability of both forms. Participants were 51 volunteers (25 males, 26 females) living in Australia with no self-reported history of head injury or neurological abnormality. Mean age was 31.3 (12.7) years; ed~ation ranged 6-20 years, with a mean of 12.2 (2.4); and estimated IQ (based on NART) J.1mged 100-128, with a mean of 115.6 (6.26). ; A new form of AVLT (form 4) was generated by the authors based on criteria develo:£1ed by them. All participants were tested onj both form 1 and form 4 in a counterbalanced order, with an interval of6-14 days between se.ions. The standard administration procedu~ was used, with the exception of a 20-minuteldelay interval filled with verbal cognitive and terbal
memory tests. A list of 50 words was used for the delayed recognition trial. Participants were asked to identify words from lists A and B and to specify the list of origin. The authors reported recall on five acquisition trials, the total number of words recalled over five trials, the interference trial, postinterference recall, delayed recall trial, number of words recognized from lists A and B, and a nonparametric measure of recognition performance, p(A) = 0.5( 1 + HR - FP), which corrects the recognition score (hit rate, HR) by taking into account false-positives (FP). The authors concluded that the forms are equivalent. In test-retest conditions, the highest reliability was demonstrated by the total number of words learned over the five acquisition trials (r=0.77) and performance on the postinterference trial (r= 0.70).
Study strengths 1. Demographic characteristics of the sample are described in terms of age, gender, education, IQ, and geographic area. 2. Administration and scoring system are described. 3. Sample size is adequate. 4. A comparison of two alternate forms is
provided. 5. Means and SDs are reported. Considerations regarding use of the study 1. The 20-minute delay interval was filled with verbal cognitive and verbal memory tests, which might cause interference with delayed testing. 2. The sample has high estimated IQ values, which might limit the generalizability of data. 3. Data were collected in Australia, which may limit their usefulness for clinical interpretation in the United States. 4. Minimal exclusion criteria. 5. Undifferentiated age range.
[RAVLT.23] Torres, Flashman, O'Leary, and Andreasen, 2001 (Table A19.40)
The authors reported data for 160 healthy adult controls in a study on the effects of interference on word list recall in schizophrenia. The sample included 84 males and 76 females
LIST-lEARNING TESTS
from the community, with mean age of 29.0 (10.2) and 30.5 (11.7) years and mean education of 14.4 (2.1) and 14.8 (1.9) years, respectively. WAIS-R VIQ and FSIQ as well as parental socioeconomic status are reported. Participants were screened for psychiatric, neurological, and substance abuse history. The Rey AVLT was administered as part of a comprehensive battery, which included measures of various aspects of frontaVexecutive functions and memory. The standard administration procedure was used (Lezak, 1995). Recall on acquisition trials I and V, interference list, and recall after interference are reported for males and females separately. In addition, the authors analyzed patterns of learning and proactive interference for patients and controls equalized on trial I performance as well as relationships between measures of proactive/retroactive interference and executive or memory functions. These data are not replicated in this book. The authors concluded that increased susceptibility to retroactive interference is related to frontally mediated central executive functions. Study strengths 1. Large sample size. 2. The sample composition is well described in terms of age, education, gender, VIQ, and FSIQ. 3. Adequate exclusion criteria. 4. Test administration procedures are specified. 5. Means and SDs for the test scores are reported. Considerations regarding use of the study 1. Recruitment procedures are not reported. 2. The data are not partitioned by age group. 3. FSIQ for the sample falls within the high average range. 4. Data for trials II-IV are not reported. [RAVLT.24] Miller, 2003; Penonal Communication (Table A19.41)
The investigation used participants from the MACS. Data were collected from 920
389
seronegative homosexual and bisexual males for the purpose of establishing norms for neuropsychological test performance based on a large sample. Table A19.41 overlaps with the data reported earlier by Seines et al. (1991), which were reanalyzed to provide norms stratified by age x education. Mean age for the sample was 38.2 (7.4) years and mean education was 16.3 (2.4) years; 91.5% were Caucasian, 2.6% Hispanic, 4.9% black, and 1% other. All participants were native English speakers. The standard administration procedure was used. In addition, recall after a 20-minute delay was assessed, followed by a delayed recognition trial. (Recognition was not tested after the postinterference trial.) Scores for acquisition trials I and V, totals for five acquisition trials, interference trial, 20minute delayed recall, delayed recognition, and number of false-positive errors for three age groups x three educational levels are reported. Study strengths 1. The overall sample size is large, and most of the individual cells have more than 50 participants. 2. Normative data are stratified by age x education. 3. Information on age, education, ethnicity, and native language is reported. 4. Means and SDs for the test scores are reported. Considerations regarding use of the study 1. All-male sample. 2. No information on IQ is reported. 3. No information on exclusion criteria. 4. Data for trials II-IV are not reported. [HVLT-R.1] Friedman, Schinka, Mortimer, and Borenstein Graves, 2002 (Table A19.42)
Data were obtained from the Hillsborough Elder African American Ufe Study (HEALS). The authors examined the influence of demographic characteristics on HVLT-R performance in a community-dwelling sample of 237 African-American older adults (108 men, 129 women), aged 60-84 years and living in Tampa, Florida. Mean age and education are
390
VERBAL AND VISUAL LEARNING AND MEMORY
provided for each demographic group separately. Information about the particip~ts was obtained using structured interviews. · Form 1 of the test was administered following the standard procedure, with the exception of addition of a cued recall (by category) trial following delayed free i recall. Performance indices included Delayed Cued Recall and Learning in addition to thf standard indices. : The authors fmmd a moderately largJ effect of age and moderate effects of educatifn and gender on test performance. Therefote, the data for all performance indices are; partitioned into two age levels and three Fducational levels and reported for males and females separately. In addition to raw .erformance indices, the authors provide tabifs that convert raw data into percentiles, with )Corresponding education and gender adjus~ents for the two age groups. , In Table A19.42, we reproduced the data for selected performance indices: Total Recall (sum of trials 1 through 3), Delayed 1\ecall, Percent Retention [Delayed Recalllhi~er of the scores on trials 2 and 3 x 100], an4 Recognition Discrimination Index (total . falsepositive- total true-negative score). For the normative data on other indices, see the original article.
Study strengths 1. Large sample size. 2. The sample composition is described in terms of age, education, gender, and geographic area. . 3. Test administration procedures are specified. 4. Means and SDs for the test scores are reported, as are tables for conver~on of raw scores into T scores. · 5. The data are partitioned by age, ~uca tion, and gender. Considerations regarding use of the study 1. Although information obtained in structured interviews is described, exclusion criteria are not specified. 2. Recruitment procedures are not reported, although a reference is prqvided to another study. i '
3. No information on IQ is reported. 4. The administration procedure was modified to include a delayed cued recall trial. This might facilitate performance on the Recognition trial. [WHO-UCLA AVLT.1] Ponton, Satz, Herrera, Ortiz, Urrutia, Young, D'Eiia, Furst, and Namerow, 1996 (Table A19.43)
The WHO-UCLA Auditory Verbal Learning Test, Spanish version, was administered to Spanish-speaking volunteers as part of a Jarger battery in a project designed to provide standardization of the Neuropsychological Screening Battery for Hispanics (NeSBHIS). Volunteers were recruited through fliers and advertisements in community centers of the greater Los Angeles area over a period of 2 years. Exclusion criteria were a history of neurological or psychiatric disorder, drug or alcohol abuse, and head trauma. Data for a sample of 300 participants with a median educational level of 10 years were analyzed. Participants ranged in age 16-75 years, with a mean of 38.4 (13.5) years. Education ranged 1-20years, with a mean of10.7 (5.1) years. The male to female ratio was 40%/60%. The average duration of residence in the United States was 16.4 (14.4) years. Seventy percent of the sample were monolingual Spanish-speaking, and 30% were bilingual. The proportion of the sample respective to their country of origin closely approximates the 1992 U.S. Census distribution. Correlations between Marin and Marin (1991) acculturation scale scores and neuropsychological variables are provided. Five acquisition, interference, recaU after interference, and 20-minute delayed recall trials were administered. The authors reported data for recall on trial V, recall after interference, and delayed recall.
Study strengths 1. Large overall sample with acceptable size for most cells. 2. The sample composition is well described in terms of age, education, gender, acculturation information, geographic area, and recruitment procedures. 3. Adequate exclusion criteria.
LIST-LEARNING TESTS
4. Test administration and scoring procedures are specified. 5. Means and SDs for the test scores are reported. 6. Data are partitioned by gender x age x education.
Considerations regarding use of the study 1. No information on IQ is reported. 2. It is unclear which of the two educational groups included participants with 10 years of education.
RESULTS OF THE META-ANALYSES OF THE REY AVLT DATA (See Appendix 19m) Data collected from the studies reviewed in this chapter were combined in regression analyses, to describe the relationship between age and Rey AVLT performance and to predict test scores for different age groups. Effects of other demographic variables were explored in follow-up analyses. The general procedures for data selection and analysis are described in Chapter 3. Detailed results of the metaanalyses and predicted test scores across adult age groups are provided in Appendix 19m. Separate analyses were performed on the data for trial I, Trial V, Recall after Interference (RAJ), Recognition, and Total Recall for five acquisition trials. In addition, patterns of learning (trial V- trial I) and forgetting (trial V- RAI) were examined. Original data corresponding to three points along the age continuum are presented in order to demonstrate the slope of decline in learning capacity. A separate run of the regression analysis was performed on the forgetting scores. The predicted values for the forgetting curve based on linear regression (R2 = 0. 743, Fo.s) = 52.38, p< 0.0004) are presented for comparison purposes in the summary table (see Table Al9m.6). Supporting statistics for the forgetting scores are not reported. Data for Delayed Recall were not analyzed due to inconsistency in terms of administration procedures and high variability in the delay interval. After data editing for consistency and for outlying scores, the following data were
391
included into the analyses: eight studies, which generated 24 data points based on a total of 1,910 participants for trial I; eight studies, which generated 23 data points based on a total of 1,901 participants for trial V; seven studies, which generated 20 data points based on a total of 983 participants for RAI; four studies, which generated 14 data points based on a total of 453 participants for Recognition; and six studies, which generated 20 data points based on a total of 1,699 participants for Total Recall. Quadratic regressions of the test scores on age yielded R2 of 0.842 for trial I, 0.877 for trial V, 0.923 for RAI, 0.948 for Total Recall, as well as R2 of 0.892 based on linear regression for Recognition, indicating that 84%-95% of variance in the test scores for the five measures is accounted for by the models. Based on these models, we estimated scores for the five measures for age intervals between 20 and 79 years. If predicted scores are needed for age ranges outside the reported boundaries, with proper caution (see Chapter 3), they can be calculated using the regression equations included in the tables, which underlie calculations of the predicted scores. The predicted scores are relevant for the following administration sequence: five acquisition trials, interference trial, recall after interference, and recognition (immediate or after a short delay). It should be noted in the context of acrosscondition comparisons that mean age for the Recognition condition is considerably lower than mean ages for other conditions because data for several large studies based on the older samples were not available for the Recognition condition. Regressions of SDs on age for all five conditions suggest that age does not account for a signiAcant amount of variability in SDs (R 2 ranges 0.010-0.471). Though some increase in variability with advancing age is expected, this trend was not statistically significant in the collected data. Therefore, we suggest that the mean SD for the aggregate sample be used across all age groups. Examination of the effects of demographic variables on Rey AVLT scores indicated that education did not contribute significantly to scores in the data available for analyses. The
392
VERBAL AND VISUAL LEARNING AND MEMORY
effect of intelligence level on Rey AVLT performance was not explored due to a scarcity of data available for review. The effect of gender on test perfonnance was examined using a series of t-tests. A statistically significant difference in fa\'or of males was found for trial I, trial V, and Recognition. This is in contrast to some ev;,dence of superiority of females available in the literature. The significant findings of gender differences in our analyses are not reli.ble as the numbers of data points included ·.in the analyses are very small for both male and female samples (seven and five for trial I; seven and four for trial V; five and Rve for Recognition). Analyses for RAI, which included nine and eight data points, and for Total Recall, which included eleven and nine data points, did not yield significant sender differences. This suggests that the jender differences found in the earlier analY!tes are likely due to individual differences between samples.
Strengths of the analyses 1. Total sample size of 1,910 for trial I; 1,901 for trial V; 983 for RAI; 453 for Recognition; and 1,699 for Total Recall. 2. R2 of 0.842 for trial I, 0.877 for trial V, 0.923 for RAI, 0.892 for Recogpition, and 0.948 for Total Recall, indicating a good model fit. 3. Postestimation tests for parametem: specifications did not indicate problent; with normality or homoscedasticity, with the exception of the marginally significant tests for normality for RAI and l\ecognition. 4. Although the data available for diferent conditions vary considerably in tetms of the number of studies included aitd resulting sample sizes, a compari$>n of weighted means for different concJitions suggests that inter-trial differences are very close to those reported in the literature. 5. The predicted values match closely the metanorms provided by Schmidt (1996) for some variables and are similar in the rate of age-related changes for most of the variables. ;
Limitations of the analyses 1. Postestimation tests for normality for RAI and Recognition were marginally significant. The Kdensity plot for RAI demonstrated an asymmetrical curve, with a nearly bimodal distribution of residuals, reflected in a small bump at the lower extreme of the curve, possibly pointing to a diagnostic significance of low scores on the RAI. The Kdensity plot for Recognition revealed a negatively skewed distribution of the residuals. These deviations from normality do not affect the estimates of regression coefficients and accuracy of prediction but do influence the results of significance tests. 2. Data for only a narrow range of higher levels of education are available for the analyses (12.~16.0 years or even more narrow, depending on the variable). Mean educational levels for all variables range 13.92-14.10. We were unable to fully explore the effect of education on the test scores because lower educational levels are not represented. Though reports on the relationship between education and test scores are equivocal, a number of studies suggest that higher levels of education are associated with better test performance. Therefore, the predicted values might overestimate expected scores for individuals with lower educational levels. 3. Although the effect of intellectual level on Rey AVLT performance has been reported in several studies, we could not include measures of intellectual level in our analyses due to scarcity of this information in the data available for analyses. 4. Equivocal findings for gender differences in test performance are likely to be due to small numbers of data points for each gender included in the analyses.
CONCLUSIONS
A review of the literature o'n the list-learning tests suggests high clinical utility of these tests
LIST-LEARNING TESTS due to their sensitivity to disruption in different memory mechanisms reSected in various indices derived from the test performance. Hence the importance of reporting a full range of test scores in clinical practice and in research, including delayed recall and recognition (if administered) to assure optimal use of these tests in identifying faulty memory mechanisms. Moreover, thorough description of administration procedures, especially deviations from standard procedures, is of utmost importance for accurate interpretation of test scores. Due to the considerable effect of age on list-learning test scores, individual sets of data should be referenced to an appropriate age group. Similarly, education, intelligence level, and gender are possible contributing factors to test score variance. Therefore, these
393
demographic characteristics should also be of concern for clinicians and investigators alike. For future research on list-learning tests, classification of data into age groups, educationaVIQ levels, and gender would be desirable. Development of new and validation of existing alternate forms for the Rey AVLT would aid clinicians and researchers in minimizing carry-over effects in retesting situations. The utility of different list-learning tests with different diagnostic groups should be further explored. In addition, further studies on different memory mechanisms using concepts adapted from cognitive science would be of great value in understanding the underlying cognitive processes associated with performance on these tests.
20 Benton Visual Retention Test
BRIEF HISTORY OF THE TEST The Benton Visual Retention Test (BVRT) measures immediate visual memory and visual perceptual-constructional skill and has several administration formats and versions. Each standard version contains 10 line drawings on individual pages measuring 8.5" x 5.5" in an easel-style booklet. The patient uses a response booklet that contains 10 blank pages on which the patient reproduces the designs. The first two designs contain single large stimuli, while the remaining eight pages depict two larger figures and one peripheral figure. There are three alternate versions of the 10 cards: forms C, D, and E. In the first administration format (A), each card is presented for 10 seconds and the patient is to draw the design immediately after removal of the stimulus. In administration B, the clinician presents each stimulus for only 5 seconds prior to the patient's attempt at reproduction. In administration C, the patient copies each design while designs remain in view. In administration D, each stimulus is presented for 10 seconds but a 15-second delay is interspersed between presentation and reproduction. The scores derived from the test include total correct out of 10 and total errors. The upper range of errors is approximately 24, although the possible range extends higher because four errors can be committed on each 394
card. Six general categories of errors can be scored: omissions, distortions, perseverations, rotations, misplacements, and size errors. Some authors have suggested that application of a background interference procedure to standard administration of the BVRT may have utility (Crockett et al., 1983). Further information regarding administration and scoring can be found in the manual for the BVRT, 5th edition (Benton-Sivan, 1992). A 15-item recognition version was published in French by Benton in 1965 (in Fabrigoule et al., 1998), and in 1977 Benton and colleagues produced an unpublished manuscript in English detailing the recognition administration. In this administration, the usual BVRT three-figure constellation, with two larger figures and one peripheral figure, is presented for 10 seconds. Immediately following removal of the stimulus, the subject is presented with another card that has four similar arrays. Three of these choices are variations on the original stimuli, while the fourth design matches the target exactly. The patient is to choose the alternative which matches the original stimulus. Benton also reported an administration in which the target is presented simultaneously with the four choices, to measure visual discrimination rather than memory; this version subsequently was renamed the Visual Form Discrimination Test.
BENTON VISUAL RETENTION TEST
Some studies have indicated that patients with right hemisphere or diffuse lesions perform worse than patients with left hemispheredamage (Crockettetal., 1983), but other data have failed to confirm a relationship between lesion laterality and BVRT performance (Arena & Gainotti, 1978; Benton, 1962). Soininen and colleagues (1994) reported a significant relationship between BVRT scores and volume of the right hippocampus and magnitude of asymmetry between the right and left hippocampi in patients with age-associated memory impairment and that the lowestscoring patients had reduced volume of both right and left amygdala. Further, patients with anterior lesions have been reported to demonstrate more perseverations than patients with posterior lesions (Vilkki, 1989). A sample of patients with thalamic stroke was described as exhibiting moderate to severe impairment on the BVRT (Radanovic et al., 2003). Poorer BVRT scores have also been associated with larger white-matter hyperintensityvolume (Swan et al., 1998). The larger the lesions, the higher the percentage of errors, especially of distortion and laterality (Kasahara et al., 1993, 1995). In addition, lesions in basal ganglia were associated with distortion errors, while lesions in the thalamus were related to perseverations (Kasahara et al., 1995}. The relationship between these subcortical lesions and BVRT errors may at least partially explain the decline in BVRT performance observed with age; 70% of elderly participants are found to have these abnormalities on magnetic resonance imagery (MRI) (Kasahara et al., 1995). BVRT performance is also lowered in elderly men with total brain volume below the median (Carmelli et al., 2000). No relationship between BVRT scores and resting regional cerebral glucose metabolism rates have been observed (Haxby et al., 1986). In mixed dementia patients (Eslinger et al., 1985) and in Alzheimer's disease patients (Jacqmin-Gadda et al., 2000), the BVRT was one of the most sensitive tests included in a dementia screening battery; it is also effective for discriminating mild Alzheimer's disease from Age-Associated Memory Impairment (Youngjohn, 1992). BVRT performance in individuals with memory loss without diagnosis
395
of dementia is highly predictive of significant decline in visual memory performance 3 years later (Small et al., 1995), and the BVRTwas one of three tests predictive of Alzheimer's disease diagnosis 1-3 years later (Dartigues et al., 1997). Individuals with Alzheimer's disease show larger declines in BVRT performance in the 6 years prior to diagnosis (Zonderman et al., 1995), and a greater number of BVRT errors is associated with an increased risk of Alzheimer's disease up to 15 years later (Kawas et al., 2003). The memory trial, but not the copy trial, of the BVRT discriminates normals from patients with very mild Alzheimer's disease (Robinson-Whelen, 1992), and worse performance on the BVRT copy but relatively better performance on BVRT delayed recall predicted a more rapid cognitive decline (Rasmusson et al., 1996). BVRT performance declined less rapidly than language skills over a 2-year period in Alzheimer's patients (Rebok et al., 1990). The BVRT was reported to differ in individuals with a family history of Alzheimer's disease vs. those without (Smalley et al., 1992) and to identify persons with one apolipoprotein E4allelle from those with none (Soininen et al., 1995). Palmer and colleagues (1994) reported that there are two distributions of BVRT scores in offspring of patients with Alzheimer's disease, with 6% of participants performing in a low-scoring cluster hypothesized to be at-risk carriers of a putative Alzheimer's gene. In addition to the sensitivity of quantitative scores to Alzheimer's disease, qualitative analysis of errors has shown promise in the detection of Alzheimer's disease. Specifically, increases in omission and perseveration errors appear to selectively occur in Alzheimer's disease (La Rue et al., 1986; Vollant et al., 1986). However, error types, with the exception of distortion errors, for both the memory and copy trials may be less useful in tracking the progression of dementia due to the prominence of omission errors as dementia severity increases. BVRT scores are comparable in individuals with Alzheimer's disease and Parkinson's disease with dementia but lower than those observed in Parkinson's disease without dementia (Kuzis et al., 1999). BVRT scores were also lowered in individuals who were
396
VERBAL AND VISUAL LEARNING AND MEMORY
carriers for Huntington's disease but npt clinically diagnosed with the condition (Witjes-Ane et al., 2003). Patients with right or left temporal lobe epilepsy score significantly below controls on the BVRT but not differently from each other (Helmstaedter et al., 1995; Mayeux , et al., 1980) or from patients with generalized epilepsy (Mayeux et al., 1980). However, the performance of patients with right temporal lobe epilepsy, but not that of left foci patifnts or controls, appears to be mediated by verbal learning capacity; the selective impairdtent in right temporal lobe seizure patients btcomes apparent when the complexity of the items exceeds the compensatory verbal mempry capacity (Helmstaedter et al., 1995). In a¥tion, patients with temporal lobe epilepsy, show distinct eye movements when completing the BVRT that vary with laterality of feizure foci and suggest functional overactivation of the epileptogenic hemisphere (Sonobe.f et al., 1991). Reduction of polypharmacy in epileptic patients has been associated with iiJt>rovement in BVRT scores (Ludgate et al., 1985). The BVRT was the only cognitive ~est of 15 administered which was significantly lowered in HIV+ men with no or minor. symptoms (Collier et al., 1992). BVRT perfmtnance is also suppressed in multiple sclerosi~ (Ruggieri et al., 2003), with nearly 50% of this population showing declines on this 'nstrument (Arias et al., 1991), although resulp have been contradictory regarding the relatipnship between extent of decline on the BVRT and duration of illness (Halligan et al., 198li; Ruggieri et al., 2003). BVRT scores are al*> lowered in Turner's syndrome (Downey ,et al., 1991), in females from fragile X families (Miezejeski et al., 1986), and in female (Jlrriers of the fragile X syndrome who inherited the fragile X chromosome from their ~thers (Hinton et al., 1992). The BVRT is suppressed in schizophrenic patients (Goldsmith&: Brengelmann, 1911) and in patients with tardive dyskinesia (B~els &: Themelis, 1983), with clinical impro~ment, as measured by the Positive and Negative Syndrome Scale (PANSS), significantly,correlated with fewer errors on the BVRT (*ollnik et al., 2002). At least some of the lcf'ered
.
performance on the BVRT in schizophrenia patients appears to be due to poor visual scanning of the figures (e.g., failure to look at peripheral figures) (Obayashi et al., 2003; Tsunoda et al., 1992). Identification of happy faces was significantly associated with BVRT performance in schizophrenic patients, although identification of sad or neutral faces was not (Silver et al., 2002). Bipolar patients also show deficient performance on the BVRT (Loo et al., 1981), although lithium treatment in this population has not been related to BVRT scores (Engelsmann et al., 1988). BVRT performance of depressed patients is lower relative to controls (Crookes &: McDonald, 1972; Dealberto et al., 1996; La Rue et al., 1986; Mormont, 1984) and further suppressed in the presence of psychotic features but not impacted by the effects of medication treatment (Shipley et al., 1981). BVRT scores do not decline post-ECTor in response to raised blood pressure post-ECT (O'Donnell&:Webb,1986).BVRTperformance has been reported to be lowered in PTSD (Kirling-Boden &: Sundbom, 2003) but not in women traumatized by childhood sexual abuse (Stein et al., 1999). Children and adolescents with learning disabilities have exhibited a high level of errors, especially distortions (Snow, 1998), and children with attention-deficit hyperactivity disorder (ADHD) receiving stimulant medication have also scored significantly lower on the BVRT (Risser &: Bowers, 1993). Abusers of cocaine and heroin and polydrug abusers exhibit lowered BVRT performance (Amir &: Bahri, 1994; Rosselli et al., 2001b). Cannabis-dependent adolescents score lower on the BVRT than controls but show some improvement in test scores after 6 weeks of abstinence (Schwartz et al., 1989). Alcoholics with cirrhosis of the liver display particularly depressed BVRT scores (Arria et al., 1991). In 1983, at the behest of the World Health Organization and the U.S. National Institute for Occupational Safety and Health, experts proposed the Neurobehavioral Core Test Battery (NCTB) to identify nervous system effects of chemical exposures in humans worldwide. The BVRT, administered in a recognition format, was included in this battery and in this context has been widely used to assess the
397
BENTON VISUAL RETENTION TEST
effects of workplace chemical exposures in various cultures. BVRT scores were lower in mercury-exposed Norwegian workers (Ellingsen et al., 2001; Mathiesen et al., 1999) and was the most sensitive measure to the presence of mercury exposure in Chinese workers, registering an effect size of 6.0 (Zhou et al., 2002). BVRT performance was also lower in Chinese workers exposed to lead (Hai-Wang et al., 1995) and in Indian petrol pump workers (Kumar et al., 1988). Similarly, Indian workers exposed to organophosphates (Misra et al., 1994) and Egyptians who worked with organophosphorous pesticides (Farahat et al., 2003) have demonstrated lower BVRT scores. Japanese workers and female workers from Singapore exposed to toluene, Singapore workers exposed to styrene, Korean painters and printers, and spray painters in Singapore have shown lower BVRTscores (Chiaet al., 1994; Foo et al., 1990; Kishi et al., 1993; Lee & Lee, 1993; Ng et al., 1992), with loss of color vision in exposed populations particularly associated with lower BVRT scores (Dick et al., 2004). BVRT scores were not lower in rescue workers 3 years after the Tokyo subway sarin attack (Nishiwaki et al., 2001), in paint formulators in Singapore (Foo et al., 1994), or in Venezuelan workers exposed to organic solvents (Escalona et al., 1995). Some data suggest that patients with elevated diastolic and systolic blood pressure readings show declines on the BVRT (Reinprecht et al., 2003), although some studies have failed to detect this relationship (Swan et al., 1998). Patients post-cardiopulmonary bypass may show slight improvement on the test (Zeitlhofer et al., 1993). Diabetes has not been found to suppress BVRT scores after adjustment for age and education (RobertsonTchabo et al., 1986). BVRT performance appears to be highly predictive of 3-year survival status in chronic obstructive pulmonary disease (COPD) patients (Fix et al., 1985), but scores did not improve in COPD patients administered long-term oxygen treatment (Borak et al., 1996) and BVRT scores are not lowered in snoring- or sleep-related breathingstoppage episodes (Dealberto et al., 1996). No difference in BVRT performance was reported in lung cancer patients treated with chemotherapy vs. radiotherapy (Kaasa et al.,
1988). Long-term liver transplant survivors show significantly depressed BVRT scores (Lewis & Howdle, 2003). Conflicting data regarding the effects of estrogen on BVRT performance have been reported, with some investigators citing a reduction in BVRT errors with estrogen replacement therapy (Resnick et al., 1997) and others finding no relationship between estrogen level and test scores (Portin et al., 1999). Individuals who have never smoked perform better on the BVRT relative to current smokers or recent quitters, and light drinkers (::::;1 drink per day) perform better than nondrinkers (Carmelli et al., 1999). Of more obscure interest, BVRT scores are higher in children who eat breakfast at school rather than at home or who go without, suggesting that ingestion of food 30 minutes prior to test administration enhances performance (Vaisman et al., 1996). The BVRT did not differ between individuals with psychic experiences and those without (Fenwick et al., 1985), and brain MRI has no significant impact on BVRT scores (Sweetland et al., 1987). BVRT scores were not lower in persons who involuntarily retire vs. voluntary retirees (Swan et al., 1991). Psychometric Properties of the Test
Factor analysis of BVRT and Benton Visual Form Discrimination (BVFD) scores has suggested that BVRT free recall and recognition scores load on a factor with BVFD while BVRT copy scores load on a separate factor (Moses, 1986). The fact that number correct and error scores are found on the same factor has led to the suggestion that these two scores may be redundant and that no appreciable information is gained by specifying error type over and above total errors (Moses, 1986). In a sample of mentally retarded individuals, one-third of the variance in memory scores was accounted for by copying ability, although free recall and recognition scores were not significantly related (Silverstein, 1962). Factor analyses using a larger battery of neuropsychological test scores has indicated that BVRT loads on visual processing (Snow, 1998) and visual perceptualmotor (Larrabee et al., 1985) factors, rather
398
VERBAL AND VISUAL LEARNING AND MEMORY
than a memory factor, along with visual reproduction immediate, block design, and object assembly. BVRT performance has been significantly correlated with the Bender Gestalt but only modestly associated with verbal memory measures (Fabrigoule et al., 1998; Moses, 1986; Snow, 1998), digit span (Moses, 1986), psychomotor speed (Digit Symbol; Fabrigoule et al., 1998), and verbal abstraction (Fabrigoule et al., 1998). BVRT scores were a significant predictor of performance on Bechara's gambling task (Torralva et al., 2000) and more related to source memory than was the WCST (Dywan et al., 1993). Lowered BVRT scores are associated with commission errors on continuous performance tasks (Dougherty et al., 2003). Correlations between forms have generally been respectable (0.79--0.84), although Amir (2001) reported a more modest relationship between versions D and E at a 2-week retest inteJVal (correct=0.455, errors=0.491). Forms may not be of equivalent difficulty; Benton-Sivan (1992) indicates that under administration A, form C may be slightly easier than forms D and E. Test-retest reliability may be somewhat low (number correct= 0.57, number of errors= 0.53) (Youngjohn et al., 1993). Randall et al. (1988) reported interrater reliabilities of 0.85 for number correct and 0.93 for number of errors, while higher values were cited by Prakash and Bhogle (1992, 0.95 for number correct) and Swan et al. (1990, 0.963 for total number correct and 0.974 for total errors). Kappa values for each error type ranged from 0.976 for omissions to 0. 737 for size; agreements were lowest for misplacement on design 9 and size errors on design 10 (0.440 and 0.480, respectively; Swan et al., 1990). An initial attempt at developing a shortened version of the memory trials (i.e., <10 items) indicated that correlations between the full and abbreviated versions ranged from 0.829 for five items to 0.987 for nine items. However, the small savings in administration time was not judged adequate for the sacrifice in interpretation accuracy (Benton, 1972). The 1992 manual does reference an eight-item version for the copy trial.
RELATIONSHIP BETWEEN BVRT PERFORMANCE AND DEMOGRAPHIC FACTORS Increasing age has been consistently reported to adversely impact BVRT scores (Arenberg. 1978, 1982; Benton, 1974; Benton et al., 1981; Chen et al., 1990; Coman et al., 1999, 2002; Dartigues et al., 1992; Dealberto et al., 1996; Duara et al., 1984; Giambra et al., 1995; Jacobs et al., 1997; Lee & Lee, 1993; Mormont, 1984; Palmer et al., 1994; Prakash & Bhogole, 1992; Resnick et al., 1995; Robinson-Whelan, 1992; Shichita et al., 1986; Shipley et al., 1981; Snow, 1998; Youngjohn et al., 1993; Zappala et al., 1995), with age accounting for approximately 10% of test score variance (Youngjohn et al., 1993). The most pronounced decline occurs in either the 65--74 year decade (Giambra et al., 1995) or at age 75 (Coman et al., 2002). Increase in errors was moderate in men in their 50s and 60s but large for men over age 70 (Arenberg, 1978), with no further decline detected in participants between 80 and 92 (Klonoff & Kennedy, 1965). Age-related losses in BVRT performance are less pronounced in very healthy individuals (Haxby et al., 1986). Qualitative analyses have shown that while all error types increase with age, greater age effects have been observed for distortions, omissions, and rotations (Resnick et al., 1995). Some data suggest that BVRT procedures involving recognition trials may be less affected by age (Anger et al., 1993). In longitudinal studies, educational level and type of activities of daily life somewhat attenuate age-related performance decline (Shichita et al., 1986). BVRT performance is significantly affected by education (Amir, 2001; Anger et al., 1993; Coman et al., 1999, 2002; Dartigues et al., 1992; Dealberto et al., 1996; Jacobs et al., 1997; Le Carret et al., 2003; Lee & Lee, 1993; Palmer et al., 1994; Ritchie & Hallerman, 1989; Shichita et al., 1986; Youngjohn et al., 1993; Zappala et al., 1995; with the exception of Robinson-Whelen, 1992), and the relationship between BVRT scores and education is more pronounced in lower-education compared to higher-education groups (Kang, 2000;
BENTON VISUAL RETENTION TEST
Shichita et al., 1986). Further, in samples with <3 years of education, illiterate individuals perform worse than literate subjects (Manly et al., 1999). The effect of education on BVRT performance appears to be mediated by enhanced executive abilities rather than visual discrimination skills (Le Carret et al., 2003); for example, participants with more education use a more exhaustive exploration strategy on the BVRT recognition trial (Le Carret et al., 2003). Separate from the effects of education, poorer BVRT scores have been observed in blue-collar workers compared to professionals/ managers (Dartigues et al., 1992); however, other investigators have not found a relationship between BVRT performance and occupation (Palmer et al., 1994). BVRT scores have been reported to be similar across a broad range of cultures, although performance may be atypical for very poorly educated samples (Anger et al., 1991). For example, older native Spanish speakers residing in the United States with an average of 8 years of education scored lower than English speakers (Jacobs et al., 1997). Similarly, French speakers showed a trend toward lower performance relative to English speakers in Canada, although this finding was confounded by the lower educational level in the French speakers (Steenhuis & Ostbye, 1995). Older African Americans scored lower than non-Hispanic whites, but after covarying for reading level, group differences disappeared (Manly et al., 2002). U.S.-bom vs. foreign-born older non-Hispanic white subjects who were very fluent in English did not differ in BVRT performance (Touradji et al., 2001). Farahat et al. (2003) reproduces data showing that performance of control participants in Austria, France, Italy, Poland, Hungary, the Netherlands, China, the United States, and Ecuador is comparable, although scores of participants from Nicaragua and Egypt were lower, which the authors suggest may be due to differences in experience with testing and/or educational background. Anger and colleagues (1993) suggest that factors leading to poor performance in Nicaraguan participants were lack of experience with geometric
399
figures and low education; the sample averaged 3 years of formal education (32% had no formal education, 42% had 1-3 years, and 26% had 4-19 years of education) and 74% were illiterate or only marginally literate. In fact, performance reported in a sample of Venezuelan controls averaging 8 years of education was significantly higher (6.2 vs. 4) than that of the original Venezuelan data. Of note, the effects of age and education on BVRT performance are moderated by level of cognitive impairment; patients with moderate/ severe cognitive deficits do not register an effect of these demographic variables on test performance, although patients with more mild cognitive deterioration continue to show a demographic impact on test performance (Coman et al., 1999). Some studies have reported a relationship between IQ and BVRT scores (Amir, 2001; Benton, 1945, 1962; Netherton et al., 1989), although this appears to be confined to low IQ levels; individuals of average or higher IQ perform comparably (Randall et a!., 1988). Scores of mentally retarded individuals are impaired, with this group reproducing from memory only two of 10 designs accurately (Silverstein, 1962) and committing an average ofll-12 errors (Silverstein, 1963). Performance on memory trials improves in a stepwise fashion from mentally retarded to borderline, from borderline to low average, and from low average to average, while on the copy trial, performance improves from the mentally retarded to low average level and then plateaus (Randall et al., 1988). Studies have generally indicated no effect of gender on BVRT scores (Amir, 2001; Anger et al., 1993; Chen et al., 1990; Coman et al., 1999, 2002; Dartigues et al., 1992; Youngjohn et al., 1993), although Shichita et al. (1986) did observe lower scores in older Japanese women relative to men, likely related to lower levels of education in women in this cohort. Some data also suggest that women commit more errors (Giambra et al., 1995) and, in particular, that older women may exhibit more rotation and omission errors than older men but that men show steeper increases in omission errors with age (Resnick et al., 1995).
· VERBAL AND VISUAL LEARNING AND MEMORY
400
METHOD FOR EVALUATING TttE NORMATIVE REPORTS 1 Our review of the literature located a few BVRT normative reports for adults and dozens "f other studies which have reported control ~ubject data, and we have confined our discu~ion to those investigations which involved a Eample size of at least 50 and represented the s dard test administration (i.e., studies in whi h only half of the items were administered or a elayed recall trial was given, etc., were excludefl). To adequately evaluate the BVRT ~orma tive reports, seven key criterion variabl~s were deemed critical. The first five of these rtlate to subject variables, and the remaining tWp refer to procedural issues. . Minimal requirements for meeting the cri! terion variables were as follows. Subject Variables Sample Size
Fifty cases are considered a desirable Sample size. Although this criterion is somewhat arbitrary, a large number of studies sugg~st that data based on small sample sizes are; highly influenced by individual differences ~d do not provide a reliable estimate of the! population mean. I Sample Composition Description
As discussed previously, information regarding medical and psychiatric exclusion criteria is important; it is unclear if geographic recruitment region, gender, socioecononpc status or occupation, ethnicity, and rec~tment procedures are relevant. Until determiped, it is best that this information be providep. Age Group Intervals
Given the association between age and iBVRT performance, information regarding theiage of the normative sample is critical and nonnative data should be presented by age inte~s. Reporting of Educational Levels
Given the relationship between educational level and BVRT scores, it is preferable that data be stratified by educational level.
Reporting of IQ Levels
Given the probable relationship between BVRT performance and IQ, information regarding intellectual level should be provided. Procedural Variables Specification of Test Version
Given the four standard administration options plus the recognition format as well as the three stimuli form, studies should report which version/administration format was followed. Data Reporting
Means and standard deviations for total correct and/or total errors are required.
SUMMARY OF THE STATUS OF THE NORMS Twenty-three data sets had total sample sizes 2:100 (Alder et al., 1990; Amir, 2001; Arenberg, 1978; Benton, 1962; Benton-Sivan, 1992; Carmelli et al., 1999; Coman et al., 1999; Dealberto et al., 1996; Giambra et al., 1995; Jacobs et al., 1997; Kawas et al., 2003; Klonoff & Kennedy, 1965; Manly et al., 2002; Palmer et al., 1994; Prakash & Bhogle, 1992; Randall et al., 1988; Reinprecht et al., 2003; Resnick et al., 1995; Robertson-Tchabo & Arenberg, 1989; Robinson-Whelen, 1992; Steenhuis & Ostbye, 1995; Touradji et al., 2001; Youngjohn et al., 1993). Twenty-two of the studies summarized in this chapter present BVRT data according to circumscribed age ranges and/or age subgroups (Alder et al., 1990; Arenberg, 1978; Benton-Sivan, 1992; Carmelli et al., 1999; Coman et al., 1999; Dealberto et al., 1996; Eslinger et al., 1985; Giambra et al., 1995; Kawas et al., 2003; Klonoff & Kennedy, 1965; Jacobs et al., 1997; Larrabee et al., 1986; Manly et al., 1999, 2002; Palmer et al., 1994; Prakash & Bhogle, 1992; Reinprecht et al., 2003; Resnick et al., 1995; Robertson-Tchabo & Arenberg, 1989; Steenhuis & Ostbye, 1995; Touradji et al., 2001; Youngjohn et al., 1993). Educational level was also indicated in all but four studies (Alder et al., 1990; Arenberg, 1978;
BENTON VISUAL RETENTION TEST
Benton-Sivan, 1992; Reinprecht et al., 2003), and Youngjohn et al. (1993) and Coman et al. (1999) stratify by age and educational level, while Dealberto et al. (1996) and RobertsonTchabo and Arenberg (1989) present data by age, gender, and educational level separately. Manly and colleagues (1999) stratify data by literacy in a population with <3 years of formal education. Information on IQ level is reported in one study (Larrabee et al., 1986), with Vocabulary raw score presented in two studies (Alderetal., 1990; Mathiesenetal., 1999);Amir (2001) and Randall et al. (1988; also reproduced in Benton-Sivan, 1992) present BVRT data in IQ groupings. Information on gender composition was in all but four reports (Benton, 1962; BentonSivan, 1992; Robinson-Whelen, 1992; Touradji et al., 2001); 10 data sets included only male (Alder et al., 1990; Arenberg, 1978; Carmelli et al., 1999; Farahat et al., 2003; Klonoff & Kennedy, 1965; Lee & Lee, 1993; Mathiesen et al., 1999; Palmer et al., 1994; Reinprecht et al., 2003) or nearly all-male (Escalona et al., 1995) populations, and four data sets were composed primarily of females (Jacobs et al., 1997; Larrabee et al., 1986; Manly et al., 1999, 2002). Robertson-Tchabo and Arenberg (1989), Dealberto et al. (1996), and Giambra et al. (1995) reported data separately for males and females. Data regarding ethnicity were presented in 22 data sets (Amir, 2001; Arenberg, 1978; Carmelli et al., 1999; Coman et al., 1999; Dealberto et al., 1996; Escalona et al., 1995; Farahat et al., 2003; Giambra et al., 1995; Jacobs et al., 1997; Larrabee et al., 1986; Lee & Lee, 1993; Manly et al., 1999, 2002; Mathiesen et al., 1999; Palmer et al., 1994; Prakash & Bhogle, 1992; Reinprecht et al., 2003; Resnick et al., 1995; Robinson-Whelen, 1992; Ruggieri et al., 2003; Touradji et al., 2001; Witjes-Ane et al., 2003). Occupation or socioeconomic status was described in five reports (Arenberg, 1978; Giambra et al., 1995; Klonoff & Kennedy, 1965; Lee & Lee, 1993; Palmer et al., 1994). Exclusion criteria were judged to be adequate in 16 publications (Alder et al., 1990; Arenberg, 1978; Escalona et al., 1995; Eslinger et al., 1985; Farahat et al., 2003; Giambra et al., 1995; Jacobs et al., 1997; Kawas et al., 2003; Larrabee et al.,
401
1986; Manly et al., 1999, 2002; Mathiesen et al., 1999; Randall et al., 1988; Resnick et al., 1995; Robertson-Tchabo & Arenberg, 1989; Youngjohn et al., 1993). Geographic recruitment areas were specified in all but one publication (Youngjohn et al., 1993). Six publications present data from the Baltimore Longitudinal Study on Aging (Alder et al., 1990; Arenberg, 1978; Giambra et al., 1995; Kawas et al., 2003; Resnick et al., 1995; Robertson-Tchabo & Arenberg, 1989), three studies report data from the Washington Heights Inwood Columbia Aging Project based in northern Manhattan, New York (Manly et al., 1999, 2002; Touradji et al., 2001), and additional data sets were collected in the United States in Iowa (Benton, 1962; BentonSivan, 1992; Coman et al., 1999; Eslinger et al., 1985), Florida (Larrabee et al., 1986), Mississippi (Randall et al., 1988), Missouri (Robinson-Whelen, 1992), New York (Jacobs et al., 1997), and Massachusetts, Indiana, and California (Carmelli et al., 1999). Data sets were also gathered in Canada (Klonoff & Kennedy, 1965; Steenhuis & Ostbye, 1995), France (Dealberto et al., 1996), Norway (Mathiesen et al., 1999), Sweden (Reinprecht et al., 2003), Italy (Ruggieri et al., 2003), the Netherlands (Witjes-Ane et al., 2003), the United Arab Emirates (Amir, 2001), Korea (Lee & Lee, 1993), Venezuela (Escalona et al., 1995), India (Prakash & Bhogle, 1992), and Egypt (Farahat et al., 2003). Administration A, form C, was the most common version reported, appearing in 16 publications (Alder et al., 1990; Arenberg, 1978; Benton-Sivan, 1992; Carmelli et al., 1999; Coman et al., 1999; Eslinger et al., 1985; Giambra et al., 1995; Kawas et al., 2003; Klonoff & Kennedy, 1965; Larrabee et al., 1986; Mathiesen et al., 1999; Palmer et al., 1994; Prakash & Bhogle, 1992; Resnick et al., 1995; RobertsonTchabo & Arenberg, 1989; Robinson-Whelen, 1992), and Randall et al. (1988) included not only this administration but also administration D, form D, and administration C, form C, while Robinson-Whelen (1992) also included administration C, form D. Four studies collected data for administration A but did not specify the form (although it is assumed to be C; Carmelli et al., 1999; Coman et al., 1999;
402
VERBAl AND VISUAl lEARNING AND MEMORY
Reinprecht et al., 2003; Youngjohn et al., 1993). Amir (2001) employed administration A, form D, followed by formE 2 weeks later, while Prakash and Bhogle (1992) administered forms C, D, and E under administration A conditions. Jacobs et al. (1997) used administration C, form C. Benton (1962) present data for forms C and E under administrations B and C, and Benton-Sivan ( 1992) provide data for four administrations (A, B, C, and D). Seven studies report data for the 15-item recognition trial only (Escalona et al., 1995; Jacobs et al., 1997; Lee & Lee, 1993; Manly et al., 1999, 2002; Steenhuis & Ostbye, 1995; Touradji et al., 2001), although the Jacob et al. (1997) data appear to be for form D rather than C; the Dealberto et al. (1996) data are assumed to be for the recognition trial since the mean correct exceeds 10. Three studies (Farahat et al., 2003; Ruggieri et al., 2003; Witjes-Ane et al., 2003) did not report any information regarding test administration version or format. Total mean number correct was reported in 22 data sets (Amir, 2001; Coman et al., 1999; Dealberto et al., 1996; Escalona et al., 1995; Eslinger et al., 1985; Farahat et al., 2003; Jacobs et al., 1997; Klonoff & Kennedy, 1965; Lee & Lee, 1993; Manly et al., 1999, 2002; Mathiesen et al., 1999; Palmer et al., 1994; Prakash & Bhogle, 1992; Randall et al., 1988; Reinprecht e~ ~·· 2003; Robinson-Whelen, 1992; Ruggien et al., 2003; Steenhuis & Ostbye, 1995; Touradji et al., 2001; Witjes-Ane et al., 2003; Youngjohn et al., 1993) and number of errors in 14 data sets (Alder et al., 1990; Amir, 2001; Arenberg, 1978; Benton, 1962; Eslinger et al., 1985; Giambra et al., 1995; Kawas et al., 2003; Klonoff & Kennedy, 1965; Mathiesen et al., 1999; Randall et al., 1988; Resnick et al., 1995; Robertson-Tchabo & Arenberg, 1989; Robinson-Whelen, 1992; Youngjohn et al., 1993). Means were reported in all but one study (Comanetal., 1999), andSDswereindicatedin all but four studies (Arenberg, 1978; Benton, 1962; Carmelli et al., 1999; Coman et al., 1999). Means and SDs for individual error scores are provided in two publications (Resnick et al., 1995; Robinson-Whelen, 1992). Below, information about the test manual will be reported first, followed by summaries of normative publications and control data from
clinical studies, presented in ascending chronological order. The text of study descriptions contains references to the corresponding tables identified by number in Appendix 20. Table A20.1, the locator table, summarizes information provided in the studies described in this chapter. 1
SUMMARIES OF THE STUDIES
Test Manual [BVRT.1] Benton-Sivan, 1992
The manual for the BVRT, 5th edition, contains normative data for administration (number correct and number of errors) on 600 participants from the third edition of the manual (Benton, 1963), as well as the Arenberg (1978) data on 769 men and the Randall et al. (1988) data on 120 participants. Exclusion criteria for the Benton (1963) sample included no history of psychosis, no cerebral injury or disease except for mental retardation, and no serious physical depletion as a consequence of somatic disease. A majority of participants were inpatients and outpatients of hospitals in Iowa City and Des Moines, Iowa. For administration B, data are reported for 103 medical patients aged 16-60 with no history or evidence of brain disease. Performance was approximately 1 point lower for number correct than for administration A. The recommendation is to use the manual norms for administration A, subtracting 1 point. Administration C norms were obtained on 200 medical patients with no history or evidence of cerebral disease (Benton, 1962). Almost half the group obtained perfect scores, and 88% made two or fewer errors. The manual also presents norms for an eight-item abbreviated version, based on performance of 100 controls (Benton, 1962). . ~or administration D, performance in participants younger than 60 is reported to be comparable to that for administration A. Values for administration A for expected number correct are presented for three age 'Norms for children are available in Baron (2004).
BENTON VISUAL RETENTION TEST
groups (15-49, 50-59, and 60--69) and six IQ levels (2:110, 95-109, 80-94, 70-79, 60-69, and ~59). Values for administration A for expected error scores are provided for four age groups (15-44, 45-59, 60-64, and 65-69) and eight IQ levels (2:110, 105-109,95-104,90-94, 80-89, 70-79, 60-69, and ~59). Expected values for administration C number errors are reported for the 10-design and eight-design versions. Data are presented as expected number of correct responses and errors and apply to all three forms (C, D, E). Data are not reproduced in this book.
Manual strengths (Benton data) 1. Large overall sample size. 2. Data stratified by age and IQ levels. 3. Information regarding geographic area. 4. Data reported for administrations A, B, C, and D and the eight-item abbreviated version. Considerations regarding use of the manual 1. Inadequate exclusion criteria for the Benton sample (mentally retarded included, most were hospital inpatients or outpatients). 2. No information regarding gender, education, or recruitment strategies. [BVRT.2] Benton, 1962 (Table A20.2)
BVRT data were obtained on 100 patients on medical and neurological wards in Iowa City hospitals who showed no evidence or history of cerebral disease or injury. Additional exclusion criteria were seizures, head trauma with loss of consciousness, or hospitalization for psychiatric disorder or mental deficiency. Mean age was 41 years (range 16--60), and mean educational level was 10 years. Participants received form C or E, administration C, and form C or E, administration B. Means for number of errors for the copy and memory trials are reported.
Study strengths 1. Large sample size. 2. Information regarding age, education, and geographic area.
403
3. Test version/administration format specified.
Considerations regarding use of the study 1. Questionable adequacy of exclusion criteria (while patients with seizures, head injury, and psychiatric or mental deficiency-related hospitalizations were excluded, participants were inpatients on medical or neurologic wards). 2. Low educational level. 3. No information regarding gender or IQ. 4. No SDs reported. 5. Data for the differing forms were apparently collapsed. 6. Data not stratified by age. [BVRT.3] Klonoff and Kennedy, 1965 (Table A20.3)
BVRT data were obtained on 172 Canadian veterans aged 80-92 "managing well on their own in the community;" 30% of War Veteran Allowance recipients 2:80 and residing in the Vancouver area were selected; 88 could not participate due to major loss of visual or auditory acuity or motor dysfunction. Testing was completed during 1963-1964. Information on educational level was available for 155 participants; mean education was 7.04 years. Seventy-three percent were born in the British Isles, 19% were hom in Canada, and 8% were hom elsewhere in Europe. The majority were unskilled workers (47%), while 12% were semiskilled, 14% were skilled, 10% were clerical, 10% were in a service industry, 5% were semiprofessional, and 2% were professionals. Forty-five percent carried cardiovascular diagnoses, 22% had pulmonary disease, and 15% had psychiatric diagnoses; 25% rated their health as "very good," 51% judged their health to be "good," and 24% designated their health as "fair." The sample was divided into six age groupings: 80 (n = 35), 81 (n = 34), 82 (n = 23), 83 (n = 27), 84-85 (n = 26), and 86+ (n =27). Administration A, form C, was used and scored according to the Benton method. Means and SDs for number correct and number errors are reported; 34% committed omission errors, 33% produced distortions, 12% had rotations, 11% exhibited perseverations, 8% of the
404
VERBAL AND VISUAL LEARNING AND MEMORY
errors were misplacements, and 2% were size errors. Study strengths 1. Large overall sample size. 2. Data stratified by age. 3. Information regarding gender, educational level, recruitment strategy, occupational status, and geographic area. 4. Test administration, version, and scoring specified. 5. Means and SDs reported. Considerations regarding use of the study 1. Questionably adequate exclusion criteria (those with unspecified psychiatric diagnoses included). 2. Language in which testing conducted not specified but assumed to be English. 3. All-male sample. 4. No information regarding IQ. 5. Low educational level. [BVRT.4] Arenberg, 1978 (Table A20.4)
BVRT data were obtained on 857 men aged 18102, as part of the Baltimore Longitudinal Study of Aging, who were tested between 1960 and 1973. All were volunteers who had agreed to come to Baltimore city hospitals for physiological, biochemical, and behavioral testing. The sample was primarily Caucasian, well-educated, and of high socioeconomic status, residing in the Baltimore-Washington, DC area. Participants were divided into seven age groups (<30, 30s, 40s, 50s, 60s, 70s, ?80) and three subgroups based on date of test administration (1960-1964.9, 1965.0-1968.5, and 1968.6-1973.5). Participants were given form C, administration A. Errors were scored according to the 1963 manual by two independent psychologists, and infrequent disagreements were resolved by discussion or a third psychologist. Means for total errors are reported. Study strengths 1. Large overall sample size, although some individual cells are small. 2. Data stratified by age. 3. Some information regarding gender, age, education, socioeconomic status, ethnicity,
geographic area, and recruitment strategies. 4. Test administration format and version specified. 5. Means reported for number of errors. Considerations regarding use of the study 1. Exclusion criteria not specified, although reported in Giambra et al. (1995). 2. All-male sample. 3. SDs and IQ level not reported. [BVRT.S] Eslinger, Damasio, Benton, and Van Allen, 1985 (Table A20.5)
BVRT scores were collected on 53 normal volunteers (25 men, 28 women) aged 60-88, recruited through local senior-citizen and community organizations in the Iowa City area. Mean age was 73.1, and mean educational level was 12.0. All were independent, communitydwelling individuals who were screened for neurologic disorder (including head injwy and alcoholism), psychiatric illness requiring hospitalization, and any disabling medical or physical condition. Participants considered themselves to be in generally good physical and mental health. Form C, administration A. was used. Total correct (10 possible) and total errors (26 possible) were calculated. Means and SDs are reported. Study strengths 1. Minimally adequate sample size. 2. Information on age, gender, education, geographic area, and recruitment strategies. 3. Adequate exclusion criteria. 4. Test version/administration format specified. 5. Means and SDs for number correct and errors reported. Consideration regarding use of the study 1. Data not stratified by age, although age grouping is fairly narrow. 2. No data on IQ level. [BVRT.6] Larrabee, Levin, and High, 1986 (Table A20.6)
BVRT scores were gathered on 88 reportedly healthy participants aged 60-90 recruited
BENTON VISUAL RETENTION TEST
from retirement apartments and organizations in Galveston County, Florida. Participants had at least 20/40 vision on screening, and hearing was adequate on audiometric assessment. Individuals with a history of psychiatric disorder, stroke, head injury, or other neurologic disease were excluded. No subject had a clinical level of depression on the Zung self-report inventory. Ten participants were identified as having senescent forgetfulness by low memory scores (on at least four scores out of 13), 1 or more SD below age-residualized means, and appearing in a memory disorder cluster on cluster analysis. Mean age of the remaining participants (n = 78) was 72.9 (6.9) years, and mean years of education was 12.2 (3.3); 61 were female and 17 were male; 73 were white and five were African American. Mean VIQ was 112.0 (12.8), and mean PIQ was 114.4 (11.5). Form C, administration A, of the BVRT was given. Means and SDs for number of errors reported.
Study strengths 1. Large sample size. 2. Information regarding age, education, ethnicity, IQ, gender, geographic area, vision and hearing acuity, and recruitment strategy. 3. Adequate exclusion criteria. 4. Test administration format/version specified. 5. Means and SDs reported for total errors. Considerations regarding use of the study 1. Data not stratified by age. 2. Exclusion criteria likely "too stringent" (i.e., 10 participants were excluded based on memory test performance). As a result, sample may not be characteristic of healthy individuals in this age range. [BVRT.7] Randall, Dickson, and Plasay, 1988
405
the mentally retarded. Participants were administered the Satz-Mogel version of the WAIS-R for placement into IQ ranges. A total of 120 participants were included (69 females, 51 males), with 20 in each of six IQ ranges: mentally retarded (60-69), borderline (70-79), low average (80-89), average (90-109), high average (110-119), and superior (120+). Participants were not compensated. Exclusion criteria were history or evidence of cerebrovascular illness, traumatic head injury, epilepsy, alcoholism, or psychiatric illness. Mean age across groups ranged 23.87-26.20. Administrations A (form C), D (form D), and C (form C) were conducted, although the order of presentation of the cards was changed (i.e., 2, 6, and 10 and then 1, 3, 4, 5, 7, 8, and 9). Means and SDs for number correct and errors are reported for the six IQ groups.
Study strengths 1. Large overall sample size, although individual cell sizes are small. 2. Data stratified by IQ level. 3. Adequate exclusion criteria. 4. Information regarding geographic area, gender, and age. 5. Administration procedures and versions specified. 6. Means and SDs reported for number correct and errors for three administration versions. Considerations regarding use of the study 1. No information regarding educational level, although this is obviated by the data on IQ. 2. No data on gender for individual cells. 3. Data not stratified by age, although the age range appears adequately narrow. 4. Altered administration format (order of stimuli changed).
(Table A20.7)
[8VRT.8] Robertson-Tchabo and Arenberg, 1989 (Table A20.8)
BVRT data were obtained on volunteers from psychology undergraduate classes and the Honors College at the University of Southern Mississippi, various church and community organizations, and a residential care facility for
BVRT data on 1,643 participants from the Baltimore Longitudinal Study of Aging are presented. Data are grouped separately by gender, college degree or no college degree, and seven age decades (20s, 30s, 40s, 50s, 60s,
406
VERBAL AND VISUAL LEARNING AND MEMORY
70s, 80s). Exclusion criteria are described below in Giambra et al. (1995). Data were collected using form C, administration A. Means and SDs (as well as frequencies not reproduced in this book) are reported for errors. Study strengths 1. Large overall sample size, although some individual cells fall below 50. 2. Data stratified by age, gender, and educational level. 3. Information regarding recruitment strategies and geographic area. 4. Test administration and scoring procedures specified. 5. Means and SDs reported for total errors. 6. Adequate exclusion criteria. Consideration regarding use of the study 1. No information regarding IQ level. [BVRT.9] Alder, Adam, and Arenberg, 1990 (Table A20.9)
6. Adequate exclusion criteria. Considerations regarding use of the study 1. No data regarding educational level, although this is somewhat obviated by data on the Vocabulary subtest. 2. All-male sample. [BVRT.10] Prakash and Bhogle, 1992 (Table A20.1 0)
BVRT scores were collected on 660 participants in India. Exclusion criteria were "evident physical or psychological disorders," and only participants with a minimum level of higher secondary education were included. Participants were aged 15-65; gender distribution was approximately equal (331 male, 329 female). Participants were divided into 10 age groupings: 15-19 (n = 90), 20-24 (n = 86), 2529 (n = 84), 30-34 (n =56), .35-39 (n =53), 40-44 (n = 62), 45-49 (n = 73), 50-54 (n =55), 55-59 (n = 62), and 60-64 (n = 39) years. Forms C, D, and E were used with administration A. Means and SDs for number correct for each age grouping are provided.
BVRT data on 277 men are reported from the Baltimore Longitudinal Study of Aging. Data are presented for five age groupings: 25-34 (n = 27), 35-44 (n = 74), 45-54 (n = 101), 55-64 (n = 42), and 65+ (n = 33). Total raw scores for the Vocabulary subtest of the WAJS are provided for each subgroup. Exclusion criteria are described below in Giambra et al. (1995). Data were collected using form C, administration A. Each protocol was scored independently by two psychologists. Means and SDs for number of errors are reported.
Study strengths 1. Large sample size, and all but one subgroup had >50. 2. Data stratified by age. 3. Information regarding gender, educational level, and geographic area. 4. Test administration format and forms specified (although it is unclear if all versions were given to all participants and, if so, whether administration was in a counterbalanced order). 5. Means and SDs for number correct reported.
Study strengths 1. Large overall sample size, although some individual cells fall below 50. 2. Data stratified by age. 3. Information regarding gender, recruitment strategies, geographic area, and Vocabulary raw score. 4. Test administration and scoring procedures specified. 5. Means and SDs reported for total errors.
Considerations regarding use of the study 1. No information regarding recruitment strategy or IQ level. 2. Three test forms were administered, but only one set of means and SDs is reported; it is unclear whether these represent means for the three forms or just one. 3. Adequacy of exclusion criteria questionable (sample apparently consists of participants who appear normal).
BENTON VISUAl RETENTION TEST
4. Data provided on East Indians, which may limit generalizability for clinical interpretation in the United States. [BVRT.11] Robinson-Whelen, 1992 (Table A20.11)
BVRT data are reported on 122 Caucasian older normal individuals in good physical health recruited in Missouri as part of a study on the BVRT in normal aging and dementia. Mean age was 72.23 (9.0) years, and mean education was 13.61 (3.4) years. Form C, administration A, followed by form D, Administration C (copy), were used. Means and SDs for number correct, total errors, and eight error types are reported. Study strengths 1. Large sample size. 2. Information regarding age, education, geographic area, and ethnicity. 3. Form and administration version reported. 4. Means and SDs for number correct as well as number of errors and various error types provided. Considerations regarding use of the study 1. Data not stratified by age. 2. No information regarding gender, recruitment strategies, or IQ. 3. No information regarding exclusion criteria; participants are only reported as being in good physical health. [BVRT.12] Lee and Lee, 1993 (Table A20.12)
BVRT data were obtained on 81 male controls as part of a study on the effects of organic solvents on cognition in Korean workers residing in the Seoul area. Participants averaged 34.7 (8.16) years of age, with 12.9 (2.5) years of education. The group was primarily composed of manual workers, guards, clerks, and technicians. They consumed an average of 10.4 (10.2) liters of alcohol per year and smoked an average of 11.6 (33.1) cigarettes per day. They had not been exposed to organic solvents. Participants were administered the NCTB, which includes a 10-second administration followed by selection responses from a
407
four-choice array. Means and SDs for number correct are provided. Study strengths 1. Large sample size. 2. Information regarding age, education, gender, occupational status, and alcohol and cigarette use. 3. Test version and administration format reported. 4. Means and SDs provided. Considerations regarding use of the study 1. Inadequate exclusion criteria. 2. Data not stratified by age. 3. Information on test translation not provided. 4. All-male sample. 5. The stimuli version not reported. 6. Data collected on Korean workers, which may limit generalizability for clinical interpretation in the United States. [BVRT.13] Youngjohn, Larrabee, and Crook, 1993 (Table A20.13)
BVRT scores were collected on 1,128 (464 male, 664 female) normal volunteers aged 17-84, recruited to participate in testing through news media. Participants were not compensated. Mean age was 58.45 (11.43) years, and mean education was 16.01 (2.2), with a range of 12-25 years. Participants with a history of physical, psychiatric, or neurological conditions that could affect memory (e.g., depression, head trauma, or stroke) were excluded. Administration A was used. Data are presented in five age groupings (18--39, 40--49,5059, 60-69, and 70+) and three educational levels (12-14, 15-17, and 18+). Means and SDs for number of designs correctly reproduced and total number of errors are reported. Study strengths 1. Large overall sample size, although some cells are relatively small. 2. Data stratified by age and educational level. 3. Information regarding gender and recruitment strategy. 4. Adequate exclusion criteria.
408
VERBAL AND VISUAL LEARNING AND MEMORY
5. Test administration format reported, although not stimuli version. 6. Means and SDs provided.
Consideration regarding use of the study 1. Well-educated sample, and no data for participants with < 12 years of education. 2. No information on IQ level. Other comments 1. The authors provide regression equations for predicted BVRT scores: a. Predicted BVRT number correct (±1.57) = 7.87- 0.045(age) + 0.098 (years of education) b. Predicted BVRT number of errors (±2.88) = 1.73 + 0.088 (age)- 0.126 (years of education) (8VRT.14] Palmer, Wolkenstein, LaRue, Swan, and Smalley, 1994 (Table A20.14)
BVRT data were obtained on 1,149 primarily Caucasian (98%), community-dwelling older males who resided in California and participated in the Western Collaborative Group Study from 1986 to 1989. Mean age was 71.4 (4.69) years. The sample is described as well educated (approximately 62% attended college) and engaged in high-level occupations (approximately 40% had managerial positions, nearly 50% had technical jobs, and < 10% were laborers or clerical staff). Approximately 30% were hypertensive, 10% were diabetic, and 25% had coronary heart disease. The rating for the group as a whole on a depression screening measure was very low, and average MMSE score was 28.1. Participants were given form C, administration A. Means and SDs for number correct were reported.
Study strengths 1. Very large sample size. 2. Information regarding age, ethnicity, education, occupation, gender, recruitment strategy, and geographic area. 3. Test version and administration format reported. 4. Means and SDs provided. Considerations regarding use of the study 1. Data not stratified by age but appear to reflect a fairly narrow age range.
2. All male sample. 3. Well-educated sample. 4. No exclusion criteria and no information regarding IQ. [8VRT.15] Escalona, Yanes, Feo, and Maizlish, 1995 (Table A20.15)
BVRT scores, collected as part of NCTB administration, were obtained on 67 (56 male, 11 female) controls as part of a study of the effects of organic solvents on cognition in Venezuela. Mean age was 30 years, and mean educational level was 8 years. Participants were aged 16-45; could read at the junior high school level; had no previous history of mental illness, drug abuse, head trauma, epilepsy, or neurotoxic exposure; had not consumed alcohol within 24 hours of testing; and had adequate sleep the night before testing. An NCTB recognition version of the BVRT was administered, involving exposure to the items for 10 seconds and selection of a correct match for a four-choice array. Means and SDs for number correct are reported.
Study strengths 1. Information on geographic area, age, education, and gender. 2. Adequate exclusion criteria. 3. Means and SDs for number correct reported. 4. Test format specified (NCTB recognition trial with four-choice array). Considerations regarding use of the study 1. Low educational level. 2. Data not stratified by age. 3. The stimuli version and IQ level not reported. 4. Method of translation of test instructions not reported. 5. Data obtained on Spanish speakers in Venezuela, which may limit generalizability for clinical interpretation in the United States. [BVRT.16] Giambra, Arenberg, Zonderman, Kawas, and Costa, 1995 (Table A20.16)
BVRT data are reported from the Baltimore Longitudinal Study of Aging, collected on 1,163 men from 1960 and on 558 women from
409
BENTON VISUAL RETENTION TEST
1978, aged 28-87, until 1992. Exclusion criteria were past or present psychosis, major depression, organic brain syndrome, dementia, Parkinson's disease, stroke, or epilepsy. Participants were recruited through invitation or in response to learning of the project through the media. Participants were mostly white and highly educated: < 12 (1. 7% of men, 2.3% of women), 12 (7.1% of men, 13.7% of women), 13-16 (45.5% of men, 44.9% of women), > 16 (53. 7% of men, 39.1% of women) years of education. A large proportion were currently in or retired from administrative or professional positions. Form C, administration A, was employed. Means and SDs for number of errors are reported for men and women separately for 10 age groupings. Each protocol was scored by two psychologists according to the 1974 manual, and disagreements were addressed through consensus or a third, independent rater. Study strengths 1. Very large overall sample size, although the size of individual age and gender cells is not reported. 2. Stratification of data by narrow age groupings and gender. 3. Adequate exclusion criteria. 4. Information on age, gender, geographic area, recruitment strategy, education, ethnicity, and occupation. 5. Administration and scoring specified. 6. Means and SDs for number of errors reported. Considerations regarding use of the study 1. High educational level. 2. No information on IQ level. [8VRT.17] Resnick, Trotman, Kawas, and Zonderman, 1995 (Table A20.17)
BVRT data were analyzed from 2,000 (1,365 men, 635 women) mostly Caucasian participants of the Baltimore Longitudinal Study of Aging, aged 20-102 years; 82% of the men and 69% of the women had completed college, and 61% of the men and 48% of the women had some graduate school education. For exclusion criteria, see Giambra et al. (1995). Participants were partitioned into seven age groups: 20-29
(n = 268), 30-39 (n = 353), 40--49 (n = 270), 50-59 (n=306), 60-69 (n=334), 70-79 (n = 340), and 80+ (n =-129) years. Administration A, form C, was employed. Errors were scored according to the 1974 manual. Errors were classified into seven major categories: omissions, distortions, perseverations, rotations, misplacement, size, and addition errors. Errors were scored by two experienced, independent examiners, and disagreements were resolved through consensus. Errors across the 10 cards were summed for a total error score. Means and SDs for the seven error types are reported for men and women separately and together in each age grouping.
Study strengths l. Very large sample size and large individual cell sizes. 2. Information regarding education, recruitment procedures, ethnicity, and geographic area. 3. Adequate exclusion criteria. 4. Data stratified by age and gender. 5. Administration format and version specified. 6. Means and SDs for seven error types reported. Considerations regarding use of the study 1. Well-educated sample. 2. No information regarding IQ level. [8VRT.18] Steenhuis (Table A20.18)
and
Ostbye, 1995
BVRT recognition trial data were collected in Canada on 591 participants over age 65 in an epidemiological study based on a representative sample of elderly Canadians. All participants received a final consensus diagnosis of no cognitive impairment based on a dementia screening instrument, although a subset (approximately one-fourth) also underwent subsequent comprehensive testing. Mean age was 78.5 (6.7) years, and mean education was 9.8 (4.0); 61% were female, and 21% spoke French; 18% resided in an institution. The multiple-choice version of the BVRT was administered. Means and SDs for number correct are reported.
410
VERBAl AND VISUAl lEARNING AND MEMORY
Study strengths 1. Very large sample size. 2. Information regarding age, education, gender, geographic area, recruitment strategy, and language. 3. Exclusion of patients with dementia and nondementia cognitive loss. 4. BVRT version specified. 5. Means and SDs for number correct reported. Considerations regarding use of the study 1. Data not stratified by age but appear to cover a relatively narrow age range. 2. Minimal exclusion criteria. 3. Low educational level. 4. Both English and French speakers included. 5. No information regarding IQ level. [BVRT.19] Dealberto, Pajot, Courbon, and Alperovitch, 1996 (Table A20.19} BVRT scores were collected on 1,389 (574 male, 815 female) French participants aged 60-70 as part of a study of the effects of sleeprelated breathing disorders on cognition. Mean age was 65 (3); 46% of men and 55% of women had <6, 33% of men and 36% of women had 612, and 21% of men and 10% of women had >12 years of schooling. The majority of women had never smoked (82.8%), whereas only 23.2% of men were never-smokers; 62.2% of the men were former smokers, as were 12.8% of women, and 14.6% of men and 4.4% of women were current smokers. Most participants ingested 1-3 medications per day (44.6% of men, 44.9% of women), while 28.6% of men and 18.7% of women took no medications; 26.8% of men and 36.4% of women took four or more medications daily. The majority of participants drank 1-40 mL of alcohol per day (58.9% of men, 59.3% women), 12.9% of men and 38.0% of women drank no alcohol, and 28.2% of men and 2.6% of women imbibed >40 mL per day. No significant medical conditions were present in 42.6% of participants, one condition was present in 34.5%, and two or more in 22.9% of the sample; 14.2% of the sample obtained scores suggesting a high level of depression on a self-report inventory. In the age group 60-64, MMSE was <24 in 3.1% of
men and 4.4% of women; and in the age group 65-69, 3.6% of men and 6.8% of women had scores in this range. BVRT means and SDs for number correct are reported separately for two age groups (60-64, 65-70), gender, and three educational levels (<6, 6-12, and >12 years).
Study strengths 1. Very large sample size. 2. Information regarding age, education, gender, geographic area, smoking history, alcohol and medication use, and depressive symptoms. 3. Means and SDs for number correct reported. 4. Data stratified by two age groups, gender, and three educational levels. Considerations regarding use of the study 1. No exclusion criteria and participants included with substantial depressive symptoms (14%) and MMSE scores <24 (3o/!Hl%). 2. Test version and format not specified (mean score is number correct but exceeds 10; therefore, this test appears to be an altered format). 3. Recruitment strategies not reported. 4. Data on French speakers. Method of translation oftest instructions not specified. 5. No information regarding IQ level. [BVRT.20] Jacobs, Sano, Albert, Schofield, Dooneief, and Stern, 1997 (Table A20.20} BVRT scores were obtained on ll8 older English speakers and l18 older Spanish speakers as part of a community-based epidemiological study in northern Manhattan, New York City. English speakers averaged 75.07 (5.90) years of age and 8.85 (3.78) years of education, while Spanish speakers averaged 74.91 (5.71) years of age and 8.41 (3.98) years of education; 75% of the English-speaking sample and 72% of the Spanish-speaking sample were female. Participants were designated as English speakers or Spanish speakers based on which language they elected to use on the examination. Among Hispanic participants, country of origin was primarily the Dominican Republic, Cuba, or Puerto Rico; most had resided in the United
BENTON VISUAL RETENTION TEST
States for > 15 years; 43% spoke English "not at all," 33% spoke English "not well,'' 11% spoke English "well," and 13% spoke it "very well;" 97% indicated that Spanish was the primary language spoken in the home. Among the English speakers, 73% had been born and raised in the United States. Immigrants were primarily from European countries, and all had immigrated prior to 1980; 98% spoke English "well" or "very well," and 89% spoke primarily English in the home. Participants with signs of dementia or cognitive impairment, based on neurological mental status testing, were excluded, as were participants with major depression or history of Parkinson's disease, stroke, head injury with loss of consciousness, or alcohol abuse or who were less than age 65. To measure visuoperceptual skills, form C was administered, in which each target stimulus was presented along with a four-choice array. For measurement of visual memory, a multiple-choice version (form D) was administered. Participants were shown each design for 10 seconds and, after removal, asked to choose the design from a four-choice array. Spanish test instructions were translated by a committee of native Spanish speakers from Cuba, Puerto Rico, Spain, and the Dominican Republic and then back-translated. Means and SDs for number correct are reported. Study strengths 1. Large overall sample size. 2. Information regarding age, education, gender, recruitment strategy, and geographic recruitment area. 3. Data stratified by language. 4. Adequate exclusion criteria. 5. Test administration specified, and method of translating instructions reported. 6. Means and SDs for total correct for forms C and D are reported. Considerations regarding use of the study 1. Nonstandard administration (multiplechoice response). 2. Mostly female sample. 3. Low educational level. 4. Data not stratified by age, but the age range appears to be adequately narrow.
411
5. Data obtained on Spanish speakers. It is unknown how translation of the test instructions affected performance. 6. No information on IQ level. [BVRT.21] Carmelli, Swan, Reed, Schellenberg, and Christian, 1999 (Table A20.21) BVRT data were obtained in 1985-1986 on 589 white male World War II veterans 59-69 years old as part of a study of the impact of smoking, drinking, and Apo Eon cognitive function; 514 twin pairs were selected for study from a larger registry of 16,000 based on geographic location (within 200 miles of Framingham, MA; Indianapolis, IN; and Davis, San Francisco, and Los Angeles, CA); 248 monozygotic twins, 242 fraternal twins, and 99 singletons were studied; 341 participants averaged ~ 12 years education, and 254 participants had >12 years of schooling. Patients with alcohol use, hypertension, diabetes, coronary heart disease, stroke, transient ischemic attack, myocardial infarction, congestive heart failure, and angina were included. Standard testing administration and scoring procedures were followed. Means for number correct are reported for former smokers (quit 2::10 years, n = 222), former smokers (quit <10 years, n = 72), current smokers (n = 102), and never-smokers (n = 199) as well as for nondrinkers (n = 158), light drinkers (~1 drink per day, n =204), moderate drinkers (<3 and >1 drinks per day, n = 150), and heavy drinkers (2::3 drinks per day, n = 83). Means for BVRT correct reported. Study strengths 1. Large samples size, and most individual cells have > 100. 2. Narrow age range. 3. Information on education, gender, geographic area, ethnicity, and recruitment strategies. 4. Means reported for number correct. 5. Data stratified by cigarette and alcohol use. Considerations regarding use of the study 1. No exclusion criteria. 2. All-male sample. 3. Test version not specified, and no SDs reported. 4. No information regarding IQ level.
412
VERBAL AND VISUAL LEARNING AND MEMORY
[BVRT.22] Coman, Moses, Kraemer, Friedman, Benton, and Yesavage, 1999 (Table A20.22)
Archival BVRT data on a total of 156 (31 male, 125 female) normal participants obtained at the University of Iowa were analyzed. Participants were aged 61-97, with a mean of 77.7 (7.89) years. The sample was primarily Caucasian, with a mean of 12.67 (3.46) years of education (range 4-20). All participants were given administration A according to standard procedures. Based on the results of regression analyses, expected BVRT number correct scores are provided for nine ages (55, 60, 65, 70, 75, 80, 85, 90, 95) by 11 educational levels (8, 9, 10, 11, 12, 13, 14, 15, 16, 18, 20). Study strengths 1. Relatively large sample size, although it encompassed a 36-year age range. 2. Data presented by age and education. 3. Information regarding gender, ethnicity, and geographic area. 4. Test version and administration procedures reported. 5. Expected number of correct scores for age x education groupings are reported. Considerations regarding use of the study 1. Sample was mostly female. 2. No information regarding recruitment strategies or IQ level. 3. No exclusion criteria. 4. Form not specified. 5. Actual means and SDs not reported. [BVRT.23] Manly, Jacobs, Sano, Bell, Merchant, Small, and Stern, 1999 (Table A20.23)
BVRT data were collected on 43 literate and 43 illiterate older participants in the Washington Heights Inwood Columbia Aging Project, a community-based epidemiological study conducted in northern Manhattan, New York City. Subjects were drawn from a random sample of older (>65 years) Medicare recipients residing in selected census tracts of Washington Heights and Inwood. The sample was restricted to subjects with 0-3 years of education. Participants were excluded if they had a history of stroke, Parkinson's disease, or alcohol
abuse or neurological or functional signs of delirium or dementia based on physician exam (independent of neuropsychological scores). Literacy was determined by self-report to the query "Did you ever learn to read and write?'' Both groups were 74% female, and 72% of the literate group were Spanish speakers compared to 86% of the illiterate group. The literate group was 81% Hispanic, 9% African American, and 10% non-Hispanic white, while the illiterate group was 91% Hispanic and 9% African American. Means and SDs for multiple-choice recognition and matching trials are reported. Study strengths 1. Large overall sample size, and individual groups approach 50. 2. Information regarding age, education, gender, ethnicity, geographic area, and recruitment strategy. 3. Adequate exclusion criteria. 4. Data stratified by literacy. 5. Test format specified, although not stimuli version. 6. Means and SDs reported. Considerations regarding use of the study 1. Data not stratified by age, but apparently a fairly narrow age range was used. 2. Subjects mostly female. 3. No information regarding IQ level. [BVRT.24] Mathiesen, Ellingsen, and Kjuus, 1999 (Table A20.24)
BVRT scores were gathered on 52 male controls aged <65 as part of a study on the effects of mercury vapor on cognitive function in Norway. Mean age was 45.5 (10.8) years, and mean education was 9.5 (1.8) years. Exclusion criteria were alcohol abuse; major head injury (loss of consciousness >6 hours); metabolic disorders; neurological, psychiatric, or other diseases causing severe disability; and exposure to known occupational neurotoxicants; 15.4% had experienced mild concussions. Mean WAIS-R Vocabulary scaled score was 8.7 (1.2). Administration A of form C was given. Means and SDs for number correct and number of errors are reported.
BENTON VISUAL RETENTION TEST
413
Study strengths 1. Adequate sample size. 2. Information regarding age, education, gender, Vocabulary scaled score, and geographic area reported. 3. Adequate exclusion criteria. 4. Test administration format and stimuli specified. 5. Means and SDs reported for number correct and number of errors reported.
Considerations regarding use of the study 1. No apparent exclusion criteria. 2. Data not stratified by age. 3. Recruitment strategy not specified. 4. Data on Arab-speaking sample, collected in United Arab Emirates, which may limit generalizability for clinical interpretation in the United States.
Considerations regarding use of the study 1. Data not stratified by age. 2. Low educational level. 3. All male sample. 4. It is assumed that the test was administered in Norwegian, but the method of translation is not reported. 5. Data collected in Norway, which may limit generalizability for clinical interpretation in the United States.
BVRT test scores were obtained on 193 randomly selected English-speaking older community residents drawn from the Washington Heights Inwood Columbia Aging Project, an epidemiological study involving elderly Medicare recipients residing in 13 census tracts of Washington Heights and Inwood. All selfidentified as non-Hispanic white; 106 were U.S.-bom, and 87 were born outside of the United States (39% in Western Europe, 14% in Austria/Hungary, 10% in Poland, 6% in England/Luxembourg, 6% in the former USSR, 3% in Bulgaria/Romania, 2% in southem Europe, 2% in Turkey, 1% in Iraq!Jordan, 1% in other Eastern Europe, 1% in Scandinavia, 1% in Canada, 1% in the Caribbean, and 1% in North Africa). Only subjects who rated themselves as speaking English "very well" were included in the study. All subjects were rated as nondemented by a physician, independent of neuropsychological test scores. All subjects were >65 years of age. The U.S.-bom subjects averaged 75.7 (7.2) years of age and 12.9 (3.5) years of education. Foreign-hom subjects averaged 77.9 (7.3) years of age and 12.0 (3.7) years of education. Means and SDs for multiple-choice recognition and matching trials are reported.
[BVRT.25] Amir, 2001 (Table A20.25) BVRT data are reported for 260 participants (124 males, 136 females) recruited from various educational institutions and workplaces in the United Arab Emirates. The mean age of the males was 21.7 (5.89) years, with a range of 1544 years, and the mean age of the females was 21.6 (4.66) years, with a range of 15--39 years. Mean educational level of the males was 11.6 (2. 71) years and of the females, 12.0 (2.80) years. Data were grouped into four IQ (superior, above average, average, below average) by gender by two education ($9 years, ;?:university) groups. Administration A, form D, was administered, followed 2 weeks later by form E. Means and SDs are reported for number correct and errors.
Study strengths 1. Large sample size 2. Information regarding age, education, gender, IQ, and geographic area. 3. Data stratified by gender, IQ level, and education. 4. Test administration format and stimuli versions specified. 5. Data presented on two forms at 2-week retest interval. 6. Means and SDs for number correct and total errors reported.
[BVRT.26] Touradji, Manly, Jacobs, and Stern, 2001 (Table A20.26)
Study strengths 1. Large sample sizes. 2. Information regarding age, education, English fluency, ethnicity, geographic area and place of birth, and recruitment strategies. 3. Data stratified by place of birth (U.S.bom vs. foreign-hom). 4. Test format specified, although not stimuli form. 5. Means and SDs reported.
414
VERBAL AND VISUAL LEARNING AND MEMORY
Considerations regarding use of the study l. Data not stratified by age, but apparently a fairly narrow age range was used. 2. Minimal exclusion criteria (i.e., nondemented). 3. No information regarding gender or IQ.
neurological or functional signs of dementia based on a physician's clinical rating (independent of neuropsychological test scores). Means and SDs for number correct on the multiple-choice recognition and matching trials of the BVRT are reported.
[BVRT.27] Coman, Moses, Kraemer, Friedman, Benton, and Yesavage, 2002 (Table A20.27)
Study strengths 1. Adequate sample size. 2. Information provided regarding age, education, gender, fluency in English, reading scores, recruitment strategies, and geographic area. 3. Good exclusion criteria. 4. Data stratified by ethnicity. 5. Test administration format reported (but not stimuli form). 6. Means and SDs for multiple-choice recognition and matching trials reported.
BVRT data are the same as those reported in the 1999 study. Means and SDs for number correct are now reported for four age groups: 55--64 (n=6), 65-74 (n=54), 75-84 (n=67), and 85+ (n = 29). Study strengths l. Same as above, although now means and SDs for the actual data are reported. Consideration regarding use of the study 1. Same as above, with the exception that actual means and SDs provided. [BVRT.28] Manly, Jacobs, Touradji, Small, and Stern, 2002 (Table A20.28)
BVRT test scores were obtained on 192 older African Americans and 192 older non-Hispanic white subjects who participated in the Washington Heights-Inwood Columbia Aging Project, an epidemiological study which drew participants from northern Manhattan, New York City. Subjects were drawn from a random sample of Medicare recipients in selected census tracts of Washington Heights and Inwood. All subjects were >65 years of age; mean age of the African-American sample was 73.9 (5.8) years, and mean educational level was 12.8 (2.8) years; mean age of the white sample was 74.6 (5.9) years, and mean educational level was 13.0 (3.0) years. Each sample was 68.2% female. Mean Wide Range Achievement Test, 3rd edition (WRAT-3) Reading score was 44.2 (7.2) for the AfricanAmerican sample and 49.3 (4.1) for the white sample. Testing was conducted in English, and only those participants who indicated that they spoke English "very well" were included. Exclusion criteria consisted of Parkinson's disease, stroke, head injury with loss of consciousness, alcohol abuse, serious mental illness such as depression or schizophrenia, or
Considerations regarding use of the study l. Data not stratified by age, but there appears to be a fairly narrow age range. 2. No information regarding IQ level. [BVRT.29] Farahat, Abdelrasoul, Amr, Shebl, Farahat, and Anger, 2003 (Table A20.29)
BVRT data were collected in 2000 on 50 male controls in Egypt as part of a study on the cognitive effects of exposure to organophosphorous pesticides. The sample was recruited from clerks and administrators; the response rate among invited participants was 79%. Mean age was 42.48 (5.54) years; eight had university degrees, and 42 had secondary education. Exclusion criteria were <12 years of education or medical illnesses such as diabetes, liver or kidney disease, peripheral neuropathy, vitamin deficiency, anemia, drug addiction, long-term treatment with psychotropic drugs, history of head injury including loss of consciousness, or recent exposure to neurotoxic agents. Nineteen participants were smokers, and the sample averaged five cups of coffee or tea per day. Means and SDs for number correct are reported. Study strengths 1. Adequate sample size. 2. Information provided regarding age, education, gender, recruitment strategies, and geographic area.
BENTON VISUAL RETENTION TEST
3. Good exclusion criteria. 4. Mean for number correct reported. Considerations regarding use of the study 1. Data not stratified by age. 2. Test version/administration format not specified. 3. Method of translation of test instructions not specified. 4. SDs not reported. 5. Data collected on Arabic speakers in Egypt, which may limit generalizability for clinical interpretation in the United States. 6. No information on IQ level. [BVRT.30] Kawas, Corrada, Brookmeyer, Morrison, Resnick, Zonderman, and Arenberg, 2003 (Table A20.30)
These BVRT data were collected as part of the Baltimore Longitudinal Study of Aging. The sample consisted of 1,425 participants (1,004 men, 421 women); 72.4% had a college education or higher. Data are presented for six age groupings: <50 (n = 298), 50-59 (n = 546), 6069 (n = 815), 70-79 (n = 760), 80-89 (n = 380), and 90+ (n = 40). Mean raw Vocabulary scores from the WAIS are provided for each age grouping. Exclusion criteria are described in Giambra et al. (1995); 144 participants subsequently developed Alzheimer's disease. Administration Awas employed (presumably form C). Means and SDs for total number of errors are reported. Study strengths 1. Very large sample size. 2. Data stratified by age. 3. Adequate exclusion criteria. 4. Information regarding gender, education, geographic area, recruitment strategy, and WAIS Vocabulary. 5. Means and SDs for total number of errors reported. Considerations regarding use of the study 1. Approximately 10% of participants were subsequently diagnosed with Alzheimer's disease (although this would be true of any older sample).
415
2. The exact BVRT stimuli used are not specified (although it is assumed to be form C, given that this is Baltimore Longitudinal Study of Aging data). 3. Mostly male sample. 4. Well-educated sample. [BVRT.31] Reinprecht, Elmstahl, Janzon, and Andre-Petersson, 2003 (Table A20.31)
BVRT data were obtained on 141 81-year-old men in a prospective study of the effects of hypertension on cognition. All men born in the even months of 1914 and residing in the municipality of Malmo, Sweden, were contacted; 500 of the 560 identified men agreed to participate in an examination in 1982-1983, and 281 surviving men were reinvited to participate in 1995-1996. Of these, 185 agreed to the reevaluation, and BVRT data were available on 141. The men were classified into three groups: no hypertension at ages 68 and 81 (n = 22), hypertension at 81 but not 68 (n = 11), hypertension at 68 and 81 (n = 108). Administration A was followed; stimuli version was not specified. Means and SDs for number correct are reported. Study strengths 1. Large overall sample size, although some individual cells were small. 2. Information regarding age, gender, geographic area, and ethnicity reported. 3. Information on recruitment strategy; sample apparently very representative of this cohort. 4. Data on blood pressure readings. 5. Test administration format reported, although not test stimuli version. 6. Means and SDs for number correct provided. 7. Narrow age range. Considerations regarding use of the study 1. No exclusion criteria. 2. No data on educational level or IQ. 3. All-male sample. 4. Method of translation of test instructions not specified. 5. Data collected in Sweden, which may limit generalizability for clinical interpretation in the United States.
416
VERBAL AND VISUAL LEARNING AND MEMORY
[BVRT.32] Ruggieri, Palermo, Vitello, Gennuso, Settipani, and Piccoli, 2003 (Table A20.32)
BVRT data were obtained on 50 Italian controls (24 males, 26 females) as part of a study on cognitive function in relapsing-remitting multiple sclerosis. Mean age was 30.08 (8.37) years, with a range of 17-45 years, and mean education was 11.76 (3.04) years, with a range of 8 to 18 years. BVRT means and SDs for number correct are reported. Study strengths 1. Minimally adequate sample size. 2. Information available on age, education, and gender. 3. Means and SDs reported. Considerations regarding use of the study 1. Data not stratified by age, and age range unacceptably large. 2. No exclusion criteria listed or IQ level. 3. Test format and version not specified. 4. Recruitment strategies not specified. 5. Method of translation of test instructions not specified. 6. Data collected in Italy, which may limit generalizability for clinical interpretation in the United States. [BVRT.33] Witjes-Ane, Vegter-van der VIis, van Vugt, Lanser, Hermans, Zwinderman, van Ommen, and Roos, 2003 (Table A20.33)
BVRT data were obtained between 1993 and 1998 on 88 non-gene carriers as part of a study on cognition in Huntington's disease at Leiden University Medical Center in the Netherlands. The group consisted of 40 men and 48 women and averaged 42 years of age (range 18--64); six had less than a high school education, 56 had completed high school, and 26 had post-high school education. Means and SDs for number correct are reported.
Study strengths 1. Large overall sample size. 2. Information regarding gender, age, education, geographic area, and recruitment strategy. 3. Means and SDs for number correct reported. Considerations regarding use of the study 1. Data not stratified by age. 2. No exclusion criteria aside from being non-gene carriers. 3. Test format and version not specified. 4. Method of translation of test instructions not specified. 5. Data collected on Dutch speakers in the Netherlands, which may limit generalizability for clinical interpretation in the United States. 6. No information on IQ level.
CONCLUSIONS
The BVRT, more than any other neuropsychological measure, has gained wide international usage for a wide range of clinical conditions, with the test used not only in English-speaking populations but also in France, Italy, Sweden, the Netherlands, Egypt, United Arab Emirates, Korea, Venezuela, China, Sinapore, Japan, and India. Adequate normative data are available for older male and female English-speaking samples, but data are generally sparse for younger age groups and for individuals with <12 years of education. Considerable data have been accumulated for administration A, form C, but considerabl~ less data are available for the other versions.
'Meta-analyses were not perfonned for the BVRT as the data available for review are heterogeneous in tenns of measures reported (e.g., number correct vs. total errors), test form/administration used, and country where data were collected. In addition, data from the same studies were reported in several articles, which considerably reduces the number of data points available for analyses.
VI MOTOR FUNCTIONS
21 Finger Tapping Test
BRIEF HISTORY OF THE TEST
The Finger Tapping Test (FTT) is one of the original tests introduced by Halstead and is commonly used as a simple measure of motor speed and motor control. Originally, it was called the Finger Oscillation Test, and the number of taps was recorded for the dominant hand only (Russell et al., 1970). Reitan modified the administration of this test to include the performance of both hands. Tapping speed with the dominant hand was one of the 10 measures used in computing the Impairment Index (Halstead, 1947; Reitan, 1955b). To take into account impaired performance with the hand contralateral to the brain damage, Rennick modified the procedures for computing the Average Impairment Rating to include tapping speed with the most impaired hand, rather than with the dominant hand only (Russell et al., 1970). Interpretation of the norms available in the literature is complicated by heterogeneity of the tapping devices and administration techniques. The most frequently used device is a tapping lever mounted with a key-driven counter. The counter rotates when the tapping key is depressed 0.50 inch. The counter and key are mounted on a board, with the tapping lever located approximately 1. 75 inches above the surface of the board and at
a 30-degree angle from the counter. It has been reported that 400 g of pressure is required to depress the lever (Knights and Moule, 1967). This tapper is available from Psychological Assessment Resources and from the Reitan Neuropsychology Laboratory (see Appendix 1 for ordering information). A second finger tapping apparatus, the Digital Tapping Test, is also available. This device consists of an electronic, self-contained timer with digital readout, which automatically begins timing with the first depression of the tapping key and allows for no further recording of taps after exactly 10 seconds have elapsed. This digital finger tapping device requires a static weight of 80 g of pressure to depress the lever and 0.13 inch of travel in the key to change the counter. The majority of the studies summarized in this chapter used the standard manual, key-driven tapping device that is part of the Halstead-Reitan Battery (HRB). The most common administration and scoring technique used in the reviewed studies is based on the instructions for the HRB (Rennick method), in which the Finger Tapping score for each hand is the mean of five consecutive 10-second trials within a range of five taps. A maximum of 10 trials with each hand is allowed, and if the above criterion is not met, the score is the mean of the best five trials (see Lezak, 1995, pp. 680--682; 419
420
Lezak et al., 2004; Spreen & Strauss, 1998, for further information). First, all five trials with the preferred hand are completed, followed by the nonpreferred hand trials. The majority of authors also referred to the standard description of the procedures specified by Reitan and Wolfson (1985). Some studies used fewer than five trials per hand, did not enforce the procedure of obtaining five consecutive trials within a range of five taps, or alternated hands after each trial. There was variability in the terms of data recording as well. Some studies reported performance for the dominant hand only, worse hand only, total for both hands, average of both hands, etc. A modification of the FTT administration procedure was introduced by Russell and Starkey (1993) as a result of the inclusion of the Finger Tapping Test in their HalsteadRussell Neuropsychological Evaluation System (HRNES). According to their instructions, the subject is to tap, using just the index finger, as fast as possible. The subject is to keep the "heel" of his or her hand on the board and avoid using the whole hand, wrist, or arm. After a brief practice, the subject is instructed to perform six 10-second trials with each hand, in sets of three trials, alternating hands between sets, starting with the dominant hand. A score at least four taps faster or slower than the next highest or lowest score is considered to be an "outlier." This score is eliminated from the calculation of the total score and is replaced with an alternate trial to make a total of six valid trials. This substitution is allowed for two trials only. Fifteen-second rest periods are allowed between trials. Total scores represent the average speed for valid trials with each hand and with both hands. The score used in computing an overall index for the entire battery is based on the average performance with the dominant and nondominant hands instead of the worse-hand performance used in the earlier versions. Some studies provide data to allow conversion of raw scores into other units, which facilitates comparison between different tests. For example, a normative system for the expanded HRB developed by Heaton et al. (1991, 2004) allows conversion of raw scores into scaled score equivalents, which can be
MOTOR FUNCTIONS
further converted into T scores, based on data for a sample of neurologically normal participants stratified by age, education, and gender. Golden et al. (1981b) reported that the FTT is a measure of fine motor controL which is based on motor speed as well as kinesthetic and visual-motor abilities. It has been suggested that the FTT is one of the most sensitive tests in the HRB for determining brain impairment (Russell et al., 1970). In a factoranalytic study, Lansdell and Donnelly (1977) reported that tapping performance may be impaired with most, but not necessarily all, types of cerebral damage. Other authors have also noted that brain impairment generally, but not always, results in a compromise in finger tapping speed (Dodrill, 1978a; Haaland et al., 1977; Lezak, 1995; Lezak et al., 2004; Prigatano & Borgaro, 2003). Golden and colleagues (Golden, 1978; Golden et al. 1981b) suggested that at the cortical level impaired performance may reHect dysfunction of the pJ."emotor and motor strip regions of the frontal lobes or abnormalities of sensory feedback secondary to parietal lobe dysfunction. These authors reported that, in general, the greater the deficit on the finger tapping score, the nearer the lesion is to the area of the precentral gyrus. In addition, they indicated that subcortical disruption of sensory or motor tracts, as well as peripheral damage to the extremities, may result in compromised performance on this measure. In spite of the well-documented sensitivity of the FTT to brain dysfunction, psychometric issues related to the optimal balance of sensitivity and specificity of the test have been widely disputed in the literature. Wheeler and Reitan (1963) have reported 79% hit rates in brain-damaged populations when using a level-of-performance criterion. However, while hit rates across different studies appear to be quite good, the rates of false-positive misclassifications based on the cutoffs proposed by Halstead are unacceptably high (especially for older age groups) (Bornstein, 1986a; Bomstein et al., 1987b; Heaton et al., 1986; Trahan et al., 1987). In McKeever and Abramson's (1991) study on college students, only 10% of left-handed vs. 14% of righthanded females and 39% of left-handed vs.
421
FINGER TAPPING TEST
57% of right-handed males scored within the nonimpaired range, using the original Halstead cutoff criteria. The authors emphasized the need for revising the norms due to the high rate of false-positive "diagnoses," especially for females and left-banders. In addition to its utility in determining the presence of brain dysfunction, the FIT provides an index of lateralized dysfunction due to the contralateral effect of cerebral lesions since independent measures of dominant and nondominant hand performance are customarily obtained. Bornstein (1986d) has systematically studied the magnitude and variability of these intermanual differences. His results suggest that in a normal sample, for approximately 30% of males and 20% of females, the nonpreferred hand was found to be superior to the preferred hand. Trahan et al. (1987) found nonpreferred hand performance to be faster than preferred-hand performance in 14.7% of subjects. Similar findings are reported in clinical studies. According to Massman and Doody's (1996), 26% of their patients with probable Alzheimer's disease displayed an exaggerated right-hand advantage (associated with higher educational level), whereas 37% demonstrated a reversal of expected asymmetry. The authors emphasized that asymmetry in motor speed correlated significantly with cognitive asymmetries. In a follow-up on his previous study, Bornstein (1986b) noted that this variability in preferred-hand performance frequently results in interpretive difficulties when the commonly used guideline of a 10% preferred-hand superiority is employed. Fromm-Auch and Yendall (1983) reported only a 5% difference in favor of the preferred hand in males. Bornstein (1986b) suggests that, in the evaluation of lateralized hemisphere lesions, FIT findings have to be supported by nonmotor tasks and by additional instruments measuring motor performance. In this study, he evaluated the pattern of motor performance on three motor tests (FIT, Grooved Pegboard Test, and Hand Dynamometer), which were administered to normal and unilateral brain lesion samples. Interestingly, a large degree of variability was observed across these inter-
manual measures, whereby "a high percentage (approximately 25 percent) of the normal sample obtained scores more than one standard deviation from the control mean on a single measure" (p. 719). Thus, Bomstein emphasized the importance of consistency in the performance pattern across tasks, rather than use of a "rigid application of 'cookbook' formulas or 'rules of thumb' " in FIT interpretation (p. 723). More recently, FIT has been used in studies exploring hemispheric specialization and neural asymmetry, as reflected in handedness and intermanual differences in motor speed (Corey et al., 2001; Nalcaci et al., 2001). Data on test-retest reliability of the FTf vary widely, with reliability coefficients ranging from 0.04 to almost perfect values. A majority of the studies, however, report high reliability ratings for different interprobe intervals (Bomstein et al., 1987a; Ruff & Parker, 1993). Reliability in Charter et al.'s (1987) study, which was expressed as the average item-test correlations, was 0.99 for both hands for normal and mixed samples with over 300 participants, with standard errors of measurement of 0.83 and 0.79 for the preferred and nonpreferred hands, respectively. Data on repeated administration of the FIT are presented by McCaffrey et al. (2000). For further information on the psychometric properties of the FIT, see Franzen (2000), Lezak et al. (2004), and Spreen and Strauss (1998).
RELATIONSHIP BETWEEN FTT PERFORMANCE AND DEMOGRAPHIC FACTORS Empirical investigations report the effect of demographic and situational variables on finger tapping speed, such as age (Bomstein, 1986a; Elias et al., 1993; Fromm-Auch & Yendall, 1983; Heaton et al., 1986; McCurry et al., 2001; Ruff & Parker, 1993; Trahan et al., 1987), education (Bernard, 1989; Bomstein, 1985; Finlayson et al., 1977; Fromm-Auch & Yendall, 1983; Heaton et al., 1986; Vega & Parsons, 1967), order of test administration (Harris et al., 1981; Neuger et al., 1981),
422
MOTOR FUNCTIONS
anxiety (King et al., 1978), and personality characteristics (Heaton, 1985). Intelligence level was not found to be related to finger tapping speed (Tremont, 1998). Many studies have questioned the feasibility of restricting test interpretation for both males and females to an identical levelof-performance cutoff. Dodrill (1979) has reported considerable gender differences in samples of neurological patients and neurologically intact individuals. Other authors have also reported notable gender differences across demographically diverse samples, with males consistently outperforming females by three to five taps (Bomstein, 1985; Echtemacht, 1981; Filskov & Catanese, 1986; Fromm-Auch & Yeudall, 1983; Harris et al., 1981; Heaton et al., 1991, 2004; Hoffman, 1969; King et al., 1978; McKeever & Abramson, 1991; Morrison et al., 1979; Ruff & Parker, 1993; Trautt et al., 1983).
METHOD FOR EVALUATING THE NORMATIVE REPORTS To adequately evaluate the FTT data, eight key criterion variables were deemed to be critical. The first six relate to subject variables, and the remaining two refer to procedural variables.
Subject Variables Sample Size
Fifty cases are considered a desirable sample size. Although this criterion is somewhat arbitrary, a large number of studies suggest that data based on small sample sizes are highly influenced by individual differences and do not provide a reliable estimate of the population mean.
Age Group Intervals
This criterion refers to grouping of the data into limited age intervals. This requirement is relevant for this test since an effect of age on FTT performance has been demonstrated in the literature. Reporting of Educational Levels
Given the possible association between education and FTT scores, information regarding educational level should be reported for each subgroup. Grouping by Gender
Given the strong association between gender and FTT performance, normative data should be reported for males and females separately. Description of Hand Preference Assessment
To address the issue of lateralization in test performance, assessment procedures for hand preference should be fully described. Without this assessment, assumptions regarding functional lateralization cannot be made.
Procedural Variables Description of Administration Procedures
Administration procedures for the FTT differ widely among studies. Detailed description of the procedures allows selection of the most appropriate norms or corrections to account for deviations in administration procedures. Data Reporting
Group means and standard deviations for the number of taps averaged over all trials for the dominant and nondominant hands should be presented at minimum.
Sample Composition Description
Information regarding medical and psychiatric exclusion criteria is important. It is unclear if order of test administration, geographic recruitment region, socioeconomic status, occupation, ethnicity, handedness, and recruitment procedures are relevant. Until determined, it is best that this information be provided.
SUMMARY OF THE STATUS OF THE NORMS As discussed above, there is a great deal of variability in sample composition, administration, scoring, and interpretation of the FTT. Some of these factors are outlined below.
423
FINGER TAPPING TEST
In addition to nonnative studies based on "normal" samples, a number of clinical comparison studies have explored differences in FIT performance between clinical groups and "normal control" groups (which are sometimes matched on demographic characteristics). Unfortunately, normal control groups are frequently comprised of medical or psychiatric patients. These samples cannot be considered truly "normal" due to possible effects of their illnesses and medications on FIT performance. The majority of studies present data as the number of taps averaged over five trials for each hand. Some studies, however, present average scores for both hands; total score across both hands; total score across both hands over all trials, cumulatively; raw data converted to T scores; or scores for the dominant hand only. Such deviations from the standard method of data reporting are identified in our review of the nonnative data in the context of each pertinent table. In addition to providing nonnative data for each hand, several studies report the proportion of participants falling in the impaired range or rates of intennanual differences. Several authors stratified their samples by age, education, and/or gender. Procedures for assessment of handedness are thoroughly described in some studies. Furthermore, some authors divide their samples into groups based on handedness pattern. The majority of investigators recruited mostly young and middle-aged participants. Only a few studies present data for elderly individuals. Several publications provide testretest data over varying interprobe intervals. Some studies provide data for left-handed samples. Several studies report data on specific ethnic groups (e.g., Japanese Americans) or collected abroad, including Canada, Australia, Italy, Holland, and Colombia. Among all the studies available in the literature, we selected for review those based on well-defined samples or that offer some information not routinely reported. It should be noted that nonnative data for airline pilots on FIT and several other tests are available in Kay (2002), (data are not reproduced in this book).
In this chapter, nonnative publications and control data from clinical studies are reviewed in ascending chronological order. The text of study descriptions contains references to the corresponding tables identified by number in Appendix 21. Table A21.1, the locator table, summarizes information provided in the studies described in this chapter. 1
SUMMARIES OF THE STUDIES
Original Studies Halstead's (1947) original control group included only 28 participants (eight of whom were females), although 30 sets of scores were presented. Apparently, the reason for this inconsistency was that two participants took the tests twice (pre- and postlobotomy), and the results of both test administrations were included in the data pool. Concerning the nonnative performance of Halstead's control group, Lezak (1995, p. 680) reported that the mean number of finger taps (per 10-second trial) was 50 for the right hand and 45 for the left. Nonnative cut scores were obtained by comparing the perfonnance of controls with that of neurologically impaired individuals. On the basis of this comparison, Halstead (1947) recommended that a cutoff score of 50 (tapping scores of 50 and below are in the brain-damaged range) be used for the dominant hand in differentiating between normal and impaired participants. The corresponding cutoff for the nondominant hand was 44. Methodological concerns are apparent when using these data as a normative reference. For example, the age group interval being assessed was too broad (range 14-50 years), with an unequal sampling across age ranges. The sample consisted of mostly young people, with an average age of 28.3 years. In addition, gender was not adequately represented. Moreover, the sample consisted of inmates and individuals being treated for psychiatric disturbances, thereby confounding nonnative interpretations. For example, Halstead (1947) noted the following: 'Norms for children are available in Baron (2004) and Spreen and Strauss (1998).
424
MOTOR FUNCTIONS
Several gave abnonnal scores on the personality tests, thus supporting the psychiatric W,.gnoses. Their symptoms at the time of testing ran~d from mild to severe headaches; from loss of appetite, easy fatiguability, acute or chronic gastroiotestinal disturbance to insomnia and minor disturbances in memory functions. (p. 37)
r
Reitan (1955b) reported the results of study designed to assess the validity of HaJstead's (1947) Impairment Index. The sample included 50 non-brain-damaged contrhls (35 males, 15 females) who were matched ~ pairs with 50 brain-damaged participants bn the basis of race, gender, and as closely as possible chronological age and years of formal ;education. The mean age of controls wrut 32.36 (SD = 10. 78), and mean education 11.58 (SD = 2.85). Participants received ne'frological examinations before testing and .bowed "no signs or symptoms of cerebral 4amage or dysfunction" (p. 29). The majority pf participants comprising the control grmlp did, however, have various diagnoses, sue~ as depression (n = 17), paraplegia (n = 13); acute anxiety state (n = 6), and obsessivEH!ompulsive neurosis (n = 2). The author noted th• these patients were included to "minimize ~e possibility that differences in the test resultS for the brain-damaged and control groups could be attributed to hospitalization, chronic illness, and possible affective disturbances" (p. 29). Administration procedures follow~d the Halstead (1947) format. Testing and scoring were completed before the groups were composed or the participants were matched. According to the results of this study (for males only), the mean number of finger taps for the preferred hand was 50.74 (SD ~ 7.29) for the control group, while for the braindamaged group it was 45.58 (SD = 7.32). The difference between the means was statistically significant. Data for the nonpreferrecl hand were not provided. Reitan stated that although further validity studies were needed, jhe results of this investigation suggested tbttt "the Halstead battery is sufficiently sensitive to the effects of organic brain damage to frovide an objective and quantitative basis for tletailed study of relationships between brain ~nction and behavior" (p. 35). ·
wa;
These data laid the foundation for extensive research concerning the psychometric properties and clinical utility of the FIT. In spite of their historical value, their use as a normative standard for clinical comparison is not recommended due to the idiosyncratic demographic composition of Halstead's and Reitan's samples. Cutoffs for the FIT based on four performance levels, from "perfectly normal" to "severely impaired," were published by Reitan in his update on the HRB (Reitan, 1985). The FIT has enjoyed wide popularity among researchers and clinicians. Since its introduction by Halstead and Reitan, over 100 studies have addressed performance on the FIT in normal and clinical samples (usually along with other HRB tests or as part of a battery comprising various neuropsychological tests). The most relevant of those studies are reviewed below. [FT.1] Vega and Parsons, 1967 (Table A21.2)
The HRB performance of brain-damaged and control groups recruited in Oklahoma was compared. The control group included 43 patients hospitalized for causes other than central nervous system (CNS) dysfunction and seven nonhospitalized participants.
Study strengths 1. Group composition was described in terms of age, education, gender, IQ, clinical setting, and geographic area. 2. Relatively large sample size. 3. Means and SDs for the test scores are reported. Considerations regarding use of the study 1. Data are presented only for the dominant hand. 2. Data are not partitioned by age group. 3. Procedures used to determine hand preference are not described. 4. Control sample is primarily composed of hospitalized patients. 5. Data are collapsed across genders. [FT.2] Goldstein and Shelly, 1972 (Table A21.3)
Inpatients at the Topeka Veterans Administration Hospital were used. Participants came
FINGER TAPPING TEST
from different services in the hospital and were tested on a referral basis. Most participants were male adults. Participants were classified into brain-damaged and control groups. Controls included general medical and psychiatric patients. Participants for whom definitive diagnostic differentiation could not be made were dropped. The administration procedure conforms with the original Halstead instructions.
Study strengths 1. Sample size is large. 2. Administration procedure is described. 3. Means and SDs for the test scores are reported. 4. Geographic area is indicated.
Considerations regarding use of the study 1. Demographic characteristics for the sample such as age, education, and gender are not reported. 2. Procedures used to determine hand preference are not described. 3. Control sample consists of medical and psychiatric inpatients. 4. Sample is comprised primarily of males; male and female data are collapsed.
425
The authors point to a steady decrease in mean speed for both hands as an expected consequence of the aging process, which might be related to diminished function of interhemispheric neural transfer. The number of reversals from the expected difference between the two hands became substantial in the group of 70-year-olds.
Study strengths 1. Overall sample size is quite large, although some individual cells are small. 2. Sample is divided into six age groups. 3. Hand preference and method for determining handedness are indicated. 4. Test administration procedure is described. 5. Gender is reported. 6. Minimally adequate exclusion criteria. 7. Means and SDs for the test scores are reported.
Considerations regarding use of the study 1. Sample is comprised primarily of males; male and female data are collapsed. 2. No data on educational level or geographic area. [FT.4] Finlayson, Johnson, and Reitan, 1977
[FT.3] Goldstein and Braun, 1974 (Table A21.4)
(Table A21.5)
The study explores changes in speed of performance on bilateral motor tasks as a function of increased age. Participants were considered representative of a "normal" population, without reported history of neurological difficulties. A sample consisting of 201 men and eight women was divided into six age groups. Preference of the right hand was endorsed by all but five participants and confirmed by a lateral dominance examination. Participants who indicated mixed dominance were not included. The procedure generally conforms with the original Halstead instructions. The first set of trials established the mean for the preferred hand. Then, the procedure was repeated for the nonpreferred hand. Means and SDs for each hand as well as percent reversal (percent of participants who tap faster with the nonpreferred hand) are presented.
The study compared male brain-damaged and control samples. Controls included healthy individuals as well as hospitalized medical and psychiatric patients. The sample was divided into three groups based on education: university groups included persons with at least 3 years of college; high school groups included those who had completed grade 12 but had not attended college; grade school groups included persons with <10 years of education. Raw scores were converted into T scores, with a mean of 50 and SD of 10.
Study strengths 1. Sample composition is well described in terms of age, education, gender, and IQ. 2. Data are presented by education groupings.
Considerations regarding use of the study 1. Data are not partitioned into age groups. 2. All participants are male.
426
MOTOR FUNCTIONS
3. Raw scores are not reported. 4. Procedures used to determine hand preference are not described. 5. Control group includes hospitalized medical and psychiatric patients. 6. Small individual cell sizes. [FT.S] Wiens and Matarazzo, 1977
group included participants from the community with no evidence of neurological disorder. Of the controls, nine were students, six were housewives, 20 were unemployed, and 15 were employed. Controls were recruited through employment facilities, churches, a community college, a public high school, a volunteer service agency, and a semisheltered workshop.
(Table A21.6)
The authors collected FIT data on 48 male applicants to a patrolman program ill Portland, Oregon, as part of an investigation of the WAIS and MMPI correlates .of the Halstead-Reitan Battery. All partieipants passed a medical exam and were judged to be neurologically normal. Participants were divided into two equal groups, which were comparable in age, education, and W AJS fullscale IQ. Group 1 ranged in agf 2127 years and group 2, 21-28 years. Me~s and SDs are provided for each hand. : Correlations for the two groups b'tween W AIS FSIQ and FIT scores were -0.05 and 0.03 for the preferred hand and -0.11 and -0.44 for the nonpreferred hand. The authors concluded that, for the top half of the population in terms of education and IQ, individual differences in scores on the W AIS do not influence performance on the HRB measures.
Study strengths 1. Demographic characteristics of the sample are pJ;"esented in terms of gender, age, education, IQ, recruitment dures, and geographic area. 2. Adequate medical exclusion criteria. 3. Means and SDs for the test scoies are reported. 4. Data are provided in a restricted age range.
proce-
Considerations regarding use of the study 1. Procedures used to determine hand preference are not described. 2. High IQ level. 3. Relatively small sample size. 4. All-male sample. [FT.6] Dodrill, 1978a (Table A21.7) The study compares epileptic and ~ntrol groups in the state of Washington. The eontrol
Study strengths 1. Sample composition is well described in terms of age, education, gender, occupation, geographic area, ethnicity, and recruitment procedures. 2. Data are stratified by gender. 3. Means and SDs for the test scores are reported.
Considerations regarding use of the study 1. Procedures used to determine hand preference are not described. 2. Apparently adequate exclusion criteria, although some controls were recruited from semisheltered workshops. 3. Undifferentiated age range. 4. Relatively small sample sizes in the gender subgroupings. [FT.7] Dodrill, 1978b (Table A21.8) Performance on motor tests was compared for a control and three brain-damaged groups. The 25 control participants were Caucasian, right-handed adults over 15 years of age, recruited from community resources in Washington, who had no history of injury or disease that involved the CNS.
Study strengths 1. Sample composition is well described in terms of age, education, gender, handedness, and geographic area. 2. Minimally adequate exclusion criteria for controls. 3. Means and SDs are reported.
Considerations regarding use of the study 1. Procedures used to determine hand preference are not described. 2. Small sample size. 3. Undifferentiated age range. 4. Data are collapsed across genders.
427
FINGER TAPPING TEST [FT.8] Morrison, Gregory, and Paul, 1979 (Table A21.9)
11le study explored interexaminer reliability for test-retest conditions of the F'Tf. Participants were 60 volunteers from introductory psychology courses with a modal age of 19. All participants were white and from rural backgrounds in Idaho; half the sample were male and half, female. Two conditions were used:
1. Participants were tested by the same examiner twice with a 1-week interval (test-retest condition). 2. Participants were tested by different examiners twice with a 1-week interval (interexaminer condition). Means and SDs for the test-retest and interexaminer conditions are reported. 11le authors found significant gender differences, with males performing about three taps faster than females.
Study strengths 1. Data are presented for males and females separately. 2. Information on ethnicity, occupation (college students), age, and geographic area is presented. 3. Means and SDs for the test scores are reported. 4. Restricted age grouping.
Considerations regarding use of the study 1. Results are reported only for the dominant hand. Data are averaged over the test and retest conditions. 2. No exclusion criteria. 3. Relatively small individual cell sizes. 4. Procedures for assessment of hand dominance are not described. [FT.9] Anthony, Heaton, and Lehman, 1980 (Table A21.10)
11le purpose of the study was to cross-validate two computerized programs designed to determine the presence, location, and process of brain lesions using scores from the HalsteadReitan Battery and the WAIS. Patients with
structural brain lesions and normal controls were compared. The control group consisted of volunteers with no medical or psychiatric problems and no history of head trauma, brain disease, or substance abuse. 11le study was conducted in Colorado.
Study strengths 1. Information regarding education, IQ, age, and geographic area is provided. 2. Large sample size. 3. Adequate exclusion criteria. 4. Means and SDs for the test scores are reported.
Considerations regarding use of the study 1. Procedures used to determine hand preference are not described. 2. Undifferentiated age grouping. 3. No information regarding gender; data are collapsed across genders. [FT.1 0] Bak and Greene, 1980 (Table A21.11)
11le study investigated the effects of age on different neuropsychological measures in healthy, active older adults in Texas aged 50-86 years. Two age groups were compared: 50-62 and 67-86 years. All participants were right-handed. Participants were fluent in English and denied history of CNS disorders, uncorrected sensory deficits or illnesses, or "incapacities" which might affect test results; participants in poor health were excluded. Four W AIS subtests were administered (Information, Arithmetic, Block Design, Digit Symbol); mean scores on these measures suggested that IQ levels were within the high average range or higher. Means and SDs were reported for each hand.
Study strengths 1. 11le study provides data on a very elderly age cohort not found in other published normative data. 2. Adequate exclusion criteria. 3. Sample composition is well described in terms of age, gender, education, handedness, and geographic area. 4. Means and SDs for the test scores are reported.
428
MOTOR FUNCTIONS
Considerations regarding use of the study 1. Procedures for assessment of handedness are not described. 2. Sample sizes are small. 3. High IQ and educational level for the older age grouping. 4. The older age grouping spans nearly two decades and may be too broad for optimal clinical interpretation. 5. Data are collapsed across genders.
Normal volunteers were matched on age, gender, and education to the patient group and satisfactorily completed a brief screening exam by a neurologist.
[FT.11] Eckardt and Matarazzo, 1981
Considerations regarding use of the study 1. No data are reported for mean age, education, or gender distribution for controls, although it is assumed that controls approximate the age, gender, and education of patients given that the groups were matched. 2. Method for determining handedness is not reported. 3. Undifferentiated age range.
(Table A21.12)
Performance on neuropsychological tests for drug-free alcoholic inpatients and nonalcoholic medical inpatients was compared. All participants were male inpatients at V.A. hospitals in California aged 21-60. No psychoactive medication had been ingested by the patients during the 48 hours prior to testing. The nonalcoholic group consisted of medical inpatients from the same hospital who were referred from a variety of services and were neurologically stable during the study. They were assumed to have no recent drinking problem. Sixty percent of participants had some college education. They were tested twice with an interval of 12-22 days between probes. Study strengths 1. The sample composition is described in terms of age, geographic area, and gender, with cursory information on education. 2. Test-retest data are available. 3. Means and SDs for the test scores are reported. Considerations regarding use of the study 1. Procedures used to determine hand preference are not described. 2. Data are provided only for males and only for the dominant hand. 3. Small sample size. 4. Controls were medical inpatients. 5. Undifferentiated age range. [FT.12] Pirozzolo, Hansch, Mortimer, Webster, and Kuskowski, 1982 (Table A21.13) The study compares performance on neuropsychological measures by Parkinson's disease patients and normal controls.
Study strengths 1. Sample sizes are sufficiently large. 2. Minimally adequate exclusion criteria. 3. Means and SDs for the test scores are reported.
[FT.13] Rounsaville, Jones, Novelly, and Kleber, 1982 (Table A21.14)
The study compared performance of opiate addicts, epi1epsy patients, and a control group. A sample of 29 Comprehensive Employment Training Act (CETA) participants was used for a normal comparison group. Participants with a history of drug or alcohol abuse or of a neurological disorder were excluded. Participants were screened for alcohol and i1licit psychoactive substances, and urine specimens were taken at the time of testing. Study strengths 1. Controls are described in terms ofgender, age, education, and percent right-handed. 2. Adequate exclusion criteria. Considerations regarding use of the study 1. SDs are not provided. 2. The procedures for assessment of hand dominance are not described. 3. Age range is not provided. 4. Sample size is small. 5. Data are collapsed across genders. [FT.14] Yeudall, Fromm-Auch, and Davies, 1982 (TableA21.15) The study compares performance on the HRB for delinquent and nondelinquent adolescents
FINGER TAPPING TEST
in Canada. The delinquent group included adolescents admitted to the primary residential treatment resource for persistent delinquents with severe behavioral disturbances. The nondelinquent group included adolescents from regular classrooms. Handedness was measured by the Annett (1970) Handedness Questionnaire: 88% of the delinquent sample and 83% of the control sample were right-handed. Study strengths 1. Samples are described in terms of age, gender, IQ, handedness, and geographic area. 2. Procedure for assessment of handedness is identified. 3. Means and SDs for the test scores are reported. 4. Sample sizes are relatively large. Considerations regarding use of the study 1. Data are collapsed across genders. 2. No apparent exclusion criteria. Other comments 1. Data were collected in Canada. [FT.15] O'Donnell, Kurtz, and Ramanaiah, 1983 (Table A21.16)
The study compares neuropsychological test performance of normal, learning-disabled, and brain-damaged young adults. The normal control group consists mostJy of college student volunteers without a history of learning problems, blows to the head, or seizures. Study strengths 1. Composition of the sample is well described in terms of age (college students), education, gender, handedness, and occupation. 2. Minimally adequate exclusion criteria. 3. Narrow age range. 4. Means and SDs for the test scores are reported. Considerations regarding use of the study 1. Data are presented for the dominant hand only.
429
2. Procedures for assessment of hand dominance are not described. 3. Sample size is relatively small. 4. Data are collapsed across genders. 5. High IQ level. [FT.16] Prigatano, Parsons, Levin, Wright, and Hawryluk, 1983 (Table A21.17)
The study, conducted in Oklahoma and Canada, explores neuropsychological functioning in mildly hypoxemic patients with chronic obstructive pulmonary disease (COPD). The 25 control participants, free of physical or emotional illnesses, were matched to patients. Study strengths 1. Control sample is described in terms of age, education, gender, IQ, handedness, and geographic area. 2. Minimally adequate exclusion criteria. 3. Means and SDs for the test scores are reported. Considerations regarding use of the study 1. Procedures for assessment of hand dominance are not described. 2. Small sample size. 3. Data are collapsed across genders. 4. Undifferentiated age range. Other comments 1. Data were partially collected in Canada. [FT.17] Fromm-Auch and Yeudall, 1983 (Table A21.18)
The authors obtained data on 193 Canadian participants (111 male, 82 female) recruited through posted advertisements and personal contacts. Participants' mean education was 14.8 (3.0) years and mean FSIQwas 119.1 (8.8). Participants are described as "nonpsychiatric" and "non-neurological." Handedness was determined by the writing hand; 83.4% of the sample were right-handed. Strength of handedness was determined by the Annett (1970) Handedness Questionnaire. Means and SDs for each hand are reported for the entire sample and for five age groups stratified by gender. The authors observed a pronounced effect of gender on all motor tests, with females appearing weaker and slower than males.
430
MOTOR FUNCTIONS
The relationship between age and performance appears to be curvilinear for both genders, with peak performance in the 33-40 year old age range.
Study strengths 1. Sample composition is described in terms of gender, IQ, education, geo!raphic area, and recruitment proceduresj 2. Some psychiatric and neurolo~al exclusion criteria were used. , 3. Large overall sample size. 4. Method for determining handedpess is specified. ~ 5. Normative data are presented for the entire sample and separately for: different age and gender groups. ; 6. Means, SDs, and ranges for ~e test scores are reported.
Considerations regarding use of the sttfdy 1. Sample sizes for some age grot4>s are very small. ; 2. High intellectual and educational level of the sample. :
Other comments
,
stratified by age group (20--39, 40-59, ~9), level of education (
Study strengths 1. Very large overall sample size. 2. Data are stratified by age, gender, and educational level. 3. This data set is unique in that it reports data for participants with less than a high school education. 4. Information on recruitment procedures and geographic area is provided. 5. Method for determining handedness is specified. 6. Means and SDs for the test scores are reported. 7. Test administration procedures are specified.
1. The article provides a summary of pre-
viously published normative data.; 2. Calculation of the educational level included technical or vocational training. 3. Data were collected in Alberta, danada.
Considerations regarding use of the study 1. Individual sample sizes of some cells are small. It is unclear whether the youngest age included was 18 or 20. 2. No reported exclusion criteria.
[FT.18] Bornstein, 1985 (Tables A21.19, A21.20)
The author collected data on 365 Capadian individuals (178 males, 187 females) re~ruited through posted notices on college campuses and unemployment offices, newspaper ads, and senior-citizen groups. Participants wei!e paid for their participation. Participants we~ aged 18--69, with a mean of 43.3 (17.1) years, and had completed 5-20 years of education; with a mean of 12.3 (2.7) years; 91.5% of the .ample were right-handed. No other demographic data or exclusion criteria are reported. i Means and SDs are reported for eacH hand. Scores are based on the mean of fiv. trials within five taps of each other to a maxirlJum of 10 trials. When not accomplished, the score is the mean of the best five trials. The sa~ple is
Other comments 1. Data were collected in Canada. [FT.19] Villardita, Cultrera, Cupone, and Mejia, 1985 (Table A21.21 )
All participants were healthy volunteers residing in Catania, Italy, with 8--13 years of schooling and scored >23 on the Mini-Mental State Exam (MMSE). The score is the total number of taps recorded for each hand on two trials. The data are presented in four age groupings.
Study strengths 1. Administration procedure is described. 2. Means and SDs for the test scores are reported. 3. Data are presented by age groupings.
431
FINGER TAPPING TEST
Considerations regarding use of the study 1. Demographic characteristics such as gender distribution and mean educational level are not presented. 2. Data for 25--45 years of age are not presented, and data are not stratified by gender. 3. Sample sizes are small. 4. Administration procedures and data reporting deviate from the standard instructions. 5. Method for determining handedness is not reported. 6. Exclusion criteria are not adequate. Other comments 1. Data were collected in Italy. [FT.20] Heaton, Nelson, Thompson, Burks, and Franklin, 1985 (Table A21.22) The authors compared performance of multiple sclerosis patients and normal controls recruited in Colorado. The control group included 100 participants with no history of neurological illness, significant head trauma, or substance abuse.
Study strengths 1. Control sample size is large. 2. Information regarding age, education, gender, and geographic area is reported. 3. Exclusion criteria are adequate. 4. Means and SDs for the test scores are reported. Considerations regarding use of the study 1. Data are provided as total for both hands only. 2. Undifferentiated age range. 3. No information regarding handedness. 4. High educational level of controls. [FT.21] Kane, Parsons, and Goldstein, 1985 (Table A21.23)
The study compares performance of braindamaged and control participants on neuropsychological tests. The control group consists of 46 medical and nonschizophrenic psychiatric V.A. patients with a mean age of 38.9 (11.3) years, recruited in Oklahoma and Pittsburgh. Data for two hands are reported in T scores.
Study strengths 1. Sample is described in terms of age, education, and geographic area. 2. Adequate sample size. 3. Means and SDs for the test scores are reported. Considerations regarding use of the study 1. Procedures used to determine handedness are not described. 2. Data are reported in T scores rather than number of taps. 3. Control group consists of medical and psychiatric patients. 4. No information on gender; presumably, the majority of participants are males. 5. Undifferentiated age range. [FT.22] Heaton, Grant, and Matthews, 1986 (Table A21.24)
The authors obtained data on 553 normal controls in Colorado, California, and Wisconsin as part of an investigation into the effects of age, education, and gender on HalsteadReitan Battery performance. The sample consisted of 356 males and 197 females. Exclusion criteria were history of neurological illness, significant head trauma, and substance abuse. Participants were aged 15-81 years, with a mean age of 39.3 (17.5) years, and education ranged 0-20 years, with a mean of 13.3 (3.4) years; 7.2% were left-handed. The sample was divided into three age groups and three education groups. Testing was conducted by trained technicians, and all participants were judged to have expended their best effort on the task. The chapter provides a review of different studies exploring the relationship of neuropsychological test performance with age, education, and gender. The authors concluded that different sets of norms should be used for participants of different ages, educational levels, and genders when determining whether an individual's performance is normal or abnormal. Study strengths 1. Large overall sample size and sizes of individual cells.
MOTOR FUNCTIONS
432
2. Information regarding age, education, gender, handedness, and geographic area is provided. 3. Adequate exclusion criteria. 4. Data are grouped by age and educational level. Considerations regarding use of the study 1. SDs are not provided, which limits utility of the norms. 2. Procedures for assessment of hand dominance are not described. 3. Age groupings are quite large in terms of ranges. [FT.23] Bornstein, 1986a (Tables A21.25, A21.26)
This study expands the analysis of the data provided in Bomstein (1985). The author examined cutoff levels for impairment and the proportion of participants falling in the impaired range. For the preferred and nonpreferred hands, the clinically employed cutoff criteria for males were 50 and 44 taps, respectively, and those for females were 46 and 40, respectively. Performance below these criteria placed participants into the impaired range. The high proportions of impaired scores obtained are viewed by the authors as suggesting caution in using standard cutoff scores. Base rate issues are discussed from the perspective of the validity of test interpretation. The scores are based on the mean of five trials within five taps of each other to a maximum of 10 trials. When not accomplished, the score is the mean of the best five trials. The sample was stratified by age group (18--39, 40--59, ~9), level of education (
criteria and test administration procedures are specified (Tables A21.25, A21.26). [FT.24] Polubinski and Melamed, 1986 (Table A21.27)
Participants were students taking introductory psychology classes. All participants were righthanded. The Crovitz-Zener test (1962) was used to assess the degree of hand dominance based on consistency in hand preference for five unimanual tasks. Participants with scores of 25 on this test formed the firm righthanded groups, whi1e those with scores of $24 formed the mixed right-handed groups. A switchback design was used, though it remains unclear how many tria1s were used per hand. Number of taps in 15-second tria1s was averaged for each hand.
Study strengths 1. Assessment of handedness is well described. 2. Information on age, education, handedness, gender, and occupation (college students) is provided. 3. Relatively large sample for restricted age, education, and handedness groups. 4. Means and SDs for the test scores are reported. Considerations regarding use of the study 1. No exclusion criteria are reported. 2. Nonstandard test administration (ISsecond tria1s), and exact test procedures are not specified. [FT.25] Trahan, Patterson, Quintana, and Biron, 1987 (Table A21.28)
Participants were 713 neurologically intact adu1ts (382 males, 331 females) aged 18-91. The score was the mean for three tria1s for each hand; trials began with the dominant hand and alternated between hands. The authors found no difference between three-trial and five-trial scores in a subgroup of 102 adu1ts. The authors concluded that the data revealed significant age-related differences. In addition, males performed faster than females at all age levels.
433
FINGER TAPPING TEST
Use of the traditional cutoff suggested by Halstead and Reitan resulted in false-positive rates ranging 28.0%--89.5% in the various groups. The percentage of participants displaying an intermanual difference >10% ranged 22.9%-55.2% in the various groups. Reversals (nonpreferred hand faster than preferred hand) were observed in 14.7% of the sample. The data challenge the traditional hypothesis regarding "normal" adult tapping performance.
Study strengths 1. Administration procedure is described. 2. Data are stratified by age and gender. 3. Minimally adequate exclusion criteria. 4. Means and SDs for the test scores are reported. 5. Large sample sizes for younger age groupings.
Considerations regarding use of the study 1. Sample composition is minimally described. 2. Procedures for assessment of hand dominance are not described. [FT.26] Yeudall, Reddon, Gill, and Stefanyk, 1987 (Table A21.29)
The authors obtained data on 225 Canadian participants recruited from posted advertisements in workplaces and personal solicitations. Participants included meat packers, postal workers, transit employees, hospital lab technicians, secretaries, ward aides, student interns, student nurses, and summer students. In addition, high school teachers identified for participation average students in grades 10-12. Participants (127 males, 98 females) did not report any history of forensic involvement, head injury, neurological insult, prenatal or birth complications, psychiatric problems, or substance abuse. Handedness was determined by the writing hand. Data were gathered by experienced technicians who "motivated the participants to achieve maximum performance" partially through the promise of detailed explanations of their test performance. The results are presented for the whole sample and stratified by four age groups x gender.
Study strengths 1. Large sample size. 2. Data are stratified by age and gender. 3. Data availability for a 1~20 year age group. 4. Adequate medical and psychiatric exclusion criteria. 5. Information regarding age, education, IQ, gender, occupation, recruitment procedures, and geographic area is provided. 6. Method for determining handedness is specified. 7. Means and SDs for the test scores are reported.
Consideration regarding use of the study 1. High educational level of the sample.
Other comments 1. IQ was measured by the WAIS and WAIS-R. WAIS IQ scores were linearly equated to WAIS-R IQ scores. 2. Data were collected in Canada. 3. Correlations of FI'T scores with age and education were 0.20 and 0.06 for the preferred hand and 0.22 and 0.08 for the nonpreferred hand, respectively. The effect of gender on performance was also explored. The authors concluded that age effects were not significant for either hand, but there were gender effects for both the preferred and nonpreferred hands. Therefore, they suggest using the gender norms collapsed across age.
[FT.27] Alekoumbides, Charter, Adkins, and Seacat, 1987 (Table A21.30)
The authors report data on 123 medical and psychiatric inpatients and outpatients from V.A. hospitals in southern California without cerebral lesions or history of alcoholism or cerebral contusion. The sample included 82 participants not suffering from psychiatric illness and 32 neurotic and 9 psychotic participants. In addition to psychiatry services, participants were drawn from medicine, neurology, spinal cord injury, and surgery units. Mean IQ was within the average range; means and SDs for individual age-corrected subtest scores are also reported. This group,
MOTOR FUNCTIONS
434
characterized as "normal participants," was recruited from the patient population of a large general hospital and consisted mostly of inpatients. Ages ranged 19-82 years, and education ranged 1-20 years. Seven percent of participants were black, and all but one were male. Most were urban residents. Data were collected in southern California as part of a project on development of standardized scores corrected for age and education for the Halstead-Reitan Battery. Study strengths 1. Sample composition is well described. Information regarding age, IQ, education, ethnicity, gender, occupational attainment, and geographic area is provided. 2. Sample size is large. 3. Regression equation for computation of age- and education-corrected scores is provided. 4. Means and SDs for the test scores are reported. Considerations regarding use of the study 1. Procedures used to determine hand preference are not described. 2. Sample was heterogeneous in terms of medical diagnoses. Psychiatric patients were included in this sample, which was supposedly representative of "normal" participants. 3. Wide age range is not partitioned by age groups. 4. All participants but one are male. [FT.28] Bornstein, Paniak, and O'Brien, 1987b (Tables A21.31, A21.32)
The authors compared performance of a subset of 134 healthy Canadian participants (49 males, 85 females) from a control sample described in Bomstein (1985, see above) and 94 brain-damaged patients (47 males, 47 females). Both groups were matched on age and education. Performance of braindamaged patients was considerably lower compared to controls. Classification rates obtained with conventional cutoff scores presented in Russell et al. (1970) were examined. In addition, the distribution of scores for the
two groups was examined to determine the optimal cutoff score resulting in the best overall classification rates, with emphasis on accurate classification of normal participants. Study strengths 1. Large sample size. 2. Data are provided for males and females separately. 3. Cutoff scores are provided, as well as means and SDs. 4. Information on age, gender, and educational level is provided. Considerations regarding use of the study 1. Procedures for determination of hand preference are not identified. 2. No exclusion criteria are identified. 3. Undifferentiated age range. Other comments 1. Data were collected in Canada. [FT.29] Bornstein, Baker, and
Douglass, 1987a
(Table A21.33)
The study assessed the test-retest reliability of FIT over a period of 3 weeks. Participants were 14 women and nine men without a positive history of neurological or psychiatric illness. Ages ranged 17-52 with a mean of 32.3 (10.3) years; mean Verbal IQ was 105.8 (10.8), and mean Performance IQ was 105.0 (10.5).
Participants were administered the Halstead-Reitan Battery in standard order on both initial testing and again 3 weeks later. Means and SDs for raw score change over 3 weeks are -1.1 (5.3) and 0.9 (2.7) for the right and left hands, respectively. Data for the whole sample for both testing probes are provided. Study strengths 1. Sample composition is described in terms of age, VIQ, PIQ, and gender. 2. Information on short-term (3-week) retest data were provided. 3. Minimally adequate exclusion criteria. 4. Means and SDs for the test scores are reported.
FINGER TAPPING TEST Considerations regarding use of the study 1. Sample size is small. 2. Age range is wide; the effect of age on test-retest change is not explored. 3. Educational levels are not specified. 4. Procedures for assessment of hand dominance are not described. It is unclear whether the authors used dominant/nondominant comparisons or right/ left comparisons. 5. Data are collapsed across genders. [FT.30] Russell, 1987 (Table A21.34)
The study explored test parameters for the Rennick Index of the Halstead-Reitan Battery. Brain-damaged and control groups were compared. The control group consisted of patients seen in V.A. medical centers in Miami and Cincinnati who were suspected of having a neurological condition but who had negative neurological findings. No other exclusion criteria are described. Tests were administered and scored according to the standard directions given by Russell (1984), with some modification for the FIT. Study strengths 1. Information regarding IQ, education, ethnicity, gender, age, and geographic area is provided. 2. Large sample. 3. Means and SDs for the test scores are reported. 4. Test administration procedures are reported. Considerations regarding use of the study 1. Sample is not partitioned into age groups. 2. Data are averaged for two hands. 3. Control sample consists of medical and psychiatric inpatients, who were suspected of having a neurological condition but had negative neurological findings. 4. Method for determining handedness is not reported. 5. Data are collected on mostly males and collapsed across genders.
435 [FT.31] Thompson, Heaton, Matthews, and Grant, 1987 (Table A21.35)
The article presents a percentage of 426 normal participants (279 males, 147 females) scoring in the lateralized lesion range using Golden's (1978) guidelines. Dominant hemisphere dysfunction was defined as superiority of nonpreferred hand performance over preferred hand performance. Nondominant hemisphere dysfunction was identified as preferred hand performance at least 20% better than nonpreferred hand performance. Lateral preference type was assessed based on participants' performance on the ReitanKlove Lateral Dominance Exam and the Miles ABC Test of Ocular Dominance (Reitan & Wolfson, 1985). The following groups were identified: 1. All right-participants who wrote with their right hand and manifested right lateral preference on all hand, eye, and foot measures. 2. Mixed right-participants who wrote with their right hand but manifested left preference on one or more hand, eye, or foot measures. 3. Left-left-handed participants.
Intermanual percent difference scores were calculated as preferred hand minus nonpreferred hand divided by preferred hand. Participants' mean age is 40.59 (18.27) years and mean education is 13.15 (3.49) years. They had been screened for history of head trauma, neurological illness, substance abuse, serious psychiatric illness, and peripheral injuries that might affect test performance. The authors concluded that age, education, and gender are not significantly related to intermanual difference scores. Study strengths 1. Large sample size. 2. Lateral preference was thoroughly assessed, and three groups are identified. 3. Intermanual differences and a percentage of participants scoring in the lateralized lesion range are reported. 4. Adequate exclusion criteria.
436 5. Information on age, education, and gender is reported. Considerations regarding use of the study 1. Means and SDs for each group are not reported. 2. Age range is wide; data are not partitioned into age groups, which precludes consideration of the effect of age on intermanual differences. [FT.32] van den Burg, van Zomeren,
Minderhoud, Prange, and Meijer, 1987 (Table A21.36)
The study compares a group of patients with multiple sclerosis and demographical1y matched contro]s in northern Holland. The control group consists of 40 healthy participants without a history of neurological disease, who had never been administered psychological tests prior to their participation in the study. The total number of taps in three 10-second trials for both hands constitutes the total score. Study strengths 1. Sample is described in terms of gender, age, education, and geographic area, although education is reported as 5 years of schooling, according to the Dutch educational system. 2. Administration procedure is identified. 3. Means and SDs for the test scores are reported. 4. Sample size is re1atively large. 5. Minimally adequate exclusion criteria.
MOTOR FUNCTIONS
were paid for their participation. Participants represent a subset of those used in Bornstein (1985). The sample was partitioned into three education groups. Nearly two-thirds of the sample are female (n = 85). Average age is 62.7 (4.3) years, and mean ages of the three education groups are comparable. Exclusion criteria were history of neurological or psychiatric disorder. Study strengths 1. Large overal1 sample size, and adequate individual cell sizes. 2. Data are partitioned into three education groups; the study is unique in terms of representation of participants with <12 years of education. 3. Information regarding gender, age, and geographic area is provided. 4. Means and SDs for the test scores are reported. 5. Minimal1y adequate exclusion criteria. 6. Reasonably restricted age grouping. Considerations regarding use of the study 1. Procedure for determination of hand preference is not identified. 2. The >12 years of education category is too large. 3. Data are collapsed across genders. Other comments 1. Data were collected in Canada.
[FT.34] Ardila and Rosselli, 1989 (Table A21.38)
Considerations regarding use of the study 1. Sample is not partitioned into age intervals. 2. Participants' handedness is not identified. 3. Data are reported as totals for both hands over three trials. 4. Data are collapsed across genders. Other comments 1. Data were collected in northern Holland.
[FT.33] Bornstein and Suga, 1988 (Table A21.37)
The authors reported data on 134 healthy older Canadian volunteers aged 55-70, who
The sample included 346 normal older Colombian adults. Participants had a score of ;:::23 on the MMSE, had no neurological or psychiatric background as determined by a neurological and psychiatric screening, and performed adequately in everyday life activities. Data are presented by age x education groups. Study strengths 1. Large overal1 sample size. 2. Sample is partitioned by age x education groups. 3. Adequate exclusion criteria.
437
FINGER TAPPING TEST
Considerations regarding use of the study 1. Demographic characteristics are cursorily described. 2. Sample size for each cell is small. 3. SDs are not reported. 4. Procedures used to determine hand preference are not described. 5. Data are collapsed across genders.
Other comments 1. Data were collected in Bogota, Colombia. [FT.35] Heaton, Grant, and Matthews, 1991
The authors provided normative data from 486 (378 in base sample, 108 in validation sample) urban and rural participants recruited in several states (California, Washington, Colorado, Texas, Oklahoma, Wisconsin, Illinois, Michigan, New York, Virginia, and Massachusetts) and Canada. Data were collected over a 15-year period through multicenter collaborative efforts. Sixty-five percent of the sample were male. Mean age for the total sample was 42.06 (16.8) years, and mean educational level was 13.6 (3.5) years. Mean FSIQ, VIQ, and PIQ were 113.8 (12.3), 113.9 (13.8), and 111.9 (11.6), respectively. Exclusion criteria were history of learning disability, neurological disease, illnesses affecting brain function, significant head trauma, significant psychiatric disturbance (e.g., schizophrenia), and alcohol or other substance abuse. The FTf was administered according to the procedures described by Reitan and Wolfson (1985). Participants were generally paid for their participation and judged to have provided their best efforts on the tasks. Average number of taps for five trials per hand are reported. The normative data, which are not reproduced here, are presented in comprehensive tables in T-score equivalents for scaled scores for males and females separately in 10 age groupings (20-34, 35-39, 40-44, 45-49, 50-54, 55-59, 60--64, 65-69, 7~74, 75--80 years) by six education groupings (6-8, 9-11, 12, 13-15, 16-17, ~18 years). For dominant hand performance, 19% of score variance was accounted for by gender, while 9% was attributable to age and 6% to
educational level. A total of 32% of test score variance was accounted for by demographic variables. A similar effect was described for nondominant hand performance, where 20% of score variance was accounted for by gender, while 9% was attributable to age and 6% to educational level. A total of 34% of test score variance was accounted for by demographic variables. For the sample as a whole, mean number of taps for the dominant hand was 49.9 (7.9) and for the nondominant hand, 45.2 (7.3). The interested reader is referred to the Fastenau and Adams (1996) critique of the Heaton et al. (1991) norms and Heaton et al.'s (1996a) response to this critique. In 2004, the authors published the revised norms, which are based on a sample of over 1,000 normal adults. In addition to age, education, and gender stratification, the data are partitioned by race/ethnicity (African American and Caucasian).
Study strengths 1. Large sample size. 2. Comprehensive exclusion criteria. 3. Detailed description of demographic characteristics in terms of age, education, IQ, geographic area, and gender. 4. Administration procedures were outlined. 5. Normative data are presented in comprehensive tables in T-score equivalents for males and females separately in 10 age groupings by six education groupings.
Consideration regarding use of the study 1. No information regarding how hand preference was determined. [FT.36] Ruff and Parker, 1993 (Tables A21.39, A21.40)
The FTf was administered as part of a comprehensive test battery to 358 normal volunteers recruited in California, Michigan, and the eastern seaboard aged 16-70 years with 7-22 years of education. Participants were screened for psychiatric hospitalizations, chronic polydrug abuse, or neurological disorders.
438
The score was the mean number of taps over five 10-second trials with alternating hands, starting with the dominant hand If the criterion was not met, up to two additional trials were given per hand and the ~ghest/ lowest scores eliminated from comput~on of the mean score. Data are stratified by gender x age. Data for a left hand-dominant sample are also relorted. The authors reported test-retest re .ability for a 6-month interval based on data forfive or more participants from each of the 12 ~emo graphic cells (30% of sample). Reliability coefficients for women, men, and the total Sample were 0.63, 0.70, and 0.70 for the dotJ,inant hand and 0.68, 0.75, and 0.76 for th, nondominant hand, respectively. Effect .,f age and gender on motor speed was sp~fically addressed. The authors explored the ~tio of dominant/nondominant hand perfollllance rate.
Study strengths 1. Sample composition is identified in terms of age, education, gendet, and geographic area. · 2. Assessment of handedness is well described. 3. Test administration procedure is thoroughly described. 4. Data are stratified according to gender and age. 5. Data for a left hand-dominant sample are reported. 6. Means and SDs for the test scores are reported. 7. Adequate exclusion criteria. 8. Sample sizes for each demographic cell are quite large. 9. Good exclusion criteria.
MOTOR FUNCTIONS
and addressing its psychometric properties. The normative sample consisted of veterans treated at the Cincinnati V.A. Hospital between 1968 and 1971 and the Miami V.A. Medical Center between 1971 and 1989. All participants received neurological examinations. Those who were administered the Halstead tests and the WAIS or WAIS-R were included in the study. Nine percent of the sample were representatives of minority groups. The total sample was divided into a comparison group and a brain-damaged group. The comparison group included "normal" individuals, all males. No subject in this group had a diagnosis of CNS pathology. Presenting symptoms for the majority of these participants were neurosis with memory or somatic complaints or personality disorders with episodes of explosive behavior. Patients diagnosed with schizophrenia or severe depression requiring hospitalization, as well as those with evidence of systemic vascular disease, were not included in the sample. Test scores can be corrected for age and IQ and converted into scaled scores to facilitate comparison with other tests. Statistics are reported for four groups of patients: comparison, left hemisphere damage, right hemisphere damage, and diffuse brain damage. Data only for the comparison group, stratified by gender, are reproduced in this chapter. The authors published an appendix to the manual (HRNES-R; Russell & Starkey, 2001), which contains tables of scaled scores based on the original HRNES norms, demographic corrections, and regression-based predicted scores.
(Table A21.41)
Study strengths 1. Sample composition is identified in terms of age, education, gender, ethnicity, and geographic area. 2. Control sample size for males is large. 3. Means and SDs for the test scores are reported.
This study describes the standardization sample used by the authors in their manual introducing the Halstead Russell Neuropsychological Evaluation System (HRNES)
Considerations regarding use of the study 1. Procedures used to determine hand preference are not described.
Consideration regarding use of the study 1. Educational levels for each demographic cell are not reported. [FT.37] Russell and Starkey, 1993
FINGER TAPPING TEST
2. Classification of participants in the comparison group as normal is questionable since they were suspected of having neurological conditions and referred for neurological evaluation, which yielded negative results. 3. Undifferentiated age range. [FT.38] Dikmen et al., 1999 (Table A21.42)
The FIT was used in a study on the psychometric properties of a broad range of neuropsychological measures, based on a sample of 384 normal or neurologically stable adults who were tested twice as part of several longitudinal studies. A group of friend controls consisted of 138 individuals who had no history of recent trauma and were friends of head-injured patients. Their mean age was 28.5 (12.2) years, and mean education was 12.2 (1.9) years; 60% of the sample were males, and the test-retest interval was 11.1 (.6) months. A group of trauma controls consisted of 121 individuals who had a recent traumatic injury that did not involve the head. They were tested at baseline 1 month after the trauma and then 11 months later. Their mean age was 31.2 (13.6) years, and mean education was 12.0 (2.6) years; 70% of the sample were males, and the test-retest interval was 10.7 (0.6) months. Both of these groups were tested at the University of Washington under the direction of one of the authors. Twenty percent of friend controls and 46% of trauma controls had preexisting conditions that might affect test performance, the most significant being alcohol abuse or a significant traumatic brain injury. The rest of the participants in these samples denied any history of conditions that might affect brain function. The third group, mixed normal controls, consisted of 125 participants who had no history of trauma or disease involving the brain. They were enrolled in longitudinal research projects at multiple sites under the supervision of the neuropsychology laboratories at the University of Colorado and the University of California at San Diego. Their mean age was 43.6 (19.6) years, and mean education was 12.0 (3.3) years; 68% of the sample were males, and the test-retest interval was 5.4 (2.5) months. Data are reported for all groups
439
combined. Demographic information for all groups combined is also provided. The mean WAIS FSIQ (Wechsler, 1955) on the initial testing for the three groups combined was 108.8 (12.3). The FIT for the dominant and nondominant hands was administered according to the procedures specified by Reitan and Wolfson (1993). Scores represent average number of taps for five trials within 5 points of each other. The authors provide raw scores for performance at two time probes, as well as various measures of test-retest reliability and magnitude of practice effect. Test-retest reliabilities for the FIT were r=0.77 for the dominant hand and r= 0.78 for the nondominant hand. Study strengths 1. Large sample sizes for the three groups. 2. Sample composition is well described in terms of age, education, gender, IQ, geographic area, and setting. 3. Test administration procedures are specified. 4. Means and SDs for the test scores are reported. 5. Information on test-retest reliability is provided.
Considerations regarding use of the study 1. Exclusion criteria are not clearly described. As the authors pointed out, 20% of friend controls and 46% of trauma controls had preexisting conditions that might affect test performance, the most significant being alcohol abuse and a significant traumatic brain injury. 2. Data are not partitioned by age group. 3. No information regarding how hand preference was determined. [FT.39] McCurry, Gibbons, Uomoto, Thompson, Graves, Edland, Bowen, McCormick, and Larson, 2001 (Table A21.43) The FIT was administered as part of the battery used in a prospective study examining the effects of age and other demographic factors on cognitive test performance in a sample of older Japanese American adults. The sample consisted of 201 nondemented, community- and institution-dwelling elderly
440
2::70 years of age, who were of at least 50% Japanese heritage and enrolled in the Kame Project, a study of aging and dementia in Seattle-King County, Washington. A twostage stratified design was used for follow-up sampling. All participants underwent additional interviews with proxy informants and a physical and neurological examination. The test battery was administered by a trained psychometrician and interpreted by a geriatric psychologist. Participants who were judged to be nondemented, based on the consensus of a multispecialty team of clinicians, were included in the study. Participants were aged 70-101 years, with a mean of 79.6 (7.2) years, and had 5-20 years of education, with a mean of 11.0 (3.0) years; 56.2% of the sample were female; 47.9% were Japanese-speaking or spoke mixed English/Japanese; 72.9% were born in the United States; and 94% were right-handed. Assessment was conducted in either English or Japanese, based on the participants' primary language and speaking preference. Test administration instructions and materials were originally translated by bilingual interviewers, then back-translated into English by two professional translators for content comparisons. An average score for five 10-second FIT trials was computed for each hand. To adjust for the stratified sampling design, each individual's test score had to be weighted by the inverse of the sampling fraction for that stratum (for further procedure, see the original article). Weighted means, SDs, medians, and 25th and 75th percentile scores are presented, stratified into two age groups: 70-79 and 2::80. Only means, SDs, and demographic information are reproduced in this chapter. The authors found that age significantly affected rate of finger tapping, whereas education did not influence FIT performance.
Study strengths 1. Large sample. 2. Sample composition is well described in terms of age, education, gender, preferred language, handedness, setting, and geographic area.
MOTOR FUNCTIONS
3. Adequate exclusion criteria. 4. Test administration procedures are specified. 5. Means and SDs for the test scores are reported.
Considerations regarding use of the study 1. Procedures for assessment of hand dominance are not described. 2. Weighted statistics are reported. 3. Data are collapsed across genders. Other comments 1. Data were collected on Japanese-American participants. [FT.40] Sackellares and Sackellares, 2001 (Table A21.44) The authors compared manual motor speed in 40 patients with psychogenic pseudoseizures and matched controls. The control group included 40 healthy adults (28 right-handed, 12 left-handed) 18-50 years of age, with a mean of 33.2 years. Participants had no history of neurological or psychiatric disorder. The dominant hand was defined as the preferred writing hand. The test was administered by a trained professional. The Asymmetry Index was calculated as the ratio of dominant minus nondominant hand to dominant hand performance multiplied by 100. Performance rates for both hands and asymmetry indices are provided for righthanded and left-handed samples.
Study strengths 1. Adequate sample size. 2. The sample composition is described in terms of age and clinical setting. 3. Minimally adequate exclusion criteria. 4. Method for determining handedness is specified. 5. Means and SDs for the test scores are reported for right-handed and lefthanded samples separately.
Considerations regarding use of the study 1. Wide age range; data are not partitioned by age group.
FINGER TAPPING TEST
2. Sample is cursorily described; no information on education or gender. 3. Data are collapsed across genders. [FT.411 Prigatano and Borgaro, 2003 (Table A21.45)
The authors investigated "normal" and "abnormal" finger tapping patterns in traumatic brain injury patients and normal contro1s. The control group included 15 participants from the general population, who were friends of either the patients' families or the experimenters. They had no reported history of psychiatric or neurological disease. All participants were interviewed and administered the Barrow Neurological Institute Screen for Higher Cerebral Functions. The test was not administered to any subject with a peripheral or orthopedic injury to the hand or arm. The FTT was administered by trained professionals. Three test trials with the dominant hand were followed by three trials with the nondominant hand. Additional consecutive trials with each hand followed, until five trials were obtained in which the mean numbers of taps were within 5 points of each other. In addition to comparing finger tapping rates for traumatic brain injury and control participants, the authors provided qualitative interpretation of FTT performance patterns. Rates of tapping for the control group are presented in Table A21.45.
Study strengths 1. The sample composition is described in
terms of age, education, gender, setting, and recruitment procedures. 2. Adequate exclusion criteria. 3. Test administration procedures are specified. 4. Means and SDs for the test scores are reported.
Considerations regarding use of the study 1. Small sample size. 2. Procedures for assessment of hand dominance are not described. 3. Data are collapsed across genders.
441
RESULTS OF THE META-ANALYSES OF THE FINGER TAPPING TEST DATA (See Appendix 21 m) Data collected from the studies reviewed in this chapter were combined in regression analyses, to describe the relationship between age and test performance and to predict expected test scores for different age groups. Effects of other demographic variables were explored in follow-up analyses. The general procedures for data selection and analysis are described in Chapter 3. Detailed results of the meta-analysis and predicted test scores across adult age groups are provided in Appendix 21m. Only studies that stratify results by gender were used in the prediction analyses. After initial data editing for consistency and for outlying scores, the following data were included in the analyses: eight studies, which generated 20 data points, based on a total of 963 participants for males, dominant hand; seven studies, which generated 19 data points, based on a total of 933 participants for males, nondominant hand; four studies, which generated 10 data points, based on a total of 560 participants for females, dominant hand; and three studies, which generated nine data points, based on a total of 530 participants for females, nondominant hand. It should be pointed out that the integrity of the results is undermined by the lack of consistency in data reporting. A majority of studies report data for the "dominant hand" and "nondominant hand," while some report for the "right hand" and "left hand." Some of the latter studies include left hand-dominant participants. Though the percent of left-banders is typically small (1%-7.5%), their inclusion confounds the outcome. Also, determination of the dominant hand was based on a wide range of criteria, ranging from comprehensive questionnaires to self-report of the writing hand. Quadratic regressions of FTT scores on age were used for both male and female data for the dominant hand and linear regressions, for the nondominant hand. R2 ranged 0.6220.937. Based on the derived models, we estimated FTT scores for age intervals between
442
20 and 74 years. If predicted scores are needed for age ranges outside the reported boundaries, with proper caution (see Chapter 3) they can be calculated using the regression equations included in the tables, which underlie calculations of the predicted scores. Regressions of SDs on age yielded R2 ranging 0.663--0.800, indicating increase in variability with advancing age, consistent with the literature. Predicted SDs, based on these models, are reported. Only a few studies reported data on educational level. Therefore, the effect of education on test performance was not examined. Strengths of the analyses 1. Total sample sizes range 530-963 participants. 2. R2 of 0.937 for females, dominant hand, indicates a good model fit. However, the number of data points in this analysis is only 10, which might result in somewhat inflated R2 values. 3. Postestimation tests for parameter specifications did not indicate problems with normality. There were no problems with homoscedasticity for female data. 4. Differences in mean predicted scores for the dominant vs. nondominant hands are 4.85 (51.61 vs. 46. 76) for males and 3. 73 (47.47 vs. 43.74) for females, which are consistent with the guideline of a 10% preferred-hand superiority (particularly for males, somewhat smaller for females). It should be noted in the context of this comparison that data only for the dominant hand for a group of 19-year-olds were reported in one study. This did not affect considerably the mean age for the aggregate sample of males. However, the mean weighted age for the aggregate sample of females used in the data analysis for the non-dominant hand, increased by approximately 3.5 years because of a small number of data points for this group available for analysis. 5. Differences between males and females in mean predicted scores are 4.14 (51.61 vs. 47.47) for the dominant hand and 3.02 (46.76 vs. 43.74) for the nondominant hand, which are consistent
MOTOR FUNCTIONS
with the effect of gender on test performance described in the literature, with males expected to outperform females by three to five taps. It should be noted in respect to this comparison for the dominant hand, that average age for males is about 5 years greater than for females. Limitations of the analyses 1. R2 of 0.686 and 0.622 for the dominant and nondominant hands for males and of 0.779 for females, nondominant hand, are acceptable. However, these values indicate that only 62%-78% of variance in FIT scores is accounted for by the models. 2. Number of data points for females is small. 3. Postestimation tests for parameter specifications indicated marginally acceptable homoscedasticity for males. Variability in scores across age groups is greater than expected by chance, with a considerable increase in variability in the older age groups, as reflected in the size of the confidence intervals. Therefore, the predicted scores for the older age ranges are less accurate than for the younger ranges.
CONCLUSIONS The large number of studies focusing on the psychometric properties of the FIT reflect its wide clinical use. In fact, a survey of neuropsychologists identified the FIT as one of the two neuropsychological tests (along with the Category Test) most frequently used in the assessment of adults (Sellers & Nadler, 1992). A review of the FIT research suggests considerable consistency in the data across different studies. A decreasing rate of tapping as a function of advancing age and lower levels of education are well demonstrated. Gender differences, with males outperforming females, are also unequivocal. In addition, the literature review suggests that test performance is highly affected by disruption of sensory or motor tracts and by peripheral damage to the upper extremities. Because FIT performance is
FINGER TAPPING TEST
affected by many factors, the interpretation of tapping speed as indicative of cortical dysfunction should be made with great caution. Following recommendations by Bomstein (1986c), the accuracy of F1T interpretive conclusions must be confirmed by findings from other motor and nonmotor tasks. The importance of clinical judgment, rather than "rule of thumb," is especially apparent in view of the controversy surrounding cutoffs for brain impairment. Unacceptably high false-positive rates with use of the original Halstead cutoffs warrant further research directed at the formulation of revised cutoff scores, assuring an optimal balance of sensitivity and specificity, which would differ for the various demographic groups. Similarly, an issue of intermanual differences remains highly disputed. A 10% dominant hand superiority criterion is clearly consistent with the average performance across the studies presented above. However, a wide range of individual differences documented in numerous studies (including high rates of reversal in intermanual difference) suggests that great
443
caution should be taken in the interpretation of dominant-nondominant hand comparisons. Despite the large number of empirical studies exploring the psychometric properties of the F1T accumulated to date, some aspects of F1T performance are not sufficiently addressed. For example, normative data for older age groups are scarce. Test-retest concordance should be further explored, to assess the magnitude of the practice effect and to address the issue of test reliability over different interprobe intervals. Similarly, very few studies provide norms for left-banders. Investigation of left-handed groups is a challenging task due to the great variability in cerebral dominance among left-handed individuals, which obscures lateralization assumptions in the interpretation of test results. Since F1T interpretation is based on lateralization assumptions, it is of the utmost importance to report the criteria for assessment of handedness, cutoff scores for subject selection on the basis of handedness pattern, and the number of left-handed individuals in the sample, if they are included.
22 Grip Streng~ Test (Hand Dynamometer)
BRIEF HISTORY OF THE TEST The Smedley Hand Dynamometer o. Grip Strength Test is a part of the Lateral Dominance Examination added by Reitan to the Halstead battery. It is a measure of pure motor ability (Russell & Starkey, 1993). According to Spreen and Strauss (1998), this test mwsures strength or intensity of voluntary grip movements of each hand. The dynamometer is available from Lafayette Instruments. Psychological Assessment Resources, and the Reitan Neuropsychological Laboratory (see Appendix 1 for ordering instructions). There are several variations in administration and scoring of the test that should be taken into consideration when interpretin* the norms. The majority of authors refer to the standard description of the procedures specified by Reitan and Wolfson (1985). The most common procedure is as follows. Parti
dynamometer, allow three trials with each hand, alternating right and left hands, with a 10-second rest between trials. Only the highest record for each hand is used in subsequent computations. Additional information on the administration of this test is provided in Lezak (1995), Lezak et al. (2004), and Spreen and Strauss (1998). Grip strength is most commonly reported in kilograms, averaged across all trials for each hand. Some studies, however, provide data allowing conversion of raw scores into other units which facilitate comparison between different tests. For example, a normative system for the expanded Halstead-Reitan Battery (HRB) developed by Heaton et al. (1991, 2004) converts raw scores into scaled score equivalents, which can be further converted into T scores adjusted for age, education, and gender. Performance on the Hand Dynamometer Test reflects the integrity of the motor strip (Swiercinsky, 1978). Sensitivity of this test to brain dysfunction has been demonstrated in many clinical comparison studies (Bomstein, 1986c; Dodrill, 1978b; Strauss & Wada, 1988). The Hand Dynamometer allows comparison of grip strength between both hands and therefore is sensitive to a lateralized lesion in the hemisphere contralateral to the hand demonstrating deviant performance. Generally, the preferred hand is expected to be 10% stronger than the nonpreferred hand (Reitan & Wolfson, 1985),
GRIP STRENGTH TEST
with intermanual differences in excess of 20% being suggestive of brain impainnent (Golden, 1978). Use of this criterion in nonnative studies, however, yielded unacceptably high rates of false-positive misclassification, which was especially true for left-handed individuals (Bomstein, 1986c; Koffier & Zebler, 1985; Thompson et al., 1987). The large number of misclassifications is due to a high rate of variability in intermanual differences reported in the above studies, which obscures the interpretive accuracy of the results. Bomstein (1986c) suggests that in the evaluation of left hemisphere lesions, interpretive consistency of performance on motor tasks should be supported by nonmotor tasks and by additional instruments measuring motor performance. In this study, the author evaluated the pattern of performance on three motor tests (Finger Tapping Test, Grooved Pegboard Test, and Hand Dynamometer), which were administered to normal and unilateral brain lesion samples. Interestingly, a large degree of variability was observed across these intennanual measures, whereby "a high percentage (approximately 25 percent) of the normal sample obtained scores more than one standard deviation from the control mean on a single measure" (p. 719). Thus, the author has emphasized the importance of consistency in performance pattern across tasks, rather than use of a "rigid application of 'cookbook' formulas or 'rules of thumb' " (p. 723) in the test interpretation. High test-retest reliability of the Hand Dynamometer is well documented, with coefficients ranging 0. 79-0.94 across different studies (see Lezak et al., 2004). Data on repeated administration are also presented by McCaffrey et al. (2000). For further information on the psychometric properties of the Hand Dynamometer, see Lezaket al. (2004) and Spreen and Strauss (1998).
445
variables. The effect of age on test performance is reported by Anstey and Smith (1999), Bomstein (1986a), Christensen et al. (2001), Fromm-Auch and Yeudall (1983), Heaton et al. (1996b), Koffier and Zehler (1985), and Yeudall et al. (1987), with equivocal findings regarding the timing of the onset of decline in grip strength (after age 40 vs. 60). The effect of education on test performance is questionable: Bomstein (1985) found a considerable effect, whereas Ernst (1988) and Heaton et al. (1991) reported negative findings. Spreen and Strauss (1991) relate Hand Dynamometer performance to participants' height and weight, among other variables. Performance on this test is affected by gender difference, more than any other motor test (Heaton et al., 1991, 2004). Superiority of males in test performance is documented by Fromm-Auch and Yeudall (1983), Koffier and Zebler (1985), Morehouse et al. (2000), Peynircioglu et al. (2000), and Yeudall et al. (1987). Dodrill (1979) related the gender difference on tests that have a strong motor component to hand size. The effect of gender on intennanual differences is questionable; Borod et al. (1984), Ernst (1988), Lewandowski et al. (1982), Morehouse et al. (2000), and Thompson et al. (1987) reported negative findings, whereas Bomstein (1986d) found that males had larger intennanual differences. Moffoot et al. (1994) relate grip strength to affective state, with lower strength in patients suffering from major depression with melancholia. Furthermore, Burton et al. (2002) underscore high intraindividual variability in grip strength as a function of physical and emotional condition, especially negative affect, which is more pronounced in individuals with traumatic brain injuries.
METHOD FOR EVALUATING THE NORMATIVE REPORTS RELATIONSHIP BETWEEN HAND DYNAMOMETER PERFORMANCE AND DEMOGRAPHIC FACTORS Performance on the Hand Dynamometer varies as a function of several demographic
To adequately evaluate the Hand Dynamometer data, eight key criterion variables were deemed critical. The first six of these relate to subject variables, and the remaining two refer to procedural issues.
MOTOR FUNCTIONS
446
Subject Variables Sample Size
Fifty cases are considered a desirable sample size. Although this criterion is somewhat arbitrary, a large number of studies suggest that data based on small sample sizes are highly influenced by individual differences and do not provide a reliable estimate of the population mean.
description allows selection of the most appropriate norms or corrections to account for deviations in administration procedures. Data Reporting
Group means and standard deviations for grip strength measured in kilograms averaged over all trials, for the dominant and nondominant hands, should be presented at minimum.
Sample Composition Description
Information regarding medical and psychiatric exclusion criteria is important. It is unclear if geographic recruitment region, socioeconomic status, occupation, ethnicity, handedness, or recruitment procedures are relevant. Until determined, it is best that this information be provided. Age Group Intervals
This criterion refers to grouping of the data into limited age intervals. This requirement is especially relevant for this test since an effect of age on grip strength has been unequivocally demonstrated in the literature. Reporting of Educational Levels
Given the possible association between educational level and grip strength, information regarding educational level should be reported for each subgroup. Reporting by Gender
A strong relationship between gender and grip strength has been unequivocally demonstrated in the literature. Therefore, it is imperative that normative data are reported for males and females separately. Description of Hand Preference Assessment
To address the issue of lateralization in test performance, assessment procedures for hand preference should be fully described. Without this assessment, assumptions regarding functionallateralization cannot be made.
Procedural Variables Description of Administration Procedures
Administration procedures for the Hand Dynamometer differ among studies. A detailed
SUMMARY OF THE STATUS OF THE NORMS The information presented in studies reporting data for the Hand Dynamometer differs considerably. Some of these differences will be summarized below. In addition to normative studies based on "normal" samples, there are a number of clinical comparison studies that explore differences in test performance between clinical groups and "normal control" groups (which are sometimes matched on demographic characteristics). "Normal control" groups are frequently comprised of medical or psychiatric patients. These samples cannot be considered truly "normal" due to possible effects of their illnesses and medication intake on test performance. Administration and scoring procedures vary among studies. The number of trials with each hand varies between two and four, with the majority of studies using two alternating-hands trials. The majority of the authors report data in kilograms, averaged across all trials for each hand; however, some studies present data in T scores, report strength at the best attempt, or provide scores for the dominant hand only. Starting hand varies, although the majority of studies start with the dominant hand. Such deviations from the standard method of data reporting are identified in our review of the normative data in each pertinent table. In addition to providing normative data for each hand, several studies report the proportion of participants falling in the impaired range or rates of intermanual differences. Several authors stratify their samples by age, education, and/or gender. Procedures for assessment of handedness are thoroughly
447
GRIP STRENGTH TEST
described in some studies. Furthermore, some authors divide their samples into groups based on handedness pattern. The majority of studies recruited mostly young and middle-aged participants. Only a few studies present data for elderly individuals. Several studies provide test-retest data over varying interprobe intervals ranging from 14 weeks to 6 months. Among all the studies available in the literature, we selected for review those based on well-defined samples or that offer some information not routinely reported. In this chapter, normative publications and control data from clinical studies are reviewed in ascending chronological order. The text of study descriptions contains references to the corresponding tables identified by number in Appendix 22. Table A22.1, the locator table, summarizes information provided in the studies described in this chapter. 1
SUMMARIES OF THE STUDIES [0.1 1 Matarazzo, Wiens, Matarazzo, and Goldstein, 1974 (Table A22.2)
Participants were 29 normal young men who met strict selection criteria for the Portland Police Department. Participants were aged 21-28 years, educational level ranged 1216 years, and mean full-scale IQ was 118. Participants were retested 14-24 weeks later, with a median of 20 weeks. Study strengths 1. Sample composition is described in terms of age, gender, education, IQ, and geographic area. 2. Administration procedure is outlined. 3. Data on test-retest are presented. 4. Adequate exclusion criteria. 5. Means and SDs for the test scores are provided. Considerations regarding use of the study 1. Procedures for assessment of hand dominance are not described. 'Nonns for children are available in Baron (2004) and Spreen and Strauss (1998).
2. Sample size is relatively small. 3. All-male sample. [0.2] Wiens and Matarazzo, 1977 (Table A22.3)
The authors collected data on 48 male applicants to a patrolman program in Portland, Oregon, as part of an investigation of the WAIS and MMPI correlates of the HalsteadReitan Battery. All participants passed a medical exam and were judged to be neurologically normal. Participants were divided into two equal groups, which were comparable in age, education, and WAIS FSIQ. Group 1 ranged in age 21-27 years and group 2, 21-28 years. Correlations for the two groups between W AIS FSIQ and dynamometer scores were -0.38 and 0.03 for the preferred hand and -0.13 and -0.36 for the nonpreferred hand, respectively. The authors inferred that for the top half of the population in education and IQ, individual differences in scores on the WAIS do not influence performance on the test. Study strengths 1. Demographic characteristics of the sample are presented in terms of gender, age, education, IQ, recruitment procedures, and geographic area. 2. Adequate medical exclusion criteria. 3. Means and SDs for the test scores are reported. 4. Data are provided in a restricted age range. Considerations regarding use of the study 1. Procedures used to determine hand preference are not described. 2. High IQ level. 3. Relatively small sample size. 4. All-male sample. [0.3] Dodrill, 1978b (Table A22.4)
Performance on motor tests was compared for a control and three brain-damaged groups. The 25 control participants were Caucasian, right-handed adults over 15 years of age, recruited from community resources in Washington, who had no history of injury or disease
448
MOTOR FUNCTIONS
that involved the central nervous system (CNS). The Smedley Hand Dynamometer was used. Two trials were given in altematiQ.g fashion for each hand, beginning with ~ right hand. The average of the two trials wis used as the final score for each hand. ! The authors concluded that the ometer correctly identified lateralization brain lesions in more instances than other mot r tests.
:$.
Study strengths . 1. Sample composition is descri'~ed in terms of age, education, genderJhandedness, and geographic area. : 2. Administration procedures are ~ll described. I 3. Minimally adequate exclusion criteria. 4. Means and SDs for the test scotes are , reported. '
Considerations regarding use of the st~dy 1. Procedures used to determine hand preference are not described. 2. Small sample size. 3. Data are collapsed across genders. 4. Undifferentiated age range. [D.41 Dodrill, 1979 (Table A22.5) The study explored gender differences on various neuropsychological measure$. The control group included 47 matched pairs of nonneurological males and females recruited in Washington. Within each pair, participants were matched for age (±5 years) and ~duca tion (±2 years). All participants were ~auca sian and older than 16 years. In addition, groups were matched for Hollingshead~s two; factor index of social position. It is assumed, but not stated by the authors, that the original Halstead procedure was followed, and the score for the dominant hand was reported. ~ The authors reported considerable gender differences on the tests that have very :strong motor components, which they rela~d to hand size. Study strengths 1. Sample composition is descrilJFd in terms of age, education, socioeccromic
status, gender, ethnicity, and geographic area. 2. Data are presented for males and females separately. 3. Sample sizes are adequate. 4. Means and SDs for the test scores are reported. Considerations regarding use of the study 1. Administration procedures are not clearly described. 2. Samples were not divided into age groups. 3. Procedures for assessment of hand dominance are not described. 4. No apparent exclusion criteria. [D.51 Rounsaville, Jones, Novelly, and Kleber, 1982 (Table A22.6)
The study compared performance of opiate addicts, epilepsy patients, and a control group. A sample of 29 Comprehensive Employment Training Act (CETA) participants was used for a normal comparison group. Participants with a history of drug or alcohol abuse or of a neurological disorder were excluded. Participants were screened for alcohol and illicit psychoactive substances, and urine specimens were taken at the time of testing. The Stoelting Dynamometer was used; no other test administration information is provided. Study strengths 1. Control participants are described in terms of gender, age, education, and percent right-handed. 2. Adequate exclusion criteria. Considerations regarding use of the study 1. SDs were not provided. 2. Testing procedure is scarcely described. 3. Procedures for assessment of hand dominance are not described. 4. Age range is not provided. 5. Sample size is small. 6. Data are collapsed across genders. [D.61 Yeudall, Fromm-Auch, and Davies, 1982 (Table A22.7)
The study compares performance on the HRB for delinquent and nondelinquent adolescents
GRIP STRENGTH TEST
in Canada. The delinquent group included adolescents admitted to the primary residential treatment resource for persistent delinquents with severe behavioral disturbances. The nondelinquent group included adolescents from regular classrooms. Handedness was measured by the Annett (1970) Handedness Questionnaire; 88% of the delinquent sample and 83% of the control sample were right-handed. Study strengths 1. Samples are described in terms of age, gender, IQ, handedness, and geographic area. 2. Procedure for assessment of handedness is identified. 3. Means and SDs for the test scores are reported. 4. Sample sizes are relatively large. Considerations regarding use of the study 1. Data were collapsed across genders. 2. No apparent exclusion criteria. Other comments 1. Data were collected in Canada. [0.7] Prigatano, Panons, Levin, Wright,
and Hawryluk, 1983 (Table A22.8) The study, conducted in Oklahoma and Canada, explores neuropsychological functioning in mildly hypoxemic patients with chronic obstructive pulmonary disease (COPD). The 25 control participants, free of physical or emotional illnesses, were matched to patients on age, education, handedness, gender ratio, and social class rating. Data only for the control group are reproduced in this chapter.
449
2. Data are presented for the dominant hand only. 3. Small sample size. 4. Data were collapsed across genders. 5. Undifferentiated age range. [0.8] Fromm-Auch and Yeudall, 1983
(Table A22.9) The authors obtained data on 193 Canadian participants (111 male, 82 female) recruited through posted advertisements and personal contacts. Mean education was 14.8 (3.0) years, and mean FSIQ was 119.1 (8.8). Participants are described as "nonpsycbiatric" and "nonneurological." Handedness was determined by the writing hand; 83.4% of the sample were right-handed. Strength of handedness was determined by the Annett (1970) Handedness Questionnaire. Means and SDs for each hand are reported for the entire sample and for five age groups stratified by gender. The authors concluded that a pronounced effect of gender was seen on all motor tests, with females appearing weaker and slower than males. The relationship between age and performance appears to be curvilinear for both genders, with peak performance in the 33-40 year range. Study strengths 1. Sample composition is described in terms of age, gender, IQ, education, geographic area, and recruitment procedures. 2. Some psychiatric and neurological exclusion criteria were used. 3. The large overall sample size. 4. Handedness was established. 5. Data are partitioned into five age groups and gender groups. 6. Means, SDs, and ranges for the test scores are reported.
Study strengths 1. Control sample was described in terms of age, education, gender, IQ, handedness, and geographic area. 2. Minimally adequate exclusion criteria. 3. Means and SDs for the test scores are reported.
Considerations regarding use of the study 1. Sample sizes for some age groups are very small. 2. High intellectual an:d educational level of the sample.
Considerations regarding use of the study 1. Procedures for assessment of band dominance are not described.
Other comments 1. Provides a summary of previously published normative data.
MOTOR FUNCTIONS
450
2. Calculation of the educational level included technical or vocational training. 3. Data were collected in Alberta, Canada.
Other comments 1. Data were collected in Canada. [0.10] Koffler and Zehler, 1985
[0.9] Bomstein, 1985
(Table A22.12)
(Tables A22.10, A22.11)
In this study, 206 normal (by self-report) adults (100 males, 106 females) aged 20-77 were administered the Hand Dynamometer to obtain normative data. The Stoelting Dynamometer was used, and Reitan's procedure of using the highest reading for each hand was followed. After adjustment for hand size, two alternating trials were given, beginning with the dominant hand. Determination of hand dominance was based on self-report; 87% of participants were right-handed, 9. 7% were left-handed, and 3.3% reported mixed dominance. The data were stratified by age and gender. The authors concluded that greater strength of grip is demonstrated by males at all ages. They cautioned that the use of commonly accepted criteria for detection of lateralized motor dysfunction leads to a large number of false-positive errors.
The author collected data on 365 Canadian individuals (178 males, 187 females) recruited through posted notices on college c~puses and unemployment offices, newspapt:!r ads, and senior-citizen groups, who we~ paid for their participation. Participants were aged 18--69, with a mean of 43.3 (17.1) ye.-s, and had completed 5-20 years of educatioi1J with a mean of 12.3 (2.7) years; 91.5% of the pample were right-handed. No other demopphic data or exclusion criteria are reported.' Means and SDs are reported for eacll hand. The sample is stratified by age group t20--39, 40--59, 60--69), level of education ~
2. Data are stratified by age, gender, and educational level. 3. Sample is unique in that it includes participants with less than a high school education. 4. Information on recruitment procedures and geographic area is provided. 5. Method for determining handedness is specified. 6. Means and SDs for the test scores are reported. Considerations regarding use of the stUdy
1. Individual sample sizes of some c•lls are small. It is unclear whether the yo)lngest age included in the study was 18 br 20. 2. No reported exclusion criteria.
Study strengths 1. Administration procedure is well described. 2. Data are presented by age and gender. 3. Method for determining handedness is specified. 4. Overall sample size is adequate, though sizes of individual cells are small. 5. Means and SDs for the test scores are reported. Considerations regarding use of the study 1. Demographic characteristics of the
sample are only cursorily described. 2. Medical exclusion criteria are unclear. [0.11] Heaton, Nelson, Thompson, Burks,
and Franklin, 1985 (Table A22.13) The authors compared performance of multiple sclerosis patients and normal control participants recruited in Colorado. The control group included 100 participants with no history of neurological illness, significant head trauma, or substance abuse.
GRIP STRENGTH TEST
Study strengths 1. Control sample size is large. 2. Information regarding age, education, gender, and geographic area is reported. 3. Exclusion criteria are adequate. 4. Means and SDs for the test scores are reported. Considerations regarding use of the study 1. Data are provided as total for both hands only. 2. Undifferentiated age range. 3. No information regarding handedness. 4. High educational level of control participants. [0.12] Kane, Parsons, and Goldstein, 1985 (Table A22.14)
The study compares performance of braindamaged and control participants on neuropsychological tests. The control group consists of 46 medical and nonschizophrenic psychiatric Veterans Administration patients, with a mean age of 38.9 (11.3) years, recruited in Oklahoma and Pittsburgh. Data for two hands are reported in T scores. Study strengths 1. Sample is described in terms of age, education, and geographic area. 2. Adequate sample size. 3. Means and SDs for the test scores are reported. Considerations regarding use of the study 1. Procedures used to determine handedness are not described. 2. Data are reported in T scores rather than in kilograms. 3. Control group consisted of medical and psychiatric patients. 4. No information regarding gender; presumably, the majority of participants are male. 5. Undifferentiated age range. [0.13] Heaton, Grant, and Matthews, 1986 (Table A22.15)
The authors obtained data on 553 normal controls in Colorado, California, and Wisconsin as part of an investigation into the
451
effects of age, education, and gender on Halstead-Reitan Battery performance. The sample consisted of356 males and 197 females. Exclusion criteria were history of neurological illness, significant head trauma, and substance abuse. Participants were aged 15-81 years, with a mean age of39.3 (17.5) years, and had education of 0-20 years, with a mean of 13.3 (3.4) years; 7.2% were left-handed. The sample was divided into three age groups and three education groups. Testing was conducted by trained technicians, and all participants were judged to have expended their best effort on the task. The chapter reviews different studies that explore the relationship of neuropsychological test performance with age, education, and gender. The authors conclude that different sets of norms should be used for participants of different ages, educational levels, and genders when determining whether performance is normal or abnormal. Study strengths 1. Large overall sample size and sizes of individual cells. 2. Information regarding age, education, gender, handedness, and geographic area is provided. 3. Adequate exclusion criteria. 4. Data are grouped by age and educational level. Considerations regarding use of the study 1. SDs are not provided, which limits the utility of the norms. 2. Procedures for assessment of hand dominance are not described. 3. Age groupings are quite large in terms of ranges. [0.14] Yeudall, Reddon, Gill, and Stefanyk, 1987 (Table A22.16)
The authors obtained data on 225 Canadian participants recruited from posted advertisements in workplaces and personal solicitations. The sample included meat packers, postal workers, transit employees, hospital lab technicians, secretaries, ward aides, student interns, student nurses, and summer students. In addition, high school teachers identified for
452
participation average students in grades 10-12. Participants (127 males, 98 females) did not report any histo:ry of forensic involvement, head injury, neurological insult, prenatal or birth complications, ps~hiatric problems, or substance abuse. Han.edness was determined by the writing hand. : Data were gathered by experiencefl technicians who "motivated the participflnts to achieve maximum performance" t)artially through the promise of detailed explapations of their test performance. Standard ~st administration procedures were used. The results are presented for the whole sample and stratified by four age groups x gender.
Study strengths 1. Large sample size. 2. Data are stratified by age and gender. 3. Data available for a 15-20 age group. 4. Adequate medical and psychiatric exclusion criteria. 5. Administration procedure is well described. 6. Information regarding age, education, IQ, gender, occupation, recruitment procedures, and geographic area is provided. 7. Method for determining handedness is specified. 8. Means and SDs for the test scores are reported. Consideration regarding use of the stu~y 1. High educational level of the sample. Other comments 1. IQ was measured by the WAlS and WAIS-R. WAIS IQ scores were linearly equated to WAIS-R IQ scores. 2. Data were collected in Canada. 3. Correlations of dynamometer scores with age and education were 0.25 a~ 0.16 for the preferred hand and 0.27 atid 0.17 for the nonpreferred hand, respettively. The effect of gender on perfmjmance was also explored. The author$ concluded that age and gender effects were significant for both hands. Therefore, they suggest using gender norms stratified by age.
MOTOR FUNCTIONS
[0.15] Thompson, Heaton, Matthews, and Grant, 1987 (Table A22.17)
The article presents a percentage of 426 normal participants (279 males, 147 females) scoring in the lateralized lesion range using Golden's (1978) guidelines. Dominant hemisphere dysfunction was defined as superiority of nonpreferred hand performance over preferred hand performance. Nondominant hemisphere dysfunction was identified when preferred hand performance was at least 20% better than nonpreferred hand performance. Lateral preference type was assessed based on performance on the Reitan-Klove Lateral Dominance Exam and the Miles ABC Test of Ocular Dominance (Reitan & Wolfson, 1985). The following groups were identified:
1. All right-participants who wrote with their right hand and manifested right lateral preference on all hand, eye, and foot measures. 2. Mixed right-participants who wrote with their right hand but manifested left preference on one or more other hand, eye, or foot measures. 3. Left-left-handed participants. Intermanual percent difference scores were calculated as preferred hand minus nonpreferred hand divided by preferred hand. Mean age was 40.59 (18.27) years, and mean education was 13.15 (3.49) years. Participants had been screened for histo:ry of head trauma, neurological illness, substance abuse, serious psychiatric illness, and peripheral injuries that might affect test performance. The authors concluded that neither age nor education nor gender was significantly related to intermanual difference scores.
Study strengths 1. Large sample size. 2. Lateral preference was thoroughly assessed, and three groups were identified. 3. Intermanual differences and percentage of participants scoring in the lateralized lesion range are reported. 4. Adequate exclusion criteria. 5. Information on age, education, and gender is reported.
GRIP STRENGTH TEST Considerations regarding use of the study 1. Means and SDs for each group are not reported. 2. Data are presented for a wide age range not separated into age groups, which precludes consideration of the effect of age on intermanual differences. [0.16] Ernst, 1988 (Table A22.18)
The author collected data on 85 Brisbane (Australian) uncompensated volunteers aged 65-75, recruited from the Queensland State electoral roll. All but one were Caucasian and right-handed. Thirty-nine were males and 46 were females. The sample was derived from 518 names randomly selected based on date of birth and residence. Potential participants were sent information regarding the project and a health questionnaire and asked to participate. Individuals with histories of substance abuse, head trauma, stroke, psychiatric hospitalization, or epiJepsy were excluded. A large minority of participants (42%) had a history of at least one treated and/or wellcontrolled chronic illness (10 heart disease, 17 hypertension, 5 asthma, 2 emphysema, 10 hypo- or hyperthyroidism, 2 diabetes). A majority of participants were currently using prescribed medications (55%) for the above chronic diseases or as a hypertensive preventative. Mean educational level of 10.4 was comparable to the modal educational level for that age range according to the Australian Bureau of Statistics. A wide range of occupations was represented, including unskilled laborers, homemakers, business persons, teachers, etc. The authors concluded that a significant gender difference was demonstrated for both hands. Superiority of the preferred hand was approximately 10%. No gender differences were demonstrated on intermanual ratio for grip strength. Study strengths 1. Ethnic characteristics, gender, education, handedness, age, recruitment procedures, and geographic area are reported. 2. Data are presented for males and females separately.
453
3. Ratio of dominantlnondominant hands was computed for the entire sample, to indicate the strength of lateralization. 4. Relatively large sample size for constricted age range. 5. Good medical and psychiatric exclusion criteria. 6. Means and SDs for the test scores are reported. Considerations regarding use of the study 1. Procedures used to determine hand preference were not described. 2. Approximately half of the participants had at least one chronic illness, and over half were taking prescribed medications. Other comments 1. Data were collected in Australia. [D. 17] Heaton, Grant, and Matthews, 1991
The authors provide normative data from 486 (378 base sample, 108 validation sample) urban and rural participants recruited in several states (California, Washington, Colorado, Texas, Oklahoma, Wisconsin, Illinois, Michigan, New York, Virginia, and Massachusetts) and Canada. Data were collected over a 15year period through multicenter collaborative efforts. Sixty-five percent of the sample were male. Mean age for the total sample was 42.06 (16.8), and mean educational level was 13.6 (3.5). Mean FSIQ, Verbal IQ, and Performance IQ were 113.8 (12.3), 113.9 (13.8), and 111.9 (11.6), respectively. Exclusion criteria were history of learning disability, neurological disease, illnesses affecting brain function, significant head trauma, significant psychiatric disturbance (e.g., schizophrenia), and alcohol or other substance abuse. The Hand Dynamometer Test was administered according to the procedures provided by Reitan and Wolfson (1985). Participants were paid and judged to have provided their best efforts on the tasks. Average number of kiJograms for two trials for each hand separately is reported. The normative data, which are not reproduced here, are presented in comprehensive tables in T-score equivalents for scaled scores
454
for males and females separately in 10 age groupings (20-34, 35-39, 40-44, 45-49, 5054,55-59,60-64,65-69,70-74, 75-BO)bys~ education groupings (6-8, 9--11, 12, 13-15, 16-17, 2::18 years). For dominant hand performance, 58% of the score variance was accounted for by gender, while age and educational level accounted for a negligible amount of unique variance in performance (2% and 1%, respectively). A total of 63% of test score variance was accounted for by demographic variables. For nondominant hand performance, 55% of score variance was accounted for by gender, while age and educational level also accounted for a negligible amount of unique variance in performance (4% and 1%, respectively). A total of 62% of test score variance was accounted for by demographic variables. For the sample as a whole, mean scores in kilograms were, for the dominant hand, 43.4 (13.1) and, for the nondominant hand, 39.7 (12.7). The interested reader is referred to the Fastenau and Adams (1996a) critique of the Heaton et al. (1991) norms, and Heaton et al.'s (1996) response to this critique. In 2004, the authors published the revised norms, which are based on a sample of over 1,000 normal adults. In addition to age, education, and gender stratification, the data are partitioned by race/ethnicity (African American and Caucasian).
Study strengths 1. Large sample size. 2. Comprehensive exclusion criteria. 3. Detailed description of the demographic characteristics of the sample in terms of age, education, IQ, geographic area, and gender. 4. Administration procedures are outlined. 5. Normative data are presented in comprehensive tables in T-score equivalents for males and females separately in 10 age groupings by s~ education groupings. Consideration regarding use of the study 1. No information regarding how hand preference was determined.
MOTOR FUNCTIONS
[0.18] Russell and Starkey, 1993 (Table A22.19) This study describes the standardization sample used by the authors in their manual introducing the Halstead-Russell Neuropsychological Evaluation System (HRNES) and addressing its psychometric properties. The normative sample consisted of veterans treated at the Cincinnati V.A. Hospital between 1968 and 1971 and the Miami V.A. Medical Center between 1971 and 1989. All participants received neurological examination. Those participants who were administered the Halstead tests and the W AIS or WAIS-R were included in the study. Nine percent of the sample were representatives of minority groups. The total sample was divided into a comparison group and a brain-damaged group. The comparison group included "normal" individuals. No subject in this group had a diagnosis of CNS pathology. Presenting symptoms for the majority of these participants were neurosis with memory or somatic complaints or personality disorders with episodes of explosive behavior. Patients diagnosed with schizophrenia or severe depression requiring hospitalization as well as those with evidence of systemic vascular disease were not included in the sample. Test scores can be corrected for age and IQ and converted into scaled scores to facilitate comparison with other tests. Statistics are reported for four groups of patients: comparison, left hemisphere damage, right hemisphere damage, and diffuse brain damage. Data on1y for the comparison group, stratified by gender, are reproduced in this chapter. The authors published an appen~ to the manual (HRNES-R, Russell & Starkey, 2001), which contains tables of scaled scores based on the original HRNES norms, demographic corrections, and regression-based predicted scores.
Study strengths 1. The sample composition is described in terms of age, education, gender, ethnicity, and geographic area. 2. Control sample size for males is large.
455
GRIP STRENGTH TEST
3. Means and SDs for the test scores are reported. Considerations regarding use of the study 1. Procedures used to determine hand preference are not described. 2. Classification of control participants as "normal" is questionable since they had been suspected of having neurological conditions and were referred for neurological evaluation, which yielded negative results. 3. Undifferentiated age range. [0.19] Tremont, Hoffman, Scott, and Adams, 1998 (Table A22.20)
The Halstead-Reitan Battery was used in a study examining the effect of intelligence level on neuropsychological test performance. The sample included 157 patients (71 males, 86 females) aged 16-74 years; 143 patients were Caucasian, 8 African American, 4 other, and 2 undetermined. Patients were referred to the University of Oklahoma Neuropsychological Laboratory for neuropsychological evaluation but determined to be neurologically normal based on neurodiagnostic procedures. The sample was devided into below average (FSIQ ~89), average (FSIQ 90-109), and above average (FSIQ ~110) ranges, based on performance on the WAIS-R. Data for the dominant hand performance on the Hand Dynamometer for males and females partitioned by intelligence level are presented in Table A22.20. The authors concluded that the Grip Strength Test failed to differentiate between different IQ levels. Study strengths 1. Relatively large sample. 2. Sample composition is well described in terms of age, education, gender, IQ, geographic area, and clinical setting. 3. Adequate exclusion criteria. 4. Means and SDs for the test scores are reported. 5. Data are presented for males and females separately.
Considerations regarding use of the study 1. Procedures used to determine hand preference are not described. 2. Data should be used with caution since they were collected on patients suspected of having neurological conditions and referred for neurodiagnostic procedures, which yielded negative results. 3. Undifferentiated age range. 4. It is unclear how many males and females are in each IQ group. [0.20] Dikmen et al., 1999 (Table A22.21)
The Hand Dynamometer Test was used in a study on the psychometric properties of a broad range of neuropsychological measures, based on a sample of 384 normal or neurologically stable adults who were tested twice as part of several longitudinal studies. A group of "friend controls" consisted of 138 individuals who had no history of recent trauma and were friends of head-injured patients. Their mean age was 28.5 (12.2) years and mean education was 12.2 (1.9) years; 60% of the sample were males, and the test-retest interval was 11.1 (0.6) months. A group of "trauma controls" consisted of 121 individuals who had a recent traumatic injury that did not involve the head. They were tested at baseline 1 month after trauma and then 11 months later. Their mean age was 31.2 (13.6) years and mean education was 12.0 (2.6) years; 70% of the sample were males, and the test-retest interval was 10.7 (0.6) months. Both groups were tested at the University of Washington under the direction of one of the authors. Twenty percent of friend controls and 46% of trauma controls had preexisting conditions that might affect test performance, the most significant being alcohol abuse or a significant traumatic brain injury. The rest of the participants denied any history of conditions that might affect brain function. The third group, "mixed normal controls," consisted of 125 participants who had no history of trauma or disease involving the brain. They were enrolled in longitudinal research projects at multiple sites under the supervision of the neuropsychology laboratories at the University of Colorado and the University of California at San Diego. Their
456
mean age was 43.6 (19.6) years and. mean education was 12.0 (3.3) years; 68% of the sample were males, and the test-retest interval was 5.4 (2.5) months. The data ·are reported for all groups combined. Demographic information for all groups combined is also provided. The mean WAIS FSIQ (Wechsler, 1955) on the initial testing for the three groups combined was 108.8 (12.3). The test was administered accordinc to the procedures specified by Reitan and Wolfson (1993). The scores represent average eumber of kilograms across two trials for the ddminant and nondominant hands. The authors provide raw scores for performance at two time probes, as well as ovarious measures of test-retest reliability and ·magnitude of practice effect. The test-ret~st reliability was r = 0.90 for the dominant h$11d and r=0.91 for the nondominant hand. ·
Study strengths 1. Large sample sizes for the three kroups. 2. Sample composition is well desc~bed in terms of age, education, gender, IQ, geographic area, and setting. 3. Test administration procedures ate specified. 4. Means and SDs for the test sc~es are reported. 5. Information on test-retest reliability is provided. Considerations regarding use of the stfldy 1. Exclusion criteria are not clearly described. As the authors pointed out, 20% of friend controls and 46% of trauma controls had preexisting conditioos that might affect test performance, the most significant being alcohol abuse and a significant traumatic brain injury. 2. Data are not partitioned by age group. 3. Data for males and females are not presented separately. 4. No information regarding how hand preference was determined. [0.21] Triggs, Calvanio, Levine, Heaton, and Heilman, 2000 (Table A22.22)
Grip strength was assessed in a study exaprlning the relationship between hand preference and
MOTOR FUNCTIONS
hand performance asymmetries. The sample included 30 right-handed and 30 left-handed healthy volunteers aged 21-57 years, with a mean age of 37 (9) years, recruited primarily from hospital staff. The two groups contained equal numbers of men and women. Exclusion criteria were history of brain injury or any medical condition expected to affect performance on the study tasks. All participants had completed high school and most had completed at least 4 years of college. Handedness was assessed using the Annett (1970) Handedness Inventory, modified by Briggs and Nebes, and the Edinburgh Handedness Inventory. Grip strength was measured with a dynamometer, using three trials with each hand.
Study strengths 1. Adequate sample size. 2. Sample composition is described in terms of age, education, and recruitment procedures. 3. Adequate exclusion criteria. 4. Means and SDs for the test scores are reported. 5. Method for determining handedness is specified. 6. Test administration procedures are specified. Considerations regarding use of the study 1. Gender distribution within each group is not provided. 2. Undifferentiated age range. [0.22] Christensen, Mackinnon, Korten,
and Jorm, 2001 (Table A22.23) The authors evaluated the "common cause hypothesis" in a longitudinal study of cognitive aging on a large probability sample of healthy participants aged 70 years and older drawn from an electoral roll in Australia. Exclusion criteria were not described; a reference was made to earlier studies. The dominant hand was used first, after a practice trial. Participants swapped hands after two trials and then repeated the procedure. The measure of grip strength is presented as the mean of the scores for two hands averaged
457
GRIP STRENGTH TEST
over four trials, for four age groups, for males and females separately. The authors underscored the effect of age on grip strength after controlling for all other variables.
Study strengths 1. Large sample. 2. Sample composition is described in terms of age, education, gender, geographic area, and recruitment procedures. 3. Test administration procedures are thoroughly described. 4. Data are stratified into four age groups. 5. Means and SDs for the test scores are reported.
Consideration regarding use of the study 1. Procedures used to determine hand preference are not described.
Other comments 1. Data were collected in Australia.
RESULTS OF THE META-ANALYSES OF THE HAND DYNAMOMETER TEST DATA (See Appendix 22m)
Data collected from the studies reviewed in this chapter were combined in regression analyses in order to describe the relationship between age and test performance and to predict expected test scores for different age groups. Effects of other demographic variables were explored in follow-up analyses. The general procedures for data selection and analysis are described in Chapter 3. Detailed results of the meta-analysis and predicted test scores across adult age groups are provided in Appendix 22m. Only those studies that stratifY the results by gender were used in the prediction analyses. After initial data editing for consistency and for outlying scores, the following data were included in the analyses: nine studies, which generated 15 data points, based on a total of 713 participants for males, dominant hand; seven studies, which generated 13 data points, based on a total of 641 participants for males, nondominant hand; five studies, which generated 11 data points, based on a total of
454 participants for females, dominant hand; and four studies, which generated 10 data points, based on a total of 407 participants for females, nondominant hand. It should be pointed out that the integrity of the results is undermined by the lack of consistency in data reporting. A majority of studies report data for the "dominant hand" and "nondominant hand," while some report for the "right hand" and "left hand." Some of the latter studies include left hand-dominant participants in their samples. Though the percent of left-banders is typically small (1%-7.5%), their inclusion confounds the outcome. Also, determination of the dominant hand was based on a wide range of criteria, ranging from comprehensive questionnaires to self-report of the writing hand. Quadratic regressions of the scores on age were used for female and linear regressions for male data. R2 ranged from 0.630 to 0.833. Based on the derived models, we estimated scores for age intervals between 25 and 69 years. If predicted scores are needed for age ranges outside the reported age boundaries, with proper caution (see Chapter 3) they can be calculated using the regression equations included in the tables, which underlie calculations of the predicted scores. Regressions of SDs for Hand Dynamometer scores on age suggest that age does not account for a significant amount of variability in SDs (R2 ranges between 0.076 and 0.229). Though some increase in variability with advancing age is expected, this trend was not present in the collected data. Therefore, we suggest that the mean SDs for the aggregate sample be used across all age groups. Education did not contribute to the test scores in the data available for analyses.
Strengths of the analyses 1. Total sample sizes of 407-713 participants. 2. R2 of 0.833 for females, nondominant hand, indicates a good model fit. However, the number of data points in this analysis is only 10, which might result in somewhat inflated R2 values. 3. Postestimation tests for parameter specifications did not indicate problems with normality or homoscedasticity.
MOTOR FUNCTIONS
458
4. Differences in mean predicted scores for the dominant vs. nondominant hands are 2.39 for males (48.10 vs. 45.71) and 3.25 (30.81 vs. 27.56) for females, which, in case of females, is consistent with the guideline of a 10% preferred-hand superiority. The intermanual difference for males is approximately 5%. 5. Differences between males and females in mean predicted scores are 17.29 (48.10 vs. 30.81) for the dominant hand and 18.15 (45.71 vs. 27.56) for the nondominant hand in favor of males, which is consistent with the strong effect of gender on test performance described in the literature.
Limitations of the analyses I. R2 of 0.747 and 0.630 for the dominant and nondominant hands for males and of 0.673 for females, dominant hand, are acceptable. However, these values indicate that only 63%-75% of the variance in test scores is accounted for by the models. 2. Number of data points for females is small.
CONCLUSIONS
A review of the above studies suggests high consistency in the data across different reports. Pronounced gender differences, with males outperforming females, represent an unequivocal finding. Decline in grip strength associated with advancing age is also frequently
reported and supported by the meta-analyses described in this chapter. Although high sensitivity of this test to brain impairment is well documented, interpretation of test results from the perspective of lateralization of brain damage should be made with caution. A 10% dominant hand superiority criterion is clearly consistent with the average performance across the studies presented above. However, a wide range of individual differences documented in numerous studies warrants great caution in the interpretation of dominant/nondominant hand comparisons. In addition to high variability of intermanual differences, peripheral dysfunction might influence test performance (e.g., arthritis, hand or arm orthopedic problems, etc.). Unacceptably high false-positive misclassification rates using standard criteria for lateralized impairment warrant further research directed at revision of these criteria. Some aspects of grip strength variability have not received sufficient attention in the literature. For example, normative data for older age groups are scarce. Test-retest concordance should be further explored to assess the magnitude of any practice effect and to address the issue of test reliability over different interprobe intervals. Since interpretation of the results is based on assumptions of cerebral lateralization, it is of the utmost importance to report the criteria for assessment of handedness, cutoff scores for subject selection on the basis of handedness pattern, and number of left-handed individuals in the sample, if they are included.
23 Grooved Pegboard Test
BRIEF HISTORY OF THE TEST 1he Grooved Pegboard Test (GPT) acquired its popularity over 30 years ago as part of two neuropsychological batteries. It consists of a metal board with a mabix of slotted holes angled in different directions. 1he task is to insert 25 metal pegs with ridges along the sides into each hole in sequence. A further description of the test, its applications, and references to the original sources are provided in Lezak (1995) and Lezak et al. (2004). Administration and scoring instructions are also provided by Lafayette Instrument Company, which manufactures the pegboard. Scores represent time in seconds required to complete the matrix with each hand, with higher scores reflecting lower levels of performance. Russell and Starkey (1993) propose a limit of 180 seconds, after which the trial is discontinued. According to their modification, the number of pegs not placed within the time limit is prorated into the time score. Although instructions for the test administration are relatively simple, review of the literature suggests that there is considerable variability in the following aspects of administration and scoring:
1. Administration of practice trials: a. 1he trial starts after instructions are given to the subject; no opportunity for practice is offered.
b. 1he subject is allowed to place a certain number of pegs (the number varies for different studies) prior to the actual trial, as practice. 2. Beginning of timing: a. An examiner starts timing when he or she cues the subject to start working on the test. b. Timing starts as the subject drops the first peg into a slot. 3. Number of trials: a. According to the test manual, one trial is administered per hand, starting with the dominant hand. b. Two or more trials are administered per hand, alternating dominant and nondominant hands to counteract the practice effect which confounds performance with the nondominant hand; the score is the mean of all trials for each hand. c. Two trials are administered per hand in a switchback design, where the first trial is performed with the dominant hand, the next two trials with the nondominant hand, and the last trial with the dominant hand; the score is the mean of all trials for each hand. 4. Assessment of laterality: This test is known as a sensitive measure of lateralized brain damage. As such, accurate identification 459
460 of handedness is of utmost importance. The majority of studies do not provide a description of laterality assessment. It is based in most cases on participants' self-report of the hand preferred for writing. Unfortunately, precise test administration procedures are not clearly described by the majority of authors, which hampers the comparability of the norms generated from different studies. The GP'f measures psychomotor s~d. fine motor control, and rapid visual-motor coordination. Motor abilities measured by this test are more complex than those measured: by the Finger Tapping Test and the Hand Dynamometer. Essentially, the GPr is a cognitive-motor task. In contrast, the finger tapping and dynamometer tasks require less task-specific cognitive effort and concentration and can be performed passively. Performance on the GPr is also highly dependent on psychomotor speed (Miller et al., 1990; Lezak, 1995; Lezak et al., 2004). Axelrod and Milner (1997) fo,nd the GPr to be a good predictor of low psychomotor speed in veterans of Operation Desert Storm and Operation Desert Shield who dJsplayed cognitive problems. Harnadek and ,Rourke (1994) report its sensitivity to nonverbal )earning disability. However, most commonly the GP'f is used for assessment of lateralized cerebral dysfunction. Use of the original cutoff scores for impairment (Heaton et al., 1986) and intermanual differences of about 10% in determination of brain dysfunction yields high rates of false-positive misclassification (Bomstein et al., 1987b). Revised cutoffs have been proposed (Bomstein et al., 1987b; Ryan et al., 1987). Bomstein (1986c) evaluated the pattern of motor performance on three motor tests (Finger Tapping Test, GPr, and Hand Dynamometer), which were administered to normal and unilateral brain lesion samples. A large degree of variability was observed across these intermanual measures, whereby "a high percentage (approximately 25 percent) of the normal sample obtained scores more than one standard deviation from the control mean on a single measure" (p. 719). Thus, the author bas emphasized the importance of consistency in performance pattern across tasks, rather than use of
MOTOR FUNCTIONS
a "rigid application of 'cookbook' formulas or 'rules of thumb' "(p. 723) in test interpretation. Schmidt et al. (2000) studied the transfer of training between hands by counterbalancing the order of the starting hand in right-banders vs. left-banders. The authors found an effect of opposite-hand training only in left-handed men, which they attributed to a larger corpus callosum in left-handed men. Reliability of the GPr has been addressed in several studies. For instance, Ruff and Parker (1993) report test-retest reliability coefficients of0.69-0.76 for the dominant hand and 0.680. 78 for the nondominant hand over a 6-month period. For further information on the psychometric properties of the GPr, see Lezak et al. (2004) and Spreen and Strauss (1998).
RELATIONSHIP BETWEEN GPT PERFORMANCE AND DEMOGRAPHIC FACTORS The effect of age on GPrperformance was quite pronounced across different studies, with slowing associated with advancing age (Bornstein, 1985: Concha et al., 1995: Heaton et al., 1991, 2004; Ruff & Parker, 1993; Ryan et al., 1987; Seines et al., 1991). The effects of education and gender have been reported but are much weaker (Bomstein, 1985; Concha et al., 1995; Heaton et al., 1991, 2004; Ryan et al., 1987; Seines et al., 1991). Polubinski and Melamed (1986) and Schmidt et al. (2000) found that females performed faster than males, and Thompson et al. (1987) report greater intermanual differences for females compared to males. However, Strenge et al. (2002) did not find an effect of age or gender on GPr performance in their sample of students 19--30 years old, which is probably due to the restricted age range of the sample. Ryan and colleagues (1987) proposed a regression equation to control for the effects ofage and education on GP'f performance.
METHOD FOR EVALUATING THE NORMATIVE REPORTS To adequately evaluate the GPr normative reports, seven criterion variables were deemed
461
GROOVED PEGBOARD TEST
critical. The first five of these relate to subject variables, and the remaining two refer to procedural issues.
Subject Variables
this assessment, assumptions regarding functionallateralization cannot be made.
Procedural Variables Description of Administration Procedures
Sample Size
Fifty cases are considered a desirable sample size. Although this criterion is somewhat arbitnuy, a large number of studies suggest that data based on small sample sizes are highly influenced by individual differences and do not provide a reliable estimate of the population mean. Sample Composition Description
Information regarding medical and psychiatric exclusion criteria is important. It is unclear if geographic recruitment region, socioeconomic status, occupation, ethnicity, handedness, or recruitment procedures are relevant. Until determined, it is best that this information be provided. Age Group Intervals
This criterion refers to grouping of the data into limited age intervals. This requirement is relevant for this test since a strong effect of age on GPT performance has been demonstrated in the literature. Reporting of Educational levels
Given the association between education and GPT performance, information regarding educational level should be reported for each subgroup, and preferably normative data should be presented by educational levels. Reporting of Gender Composition
Given the possible association between gender and GPT performance, information regarding gender composition should be reported for each subgroup, and preferably normative data should be presented for males and females separately. Description of Hand Preference Assessment
To address the issue of lateralization in test performance, assessment procedures for hand preference should be fully described. Without
Administration procedures for the GPT differ among studies. Detailed description of the procedures allows selection of the most appropriate norms or corrections to account for deviations in administration procedures. Data Reporting
Group means and standard deviations for the number of seconds required to complete the matrix for the dominant and nondominant hand should be presented at minimum.
SUMMARY OF THE STATUS OF THE NORMS A number of studies have reported normative data for the GPT. Studies vary in subject selection, description of procedural and subject variables, and grouping of data into categories. In addition to normative studies based on "normal" samples, there are a number of clinical comparison studies that explore differences in GPT performance between clinical groups and "normal control" groups (which are sometimes matched on demographic characteristics). Unfortunately, normal control groups are frequently comprised of medical or psychiatric patients. These samples cannot be considered truly "normal" due to possible effects of their illnesses and medications on test performance. The majority of studies present the data in number of seconds required to complete the matrix with each hand. Several studies report the proportion of participants falling in the impaired range or rates of intermanual difference. Several authors stratify their samples by age, education, and/or gender. Procedures for the assessment of handedness are described in some studies. Furthermore, some authors divide their samples into groups based on a subject's handedness pattern.
462
MOTOR FUNCTIONS
The majority of studies include. mostly young and middle-aged participants.; Only a few studies present data for elderly ~dividu als. Several studies provide test-retest data over varying interprobe intervals. Some studies provide data for left-handed samples. Among all the studies available in! the literature, we selected for review those based on I well-defined samples or that offer some information not routinely reported. In this chapter, normative publicatjpns and control data from clinical studies are reviewed in ascending chronological order. Th~ text of study descriptions contains references to the corresponding tables identified by nutnber in Appendix 23. Table A23.1, the locat
SUMMARIES OF THE STUDIH; [GPT.1] Rounsavile, Jones, Novelly, an4 Kleber, 1982 (Table A23.2) The study compared GPT performance of opiate addicts, epilepsy patients, and controls. A sample of 29 Comprehensive Emp~yment Training Act (CETA) workers was used for a normal comparison group. Participants with a history of drug or alcohol abuse or a neurological disorder were excluded. Participants were screened for alcohol and illicit psychoactive substances, and urine specime's were taken at the time of testing.
Study strengths 1. Sample is described in terms of gender, age, education, and percent of righthanded participants. 2. Adequate exclusion criteria.
Considerations regarding use of the study 1. SDs were not provided. 2. Testing procedure was scarcely described. 3. Procedures for assessment of hand dominance were not described. 'Norms for children are available in Baron (2004) and Spreen and Strauss (1998).
4. Age ranges were not provided. It is difficult to extrapolate age limits for use of the presented norms. 5. Sample size is small.
[GPT.2] Bornstein, 1985 (Tables A23.3, A23.4) The author collected data on 365 Canadian individuals (178 males, 187 females) recruited through posted notices on college campuses and unemployment offices, newspaper ads, and senior-citizen groups. Participants were paid; ranged in age from 18 to 69, with a mean age of 43.3 (17.1) years; and had completed 5-20 years of education, with a mean of 12.3 (2.7); also, 91.5% of the sample were righthanded. No other demographic data or exclusion criteria are reported. Means and SDs are reported for each hand. The sample is stratified by age group (20-39, 40--59, 60--69), level of education (
Study strengths 1. Very large overall sample size. 2. Stratification of the data by age, gender, and educational level. 3. This study is unique in that it reports data for participants with less than a high school education. 4. Information on recruitment procedures and geographic area is provided. 5. Method for determining handedness is specified. 6. Means and SDs for the test scores are reported.
Considerations regarding use of the study 1. Individual sample sizes of some cells are small. It is unclear whether the youngest age included in the study was 18 or 20. 2. No reported exclusion criteria.
463
GROOVED PEGBOARD TEST
Other comments 1. It has been established in several studies that performance on GPf varies as a function of age; however, it has only a weak relationship with education and gender. Therefore, norms broken down by age group are most appropriate for use. 2. Data were collected in Canada. [GPT.3] Heaton, Nelson, Thompson, Burks, and franklin, 1985 (Table A23.5)
The authors compared performance of multiple sclerosis patients and normal control participants recruited in Colorado. The control group included 100 participants with no history of neurological illness, significant head trauma, or substance abuse. Study strengths 1. Control sample is large. 2. Information regarding age, education, gender, and geographic area is reported. 3. Exclusion criteria are adequate. 4. Means and SDs for the test scores are reported. Considerations regarding use of the study 1. Data are provided as total for both hands only. 2. Undifferentiated age range. 3. No information regarding handedness. 4. High educational level of control participants. [GPT.4] Heaton, Grant, and Matthews, 1986 (Table A23.6)
The authors obtained data on 553 normal controls in Colorado, California, and Wisconsin as part of an investigation into the effects of age, education, and gender on Halstead-Reitan Battery performance. The sample consisted of 356 males and 197 females. Exclusion criteria were history of neurological illness, significant head trauma, and substance abuse. Participants were aged 1581 years, with a mean of 39.3 (17.5), and had education ranging 0-20 years, with a mean of 13.3 (3.4); 7.2% were left-handed. The sample was divided into three age groups and three education groups.
Testing was conducted by trained technicians, and all participants were judged to have expended their best effort on the task. The chapter reviews different studies that explore the relationship of neuropsychological test performance with age, education, and gender. The authors concluded that different sets of norms shou1d be used for participants of different ages, educational levels, and genders when determining whether performance is normal or abnormal. Study strengths 1. Large overall sample size and sizes of individual cells. 2. Information regarding age, education, gender, handedness, and geographic area is provided. 3. Adequate exclusion criteria. 4. Data are grouped by age and educational level. Considerations regarding use of the study 1. SDs are not provided, which limits utility of the norms. 2. Procedures for assessment of hand dominance are not described. 3. Age groupings are quite large in terms of ranges. [GPT.5] Bornstein, 1986a (Tables A23.7, A23.8)
This report expands the analysis of the data provided in Bomstein (1985). The author examined cutoff levels for impairment and the proportion of participants falling in the impaired range. For both preferred and nonpreferred hands, the clinically employed cutoff criterion was 66 seconds. Performance time >66 seconds placed participants into the impaired range. The high proportion of impaired scores is viewed by the authors as suggesting caution in using standard cutoff scores. Base rate issues are discussed from the perspective of the validity of test interpretation. The administration and scoring were as follows: The score for the Grooved Pegboard Test was the time required to fill the board according to standard instructions, described in a privately
464
MOTOR FUNCTIONS
published manual developed by Matth~. The preferred hand trial was administered Jrst, and timing of the trial was not interrupted in fhe event of a dropped peg. (p. 414) :
The sample was stratified by age grqup (1839, 40-59, 60-69), level of educatioq ( on use of the study, see [GPT.2] Bomstei,, 1985, above. In addition, in the curre1 study, exclusion criteria and test admi stration procedures are specified. I
[GPT.6] Polubinski and Melamed, 1986' (Table A23.9)
Participants were 120 students taim:t introductory psychology classes. All we~ righthanded. The Crovitz-Zener Test (Ciovitz & Zener, 1962) was used to assess dtfgree of hand dominance based on consistencylin hand preference for five unimanual tasks. j Participants with a score of 25 on this test foqned the firm right-handed groups, while thQse with scores of 524 formed the mixed righ~handed groups. l A switchback design was used, in which the first and fourth trials were performed ~th the right hand and the second and thitd trials were performed with the left hand. The authors found that women petformed faster than men and mixed right-banders performed faster than firm right-han
4. Relatively large sample for restricted age, education, and handedness groups. 5. Means and SDs for the test scores are reported. Consideration regarding use of the study 1. No exclusion criteria. [GPT.7] Ryan, Morrow, Bromet, and
Parkinson, 1987 (Table A23.10) This report describes the development of the Pittsburgh Occupational Exposures Test (POET) battery. It explored the factor structure of the battery and interrelations of test scores with age and education. The article provides norms for 182 bluecollar workers who do not have a history of exposure to industrial toxins, to be used to assess the effect of industrial toxins on neuropsychological functioning in clinic. All participants were white, native Englishspeaking males who had been employed at a heavy industrial plant in eastern Pennsylvania for at least 1 year. Participants had no previous exposure to industrial toxins and no history of neurological or psychiatric disorder or renal or hepatic disease; in addition, they had restrained from alcohol consumption in the 12 hours prior to testing. Since age and education were highly related to test performance, the authors developed a linear regression procedure that controls for the confounding effect of these variables. The prediction of a test score for each individual is based on the following equations: Dominant hand predicted score = 71.233 + 0.301 (age)- 0.904 (education) Nondominant hand predicted score= 85.929 + 0.151 (age) -1.347 (education)
The authors also reported the percent predicted ratio score (ratio of the actual score to the predicted score x 100) that falls at or below the fifth centile for this population. The cutoff values for impairment are as follows: 130 for the dominant hand and 128 for the nondominant hand.
465
GROOVED PEGBOARD TEST
Study strengths 1. Sample composition is described in terms of gender, occupation, education, ethnicity, and geographic area. 2. Sample size is large, and most individual cell sizes approach 50. 3. Testing procedure is well described. 4. Sample is divided into four age groups. 5. Means and SDs for the test scores are reported. 6. Adequate exclusion criteria. Considerations regarding use of the study 1. Procedures for assessment of hand dominance are not described. 2. All-male sample. [GPT.8] Bornstein, Baker, and Douglass, 1987a (Table A23.11)
The study assessed test-retest reliability of the CPT over a period of 3 weeks. Participants were 14 women and nine men recruited from a university community without a positive history of neurological or psychiatric illness. Their ages ranged 17-52, with a mean of 32.3 (10.3) years; mean VIQ was 105.8 (10.8), and mean PIQ was 105.0 (10.5). Participants were administered the HalsteadReitan Battery in standard order both on initial testing and again 3 weeks later. Means and SDs for raw score change over 3 weeks are -2.8 (6.1) and 0.3 (6.4) for the right and left hands, respectively. Data for the whole sample for both testing probes are provided. Study strengths 1. Sample composition is described in terms of age, VIQ, PIQ, and gender. 2. Information on short-term (3-week) retest data is provided. 3. Minimally adequate exclusion criteria. 4. Means and SDs for the test scores are reported. Considerations regarding use of the study 1. Sample size is small. 2. Age range is wide; the effect of age on test-retest change is not explored. 3. Procedures for assessment of hand dominance are not described. It is unclear
whether the authors used dominant/ nondominant comparisons or right/left comparisons. 4. Information on educational level is not reported. 5. Data are collapsed across genders. [GPT.9] Thompson, Heaton, Matthews, and Grant, 1987 (Table A23.12)
The article presents a percentage of 426 normal participants (279 males, 147 females) scoring in the lateralized lesion range using Golden's (1978) guidelines. Dominant hemisphere dysfunction was defined as superiority of nonpreferred hand performance over preferred hand performance. Nondominant hemisphere dysfunction was identified when preferred hand performance was at least 20% better than nonpreferred hand performance. Lateral preference type was assessed based on performance on the Reitan-Klove Lateral Dominance Exam and the Miles ABC Test of Ocular Dominance (Reitan & Wolfson, 1985). The following groups were identified: 1. All right-participants who wrote with
their right hand and manifested right lateral preference on all hand, eye, and foot measures. 2. Mixed right-participants who wrote with their right hand but manifested left preference on one or more other hand, eye, or foot measures. 3. Left-left-handed participants. Intermanual percent difference scores were calculated as preferred hand minus nonpreferred hand performance divided by preferred hand. Participants' mean age was 40.59 (18.27) years, and mean education was 13.15 (3.49) years. They had been screened for history of head trauma, neurological illness, substance abuse, serious psychiatric illness, and peripheral injuries that might affect test performance. The authors concluded that females tend to show greater disparity between preferred and nonpreferred hand performance (M = 9.8%, SD = 11.8) than males (M =6.7%, SD = 12.1). Age and education were not significantly related to intermanual difference scores.
466
Study strengths 1. Large sample size. 2. Lateral preference was thoroughly assessed, and three groups were identified. 3. Intermanual differences and a percentage of participants scoring in the lateralized lesion range are reported. 4. Adequate exclusion criteria. 5. Information on age, education, and gender is reported. Considerations regarding use of the study 1. Means and SDs for each group are not reported. 2. Data are presented for a wide age range not separated into age groups, which precludes consideration of the effect of age on intermanual differences. [GPT.10] Bornstein and Suga, 1988 (Table A23.13)
The authors report data on 134 healthy older Canadian volunteers aged 55-70, who were paid for their participation. Participants represent a subset of those used in Bomstein (1985). The sample is partitioned into three education groups. Nearly two-thirds of the sample are female (n = 85). The average age for the sample is 62.7 (4.3) years, and the mean ages of the three education groups are comparable. Exclusion criteria were history of neurological or psychiatric disorder. Study strengths 1. Large overall sample size and adequate individual cell sizes. 2. Data are partitioned into three education groups; the study is unique in terms of representation of participants with <12 years of education. 3. Information regarding gender, age, and geographic area is provided. 4. Means and SDs for the test scores are reported. 5. Minimally adequate exclusion criteria. 6. Reasonably restricted age grouping. Considerations regarding use of the study 1. Procedure for determination of hand preference is not identified.
MOTOR FUNCTIONS
2. The ~ 12 years of education category is too large. 3. Data are collapsed across genders. Other comments 1. Data were collected in Canada. [GPT.11] Miller, Seines, McArthur, Satz, Becker, Cohen, Sheridan, Machado, Van Gorp, and Visscher, 1990 (Tables A23.14, A23.15)
The article provides data for homosexual! bisexual males recruited in the Multi-Center AIDS Cohort Study (MACS), an epidemiological project designed to assess the natural history of HIV-1 infection. The study uses large sample sizes to explore the effect of HIV serostatus and symptom status on cognitive and motor functioning. Handedness was established based on self-report. The test administration procedure followed that outlined by Lezak (1983). The paper reports the percentage of righthanded, ambidextrous, and left-handed individuals; race composition (white, black, Hispanic, other); CES Depression Scale scores (with SDs); and CD4 cell!mm3 count (with SDs). Study strengths 1. Sample sizes are large. 2. The demographic characteristics of each sample are meticulously described in terms of gender, sexual orientation, handedness, ethnicity, age, education, and geographic area. 3. Method for determining handedness is described. 4. Means and SDs for the test scores are reported. 5. Test administration procedures are reported. Considerations regarding use of the study 1. The study recruited participants aged 21-72. Data are presented for all ages combined. 2. No exclusion criteria reported. 3. All-male sample. 4. High educational level.
467
GROOVED PEGBOARD TEST
[GPT.12] Heaton, Grant, and Matthews, 1991
The authors provide normative data from 486 (378 base sample, 108 validation sample) urban and rural participants recruited in several states (California, Washington, Colorado, Texas, Oklahoma, Wisconsin, Illinois, Michigan, New York, Virginia, and Massachusetts) and Canada. Data were collected over a 15year period through multicenter collaborative efforts. Sixty-five percent of the sample were males. Mean age for the total sample was 42.06 (16.8), and mean educational level was 13.6 (3.5). Mean full-scale IQ, VIQ, and PIQ were 113.8 (12.3), 113.9 (13.8), and 111.9 (11.6), respectively. Exclusion criteria were history of learning disability, neurological disease, illnesses affecting brain function, significant head trauma, significant psychiatric disturbance (e.g., schizophrenia), and alcohol or other substance abuse. The GPT was administered according to procedures provided by the test manufacturer. Participants were paid and judged to have provided their best efforts on the tasks. Time in seconds to complete the 25-peg placement with each hand separately is reported. The normative data, which are not reproduced here, are presented in comprehensive tables in T-score equivalents for scaled scores for males and females separately in 10 age groupings (20-34, 35-39, 40-44, 4549, 50-54, 55-59, 60-64, 65-69, 70-74, 7580) by six education groupings (6--8, 9-11, 12, 13-15, 16-17, ~18 years). For dominant hand performance, 40% of the score variance was accounted for by age, while 17% was attributable to educational level; gender accounted for a negligible amount of unique variance in performance (4%). A total of 47% of test score variance was accounted for by demographic variables. For nondominant hand performance, 39% of score variance was accounted for by age, while 13% was attributable to educational level; again, gender accounted for a negligible amount of unique variance (3%). A total of 42% of test score variance was accounted for by demographic variables.
For the sample as a whole, mean time in seconds was 67.3 (16.1) for the dominant hand, and 72.3 (17.5) for the nondominant hand. The interested reader is referred to the Fastenau and Adams (1996) critique of the Heaton et al. (1991) norms, and Heaton et al.'s (1996a) response to this critique. In 2004, the authors published revised norms, which are based on a sample of over 1,000 normal adults. In addition to age, education, and gender stratification, the data are partitioned by race/ethnicity (African American and Caucasian). Study strengths 1. Large sample size. 2. Comprehensive exclusion criteria. 3. Detailed description of demographic characteristics in terms of age, education, IQ, geographic area, and gender. 4. Administration procedures are outlined. 5. Normative data are presented in comprehensive tables in T-score equivalents for males and females separately in 10 age by six education groupings. Consideration regarding use of the study 1. No information regarding how hand preference was determined. [GPT.13] Seines, Jacobson, Machado, Becker, Wesch, Miller, Visscher, & McArthur, 1991 (Tables A23.16-A23.18)
The article presents data for seronegative homosexual and bisexual males from the MACS, who were earlier described by Miller et al. (1990), for the purpose of establishing normative data for neuropsychological test performance based on a large sample. Participants with a history of head injury with loss of consciousness >1 hour and who reported drinking ~21 drinks per week in the previous 6 months were excluded. The paper reports the percentage of right-handed, ambidextrous, and left-handed individuals as well as the race composition (Caucasian, African American) for age and education strata. Handedness was established based on selfreport. Standard procedures described by the Lafayette Instrument Company were used.
468
MOTOR FUNCTIONS
The authors point out a significant effect of age on GPT performance. Education, however, was not significantly related to performance.
Study strengths 1. Normative data are stratified by flge and education. ' 2. The demographic composition · of the sample is described in terms Jof age, gender, sexual orientation, hantdness, ethnicity, and geographic area; demographic composition is described reach age and education cell separately. 3. Means, SDs, as well as scores t>r percentiles 5 and 10 are presented. ; 4. Method for determining handeclness is reported. : 5. Very large overall sample and individual cells. 6. Administration procedures are specified. 7. Minimally adequate exclusion cr(teria. I
Considerations regarding use of the s'tfldy 1. All-male sample. r 2. High educational level.
Other comments
The authors provide an update (~rsonal communication) on the data reported Jn their 1991 paper, reflecting ongoing data coUection and analysis for their longitudinal ep~emio logical study of HIV infection. They present the data in a combined age and ec41cation grouping. All data are from healthy HIV· negative gay and bisexual males. [GPT.14] Ruff and Parker, 1993
(Tables A23.19, A23.20) The GPT was administered as part of a comprehensive test battery to 360 normal 1volunteers recruited in California, Michig~, and the eastern seaboard, aged 1~70 ye~. with education of 7-22 years. Participant$ were screened for psychiatric hospit~tions, chronic polydrug abuse, and neur.logical disorders. ' Data are stratified by education x g$-tder x age. Data for a left hand-dominant saq>le are also reported. '
The authors report test-retest reliability for a 6-month interval, based on data for five or more participants from each of 12 demographic cells (30% of sample). Reliability coefficients for women, men, and the total sample were 0.76, 0.69, and 0.72 for the dominant hand and 0.78, 0.68, and 0.74 for the nondominant hand, respectively. The effect of age and gender on motor speed was specifically addressed. The authors explore the ratio of dominant/nondominant hand performance.
Study strengths 1. Sample composition is described in terms of age, gender, education, handedness, and geographic area. 2. Assessment of handedness is well described. 3. Test administration procedure is described. 4. Data are presented in gender x education x age groupings. 5. Data for a left hand-dominant sample are reported. 6. Means and SDs for the test scores are reported. 7. Adequate exclusion criteria. 8. Large overall sample size, although some cells are relatively small. [GPT.15] Russell and Starkey, 1993
(Table A23.21) This study describes the standardization sample used by the authors in their manual introducing the Halstead-Russell Neuropsychological Evaluation System (HRNES) and addressing its psychometric properties. The normative sample consisted of veterans treated at the Cincinnati Veterans Administration Hospital between 1968 and 1971 and the Miami V.A. Medical Center between 1971 and 1989. All participants received a neurological examination. Those participants who were administered the Halstead tests and the WAIS or WAIS-R were included in the study. Nine percent of the sample were representatives of minority groups. The total sample was divided into a comparison group and a brain-damaged group. The comparison group included "normal"
469
GROOVED PEGBOARD TEST
individuals. No subject in this group had a diagnosis of central nervous system pathology. Presenting symptoms for the majority of these participants were neurosis with memory or somatic complaints or personality disorders with episodes of explosive behavior. Patients diagnosed with schizophrenia or severe depression requiring hospitalization as well as those with evidence of systemic vascular disease were not included in the sample. Test scores can be corrected for age and IQ and converted into scaled scores to facilitate comparison with other tests. Statistics are reported for four groups of patients: comparison, left hemisphere damage, right hemisphere damage, and diffuse brain damage. Data only for the comparison group are reproduced in this chapter. The authors published an appendix to the manual (HRNES-R; Russell & Starkey, 2001), which contains tables of scale scores based on the original HRNES norms, demographic corrections, and regression-based predicted scores. Study strengths 1. Sample composition is described in terms of age, education, gender, ethnicity, and geographic area. 2. Control sample size is large. 3. Means and SDs for the test scores are reported. Considerations regarding use of the study 1. Procedures used to determine hand preference are not described. 2. Classification of participants in the comparison group as normal is questionable since they were suspected of having neurological conditions and referred for neurological evaluation, which yielded negative results. 3. Undifferentiated age range. [GPT.16] Dikmen, Heaton, Grant, and Temkin, 1999 (Table A23.22)
The GPT was used in a study on the psychometric properties of a broad range of neuropsychological measures based on a sample of 125 normal or neurologically stable adults, 121 of whom were tested twice. Participants in
this group had no history of trauma or disease involving the brain. They were enrolled in longitudinal research projects at multiple sites under the supervision of the neuropsychology laboratories at the University of Colorado and the UDiversity of California at San Diego. Their mean age was 43.6 (19.6) years and mean education was 12.0 (3.3) years; 68% of the sample were males, and the test-retest interval was 5.4 (2.5) months. The other two groups do not have data for the GPT and therefore will not be described in this chapter. The mean WAIS FSIQ (Wechsler, 1955) on the initial testing for the three groups combined was 108.8 (12.3). The GPT was administered and scored according to the instructions provided by the Lafayette Instrument Company. The scores represent the numbers of seconds required to place all 25 pegs on the board, separately for the dominant and nondominant hands. The authors provide raw scores for performance at two time probes, as well as various measures of test-retest reliability and the magnitude of the practice effect. The testretest reliability for the GPT was r = 0.86 for both dominant and nondominant hands.
Study strengths 1. Large sample size. 2. Sample composition is well described in terms of age, education, gender, IQ, geographic area, and setting. 3. Test administration procedures are specified. 4. Means and SDs for the test scores are reported. 5. Information on test-retest reliability is provided. Considerations regarding use of the study 1. Exclusion criteria are not clearly described. 2. Data are not partitioned by age group. 3. No information regarding how hand preference was determined. [GPT.17] Strenge, Niederberger, and Seelhorst, 2002 (Table A23.23)
The GPT was administered as part of a battery in a study examining the relation between
470
MOTOR FUNCTIONS
tests of manual dexterity and attentional functions. The sample consisted of 49 righthanded medical students (26 women, 23 men) 19-30 years of age, with a median age of 23 years. Handedness was assessed with the Annett (1970) inventory. The GPT was administered according to standard instructions. The authors controlled for finger size in their analyses. No significant effect of age or ge.der on test performance was found. ~
Study strengths 1. Sample size approaches 50. 2. Sample composition is desc~ed in terms of age, gender, and setting. 3. Test administration proceduJlls are specified. j 4. Means and SDs for the test scqres are reported. 5. Method for determining handecJness is specified. ·
Considerations regarding use of the .J,dy 1. Exclusion criteria are not identified. 2. Recruitment procedures are not rprted. 3. Data were obtained on German [Participants, which may limit their usefulhess for clinical interpretation in the Unite4 States.
RESULTS OF THE META-ANALYS~ OF THE GROOVED PEGBOARD TEST DATA (See Appendix 23m)
Data collected from the studies reviawed in
this chapter were combined in regression analyses, to describe the relationship between age and test performance and to predict e:1pected test scores for different age groups. Effects of other demographic variables were explored in follow-up analyses. The general proqedures for data selection and analysis are d~cribed in Chapter 3. Detailed results of th~ metaanalysis and predicted test scores across adult age groups for both genders combi~d are provided in Appendix 23m. After initial data editing for consisteflcy and for outlying scores, six studies, which .generated 15 data points based on a total ci 2,382 participants, were included for each hand.
It should be pointed out that the integrity of the results is undermined by the lack of consistency in data reporting. A majority of studies report data for the "dominant hand" and "nondominant hand," while some report for the "right hand" and "left hand." Some of the latter studies include left hand-dominant participants. Though the percent of leftbanders is typically small (5%-7.5%), their inclusion confounds the outcome. Also, determination of the dominant hand was based on a wide range of criteria, ranging from comprehensive questionnaires to self-report of the writing hand. linear regressions of GPT scores on age yeilded R2 of 0.936 and 0.907 for the dominant and nondominant hands, respectively. Based on the derived models, we estimated GPT scores for age intervals between 20 and 64 years. If predicted scores are needed for age ranges outside the reported age boundaries, with proper caution (see Chapter 3) they can be calculated using the regression equations included in the tables, which underlie calculations of the predicted scores. Regressions of SDs on age yielded R2 of 0.489 and 0.468 for the two hands, indicating increase in variability with advancing age, consistent with the literature. Predicted SDs, based on these models, are reported. The effect of education on the test scores was analyzed on the data set described above and on a separate data set that included data broken down by education groups rather than by age groups. Whereas the former analysis indicated no effect of age on the test scores, the latter analysis revealed a significant effect of education on test performance. The t value for education was-2.71 (p=0.030) for the dominant hand and -3.96 (p = 0.005) for the nondominant hand. Because of the inconsistency in findings from the two data sets and the marginally significant effect for the dominant hand, education-correction tables were not reported.
Strengths of the analyses 1. Total sample size is 2,382 for both hands. 2. R2 of 0.936 for the dominant hand and of 0.907 for the nondominant hand indicate a good model fit.
471
GROOVED PEGBOARD TEST
3. Postestimation tests for parameter specifications did not indicate problems with homoscedasticity. 4. The difference in the mean predicted time for the dominant vs. nondominant hands is 6.23 seconds (66.63 vs. 72.86), which is consistent with the guideline of a 10% preferred-hand superiority. Limitations of the analyses 1. Postestimation tests for parameter specifications indicated positive skew in the test scores. This non-normality of the score distribution does not affect the estimates of regression coefficients and accuracy of prediction but does influence the results of significance tests. 2. Educational level of the aggregate sample is approximately 13.5 years. Effect of education on GPT performance has been reported in several studies and was also evident in our analysis on the additional data set. Time to completion is expected to increase as a function of decrease in educational level. Therefore, predicted values are likely to underestimate expected time to completion for individuals with lower educational levels. 3. Possible gender differences in time to completion and intermanual differences in favor of females have been reported in the literature. However, we were unable to examine these differences because the data available for review were not stratified by gender.
CONCLUSIONS Despite the wide variability in administration procedures for the GPT, there is high consistency in the data across different studies. Decline in performance is clearly associated with advancing age. The effects of education and, specifically, gender are more equivocal. Peripheral (orthopedic and muscular) problems impacting performance on the test must also be considered. Because GPT performance is affected by several factors, its interpretation as an indicator of cortical dysfunction should be made with great caution. Following the recommendations of Bomstein (1986c), diagnostic accuracy rests on the consistency of findings across different tasks and different functional domains. Given the unacceptably high false-positive rates with the original cutoffs for impairment, further research should be directed at the formulation of revised cutoff scores to improve specificity. Despite the large body of empirical studies exploring the psychometric properties of the GPT, some aspects of performance are not sufficiently addressed. For example, normative data for older age groups are scarce. Since interpretation of the results is based on assumptions of cerebral lateralization, it is of utmost importance to report the criteria for the assessment of handedness, cutoff scores for subject selection on the basis of handedness pattern, and number of left-handed individuals in the sample (if they are included).
VII CONCEPT FORMATION AND REASONING
24 Category Test
BRIEF HISTORY OF THE TEST The Category Test was developed by Halstead (1947) to assess the ability to "abstract" categorization parameters such as size, shape, number, position, brightness, and color. The original test apparatus consisted of a boxed screen placed in front of the subject, on which were presented visual stimuli in groups of four; the task was to identify which of the four stimuli differed from the other designs by pressing one of four corresponding keys located on a pad below the screen. Feedback was provided in the form of a "chime" for correct responses and a "raspberry buzzer" for incorrect answers. Halstead's (1947) original test included 336 items organized into nine subtests, while the version employed by Reitan and Wolfson (1985) was reduced to 208 stimuli in seven subgroups. The Halstead version is no longer in use, and the data presented in this chapter refer to the Reitan edition. The Category Test involves several different abilities, including attention and concentration, learning and memory, and visuospatial skills, as well as concept formation, abstraction of similarities and differences among stimuli, and modification of problem-solving hypotheses in responses to feedback. The majority of recent studies on the construct validity of the Category Test describe it as a measure
of different aspects of reasoning (Johnstone et al., 1997; Kelly et al., 1992; Perrine, 1993; Shute & Huertas, 1990). However, Leonberger et al. (1991) emphasize visual concentration and visual memory as important determinants of Category Test performance, while Boyle (1988) views this test as a measure of intelligence. Golden and colleagues (1981a) suggest that the Category Test is sensitive to prefrontal lobe disturbance as well as diffuse dysfunction. Adams and colleagues (1993, 1995) explored the association between Category Test performance and frontal tissue glucose metabolism rates. The authors attribute the test's sensitivity to reasoning, concept formation, and abstraction to involvement of three frontal subdivisions in information processing: the cingulate gyrus and the dorsolateral and orbitomedial aspects of the frontal lobes (Adams et al., 1995). Other authors, however, do not relate Category Test performance to any specific brain area (Bomstein, 1986c; Choca et al., 1997). Lowered Category Test performance has been observed in moderate brain injury (Rimel et al., 1982), multiple sclerosis (but not chronic fatigue syndrome; Krupp et al., 1994), and psychiatrically hospitalized male prisoners (Young & Justice, 1998). In addition, schizophrenic patients show decreased performance (Goldstein & Zubin. 1990; Gottschalk & Selin, 475
476
1991), with nondelusional patients showing higher scores than delusional patients (Steindl & Boyle, 1995). Category Test performance is also depressed in alcoholics (Loberg, 1980; O'Leary et al., 1977), and scores significantly correlate with MMPI clinical scales in this population, suggesting a relationship between emotional distress and executive function in alcoholism (JohnsonGreene et al., 2002). Of interest, the presence of antisocial personality disorder in alcoholics is associated with higher Category Test performance in men compared to women (Hesselbrock et al., 1985). Category Test scores have also been significantly correlated with ventricular size in depressed patients (Kellner et al., 1986), and they are lower in older compared to younger bipolar patients (Savard et al., 1980). Finally, Category Test performance is suppressed in diabetes (Skenazy & Bigler, 1984) and hypertension (Pentz et al., 1979). Several recent studies have indicated that the Category Test may show particular promise for the detection of malingered cognitive symptoms (DiCarlo et al., 2000; Sweet & King, 2002; Tenhula & Sweet, 1996). Of some concern, a low relationship between Category Test scores and activities of daily living (as measured by the Scale of Competence in Independent living Skills) in a sample of geriatric patients was reported by Searight et al. (1989). The authors report correlation coefficients between Category Test scores and measures of 16 activities of daily living ranging between -0.03 and 0.37. However, little and colleagues (1996) indicate that neuropsychological tests, including the Category Test, were more predictive of outcome in their sample of brain-injured patients than were intelligence tests. The Category Test has been only modestly correlated with the Wisconsin Card Sorting Test (Macinnes et al., 1983a), another measure of problem-solving/abstraction; and in factor analyses, the Category Test has been shown to load with general intellectual and memory measures (Boyle, 1988) and, among WAIS subtests, to primarily correlate with Block Design, Digit Symbol, and Similarities (Stone et al., 1988). In a study in which response time was measured, 15% of the
CONCEPT FORMATION AND REASONING
variability in Category Test error scores was accounted for by spatial memory, while 15% was also explained by response time (Rattan et al., 1986). Interestingly, administering the Category Test prior to the Wisconsin Card Sorting Test results in higher scores on the latter; however, administration of the Wisconsin Card Sorting Test first is associated with poorer scores on the Category Test (Franzen et al., 1993). Specific test administration instructions are provided in Reitan's (1979) Manual for Ad-
ministration of Neuropsychological Test Batteries for Adults and Children and in Reitan and Wolfson's (1985) The Halstead-Reitan Neuropsychological Test Battery. Snow (1987) cites concerns regarding lack of standardiza-
tion of test administration. He points out that the test manual dictates that the examiner assist the subject, but the nature and extent of the help to be offered is not specified. He quotes the following passage from the manual: "it may become necessary to urge [patients] to study the pictures carefully, to ask for . . . description . . . to urge them to try to notice and remember how the pictures change . . . and to try to think of the possible reason when a correct answer occurs" (pp. 26--27). Halstead (1947) comments that although several methods of scoring were considered, he settled on a single score: total number of errors. He recommended a cutoff of >80 errors as the criterion score in computing his impairment index, while Reitan and Wolfson (1985) used a criterion score of 50 for their shortened test. Given the significant association between Category Test scores and demographic factors and IQ, a single cutoff would not appear to be appropriate, particularly in older participants. For example, Ernst (1987) documented in his sample (65-75 years old) a misclassification rate of 84% on a booklet version of the Category Test. Dodrill (1987) documented a 22.5% misclassification rate in a young control sample. Logue and Allen (1971) recommend that "ultimate interpretation of the significance of critical data on the Category Test rests often not only on the standard cutoff of 50 but also on a subjective evaluation by the psychologist as to
477
CATEGORY TEST
whether the score is to be reasonably expected by a normal subject of this general level of intelligence" (p. 1095). Similarly, Bomstein and colleagues (1987a) emphasize that cutoff scores may be useful but only if considered in the context of other neuropsychological information obtained in a test battery and if age, education, and other appropriate adjustments are made. An item analysis reported by Laatsch and Choca (1991) revealed an uneven progression of item difficulty in successive subtests. In addition, all items on subtests I and II were found to be too easy to yield useful information (see Choca et al., 1997). Charter (1994) examined the frequency of random responding on the Category Test using the formula approximating binomial distribution for a large sample, based on observed scores and probability of guessing. Frequencies of random responses for the 90%, 95%, and 99% confidence intervals presented in this article are summarized in Table 24.1. Charter's review of different studies reported in the literature suggested that no normative participants' scores fall in the random range.
Alternate Formats Several authors have observed that a major negative aspect of the Category Test for clinical use is the extensive amount of time which may be required for administration (Golden et al., 1981a). Finlayson and colleagues (1986) note that a quarter of their sample of severely brain-injured patients completed the test in 29 minutes or less; however, a third required 30-39 minutes, another third completed the test in 40-59 minutes, and nearly 10%
Table 24.1. Frequencies of Random Responses for 90%, 95%, and 99% Confidence Intervals, According to Charter (1994) 90%
95%
99%
Low High Low High Low High
Full test Subtests II, VII Subtests III-VI
146 13 26
167 19 35
145 12 26
169 19
141 12
36
24
173 21 38
required over 1 hour. Some patients may complete the task in 2 hours or more, and the associated fatigue may result in random responding. Rest periods alleviate fatigue, but these breaks may compromise performance on subtest VII because it involves recall of earlier strategies. To address the problem of lengthy administration, at least 10 shortened versions of the Category Test have been developed. Some formats have involved administration of three or four of the seven subtests (Calsyn et al., 1980; Moehle et al., 1988), while other formats have used selected items from five or six subtests (Boyle, 1975, 1986; Gregory et al., 1979; LaBreche, 1983; Russell & Levy, 1987) or have split the test in half using even vs. odd items (Kilpatrick, 1970). Kilpatrick documented high correlations between number of errors on odd items, even items, and all items (r=0.90-0.99). Calsyn et al. (1980; cf. Dunn et al., 1985; Golden et al., 1981a; Taylor et al., 1984) found that scores based on administration of the first four subtests had high correlations with the total score (r=0.83--0.89) and accounted for 77%-79% of total score variance. They suggested that full test scores can be approximated by multiplying the shortened version score by 1.4 and adding 14. Inclusion of information regarding age, education, and gender has not increased the predictive accuracy of this short form (Pierce et al., 1989). Taylor and colleagues (1984) corroborated a high correlation between the Calsyn short version and the complete test format (r = 0.91), but they noted that use of the short form resulted in a substantially higher misclassification of normals as brain-damaged and that participants with right-sided focal lesions tended to be misidentified as normal. Dunn et al. (1985) suggested that the following equation may lead to more accurate estimates of full Category Test scores in a geriatric population: (short form score x 1.6) + 22. Moehle and colleagues (1988), using multiple regression analyses, reported that a short form composed of subtests N, VI, and III accounted for the highest percentage of longform score variance (77%) and is the "psychometrically soundest short form." Their analyses indicated that the Calsyn short form
478
CONCEPT FORMATION AND REASONING
accounted for only 62% of full-fo~ score variance. Moehle et al. (1988) recomm~nded a cutoff of 26 (which represents the Score 1 standard deviation [SD] below the m~an for their brain-damaged sample) for use with their short form. They suggested some ~aution in the use of their version because !fl their participants on whom the analyses wer+ based were administered the entire Category Test. The authors questioned whether there are un1 known order effects or if some subte!¢5 (i.e., I and II) have an influence on sub~quent subtest performance. : Other investigators have retained ~t least five or six of the seven subtests from te Reitan version of the Category Test bub shortened some of the subtests (Boyle, 197ft 1986; Gregory et al., 1979; Russell & Levy, j1987). Taylor et al. (1990) argue that "becaqse the Category Test is designed to be a test! of abstract reasoning and requires the su~ect to make conceptual shifts among several principles, the same number of principles apd the same number of conceptual shifts sh~d be required on a shortened version of thq Category Test" (p. 486). ' Boyle (1975, 1986) created two ~arallel 84-item forms of the Category Test, inyolving half the items from subtests I-IV and \!II and 20 items from subtests V and VI. The 84-item I version discriminated between a norn:+U and neurological population, and using cu.Pffs of 38 and 39 errors, only 6%-13% of the, braindamaged participants were miscla$sified, while 20%-22% of the non-brain-daJnaged participants were misidentified. , Gregory et al. (1979) developed a 12J)-item test version employing all subtest I it~'s, the first 16 items from subtest II, and first 32 items from subtests III-VI. While . egory and colleagues suggest that a cutoff sre of 35 best corresponded to the full test ;cutoff of 51 in their brain-damaged participapts vs. college students, Sherrill (1985) found ,that a cutoff of 29 was a better predictor of th~ longform cutoff in his heterogeneous, neuropsychiatric population. , Russell and Levy (1987) devised a 9~-item Category Test format composed of fivej items from subtest I, 10 items from subtest •· and 20 items from subtests III-VI. Item~ from i
subtests V and VI were reorganized so that subtest V included only pure quantitative items and subtest VI consisted of complex counting items. A full Category Test score was calculated by multiplying the short-version score by 2.2. A correlation of 0.97 was obtained between the abbreviated and full test scores in a neurological population. Mean Cronbach's oc was 0.71, suggesting moderate item homogeneity across subtests (Boyle et al., 1994). LaBreche (1983) deleted memory items (subtest VII), discarded the formal scoring of subtests I and II (because they simply orient the patient to the task), and took out redundant items, resulting in 81 items, which he called the "Victoria Revision." Correlation with the full form was 0.96, and the revised test produced a classification rate of 84%, which was comparable for the 83% obtained for the full version. Sherrill (1985) concluded that "having the smallest number of scored items of any of the attractive short forms, the Victoria Revision probably offers the best combination of relatively short administration time and relatively good predictive accuracy for routine clinical use" (p. 350). Kozel and Meyers (1998) reported that short-form and full-form scores obtained on matched groups of head-injured and dementia patients did not significantly differ. Finally, Wetzel and Boll (1987) published the Short Category Test, which contains 100 Category Test items (the first 20 items from subtests II-VI). Subtest I was eliminated because the first 12 items in subtest II were judged to be adequate to introduce patients to the task. Subtest VII was deleted because it repeated items from previous subtests and was thought to tap memory more than abstraction ability. This shortened form was administered to a clinical sample, and the same pattern of correlations as observed between the full Category Test and various neuropsychological measures was found for the Short Category Test, with the exception that the Short Category Test was more strongly associated with dominant finger tapping speed. Normative data were derived from 120 control participants, who averaged 15 years of education. Correlations between the Short Category Test and full test ranged 0.80--0.93,
CATEGORY TEST
depending on which test was administered first. The classification accuracy rate of 83% matches that found for the full version of the test. Gontkovsky and Souheaver (2002) compared performance on the Short Category Test and Booklet Category Test in a sample of neuropsychology clinic referrals and found that T scores were comparable between the two, although they observed that the shortform cut-off of 46 for older participants needed to be lowered to 41 to match the sensitivity values of the full test version. Sherrill (1985) found that the Gregory et al. (1979) 120-item version and the Calsyn et al. (1980) 108-item version were highly correlated with each other (0.968) and that, while both were highly correlated with the standard format (0.943-0.981), the 120-item version had the highest correlation with the full test and the smallest standard error of estimate (+7.5), suggesting that it was overall the best predictor of the full test score. Sherrill (1985) suggested that the 120-item version is the most attractive short-form alternative to the 208-item test, particularly for high-functioning participants, since it includes subtest V; a wide variation in scores occurs on this subtest in this population. Taylor et al. (1990) compared the Gregory et al. (1979), Calsyn et al. (1980), and Russell and Levy (1987) short forms and reported that the Russell and Levy version and the Gregory et al. version were better predictors of total test scores than the Calsyn et al. version and that while the Russell and Levy version had only 95 items, it performed comparably to the 120-item Gregory et al. format. Taylor and colleagues (1990) recommend the following equation for the Russell and Levy format in calculating the predicted total score: (Number of short-form errors x 2.73)- 4.49. These authors suggest that the Russell and Levy formula of short-form errors x 2.2 tends to overestimate total test errors. While efforts to shorten the Category Test are commendable, a major concern involves the fact that the majority of the shortened versions have been based on analyses derived from administration of the entire Category Test. Few studies have actually administered a shortened version, and it is unlmown if some
479
subtests (i.e., I and II) have an influence on subsequent subtest performance or if unique order effects emerge if subtests are given out of order (Moehle et al., 1988). Moehle and colleagues (1988) rightly express caution in the use of their shortened version for these reasons. Snow (1987) reminds that any shortening of a test necessitates compilation of new normative data: for example, the process of shortening the Category Test may make it less fatiguing, and hence less demanding. Short forms may therefore be less able to discriminate patients with subtle brain dysfunction. Further, when short forms are developed, it is often the case that little work is done in validating the findings obtained with the newer versions of the test. Instead, the old cutting scores continued to be used, with short-form scores merely being prorated to their equivalent lengthier versions. Clearly, when a test is shortened, new norms will be required. (p. 258)
In addition to the length of time required for administration, the lack of portability and cumbersome nature of the Category Test apparatus have been drawbacks (Slay, 1984; Wood & Strider, 1980). Slay (1984) provides instructions for constructing a portable Category Test for the clinician "with a modicum of workshop skills." Several investigators have developed paper-and-pen, card, or booklet forms of the Category Test (Adams & Trenton, 1981; DeFilippis et al., 1979; Kimura, 1981; Wood & Strider, 1980). Adams and Trenton (1981) devised a laminated booklet test form, which provides visual feedback as to the correctness of a response, although the practicality of this method is somewhat in doubt: "the answer sheet was treated by touching each correct answer with a swab containing dimenthylglyoxine. The subject was given a felt tip pen that was treated chemically with nickel chloride in aqueous ammonia. If the subject responded to an item correctly, the circle on the answer sheet would immediately tum red. An incorrect response resulted in a green circle" (p. 299). The high modified split-half Spearman-Brown coefficient documented between the slide-format Category Test and the paper-and-pen version did not
480
significantly differ from the coefficient obtained for the slide version, and the two test forms were judged to be equivalent. Wood and Strider (1980) report a similar shortened version of the Category Test using a latent image transfer sheet. When a developer pen is applied to the correct rectangle on the answer sheet, the rectangle darkens, providing feedback as to the correctness of the response. No significant difference in performance across psychiatric groups was found between this version and the original Reitan format. Kimura (1981) also developed a card version of the Category Test, in which the patient provides verbal responses to which the examiner responds "right" or "wrong." No significant differences in test performance were noted between groups of neurological patients. Finally, DeFilippis et al. (1979) created a booklet form in which the Category Test slides were reproduced on 8.5" x 11" sheets, which were then placed into notebooks. A piece of cardboard with the numbers 1-4 was placed below the notebook, and participants were instructed to point to the appropriate number for each sheet. The examiner provided feedback as to the correctness of a response by saying "correct" or "incorrect." The booklet format was highly correlated with the original slide version, and no effect of test version was documented in patients and normals administered both formats. Macinnes et al. (1983b) report data validating the Calsyn et al. (1980) short form in conjunction with the DeFilippis booklet format. The DeFillippis et al. (1979) version in particular appears to be in much wider usage than the original slide projector format. A comprehensive summary of the history and current perspectives on the Category Test is offered by Choca et al. (1997).
RELATIONSHIP BETWEEN CATEGORY TEST PERFORMANCE AND DEMOGRAPHIC FACTORS Corrigan et al. (1987) summarize much of the available literature on the relationship of IQ, education, and age to Category Test scores;
CONCEPT FORMATION AND REASONING
and the interested reader is referred to this publication. A highly consistent relationship has been documented between age and Category Test scores in both normal and patient (braindamaged, psychiatric, medical) samples (Alekoumbides et al., 1987; Anthony et al., 1980; Bigler et al., 1981a; Boyle, 1986; Boyle et al., 1994; Choca et al., 1997; Corrigan et al., 1987; Elias et al., 1990, 1993; Ernst et al., 1987; Fitzhugh et al., 1964; Goldstein & Zubin, 1990; Heaton et al., 1986, 1991; Mack & Carlson, 1978; Moses et al., 1999; Prigatano & Parsons, 1976; Query, 1979; Reed & Reitan, 1963b; Seidenberg et al., 1984; Vega & Parsons, 1967). However, Willis et al. (1988) did not find a relationship between age and performance on the original version of the Category Test in a sample of 154 healthy elderly individuals, but the age range of their participants was quite narrow (65--79 years). A negative correlation frequently has been noted between education and Category Test scores, particularly in normal compared to patient groups. Several investigators have reported associations between education and Category Test performance in normal individuals (Anthony et al., 1980; Boyle, 1986; Choca et al., 1997; Ernst, 1987; Finlayson et al., 1977; Heaton et al., 1986, 1991; Prigatano & Parsons, 1976; Vega & Parsons, 1967; Warner et al., 1987; Yeudall et al., 1987). However, the relationship between education and Category Test performance has been equivocal in patient groups, with some authors documenting a significant correlation (Alekoumbides et al., 1987; Boyle, 1986; Lin & Rennick, 1974; Seidenberg et al., 1984) and others failing to detect an association (Corrigan et al., 1987; Finlayson et al., 1977; Prigatano & Parsons, 1976; Vega & Parsons, 1967). Seidenberg et al. (1984), using a multivariate approach, reported that education was a more influential variable in performance than age; however, the age range of their participants was very attenuated: 15-52 years. A consistent negative relationship has been observed between Category Test scores and IQ in both normal and patient groups, although reports have differed as to whether VIQ or
CATEGORY TEST
PIQ is more tied to performance. For example, several studies have indicated that scores are more associated with PIQ (Corrigan et al., 1987; Cullum et al., 1984; Goldstein & Shelly, 1972; Lansdell and Donnelly, 1977), while other publications have documented a stronger association with VIQ or WAIS verbal factor scores (Shore et al., 1971; Yeudall et al., 1987). Corrigan et al. (1987) point out that the correlations with PIQ have been relatively consistent across studies, suggesting a reliable relationship; however, the correlation with VIQ has varied widely and inexplicably across studies. In general, no significant gender differences have been noted in Category Test performance (Dodrill, 1979; Elias et al., 1990, 1993; Fromm-Auch & Yeudall, 1983; Heaton et al., 1986, 1991; Kupke, 1983; Pauker, 1980; Seidenberg et al., 1984; Yeudall et al., 1987), although Ernst (1987) reported that in his elderly sample, men performed slightly better than women on the booklet version of the test. Handedness (Gregory & Paul, 1980; Seidenberg et al., 1984) and socioeconomic status (Seidenberg et al., 1984) do not appear to be related to Category Test scores. However, health status is moderately related, as reported by Willis et al. (1988). Arnold et al. (1994) documented a significant effect of acculturation on performance for the original version of the Category Test in a sample of Mexican Americans, with more acculturated individuals demonstrating higher performance. Similarly, Manly et al. (1998) found that performance of 30% of healthy African Americans was in the impaired range using the norms of Heaton (1992) and Heaton et al. (1991), although significant differences between African Americans and Caucasian (HIV+) individuals disappeared when level of acculturation (including use of black English) was considered.
METHOD FOR EVALUATING THE NORMATIVE REPORTS Our review of the literature located seven Category Test normative reports for adults
481
published since 1965 (Dodrill, 1987; Ernst, 1987; Fromm-Auch & Yeudall, 1983; Harley et al., 1980; Heaton et al., 1991; Pauker, 1980; Yeudall et al., 1987), as well as the original Halstead (1947) and Reitan (1955b, 1959) normative data and three interpretive guides (Logue & Allen, 1971; Reitan & Wolfson, 1985; Russell et al., 1970). Hundreds of other studies have also reported control subject data, and we will discuss several of those investigations which involved some unique feature, such as large sample size (> 100), retest data, elderly population, or non-English-speaking sample (Alekoumbides et al., 1987; Anthony et al., 1980; Barrett et al., 2001; Bomstein et al., 1987a; Elias et al., 1990, 1993; El-Sheikh et al., 1987; Heaton et al., 1986; Klove & Lochen, cited in Klove, 1974; Mack & Carlson, 1978; Russell, 1987; Wiens & Matarazzo, 1977). Russell and Starkey (1993) developed the Halstead-Russell Neuropsychological Evaluation System (HRNES), which includes the Category Test among 22 tests. In this system and its revised version (HRNES-R), individual performance is compared to that of 576 braindamaged participants and 200 participants who were initially suspected ofhaving brain damage but had negative neurological findings. Data were partitioned into seven age groups and three educationai!IQ levels. This study will not be reviewed in this chapter because the "normal" group consisted of V.A. patients who presented with symptoms requiring neuropsychological evaluation. For further discussion of the HRNES, see Lezak et al. (2004). Of note, few relevant manuscripts have emerged since the 1980s, perhaps due either to the publication of Heaton et al.'s (1991) comprehensive normative tables or to the escalating use in research and clinical practice of flexible neuropsychological test protocols which include newer tasks rather than traditional fixed neuropsychological batteries. To adequately evaluate the Category Test normative reports, six key criterion variables were deemed critical. The first five of these relate to subject variables, and the one remaining dimension refers to a procedural issue. Minimal requirements for meeting criterion variables were as follows.
482
CONCEPT FORMATION AND REASONING
Subject Variables Sample Size
As discussed in previous chapters, a minimum of at least 50 participants per groupin!i interval is optimal. I Sample Composition Description
As discussed previously, information re~arding
medical and psychiatric exclusion crite~ is important; it is unclear if geographic recruitment region, gender, socioeconomic status occupation, ethnicity, handedness, and rec~tment procedures are relevant. Until determin~, it is best that this information be provided. .
ot
j
Age Group Intervals
:
Given the association between age an~Cegory Test performance, information reg · g the norage of the normative sample is critical mative data should be presented by age in rvals. Reporting of IQ Levels
!
Given the relationships between C4egory Test performance and IQ, data shmpd be presented by IQ intervals, or at least information regarding intellectual level sho..ld be provided. In addition, given some e~ence that PIQ may be more related to CaJ:egory Test performance than VIQ, informatbn on PIQ and VIQ separate from FSIQ is deslrable.
i
Reporting of Educational Levels 1
Given the possible, although minor, a$Sociation between educational level and C~egory Test scores, it is preferable that info~ation regarding highest educational level completed be reported.
Procedural Variables Data Reporting
Means, standard deviations, and preferably ranges for total Category Test errors are reqtPred.
SUMMARY OF THE STATUS OF THE NORMS
i
All but eight data sets had total sample sizes >100 (Bomstein et al., 1987a; El-Sheikh:et al.,
1987; Halstead, 1947; Klove &: Lochen, cited in Klove, 1974; Mack &: Carlson, 1978; Reitan, 1955b, 1959; Wiens &: Matarazzo, 1977). Only three publications consistently had at least 50 participants in individual subject groupings (Elias et al., 1993; Ernst, 1987; Heaton et al., 1986), although some reports had some subgroups which met this criterion (Dodrill, 1987; Fromm-Auch &: Yeudall, 1983; Harley et al., 1980; Pauker, 1980; Yeudall et al., 1987). Eleven of the studies summarized in this chapter present Category Test data according to circumscribed age ranges (Elias et al., 1990, 1993; Ernst, 1987; Fromm-Auch &: Yeudall, 1983; Harley et al., 1980; Heaton et al., 1986, 1991; Mack &: Carlson, 1978; Pauker, 1980; Wiens &: Matarazzo, 1977; Yeudall et al., 1987). Information on IQ levels is reported in all but six studies (Barrett et al., 2001; Elias et al., 1993; El-Sheikh et al., 1987; Ernst, 1987; Heaton et al., 1986; Russell, 1985), and one report presented Category Test data in age-by-IQ groupings (Pauker, 1980). Similarly, educational level was also indicated in all but two studies (Bomstein et al., 1987a; Pauker, 1980), and Heaton et al. (1986, 1991) organized data by educational levels. Information on the gender composition of the samples was available in all but three reports (Anthony et al., 1980; Harley et al., 1980; Klove &: Lochen, cited in Klove, 1974); four data sets included only male (Barrett et al., 2001; Wiens &: Matarazzo, 1977) or nearly all-male (Alekoumbides et al., 1987; Russell, 1987) populations, and one data set was composed primarily of females (Mack&: Carlson, 1978). Ernst (1987) and Heaton et al. (1991) presented data separately for males and females. Information on other subject variables was provided less frequently; data on handedness were indicated in three studies (Dodrill, 1987; Fromm-Auch &: Yeudall, 1983; Yeudall et al., 1987); occupation or socioeconomic status was described in seven reports (Alekoumbides et al., 1987; Barrett et al., 2001; Dodrill, 1987; Elias et al., 1990; Halstead, 1947; Wiens&: Matarazzo, 1977; Yeudall et al., 1987); and information regarding ethnicity was presented in four data sets (Alekoumbides et al., 1987; Barrett et al., 2001; Dodrill, 1987; Russell, 1987). Exclusion criteria were judged to be adequate in only
483
CATEGORY TEST
10 publications (Anthony et al., 1980; Bomstein et al., 1987a; Dodrill, 1987; Elias et al., 1990, 1993; Fromm-Auch & Yeudall, 1983; Heaton et al., 1991; Pauker, 1980; Wiens & Matarazzo 1977; Yeudall et al., 1987). Geographic re~ cruitment areas were specified in all but six publications (Barrett et al., 2001; Bomstein et al., 1987a; Dodrill, 1987; Elias et al., 1990, 1993; Mack & Carlson, 1978). Twelve data sets were obtained in the United States (Alekoumbides et al., 1987; Anthony et al., 1980; Barrett et al., 2001; Halstead, 1947; Harley et al., 1980; Heaton et al., 1986, 1991; Klove & Lochen, cited in Klove, 1974; Reitan, 1955b, 1959; Russell, 1987; Wiens & Matarazzo, 1977), three in Canada (Fromm-Auch & Yeudall 1983; Pauker, 1980; Yeudall et al., 1987), one i~ Norway (Klove & Lochen, cited in Klove, 1974), one in Egypt (El-Sheikh et al., 1987), and one in Australia (Ernst, 1987). . Total mean number of errors was reported m all data sets, and SDs were indicated in all but four studies (Barrett et al., 2001; Halstead, 1947; Heaton et al., 1986; Klove & Lochen cited in Klove, 1974). Ranges for number of errors were presented in four publications (Bomstein et al., 1987a; Fromm-Auch & Yeudall, 1983; Halstead, 1947; Harley et al., 1980), and means and SDs for individual subtest scores are provided in two publications (Ernst, 1987; Mack & Carlson, 1978). Some studies reported supplementary Category Test scores, such as IQ-equivalent scores (Dodrill, 1987), test-retest data (Bomstein et al., 1987; El-Sheikh et al., 1987), T-score equivalents for raw scores (Harley et al., 1980), and T-score equivalents corrected for age, education, and gender (Heaton et al., 1991). The text of study descriptions contains references to the corresponding tables identified by number in Appendix 24. Table A24.1, ~e l<><:ator table, summarizes information proVIded m the studies described in this chapter. 1
SUMMARIES OF THE STUDIES Given that the Category Test has typically been used within the context of the Halstead-Reitan 'Nonns for children are available in Baron (2004) and Spreen and Strauss (1998).
Battery (HRB), the Halstead (1947) and Reitan (1955b, 1959; Reitan & Wolfson, 1985) data and interpretation recommendations will be reported first, followed by a summary of the other interpretation formats. Then, the normative publications and control groups from clinical comparison studies will be reviewed in ascending chronological order.
Original Studies [CT.l] Halstead, 1947 (Table A24.2)
The author obtained Category Test data on 28 control participants in Chicago, more than half of whom had psychiatric diagnoses. The 14 participants without psychiatric diagnoses were nine male and five female civilians aged 15-50 (mean= 25.9) without history of brain injury. The eight participants with diagnoses of mild psychoneurosis were male soldiers aged .2_2--38 (mean= 29.6); some had combat expenence but none had a history of head injury. The last six participants were aged 27-39 and included a depressed military prisoner facing execution, a severely depressed female with suicidal and homicidal impulses tested prior to lobotomy, and a suicidal/homicidal female and a suicidal male tested pre- and post-lobotomy. Educational level ranged 7-18 years, and the following occupations were represented: artist, entertainer, farmer, housewife, semiskilled and unskilled laborers, professional, secretary, teacher, technician, trade, and student. Ethnic background included American, Balkan, English, French, German, Irish, Polish, Scandinavian, and Scottish. IQ level ranged 70-140. Mean errors are reported for the total group and each control subgroup, as well as individual scores for each subject. The Category Test criterion score used in calculating the Impairment Index was >80 errors.
Study strength 1. Information provided regarding IQ, education, occupation, ethnicity, geographic recruitment area, age, and gender. Considerations regarding use of the study 1. Small sample size, including use of two participants twice.
484
CONCEPT FORMATION AND REASONING
2. Inclusion of participants with psychiatric diagnoses and postlobotomy. 3. No reporting of SDs. 4. Undifferentiated age range.
gory Test raw error scores based primarily on "rules of thumb" recommended by P. M. Rennick. Russell (1984) subsequently modified the ratings as reHected in Table 24.2. Logue and Allen (1971) published a predictor table plotting the expected number of Category Test errors for nine W AIS FSIQ va1ues based on Reitan's 1959 WechslerBellevue and Category Test scores on control participants (see Table 24.3).
[CT.2] Reitan, 1955b (see also Reitan, 19$9) (Table A24.3)
The author obtained Category Test s~es on 50 participants in Indiana who had app;rently been referred for neuropsychologicaJ testing and "who had received neurologicaJ txaminations before testing and showed nq signs or symptoms of cerebra] damage or d~func tion . . . . None . . . had positive ~am nestic findings" (p. 29); some were hospi~d with paraplegia or neurosis. The sam~le included 35 men and 15 women, mean ase was 32.36 (10.78), and mean educationaJle~l was 11.58 (2.85). Mean WAIS VIQ, PIQ, anclFSIQ were 110.82 (14.46), 112.18 (14.23), and l12.64 (14.28), respectively. '
Use of the table . . . allows a direct comparison of scores actually obtained from a given client on the WAIS and on the Category Test. Where the relationship is not at the predicted level, the examiner can have more confidence that the obtained category score is not an artifact of limited or superior intelligence. (p. 1096)
The authors caution that the expected Category Test scores at the highest IQ va1ues are probably unrealistically low.
Study strengths 1. Information regarding IQ, educationaJ level, gender, age, and geographic recruitment area. · 2. Adequate sample size.
[CT.3] Reitan and Wolfson, 1985
The authors provide generaJ guidelines for Category Test score interpretation in the form of "severity ranges:" "perfectly normaJ (or better than average)," "normal," "mildly impaired," and "seriously impaired." They list the number of errors which correspond to each severity range:
3. Means and SDs are reported.
Considerations regarding use of the stuf.y 1. Undifferentiated age range. 2. Insufficient medica] and psychiatric exclu~ion criteria; sample included participants hospitaJized with spinaJ cord injuries and psychiatric disorders. 3. High average IQ.
0-25: perfectly normaJ (or better than average) 26-45: normaJ
46--65: mildly impaired 2':66: seriously impaired
Interpretive Guides
No other information is provided, such as score means or SDs, or any data regarding the normative sample on which these guidelines were developed.
In constructing their neuropsychologic41 key approach, Russell, Neuringer, and Golilstein (1970) devised six rating equivaJents of Pate-
Table 24.2. Rating Equivalents of Catetory Test Raw Scores, • According to Russell et al. (1970) Rating Equivalent
2
3
4
5
6
53-78
79-104
105-130
131-156
>5 Er-col II
0 Errors
26-52
• A score of 156 is considered random.
485
CATEGORY TEST
Table 24.3. Predictor Table, According to Logue and Allen (1971) WAIS FSIQ"
Predicted
Category Test Score
140 130 120 110 100
10 15 21 26 32
90 80
37
70
43 48
60
54
•wAIS FSIQ, Wechsler Adult Intelligence Scale full-scale intelligence quotient.
Considerations regarding use of the study The authors argued that these norms were meant as "general guidelines" and that "exact percentile ranks corresponding with each possible score are hardly necessai)' because the other methods of inference are used to supplement normative data in clinical interpretation of results of individual participants" (p. 977). However, we maintain that m~re precise scores as well as separate normative data for different age, IQ, and educational levels are necessary to avoid false-positive errors in diagnosis. It is not clear how the cutoffs were developed; they do not match the cutoffs cited by Halstead (1947). The authors report that a cutoff of 50 was recommended by Halstead in computing the Impairment Index, but examination of Halstead's (1947) manuscript revealed that his cutoff was >80, not >50. It appears that Reitan derived his cutoff by computing the ratio of errors to total items for the Halstead administration and applied the same ratio to his 208-item version (e.g., 80/336=50/208).
Normative Studies and Control Groups from Clinical Comparison Studies [CT.4] Klove and Lochen (Cited in Klove,
damage. Mean age, educational level, and IQ for the American participants were 31.6, 11.1, and 109.3, respectively; and mean age, educational level, and IQ for the Norwegian participants were 32.1, 12.2, 111.9, respe~ tively. Category Test scores are presented m terms of mean errors for each group.
Study strengths 1. This publication is unique in providing Category Test data on a Norwegian population. 2. Information regarding educational level, IQ, age, and geographic recruitment area is reported.
Considerations regarding use of the study 1. 2. 3. 4.
Small sample size. Undifferentiated age ranges. No SDs reported. No exclusion criteria specified, and no information regarding gender distribution of the sample is provided.
[CT.S] Wiens and Matarazzo, 1977 (Table A24.5) The authors collected Category Test data on 48 male applicants to a patrolman program in Portland, Oregon, as part of an investigation of the WAIS and MMPI correlates of the HRB. All participants passed a medical exam and were judged to be neurologically normal. Participants were divided into two equal groups, which were comparable in age (23.6 vs. 24.8), education (13.7 vs. 14.0), and WAIS FSIQ (117.5 vs. 118.3). A random subsample of 29 applicants was readministered the Category Test 14-24 weeks following the original administration. Means and SDs for total errors are reported for both the original testing and retest. One of the 29 participants obtained a score higher than Reitan's suggested cutoff of 50/51 errors. No significant correlations were observed between Category Test scores and FSIQ, VIQ, or PIQ in either control group (Wiens & Matarazzo, 1977).
1974) (Table A24.4)
The authors obtained Category Test data on 22 American controls from Wisconsin and 22 Norwegian controls as part of a validation study on the ability of the HRB to detect brain
Study strengths 1. Information on test-retest performance. 2. Relatively large sample size for the small age range.
486
3. Adequate medical exclusion criteria. 4. Information provided regarding educational level, IQ, gender, and geographic recruitment area. 5. Means and SDs are reported.
Considerations regarding use of the study 1. High IQ level. 2. High educational level. 3. SDs differ markedly between the two control groups, suggesting either unusual variability in scores for the first group or unusual lack of variability in the second group or an error in reporting the data. 4. All-male sample. [CT.6] Mack and Carlson, 1978 (Table A24.6)
The authors obtained Category Test data on 41 old (range 60-80) and 40 young (range 2037) participants as part of an investigation into the neuropsychological effects of aging. Older participants with histories of neurological impairment or "signs or symptoms of diseases with neurological significance (or which predispose participants to possible neurological disorder)" were excluded. No screening for neurological impairment was conducted in the younger sample, which was drawn from a university student body and hospital staff. The older sample included three men and 38 women; mean educational level was 14.05 (3.39) years, and mean WAIS FSIQwas 119.90 (15.14). The young sample consisted of 31 female and nine male participants, and mean years of education and mean IQ (based on Shipley scores in 17 participants) were 15.43 (2.65) years and 113.76 (4.89), respectively. The Category Test was computeradministered according to standard instructions, with the exception that after participants made a response by pressing a button, they could change their response. Once they were satisfied with a response, they pushed the "0" key, which was followed by feedback on the correctness of the response, and the slide projector advanced to the next trial. The authors concluded that this modification had little effect given' that few participants corrected an initial choice.
CONCEPT FORMATION AND REASONING
Total mean errors and SDs are provided as well as mean errors and SDs for subtests III, IV, V, VI, and VII. The elderly participants performed significantly worse than young controls and comparably to a middle-aged braindamaged sample. The elderly sample showed particular difficulty on subtests II and IV relative to the younger sample.
Study strengths 1. Data are presented in age groupings. 2. Adequate exclusion criteria in older participants. 3. Information regarding education, IQ, gender, and (in the younger participants) occupation. 4. Data for several individual subtests as well as total errors. 5. Means and SDs are reported. Considerations regarding use of the study 1. No exclusion criteria for the younger participants. 2. IQ data not available for all participants and two IQ measures used. 3. Minor alterations in test administration format (computer-assisted). 4. Relatively small sample size and relatively large age range within each age grouping. 5. High educational and intellectual levels. 6. Samples are primarily female. [CT.7] Anthony, Heaton, and Lehman, 1980 (Table A24.7)
The authors amassed Category Test data on 100 normal volunteers from Colorado as part of a cross-validation of two objective, computerized interpretive programs for the HRB. Participants had no history of medical or psychiatric problems, head trauma, brain disease, or substance abuse. In addition, for 85% of the controls, normal EEGs and neurological exams were obtained; in the remaining 15% of participants, it appeared that this information was not available. Mean age was 38.88 (15.80) years, and mean education was 13.33 (2.56) years. Mean W AIS FSIQ, VIQ, and PIQ were 113.54 (10.83), 113.24 (11.59), and 112.26 (10.88), respectively. Category Test data are presented in terms of mean
487
CATEGORY TEST
number of errors and SD. Participants incorrectly identified as brain-damaged (according to the Russell et al., 1970, system) were older, less educated, and less intelligent than participants correctly classified as non-brain-damaged. Study strengths 1. Large sample size. 2. Adequate exclusion criteria. 3. Information regarding education, IQ, age, and geographic recruitment area. 4. Means and SDs are reported. Considerations regarding use of the study 1. Large undifferentiated age grouping. 2. IQ range is high average. 3. No information regarding gender. [CT.8l Harley, Leuthold, Matthews,
and Bergs, 1980 (Table A24.8)
The authors collected Category Test data on 193 V.A.-hospitalized patients in Wisconsin aged 55-79. Exclusion criteria were FSIQ <80, active psychosis, unequivocal neurological disease or brain damage, and serious visual or auditory acuity problems. Patients with a diagnosis of chronic brain syndrome were included. Patient diagnoses were as follows: chronic brain syndrome unrelated to alcoholism, psychosis, and alcoholic, neurotic, or personality disorder. Mean educational level was 8.8 years. The sample was divided into five age groupings: 55-59 (n =56), 60-64 (n = 45), 65-69 (n =35), 7~74 (n =37), and 75-79 (n = 20). Mean educational level and percent of sample included in each diagnostic classification are reported for each diagnostic classification and for each age grouping. The authors also provide test data on a subgroup of 160 participants equated for percent diagnosed with alcoholism across the five age groupings. The "alcohol-equated sample" was developed "to minimize the influence that cognitive or motor/sensory differences uniquely attributable to alcohol abuse might have upon group test performance levels" (p. 2). This subsample remained heterogeneous regarding representation of the other diagnostic categories. Mean errors and SDs are provided by age groupings for the total and alcohol-equated samples. In addition, T-score equivalents for
raw scores are reported as well as "percentage of best raw score," which indicated where a raw score fell within the range of raw scores for that age grouping. We reproduce only the mean, SD, and range for total errors due to space considerations. Study strengths 1. Large sample size, and many individual cells approximate 50. 2. Reporting of data on IQ, educational level, and geographic recruitment area. 3. Data presented in age groupings. 4. Means and SDs are reported. Considerations regarding use of the study 1. Presence of substantial neurological (chronic brain syndrome), substance abuse, and major psychiatric disorders in the sample. 2. Low educational level, although IQ levels are average. 3. No information regarding gender, although given that data were obtained in a V.A. setting, the sample is likely all or nearly all male. 4. Odd variability across scores, with those 75-79 years old outperforming the younger age groups and those 60-64 years old outperforming those 55-59 years old. Other comments The scores for the two oldest age groups are identical in the whole sample and alcoholequated group because these two groups did not have overrepresentation of alcoholics, so they did not need to be adjusted. [CT.9] Pauker, 1980 (Table A24.9)
The author obtained Category Test scores on 363 Toronto citizens Ruent in English, recruited through announcements and notices. Participants were aged 19-71 and included 152 men and 211 women. Exclusion criteria consisted of significant physical disability, sensory deficit, current medical illness, using of medication that might affect test performance, history of actual or suspected brain disorder, and alcoholism. MMPI profiles "could not suggest severe disturbance" or include more
488
CONCEPT FORMATION AND REASONING
than three clinical scales with T scores 2':70 or an F scale score >80. · The Category Test was administered according to Reitan's guidelines. Means aJld SDs for total errors are reported for the sample as a whole and for three age groupings (~9-34. 35-52, 53-71) by four WAIS IQ j levels (89-102, 103-112, 113-122, 123-143)! Individual cell sizes ranged 4-60. Age-by-·Q categories were determined "in a compjomise between what would be desirable and w~at the obtained sample characteristics and s* dictated" (p. 1). No differences in perforynance between men and women were docum.nted. Study strengths 1 1. Large sample size, although ind(vidual
cell sizes are substantially less thap 50. 2. Presentation of data in ageiby-IQ groupings. 3. Adequate medical and psychiatric exclusion criteria. . 4. Information regarding gender, r+cruitment procedures, and geographJc reI cruitment area. 5. Means and SDs are reported. Considerations regarding use of the stuf,y 1. No information regarding educatiqn.
2. Participants were recruited in Cioada, raising questions regarding usef\tlness for clinical interpretation in the Qnited States. 3. The age-by-IQ cell representing participants aged 53-71 with IQ of 8~102 contained only four participants; ~uker comments that this category "shouJd not be considered to be of any mord than interest value" (p. 2). · 4. IQ levels below the average ran~ not represented. [CT.10] Fromm-Auch and Yeudall, 1983
sample were right-handed, and mean age was 25.4 (8.2) years, with a range of 15-64 years. Mean education was 14.8 (3.0) years, with a range of8-26 years, and included technical and university training. Mean WAIS FSIQ, VIQ, and PIQ were 119.1 (8.8, range 98-142), 119.8 (9.9, range 95-143), and 115.6 (9.8, range 89-146), respectively. No participants obtained an FSIQ lower than the average range. Participants were classified into five age groupings: 15-17 (n = 32),18-23 (n = 75), 24-32 (n =57), 33-40 (n = 18), and 41-64 (n = 10). Mean errors, SDs, and ranges are reported for each age grouping. No gender differences were documented, and male and female data were collapsed. The authors suggest that a cutoff of 50 errors is appropriate only for participants <40 years of age. Study strengths 1. Large overall sample size, and some in-
dividual cells approximate 50. 2. Presentation of data by age groupings. 3. Information regarding mean IQ, educational level, handedness, gender, recruitment procedures, and geographical recruitment area. 4. Some psychiatric and neurological exclusion criteria. 5. Means and SDs are reported. Considerations regarding use of the study 1. High intellectual and educational level. 2. An age grouping of 41-64 with 10 par-
ticipants would not appear to be particularly useful. 3. Participants were recruited in Canada, raising questions regarding usefulness for clinical interpretation in the United States. 4. At least one subject in the 18-23 age group scored particularly poorly, causing the mean to be artificially low and the SD to be excessively large for this group.
(Table A24.1 0)
The authors obtained Category Test data on 193 Canadian participants (111 male, 82 female) recruited through posted advcfrtisements and personal contacts. Participants are described as "nonpsychiatric" and "nonneurological. " Eighty-three percent the !
[CT.11] Heaton, Grant, and Matthews, 1986 (Table A24.11)
The authors obtained Category Test data on 553 normal controls in Colorado, California, and Wisconsin as part of an investigation into the effects of age, education, and gender on
CATEGORY TEST
HRB performance. Nearly two-thirds of the sample were male (356 males, 197 females). Exclusion criteria were history of neurological illness, significant head trauma, and substance abuse. Participants were aged 15--81 years, with a mean of 39.3 (17.5) years, and had mean education of 13.3 (3.4) years, with a range of 0-20 years. The sample was divided into three age categories ( <40, 40-59, and ~60) with sizes of 319, 134, and 100, respectively, and classified into three education categories (<12, 12-15, ~16 years) with sizes of 132, 249, and 172, respectively. Testing was conducted by trained technicians, and all participants were judged to have expended their best effort on the task Mean errors are reported for the six subgroups, as well as percent classified as normal using Russell et al.'s (1970) criteria. Approximately 30% of the test score variance was accounted for by educational level. Significant group differences were found across the three age groups and the three educational levels, and a significant ageby-education interaction was documented. No significant differences in performance were found between males and females.
Study strengths 1. Large size of overall sample and individual cells. 2. Information regarding education, gender, age, and geographic recruitment area. 3. Data are grouped by age as well as educational level. Considerations regarding use of the study 1. No reporting of SDs. 2. Means reported for individual WAIS subtest scaled scores but not for overall IQ scores. [CT.12] Dodrill, 1987 (Table A24.12)
The author collected Category Test data on 120 participants in Washington during the years 197~1976 (n = 81) and 1986-1987 (n = 39). Half of the sample was female, and 10% were minorities (six African American, three Native American, two Asian American, one unknown). Eighteen were left-handed, and occupational status included 45 students, 37
489
employed, 26 unemployed, 11 homemakers, and one retiree. Participants were recruited from various sources, including schools, churches, employment agencies, and community service agencies; and they were either paid for their participation or offered an interpretation of their abilities. Exclusion criteria were history of "neurologically relevant disease (such as meningitis or encephalitis)," alcoholism, birth complications "of likely neurological significance," oxygen deprivation, peripheral nervous system injury, psychotic or psychotic-like disorders, or head injury associated with unconsciousness, skull fracture, persisting neurological signs, or diagnosis of concussion or contusion. One-third of potential participants failed to meet the above medical and psychiatric criteria, resulting in a final sample of 120. Mean age was 27.73 (11.04) years, and mean education was 12.28 (2.18) years. Participants tested in the 1970s were administered the WAIS, and those assessed in the 1980s were administered the WAIS-R; WAIS scores were converted to WAIS-R equivalents by subtracting 7 points from the VIQ, PIQ, and FSIQ. Mean FSIQ, VIQ, and PIQ scores were 100.00 (14.35), 100.92 (14.73), and 98.25 (13.39), respectively. IQ scores ranged 60-138 and reftected a normal distribution. Mean errors and SDs are reported as well as IQ-equivalent scores for various levels of intelligence. Using Reitan's cutoff of 50/51 errors, 22.5% of a subgroup of the sample were misclassified as brain-damaged.
Study strengths 1. Large sample size. 2. Comprehensive exclusion criteria (although the appropriateness of including individuals with WAIS-R scores falling in the mentally deficient range could be questioned). 3. Information regarding age, education, IQ, occupation, gender ratio, handedness, ethnicity, recruitment procedures, and geographic area. 4. IQ equivalent scores provided. 5. Data for different IQ levels provided. 6. Means and SDs are reported.
490
Considerations regarding use of the stt¥iy 1. Undifferentiated age range. : 2. On the IQ-equivalent scores, the two highest IQ groups have poorer :scores than the 115-120 IQ groups. [CT.13] Yeudall, Reddon, Gill, and Stefanyk, 1987 (Table A24.13) 1
The authors obtained Category Test ~ta on 225 Canadian participants recruitedi from posted advertisements in workplace$ and personal solicitations. Participants included meat packers, postal workers, transit e~ploy ees, hospital lab technicians, secretaries; ward aides, student interns, student nurse,, and summer students. In addition, high ;chool teachers identified for participation average students in grades 10-12. Exclusion ~teria were evidence of "forensic involve.pent," head injury, neurological insult, pren*al or birth complications, psychiatric proble~s. or substance abuse. Participants were claSsified into four age groupings: 15-20, 21-25, $6--30, and 31-40. Information regarding p'rcent right-banders, mean years of educatio~, and mean WAIS/WAIS-R FSIQ, VIQ, and PIQ are reported for each age grouping for :males and females separately and combined. Fbr the sample as a whole, 88% were right-handed and had completed an average of 14.87 (2.99) years of schooling. The mean FSIQ, VIQ, and PIQ were 113.98 (9.83), 114.77 (10.34), and 108.50 (10.34), respectively. Category Test data were gathered by .experienced testing technicians who "mottyated the participants to achieve maximum ~rfor mance" partially through the promise +f detailed explanations of their test perform4nce. Means and SDs for total errors ar~ presented for each age grouping and eacij ageby-gender grouping. No significant relationships were found between Category Test scores and gender. Age effects were also not significant, although the authors note that variance effects' with age were significant and recommend use of age norms. A significant negative assocJation was found between Category Test !fores and education, particularly in males; bu~ education accounted for <10% of test scor~ variance. Significant negative correlations :were
CONCEPT FORMATION AND REASONING
documented between test scores and VIQ and PIQ, again particularly in males; but only VIQ accounted for >10% of score variance. Because no significant differences were found between men and women, only the combined sample data are reproduced below.
Study strengths 1. Large sample size, and individual cells approximate 50. 2. Grouping data by age. 3. Data availability for a 15-20 age group. 4. Adequate medical and psychiatric exclusion criteria. 5. Information regarding handedness, education, IQ, gender, occupation, geographic recruitment area, and recruitment procedures. 6. Means and SDs are reported. Considerations regarding use of the study 1. Sample was atypical in terms of its high average intellectual level and high level of education. 2. Data were obtained on Canadian participants, which may limit their usefulness for clinical interpretation in the United States due to possible subtle cultural differences. 3. Examination of the data reveals odd, unpredicted variability, with those 2125 years old performing worse than those 26--30 years old. [CT.14] Ernst, 1987 (Table A24.14)
The author obtained Category Test data on 110 primarily Caucasian (99%) residents of Brisbane, Australia, aged 65-75. Fifty-nine were female and 51 were male, and mean educational level was 10.3 years; men and women did not differ in years of education. Participants were recruited primarily through random selection from the Queensland State electoral roll (n = 97), with the remainder (n = 13) solicited through senior-citizen centers. Exclusion criteria were history of significant head trauma or neurological disease. Nearly one-half of the sample were diagnosed with at least one chronic disease (hypertension = 33, heart disease = 9, thyroid dysfunction = 7, asthma= 5, emphysema= 2, diabetes= 1), for which they
491
CATEGORY TEST
were receiving treatment and which was described as well controlled. Sixty-six participants were taking medications, primarily for the diseases listed above. All participants were administered the Trailmaking Test first, and half of the participants were also administered the Tactual Performance Test (TPT) prior to the Booklet Category Test. Mean errors and SDs for each of the seven subtests as well as total errors are reported. Using a cutoff of 51 errors, 84% of the sample were classified as impaired. Men obtained fewer errors than women on subtests III and IV and on total errors. No differences in scores emerged between participants with and without chronic disease. Educational level appeared to be related to scores on subtests IV, V, and VII and total errors. No differences in test performance were documented between those participants who were or were not administered the TPT prior to the Category Test.
Study strengths 1. Large sample size in a restricted age range. 2. Presentation of the data by gender. 3. Information regarding education, geographic recruitment area, recruitment procedures, and ethnicity. 4. Information regarding test administration order effects. 5. Means and SDs are reported.
Considerations regarding use of the study 1. Approximately half of the participants had at least one chronic illness, and over half were taking prescribed medications. 2. No information regarding IQ. 3. Low mean educational level. 4. Data were collected in Australia and may be unsuitable for clinical use in the United States. [CT.15] Alekoumbides, Charter, Adkins,
and Seacat, 1987 (Table A24.15)
The authors report Category Test data on 135 medical and psychiatric inpatients and outpatients without cerebral lesions or histories of alcoholism or cerebral contusion, from V.A. hospitals in southern California as part of their development of standardized scores cor-
rected for age and education for the HRB. Among the 41 psychiatric patients, nine were diagnosed as psychotic and 32 were neurotic. In addition to psychiatry services, patients were drawn from medicine (n =57), neurology (n = 22), spinal cord injury (n = 9), and surgery (n = 6) units. Mean age was 46.85 (7.17) years, ranging 19--82 years, and mean education was 11.43 (3.20) years, ranging 1-20 years. Frequency distributions for age and years of education are provided. Mean WAIS FSIQ, VIQ, and PIQ were within the average range: 105.89 (13.47), 107.03 (14.38), and 103.31 (13.02), respectively. Means and SDs for individual age-corrected subtest scores are also reported. All participants except one were male; the majority were Caucasian (93% ), with 7% African American. The mean score on a measure of occupational attainment was 11.29. No differences were found in test performance between the two psychiatric groups and the nonpsychiatric group, and the data were collapsed. Mean errors and SDs are presented. Both age and educational level had significant associations with Category Test scores in the expected direction, and regression equation information to allow correction of raw scores for age and education is included.
Study strengths 1. Large sample size. 2. Information regarding age, IQ, education, ethnicity, gender, occupational attainment, and geographic recruitment area. 3. Regression equation data for computation of age- and education-corrected scores. 4. Means and SDs are reported.
Considerations regarding use of the study 1. Data were collected on medical and psychiatric patients. 2. Undifferentiated age range (mitigated by the regression equation information). 3. Nearly all-male sample. [CT.16] Bornstein, Baker, and Douglass, 1987a (Table A24.16)
The authors collected Category Test-retest data on 23 volunteers (14 women, nine men) aged 17--52 years, with a mean age of 32.3 (10.3) years, as part of an examination of the
492
CONCEPT FORMATION AND REASONING
short-term retest reliability of the HRB. Exclusion criteria consisted of a positive history of neurological or psychiatric illness. Mean VIQ was 105.8 (10.8), with a range of 88-128, and mean PIQ was 105.0 (10.5), with a range of 85-121. Participants were administered the HRB in standard order both on initial testing and again 3 weeks later. Means, SDs, and ranges for total errors for both testing sessions are provided, as well as raw score change and SD, median raw score change, and mean percent of change. Significant improvement in performance over the 3-week period was documented. Correlations of such demographic variables as age and education with mean percent of change and mean change were small, with education accounting for up to 7% of variance and age accounting for up to 4% of variance.
differences in performance were found between the English and Arabic administrations. A significant practice effect was documented over the 2-week interval.
Study strengths l. Information on short-term (3-week) retest data. 2. Information on IQ level, gender, and age. 3. Minimally adequate exclusion criteria. 4. Means and SDs are reported.
[CT.18] Russell, 1987 (Table A24.18)
Considerations regarding use of the study l. Undifferentiated age range. 2. Relatively small sample size. 3. No information regarding education. [CT.17] EI-Sheikh, EI-Nagdy, Townes, and Kennedy, 1987 (Table A24.17)
The authors report Category Test data on 32 undergraduate and graduate Egyptians at the American University in Cairo as part of their cross-cultural investigation of the LunaNebraska Neuropsychological Battery and the HRB. No subject had a history of known brain damage. Participants were described as "Arabic and English-speaking." Category Test instructions were translated in Egyptian colloquial Arabic by the first author and checked by two independent judges fluent in both Arabic and English. In the case of disagreement between these two judges, a third judge was consulted. The Category Test was administered in English to 23 participants and in Arabic to nine participants and readministered 2 weeks later. Mean errors and SDs are reported. No
Study strengths l. Data obtained on an Arabic sample. 2. Information on test-retest scores. 3. Information regarding educational level, age, and geographic recruitment area. 4. Means and SDs are reported. Considerations regarding use of the study 1. Small sample size. 2. Minimal exclusion criteria. 3. No information regarding intellectual level. 4. Undifferentiated age range, although it can be assumed it is fairly restricted.
The author obtained Category Test data on 155 controls during the years 1968-1982 in V.A. hospitals in Cincinnati and Miami for development of a reference scale method for neuropsychological test batteries. The 148 male and seven female participants were suspected of having neurological disorders but had "negative neurological findings." No other exclusion criteria were described. Mean age was 46.19 (12.86) years, and mean education was 12.29 (3.00) years. All but eight of the participants were Caucasian; the remainder were African American. Mean WAIS FSIQ, VIQ, and PIQ were 111.9, 112.3, and 109.90, respectively. Mean errors and SDs are provided. Study strengths 1. Large sample size. 2. Information regarding IQ, education, ethnicity, gender, age, and geographic recruitment area. 3. Means and SDs are reported. Considerations regarding use of the study 1. Undifferentiated age range. 2. Insufficient exclusion criteria; all participants were suspected of having neurological disorders. 3. High mean intellectual level. 4. Mostly male sample.
CATEGORY TEST
[CT.19] Elias, Podraza, Pierce, and Robbins, 1990 (Table A24.19)
Participants were 183 community-dwelling individuals (76 men, 107 women) recruited from church groups, businesses, professional organizations, and community service organizations for older persons as part of a study on the impact of hypertension on cognition. Exclusion criteria were no major chronic or acute disease including hypertension, treatment for a neurological disorder, brain trauma, mental illness, or any cardiovascular or cerebrovascular disease. Skilled clerical, supervisory, blue-collar, and professional-executive occupations were represented. Participants were divided into three age groupings: 20-31 (41 men, 47 women), 37-49 (23 men, 38 women), and 55--67 (12 men, 22 women). Mean educational levels for the three groups were 15.4, 15.7, and 14.9, respectively (range 12-20 for each age group), and mean WAIS VIQ and PIQ were 119 and 116 for the youngest group, 122 and 122 for the middle-aged group, and 124 and 121 for the oldest group, respectively. Means and SDs for number of errors are reported.
Study strengths 1. Large overall sample size (with individual subgroup sizes of 88, 61, and 34). 2. Adequate exclusion criteria. 3. Information regarding gender, educational level, IQ, and recruitment strategies; data stratified by age. 4. Means and SDs reported. Considerations regarding use of the study 1. No information regarding ethnicity or geographic area (although it can be assumed it was Maine, given the academic affiliations of the authors). 2. High educational and IQ levels. [CT.20] Heaton, Grant, and Matthews, 1991
The authors provide normative data on the Category Test from 486 urban and rural participants recruited in several states (California, Washington, Colorado, Texas, Oklahoma, Wisconsin, Illinois, Michigan, New York, Virginia, and Massachusetts) and Canada. Data
493
were collected over a 15-year period through multicenter collaborative efforts; the authors trained the test administrators and supervised data collection. Exclusion criteria were history of learning disability, neurological disease, illness affecting brain function, significant head trauma, significant psychiatric disturbance (e.g., schizophrenia), and alcohol or other substance abuse. Mean age for the total sample was 42.0 (16.8), and mean educational level was 13.6 (3.5). Sixty-five percent of the sample were males. Mean WAIS FSIQ, VIQ, and PIQ were 113.8 (12.3), 113.9 (13.8), and 111.9 (11.6), respectively. Participants were generally paid and judged to have provided their best efforts on the tasks. The Category Test was administered according to Reitan and Wolfson's (1985) instructions. A T-score system with demographic correction was developed on 378 participants and cross-validated on 108 participants. Total number of errors was the performance parameter employed. Age accounted for 38% of the variance in test scores, and education was associated with 20% of score variance; gender did not account for any score variance. These demographic variables in combination were associated with 43% of score variance. Extensive T-score tables corrected for age, education, and gender are provided; and the interested reader is referred directly to the handbook for these data. The comprehensive tables present T-score equivalents for test scaled scores for males and females separately in 10 age groupings (20-34, 35-39, 40-44, 4549,50-54,55-59,60-64,65-69,70-74, 75-80) by six education groupings (6--8, 9-11, 12, 1315, 16-17, 18+years). For the sample as a whole, mean errors was 39.6 (25.6). In 2004, the authors published revised norms based on a sample of over 1,000 normal adults. In addition to age, education, and gender stratification, the data are partitioned by race/ ethnicity (African American and Caucasian).
Study strengths 1. Large sample size. 2. T scores corrected for age, education, and gender. The 2004 edition presents data for two race/ethnicity groups. 3. Adequate exclusion criteria.
494
4. Infonnation regarding IQ and geographic recruitment area.
Consideration regarding use of the study l. Above average mean intellectual level (which is probably less of an issue given that this is WAIS rather than WAIS-R IQ data). Other comments 1. The interested reader is referred to the Fastenau and Adams (1996) critique of the Heaton et al. (1991) norms, and Heaton et al.'s 1996 response to this critique. [CT.21] Elias, Robbins, Walter, and Schultz, 1993 (Table A24.20)
Category Test data on 427 participants, including those from the 1990 study and reRecting the same exclusion criteria, are provided for men and women separately for six age groupings: 15-24 (37 men, 24 women), 25-34 (40 men, 56 women), 35-44 (36 men, 56 women), 45-54 (25 men, 46 women), 55-64 (25 men, 35 women), and 65 and over (24 men, 23 women). Participants with <12 and > 19 years of education were excluded because participants outside this range were disproportionately distributed across the age and gender groupings. Mean WAIS Vocabulary and Infonnation subtest scores ranged 13.9-14.7 and 13.2-13.7, respectively, across the age groups. Means and SDs for number of errors are provided.
Study strengths l. Large overall sample size, although most individual age x gender cells were <50. 2. Adequate exclusion criteria. 3. Information regarding educational level (although only ranges provided) and WAIS subtests (Information, Vocabulary) provided and data stratified by age and gender. 4. Means and SDs reported. Considerations regarding use of the study l. No infonnation regarding ethnicity or geographic area (although it can be
CONCEPT FORMATION AND REASONING
assumed it was Maine, given the academic affiliations of the authors). 2. High IQ level. [CT.22] Barrett, Morris, Akhtar, and Michalek, 2001 (Table A24.21)
Test data were obtained on 1,052 Air Force veteran controls who served in Southeast Asia from 1962 to 1971 in a study examining the effects of Agent Orange on cognition. Participants averaged 43.9 (7.6) years of age, and 5.3% were African American (with the rest "nonb1ack"); 37% were officers, 16.6% were enlisted Ryers, and 46.0% were enlisted ground crew; most of the officers were college-educated, and most enlisted personnel were high school-educated. No exclusion criteria are listed aside from epilepsy and low exposure to dioxin. Participation was voluntary. Mean number of errors is reported.
Study strengths 1. Huge sample size. 2. Infonnation regarding age, occupational status, and recruitment strategy, with some limited data regarding ethnicity, educational level, and gender (assumed all were male). Considerations regarding use of the study l. No exclusion criteria or infonnation regarding IQ. 2. No stratification of data by age or education. 3. No SD reported for number of errors.
RESULTS OF THE META-ANALYSES OF THE CATEGORY TEST DATA (See Appendix 24m) Data collected from the studies reviewed in this chapter were combined in regression analyses, to describe the relationship between age and test perfonnance and to predict estimated test scores for different age groups. Effects of other demographic variables were explored in follow-up analyses. The general procedures for data selection and analysis are described in Chapter 3. Detailed results of the metaanalysis and predicted test scores across adult
495
CATEGORY TEST
age groups are provided in Table A24m.l in Appendix 24m. After initial data editing for consistency and for outlying scores, 11 studies, which generated 25 data points based on a total of 1,579 participants, were included in the analyses. Linear regression of the number of errors on age yielded an R2 of0.839, indicating that 84% of variance in the scores is accounted for by the model. Based on this model, we estimated number of errors for age intervals between 16 and 79 years. If predicted scores are needed for age ranges outside the reported age boundaries, with proper caution (see Chapter 3) they can be calculated using the regression equations included in the tables, which underlie calculations of the predicted scores. Linear regression of SDs on age yielded R2 of 0.469, indicating increase in variability with advancing age, consistent with the literature. Predicted SDs, based on this modeL are reported. Examination of the effects of demographic variables on the number of errors indicated that education did not contribute to test scores beyond its association with age in the data available for analyses. IQ had a significant effect on number of errors. However, the limited number of studies that reported IQ does not allow close examination of this relationship. The effect of gender was not examined as information on gender distribution was not available in the data reviewed. Strengths of the Analyses 1. Total sample size of 1,579 participants. 2. R2 of 0.839, indicating a good model fit. 3. Postestimation tests for parameter specifications did not indicate problems with normality or homoscedasticity. 4. Significant effect of IQ is evident in the data, consistent with the literature.
Limitation of the Analyses 1. An effect of education on test scores in normal groups has been reported in several studies. Examination of this relationship did not yield meaningful results in our data due to close association of education with age.
CONCLUSIONS A large number of studies document the popularity of the Category Test in clinical assessment. The major drawbacks of the original version of the test are its length and lack of portability. To overcome these problems, more recent modifications focus on development of short forms and creation of booklet formats of the test. Despite the greater convenience offered by these modified versions, their psychometric properties are not yet sufficiently assessed. Most of the studies address the reliability and validity of the short forms based on analyses of extrapolated items from the full version of the test, rather than on data for the actual short version of the test. This suggests caution in using these data in diagnostic decision making and prompts further investigation of the psychometric properties of these versions. Correspondence between the original version and the booklet version of the test also deserves more attention. Further research also needs to focus on standardization of instructions for the test. The clinical utility of the Category Test would also be improved by adjusting cutoff criteria relative to participants' age, education, and intelligence level. Consideration of demographic factors in assigning participants to impaired vs. nonimpaired groups would improve the specificity of the Category Test. This would reduce the excessively high rates of misclassification of "normal" participants in the impaired range reported in the literature.
25 Wisconsin Card Sorting Test
BRIEF HISTORY OF THE TEST The original version of the Wisconsin Card Sorting Test (WCST) was developed by Berg and colleagues (Berg, 1948; Grant & Berg, 1948). It was designed to assess abstract reasoning and the ability to adapt cognitive strategies to one's changing environment. For this reason, the WCST is believed to measure a complex range of executive functions. including planning. organizing, abstract reasoning, concept formation, cognitive set maintenance and shifting ability, and inhibiting impulsive responses (Lezak, 1995; Lezak et al., 2004; Spreen & Strauss, 1998). The WCST was primarily based on abstract reasoning and learning research conducted with primates (Zable & Harlow, 1946) and the Weigl Color-Form Sorting Test, designed for assessment of reasoning skills in humans (Weigl. 1941). The original version of the WCST involves sorting either 60 response cards (Berg. 1948) or 64 response cards (Grant & Berg, 1948) to four key stimulus cards. If participants do not achieve the criterion number of sorts, the examiner rearranges the deck of cards and testing is continued until the expected number of categorical sorts is achieved or the second deck of cards is sorted. This original format has been modified over time, and the most popular version of the test now involves sorting two decks of 496
64 cards to four key stimulus cards (Grant & Berg, 1948; Heaton, 1981; Heaton et al., 1993). In their revised WCST manual, Heaton et al. (1993) present a comprehensive review of the variations in testing materials and administratiog. procedures used in previous studies. Briefly, past versions have varied in the number of response cards employed (e.g., two 48-card decks, two 60-card decks, or two 64-card decks), type of designs used (e.g., standard figures or modified figures), and presentation style of the stimuli (e.g., systematic or nonsystematic configurations in random or standardized order). Furthermore, test administration procedures, such as discontinuation rules and scoring criteria for the specific outcome measures, have varied widely among past studies. Heaton (1981) published the first comprehensive administration, scoring, and normative manual on the two-deck, 64-card version of the WCST. In the first manual, he standardized the Grant and Berg (1948) testing procedures, documented precise scoring procedures, and presented normative data based on relevant demographic factors. Heaton et al. (1993) later revised and updated the manual to include wider normative age ranges (6.589 years), further clarify scoring criteria with explicit examples, and revise the scoring forms to facilitate recording responses and calculating outcome measures.
WISCONSIN CARD SORTING TEST
Since the publication of Heaton's manuals (Heaton 1981; Heaton et al. 1993) the 64-card, two-deck version of the WCST has gained popularity and is the most frequently administered format of this test for both clinical and research purposes. In this version, the participant is presented with two decks of 64 stimulus cards (one deck at a time) and asked to sort each card in the deck, one card at a time, to one of four key reference cards presented in a predetermined order. A single red triangle is printed on the first reference card, two green stars are on the second card, three yellow crosses are printed on the third card, and four blue circles are printed on the fourth card. Each of the stimulus cards in the two decks is unique in terms of design form (triangle, star, cross, or circle), color (red, green, yellow, or blue), and number of items (one, two, three, or four designs per card). The participant is required to match stimulus cards to response cards based on one of three sorting principles (color, form, or number). However, there are times when a response card matches a key card on two or more principles (e.g., color and form), and it is up to the examinee to decide the principle he or she must follow. The first sorting principle is color, followed by form, and then by number. Testing is concluded when this sequence of sorting is completed two times by the participant or when all 128 cards have been sorted. Participants are provided with very little specific information as to how to execute the task. Essentially, they are told to take one card from the top of the deck and place it beneath one of the four key cards. They are not given any indication as to what principle to use when sorting and are provided only with the feedback "right" or "wrong" after each card sort. Thus, it is the task of the participant to determine, from this ambiguous feedback, what principle he or she is to use for sorting. Additionally, when the appropriate number of sorts has been made to a given category, the examiner changes the sorting principle without alerting the participant. Other modified versions of the WCST, in which the participant is notified when the sorting principle is about to change, will be discussed in later parts of this chapter (Nelson, 1976).
497
The WCST has a number of useful outcome measures, which can be derived with use of (sometimes complex) sets of scoring rules (see the revised WCST manual for details; Heaton et al., 1993). Briefly, some of the outcome measures include the following: (1) Trials to Complete First Category refers to the number of trials taken to complete sorting to the first principle; (2) Categories Completed (sometimes referred to as Categories Achieved) is an indication of the total number of correct principles sorted, and credit for obtaining one category sort is given when the participant completes 10 consecutive correct matches to one principle; (3) Failure to Maintain Set (sometimes labeled Set Failure) refers to the inability to complete the category sort and occurs when participants have obtained at least five accurate categorical matches before shifting their sorting strategy; (4) Percent Conceptual Level Responses is a measure of the participant's insight into the correct sorting principle and is based on "consecutive correct responses occurring in runs of three or more" (Heaton et al., 1993); (5) Learning to Learn indicates the participant's "average change in conceptual efficiency across the consecutive categories (stages) of the WCST" (Heaton et al., 1993; calculating this measure requires a number of sequential steps that are best described in the Heaton et al. manual); (6) Total Errors refers to the number of sorting errors made throughout the task; and (7) Perseverative Errors is the number of repetitive errors (i.e., sorting to a single, wrong principle repeatedly despite negative feedback) while (8) Nonperseverative Errors reflects errors but not those that are repetitively made to a single, wrong sorting principle. Other outcome measures include Percent Conceptual Level Responses, Percent Perseverative Responses, Percent Perseverative Errors, Percent Nonperseverative Errors, and "Other" Responses (also referred to as Unique Errors, matches that are not based on any of the three sorting principles). A more detailed description of these outcome variables as well as specific scoring criteria can be found in the WCST revised manual (Heaton et al., 1993). As mentioned above, Heaton et al. (1993) have very clearly presented the scoring criteria
498 and supplemented them with detailed examples in the revised manual. Prior to the development of the revised manual, Axelrod et al. (1992b) created supplementary scoring material and found that the interrater reliability among novice raters did not change with the additional material but that scoring time was reduced by 41%. Nonetheless, Greve (1993) rightly points out that scoring errors are still likely to occur due to the complex nature of calculating some of the WCST outcome measures. Fortunately, clinicians and researchers can now use computer scoring software, which very precisely and quickly produces accurate scores as well as normative data corrected by demographic factors such as age and education. However, scoring data with this software requires that the stimuli cards be sorted in a predetermined order. In a study using the computerized scoring method, Greve (1993) found that a novice could score at the "expert" level.
Anatomical Correlates and Effect of Brain Pathology on the WCST In one of the earliest clinical studies using the WCST, McFie and Piercy (1952) found that the laterality of the lesion, rather than its site within the hemisphere, was the cause of poor sorting performance. However, later studies by Milner (1963; 1964) found that patients with dorsolateral prefrontal cortex damage performed worse on the WCST than those with damage to the orbitomedial or other brain regions. Using her version of the WCST, Nelson (1976) found that patients with frontal lobe lesions obtained fewer categories and made more perseverative errors than those with lesions elsewhere in the brain. Following these early findings, numerous other studies have demonstrated similar results (Anderson et al., 1991; Bomstein, 1986c; Drewe, 1974; Heaton 1981; Robinson et al., 1980). A recent meta-analysis by Demakis (2003) of the WCST using published studies on patients with damage to frontal and "nonfrontal" brain regions as well as left frontal vs. right frontal brain-damaged patients reported similar findings. Demakis found that patients with frontal damage achieve far fewer catego-
CONCEPT FORMATION AND REASONING
ries and commit more perseverative, but not nonperseverative, errors than patients with nonfrontal damage. The results of the metaanalysis further indicated that the lateralization of the lesion did not affect WCST performance in frontallobEHlamaged patients. Nonetheless, several clinical studies have found that the WCST is incapable of discriminating patients with frontal lesions from those with lesions in other brain regions (Anderson et al., 1991; Axelrod et al., 1996; Drewe, 1974; Grafman et al., 1990; Hermann et al., 1988; Homer et al., 1996; Mountain & Snow, 1993; Stuss et al., 1983). More recently, functional neuroimaging studies have confirmed that WCST performance is related to activation of dorsolateral prefrontal regions (Berman et al., 1995; Konishi et al., 1998; Nagahama et al., 1998). Additionally, associations between perseverative behavior on the WCST and prefrontal lesions, prefrontal volumes, and prefrontal cortex activation have been solidly established (Cabeza & Nyberg, 2002; Raz et al., 1998; Rogers et al., 2000; Stuss et al., 2000). In young healthy subjects, WCST performance has been shown to activate the dorsolateral prefrontal cortex (Berman et al., 1995), and in older adults it has been associated with prefrontal cortex volume (Head et al., 2002). Esposito et al. (1999) used PET to study the relationship between regional cerebral blood flow (rCBF) and the WCST and Raven Progressive Matrices in individuals aged 18-80. Among other findings, they report that the left dorsolateral prefrontal cortex was activated by both tasks in the young group, but in the older group activation decreased in this region for the WCST but not the Raven Progressive Matrices. Frontal lobe dysfunction has been speculated to be the primary reason for the neuropsychological and psychiatric symptoms in schizophrenia (Weinberger et al., 1994). Accordingly, functional MRI and single-photon emission computed tomography (SPECT) studies have repeatedly shown less prefrontal activity in schizophrenics compared to controls on the WCST (Berman et al., 1993; Catafau et al., 1994; Kawasaki et al., 1993; Parellada et al., 1994; Sagawa et al., 1990a,b; Seidman
499
WISCONSIN CARD SORTING TEST
et al., 1994; Weinberger and Berman, 1986; Weinberger et al., 1986, 1989), with some data indicating an absence of activation for the right frontal region in schizophrenics (Volz et al., 1997). Using transcranial Doppler sonography, a noninvasive method designed to measure the CBF velocity of the basal cerebral arteries, Schuepbach et al. (2002a,b) found that there was an increase in blood flow in controls, but not in schizophrenics, after shifting set on the WCST. Haut et al. (1996) found that patients with schizophrenia performed similarly on the WCST to patients with right frontal lobe tumors but not those with left frontal lobe tumors, nonfrontal tumors, and healthy controls. Evoked-potential studies of healthy individuals link WCST perseverative errors to the frontal-extrastriate network (Barcelo, 1999). In patients with Parkinson's disease, P300 latency response (a physiological measure of cognitive processing) was correlated with number of WCST categories completed (lijima et al, 2000); and in both depressed and nondepressed elderly, longer P300 latency was related to greater total errors and poorer ability to maintain set but not to perseverative errors (Kindermann et al., 2000). While studies have demonstrated that regions of the frontal lobes are activated during performance of the WCST, the test's specificity to activation confined exclusively to the prefrontal cortex has not been demonstrated in all studies. This is not surprising given that the WCST most likely requires multiple, parallel cognitive processes (Anderson et al., 1991; Dehaene & Changeux, 1991). Functional imaging studies have found the WCST task to activate other brain regions, such as the bilateral inferior parietal lobe and the inferior posterior temporal lobe (Berman et al., 1995), the bilateral supramarginal gyrus and the anterior cingulate cortex (Konishi et al., 1998), and the bilateral inferior parietal lobe and the left superior occipital gyrus (Nagahama et al., 1996) in addition to the frontal regions. Test version and administration procedures might explain some of the variability observed (Stuss et al., 2000). Interestingly, poor performance on the WCST does not always lead to lack of CBF activity in the prefrontal cortex. In fact, some
studies have found increased rCBF in the prefrontal cortex in patients with Huntington's disease (Goldberg et al., 1990; Weinberger et al., 1988) and Down syndrome (Schapiro et al., 1999), despite poor performance on the task. Based on these findings, Schapiro et al. (1999) caution that prefrontal activation may be an indication of mental effort exerted rather than level of task performance alone.
Brief Overview of Clinical Findings Using the WCST In the past several years, the WCST has been used in hundreds of clinical studies on patients with various psychiatric and neurological disorders to characterize frontal systems dysfunction. These studies have added valuable information regarding the clinical utility of this test and have assisted in determining the expected pattern of performance for clinical disorders. Due to the large volume of such studies, not all will be reviewed in this chapter. Instead, a brief overview of the clinical findings is provided below. As described earlier, patients with predominantly prefrontal lobe lesions perform poorly on the WCST (Anderson et al., 1991; Bomstein, 1986c; Heaton, 1981; Milner, 1963; Milner, 1964; Nelson, 1976; Robinson et al., 1980). Patients with various neurodegenerative diseases, such as Alzheimer's disease (Binetti et al., 1995; Bondi et al., 1993), Huntington's disease (Paulsen et al., 1995b), frontotemporal dementia (FTD; Boone et al., 1999; Razani et al., 2001), and Parkinson's disease (Alevriadou et al., 1999; Beatty & Monson, 1990; Paolo et al., 1995; Tomer et al., 2002), also exhibit impaired WCST performance. Bondi et al. (1993) found that Nelson's version of the WCST was quite sensitive and specific in differentiating patients with Alzheimer's disease (even those in the mild stages) from normal controls. Using receiver operating characteristic (ROC) curves, the authors found that the number of categories completed best classified Alzheimer's patients and controls (sensitivity of 94% and specificity of 87% using optimal cutoff scores) but that number of perseverative errors best distinguished Alzheimer's patients
500
with mild symptoms from controls (sensitivity of 74% and specificity of 87% using optimal cutoff scores). Paulsen et al. (1995b), ~so using ROC curves, found the WCST to be quite effective at discriminating patients wtth Alzheimer's and Huntington's dementia from normal controls (90% classification accuracy) but that accuracy rates for discriminating among the two dementia groups were far lower (63% classification accuracy). Razani et al. • (2001) examined WCST performance in F:fD patients with asymmetrical left- or right-sided anterior hypoperfusion and in those wlth Alzheimer's disease. In general, while all three dementia groups performed poorly on the WCST relative to normal controls, the FTD patients with right-sided hypoperfusion displayed the greatest difficulty. Patients with right-FTD committed significantly more perseverative errors than patients with left-FTD and Alzheimer's disease, and the rig~t-FTD patients scored significantly below the leftFTD patients on Percent Conceptual Level Responses. Documented deficits in patients wi~ Parkinson's disease include inability to niamtain set (Lees & Smith, 1983; Flowers & Robertson, 1985; Taylor et al., 1986) and increased number of total and perseverative, but not nonperseverative, errors (Paolo ·et al., 1995). Similar to patients with frontal lobe lesions, patients with Parkinson's disease seem to benefit from cues to shift set (Fimll} et al., 1994; Hsieh et al., 1995), even when those cues do not explicitly direct them to the correct solution (van Spaendonck et al., 1995); but compared to patients with frontal lobe damage, the performance of patients with Parkinson's disease improves after the second shift in set (van Spaendonck et al., 1995), while patients with frontal lesions continue to have difficulty throughout the task (Heaton, 1981). Interestingly, Parkinson's patient5 without dementia also appear to perform worse than healthy controls (Bondi et al., 1993; Caltagirone et al., 1989; Tsai et al., 1994). In a study using Nelson's modified version of the WCST, Paolo et al. (1996b) found that Parkinson's patients who were matched Qn age, education, and overall intellectual abUity to normal controls committed more total errors,
CONCEPT FORMATION AND REASONING
perseverative responses, and perseverative errors. In this same study, Parkinson's patients with dementia committed more total errors than patients with Alzheimer's disease but the two groups performed equally poorly on all other WCST measures. Alevriadou et al. (1999) found that, in addition to greater perseverations and set failures, patients with Parkinson's disease who did not display cognitive impairment required more trial administrations relative to controls, a finding that is in contrast to Taylor et al. (1986) and Cooper et al. (1991). Overall, the findings on patients with Parkinson's disease suggest that they require more trials to develop a problem-solving strategy but that even after developing a strategy they tend either to have difficulty switching from one sorting principle to another (perseverative) or to abandon their strategy prematurely (failure to maintain set). A number of researchers have demonstrated that patients with traumatic brain injury (TBI) commit more perseverative errors relative to controls (Ferland et al., 1998; Segalowitz et al., 1992; Stuss, 1987). Little et al. (1996) explored the implications of WCST performance for daily life activities in patients with TBI and found that the WCST correlated moderately, and better than the Category Test, with a daily living functional measure. Coelho (2002) analyzed the speech patterns of patients with closed head injuries and found that performance on the WCST may predict sentence complexity, organization, and content. Sherer et al. (2003) found that Perseverative Errors, Perseverative Responses, Categories Completed, and Conceptual Level Responses (which clustered on one principal component) on both the full and 64-card versions of the WCST predicted the functional level of patients with closed head injuries at the time of discharge from rehabilitation. There has been an abundance of research with the WCST in the past two decades on patients with schizophrenia and schizophreniarelated disorders. Among other WCST outcome measures, most studies report that patients with schizophrenia exhibit increased perseverative errors and/or complete few
WISCONSIN CARD SORTING TEST
categorical sorts (Beatty et al., 1994; Bustini et al., 1999; Ismail et al., 2000; Martinez et al., 2002; Morice, 1990; Parellada et al., 1994, 2000; Raine et al., 1992; Rossi et al., 2000; Schuepbach et al., 2002b; Weinberger et al., 1986), and these deficits do not appear to be a result of inadequate effort or motivation (Ilonen et al., 2000). In fact, factor-analytic studies have found that WCST scores that cluster on a perseveration-type factor are the most diagnostically useful in characterizing executive dysfunction in schizophrenics (Koren et al., 1998). However, WCST performance appears to vary by schizophrenia subtype and/or symptoms (Braff et al., 1991; i.e., those with greater frontal lobe abnormalities display greater WCST dysfunction). Support for this notion comes from studies that have found a relationship between chronicity of the illness and WCST perseverative errors (Braff et al., 1991; Butler et al., 1992), between symptom severity and increased perseverative errors (Bomstein et al., 1990), between disorganized symptoms and fewer categorical sorts as well as greater perseverative errors (Dahan et al., 2002), and between level of insight and WCST performance (Chen et al., 2001; Lysaker & Bell, 1994) and studies that have shown alterations in WCST performance as a result of the type of neuroleptic medication used (i.e., respiredol was better than olanzapine; Rybakowski & Borkowaska, 2001). Additional studies have shown that more perseverative errors are committed by paranoid relative to nonparanoid patients (Abbruzzese et al., 1996), by patients with negative symptoms compared to positive symptoms (Braff, 1989), and by those with early-onset (mean age 27.6) relative to late-onset (mean age 54.7 years) schizophrenia (Jeste et al., 1995). In contrast, other investigators have not found differences in WCST scores between patients with positive and negative symptoms or a relationship between WCST performance and chronicity of illness or neuroleptic treatment (Parellada et al., 2000). Gambini et al. (1992) found that factors such as education affect WCST scores in schizophrenic patients and suggest that studies control for this demographic factor. In schizophrenic patients, deficits in
501
spatial working memory may be responsible for number of perseverative errors and categories achieved but not nonperseverative errors (Gooding & Tallent, 2002). Individuals with schizotypal personality disorder display some of the same deficits on the WCST as patients with schizophrenia, including completing fewer categorical sorts, more perseverative errors, and more set failures compared to controls (Lenzenweger & Korfine, 1994; Trestman et al., 1995). Most studies suggest that nonschizophrenic siblings of schizophrenic patients tend to perform relatively normally on the WCST (Scarone et al., 1993; Ismail et al., 2000; YurgelunTodd & Kinney, 1993), but at least one study has found that schizophrenics and their healthy siblings committed greater percentages of perseverative errors relative to controls (Saoud et al., 2000). Another study found that the offspring of schizophrenic patients achieved fewer categorical sorts and committed more perseverative errors and responses relative to healthy controls (Wolf et al., 2002). Studies examining WCST performance in other psychiatric populations have produced interesting, but at times mixed, results. Some studies have reported impairment on the WCST in patients with obsessive-compulsive disorder, particularly decreases in categories completed and/or increases in number of errors committed (Christensen et al., 1992; Harvey, 1986; Head et al., 1989; Malloy, 1987); but others have not found such impairments (Boone et al., 1991; Gross-Isseroff et al., 1996). Similarly, some investigators have reported poor performance on the WCST in patients with depression, with patients typically achieving fewer categorical sorts, more errors, and set failures (Austin et al., 1992; Axelrod et al., 1994a,b; Boone et al., 1995; Channon, 1996; Martinet al., 1991, while others have not found marked deficits (Fossati et al., 1999, 2001). Ishikawa et al. (2001) found that "successful" (i.e., nonconvicted) psychopaths committed fewer total errors and fewer nonperseverative errors and achieved a greater number of category sorts relative to "unsuccessful" (i.e., convicted) psychopaths and normal controls. No differences, however, were noted between the unsuccessful
502
psychopaths and controls on any of the WCST measures. Patients with multiple sclerosis (MS) have been shown to attain fewer categories and commit more perseverative responses and perseverative errors on the WCST, secondruy to frontal lobe dysfunction (Beatty et al., 1989, 1995; Heaton et al., 1985; Rao et al., 1991b). Poor performance on the WCST is correlated with frontal lobe lesion volumes in patients with MS (Arnett et al., 1994). Studies of the effects of chronic and acute alcohol use on the WCST have provided intriguing results. Early studies found that chronic alcoholics tend to be quite perseverative on the WCST (Parson, 1975; Tarter and Parsons, 1971). Sullivan et al. (1993) compared the WCST performance of patients with chronic alcoholism to patients with schizophrenia, frontal lobe lesions, and normal controls and found that the alcoholic group scored highest on an "insufficient sorting" factor (i.e., failed to maintain set) relative to the schizophrenic and frontal lobe groups, who performed worse on a perseverative factor score. Other studies have demonstrated that WCST scores are predictors of risk for alcoholism (i.e., WCST scores correlate with MacAndrew Alcoholism Scale; Deckel, 1999). Interestingly, acute alcohol intoxication can also lead to increased perseverative errors on the WCST (Lyvers & Maltzaman, 1991). Poor WCST performance has also been linked with specific health factors. Fewer categorical sorts and increased perseverative errors have been reported for those with chronic obstructive pulmonary disease (Crews et al., 2001). Boone et al. (1992) found that an increase in the size of white-matter lesions in older adults, particularly when the lesions exceeded a total volume of 10 cm2, led to an increase in the number of perseverative errors and a decrease in the number of categories completed. Boone (1999) also found that risk for vascular illness was a significant predictor of Categories Completed, Total Errors, Perseverative Responses, Percent Conceptual Level Response, and Trials to First Category. This finding is consistent with those of Dywan et al. (1992), who reported that 28% WCST performance in older adults was accounted for
CONCEPT FORMATION AND REASONING
by cardiovascular health. Takashima et al. (2003) found that factors such as age, greater multiple lacunar infarcts, and lower HDL cholesterol best explained scores on tests of executive functioning, including a modified computer version of the WCST. Hanninen et al. (1997) found that individuals with ageassociated memory impairment also perform poorer than controls on the number of categories achieved, overall correct responses, and perseverative errors. The identification of malingering has been studied with the WCST. A number of investigators have developed regression equations based on various WCST outcome measures that discriminate between malingerers and nonmalingerers (Bernard et al., 1996; Suhr & Boyer, 1999). The basic premise behind deriving such formulas is that it is more difficult for individuals to feign a specific pattern of performance. Bernard et al. (1996) used Categories Completed and Perseverative Errors in deriving their malingering formulas. Categories Completed was considered the "obvious" measure since most individuals who attempt to perform poorly would know not to score well on this outcome measure. In contrast, Perseverative Errors were considered the "subtle" measure since most individuals would not intuit the significance of committing this type of error. Using this method, Bernard et al. (1996) adequately classified brain-injured individuals (sensitivity 86%) and simulated malingerers (specificity 94%). Classification rates were somewhat lower when discriminating simulated malingerers from a mixed group of neurological patients (i.e., sensitivity 58%, specificity 90% ). Bernard et al. (1996) found a false-positive rate of approximately 5%, which was replicated by Donders (1999b). Given the high correlation between Categories Completed and Perseverative Errors, Suhr and Boyer (1999) chose failure to maintain set and Categories Completed in their set of logistic regression formulas used for the detection of malingering. These investigators found that these formulas classified undergraduate simulating malingerers and undergraduate nonmalingerers at relatively high rates, 70.7% and 87.1% respectively. Patients suspected of malingering were discriminated
503
WISCONSIN CARD SORTING TEST
from brain-injured patients at even higher rates (i.e., sensitivity= 82.4% and specificity=87.5%). Subsequent studies, however, have found that classification with either Bernard et al.'s (1996) or Donders' (1999b) formulas can produce false-positive rates as high as 41% depending on the clinical sample used and the severity of illness (Greve &: Bianchini, 2002). It is clear that further studies using either regression-based formulas or cutoff scores on the WCST are needed to adequately evaluate whether this test is useful in detecting malingerers. Modifications and Alternate Formats of the WCST Modified Card Sort Test (Mcsn
This version has also been referred to as the modified WCST (mWCST; Bondi et al., 1993; Lineweaver et al., 1999; Nagahama et al., 2003). Nelson (1976) altered the WCST substantially by reducing the number of stimulus cards, eliminating ambiguous responses (i.e., cards with more than one shared attribute with the stimulus cards), and altering examiner feedback to participants. Because this version eliminates the ambiguity in responding, it is thought to be best for patients with severe impairment. De Zubicaray and Ashton (1996) offer an excellent review of the studies that have used the MCST. In this version, the same four reference (key) cards as the 128-card version are used; however, the stimulus cards that share more than one attribute with the reference card (e.g., color and number) have been eliminated, thereby leaving two decks of 24 response cards (Nelson, 1976). Thus, the response cards share only one attribute with the stimulus cards (i.e., color with the first card, form with the second card, and number with the third card) and no attribute with the fourth card. Whichever category the participant matches is determined to be the first principle, and each sorting principle changes after six, not ten, consecutive correct sorts. Once participants obtain six correct sorts to a category, they are instructed to find another rule to which to sort. The test is discontinued after completion of six categories. Nelson
(1976) suggested obtaining the following outcome measures: Categories Completed, Total Errors, Nonperseverative Errors, Perseverative Errors, and Percent Perseverative Errors ([Perseverative Errors/fotal Errors] x 100). Additional scores, such as Conceptual Level Responses and Trials to Complete First Category, can also be obtained using Heaton et al.'s (1993) scoring criteria (Nagahama et al., 2003). Investigators who argue that this shortened, simplified version of the WCST is not sensitive to detecting mild cognitive impairment have altered the MCST to include another deck of 24 cards (total of 72 unambiguous cards) and have changed the instructions so that participants are no longer alerted when the sorting principle is changed (Hart et al., 1988; Jenkins &: Parsons, 1978). Direct comparison of the MCST and the WCST is difficult given the substantial differences between the two tests; however, some studies have demonstrated the MCST's sensitivity to detection of brain damage (Bondi et al., 1993; Nelson, 1976; Vanden Broek et al., 1993). Lineweaver et al. (1999) reported moderate test-retest reliability for this version (see Psychometric Properties of the WCST below) and provided extensive normative data as well as raw score to standard score conversion tables that correct for age and educational level. 64-Card WCST Version
This shorter version of the 128-card WCST is gaining popularity among clinicians and researchers given the shortened administration time as well as the similarity in administration procedures and scoring criteria to the 128card version. Throughout the literature, this version has also been referred to as the "abbreviated WCST" or the 'WCST-64.'' Test administration is virtually identical to the 128card version, except that only the first deck (64 cards) is presented to participants (Kongs et al., 2000). As with the full version, 10 correct sorts to the predetermined principle constitute a complete categorical sort and the task is discontinued when all of the 64 cards have been sorted. Axelrod et al. (1992a,b) found a moderate correlation between this
504
version and the full 128-card version for all measures (most coefficients ranging 0.640.74), except for total correct response$, which was low (coefficient 0.22) yet statistically significant. The authors argued for the validity of this shortened version for use with ;healthy individuals but noted that its clinical utility would have to be further studied. Studies have since found this shortened version to yield clinically meaningful data for patient groups such as those with Parkinson's disease (Paolo et al., 1996a,b; Robinson et al., 1991). However, Axelrod et al. (1996) argue against simple conversion of scores to percentages to obtain demographically c
CONCEPT FORMATION AND REASONING
these are not norms that are used in individual cases. For the demographically corrected T scores, they found that correlations between the WCST-64 and the full version were relatively high (r= 0.75-0.88) but that the WCST-64 captured only approximately 59% of the full WCST scores when a 5-point margin of error was set. Merrick et al. (2003) reported similar findings for a group of headinjured patients. They found that T scores for perseverative responses were on average over half a standard deviation (SD) lower for the WCST-64 relative to the full version for these patients. In fact, for a quarter of the sample, T scores for perseverative responses obtained for the WCST-64 were over 10 points (1 SD) lower relative to the full version. These findings suggest that for brain-damaged patients the 64-card version may not be equivalent to the full WCST. Computerized Administration Version
A number of commercial and noncommercial computer administration and scoring software programs exist for the WCST (Harris, 1988; Heaton, 1993, 2003a,b; Keller & Davis, 1998). Most computerized versions are designed to administer the 128-card or the 64-card WCST. In most cases, the testing procedures are kept similar, if not identical, to the administration manual (Heaton, 1981; Heaton et al., 1993). Thus, the four reference cards appear at the top of the computer screen, and the task of the participant is to sort each card that appears at the bottom (typically center) of the screen to one of the reference cards by pressing a computer key or using the mouse. Written feedback is provided to the participant (i.e., the word right or wrong appears after each match). In some computerized versions, a combination of written and auditory cues (i.e., different tones for correct and incorrect responses) are provided. Artiola i Fortuny and Heaton (1996) compared the standard (manual) WCST administration to a computerized version in a sample of healthy adults in Madrid, Spain, and found no differences between any of the outcome measures, except trials to complete first category. Specifically, it appears that individuals require an average of six more trials to obtain
WISCONSIN CARD SORTING TEST
the first category in the computerized compared to the manual administration. The authors speculate that individuals may need more time to become familiar with the computertesting format. Conversely, Feldstein et al. (1999) found that central tendency measures and variability scores differed for one of the computerized versions (Keller & Davis, 1998) relative to the manual administration of the WCST. Psychometric Properties of the Test
There is surprisingly sparse information available regarding the reliability of the 128card WCST. The Heaton et al. (1993) manual reports adequate internal reliability coefficients (range 0.37-0.72) that are based on the generalizability theory (i.e., how well the test depicts a participant's true score) for healthy children and adolescents. The manual, however, does not report test reliability data for adults who are over 18 years of age. Paolo et al. (1996a) found moderate to low test-retest reliability for testing probes separated by 1 year. They found correlation coefficients ranging from as low as 0.12 (for Learning to Learn) to 0.65 (for Categories Completed). They also reported stability coefficients for a 1-year retesting period to be rather low, range from 0.55 (Nonperseverative Errors) to 0.66 (Total Errors). These authors expressed concern over the reliability and poor stability of the WCST measures and indicated that the WCST may not accurately measure change in problem-solving skills in normal adults. Ingram et al. (1999) reported testretest (testing probes separated by 1-71 days) reliability coefficients of 0.34 (total correct responses) to 0.83 (perseverative responses) for 11 WCST outcome measures in a group of patients with untreated sleep apnea. Paolo et al. (1996a) also found practice effects (for testing probes separated by 1 year), ranging from one-third to nearly one-half of a standard deviation, for all outcome measures of the 128-card version of the WCST, with the exception of Categories Completed, Trials to Complete First Category, Learning to Learn, and Nonperseverative Errors. They note that the performance of their sample was quite
505
similar to that of the Heaton et al. (1993) sample, suggesting that their findings can be generalized to that normative data set. Ferland et al. (1998) also found significant improvement in WCST performance (e.g., 38% for Perseverative Errors and 40% for Perseverative Responses) of patients with TBI when testing probes were separated by 5 months. Although WCST scores improved for the normal control group as well, these differences did not reach statistical significance. Lineweaver et al. (1999) reported modest test-retest (probes separated by 1 year) correlation coefficients for Nelson's MCST. Coefficients were 0.46 for Nonperseverative Errors, 0.56 for Categories Completed, and 0.64 for Perseverative Errors. However, no significant practice effects, particularly for categories completed and perseverative errors, were observed. Relatively strong intra- and interrater scoring reliabilities have been obtained for hand scoring of the WCST. Axelrod et al. (1992a) reported interrater scoring reliability coefficients of 0.93 for Perseverative Responses, 0.92 for Perseverative Errors, and 0.88 for Nonperseverative Errors. Additionally, they found excellent consistency in the way raters applied the scoring rules, with correlation coefficients ranging 0.91-0.96. Despite adequate inter- and intrascorer reliability, Axelrod et al. (1994b) found that accuracy in scoring can improve even more for novice scorers if they use the written supplements produced by Axelrod et al. (1992a). Greve (1993) found that clinical neuropsychologists who used the Heaton (1981) manual were less accurate than "experts" (neuropsychologists with at least 5 years of experience with the WCST) and novices. Finding that the computer scoring method assisted novices in their scoring accuracy, Greve (1993) recommended its use. Heaton et al. (1993) cite two studies that have demonstrated adequate concurrent validity for the WCST. Shute and Huertas (1990) found that in a group of college students Perseverative Errors loaded on a Piagetian measure of formal operational reasoning. Similarly, Perrine (1993) found that Total Errors, Categories Completed, and Perseverative Responses were
506
correlated with an attribution identification test (a measure of concept formation). The validity of the WCST as a measure of executive functioning and frontal lobe dysfunction has also been demonstrated in various patient populations (see Brief Overview of Clinical Findings Using the WCST, above). Wildgruber et al. (2000) examined the sensitivity and specificity of the WCST Perseverative Responses scores of patiet.ts with frontal lobe damage, nonfrontallobe damage, a mixed group of brain-damaged individuals, and controls. They found that WCS~ Perseverative Responses was not very sen.itive at detecting frontal lobe patients (only i~ntified 65.4%) or very specific at classifying pontrols (60.9%). Other studies, however, ha~ found the WCST to be effective at discri~inating healthy controls from brain-damaged .ndividuals (discrimination accuracy was .pproximately 71% with all outcome measure~). Rossi et al. (2000) found that the overall clilssification rate of the WCST was 60.59% when discriminating between patients with: schizophrenia, bipolar disorder, and healthy controls. However, the classification ~tes of patient groups (schizophrenia 48.5%,~bipolar 40%) were far lower than that of ~on trois (85.9%). Additional studies with clinical populations have found that discrimination between various psychiatric and neurt>logical conditions (e.g., schizophrenia, moocJ disorders, and head injury) is difficult ~th the WCST (Axelrod et al., 1994a); however, this is not unexpected given that all of these disorders involve frontal lobe dysfunction. Feldstein et al. (1999) assessed whether the normative data reported by Heaton et al. (1993), in which the standard (man•al) administration procedure was used, are equivalent to various WCST computer administration procedures (e.g., keyboard, mouse, or touch screen). The authors found that while there were no statistical difference between the mean scores for all WCST measur~ (with the exception of failure to maintain ~t), significant differences existed in cential tendency, dispersion, and distribution ~ shapes between the published normative datil (manual administration) and the compOterized administration.
CONCEPT FORMATION AND REASONING
Investigators have found a weak relationship between the WCST and another popular test of reasoning, the Category Test (Crockett et al., 1986; Donders & Kirsch, 1991; Pendleton & Heaton, 1982; Perrine, 1993). Perrine (1993) attributes this weak relationship (of approximately 30% shared variance) to the fact that the two tests measure different underlying aspects of conceptual processes. He argues that the WCST measures "attribute identification" (selection of critical features for encoding and categorizing), while the Category Test assesses "rule learning" (relating two or more concepts with a logical rule). Interesting test order effects have been observed by Brandon and Chavez {1985) when administering the Category Test and the WCST. Essentially, these authors found that while administering the WCST first did not alter participants' scores on the Category Test, perseverative responses and total errors tended to decrease for the WCST when the Category Test preceded the WCST. This effect was present across various administration delays, including back-to-hack test administrations and when administration of the tests was separated by 1 hour or by 24 hours. Recent investigations into the ecological validity of the WCST have demonstrated that it can predict the ability to carry out activities of daily living and the type of occupational position one is likely to hold (Kibby et al., 1998; Little et al., 1996). Furthermore, accuracy on a shopping task (Rempfer et al., 2003) can be predicted from the number of perseverative responses. Trials to First Category and total correct responses can predict task orientation at a vocational work placement in a sample of schizophrenics (Lysaker et al., 1995). Factor Structure of WCST
Several studies have examined the latent structure of the WCST and have produced varying results. There appears to be a discrepancy in the number of factors obtained, and this may be due to the number of WCST outcome measures used in a particular study and whether these selected measures contain redundancy (e.g., perseverative errors and percent perseverative errors) or to the type of
WISCONSIN CARD SORTING TEST
sample used (type of clinical sample or control groups). In general, fewer factors seem to be obtained when nonclinical samples are used. Several investigators have found that in healthy adults a unitary factor best accounts for the WCST outcome measures (Boone et al., 1998; Bowden et al., 1998; Goldman et al., 1996; Pineda &: Merchan, 2003). However, Salthouse et al. (1996) examined a group of healthy adults aged 18-94 years and reported the existence of two factors, with the majority of the WCST measures (e.g., Total Errors and Perseverative Errors/Responses, Categories Completed) loading on the first factor. Paolo et al. (1995) reported three underlying factor structures for healthy older controls, with Categories Completed, Total Errors, Perseverative Responses/Errors, Nonperseverative Errors, and Conceptual Level Responses loading on the first factor, Failure to Maintain Set loading on the second factor, and Learning to Learn and Trials to First Category loading on the third factor. These findings are essentially the same as those reported by Greve et al. (1999) for a sample of headinjured patients. Studies using various clinical samples have most typically found three factors that best account for the WCST outcome measures (Greve et al., 1998; Sullivan et al., 1993; Weigner &: Donders, 1999). Sullivan et al. (1993) examined 11 WCST scores of 58 individuals (schizophrenics, alcoholics, normal controls) and found a three-factor solution that accounted for 91% of the total variability. They labeled factor 1 "Perseverations" (7 of the 11 scores), factor 2 "Inefficient Sorting" (2 scores), and factors 3 "Nonperseverative Errors" (2 scores). According to Sullivan et al., Factors 1 and 3 required executive and memory skills, while factor 2 appeared unrelated to either type of skill. Koren et al. (1998) replicated the same three-factor structure in a group of patients with schizophrenia and controls and found the perseveration factor to be best for distinguishing patients from controls. Greve (1993) examined 270 patients and controls and found only two factors that accounted for 91% of the total variability. They called factor 1 "Problem Solving" and factor 2
507
"Failure to Maintain Set" (since this outcome measure predicted the most variability). Greve et al. (1997) again found essentially the same two factors in a group of 135 college students and a mixed clinical sample of 139. These factors are quite similar to Sullivan et al.'s (1993) Perseveration and Inefficient Sorting factors. Greve et al. (1996) suggested that factor 1 reflected problem solving, while factor 2 was interpreted as a measure of attentional processes, which was later confirmed with a color overlay study. In follow-up studies of etiologically mixed groups of patients with TBI and chronic, severe TBI, however, Greve and colleagues found a three-factor solution which they labeled "Cognitive Flexibility" (contained Total Correct Responses, Percent Conceptual Level Responses, Categories Completed, Perseverative Errors, and Perseverative Responses), "Problem-Solving" (contained Nonperseverative Errors), and "Response Maintenance" (contained Failure to Maintain Set; Greve et al., 1999, 2002). Weigner and Donders (1999), using a higher-functioning group of TBI patients than Greve and colleagues, found a virtually identical three-factor solution to that of Greve et al. (1999). Nagahama et al. (2003) also found a threefactor structure for Nelson's MCST. Using patients with Alzheimer's disease, patients with mild cognitive impairment, and healthy controls, Nagahama et al. (2003) found that Perseverative Errors, Categories Completed, Trials to Complete First Category, and Conceptual Level Responses loaded on the first factor, termed "Perseveration," and accounted for 57% of the common variance. Failure to Maintain Set loaded on the second factor, termed "Insufficient Sorting," and accounted for 23% of the common variance. Nonperseverative Errors loaded on the final factor, termed "Nonperseverative Error," and accounted for 10% of the common variance. These factor structures are quite similar to those reported by Greves and colleagues and Sullivan et al. (1993). While it is clear from years of research that the WCST taps into specific executive skills, it appears to not heavily load on those executive functioning tests that require speed (Boone et al., 1993a; Welsh et al., 1991). Boone et al.
508
CONCEPT FORMATION AND REASONING
(1998) examined the factor structure of the WCST and three other tests of axecutive function (Stroop, Verbal Fluency, .nd Auditory Consonant Trigrams) in a woup of 250 patients and controls. They found: that the four outcome measures of the WCST (Categories, Percent Conceptual Level Ref;ponses, Errors, Percent Perseverative Refponses) loaded robustly on their own factor (and independent of the other executive tef;ts), accounting for 23% of the total varianct. A word of caution comes from Bow4en et al. (1998), in a study examining the reliaWity and validity of the WCST in a group of Wcoholdependent individuals and a group ot college students. They found only one u~erlying factor out of six WCST outcome measures examined. The authors suggest that ~atients' "pattern" of performance on the clifferent WCST outcome measures should be interpreted with caution given that all of th. WCST outcome measures were accounted fot by one factor. Further, based on their data, ~ey suggest that test-retest assessment over tipne may not produce accurate clinical interpre;ation.
RELATIONSHIP BETWEEN W~T PERFORMANCE AND DEMOGRAJ-HIC FACTORS Effect of Age Age-related declines in WCST perfqrmance on both the 128-card and the 64-card versions have been widely and consistently ~ported (Anderson et al., 1991; Arbuckle & Gold, 1993; Axelrod & Henry, 1992; Axelrod et 1993, 1996; Crockett et al., 1986; Daigneault et al., 1992; Davis et al., 1990; Head et all, 2002; Heaton et al., 1993; Kramer et al.;, 1994; Laiacona et al., 2000; Mejia et al., 199B; Parkin & Walter, 1991, 1992; Salthouse et al, 1996, 2003; Spencer & Raz, 1994), but a sltht discrepancy appears regarding the age ~ which WCST scores begin to decline and the ~pecific WCST measures that show age-relafed decline. Heaton et al. (1993), using 899 Jndividuals aged 6.5-90 years, reported a qe.adratic effect for age on all WCST measures, With the proportion of variance accounted for; by age
4.
ranging 2%-21%. They found a significant improvement in WCST scores from ages 6.5 to approximately 19 years, with little change over ages 20--50 years, and then a relatively sharper decline in performance after 60 years. Virtually identical age effects were found when the same data were reanalyzed using responses from only the first deck of cards (in the manual for the 64-card version; Kongs et al., 2000). These results are consistent with other findings using the 64-card version (Axelrod et al., 1993). The reports of Heaton et al. (1993) and Kongs et al. (2000) of declining WCST performance after the sixth decade of life are consistent with those of other investigators (Axelrod & Henry, 1992; Compton et al., 2000; Craik et al., 1990). Axelrod and Henry (1992) compared the WCST performance of Heaton's (1981) sample of 40-year-olds to their sample of individuals in their 50s, 60s, 70s, and 80s. They found that only for Perseverative Responses did the age-related decline begin at age 50 and that, for Categories Completed and Perseverative Errors, differences were observed for those 60 years and older. In a study examining highly educated individuals (university faculty), Compton et al. (2000) found that there were essentially no changes in any of the WCST measures in those in their 30s, 40s, and 50s but that those over 60 years completed fewer categories, required more trials, had fewer percent correct responses, committed more perseverative errors, and had lower conceptual level responses than those in their 30s and40s. In contrast, while Boone et al. (1993) also found age-related declines in WCST performance, they report that performance does not deteriorate until after the age of 70 years. Further, individuals who were 70 years or older also only displayed deficits on Total Errors and Conceptual Level Responses relative to individuals who were 60 years or younger. These findings are more consistent with the results of Haaland et al. (1987), who found that in a sample aged 17-87 years only those 80-87 years old performed poorer than those 64-69 years old. These authors also found that the SO-yearold group committed a greater total number of errors but that this group completed fewer categories relative to the 60-year-old group.
509
WISCONSIN CARD SORTING TEST
Other studies which have not used continuous age groupings but rather younger vs. older groups have also found age-related differences on the WCST. Beatty (1993) found that older participants (mean age 70 years) performed poorer than middle-aged (mean age 40) and young (mean age 20) participants on Categories Completed, Number of Perseverative Errors, and Total Errors but not on Nonperseverative Errors, Trials to First Category, or Failure to Maintain Set. He also noted greater within-group variability for the older group. Parkin and Walter (1991) observed a difference between their sample of "young" adults, who were an average of 34 years, and their "old" adults, who were an average of 80 years, on all four of the WCST measures they examined (Categories Completed, Total Errors, Perseverative Errors, and Nonperseverative Errors). These findings were replicated in a second study (Parkin & Walter, 1992). In a sample aged 60-80, Craik et al. (1990) found a significant correlation between age and Perseverative Errors. Spencer and Raz (1994) found an interesting age-by-education interaction effect; their older group (mean age 69.5 years) was more highly educated than their younger group (mean age 23.8 years), but the younger group outperformed the older group on the three WCST scores measured (Total Errors, Perseverative Errors, and Categories Completed). Using her version of the WCST, Nelson (1976) reported that age had a deleterious effect on the number of categories achieved but not on the types of error. Similarly, Isingrini and Vazou (1997), using the same version, found that individuals aged 25-46 years achieved a greater number of categories and committed fewer total and perseverative errors than a group of individuals aged 7079 years. Uneweaver et al. (1999) observed that age correlated with various measures of Nelson's MCST, including number of categories achieved and number of nonperseverative errors. However, they noted these age effects to be rather subtle until the eighth decade of life, when a sharper decline in performance was detected. Caffarra et al. (2004) also found age effects for perseverative errors and number of categories achieved.
In sum, robust findings of age-related decline in WCST performance have been reported, and while there is some discrepancy regarding the age at which WCST performance begins to deteriorate, studies tend to agree that there is little change in performance between ages 20 and 50 years (Boone et al., 1993; Compton et al., 2000; Heaton et al., 1993; Yeudall et al., 1986). The changes appear to occur after the age of 60 (Heaton et al., 1993), with sharper deterioration in performance occurring in the seventh and eighth decades of life (Beatty, 1993; Boone et al., 1993; Craik et al., 1990; Haaland et al., 1987; Uneweaver et al., 1999; Parin & Walter, 1991). Additionally, there appears to be relatively good agreement that the total number of errors increases with increasing age and that number of categories completed decreases with age (Beatty, 1993; Boone et al., 1993; Haaland et al., 1987; Heaton et al., 1993; Parkin & Walter, 1991; Spencer & Raz, 1994). Some investigators have reported increased perseverative errors as a result of increasing age (Craik et al., 1990; Heaton et al., 1993; Parkin & Walter, 1991; Spencer & Raz, 1994), while others have shown age-related increases of set failures (Beatty, 1993; Heaton et al., 1993). Positive findings (Uneweaver et al., 1999; Heaton et al., 1993; Parkin & Walter, 1991) and negative findings (Beatty, 1993; Boone et al., 1993) regarding the relationship between nonperseverative errors and age have been reported. Conceptual level responses have also been shown to reduce with age (Boone et al., 1993; Compton et al., 2000; Heaton et al., 1993). Effect of Education
The effect of education on the WCST has been well documented in the literature (Boone et al., 1993a, 1998; Heaton, 1981; Heaton et al., 1993; Laiacona et al., 2000; Stratta et al., 1993). In his first manual, Heaton (1981) reported that the effects of education on WCST measures were most apparent after 15 years of formal education. For their large sample of adults (aged 20 years and older), Heaton et al. (1993) reported a steady, linear relationship between most age-adjusted WCST measures
510
and education, with poorer WCST performance for lower educational levels. l'he only two measures that did not exhibit a relationship with education were Failure to Maintain Set and Learning to Learn. Laiacor¥t et al. (2000) reported a similar improvethent in WCST scores with greater education in a group of healthy Italian individuals. Education particularly affected Perseverative Responses and Nonperseverative Errors. Compt~n et al. (1997) found that education was the most predictive demographic factor of the ~umber of set failures in a highly educated woup of healthy adults aged 25-72 years. H~ver, in another sample of 102 highly educated adults, Compton et al. (2000) did not find a correlation between education and any of th~ WCST measures. Boone et al. (1993) found that :healthy middle-aged and older adults with >l6 years of education outperformed those wlh :::;12 years of education on most of the ~ WCST measures (e.g., Total Errors, Perseyerative Responses, Percent Perseverative Errers, and Percent Conceptual Level Response$). In a later study, Boone et al. (1998), using •epwise multiple regression analyses, found ~t education emerged as a significant prediJ:tor for Perseverative Responses, Percent Coifeptual Level Responses, and Total Errors. ' One study reported an interaction ~tween gender, education, and WCST performance. In a group of healthy individuals aged 15-40 years, Yendall et al. (1986) found tqat only "other" (unique) responses correlat~ with education in women but that in men Perseverative Errors, Nonperseverative Err~rs. and Total Errors correlated significantt with education. Interestingly, in a Hispanic, Spanishspeaking sample, Mejia et al. (1998) fQUnd no difference in any of the WCST rqeasures between adults with very little education (2--5 years) and those with more education (6-11 years). However, performances ~ported for both groups were strikingly poor. It is possible that, as indicated by Heaton~(1981), effects of education emerge only in thqse with a college education. Lineweaver et al. (1999) demonstra4ed that education affects performance on ~elson's
CONCEPT FORMATION AND REASONING
MCST. Specifically, they found that individuals with an elementary school education committed more perseverative errors and completed fewer categorical sorts relative to those with greater education. Similarly, Caffarra et al. (2004) found a relationship between education and MCST (Perseverative Errors and Categories Completed). Effect of Gender
Most studies have not found effects of gender on the WCST (Heaton et al., 1993; Kongs et al., 2000; Laiacona et al., 2000; Yendall et al., 1987). Heaton et al. (1993), in their normative sample of 899 children, adolescents, and adults, found no gender-related differences on any of the WCST outcome measures. In contrast, using an older sample of healthy participants (mean age approximately 62 years), Boone et al. (1993) found that females outperformed males on the following six WCST measures: Categories Completed, Total Errors, Perseverative Responses, Percent Perseverative Errors, Percent Conceptual Level Responses, and Trials to First Category. Women and men performed similarly on Set Failure and "Other" Responses. It is possible that the superior performance by women relative to men appears only in middle to old age. Ferland et al. (1998) found an opposite gender effect in a group of young undergraduate college students. The authors administered a modified version of the 128-card WCST (in which all 128 cards were administered regardless of whether or not six categories were completed) to healthy and brain-injured individuals. The test was administered two different times (separated by 5 months), and the authors found that males outperformed females when the test scores were collapsed over the two times. No gender effects, however, were present when only the first (standard) test administration trial was examined. Thus, these findings may be a result of differential practice effects for males vs. females, which was not further examined in this study. No gender differences have been reported for Nelson's MCST Caffarra et al., 2004; Lineweaver et al., 1999).
WISCONSIN CARD SORTING TEST
Effect of Intellectual Level
Very few studies have specifically examined the relationship between the WCST and intellectual functioning, although those that are available have observed a relationship between WCST scores and IQ. Heaton (1981) reported a correlation between the majority of the WCST outcome measures and FSIQ. Similarly, Merriam et al. (1999) found that IQ correlated with most WCST measures in a group of healthy controls as well as in patients with major depression or schizophrenia. Boone et al. (1993a) did not find differences on any of the WCST measures for four IQ levels (90--109, 110-119, 120-129, and 130+). However, none of their participants had lower than average IQ; thus, the nonsignificant findings may have been due to a restricted range in IQ. In a later study, using multiple regression analyses, Boone et al. (1998) found that FSIQ accounted for significant test score variability in all of the WCST measures examined (Categories Completed, Perseverative Responses, Percent Conceptual Level Responses, Total Errors, and Trials to First Category). Isingrini and Vazou (1997) using Nelson's MCST found that better performance on the WAIS Similarities subtest and Cattell's Matrices (a measure of fluid intelligence) related to better performance on Categories Completed, Total Errors, and Perseverative Errors. Yet, no relationship between these MCST scores and WAIS Vocabulary or Information subtests was found. The authors interpreted these results to mean that the MCST (and perhaps WCST) is correlated with fluid, but not crystallized, intelligence. Further studies are needed to better delineate the relationship between intelligence and WCST performance.
511
similar to that of North American samples when raw scores were converted into demographically corrected standard scores using the Heaton et al. (1993) norms. Rey et al. (1999) examined the performance of 75 Hispanic individuals on the WCST and other neuropsychological tests. Participants' nationalities included Cuba, Peru, Venezuela, Puerto Rico, Panama, Colombia, Honduras, and Nicaragua. While no statistical analyses are reported, the authors note that means and SDs for all of the WCST measures of their Hispanic sample appear comparable to that of Heaton's (1981) normative sample. Artiola i Fortuny et al. (1999) have included the WCST in their standardized and validated battery of neuropsychological tests culturally adapted for Spanish-speaking individuals. Normative data based on 390 participants, which were collected in Spain, Mexico, and the United States, are stratified by geographical area x age x education.
METHOD FOR EVALUATING THE NORMATIVE REPORTS
To adequately evaluate the WCST normative reports, six key criterion variables were deemed critical. The first four of these relate to subject variables, and the last two refer to procedural issues. Minimal requirements for meeting the criterion variables were as follows.
Subject Variables Sample Size As discussed in previous chapters, a mlm-
mum of 50 subjects per grouping interval is optimal.
Effect of Ethnicity
To date, very few studies have examined the effects of ethnicity on WCST performance. In a study comparing manual and computer administrations of the WCST, Artiola i Fortuny and Heaton (1996) found that the performance of healthy adults from Madrid, Spain, was quite
Sample Composition Description As discussed earlier, information regarding
medical and psychiatric exclusion criteria is important; it is unclear if geographic recruitment region, socioeconomic status or occupation, ethnicity, gender, and recruitment
CONCEPT FORMATION AND REASONING
512
procedures are relevant, so until determined, it is best that this information be provided. Age Group Interval
Given the association between age and WCST performance, information regarding the age of the normative sample is critical and normative data should be presented by age interval. Grouping by Educational Level Given consistent evidence of effects of educational level on WCST performance, normative data should be grouped by educational level.
Procedural Variables Description of Administration Procedures
Due to variability in administration procedures, a detailed description, including identification of the version of the test administered, is desirable. This would allow one to select the most appropriate norms or to make corrections in interpretation of the data. Data Reporting
To facilitate interpretation of the data, group means and standard deviations should be presented at minimum for categories achieved and one of the perseverative measures (e.g., errors or responses).
SUMMARY OF THE STATUS OF THE NORMS Data reporting for the WCST differs across studies. Some of these differences will be summarized below. Our review of the literature located WCST normative reports for adults as well as administration manuals containing comprehensive normative data (Heaton, 1981; Heaton et al., 1993; Kongs et al., 2000). Hundreds of other clinical studies have also reported control subject data. The majority of studies report the mean age, education, and gender distribution for the sample and/or for the age groups. Some studies report WAIS IQs or estimated intelligence levels and ethnic composition.
Some studies present data divided into age groups. Few studies classify participants into education groups or present data for males and females separately; few studies report data for males only or present data in age-byeducation cells. Data collected on individuals from Spain, South America (e.g., Peru, Venezuela, and Columbia), and Central America (e.g., Panama, Honduras, and Nicaragua), and Italy are presented in this chapter. Test-retest data are reported in some studies, with typically 1-year intertrial intervals. Issues of reliability and/or practice effects are discussed in these studies. The studies vary in the WCST outcome measures reported. The majority report at least four or more measures. In the studies reviewed below, all available WCST scores reported will be presented. Among all the studies available in the literature, we selected for review those based on well-defined samples. Additionally, comprehensive normative data are available in the administration and scoring manuals of the 128-card and 64-card versions of the WCST (Heaton, 1981; Heaton et al., 1993; Kong et al., 2000). Thus, among all of the available studies for these two versions of the WCST, only those that were published in 1993 and later and contain sample sizes of ~50 will be reviewed in this chapter. We hope to accomplish two goals with our reviews of these studies: (1) to aid the reader to relatively quickly examine the most recent normative information in order to supplement the Heaton et al. (1993) norms and (2) to cover the most recent literature in order to address any cohort effects (e.g., changes in scores over the past decade) in the WCST. For Nelson's MCST, all of the studies containing sample sizes >50 are reviewed. Summaries of the studies are presented in ascending chronological order for each version of the test separately. Studies using the original 128-card WCST administration procedure are presented first, followed by those using the 64-card version and then those using Nelson's MCST. The text of study descriptions contains references to the corresponding tables identified by number in Appendix 25. Table A25.1, the locator table, summarizes
WISCONSIN CARD SORTING TEST
information provided in the studies described in this chapter. 1
SUMMARIES OF THE STUDIES WCST 128-Card Administration Version WCST Manual [WCST.1] Heaton, Chelune, Talley, Kay, and Curtiss, 1993 (WCST 128-Card Version)
The Wisconsin Card Sorting Test Manual: Revised and Expanded is a modified edition of Heaton's (1981) original WCST administration and scoring manual. The goals of the authors for this new edition were to include wider normative age ranges (6.5--89 years), to further clarify scoring criteria with explicit examples, and to revise scoring forms to facilitate recording participants' responses and calculating outcome measures. The authors combined essentially six different samples to obtain normative data for 899 healthy individuals. The six samples are described below. Sample 1: This sample consisted of 453 (48% male) normal chiJdren and adolescents aged 6 years and 6 months to 17 years and 11 months recruited from public schools. Data regarding race were available only for 379 participants. Whites comprised 78% of the sample, 11% were black, and 2% were classified as "other." Exclusion criteria were a neurological dysfunction, learning disability, emotional disorder, or attention disorder. There is no mention of the geographic location from which this sample was collected. Sample 2: This sample consisted of "49 students and friends of students who lived in the community surrounding a large urban area in the southwestern United States." Participants were 49% male, all 18 years of age, and had 12-15 years of education. Sample 3: This sample was collected from Texas and Colorado and consisted of 150 (83% male) healthy adult participants aged 15-77 years, with educational levels ranging 7-20 years. This sample originally served as controls 'Nonns for children are available in Baron (2004) and Spreen and Strauss (1998).
513
for a study of pesticide poisoning conducted by Heaton et al. (1991) and was described in the first WCST manual (Heaton, 1981). Exclusion criteria are presented in the original Heaton et al. (1991) study and include a history of learning disabilities, neurological illness, "significant" head injury, "serious" psychiatric illness (e.g., schizophrenia), or substance abuse. Sample 4: Fifty (34% male) healthy participants, recruited from Colorado, comprised this sample. Participants were aged 58-84 years, with educational levels of 8-20 years. There is no mention of exclusion criteria for this sample. Sample 5: This sample was collected on 124 (91% male, 9% female) commercial pilots primarily recruited in Colorado (only five participants were recruited in Washington DC). Participants were aged 24--65 years and had 14-20 years of education. Sample 6: This sample was based on a study by Axelrod and Henry (1992). Seventy-three healthy individuals (45% male, 55% female) were "recruited from a health promotion project, from independent living retirement residences, and from the general community in the Detroit metropolitan area." Ages ranged 5189 years and years of education ranged ~20 years. Exclusion criteria are presented in the original study by Axelrod and Henry (1992) and include a history of psychiatric hospitalization or use of psychotropic medication, history of substance abuse, neurological disorder, head injury resulting in >5 minutes ofloss of consciousness, significant illness such as diabetes or chronic obstructive pulmonary disease (COPD) requiring long-term medical treatment, or MiniMental Status Exam (MMSE) scores of $24.
The manual provides regression-based raw to T-score conversions for Total Errors, Percent Errors, Perseverative Responses, Percent Perseverative Responses, Perseverative Errors, Percent Perseverative Errors, Nonperseverative Errors, Percent Nonperseverative Errors, and Percent Conceptual Level Responses. Additionally, raw score to percentile conversions are provided for Categories Completed, Trials to Complete First Category, Failure to Maintain Set, and Learning to Learn. The reader is referred to the manual, which stratifies the data
514
for these WCST outcome measures ~ed on gender, 14 child and adolescent age groups (aged ~19 years), eight adult age groups (aged 20--79 years), and six education groupsr(~8, 911, 12, 13-15, 1~17, ~18 years). Stan4ard test administration and scoring criteria are well described. Normative data based on this WCSTimanual for Perseverative Responses are also kported in the Comprehensive Norms for an E!anded Halstead-Reitan Neuropsychological attery: Demographic, Corrections, Research F: ndings, and Clinical Applications (Heaton et al1, 1991), and Perseverative Errors are reprod~ced in the most updated edition (Heaton et al,, 2004). Study strengths 1. Sample composition is well desc~bed in terms of age, gender, and education. 2. Adequate exclusion criteria. 3. Test administration procedures are ; specified. 4. Means and SDs for the entire saptple as well as T scores for the groups ±atified by age and education are presen ed. 5. Sample is stratified into numer s ageby-education groupings. Considerations regarding use of the ~dy 1. Overall sample size is adequate, but individual cells for certain WCST ~tcome measures are relatively small. 2. Recruitment procedures were ~t well described for some of the sub-samples. 3. Exclusion criteria are not speci,ed for some of the subsamples. Other comments 1. The interested reader is referred :to Fastenau and Adams (1996) critiqu~ of the Heaton et al. (1991) norms, and lleaton ) et al.'s 1996 response to this critique.
Normative Studies and Control Gro&4PS in Clinical Comparison Studies for the WCST
CONCEPT FORMATION AND REASONING
on the WCST and the California Card Sorting Test. Participants were "young" (18-34 years; mean age= 25.5, SD = 5. 7), "middle-aged" (35-49 years; mean age= 40.6, SO= 6.1), and "old" (~60 years; mean age=70.9, SD=6.5) individuals who served as normal controls for the author's previous studies. They were aged 18-75 and had education of 8-20 years. Individuals with a history of medical illness such as diabetes, head injuries, neurological disease, psychiatric illness, or substance abuse "that could affect their performance" were excluded from the study. Standard procedures based on the Heaton (1981) manual were used. Means and SDs are reported. The authors found that on the California Card Sorting Test the older subjects achieved fewer sorts but that they did not have increased verbal or nonverbal perseverative responses and they were able to explain their correct sorting strategies as well as the younger group. On the WCST, numbers of perseverative responses and perseverative errors were greater for the older relative to the younger group. Study strengths 1. Sample composition is well described in terms of age, education, and gender. 2. Adequate exclusion criteria. 3. Test administration procedures are specified. 4. Means and SDs for the test scores are reported. 5. Sample is stratified into three age groups. Considerations regarding use of the study 1. Overall sample size is adequate; however, individual cells are relatively small, and precise sizes are not provided (e.g., "age groups of 20-21 individuals" per cell). 2. Data are not stratified by education. 3. Educational levels are relatively high. 4. Sample composition is not well described in terms of recruitment procedures and geographic location.
[WCST.2] Beatty, 1993 (WCST 128-Card
[WCST.3] Boone, Ghaffarian, Lesser, Hiii-Gutierrez, and Berman, 1993 (WCST
Version) (Table A25.2)
128-Card Version) (Table A25.3)
The author examined the test performjl.nce of 65 (31 male, 34 female) healthy p~ipants
The purpose of this study was to provide further information regarding the effects of
WISCONSIN CARD SORTING TEST
age, education, IQ, and gender on the WCST in older adults. The sample was recruited
through newspaper advertisements, flyers, and personal contact from the Los Angeles, California, area. It consisted of 91 (35 males, 56 females) healthy adults who were fluent in English and aged 45-83 years, with an average education of 14.5 (2.5) years and an average IQ of 115.89 (12.97). Exclusion criteria were a history of psychotic or major affective disorder, current or past history of substance abuse, documented neurological illness, and significant medical illness that could affect the central nervous system (e.g., diabetes). Individuals were also excluded from the study based on abnormal neurological examination, significant metabolic abnormalities detected in blood tests, or abnormal MRI findings. Seventy-one subjects were white, 10 were African American, five were Asian, and five were Hispanic. Standard procedures based on the Heaton (1981) manual were used, and the protocols were computer-scored using the Harris (1988) software. The results of the study indicated that healthy middle-aged and older adults with > 16 years of education outperformed those with $12 years of education on most of the WCST measures (e.g., Total Errors, Perseverative Responses, Percent Perseverative Errors, and Percent Conceptual Level Responses). Also, females scored higher than males on almost all of the measures, and individuals older than 70 years performed poorer than the younger subjects on Total Errors and Percent Conceptual Level Responses. Study strengths 1. Sample composition is well described in terms of age, education, ethnicity, gender, IQ, recruitment procedures, and geographic location. 2. Adequate exclusion criteria. 3. Test administration procedures are specified. 4. Means and SDs for the test scores are reported. 5. Sample is stratified first into three age groups, then by gender, and finally into three educational levels.
515
Considerations regarding use of the study 1. Overall sample size is adequate, but individual cells are relatively small. 2. Educational levels are relatively high. [WCST.4] Stratta, Rossi, Mancini, Cupillari, Matteri, and Casacchia, 1993 (WCST 128-Card Version) (Table A25.4)
The authors examined the effects of education on WCST performance in a group of patients with schizophrenia and healthy controls. Sixty-one control participants were recruited from among relatives and employees of the S. Salvatore Hospital in L'Aquila, Italy. Subjects were excluded from the study if they had a personal history of substance abuse, head injury, "serious" medical illness, or psychiatric disorder or a family history of psychiatric illness. Participants were righthanded, with an average age of 31.93 (5.95) years and 12.65 (4.3) years of education. The data are stratified into three education groups (0--8, 9-13, and 2::14 years). Standard procedures based on the Heaton (1981) manual were used. Differences between the schizophrenic and control groups attenuated when the data were stratified by educational level. Study strengths 1. Sample composition is well described in terms of age, education, recruitment procedures, and geographic location. 2. Adequate exclusion criteria. 3. Test administration procedures are specified. 4. Means and SDs for the test scores are reported. 5. Sample is stratified into two age groups. Considerations regarding use of the study 1. Overall sample size is adequate, but individual cells are relatively small. 2. Gender composition of the sample is not reported. 3. Data were obtained on Italian subjects, which may limit their usefulness for clinical interpretation in the United States.
516
[WCST.S] Kramer, Humphrey, Larish, Logan, and Strayer, 1994 (WCST 128-Card Ver9ion) (Table A25.5)
The authors collected WCST data on controls as part of a study examining "whether inhibitory failures are general or specific in •ature." The sample included 62 (26 male, 36 female) healthy individuals. The data were stratified into two age groups. The "young" in~viduals were aged 18-28, with an average age; of 20.6 and an average of 16.4 (1.1) years of edocation. The "old" individuals were aged 60-74, with an average age of 67.8 years and an avqrage of 16.3 (1.8) years of education. No SDs,for age were reported. The younger group ha4 an average IQ of 117.8 (8.5) based on the K'Jlufman Brief Intelligence Test, and the oldef group had an average IQ of 117.6 (8.4). S4rutdard procedures based on the Heaton (1981) manual were used. . With respect to the WCST, the authors found a significant age effect, with thf younger group outperforming the older gq:>up on all but Trials to First Category and qonceptual Level Responses. ;
Study strengths 1. Sample composition is well described in terms of age, education, gender, and IQ. 2. Test administration procedurcts are specified. 3. Means and SDs for the test sc~s are reported. 4. Sample is stratified into three edpcation groups. Considerations regarding use of the stt~dy 1. Overall sample size is adequate, but individual cells are relatively small. 2. Exclusion criteria are not descrihl!d. 3. Recruitment procedures wer' not reported. 4. Educational levels and IQ are relatively high. .
[WCST.6] Spencer and Raz, 1994 (WCST 128-Card Version) (Table A25.6)
The authors obtained WCST data on controls as part of a study examining memory fot facts, sources, and contextual detail as it re~es to
CONCEPT FORMATION AND REASONING
aging. A total of64 (23 male, 41 female) healthy participants were divided into two age groups. The "young adults" were aged 18-35 years, with an average of 23.8, and the "older adults" were aged 65-80 years, with an average of 69.5. The older group had an average of 15.3 years of education, which was significantly greater than the younger age group, who had an average of 13.5 years of education. No SDs were provided for age or education. Subjects were recruited from the undergraduate psychology subject pool, advertisements, and personal invitations. Participants were excluded if they had a history of head injury, diabetes, epilepsy, "severe" substance abuse, neurological disease, use of psychotropic medications, sleep deprivation, or color blindness. The test was administered using a computer-administered version of the WCST (Neurosoft Corp., McLean, VA), which follows the standard testing procedures. With this computerized version, the participant receives auditory (2,000 Hz tone for correct and 20 Hz tone for incorrect) and written ("right" or "wrong'') feedback from the computer after each response. The authors found age-related declines on all of the WCST scores they examined and further reported that Perseverative Errors was "inversely related to both factual and contextual memory tests," with the relationship to contextual memory being stronger.
Study strengths 1. Sample composition is well described in terms of age, education, gender, and recruitment procedures. 2. Adequate exclusion criteria. 3. Test administration procedures are specified. 4. Means and SDs for the test scores are reported. 5. Sample is stratified into three education groups.
Considerations regarding use of the study 1. Overall sample size is adequate, but individual cells are relatively small. 2. Educational levels are relatively high for the older group and significantly different for the two age groups (older group has higher education than younger group).
517
WISCONSIN CARD SORTING TEST
[WCST.7] Paolo, Troester, Axelrod, and Koller, 1995 (WCST 128-Card Version) (Table A25.7)
These authors conducted principal components analyses separately for patients with Parkinson's disease and normal controls. A total of 187 (69 male, 118 female) control participants who were part of a longitudinal study were recruited from the Kansas City, KS community and retirement centers. Participants were an average age of 69.74 (6.96) years, with an average of 14.91 (2.57) years of education. Ninety-seven percent of participants were white, 1.1% were black, and 1.1% were Hispanic. Participants were excluded if they had a history of stroke, psychiatric illness, "significant" head trauma, substance abuse, or evidence of neurological disorder that may compromise cognition and scores <130 on the Dementia Rating Scale (DRS). Standard procedures based on the Heaton (1981) manual and computerized scoring based on the Heaton (1993) scoring software were used. Principal components analysis revealed three WCST factors for the control participants and for patients with Parkinson's disease, but the factor structures were not the same for the two groups. Study strengths 1. Sample composition is well described in terms of age, education, gender, ethnicity, recruitment procedures, and geographic location. 2. Relatively large sample size. 3. Adequate exclusion criteria. 4. Test administration procedures are specified. 5. Means and SDs for the test scores are reported. Considerations regarding use of the study 1. Sample is not stratified by age or education groups. 2. Educational levels are relatively high. [WCST.8] Artiola i Fortuny and Heaton, 1996 (WCST 128-Card Version) (Table A25.8)
The authors examined manual and computer test administrations of the WCST in a group of 119 (51 male, 68 female) healthy participants from Madrid, Spain, who were primarily
monolingual Spanish-speaking. Participants were part of a larger normative study. Exclusion criteria were a history of learning disability, "significant" head trauma, neurological illness, toxic exposure, major psychiatric illness, or substance abuse. For the total sample, participants were aged 15-59 years, with an average of 27.32 (9.11), and had 11-18 years of education, with an average of 14.35 (2.25). Sixty participants with an average age of 27.32 (10.82) and an average of 14.13 (2.33) years of education were administered the standard (manual) version of the WCST. Fifty-nine participants with an average age of 27.32 (7.06) years and an average of 14.58 (2.16) years of education were administered the computerized version of the WCST (i.e., WCST-CV2; Heaton et al., 1993). Protocols administered in the standard format were computer-scored. Testing was administered in Spanish to all participants, using standard procedures (Heaton et al., 1993). The study revealed that the demographically corrected WCST data for this Spanishspeaking sample were quite similar to the published normative data derived from North American subjects. Study strengths 1. Sample composition is well described in terms of age, education, gender, recruitment procedures, and geographic location. 2. Relatively large sample size. 3. Test administration procedures are specified. 4. Means and SDs for the test scores are reported. Considerations regarding use of the study 1. Sample is not stratified by age or education groups. 2. Educational levels are relatively high. 3. Data were obtained on subjects from Spain, which may limit their usefulness for clinical interpretation in the United States. [WCST.9] Hoff, Riordan, Morris, Cestaro, Wieneke, Alpert, Wang, and Volkow, 1996 (WCST 128-Card Version) (Table A25.9)
The study examined the effects of crack cocaine use on cognitive functioning. A total of
518
54 male normal control participants ~ere recruited from the community. Participaqts were an average of 32.1 (9. 7) years of age ~d had an average of 15.4 (2.4) years of edqcation. Exclusion criteria were a history of rtedical, neurological, or psychiatric illnesses; suJ>stance abuse; or learning disability. Forty-eif~It participants were White, four were Afri~ American, and two were Hispanic. It appe~s that the Heaton et al. (1993) WCST adminiftration was used, but the procedures are nof referI enced nor are they well described. Among other findings, the authors\ report that, surprisingly, crack cocaine use isiassociated with better performance on Ca~gories Completed. I Study strengths 1 1. Sample composition is well desctlbed in terms of age, education, ethnicifY. and gender. 2. Adequate exclusion criteria. : 3. Relatively large sample size. : 4. Means and SDs for the test scotes are t reported. Considerations regarding use of the s~dy 1. Sample is not stratified by age qr education groups. 2. Educational levels are relatively high. 3. Test administration procedures ~e not specified. 4. Recruitment procedures were not reported. [WCST.10] Paolo, Axelrod, and Troester, 1 1996a (WCST 128-Card Version)
CONCEPT FORMATION AND REASONING
had a history of stroke, "significant" head trauma, substance abuse, or neurological disorders. Additionally, participants had to have a DRS score of >130 at each testing session and could not display a >10-point drop at the second testing probe. Standard procedures based on the Heaton (1981) manual and computerized scoring based on the Heaton (1993) scoring software were used. Normalized age- and education-corrected standard scores (Heaton et al., 1993) were used for Total Errors, Perseverative Errors, and Percent Conceptual Level Responses. The study revealed practice effects for testing probes separated by approximately 1 year, with performance on most WCST measures improving by 5-7 standard points. The authors provide a number of indices, including discrepancy scores, to assist clinicians in better interpreting test-retest change scores. Study strengths 1. Sample composition is well described in terms of age, education, ethnicity, and gender. 2. Relatively large sample size. 3. Means and SDs for some test scores are reported. Considerations regarding use of the study 1. Sample is not stratified by age or education groups. 2. Educational levels are relatively high. 3. Recruitment procedures were not reported. 4. Adequate exclusion criteria.
(Table A25.10)
[WCST.11] Rosselli and Ardila, 1996
The authors examined the test-retest reliability and practice effects of the WCST in a group of older adults. A total of 87 (2l'J male, 62 female) participants, with an averag~ age of 68.8 (6.21) years and an average of 14.a (2.42) years of education who were part of ailongitudinal study, were assessed at two ditferent times. Testing probes were separattd by approximately 1 year. Ninety-five perqent of participants were White, 2% were African American, and 2% were Hispanic. jarticipants were excluded from the study If they
(WCST 128-Card Version) (Table A25.11)
The effects of substance abuse on cognitive functioning were assessed in this study. A total of 63 males aged 15-48 years, with an average age of25.61 (7.54) years and an average of 10.5 (4.58) years of education, participated in this study. Participants were recruited from Bogota, Colombia, and were Spanish-speaking. Standard procedures based on the Heaton (1981) manual were used. The authors report poorer performance on virtually all WCST measures for cocaine and
519
WISCONSIN CARD SORTING TEST
polysubstance abusers relative to normal controls.
Study strengths 1. Sample composition is well described in terms of age, education, gender, recruitment procedures, and geographic location. 2. Relatively large sample size. 3. Test administration procedures are specified. 4. Means and SDs for the test scores are reported.
Considerations regarding use of the study 1. Sample is not stratified by age or education groups. 2. Exclusion criteria are not described. 3. All-male sample. 4. Data were obtained on subjects from Colombia, which may limit their usefulness for clinical interpretation in the United States. [WCST.12] Salthouse, Fristoe, and Rhee, 1996 (WCST 128-Card Version) (Table A25.12)
The authors examined the effects of age on various neuropsychological measures, including the WCST. A total of 259 (approximately 63% female) healthy participants aged 18--94, with an average age of 51.4 (18.4) years and approximately 15 years of education (no SD available), were recruited. All participants were self-reported to be in good, very good, or excellent health. Standard procedures based on the Heaton et al. (1993) manual were used. A significant age-related decline in most WCST measures was reported; however, the strong intercorrelation among the WCST scores and with other neuropsychological test scores suggests "only a portion of the agerelated influences on many commonly used neuropsychological measures is specific and potentially localized."
Study strengths 1. Sample composition is well described in terms of age, gender, and education. 2. Relatively large sample size. 3. Test administration procedures are specified.
4. Means and SDs for the test scores are reported.
Considerations regarding use of the study 1. Sample is not stratified by age or education groups. 2. Educational levels are relatively high. 3. Exclusion criteria are not described. 4. Recruitment procedures are not reported. [WCST.13] Compton, Bachman, and Logan, 1997 (WCST 128-Card Version) (Table A25.13)
These authors examined age-associated changes in intelligence and other cognitive domains in a group of university faculty. Participants were 52 (30 male, 22 female) nonpsychology faculty members of the Georgia College and State University. Participants were aged 25-72, with an average of 47.74 (11.77) years, and ranged in educational level 16-20 years, with an average of 18.44 (1.69). Participants were recruited by phone and due to the "invasive nature of the protocol" and privacy issues given that psychology graduate students conducted the testing, faculty from the Psychology Department were not included in the sample. The computerized administration version of the standard procedures based on the Heaton (1981) manual was used. The authors found that age predicted the number of categories completed and the average response time on the WCST, while education predicted the number of set failures in this highly educated group.
Study strengths 1. Sample composition is well described in terms of age, education, gender, recruitment procedures, and geographic location. 2. Relatively large sample size. 3. Test administration procedures are specified. 4. Means and SDs for the test scores are reported.
Considerations regarding use of the study 1. Sample is not stratified by age and education groups. 2. Educational levels are relatively high.
520
CONCEPT FORMATION AND REASONING
3. Exclusion criteria are not described. 4. Precise computerized version ;of the WCST is not specified. [WCST.14] Fristoe, Salthouse, and Woodard, 1997 {WCST 128-Card Version) (Table Al5.14) '
The authors examined the processes mediating age-related differences in WCST ~·erformance. A total of 97 individuals parti "pated. Participants were divided into youn r and older groups. The younger group cons ted of 48 (25% male) participants aged 18-3 years, with an average of 26.7 (5.7), and 13~ (1.3) years of education. The older group Ced in terms of age, gender, and education. 2. Relatively large sample size, with both subgroups approaching 50. 3. Test administration procedure$ are I specified. 4. Means and SDs for the test scores are reported. 5. Data are stratified into two age gro,pings. I
Considerations regarding use of the stftly 1. Recruitment procedures are llPt reported. , 2. Exclusion criteria are not cleady described.
[WCST.15] Artiola i Fortuny, Heaton, and Hermosillo, 1998 {WCST 128-Card Version) (Table A25.15)
The authors examined differences in performance on various neuropsychological tests, including the WCST, in Spanish-speaking individuals from the USA-Mexico border and those from Spain. The study collected a total of 390 participants aged 15-76 years, with 0-20 years of education. Of these, 185 (47 male, 138 female) were from the USA-Mexico border, were an average of 42.2 (13.5) years of age, and had an average of 9.6 (6.1) yem of education; 205 (91 male, 114 female) were from Madrid, Spain, with an average age of 36.3 (16.1) years and an average education of 12.7 (4.4) years. The US-Mexico participants were Mexicans who lived within Mexico or in close proximity to the USA-Mexico border, Mexican Americans whose years of residency in the United States and years of education in Mexico varied. Exclusion criteria for all participants were history of neurological illness, use of psychoactive medication, chronic medical conditions (e.g., diabetes, hypertension), complaints of current cognitive or emotional problems, history of substance use, and learning problems or disability. Also, participants had to declare Spanish as their primary language and to demonstrate native fluency in Spanish. Both the standard manual (Heaton et al., 1993) and the computerized WCST were administered. Data for Categories completed and Perseverative Responses are reported. Study strengths 1. Sample composition is well described in terms of age, gender, education, and ethnicity. 2. Large sample size. 3. Test administration procedures are specified. 4. Means and SDs for the test scores are reported. 5. Data are stratified by USA-Mexico and Spain samples. Considerations regarding use of the study 1. Data are not stratified by age or education groups.
521
WISCONSIN CARD SORTING TEST
2. Scores for only two WCST measures (Categories and Perseverative Responses) are reported. [WCST.16] Boone, 1998 (WCST 728-Card Version) (Table A25.16)
The effects of various demographic and health-risk factors, such as age, education, IQ, and vascular status, were examined in a group of middle-aged and older individuals. Participants were 155 (53 male, 102 female) healthy adults aged 45-84, with a mean of 63.07 (9.29), who had an average of 14.57 (2.55) years of education and an average IQ of 115.41 (14.11). Exclusion criteria were a history of psychotic or major affective disorder, current or past history of substance abuse, documented neurological illness, or significant medical illness that could affect the central nervous system (CNS; e.g., diabetes). Individuals were also excluded based on abnormal neurological examination or significant metabolic abnormalities detected in blood tests. Individuals who reported or showed medical evidence of current or past hypertension, arrhythmia, large white-matter hyperintensity on MRI (>10 cm2 ), coronary artery bypass graft, angina, or old myocardial infarction were classified as having vascular illness. Fifty-one participants were classified as having vascular disease, and 104 participants were classified as being "healthy." Given that age, IQ, and vascular status significantly predicted most WCST measures, the data were stratified first by vascular status (vascular and healthy) by two age groups (<65, ;:::65) and then by vascular status (vascular and healthy) and three IQ groups (average, high average, and ;:::superior). Standard procedures based on the Heaton (1981) manual were used, and data were computerscored. The results indicated that vascular status, age, IQ, education, and gender were significant predictors of WCST performance. Study strengths 1. Sample composition is well described in terms of age, gender, education, IQ, and recruitment procedures. 2. Adequate exclusion criteria.
3. Test administration procedures are specified. 4. Means and SDs for the test scores are reported. 5. Data are stratified into two vascular status by two age groups and into two vascular status by three IQ groups. Considerations regarding use of the study 1. Overall sample size is adequate, but individual cells are relatively small. 2. Educational levels are relatively high. [WCST.17] Mejia, Pineda, Alvarez, and Ardila, 1998 (WCST 728-Card Version) (Table A25.17)
The authors examined the effects of age, gender, and education on a variety of memory and executive function tests, including the WCST. Participants were 60 (21 male, 39 female) healthy Colombian adults aged 55-85 years, with an average age of 69.66 years. Educational level ranged 2-11 years. All participants were native Spanish speakers and were recruited from Medellin, Colombia. Exclusion criteria were a history of psychiatric illness, neurological disorder, or psychotropic medication at the time of testing. The data are stratified first by two age groupings (55-70, 71-85 years) and then by two education groupings (2-5, 6--11 years). Standard procedures based on the Heaton (1981) manual were used. These authors did not find significant differences on the WCST between their samples aged 55-77 and 71-85 years. Likewise, they did not find differences between those with 25 years and those with 6--11 years of education. They did, however, observe that those who attended rural schools committed fewer perseverative errors than those who attended urban schools. Study strengths 1. Sample composition is well described in terms of age, gender, ethnicity, education, and geographic location. 2. Adequate exclusion criteria. 3. Test administration procedures are specified. 4. Means and SDs for the test scores are reported.
522 5. Sample is stratified into two age and two education groups. Considerations regarding use of the study 1. Overall sample size is adequate, but individual cells are relatively sm~. 2. Recruitment procedures are not reported. 3. Data were obtained on subjects from Colombia, which may limit their usefulness for clinical interpretation · in the United States. [WCST.18] Basso, Bornstein, and Lang, 1999 (WCST 128-Card Version) (Table A25.18).
This study examined practice effects f+r most common tests of executive functioning, ;ncluding the WCST. Participants were 82 ~ealthy males recruited via community newspaper advertisements. The authors note that no females were recruited due to "logistic"\ limitations. Of the 82 participants, 50 were retested in 1 year. Those participants were an av~rage of 32.50 (9.27) years of age and had an av~age of 14.98 (1.93) years of education. Therr were 48 Caucasians, one African American, ~d one Hispanic. Exclusion criteria were psythiatric disorder, neurological disease, head !injury, learning disability, or other medical illness. Standard procedures based on the Heaton et al. (1993) manual were used. Among other findings, the authors : report significant practice effects for all WCST measures with the exception of Categories Completed and Failure to Maintain $et for testing probes separated by 12 month~ Study strengths · 1. Sample composition is well described in terms of age, gender, education, and recruitment procedures. 2. Relatively large sample size. 3. Test administration procedures are specified. 4. Means and SDs for the test sCOfeS are reported. · Considerations regarding use of the st4£fy 1. Sample is not stratified by age ar;J education groups. 2. Educational levels are relatively ~gh.
CONCEPT FORMATION AND REASONING
[WCST.19] Gooding, Kwapil, and Tallent, 1999 (WCST 128-Card Version) (Table A25.19)
The WCST performance of college students with schizotypal traits and normal controls were examined. Control participants were 104 (43 male, 61 female) college students from the University of Wisconsin, Madison. Individuals with a history of psychotic illness and/or "psychoactive substance use disorder, family history of psychotic disorder, learning disability, epilepsy, TBI, or other medical illnesses were excluded from the study. Subjects were an average of 18.72 (0.86) years of age, with an average prorated WAIS-R IQ (using Vocabulary and Block Design subtests) of 116.26 (12.56). A computerized version of the WCST (Harris, 1988), based on standard procedures described in the Heaton et al. (1993) manual, was used. Among other findings, the results revealed that college students with schizotypal traits achieve fewer categorical sorts, commit more perseverative errors, and have more set failures compared to controls. Study strengths 1. Sample composition is well described in terms of age, gender, IQ, education, and geographic location. 2. Relatively large sample size. 3. Adequate exclusion criteria. 4. Test administration procedures are specified. 5. Means and SDs for the test scores are reported. Considerations regarding use of the study 1. Sample is not stratified by age and education groups, although it can be assumed that it is a homogeneous sample with narrow ranges on these variables. 2. Recruitment procedures and educational levels are not reported. 3. Relatively high IQ. [WCST .20] Merriam, Thase, Haas, Keshavan, and Sweeney, 1999 (WCST 128-Card Version) (Table A25.20)
The WCST performance of a group of patients with major depression was compared to
523
WISCONSIN CARD SORTING TEST
those with schizophrenia and normal controls. Control participants were 61 healthy individuals aged 18-50, with an average age of 26.08 (7.67) years, an average of 14.66 (2.39) years of education, and an average IQ of 103.90 (9.22). Exclusion criteria were no history of electroconvulsive therapy, neurological illness, head injury, or substance dependence within 6 months of testing. A computerized version of the WCST was administered following the Heaton et al. (1993) standard procedures; however, there is no mention of the specific computerized version used. Patients with depression performed poorer than controls but not compared to patients with schizophrenia on virtually all WCST measures, and their scores were related to the severity of their illness. Study strengths 1. Sample composition is well described in terms of age, education, and IQ. 2. Relatively large sample size. 3. Adequate exclusion criteria. 4. Test administration procedures are specified. 5. Means and SDs for the test scores are reported. Considerations regarding use of the study 1. Sample is not stratified by age and education groups. 2. Educational levels are relatively high. 3. Gender and recruitment procedures were not reported. 4. The exact computerized version of the WCST used is not specified. [WCST.21] Rey, Feldman, Rivas-Vazquez, Levin, and Benton, 1999 (WCST 128-Card Version) (Table A25.21)
The authors report normative data for a number of neuropsychological tests, including the WCST, for Hispanics. Participants included 75 (56 male, 19 female) healthy individuals, with an average age of 33.45 (19.75) and an average of 14.53 (3.25) years of education. Participants reflected the following nationalities: 53 were from Cuba, 3 were from Peru, 1 was from Venezuela, 6 were from Puerto Rico, 1 was from Panama, 6 were from Columbia,
1 was from Honduras, 8 were from Nicaragua, and 19 "other." Data were collected in Dade County, Florida. All participants were primarily Spanish-speaking, and the test instructions for the neuropsychological instruments were "adapted or translated from the original versions." The data are stratified into two education groupings (12-15, >15 years). Standard procedures based on the Heaton (1981) manual were used. The WCST scores of their sample were comparable to those of the age- and educationmatched sample reported by Heaton (1981). Study strengths 1. Sample composition is well described in terms of age, education, gender, ethnicity, language, and geographic location. 2. Test administration procedures are specified. 3. Means and SDs for the test scores are reported. 4. Data are stratified into two education groupings. Considerations regarding use of the study 1. Overall sample size is adequate, but individual cells are relatively small. 2. Exclusion criteria are not described. 3. Recruitment procedures not reported. 4. Educational levels for the overall sample are relatively high. [WCST.22] Snitz, Curtis, Zald, Katsanis, and Iacono, 1999 (WCST 128-Card Version) (Table A25.22)
The study examined the relationship of spatial working memory to neuropsychological functioning and oculomotor activity in patients with schizophrenia and healthy individuals. Control participants were 54 (19 male, 35 female) individuals who lived in the Minneapolis, Minnesota, community. Control participants were recruited via flyers placed in various clinics (e.g., general medical clinics, dental clinics, dermatology clinics) in the same hospital from which the schizophrenic patients were recruited. Most of the sample were patients with nonneurological conditions, and a smaller number were employees
524
of the hospital. Participants were also recruited from vocationaVtechnical schools and medical clinics in a university hospital. Exclusion criteria were history of substance use disorder, diagnosis of major affective or psychotic disorder, history of neurological disorder or any other medical illness affecting CNS functioning, history of head injury, or mental retardation. Participants averaged 36.0 (13.4) years of age, 15.0 (1.7) years of education, and IQ of 109.7 (13.4). The authors used the computerized version of the WCST developed by Rezai (1988, personal communication) but do not mention whether this version follows the standard testing procedures. Among other findings, these authors report that the working memory impairment in patients with schizophrenia was related to fewer categories completed. Study strengths 1. Sample composition is well described in terms of age, education, gender, IQ, recruitment procedures, and geographic location. 2. Relatively large sample size. 3. Adequate exclusion criteria. 4. Means and SDs for the test scores are reported. Considerations regarding use of the study 1. Data are not stratified by age or education groupings. 2. Educational levels are relatively high. 3. Test administration procedures are not specified. [WCST.23] Tallent and Gooding, 1999 {WCST 128-Card Version) (Table A25.23)
The authors examined the relationship between working memory and WCST performance in individuals with schizotypal traits. This was a replication and extension of earlier work. Control participants were 63 (22 male, 41 female) undergraduate students at the University of Wisconsin, Madison, who were an average age of 19.11 (1.03) and had an average prorated WAIS-R IQ (using Vocabulary and Block Design subtests) of 114.63 (11.93). Participants whose primary language was not English, those with a history of
CONCEPT FORMATION AND REASONING
"serious" head injury, learning disability, epilepsy, history of DSM Axis I disorder, psychosis in the family, or medical condition that would interfere with completing the tasks (e.g., color blindness) were excluded from the study. A computerized version of the WCST (Harris, 1988), based on standard procedures described in the Heaton et al. (1993) manual, was used. Individuals with schizotypal traits performed poorer than controls on Categories Completed and Failure to Maintain Set. Additionally, a negative relationship between working memory performance and number of perseverative errors (r = - 0.17) and Trials to Complete First Category (r=- 0.15) was noted. Study strengths 1. Sample composition is well described in terms of age, gender, IQ, current education status, and geographic location. 2. Relatively large sample size. 3. Adequate exclusion criteria. 4. Test administration procedures are specified. 5. Means and SDs for the test scores are reported. Considerations regarding use of the study 1. Data are not stratified by age and education groupings. 2. Recruitment procedures are not reported. 3. Relatively high IQ. [WCST.24] Compton, Bachman, Brand, and Avet, 2000 (WCST 128-Card Version) (Table A25.24)
The authors examined the relationship between age and WCST performance in a group of highly educated professionals. Participants were 102 (53 male, 49 female) healthy adults >30 years of age, with an average of 19.5 years of education (SD for education is not reported). Of these, 87 were current college professors and 17 were in professions which required a high level of cognitive skill (e.g., a significant amount of reading and writing). All participants were recruited from Atlanta, Georgia. English was their primary language, and 94 were Caucasian. The data were stratified into four
525
WISCONSIN CARD SORTING TEST
age groupings: :»-39 (mean=34.9, SD=3.71; mean education= 18.95), 40-49 (mean =46.19, SD = 2.80; mean education= 19.18), 50-59 (mean= 54.03, SD = 3.20; mean education= 19.92), and 60 years and older (mean= 65.49, SD=5.72; mean education=19.47). A computerized version of the WCST (Loong, personal communication, 1990), based on standard procedures described in the Heaton (1981) manual, was used. The results revealed a significant relationship between age and most WCST measures, including response time (r=0.49), Categories Completed (r = -0.36), total trials (r = 0.32), total correct responses (r= -0.36), and Percent Conceptual Level Responses (r= -0.53). However, no relationship between education and WCST scores was reported, which was likely due to the restricted educational range. Study strengths 1. Sample composition is well described in terms of age, education, gender, recruitment procedures, and geographic location. 2. Relatively large sample size, although individual cells are small. 3. Test administration procedures are specified. 4. Means and SDs for the test scores are reported. 5. Data are stratified into four age groupings.
personality disorder, head injury, neurological disorder, or substance abuse or a famiJy history of psychosis. The educational level of the controls was reported to be comparable to that of the patient group, who had an average of 13.2 (3.8) years of education; however, the exact education of the controls was not reported. Standard procedures based on the Heaton (1981) manual were used. Patients with schizophrenia committed more perseverative errors and achieved fewer categorical sorts than their unaffected siblings and controls, but no differences were found between controls and unaffected siblings. Study strengths 1. Sample composition is well described in terms of age and gender. 2. Relatively large sample size. 3. Test administration procedures are specified. 4. Adequate exclusion criteria. 5. Means and SDs for the test scores are reported. Considerations regarding use of the study 1. Sample is not stratified by age or education groups. 2. Actual educational level is not reported. 3. Recruitment procedures are not reported. [WCST.26] Laiacona, lnzaghi, De Tanti, and Capitani, 2000 (WCST 128-Card Version) (Table A25.26)
Considerations regarding use of the study 1. Data are not stratified by education, sample reflects a narrow educational range, and educational level is high. 2. Exclusion criteria are not described. [WCST.25] Ismail, Cantor-Graae, and McNeil, 2000 (WCST 128-Card Version) (Table A25.25)
The WCST performance of patients with schizophrenia, their unaffected siblings, and normal controls were compared. Control participants were 75 (59 male, 16 female) healthy adults aged 20-54, with an average age of 35.9 years (no SD). Participants were excluded if they had a personal history of psychosis, affective disorder, schizophrenia-related
The authors developed a new "global efficiency" score for the WCST and normative data for this new measure as well as other WCST scores for a sample of Italian participants. Participants were adults (100 male, 105 female) aged 15-85 years, with an average age of 46.5, and ranging in education 517 years, with an average of 11.4 years. Nineteen were admitted to the Valduce Hospital located in Costa Masnaga, Italy. The illnesses for which they were admitted to the hospital were not related to neurological or psychiatric disorders, and the average hospital stay was 20 (1.4) days. The remaining 186 participants were healthy individuals recruited from the same geographic area, and some were
526
CONCEPT FORMATION AND REASONING
relatives of patients attending the hospital. All participants were reported to be free from medical illnesses that affect cognitive performance (e.g., substance use). However, the authors note that they did not want a '"hypernormal" subject; thus, selection criteria were "not too selective." They admit that their sample may have included participants with mild hypertension and diabetes who were receiving medication. The data were part:ttioned by gender, six age groups (15-29, 30-39,149-49, 50-59, 60--69, 70-85 years), and four education groups (5-6, 8-12, 13-16, 17-24 years). However, no data are reported for age groupa.15-29 and 30-39 with 5-6 years of education. Standard procedures based on the Heaton (1981) manual were used. A global; score was created by multiplying the number of categories completed by 10 and subttacting this value from the total number trials administered:
oi
Global SC0!11 = [ ntrials
administered
-(ncategories completed X
10)] ·
bipolar disorder and healthy controls. Controls were 64 (30 male, 34 female) healthy individuals who were an average of 26.4 (5.44) years of age and had 14.69 (2.99) years of education. Participants were employees and relatives or acquaintances of staff in the clinical and administrative areas of the S. Salvatore Hospital in L'Aguila, Italy. Exclusion criteria were a personal history of head injury, substance abuse, "serious" neurological or physical disease, or psychiatric disorder or a family history of psychosis or personality disorder. A computerized version of the WCST (Schneider, 1989; personal communication), based on standard procedures described in the Heaton et al. (1993) manual was used. The only exception to the standard scoring procedure was "that the first ambiguous error repeating the previously correct principle was not scored as a perseveration." Among other findings, this study reports that a discriminant function analysis was able to correctly classify 85.9% of controls but only 48.5% of schizophrenic patients and 40% of bipolar patients.
Study strengths . 1. Sample composition is well described in terms of age, gender, education, etlpllcity/ language, recruitment procedur~, and geographic location. 2. Adequate exclusion criteria. 3. Test administration procedures are specified. . 4. Means and SDs for the test scores are reported. 5. Sample is stratified into gender, six age groups, and four education groupt.
Study strengths 1. Sample composition is well described in terms of age, education, gender, recruitment procedures, ethnicity/language, and geographic location. 2. Relatively large sample size. 3. Test administration procedures are specified. 4. Adequate exclusion criteria. 5. Means and SDs for the test scores are reported.
Considerations regarding use of the study 1. Overall sample size is adequate, ~ut individual cells are relatively small. 2. Data were obtained on subjects from Italy, which may limit their usetplness for clinical interpretation in the lJnited States.
Considerations regarding use of the study 1. Sample is not stratified by age and education groups. 2. Data were obtained on subjects from Italy, which may limit their usefulness for clinical interpretation in the United States.
[WCST.27] Rossi, Arduini, Daneluzzo, Bustini, Prosperini, and Stratta, 2000 (WCST 128-Card
[WCST.28] Razani, Boone, Miller, Lee, and Sherman, 2001 (WCST 128-Card Version)
Version) (Table A25.27)
(Table A25.28)
This study examined neuropsychological functioning in patients with schizophrmia or
This study examined cognitive function in patients with frontotemporal dementia or
527
WISCONSIN CARD SORTING TEST
Alzheimer's disease. Control participants were 104 (33 male, 71 female) healthy older adults with an average of 60.36 (9.64) years of age, an average of 14.82 (3.31) years of education, and an average WAIS-R FSIQ of 116.81 (14.06). Data for control participants were selected from a larger pool of archival data. Exclusion criteria were a history of head injury, major affective or psychotic disorder, seizures, or substance abuse within the past 5 years. Standard procedures based on the Heaton et al. (1993) manual were used.
Study strengths 1. Sample composition is well described in terms of age, gender, education, and IQ. 2. Adequate exclusion criteria. 3. Test administration procedures are specified. 4. Means and SDs for the test scores are reported. Considerations regarding use of the study 1. Recruitment procedures are not reported. 2. Educational and IQ levels are high. 3. Data are not stratified by age and education.
[WCST.29] Salthouse, Atkinson, and Berish, 2003 (WCST 128-Card Version) (Table A25.29) The authors examined age-related issues and executive functioning in a group of 261 (35% male, 65% female) healthy adults. Participants were aged 18-84 years and had an average of approximately 16 years of education. They were recruited via newspaper advertisements and Hyers to participate in a battery of neuropsychological tests requiring three sessions of approximately 2 hours' duration. No specific exclusion criteria are listed, but the authors mention that six participants were excluded from the analysis for not completing the battery, due to difficulty in understanding instructions, and/or for obtaining WAIS-III Vocabulary scaled scores of<4. The data were stratified into three age grouping: 18--39 (mean= 27. 7, SD = 6.4; mean education= 15.5, SO= 3.3; mean Vocabulary scaled score= 12.0, SO= 3.5), 4059 (mean= 49.0, SD = 5.0; mean education= 16.0, SD = 2.4; mean Vocabulary scaled
score= 12.2, SD = 2.7), and 60--84 (mean age= 70.3, SD = 6.2; mean education= 16.4, SD = 2.9; mean Vocabulary scaled score= 12.8, SD = 2.4). A computerized version ofthe WCST (Woodard, 1994), based on standard procedures described in the Heaton (1981) manual, was used.
Study strengths 1. Sample composition is well described in terms of age, education, gender, and recruitment procedures. 2. Relatively large sample size. 3. Test administration procedures are specified. 4. Means and SDs for the test scores are reported. 5. Data are stratified into three groupings. Considerations regarding use of the study 1. Exclusion criteria are not clearly described. 2. Educational levels are high.
WCST 64-Card Administration Version WCST-64 Manual [WCST.30] Kongs, Thompson, Iverson, and Heaton, 2000 (WCST 64-Card Version) The Wisconsin Card Sorting Test-64 Card Version professional manual is designed to provide normative data on the shortened, 64-card version of the WCST. This manual essentially reanalyzes the first 64 responses (from the first deck of cards) in the sample reported in the WCST revised manual (Heaton et al., 1993). Data were not available for two participants; thus, this sample consisted of 897 children, adolescents, and adults. For a description of the various samples, recruitment procedures, and demographic information, see WCST.1, above. The manual provides regression-based raw to T-score conversions for Total Errors, Percent Errors, Perseverative Responses, Percent Perseverative Responses, Perseverative Errors, Percent Perseverative Errors, N onperseverative Errors, Percent Nonperseverative Errors, and Percent Conceptual Level Responses. Additionally, raw score to percentile conversions are
528
provided for Categories Completed, Trials to Complete First Category, Failure to Maintain Set, and Learning to Learn. The reader is referred to the manual, which stratifies the data for these WCST outcome measures based on gender, 14 child and adolescent age groups (ages ~19 years), eight adult age groups (ages 20--79 years), and six education groups ($8, 911, 12, 13-15, 1~17, 18> years). Standard test administration and scoring criteria are well described.
Study strengths 1. Sample composition is well described in terms of age, gender, and education. 2. Adequate exclusion criteria. 3. Test administration procedures are specified. 4. Means and SDs for the entire sample and T scores for the groups stratified by age and education are presented. 5. Sample is stratified into numerous ageby-education groupings.
Considerations regarding use of the study 1. Overall sample size is adequate, but individual cells for certain outcome measures are relatively small. 2. Recruitment procedures were not well described for some subsamples. 3. Exclusion criteria are not specified for some subsamples.
Other comments 1. The interested reader is referred to Fastenau and Adams (1996) critique of the Heaton et al.'s normative approach, and Heaton et al.'s 1996 response to this critique.
Normative Studies and Control Groups in Clinical Comparison Studies for the WCST-64 [WCST.31] Axelrod, Jiron, and Henry, 1993 (WCST 64-Card Version) (Table A25.30) The authors examined performance of healthy adults on the 64-card version of the WCST. Participants were 140 (55 male, 85 female) adults aged 20--90. The data were partitioned
CONCEPT FORMATION AND REASONING
into seven age groups by decade of life (i.e., 20s, 30s, 40s, 50s, 60s, 70s, and 80s). Average educational level ranged from 14.4 (3.0) to 15.6 (1.2). Subjects aged 20--49 were either undergraduate students at Wayne State University or recruited via newspaper advertisements from the Detroit community. Participants aged 50--89 were part of a previously published study (Axelrod & Heruy, 1992). For these older participants, Axelrod and Heruy (1992) list the following exclusion criteria: history of psychiatric hospitalization or use of psychotropic medication, substance abuse, neurological disorder, head injury resulting in >5 minutes of loss of consciousness, significant illness such as diabetes or COPD requiring long-term medical treatment, or MMSE scores 2:::24. No exclusion criteria are listed for participants younger than 50 years. Standard procedures of the 128-card WCST based on the Heaton (1981) manual were used, and participants in the youngest three decade groups sorted all 128 cards even after six categories were completed. For all participants, only data from the first 64 cards were analyzed. Results of trend analyses revealed a significant age-related decline for Categories Completed and an increase for Total Errors, Perseverative Errors, and Perseverative Responses for the 64-card version of the WCST.
Study strengths 1. Sample composition is well described in terms of age, gender, education, and recruitment strategies. 2. Test administration procedures are specified. 3. Means and SDs for the entire sample are reported. 4. Sample is stratified into seven age groupings.
Considerations regarding use of the study 1. Overall sample size is adequate, but individual cells for certain measures are relatively small. 2. Recruitment procedures and exclusion were not well described for some subsamples.
529
WISCONSIN CARD SORTING TEST
3. Educational levels are high; data are not stratified by education. [WCST.32] Paolo, Axelrod, Troester, Blackwell, and Koller, 1996b (64-Card Version) (Table A25.31)
The authors examined performance of pa· tients with Parkinson's disease or Alzheimer's disease and normal controls on the 64-card version of the WCST. Control participants were 35 (22 male, 13 female) older adults, with an average age of71.34 (5.73), an average education of 13.11 (2.03) years, and an aver· age DRS score of 137.37 (3.36). They were part of a longitudinal study of neurodegener· ative disease. Participants were recruited via advertisements, and all were interviewed, completed a health questionnaire, and re· ceived neurological examinations. Exclusion criteria were history of stroke, psychiatric disorder, "significant" head trauma, substance abuse, or neurological disorders. Standard procedures based on the Heaton et al. (1993) manual were used, but for the purposes of this study only the responses on the first 64 cards were scored and analyzed. The Heaton (1993) computer scoring software was used.
Study strengths 1. Sample composition is well described in terms of age, education, gender, and recruitment procedures. 2. Test administration procedures are specified. 3. Adequate exclusion criteria 4. Means and SDs for the test scores are reported.
Considerations regarding use of the study 1. Small sample size. 2. Sample is not stratified by age and edu· cation groupings. [WCST.33] Lopez-Carlos, Salazar, Villasenor, Saucedo, and Peiia, 2003 (64-Card Version) (Tables A25.32, A25.33)
The authors investigated the effects of demographic variables on cognitive abilities in Spanish-speaking individuals with low educa· tion. The WCST-64 was administered to
59 monolingual, Spanish-speaking Latino men with $10 years of formal education in the Los Angeles, California, community. Participants were an average of 28.89 (8.37) years old and had an average of5.82 (2.49) years of education. Exclusion criteria consisted of any self-report of head injury, neurological insults, prenatal or birth complications, learning disabilities, psychiatric problems, or substance abuse. Standard administration procedures were used. Participants were tested in Spanish. Selected subtests from the WAIS-111 (Mexican version) were included in the battery. Mean performance on the Marin and Marin (1991) acculturation scale for this sample was 17.61 (6.19). For the Los Angeles group, Picture Vocabulary subscale scores from the Woodcock-Johnson-III Tests of Achievement (M = 5.36, SD = 6.01) and the Bateria Woodcock-Muiioz-R, Pruebas de habilidad cognitiva-R (M = 29. 77, SD = 5.37) were used to assess level of English and Spanish word expressive abilities. The results are presented by education groupings (0-6, 7-10) and by age-andeducation groupings (18-29 years old, 0-6 or 7-10 years of education; 30-49 years old, 0-6 or 7-10 years of education). The authors found a significant difference in performance on the WCST (number correct and Categories Completed) between the two education groups. However, the two age groups did not differ significantly on any of the sections of the WCST.
Study strengths 1. Sample composition is well described in terms of age, education, gender, and geographic area. 2. Data availability for a healthy, employable, monolingual Spanish-speaking group with low educational level. 3. Sample is stratified into two education groups and two age-by· education groups. 4. Adequate exclusion criteria. 5. Means and SDs are reported.
Considerations regarding use of the study 1. All-male sample. 2. Small sample sizes.
530
MCST Administration Version: Normative Studies and Control Groups in Clinical Comparison Studies [WCST.34] Bondi, Monsch, Butters, Salmon, and Paulsen, 1993 (MCST Version) (Table A25.34)
11lis study examined the utility of the MCST in differentiating patients at various stages of Alzheimer's disease and normal controls. Control participants were 75 (27 male, 48 female) older adults, with an average age of71.1 (7.6) years, an average of 13.7 (2.6) years of education, and an average MMSE score of 28.9 (1.2). Precise educational levels are not reported. Participants were spouses of patients or recruited via newspaper advertisements from the San Diego, California, community. Exclusion criteria were a history of substance abuse, learning disability, or "serious" neurological or psychiatric disorders. Standard MCST administration and scoring procedures (Nelson, 1976) were used. ROC curves found Categories Completed and Perseverative Errors to be more sensitive than Nonperseverative Errors at discriminating Alzheimer's patients from controls. Study strengths 1. Sample composition is well described in terms of age, education, gender, recruitment procedures, and geographic location. 2. Relatively large sample size. 3. Test administration procedures are specified. 4. Adequate exclusion criteria. 5. Means and SDs for the test scores are reported. Considerations regarding use of the study 1. Data are not partitioned by age and education groups. [WCST.35] Van den Broek, Bradshaw, and Szabadi, 1993 (MCST Version) (Table A25.35)
11le sensitivity and specificity of the MCST were assessed in a group of patients with brain lesions and normal controls. 11le control sample consisted of 77 (19 male, 58 female) Caucasian participants who were an average of 35.2 (12.8) years old and had an average
CONCEPT FORMATION AND REASONING
WAIS IQ of 109.5 (13.2) from the United Kingdom. Secretarial and support staff of a local hospital and university were recruited. Exclusion criteria were no history of neurological or psychiatric disorder. Standard MCST administration and scoring procedures (Nelson, 1976) were used. Among other findings, the authors report that this version of the WCST did not discriminate well among frontal and non-frontal lobe lesion patients but that it did discriminate between controls and patients (regardless of the site of lesion). Study strengths 1. Sample composition is well described in terms of age, gender, IQ, and recruitment procedures. 2. Relatively large sample size. 3. Test administration procedures are specified. 4. Adequate exclusion criteria. 5. Means and SDs for the test scores are reported. Considerations regarding use of the study 1. Sample is not partitioned by age and education groups. 2. No information on education is reported. 3. Data were obtained on subjects from the United Kingdom, which may limit their usefulness for clinical interpretation in the United States. [WCST.36] lsingrini and Vazou, 1997 (MCST Version) (Table A25.36)
Performance on tests of frontal lobe function and intelligence were assessed in a group of healthy adults. Participants were 107 {52 male, 55 female) adults. 11le study was conducted in France, and the data were divided into two age groups: 25-46 {mean= 35.5, SD = 7.58; mean education= 12.5, SD = 3.03) and 70-99 (mean = 80.59, SD = 8.58; mean education= 8.54, SD = 1.18). Older participants who resided independently or in senior-citizen residential homes were recruited. There is no mention of the recruitment procedures for the younger group. Participants reported good health and were not on medications that affect cognitive functioning. However, no
531
WISCONSIN CARD SORTING TEST
other exclusion criteria were reported. Standard MCST administration and scoring procedures (Nelson, 1976) were used. The study found that MCST scores best correlated with measures of Huid intelligence.
Study strengths 1. Sample composition is well described in terms of age, education, and gender. 2. Test administration procedures are specified. 3. Means and SDs for the test scores are reported. 4. Sample is stratified into two age groupings. Considerations regarding use of the study 1. Overall sample is adequate, but individual cells are relatively small. 2. Recruitment procedures not reported. 3. Data were obtained on French subjects, which may limit their usefulness for clinical interpretation in the United States. 4. Low educational level for the older subjects; data not stratified by education.
The authors found significant effects of age and education, but not gender, on the MCST. Additionally, for testing probes separated by 1 year, significant increases for Nonperseverative Errors were found. Practice effects for Categories Completed and Perseverative Errors were not observed.
Study strengths 1. Sample composition is well described in terms of age, education, gender, ethnicity, and recruitment procedures. 2. Test administration procedures are specified. 3. Adequate exclusion criteria. 4. Means and SDs for the test scores are reported. 5. Sample is stratified into four age and four education groupings. Considerations regarding use of the study 1. Overall sample is adequate, but individual cells are relatively small.
[WCST.37] Lineweaver, Bondi, Thomas, and
CONCLUSIONS
Salmon, 1999 (MCST Version) (Table A25.37)
A tremendous number of clinical studies, including those on patients with brain lesions and neurological and psychiatric patients, especially patients with schizophrenia and head injury, as well as neuroimaging studies have established the WCST as a useful clinical tool for detecting executive or frontal system dysfunction. Up to three factor structures have been identified that best explain all the WCST outcome measures, but when used individually, Perseverative Errors, Perseverative Responses, and Categories Completed appear to be the most useful clinical measures (Heaton, 1981; Heaton et al., 2004; Milner, 1963). Practice effects have been noted on most WCST and MCST measures in healthy adults but need to be further investigated in clinical populations. A review of the literature indicates a strong effect of age, a moderate effect of education and intelligence, and equivocal results for the effect of gender on the various versions of the WCST. Age-related decline in WCST
The authors conducted a normative study on the MCST. Participants were 229 (97 male, 132 female) healthy, community-dwelling, older adults aged 45-91 years, with an average age of 69.06 (8.58) years and an average ofl3.60 (4.57) years of education. Seventy-eight percent were white, 21% were Mexican American or Spanish American, 1% were African American, and the remaining 1% were Cuban American. Participants were part of a longitudinal study conducted by the University of California, San Diego Alzheimer's Disease Research Center. Exclusion criteria were a history of substance abuse, psychiatric illness, or neurological disorder. The data were partitioned into four age groups (45-59, 60--69, 70-79, 80-91 years) and four education groups (1-6, 7-12, 13-16, 1720 years). Additionally, the authors provide demographically corrected norms and raw score to scaled score conversions, which have not been reproduced in this chapter. Standard MCST administration and scoring procedures (Nelson, 1976) were used.
532
performance had been consistently ; documented, with virtually no changes between the ages of 20 and 60 years but a relttively steep decline during the sixth, seventJt, and eighth decades of life. Better performance on the WCST appears to be related to ~gher educational level, particularly when ~duca tionallevel is > 15 years. While relatively few studies have explicitly examined the lationship between intelligence and WCS most have found that higher intellectual fun oning leads to better WCST performance. Th+re are mixed reports for the effects of gendet. with most studies reporting equivalent ferformance between males and females.
CONCEPT FORMATION AND REASONING
Very few studies have examined the relationship between ethnicity and WCST scores. Of the few cross-cultural studies available, primarily on Hispanic samples, there appear not to be large differences between these samples and the nonnative samples developed on North American populations. It is clear that additional research is needed to better understand the effects of factors such as culture, ethnicity, and multilingualism on the WCST. Additionally, more normative information is needed for individuals with low educational levels. Very few of the existing WCST studies included individuals with <12 years of education.2
2 Meta-analyses were not perfonned on the WCST data as this chapter was not intended to summarize all of the voluminous literature available on this test. Conversely, comprehensive sets of nonns are available in the literature.
References
Abbruzzese, M., Ferri, S., & Scarone, S. (1996). Perfonnance on the Wisconsin Card Sorting Test in schizophrenia: Perseveration in clinical subtypes. Psychiatry Research, 64(1), 27--33. Abikoff, H., Alvir, J., Hong, G., Sukoff, R., Orazio, J., Solomon, S., et al. (1987). Logical Memory Subtest of the Wechsler Memory Scale: Age and education norms and alternate-form reliability of two scoring systems. Journal of Clinical and Experimental Neuropsychology, 9(4), 435-448. Abraham, E., Axelrod, B. N., & Ricker, J. H. (1996). Application of the Oral Trail Making Test to a mixed clinical sample. Archives of Clinical Neuropsychology, 11(8), 697-701. Abrahams, S., Leigh, P. N., Harvey, A., Vythelingum, G. N., Grise, D., & Goldstein, L. H. (2000). Verbal fluency and executive dysfunction in amyotrophic lateral sclerosis (ALS). Neuropsychologia, 38(6), 734-747. Abwender, D. A., Swan, J. G., Bowerman, J. T., & Connolly, S. W. (2001). Qualitative analysis of verbal fluency output: Review and comparison of several scoring methods. Assessment, 8(3), 323--336. Acevedo, A., Loewenstein, D. A., Barker, W. W., HaiWOOd, D. G., Luis, C., Bravo, M., et al. (2000). Category Fluency Test: Normative data for English- and Spanish-speaking elderly. Jour-
nal of the International Neuropsychological Society, 6(7), 760-769.
Acker, M. B., & Davis, J. R. (1989). Psychology test scores associated with late outcome in head injury. Neuropsychology, 3(3), 123-133. Adams, K. M., Gilman, S., Koeppe, R. A., Kluin, K. J., Brunberg, J. A., Dede, D., et al. (1993). Neuropsychological deficits are correlated with frontal hypometabolism in positron emission tomography studies of older alcoholic patients.
Alcoholism: Clinical and Experimental Research, 17(2), 205-210. Adams, K. M., Gilman, S., Koeppe, R., Kluin, K., Junek, L., Lohman, M., et al. (1995). Correlation of neuropsychological function with cerebral metabolic rate in subdivisions of frontal lobes of older alcoholic patients measured with [-1-8F]fluorodeoxyglucose and positron emission tomography. Neuropsychology, 9(3), 275-280. Adams, R. L., & Trenton, S. L. (1981). Development of a paper-and-pen form of the Halstead Category Test. Journal of Consulting and Clinical Psychology, 49(2), 298-299. Akshoomoff, N. A., & Stiles, J. ( 1995a). Developmental trends in visuospatial analysis and planning: I. Copying a complex figure. Neuropsychology, 9(3), 364-377.
Akshoomoff, N. A., & Stiles, J. (1995b ). Developmental trends in visuospatial analysis and planning: II. Memory for a complex figure. Neuropsychology, 9(3), 378-389. Akshoomoff, N. A., & Stiles, J. (2003). Children's performance on the ROCF and the development of spatial analysis. In J. A. Knight (Ed.), The
handbook of Rey-Osterrieth Complex Figure usage: Clinical and research applications (pp. 393--409). Lutz, FL: Psychological Assessment
Resources. Albert, M. S., Heller, H. S., & Milberg, W. (1988). Changes in naming ability with age. Psychology and Aging, 3(2), 173-178. Alder, A. G., Adam, J., & Arenberg, D. (1990). Individual-differences assessment of the relationship between change in and initial level of adult cognitive functioning. Psychology and Aging, 5(4), 560-568. Alekoumbides, A., Charter, R. A., Adkins, T. G., & Seacat, F. (1987). The diagnosis of brain damage
533
REFERENCES
534
by the WAIS, WMS, and Reitan battery utilizing standardized scores corrected for age and education. International Journal of Clinical Neuropsychology, 9(1), 11-28. Alevriadou, A., Katsarou, Z., Bostantjopoulou, S., Kiosseoglou, G., & Mentenopoulos, G. (1999). Wisconsin Card Sorting Test variables in relation to motor symptoms in Parkinson's disease. Perceptual and Motor Skills, 89(3, Pt 1), 824--830. Allegri, R. F., Mangone, C. A., Villavicencio, A. F., Rymberg, S., Taragano, F. E., & Baumann, D. (1997). Spanish Boston Naming Test norms. Clinical Neuropsychologist, 11(4), 416-420. Allen, J., Blanton, P., Johnson-Greene, D., MurphyFarmer, C., & Gross, A. (1992). Need for achievement and performance on measures of behavioral fluency. Psychological Reports, 71(2), 471-478. Allen, J. B., Gross, A.M., Aloia, M.S., & Billingsley, C. (1996). The effects of glucose on nonmemory cognitive functioning in the elderly. Neurupsyclwlogia, 34(5), 459-465. American Academy of Neurology (1996). Assessment: Neuropsychological testing of adults. Considerations for neurologists. Report of the Therapeutics and Technology Assessment Subcommittee. Neurology, 47(2), 592-599. American Psychological Association (1999). Stan-
dards for Educational and Psychological Testing. American Psychological Association (2002). Ethical principles of psychologists and code of conduct. American Psychologist, 57(12), 1060-1073. Amieva, H., Lafont, S., Auriacombe, S., Rainville, C., Orgogozo, J.-M., Dartigues, J.-F., et al. (1998). Analysis of error types in the Trail Making Test evidences inhibitory deficit in dementia of the Alzheimer type. Journal of Clinical and Experimental Neuropsychology. 20(2), 280-285. Amir, T. (2001). Benton Visual Retention Test: Reliability, gender, and the effect of extended practice on the performance of participants from UAE. Bulletin of the Faculty of Arts, Cairo University, 61(2), 7-17. Amir, T., & Bahri, T. (1994). Effect of substance abuse on visuographic function. Perceptual and Motor Skills, 78(1), 235-241. Anastasi, A. (1988). Norms and the interpretations of test scores. In A. Anastasi (Ed.), Psychological Testing (6th ed.) (pp. 71-108). New York: MacMillan. Andel, R., McCleary, C. A., Murdock, G. A., Fiske, A., Wilcox, R. R., & Gatz, M. (2003). Performance on the CERAD Word List Memory task: A comparison of university-based and communitybased groups. International Journal of Geriatric Psychiatry, 18(8), 733--739.
Anderson, C. V., Bigler, E. D., & Blatter, D. D. (1995). Frontal lobe lesions, diffuse damage, and neuropsychological functioning in traumatic brain-injured patients. Journal of Clinical and
Experimental Neuropsychology, 17(6), 900-008. Anderson, S. W., Damasio, H., Jones, R. D., & Tranel, D. (1991). Wisconsin Card Sorting Test performance as a measure of frontal lobe damage. Journal of Clinical and Experimental Neuropsychology, 13, 909-922. Anger, W. K., Cassitto, M. G., Liang. Y-X, Amador, R., Hooisma, J., Chnislip, D. W., et al. (1993). Comparison of performance from three continents on the WHO-recommended neurobehavioral core test battery. Environmental Research, 62, 1~147.
Anil, A. E., Kivircik, B. B., Batur, S., Kabakci, E., Kitis, A., Giiven, E., et al. (2003). The Turlcish version of the Auditory Consonant Trigram Test as a measure of working memory: A normative study. Clinical Neuropsychologist, 17(2), 159-lf;9. Annett, M. (1970). A classification of hand preference by association analysis. British Journal of Psychology, 61, 303--321. Anstey, K. J., & Smith, G. A. (1999). Interrelationships among biological markers of aging. health, activity, acculturation, and cognitive performance in late adulthood. Psychology and Aging. 14(4), 605-618. Anstey, K. J., Matters, B., Brown, A. K., &: Lord, S. R. (2000). Normative data on neuropsychological tests for very old adults living in retirement villages and hostels. Clinical Neuropsychologist. 14(3), 309--317. Antes, G., & Oxman, A. D. (for the Cochrane Collaboration) (2001). The Cochrane Collaboration in the 20th century. In M. Egger, G. D. Smith, & D. G. Altman (Eds.), Systematic rwlew8 in hetJth
care: Meta-analysis in contert. London: BMJ. Anthony, W. Z., Heaton, R. K., & Lehman, R. A. W. (1980). An attempt to cross-validate two actuarial systems for neuropsychological test interpretation. Journal of Consulting and Clinicol Psychology, 48(3), 317-326. Arbuckle, T. Y., & Gold, D. P. (1993). Aging, inhibition, and verbosity. Journals of Gerontology, 48(5), P~P232. Arbuthnott, K., & Frank, J. (2000). Trail Making Test, part B as a measure of executive control: Validation using a set-switching paradigm. Jour-
nal of Clinical and Experimental Neuropsychology, 22(4), 518-528. Ardila, A., & Rosselli, M. (1989). Neuropsychological characteristics of normal aging. Developmental Neuropsychology, 5(4), 307-320.
535
REFERENCES Ardila, A., & Rosselli, M. (2003). Educational effects on ROCF performance. In J. A. Knight (Ed.), The handbook of Rey-Osterrieth Complex
Figure usage: Clinical and research applications. Lutz, FL: Psychological Assessment Resources. Ardila, A., Rodriguez-Menendez, G., & Rosselli, M. (2002). Current issues in neuropsychological assessment with Hispanic/Latinos. In F.R. Ferraro (Ed.), Minority and cross-cultural aspects of neuropsyclwlogical assessment. Studies on neuropsychology, development, and cognition. Lisse: Swets & Zeitlinger. Ardila, A., Rosselli, M., & Rosas, P. (1989). Neuropsychological assessment in illiterates: VJSuospatial and memory abilities. Brain and Cognition, 11(2), 147-166. Arena, R., & Gainotti, G. (1978). Constructional apraxia and visuoperceptive disabilities in relation to laterality of cerebral lesions. Cortex, 14(4), 463-473. Arenberg. D. (1978). Differences and changes with age in the Benton Visual Retention Test. Journal of Gerontology, 33(4), 534-540. Arenberg, D. (1982). Estimates of age changes on the Benton Visual Retention Test. Journal of Gerontology, 37(1), 87-90. Arias Bal, M.A., Vazquez-Barquero, J. L., Pena, C., Miro, J., & Berciano, J. A. (1991). Psychiatric aspects of multiple sclerosis. Acta Psychiatrico Scandinavico, 83(4), 292-296. Arima, J. K. (1965). Performance of normal males on the Halstead Tactual Performance Test under severe environmental stress. Perceptual and Motor Skills, 21, 83--90. Armengol, C. G. (2002). Stroop test in Spanish: Children's norms. Clinical Neuropsychologist, 16(1), 67--80. Armstrong. C., Onishi, K., Robinson, K., D'Esposito, M., Thompson, H., Rostami, A., et al. (1996). Serial position and temporal cue effects in multiple sclerosis: Two subtypes of defective memory mechanisms. Neuropsychologia, 34(9), 853--862. Army Individual Test Battery (1994). Manual of directions and scoring. Washington, DC: War Department, Adjutant General's Office. Arnett, J. A., & Labovitz, S. S. (1995). Effect of physical layout in performance of the Trail Making Test. Psychological Assessment, 7(2), 220-221. Arnett, P. A., Rao, S.M., Bernardin, L., Grafman, J., Yetkin, F. Z., & Lobeck, L. (1994). Relationship between frontal lobe lesions and WCST performance by patients with multiple sclerosis. Neu-
rology, 44, 420--424. Arnold, B. R., Montgomery, G. T., Castaneda, 1., & Longoria, R. (1994). Acculturation and perfor-
mance of Hispanics on selected Halstead-Reitan neuropsychological tests. Assessment, 1(3),
239-248. Arria, A.M., Tarter, R. E., Kabene, M.A., Laird, S. B., Moss, H. & Van Thiel, D. M. (1991). The role of cirrhosis in memory functioning of alcoholics. Alcolwlism: Clinical and Experimental Research, 15(6), 932-937. Artiola i Fortuny, L., & Heaton, R. K. (1996). Standard versus computerized administration of the Wisconsin Card Sorting Test. Clinical Neuropsyclwlogist, 10(4), 419-424. Artiola i Fortuny, L., Heaton, R. K., & Hermosillo, D. (1998). Neuropsychological comparisons of Spanish-speaking participants from the U.S.Mexico border region versus Spain. Journal of the International Neuropsychological Society, 4(4), 363--379. Artiola i Fortuny, L., Hermosillo Romo, D. H., Heaton, R. K., & Pardee, R. E. III (1999). Manual tk nonnas y procedimientos para Ia bateria neuropsicologica en Espanal. Tucson, AZ: mPress. Ashendorf, L., O'Bryant, S. E., & McCaffrey, R. J. (2003). Specificity of malingering detection strategies in older adults using the CVLT and WCST. Clinical Neuropsyclwlof!i.st, 17(2), 255-262. Au, R., Joung, P., Nicholas, M., Ohler, L. K., et al. (1995). Naming ability across the adult life span. Aging and Cognition, 2(4), 300--311. Audenaert, K., Brans, B., Van Laere, K., Lahorte, P., · Versijpt, J., van Heeringen, K., Dierckx, R. (2000). Verbal fluency as a prefrontal activation probe: a validation study using 99mTc-ECD brain SPET. European Journal of Nuclear Medicine, 27(12), 1800-1808. Aupperle, R. L., Beatty, W. W., Shelton, F., & Gontkovsky, S. T. (2002). Three screening batteries to detect cognitive impairment in multiple sclerosis. Multiple Sclerosis, 8, 382-389. Austin, M. P., Ross., M., O'Carroll, R. E., Ebmeier, K. P., & Goodwin, G. M. (1992). Cognitive dysfunction in major depression. Journal of Af fective Disorder, 25, 21-30. Axelrod, B. N. (2002). Are normative data from the64card version of the WCST comparable to the full WCST? Clinical Neuropsychologist,16(1), 7-11. Axelrod, B. N., & Goldman, R. S. (1996). Use of demographic corrections in neuropsychological interpretation: How standard are standard scores? Clinical Neuropsyclwlof!i.st, 10(2), 15~162. Axelrod, B. N., & Henry, R. R. (1992). Age-related performance on the Wisconsin Card Sorting, Similarities, and Controlled Oral Word Association Tests. Clinical Neuropsyclwlogist, 6(1), 16-26.
536
Axelrod, B. N., & Milner, I. B. (1997). Neuropsychological findings in a sample of Operation Desert Storm veterans. Journal of Neuropsychiatry and Clinical Neurosciences, 9(1), 23--38. Axelrod, B. N., Goldman, R. S., & Woodard, J. L. (1992a). Interrater reliability in scoring the Wisconsin Card Sorting Test. Clinical Neuropsychologist, 6{2), 143-155. Axelrod, B. N., Henry, R. R., & Woodard, J. L. (1992b). Analysis of an abbreviated form of the Wisconsin Card Sorting Test. Clinical Neuropsychologist, 6(1), 27-31. Axelrod, B. N., Jiron, C. C., & Henry, R. R. (1993). Performance of adults ages 20 to 90 on the Abbreviated Wisconsin Card Sorting Test. Clinical Neuropsychologist, 7(2), ~209. Axelrod, B. N., Goldman, R. S., Tompkins, L. M., & Jiron, C. C. (1994a). Poor differential performance on the Wisconsin Card Sorting Test in schizophrenia, mood disorder, and traumatic brain injury. Neuropsychiatry, Neuropsychology, and Behavioral Neurology. 7{1), 20-24. Axelrod, B. N., Greve, K. W., & Goldman, R. S. (1994b). Comparison of four Wisconsin Card Sorting Test scoring guides with novice raters. Assessment, 1{2), 115-121. Axelrod, B. N., Goldman, R. S., Heaton, R. K., Curtiss, G., et al. (1996). Discriminability of the Wisconsin Card Sorting Test using the standardization sample. Journal of Clinical and Experimental Neuropsychology, 18(3), 338--342. Axelrod, B. N ., Aharon-Peretz, J., Tomer, R., & Fisher, T. (2000a). Creating interpretation guidelines for the Hebrew Trail Making Test. Applied Neuropsychology, 7(3), 186-188. Axelrod, B. N., Heilbronner, R., Barth, J., Larrabee, G., Faust, D., Pliskin, N., et al. (2000b). Test security: Official position statement of the National Academy of Neuropsychology. Archives of Clinical Neuropsychology. 15(5), 383-386. Axelrod, B. N., Tomer, R., Fisher, T., & AharonPeretz, J. (2001). Preliminary analyses of Hebrew verbal fluency measures. Applied Neuropsychology, 8(4), 248-250. Baddeley, A. (1986). Working memory. Oxford Psychology Series 11. New York: Oxford University Press. Baddeley, A. (1996). Exploring the central executive. Quarterly Journal of Experimental Psychol-
ogy: Human Experimental Psychology. Special Issue: Working Memory, 49A(1), 5-28. Bak, J. S., & Greene, R. L. (1980). Changes in neuropsychological functioning in an aging population. Journal of Consulting and Clinical Psychology, 48(3), 395-399.
REFERENCES Baker, R., Donders, J., & Thompson, E. (2000). Assessment of incomplete effort with the California Verbal Learning Test. Applied Neuropsychology, 7(2), 111-114. Baldo, J. V., Shimamura, A. P., Delis, D. C., Kramer, J., & Kaplan, E. (2001). Verbal and design fluency in patients with frontal lobe lesions.
Journal of the International NeuropsychologU:td Society, 7(5), 586-596. Baldo, J. V., Delis, D., Kramer, J., & Shimamura, A. P. (2002). Memory performance on the California Verbal Learning Test-11: Findings from patients with focal frontal lesions. Joumal of the
International NeuropsychologU:td Society, 8(4), 539-546. Barbarotto, R., Laiacona, M., Frosio, R., Vecchio, M., Farinato, A., & Capitani, E. (1998). A normative study on visual reaction times and two Stroop colour-word tests. Italian Joumal of Neurolog#col Science, 19(3), 161-170. Barcelo, F. (1999). Electrophysiological evidence of two different types of error in the Wisconsin Card Sorting Test. Neuroreport, 10(6), 1~1303. Bardwell, W. A., Ancoli-Israel, S., Berry, C. C., & Dimsdale, J. E. (2001). Neuropsychological effects of one-week continuous positive airway pressure treatment in patients with obstructive sleep apnea: A placebo-controlled study. Psychosomatic Medicine, 63(4), 579-584. Barker-Collo, S. L. (2001). The 60-item Boston Naming Test: Cultural bias and possible adaptations for New Zealand Aphasiology, 15(1), 85-92. Barker-Collo, S., Clarkson, A, Cribb, A, & Grogan. M. (2002). The impact of American content on California Verbal Learning Test performance: A New Zealand illustration. Clinical Neuropsychologist, 16(3), 290-299. Barncord, S. W., & Wanlass, R. L. (1999). Paper or plastic: Another ecological consideration in neuropsychological assessment. Applied Neuropsychology, 6(2), 121-122. Barncord, S. W., & Wanlass, R. L. (2001). The Symbol Trail Making Test: Test development and utility as a measure of cognitive impairment. Applied Neuropsychology, 8(2), 99-103. Baron, I. S. (2004). NeuropsychologU:td evaluation of the child. New York: Oxford University Press. Barr, A., & Brandt, J. (1996). Word-list generation deficits in dementia. Joumal of Clinical and Experimental Neuropsychology, 18(6), 810-822. Barr, W. B. (2003). Assessment of temporal lobe epilepsy using the ROCF. In J. A. Knight (Ed.), The handbook of Rey-Ostenieth Complex Figure
REFERENCES
usage: Clinical and research applications. Lutz, FL: Psychological Assessment Resources. Barrash, J., Suhr, J., & Manzel, K. (2004). Detecting poor effort and malingering with an expanded version of the Auditory Verbal Learning Test (AVLTX): Validation with clinical samples. Jour-
nal of Clinical and Experimental Neuropsychology, 26(1), 125-140. Barreca, S. R., Finlayson, M.A. J., Gowland, C. A., & Basmajian, J. V. (1999). Use of the Halstead Category Test as a cognitive predictor of functional recovery in the hemiplegic upper limb: A cross-validation study. Clinical Neuropsychologist, 13(2), 171-181. Barresi, B. A., Nicholas, M., Tabor Connor, L., Ohler, L. K., & Albert, M. L. (2000). Semantic degradation and lexical access in age-related naming failures. Aging. Neuropsychology, and Cognition, 7(3), 16~178. Barrett, D. H., Morris, R. D., Akhtar, F. Z., & Michalek, J. E. (2001). Serum dioxin and cognitive functioning among veterans of Operation Ranch Hand. Neurotoxicology, 22, 491--502. Barrett-Connor, E., & Goodman-Gruen, D. (1999). Cognitive function and endogenous sex hormones in older women. Journal of the American Geriatrics Society, 47(11), 128~1293. Bartels, M., & Themelis, J. (1983). Computerized tomography in tardive dyskinesia: Evidence of structural abnormalities in the basal ganglia system. Archiv foer Psychiatrie und Neroenkrankheiten, 233(5), 371-379. Bartfai, A., Winborg, I. M., Nordstroem, P., & Asberg, A. (1990). Suicidal behavior and cognitive flexibility: Design and verbal fluency after attempted suicide. Suicide and Life-Threatening Behavior, 20(3), 254-266. Bartok, J. A., Wilson, C. S., Giordani, B., Keys, B. A., Persad, C. C., Foster, N. L., et al. (1997). Varying patterns of verbal recall, recognition, and response bias with progression of Alzheimer's disease. Aging. Neuropsychology, and Cognition, 4(4), 266--272. Baser, C. A., & Ruff, R. M. (1987). Construct validity of the San Diego Neuropsychological Test Battery. Archives of Clinical Neuropsychology, 2(1), 13--32. Basso, M. R., Bomstein, R. A., & Lang, J. M. (1999). Practice effects on commonly used measures of executive function across twelve months. Clinical Neuropsychologist, 13(3), 283-292. Basso, M. R., Harrington, K., Matson, M., & Lowery, N. (2000). Sex differences on the WMS-III: Findings concerning verbal paired associates and faces. Clinical Neuropsychologist, 14(2), 231-235.
537 Bate, A. J., Mathias, J. L., & Crawford, J. R. (2001). Performance on the Test of Everyday Attention and standard tests of attention following severe traumatic brain injury. Clinical Neuropsychologist, 15(3), 405--422. Battig, W. F., & Montague, W. E. (1969). Category norms of verbal items in 56 categories: A replication and extension of the Connecticut category norms. Journal of Experimental Psychology. 80(3), 1-46. Bayles, K. A., & Tomoeda, C. K. (1983). Confrontation naming impairment in dementia. Brain and Language, 19, 98-112. Bayles, K. A., Salmon, D. P., Tomoeda, C. K., Jacobs, D., Caffrey, J. T., Kaszniak, A. W., et al. (1989). Semantic and letter category naming in Alzheimer's patients: A predictable difference. Developmental Neuropsychology, 5(4), 335-347. Bayley, P. J., Salmon, D. P., Bondi, M. W., Bui, B. K., Olichney, J., Delis, D. C., et al. (2000). Comparison of the serial position effect in very mild Alzheimer's disease, mild Alzheimer's disease, and amnesia associated with electroconvulsive therapy. Journal of the International Neuropsychological Society, 6(3), 290--298. Beatty, W. W. (1993). Age differences on the California Card Sorting Test: Implications for the assessment of problem solving by the elderly. BuUetin of the Psychonomic Society, 31(6), 511514. Beatty, W. W. & Monson, N. (1990). Problem solving in Parkinson's disease: Comparison of performance on the Wisconsin and California Card Sorting Tests. Journal of Geriatric Psychiatry and Neurology, 3, 163-171. Beatty, W. W., Goodkin, D. E., Monson, N., & Beatty, P. A. (1989). Cognitive disturbances in patients with relapsing remitting multiple sclerosis. Archives of Neurology, 46, 1113-1119. Beatty, W. W., Jocic, Z., Monson, N ., & Staton, R. D. (1993). Memory and frontal lobe dysfunction in schizophrenia and schizoaffective disorder. Journal ofNeroous and Mental Disease, 181(1), 448-453. Beatty, W. W., Jocic, Z., Monson, N ., & Katzung, V. M. (1994). Problem solving by schizophrenic and schizoaffective patients on the Wisconsin and California Card Sorting Tests. Neuropsychology, 8(1), 4~4. Beatty, W. W., Hames, K. A., Blanco, C. R., Paul, R. H., et al. (1995). Verbal abstraction deficit in multiple sclerosis. Neuropsychology, 9(2), 198-205. Beatty, W. W., Krull, K. R., Wilbanks, S. L., Blanco, C. R., Hames, K. A., & Paul, R. H. (1996a).
REFERENCES
538 Further validation of constructs from the Selective Reminding Test. Journal of Clinical and Experimental Neuropsychology, 18(1), 52-55. Beatty, W. W., Wilbanks, S. L., Blanco, C. R., Hames, K. A., Tivis, R., & Paul, R. H. (1996b). Memory disturbance in multiple sclerosis: Reconsideration of patterns of performance on the Selective Reminding Test. Journal ofClinical and Experimental Neuropsychology, 18(1), 56--62. Beatty, W. W., Testa, J. A., English, S., & Winn, P. (1997). Influences of clustering and switching on the verbal fluency performance of patients with Alzheimer's disease. Aging, Neuropsychology, and Cognition, 4(4), 273-279. Beatty, W. W., Salmon, D. P., Troester, A. 1., & Tivis, R. D. (2002). Do primary and supplementary measures of semantic memory predict cognitive decline by patients with Alzheimer's disease? Aging, Neuropsychology, and Cognition, 9(1), 1-10. Bechtoldt, H. P., Benton, A. L., & Fogel, M. L. (1962). An application of factor analysis in neuropsychology. Psychological Record, 12, 147-156. Becker, J. T. (1988). Working memory and secondary memory deficits in Alzheimer's Disease.
Journal of Clinical and Experimental Neuropsychology, 10(6), 73~753. Becker, J. T., Huff, F. J., Nebes, R. D., Holland, A., & Boller, F. H. (1988). Neuropsychological function in Alzheimer's disease: Patterns ofimpainnent and rate of progression. Archives of Neurology, 45, 263-268.
Beebe, D. W., Ris, M.D., & Dietrich, K. N. (2000). The relationship between CVLT-C process scores and measures of executive functioning: Lack of support among community-dwelling adolescents.
Journal of Clinical and Experimental Neuropsychology, 22(6), 77~792. Bell, B. D., Davies, K. G., Hermann, B. P., & Walters, G. (2000). Confrontation naming after anterior temporal lobectomy is related to age of acquisition of the object names. Neuropsychologia, 38(1), 83-92. Bell, B. D., Hermann, B. P., Woodard, A. R., Jones, J. E., Rutecki, P. A., Sheth, R., et al. (2001). Object naming and semantic knowledge in temporal lobe epilepsy. Neuropsychology, 15(4), 434-443. Bench, C. J., Frith, C. D., Grasby, P. M., Friston, K. J., Paulseu, E., Frackowiak, R. S. J., et al. (1993). Investigations of the functional anatomy of attention using the Stroop Test. Neuropsychologia, 31, 907-922. Benedict, R. H. B., Schretlen, D., Groninger, L., Dobraski, M., et al. (1996). Revision of the Brief
Visuospatial Memory Test: Studies of normal performance, reliability, and validity. Psychological Assessment, 8(2), 145-153. Benedict, R. H. B., Schretlen, D., Groninger, L., & Brandt, J. (1998). Hopkins Verbal Learning Test-Revised: Normative data and analysis of inter-form and test-retest reliability. Clinical Neuropsychologist, 12(1), 43--55. Benedict, R. H. B., & Zgaljardic, D. J. (1998). Practice effects during repeated administrations of memory tests with and without alternate forms. Journal of Clinical and Experimental Neuropsychology, .20(3), 33~2. Benito-Cuadrado, M. M., Esteba-CastiDo, S., Boehm, P., Cejudo-Bolivar, J., & Pena-Casanova, J. (2002). Semantic verbal fluency of animals: A normative and predictive study in a Spanish population. Journal of Clinical and Experimental NeuropsycholorJJ. 24(8), 1117-1122. Bennett-Levy, J. (1984). Determinants of performance on the Rey-Osterrieth Complex Figure test: An analysis, and a new technique for singlecase assessment. British Journal of Clinical Psychology, .23, 109-119. Benton, A. (1945). A visual retention test for clinical use. Archives of Neurolora and Psychiatry, 54, 212-216. Benton, A. (1962). The VISual Retention Test as a constructional praxis task. ConfinitJ Neurologica, 22, 141-155. Benton, A. (1963). Revised Visual Retention Test: Clinical and experimental applications (3rd ed.). New York: The Psychological Corporation. Benton, A. (1967). Problems of test construction in the field of aphasia. Cortex, 3, 32-58. Benton, A. (1972). Abbreviated versions of the Visual Retention Test. Journal of Psycho/ora, 80, 18~192.
Benton, A. (1974). Revised Visual Retention Test: Clinical and experimental applications (4th ed.). San Antonio, TX: The Psychological Corporation. Benton, A., & Hamsher, K. (1978). Multilingual Aphasia Examination mtlnual. Iowa City: University of Iowa. Benton, A., Hannay, H. J., & Varney, N. R. (1975). Visual perception of line direction in patients with unilateral brain disease. Neurolora• .25(10), 907-910. Benton, A., Varney, N. R., & Hamsher, K. (1978). Visuospatial judgment: A clinical test. Archives of
Neurology, 35(6), 364-367. Benton, A., Eslinger, P. J., & Damasio, A. R. (1981). Normative observations on neuropsychological test performances in old age. Journal of Clinical Neuropsycholora. 3(1), 33-42.
REFERENCES Benton, A., Hamsher, K., Varney, N. R., & Spreen, 0. (1983a). Contributions to neuropsyclwlogical assessment: A Clinical manual. New York: Oxford University Press. Benton, A., Hamsher, K., Varney, N. R., & Spreen, 0. (1983b). Visual Form Discrimination. New York: Oxford University Press. Benton, A., Hamsher, K., & Sivan, A. B. (1994a). Multilingual Aphasia Examination. Iowa City: AJA Associates. Benton, A., Sivan, A. B., Hamsher, K., Varney, N. R., & Spreen, 0. (1994b). Contributions to neuropsyclwlogical assessment-A clinical manual (2nd ed.). New York: Oxford University Press. Benton-Sivan, A. (1992). Benton Visual Retention Test (5th ed.). San Antonio: Psychological Corporation. Berg, E. A. (1948). A simple objective technique for measuring flexibility in thinking. Journal of General Psychology, 39, 15--22. Berman, K. F., Doran, A. R., Pickar, D., & Weinberger, D. R. (1993). Is the mechanism of prefrontal hypofunction in depression the same as in schizophrenia? Regional cerebral blood flow during cognitive activation. British Journal of Psychiatry, 162, 183-192. Berman, K. F., Ostrem, J. L., Randolph, C., Gold, J., Goldberg, T. E., Coppola, R. E., et al. (1995). Physiological activation of a cortical network during performance of the Wisconsin Card Sorting Test: A positron emission tomography study. Neuropsychologia, 33, 1027-1046. Bernard, L. C. (1989). Halstead-Reitan Neuropsychological Test performance of black, Hispanic, and white young adult males from poor academic backgrounds. Archives of Clinical Neuropsychology, 4, 267-274. Bernard, L. C. (1990). Prospects for faking believable memory deficits on neuropsychological tests and the use of incentives in simulation research.
Journal of Clinical and Experimental Neuropsychology, 12, 715--728. Bernard, L. C. (1991). The detection of faked deficits on the Rey Auditory Verbal Learning Test: The effect of serial position. Archives of Clinical Neuropsychology, 6, 81-88. Bernard, L. C., Houston, W., & Natoli, L. (1993). Malingering on neuropsychological memory tests: Potential objective indicators. Journal of Clinical Psyclwlogy, 49, 45--53. Bernard, L. C., McGrath, M. J., & Houston, W. (1996). The differential effects of simulating malingerers, closed head injury, and other CNS pathology on the Wisconsin Card Sorting Test: Support for the "pattern of performance"
539 hypothesis. Archives of Clinical Neuropsychology, 11, 231-245. Berning, L. C., Weed, N.C., & Aloia, M. S. (1998). Interrater reliability of the Ruff Figural Fluen(.y Test. Assessment, 5(2), 181-186. Bernstein, J. H. (2003). Interpreting the ROCF productions of children. In J. A. Knight (Ed.), The handbook of Rey-Ostenieth Complex Figure usage: Clinical and research applications. Lutz, FL: Psychological Assessment Resources. Bernstein, J. H., & Waber, D. P. (1996). Develop-
mental Scoring System for the Rey-Osterrieth ComplexFigure(DSS-ROCF):Professionalmanual. Odessa, FL: Psychological Assessment Resources. Berry, D. T. R., & Carpenter, G. C. (1992). Effect of four different delay periods on recall of the Rey-Osterrieth Complex Figure by older persons. Clinical Neuropsychologist, 6( 1), 80-84. Berry, D. T. R., Allen, R. S., & Schmitt, F. A. (1991). Rey-Osterrieth Complex Figure: Psychometric characteristics in a geriatric sample. Clinical Neuropsyclwlogist, 5(2), 143--153. Bertolucci, P. H. F., Okamoto, I. H., Brucki, S. M. D., Siviero, M. 0., Toniolo Neto, J., & Ramos, L. R. (2001). Applicability of the CERAD neuropsychological battery to Brazilian elderly. Arquivos de Neuro-Psiquiatria, 59(3-A), 532-536. Bherer, L., Belleville, S., & Peretz, I. (2001). Education, age, and the Brown-Peterson technique. Developmental Neuropsyclwlogy, 19(3), 237-251. Bieliauskas, L. A., Adams, K. M., Fennell, E., Hammeke, T., & Rourke, B. (1997a). Assessment of neuropsychological testing. Neurology, 49, 1182--1183. Bieliauskas, L. A., Fastenau, P. S., Lacy, M. A., & Roper, B. L. (1997b). Use of the odds ratio to translate neuropsychological test scores into real-world outcomes: From statistical significance to clinical significance. Journal of Clinical and Experimental Neuropsychology, 19(6), 889-896. Bigler, E. D. (2003). Neuroimagingand the ROCF. In J. A. Knight (Ed.), The handbookofRey-Osterrieth
Complex Figure usage: Clinical and research applications. Lutz, FL: Psychological Assessment Resources. Bigler, E. D., & Dodrill, C. B. (1997). Assessment of neuropsychological testing: Comment. Neurology, 49(4), 1180-1181. Bigler, E. D., & Tucker, D. M. (1981). Comparison of Verbal IQ, Tactual Performance, Seashore Rhythm and Finger Oscillation tests in the blind and brain-damaged. Journal of Clinical Psychology, 37(4), 849-851. Bigler, E. D., Steinman, D. R., & Newton, J. S. (1981a). Clinical assessment of cognitive deficit
540 in neurologic disorder: I. Effects of age and degenerative disease. International journal of Clinical Neuropsychology, 3(3), 5-13. Bigler, E. D., Steinman, D. R., & Newton, J. S. (1981b). Clinical assessment of cognitive deficit in neurologic disorder. II: Cerebral trauma. Clinical Neuropsychology, 3(3), 13--18. Bigler, E. D., Rosa, L., Schultz, F., Hall, S., & Harris, J. (1989). Rey Auditory-Verbal Learning and Rey-Osterrieth Complex Figure Design performance in Alzheimer's disease and closed head injury. Journal of Clinical Psyehology, 45(2), 277-280. Binder, L. M. (1982). Constructional strategies on complex figure drawings after unilate!ll brain damage. Journal of Clinical Neuropsythology, 4(1), 51-58. Binder, L. M., Villanueva, M. R., Howiesoo, D., & Moore, R. T. (1993). The Rey AVLT recpgnition memory task measures motivational im!*irment after mild head trauma. Archives of Clinical Neuropsychology, 8, 137-147. Binder, E. F., Storandt, M., & Birge, S. J. (1999). The relation between psychometric test. performance and physical performance in oldet adults. Journals of Gerontology: Series A: Biologfcal Sciences and Medical Sciences, 54(8), M~M432. Binder, L. M., Kelly, M. P., Villanueva, M. R., & Winslow, M. M. (2003). Motivation ~d neuropsychological test performance following mild head injury. Journal of Clinical and Experimental Neuropsychology, 25(3), 420-430. Binetti, G., Magni, E., Padovani, A., Cappa, S. F., et al. (1995). Release from proactive interference in early Alzheimer's disease. Neuropsyd10logia, 33(3), 379--384. Binetti, G., Magni, E., Padovani, A., Capp., S. F., Bianchetti, A., & Trabucchi, M. (1996). Executive dysfunction in early Alzheimer's disease.
Journal of Neurology, Neurosurgery, a'd Psychiatry, 60, 91-93. Blachstein, H., Vakil, E., & Hoffien, D. (19fl3). Impaired learning in patients with clo~d-head injuries: An analysis of components of th, acquisition process. Neuropsychology, 7(4), 53()....535. Bleecker, M. L., Bolla-Wilson, K., Agnew, J., & Meyers, D. A. (1988). Age-related se" differences in verbal memory. Journal of Clinical Psychology, 44(3), 403--411. Blusewicz, M. J., Kramer, J. H., & Delmonioo, R. L. (1996). Interference effects in chronic alctholism.
Journal of the International Neuropsychological Society, 2(2), 141-145. Boeve, B., McCormick, J., Smith, G., Fennan, T., Rummans, T., Carpenter, T., et al. (200~). Mild
REFERENCES cognitive impairment in the oldest old. Neurology, 60(3), 477-480. Boll, T. J., & Reitan, R. M. (1973). Effect of age on performance on the Trail Making Test. Perceptual and Motor Skills, 36, 691--694. Bolla, K. 1., Lindgren, K. N., Bonaccorsy, C., & Bleecker, M. L. (1990). Predictors of verbal fluency (FAS) in the healthy elderly. Journal of Clinical Psychology, 46(5), 623--628. Bolla-Wilson, K., & Bleecker, M. L. (1986). Influence of verbal intelligence, sex, age, and education on the Rey Auditory Verbal Learning Test. Developmental Neuropsychology, 2(3), 203--211. Bondi, M. W., Kaszniak, A. W., Bayles, K. A., & Vance, K. (1993). Contributions of frontal system dysfunction to memory and perceptual abilities in Parkinson's patients. Neuropsychology, 7, 89-102. Bondi, M. W., Monsch, A. U., Butters, N., Salmon, D. P., & Paulsen (1993). Utility of a modified version of the Wisconsin Card Sorting Test in the detection of dementia of the Alzheimer type. Clinical Neuropsychologist, 7(2), 161-170. Boone, K. B. (1999). Neuropsychological assessment of executive functions: Impact of age, education, gender, intellectual level, and vascular status on executive test scores. In B. L. Miller & J. L. Cummings (Eds.), The human frontal lobes:
Functions and disorders. The science and practice of neuropsychology series (pp. 247-260). New York: Guilford Press. Boone, K. B. (2000). The Boston Qualitative Scoring System for the Rey-Osterrieth Complex Figure. Journal of Clinical and Experimental Neuropsychology, 22(3), 430-432. Boone, K. B., Miller, B. L., Rosenberg, L., Durazo, A., Mcintyre, H., & Wei), M. (1988). Neuropsychological and behavioral abnormalities in an adolescent with frontal lobe seizures. Neurology, 38, 583--586. Boone, K. B., Miller, B. L., Lesser, I. M., Hill, E., & D'Elia, L. (1990). Performance on frontal lobe tests in healthy, older individuals. Developmental Neuropsychology. 6(3), 215-223. Boone, K. B., Ananth, J., Philpott, L., Kaur, A., & Djenderedjian, A. (1991). Neuropsychological characteristics of nondepressed adults with obsessive-compulsive disorder. Neuropsychia-
try, Neuropsychology, and Behavioral Neurology, 4, 96-109. Boone, K. B., Miller, B. L., Lesser, I. M., Mehringer, C. M., Hill-Gutierrez, E., Goldberg, M. A., et al. (1992). Neuropsychological correlates of white-matter lesions in healthy elderly subjects: A threshold effect. Archives of
Neurology, 49, 549--554.
REFERENCES Boone, K. B., Ghaffarian, S., Lesser, I. M., HillGutierrez, E., & Bennan, N. G. (1993a). Wisconsin Card Sorting Test perfonnance in healthy, older adults: Relationship to age, sex, education, and IQ. Journal of Clinical Psychology. 49(1), 54-60. Boone, K. B., Lesser, I. M., Hill-Gutierrez, E. H., Bennan, N. G., & D'Elia, L. F. (1993b). ReyOsterrieth Complex Figure perfonnance in healthy, older adults: Relationship to age, education, sex and IQ. Clinical Neuropsychologist. 7(1), 22-28. Boone, K. B., Miller, B. L., & Lesser, I. M. (1993c). Frontal lobe cognitive functions in aging: Methodologic considerations. Dementia, 4, 232-236. Boone, K. B., Lesser, I. M., Miller, B. L., Wohl, M., Bennan, N., Lee, A., et al. (1995). Cognitive functioning in older depressed outpatients: Relationship of presence and severity of depression to neuropsychological test scores. Neuropsychology, 9(3), 390--398. Boone, K. B., Ponton, M. 0., Gorsuch, R. L., Gonzalez, J. J., & Miller, B. L. (1998). Factor analysis of four measures of prefrontal lobe functioning. Archives of Clinical Neuropsychology, 13(1), 585-595. Boone, K. B., Miller, B. L., Lee, A., Bennan, N., Shennan, D., & Stuss, D. T. (1999). Neuropsychological patterns in right versus left frontotemporal dementia. Journal of the International Neuropsychological Society, 5(7), 616-622. Boone, K. B., Swerdloff, R. S., Miller, B. L., Geschwind, D. H., Razani, J., Lee, A., et al. (2001). Neuropsychological profiles of adults with Klinefelter syndrome. Journal ofthe International Neuropsychological Society, 7(4), 446-456. Borak, J., Sliwinski, P., Pobiasz, M., Gorecka, D., & Zielinski, J. (1996). Psychological status ofCOPD patients before and after one year of long-tenn oxygen therapy. Monaldi Archives of Chest Diseases, 51(1), 7-11. Boringa, J. B., Lazeron, R., Reuling, 1., Ader, H., PCennings, L., Linderboom, J., et al. (2001). The Brief Repeatable Battery of Neuropsychological Tests: Nonnative values allow application in multiple sclerosis clinical practice. Multiple Sclerosis, 7, 263-267. Borkowski, J., Benton, A., & Spreen, 0. (1967). Word fluency and brain damage. Neuropsychologia, 5, 135-140. Bornstein, M. H. (1973). Color vision and color naming: A psychological hypothesis of cultural difference. Psychological Bulletin, 80, 257-285. Bomstein, R. A. (1985). Nonnative data on selected neuropsychological measures from a nonclinical
541 sample. Journal of Clinical Psychology, 41(5), 651-658. Bornstein, R. A. (1986a). Classification rates obtained with "standard" cut-off scores on selected neuropsychological measures. Journal of Clinical and Experimental Neuropsychology, 8(4), 413-420. Bornstein, R. A. (1986b). Contribution of various neuropsychological measures to detection of frontal lobe impainnent. International Journal of Clinical Neuropsychology, 8(1), 18-22. Bornstein, R. A. (1986c). Normative data on intennanual differences on three tests of motor perfonnance. Journal of Clinical and Experimental Neuropsychology, 8(1), 12-20. Bornstein, R. A. (1986d). Consistency of intermanual discrepancies in nonnal and unilateral brain lesion patients. Journal of Consulting and Clinical Psychology, 54(5), 719-723. Bornstein, R. A. (1990). Neuropsychological test batteries in neuropsychological assessment. In A. A. Boulton, G. B. Baker, & M. Hiscock (Eds.), Neuromethods-17: Neuropsychology. Clifton, NJ: Humana Press. Bornstein, R. A., & Suga, L. J. (1988). Educational level and neuropsychological perfonnance in healthy elderly subjects. Developmental Neuropsychology, 4(1), 17-22. Bornstein, R. A., Baker, G. B., & Douglass, A. B. (1987a). Short-tenn retest reliability of the HalsteadReitan Battery in a normal sample. Journal of Nervous and Mental Disease,l75(4), 229-232. Bornstein, R. A., Paniak, C., & O'Brien, W. (1987b). Preliminary data on classification of nonnal and brain-damaged elderly subjects. Clinical Neuropsychologist, 1(4), 315-323. Borod, J. C., Goodglass, H., & Kaplan, E. (1980). Nonnative data on the Boston Diagnostic Aphasia Examination, Parietal Lobe Battery, and the Boston Naming Test. Journal of Clinical Neuropsychology, 2(3), 209-215. Borod, J. C., Caron, H. S., & Koff, E. (1984). Leftbanders and right-banders compared on performance and preference measures of lateral dominance. British Journal of Psychology. 75(2), 177-186. Botwinick, J. (1981). Neuropsychology of aging. In S. Filskov & T. Boll (Eds.), Handbook of Clinical Neuropsychology. New York: Wiley. Bowden, S., & Bell, R. (1992). Relative usefulness of the WMS and WMS-R: A comment on D'Elia et al. (1989). Journal of Clinical and Experimental Neuropsychology, 14(2), 340-346. Bowden, S., Fowler, K. S., Bell, R. C., Whelan, G., Clifford, C. C., Ritter, A. J., et al. (1998). The reliability and internal validity of the Wisconsin
542 Card Sorting Test. Neuropsychological Rehabilitation, 8(3), 243-254. Bowles, N. L., & Poon, L. W. (1985). Aging and retrieval of words in semantic memory. Journal of Gerontology, 40(1), 71-77. Boyd, J. L. (1981). A validity study of the Hooper Visual Organization Test. Journal of Consulting and Clinical Psychology, 49,15-19. Boyd, J. L. (1982a). Reply to Rathbun and Smith: Who made the Hooper blooper? Journal of Consulting and Clinical Psychology, 50, 284-285. Boyd, J. L. (1982b). Reply to Woodward. Journal of Consulting and Clinical Psychology, 50(2), 289-290. Boyle, G. J. (1975). Shortened Halstead Category Test. Australian Psychologist, 10(1), 81-84. Boyle, G. J. (1986). Clinical neuropsychological assessment: Abbreviating the Halstead Category Test of brain dysfunction. Journal of Clinical Psychology, 42(4), 615-625. Boyle, G. J. (1988). What does the neuropsychological category test measure? Archives of Clinical Neuropsychology, 3, 69-76. Boyle, G. J., Ward, J., & Steindl, S. R. (1994). Psychometric properties of Russell's short fonn of the Booklet Category Test. Perceptual and Motor Skills, 79(1, Pt 1), 128-130. Bradford, D. T. (1992).Interpretive Reasoning and the Halstead-Reitan Tests. Brandon, Vf: Clinical Psychology. Brady, C. B., Spiro, A., III, McGlinchey-Berroth, R., Milberg, W., & Gaziano, J. M. (2001). Stroke risk predicts verbal fluency decline in healthy older men: Evidence from the nonnative aging study.
Journals of Gerontology: Series B: Psychological Sciences and Social Sciences, 568(6), P340-P346. Braff, D. L. (1989). Sensory input deficits and negative symptoms in schizophrenic patients. American Journal of Psychiatry, 146(8), 1006-1011. Braff, D. L., Heaton, R. K., Kuck, J., Cullum, M., Maranville, J., Grant, 1., et al. (1991). The generalized pattern of neuropsychological deficits in outpatients with chronic schizophrenia with heterogeneous Wisconsin Card Sorting Test results. Archives of General Psychiatry, 48(10), 891-898. Brandon, A. D., & Chavez, E. L. ( 1985). Order and delay effects on neuropsychological test presentation: The Halstead Category and Wisconsin Card Sorting Tests. Clinical Neuropsychology, 7(3), 152-153. Brandt, J. (1991). The Hopkins Verbal Learning Test: Development of a new memory test with six equivalent fonns. Clinical Neuropsychologist, 5(2), 125-142.
REFERENCES
J., & Benedict, R. H. B. (2001). Hopkins Verbal Learning Test-Revised. Lutz, FL: Psy-
Brandt,
chological Assessment Resources. Brebion, G., Smith, M. J., Connan, J. M., & Amador, X. (1996). Reality monitoring failure in schizophrenia: The role of selective attention. Schizophrenia Research, 22, 173-180. Breteler, M. M., van Amerongen, N. M., van Swieten, J. C., Claus, J. J., Grobbee, D. E., van Gijn, J., et al. (1994). Cognitive correlates of ventricular enlargement and cerebral white matter lesions on magnetic resonance imaging. The Rotterdam Study. Stroke, 25,1109-1115. Brislin, R. W. (1983). Cross-cultural research in psychology. Annual Review of Psychology, 34, 363-400. Brittain,J. L.,laMarche,J. A., Reeder, K. P., Roth, D. L, & Boll (1991). Effects of age and IQ on Paced Auditory Serial Addition Task (PASAT) performance. Clinical Neuropsychologjst, 5(2), 163-175. Brooke, M. M., Questad, K. A., Patterson, D. R., & Valois, T. A. (1992). Driving evaluation after traumatic brain injury. American Journal of Physical Medicine Rehabilitation, 71(3),177-182. Brooks, D. N. (1972). Memory and head injury. Journal of Neroous and Mental Disease, 155(5),
350--355. Brooks, J., Fos, L., Greve, K., & Hammond, J. S. (1999). Assessment of executive function in patients with mild traumatic brain injury. The Journal of Trauma, 461, 159-163. Brown, G. G., Kindennann, S. S., Siegle, G. J., Granholm, E., Wong. E. C., & Buxton, R. B. (1999 ). Brain activation and pupil response during covert perfonnance of the Stroop Color Word task. Journal ofthe International Neuropsychological Society, 5(4), 308--319. Brown, J. (1958). Some tests of the decay theory of immediate memory. Quarterly Journal of Experimental Psychology, 10, 12-21. Buchanan, R. W., Strauss, M. E., Kirkpatrick, B., Holstein, C., Breier, A., & Carpenter, W. T. (1994). Neuropsychological impairments in deficit vs. nondeficit fonns of schizophrenia. Archives of General Psychiatry, 51, 804-811. Burgess, P. W. (2003). Assessment of executive function. In P. W. Halligan, U. Kischka, & J. C. Marshall (Eds.), Handbook of clinical neuropsychology. New York: Oxford University Press. Burton, C. L., Hultsch, D. F., Strauss, E., & Hunter, M. A. (2002). Intraindividual variability in physical and emotional functioning: Comparison of adults with traumatic brain injuries and healthy adults. Clinical Neuropsychologist, 16(3), 264--279.
REFERENCES Buschke, H. (1973). Selective reminding for analysis of memory and learning. Journal of Verbal Learning and Verbal Behavior, 12(5), 543--550. Buschke, H. (1984). Cued recall in amnesia. Journal of Clinical Neuropsychology, 6(4), 433-440. Buschke, H., & Fuld, P. (1974). Evaluating storage, retention, and retrieval in disordered memory and learning. Neurology, 24(1l), 1019-1025. Bustini, M., Stratta, P., Daneluzzo, E., Pollice, R., Prosperini, P., & Rossi, A. {1999). Tower of Hanoi and WCST performance in schizophrenia: Problem-solving capacity and clinical correlates. Journal of Psychiatric Research, 33, 285-290. Butler, M., Retzlaff, P. D., & Vanderploeg, R. (1991). Neuropsychological test usage. Professional Psychology: Research and Practice, 22(6), 510-512. Butler, R. W., Jenkins, M. A., Sprock, J., & Braff, D. L. (1992). Wisconsin Card Sorting Test deficits in chronic paranoid shcizophrenia: Evidence for a relatively discrete subgroup? Schizophrenia Research, 7, 169-176. Butler, R. W., Horsman, 1., Hill, J. M., & Tuma, R. (1993). The effects of frontal brain impairment on fluency: Simple and complex paradigms. Neuropsychology, 7(4), 519-529. Butman, T. J. (2001). Designing an instrument of early diagnosis of dementia in primary care/ Hacia un protocolo clinico de deteccion precoz de demencia en asistencia primaria. Acta Psiquiatrica y Psicologica de America Latina, 47{1), 79-87. Butters, N., Granholm, E., Salmon, D.P., & Grant, I. (1987). Episodic and semantic memory: A comparison of amnesic and demented patients.
Journal of Clinical and Experimental Neuropsychology, 9, 479-497. Cabeza, R., & Nyberg. L. (2002). Imaging cognition: An empirical review of 275 PET and fMRI studies. Journal of Cognitive Science, 12, 1-47. Caffarra, P., Vezzadini, G., Dieci, F., Zonato, F., & Venneri, A. (2002). Rey-Osterrieth Complex Figure: Normative values in an Italian population sample. Neurological Sciences, 22(6), 443-447. Caffarra, P., Vezzadini, G., Dieci, F., Zonato, F., & Venneri, A. (2004). Modified card sorting test: Normative data. Journal of Clinical and Experimental Neuropsychology, 26(2), 246-250. Cahn, D. A., Salmon, D. P., Butters, N., Wiederholt, W. C., Corey-Bloom, J., Edelstein, S. L., et al. (1995). Detection of dementia of the Alzheimer type in a population-based sample: Neuropsychological test performance. Journal of the International Neuropsychological Society, 1(3), 252-260.
543 Cahn, D. A., Marcotte, A. C., Stern, R. A., Arruda, J. E., Akshoomoff, N. A., & Leshko, I. C. (1996). The Boston Qualitative Scoring System for the Rey-Osterrieth Complex Figure: A study of children with attention deficit hyperactivity disorder. Clinical Neuropsychologist, 10(4), 397-406. Cahn-Weiner, D. A., Boyle, P. A., & Malloy, P. F. (2002). Tests of executive function predict instrumental activities of daily living in communitydwelling older individuals. Applied Neuropsychology, 9(3), 187-191. Caine, E. D. (1986). The neuropsychology of depression: The pseudodementia syndrome. In I. Grant & K. M. Adams (Eds.), Neuropsychologi-
cal assessment of neuropsychiatric disorders (pp. 221-243). New York: Oxford University Press. Calero, M. D., Arnedo, M. L., Navarro, E., RuizPedrosa, M., & Camero, C. (2002). Usefulness of a 15-item version of the Boston Naming Test in neuropsychological assessment of low-educational elders with dementia Journal of Gerontology.
Series B: Psychological Sciences and Social Sciences, 57(2), P187-P191. Calsyn, D. A., O'Leary, M. R., & Chaney, E. F. (1980). Shortening the Category Test. Journal of Consulting and Clinical Psychology, 48(6), 788-789. Caltagirone C. Carlesimo, A., Nocentini, U. & Vican, S. (1989). Defective concept formation in parkinsonians is independent from mental deterioration. Journal of Neurology, Neurosurgery, and Psychiatry, 52, 334-337. Camara, W. J., Nathan, J. S., & Puente, A. E. (2000). Psychological test usage: Implications in professional psychology. Professional Psychology: Research and Practice, 31(2), 141-154. Campo, P., & Morales, M. (2003). Reliability and normative data for the Benton Visual Form Discrimination Test. Clinical Nettropsychologist, 17(2), 220-225. Campo, P., & Morales, M. (2004). Normative data and reliability for a Spanish version of the verbal Selective Reminding Test. Archives of Clinical Neuropsychology, 19(3), 421-235. Campo, P., Morales, M., & Juan-Malpartida, M. (2000). Development of two Spanish versions of the verbal Selective Reminding Test. Journal of
Clinical and Experimental Neuropsychology, 22, 279-285. Campo, P., Morales, M., & Martinez-Castillo, E. (2003). Discrimination of normal from demented elderly on a Spanish version of the Verbal Selective Reminding Test. Journal of Clinical and Experimental Neuropsychology, 25(7), 991-999.
544
Canadian Study of Health and Aging Working Group (1994). The Canadian Study of Health and Aging: Study methods and prevalence of dementia. Canadian Medical Association Journal, 150, 899-913. , Caplan, B. (1985). Stimulus effects in unilateral neglect? Cortex, 21(1), 69-80. Caplan, B., & Caffery, D. (1996). Visual form discrimination as a multiple-choice visual memory test: Illustrative data. Clinical Neuropsychologist, 10(2), 152-158. Caplan, B., & Schultheis, M. (1998). An interpretative table for the Visual Form Discrittination Test. Perceptual and Motor Skills, B?f, 12031207. Caplan, B., & Shechter, J. (1995). The, role of nonstandard neuropsychological assess~ent in rehabilitation: History, rationale, and exaqtples. In L. A. Cushman & M. J. Scherer (Ed$.), Psy-
chological assessment in medical refuzbilitation. Washington DC: American Psyciological Association. Carew, T. G., Lamar, M., Cloud, B.S., Gross~. M., & Libon, D. J. (1997). Impairment in category fluency in ischemic vascular dementia. Neuropsychology, 11(3), 400-412. Carmelli, D., Swan, G. E., Reed, T., Schellenberg, G. D., & Christian, J. C. (1999). The effect of apolipoprotein E epsilon4 in the relationships of smoking and drinking to cognitive fPnction. Neuroepidemiology, 18(3), 125-133. Carmelli, D., DeCarli, C., Swan, G. E., KellyHayes, M., Wolf, P. A., Reed, T., et al. (2000). The joint effect of apolipoprotein E epsilon4 and MRI findings on lower-extremity function and decline in cognitive function. Journals of Ger-
ontology: Series A: Biological Sciences and Medical Sciences, 55A(2), M103-M109. Carr, E. K., & Lincoln, N. B. (1988). Inter-rater reliability of the Rey figure copying test, British Journal of Clinical Psychology, 27, 267-268. Carstairs, J. R., & Shores, E. A. (2000). The Macquarie University Neuropsychological Normative Study (MUNNS): Rationale and methqdology. Australian Psychologist, 35, 36-40. · Carter, C. S., Mintun, M., & Cohen, U. J.D. (1995). Interference and facilitation effects duri~ selective attention: An H 20 150 PET study of Stroop task performance. Neuroimage, 2, 264-272. Carter, C. S., Mintun, M., Nichols, T., & Co~n. J.D. (1997). Anterior cingulate gyrus dyst.mction and selective attention deficits in schizophrenia: [-sup-1-sup-SO]H-sub-20 PET study: during single-trial Stroop task performance. Arerican Journal of Psychiatry, 154(12), 167~16'f5.
REFERENCES Carter, S. L., Shore, D., Harnadek, M. C. S., & Kubu, C. S. (1998). Normative data and interrater reliability of the Design Fluency Test. Clinical Neuropsychologist, 12(4), 531--534. Casey, M. B., Winner, E., Hurwitz, 1., & DaSilva, D. (1991). Does processing style affect recall of the Rey-Osterrieth or Taylor Complex Figures?
Journal of Clinical and Experimental Neuropsychology, 13, ~06. Catafau, A. M., Parellada, E., Lomena, F. J., Bernardo, M., Pavia, J., Ros, D., et al. (1994). Prefrontal and temporal blood flow in schizophrenia: Resting and activation technetium-99m-HMPAO SPECT patterns in young neuroleptics-naive patients with acute disease. Journal of Nuclear Medicine, 35,935--941. Cattell, J. (1886). The time it takes to see and name objects. Mind, 11, 63-65. Cauthen, N. (1977). Extension of the Wechsler Memory Scale norms to the older age groups. Journal of Clinical Psychology, 33, 208-212. Cauthen, N. (1978a). Normative data for the Tactual Performance Test. Journal of Clinical Psychology, 34(2), 456-460. Cauthen, N. (1978b). Verbal fluency: Normative data. Journal of Clinical Psychology, 32(1), 126-129. Cavalli, M., De Renzi, E., Faglioni, P., & Vitale, A. (1981). Impairment of right brain-damaged patients on a linguistic cognitive task. Cortex, 17, 546-556. Cerhan, J. H., Ivnik, R. J., Smith, G. E., Tangalos, E. C., Petersen, R. C., & Boeve, B. F. (2002). Diagnostic utility of letter fluency, category fluency, and fluency difference scores in Alzheimer's disease. Clinical Neuropsychologist, 16(1), 35-42. Cermak, L. S., & Butters, N. (1972). The role of interference and encoding in the short-term memory deficits of Korsakoff patients. Neuropsychologia, 10, 8~95. Chan, A. S., & Poon, M. W. (1999). Performance of 7- to 95-year-old individuals in a Chinese version of the category fluency test. Journal of the International Neuropsychological Society, 5(6), 525-533. Chan, R. C. K. (2001). Base rate of post-concussion symptoms among normal people and its neuropsychological correlates. Clinical Rehabilitation, 15(3), 266-273. Channon, S. (1996). Executive dysfunction in depression: The Wisconsin Card Sorting Test. Journal of Affective Disorders, 39, 107-114. Charter, R. A. (1994). Determining random responding for the Category, Speech-Sounds
REFERENCES Perception, and Seashore Rhythm tests. Journal of Clinical and Experimental Neuropsychology, 16(5), 7~748. Charter, R. A. (1999). Sample size requirements for precise estimates of reliability, generalizability, and validity coefficients. Journal of Clinical and Experimental Neuropsychology, 21(4), 559-566. Charter, R. A. (2000a). Internal consistency reliability of the Tactual Performance Test trials. Perceptual and Motor Skills, 91(2), 460-462. Charter, R. A. (2000b). Item difficulty analysis of the tactual performance test trials. Perceptual and Motor Skills, 91(3, Pt 1), 903-909. Charter, R. A. (200la). Coefficients alpha for the Tactual Performance Test trials. Perceptual and Motor Skills, 92(3, Pt 1), 893. Charter, R. A. (200lb). Difference score reliability for Tactual Performance Test trials. Perceptual and Motor Skills, 92(3, Pt 1), 941-942. Charter, R. A. (200lc). Tactual performance test trials: Internal consistency reliability using the Gilmer-Feldt coefficient. Perceptual and Motor Skills, 93(2), 363--366. Charter, R. A., & Dutra, R. L. (200la). Item difficulty of the Tactual Performance Test Location score. Perceptual and Motor Skills, 93(3), 899-900. Charter, R. A., & Dutra, R. L. (200lb). Tactual Performance Test: Item analysis of the Memory and Location scores. Perceptual and Motor Skills, 92(3, Pt 1), 899-902. Charter, R. A., Adkins, T. G., Alekoumbides, A., & Seacat, G. F. (1987). Reliability of the WAIS, WMS, and Reitan Batteries: Raw scores and standardized scores corrected for age and education. International Journal of Clinical Neuropsychology, 9(1), 2lh12. Charter, R. A., Walden, D. K., & Hoffman, C. (1998). Interscorer reliabilities for memory and localization scores of the Tactual Performance Test. Clinical Neuropsychologist, 12(2), 245-247. Charter, R. A., Dutra, R. L., & Rapport, L. J. (2000). Tactual performance test: Internal consistency reliability of the memory and location scores. Perceptual and Motor Skills, 91(1), 143-146. Chavez, E. L., Schwartz, M. M., & Brandon, A. (1982). Effects of sex of subjects and method of block presentation on the Tactual Performance Test. Journal of Consulting and Clinical Psychology, 50(4), 600-601. Chelune, G., Bomstein, R.L., & Prifitera, A. (1989). The Wechsler Memory Scale-Revised: Current status and applications. In J. Rosen, P. McReynolds, & G. Chelune (Eds.), Advances in psychological assessment. New York: Plenum Press.
545 Chen, E. Y. H., Kwok, C. L., Chen, R. L., & Kwong, P. P. K. (2001). Insight changes in acute psychotic episodes: A prospective study of Hong Kong Chinese patients. Journal of Neroous and Mental Disease, 189(1), 24-30. Chen, H., & Ho, C. (1986). Developmental study of the reversed Stroop effect in Chinese-English bilinguals. Journal of Generol Psychology, 113, 121-125. Chen, P., Ratcliff, G., Belle, S. H., Cauley, J. A., DeKosky, S. T., & Ganguli, M. (2000). Cognitive tests that best discriminate between presymptomatic AD and those who remain nondemented. Neurology, 55(12), 1847-1853. Chen, Y. L. R., Chen, Y. H. E., & Lieh, M. F. (2000). Semantic verbal fluency deficit as a familial trait marker in schizophrenia. Psychiatry Research, 95(2), 133-148. Chen, Z. Q., Yu, J. H., & Cao, S. H. (1990). Reference values of indicators for WHO neurobehavioral core test battery. Chinese Medical Journal, 103(1), 61-65. Cherrier, M. M., Mendez, M. F., Dave, M., & Perryman, K. M. (1999). Performance on the Rey-Osterrieth Complex Figure test in Alzheimer disease and vascular dementia. Neuropsychiatry,
Neuropsychology, and Behaviorol Neurology, 12(2), 95-101. Chervinsky, A. B., Mitrushina, M., & Satz, P. (1992). Comparison of four methods of scoring the Rey-Osterrieth Complex Figure Drawing Test on four age groups of normal elderly. Broin Dysfunction, 5, 267-287. Chia, S. E., Jeyaratnam, J., Ong, C. N., Ng, T. P., & Lee, H. S. (1994). Impairment of color vision among workers exposed to low concentrations of styrene. American Journal of Industrial Medicine, 26(4), 481-488. Chiaravalloti, N.D., Demaree, H., Gaudino, E. A., & De Luca, J. (2003). Can the repetition effect maximize learning in multiple sclerosis? Clinical Rehabilitation, 17(1), 58-68. Chiu, H. F. K., Chan, C. K. Y., Lam, L. C. W., Ng, K., Li, S., Wong, M., et al. (1997). The modi6ed Fuld Verbal Fluency Test: A validation study in Hong Kong. Journals of Gerontology. Series B: Psychological Sciences and Social Sciences, 528(5), P247-P250. Chiulli, S. J., Haaland, K. Y., Ellis, H. C., & Rhodes, J. M. (1985, February). Recall and
recognition memory with a variant of the Rey Auditory-Verbal Learning Test in a clinically depressed populotion. Paper presented at the 13th Annual Convention of the International Neuropsychological Society, San Diego, CA.
546 Chiulli, S. J., Yeo, R., Haaland, K., & Garry, P. (1989). Complex figure copy and recaD in the elderly. Journal of Clinical and Experimental Neuropsychology, 11, 95. Chiulli, S. J., Haaland, K. Y., LaRue, A., & Garry, P. J. (1995). Impact of age on drawing the ReyOsterrieth Figure. Clinical Neuropsychologist, 9(3), 219-224. Choca, J.P., Laatsch, L., Wetzel, L., & Aw-esti, A. (1997). The Halstead Category Test: A fifty year perspective. Neuropsychology Review, 7(2), 61-75. Christensen, A.-L. (1974). Luria's neuropsjchological investigation. Copenhagen: Munksgaard. Christensen, H., Mackinnon, A. J., Korten, A. E., Jorm, A. F., Henderson, A. S., Jacomb, P., et al. (1999). An analysis of diversity in the cognitive performance of elderly community dwellers: Individual differences in change scores as a function of age. Psychology and Aging. 14(3), 365-379. Christensen, H., Mackinnon, A. J., Korteq, A., & Jorm, A. F. (2001). The "common cause hypothesis" of cognitive aging: Evidence for not only a common factor but also specific !lSSociations of age with vision and grip strength in a cross-sectional analysis. Psychology and Aging. 16(4), 588-599. Christensen, K. J., Kim, S. W., Dysken, M. W., & Hoover, K. M. (1992). Neuropsychologi~al performance in obsessive-compulsive disorder. Biological Psychiatry, 31, 4-18. Christensen, K. J., Riley, B. E., Heffernan, K. A., Love, S. B., & Sta Maria, M. E. M. (2002). Neuropsychological tests in the elderly: Methods and sample characteristics of a GRECC study. Clinical Neuropsychologist, 16(1), 43--50. Chronicle, E. P., & MacGregor, N. A. (1998). Are P ASAT scores related to mathematical ability? NeuropsychologiCill Rehabilitation, 8(3), 27~282.
Cicchetti, D. V. (1999). Sample size requirements for increasing the precision of reliability estimates: Problems and proposed solutions. Journal
of Clinical and Experimental Neuropsydtology, 21(4), 567--570. Cicerone, K. D. (1997). Clinical sensitivity of four measures of attention to mild traumatic brain injury. Clinical Neuropsychologist. 11{3), 266--272. Cicerone, K. D., & Azulay, J. (2002). Diagnostic utility of attention measures in post.,ncussion syndrome. Clinical Neuropsychologist 16(3), 280--289. Clark, C., & Klonoff, H. (1988). Reliability and construct validity of the six-block Tactual Performance
REFERENCES Test in an adult sample. Journal of Clinical and Experimental Neuropsychology, 10(2), 175-184. Clark, C., Jacova, C., Klonoff, H., Kremer, B., et al. (1997). Pathological association and dissociation of functional systems in multiple sclerosis and Huntington's disease. Journal of Clinical and Experimental Neuropsychology, 19(1), 63-76. Coelho, C. A. (2002). Story narratives of adults with closed head injury and non-brain-injured adults: Inftuence of socioeconomic status, elicitation task, and executive functioning. Journal of Speech, lAnguage, and Hearing Research, 45(6), 1232-1248. Coello, E., Ardila, A., & Rosselli, M. (1990). Is there a cognitive marker in major depression? International Journal of Neuroscience, 50, 137-145. Coen, R. F., Maguire, C., Swanwick, G. R., Kirby, M., Burke, T., Lawlor, B. A., et al. (1996). Letter and category fluency in Alzheimer's disease: A prognostic indicator of progression. Dementia, 7(5), 246-250. Coffey, C. E., Ratcliff, G., Saxton,J. A., Bryan, R.N., Fried, L. P., & Lucke, J. F. (2001). Cognitive correlates of human brain aging: A quantitative magnetic resonance imaging investigation. Jour-
nal of Neuropsychiatry and Clinical Neurosciences, 13(4), 471-485.
Cohen, M. J., & Stanczak, D. E. (2000). On the reliability, validity, and cognitive structure of the Thurstone Word Fluency Test. Archives of Clinical Neuropsychology, 15(3), 267-279. Cohen, R. A., Kaplan, R. F., Zuffante, P., Moser, D. J., Jenkins, M. A., Salloway, S., et al. (1999). Alteration of intention and self-initiated action associated with bilateral anterior cingulotomy. Journal
of Neuropsychiatry and Clinical Neurosciences, 11(4), 444-453. Cohen, R. A., Brumm, V., Zawacki, T. M., Paul, R., Sweet, L., & Rosenbaum, A. (2003). Impulsivity and verbal deficits associated with domestic violence. Journal of the International Neuropsychological Society, 9(5), 760-770. Cohn, N. B., Dustman, R. E., & Bradford, D. C. (1984). Age-related decrements in Stroop Color Test performance. Journal of Clinical Psychology, 40, 1244-1250. Collie, A., Shafiz-Antonacci, R., Maruff, P., Tyler, P., & Currie, J. (1999). Norms and the effects of demographic variables on a neuropsychological battery for use in healthy ageing Australian populations. Australian and New Zealand Journal of Psychiatry, 33{4), 568-575. Collier, A. C., Marra, C., Coombs, R. W., Claypoole, K., Cohen, W., et al. (1992). Central nervous system manifestations in human immunodeficiency
547
REFERENCES
virus infection without AIDS. Journal of IHJiclency Syndrome, 5(3), 229-241. Colombo, F., & Assai, G. (1992). Adaptation francaise du test de denomination de Boston. Versions abregees. European Review of Applied Psyclwlogy, 42(1), 67-73. Comalli, P. E., Wapner, S., & Werner, H. (1962). Interference effects of Stroop Color-Word Test in childhood, adulthood, and aging. Journal of Genetic Psyclwlogy, 100, 47-53. Coman, E., Moses, J. A., Jr., Kraemer, H. C., Friedman, L., Benton, A. L., & Yesavage, J. (1999). Geriatric performance on the Benton Visual Retention Test: Demographic and diagnostic considerations. Clinical Neuropsyclwlogist, 13(1), 66-77. Coman, E., Moses, J. A., Jr., Kraemer, H. C., Friedman, L., Benton, A. L., & Yesavage, J. (2002). Interactive inHuences on BVRT performance level: Geriatric considerations. Archives of Clinical Neuropsyclwlogy, 17(6), 595--610. Compton, D. M., Bachman, L. D., & Logan, J. A. (1997). Aging and intellectual ability in young, middle-aged, and older educated adults: Preliminary results from a sample of college faculty. Psychologiad Reports, 81(1), 79-90. Compton, D. M., Bachman, L. D., Brand, D., & Avet, T. L. (2000). Age-associated changes in cognitive function in highly educated adults: Emerging myths and realities. International Journal of Geriatric Psychiatry, 15(1), 75-85. Concha, M., Seines, 0. A., McArthur, J. C., & Nance-Sproson, T. (1995). Normative data for a brief neuropsychological test battery in a cohort of injecting drug users. International Journal of the Addictions, 30(7), 823--841. Conners, C. K., Epstein, J., Stem, R. A., March, J., Sparrow, E., & Javorsky, D. J. (2003). Subtyping attention-deficit/hyperactivity disorder (ADHD): Use of the ROCF. In J. A. Knight (Ed.), The
Acquired Immune
handbook of Rey-Osterrieth Complex Figure usage: Clinical and research applications. Lutz, FL: Psychological Assessment Resources. Connor, A., Franzen, M., & Sharp, B. (1988). Effects of practice and differential instructions on Stroop performance. International Journal of Clinical Neuropsyclwlogy, 10, 1-4. Connor, P. D., Sampson, P. D., Bookstein, F. L., Barr, H. M., & Streissguth, A. P. (2001). Direct and indirect effects of prenatal alcohol damage on executive function. Developmental Neuropsyclwlogy, 18(3), 331-354. Constantinou, M., Ashendorf, L., & McCaffrey, R. J. (2002). When the third party observer of a neuropsychological evaluation is an audio-
recorder. Clinical Neuropsychologist, 16(3), 407-412. Cooper, H., & Hedges, L. V. (1994). The handbook of research synthesis. New York: Russell Sage Foundation. Cooper,J.A., Sagar, H.J.,Jordan, N., Harvey, N. S. & Sullivan, E. V. (1991). Cognitive impairment in early untreated Parkinson's disease and it's relationship to motor disability. Brain, 114, ~2122.
Copersino, M. L., Serper, M., & Allen, M. H. (2003). Rapid screening for cognitive impairment in the psychiatric emergency service: II. A flexible test strategy. Psychiatric Seroices, 54(3), 314-316. Corey, D. M., Hurley, M. M., & Foundas, A. L. (2001). Right and left handedness defined: A multivariate approach using hand preference and hand performance measures. Neuropsychiatry, Neuropsychology, and Behavioral Neurology, 14(3), 144-152. Corrigan, J. D., & Hinkeldey, N. S. (1987). Relationships between parts A and B of the Trail Making Test. Journal of Clinical Psyclwlogy, 43(4), 402-409. Corrigan, J.D., Agresti, A. A., & Hinkeldey, N. S. (1987). Psychometric characteristics of the Category Test: Replication and extension. Journal of Clinical Psyclwlogy, 43(3), 368--376. Corwin, J., & Bylsma, F. W. (1993). Psychological examination of traumatic encephalopathy. The Complex Figure Copy Test. Clinical Neuropsychologist, 7, 4-21. Craddick, R. A., & Stem, M. R. (1963). Practice effects on the Trail Making Test. Perceptual and Motor Skills, 17(3), 651-653. Craik, F. 1., Byrd, M., & Swanson, J. M. (1987). Patterns of memory loss in three eldery samples. Psyclwlogy and Aging, 2, 79-86. Craik, F. I., Morris, L. W., Morris, R. G., & Loewen, E. R. (1990). Relations between source amnesia and frontal lobe functioning in older adults. Psyclwlogy and Aging, 5(1), 14S-151. Crawford, J. R. (1992). Current and premorbid intelligence measures in neuropsychological assessment. In J. R. Crawford & D. M. Parker (Eds.),
A lwndbook
of
neuropsyclwlogical assessment
(pp. 21-49). Hillsdale, NJ: Lawrence Erlbaum. Crawford, J. R., & Garthwaite, P. H. (2002). Investigation of the single case in neuropsychology: Confidence limits on the abnormality of test scores and test score differences. Neuropsychologia, 40(8), 1196-1208. Crawford, J. R., & Howell, D. C. (1998a). Comparing an individual's test score against norms
548 derived from small samples. Clinical Neuropsychologist, 12(4), 482-486. Crawford, J. R., & Howell, D. C. (1998b). Regression equations in clinical neuropsychology: An evaluation of statistical methods for comparing predicted and obtained scores. Journal of Clinical and Experimental Neuropsychology, 20(5), 755-762. Crawford, J. R., Stewart, L. E., & Moore, J. W. (1989). Demonstration of savings on the AVLT and development of a parallel form. ]011mal of Clinical and Experimental Neuropsy~hology, 11(6), 975-981. Crawford, J. R., Howell, D. C., & Garthwaite, P. H. (1998a). Payne and Jones revisited: Estimating the abnormality of test score differences using a modified paired samples t test. journal of Clinical and Experimental Neuropsychology. 20(6), 898-905. Crawford, J. R., Obonsawin, M. C., & Allan, K. M. (1998b). PASAT and components of WAIS-R performance: Convergent and discriminant validity. Neuropsychological Rehabilitatioa, 8(3), 255-272. Crawford, J. R., Garthwaite, P. H., Howell, D. C., & Venneri, A. (2003). Intra-individual mea$ures of association in neuropsychology: Inferential methods for comparing a single case with a control or normative sample. Journal of the International Neuropsychological Se>cietr, 9(7), 989-1000. ' Crews, W. D., Jr., Harrison, D. W., & Rhodes, R. D. (1999). Neuropsychological test performances of young depressed outpatient women: An· examination of executive functions. Archives of Clinical Neuropsychology, 14(6), 517-529. Crews, W. D., Jefferson, A. L., Bolduc, T., Elliott, J. B., Ferro, N. M., Broshek, D. K., et al. (2001). Neuropsychological dysfunction in patients suffering from end-stage chronic obstructive pulmonary disease. Archives of Clinical Neuropsychology, 16(1), 643-652. Crews, W. D., Jr., Jefferson, A. L., Broshek. D. K., Rhodes, R. D., Williamson, J., Brazil, A. M., et al. (2003). Neuropsychological dysfunction. in patients with end-stage pulmonary disease: Lung transplant evaluation. Archives of Clinical Neuropsychology, 18(4), 353-362. Crockett, D., Clark, C. M., Browning, J., & MacDonald, J. (1983). An application of the background interference procedure to the Benton Visual Retention Test. Journal of Clinical Neuropsychology, 5(2), 181-185. Crockett, D. J., Blisker, D., Hurwitz, T., & Kozak, J. (1986). Clinical utility of three measures of
REFERENCES frontal lobe dysfunction in neuropsychiatry samples. International Journal of Neurosciences, 30, 241-248. Crockett, D. J., Hadjistavropoulos, T., & Hurwitz, T. (1992). Primacy and recency effects in the assessment of memory using the Rey Auditory Verbal Learning Test. Archives of Clinical Neuropsychology, 7, 97-107. Crookes, T. G., & McDonald, K. G. (1972). Benton's Visual Retention Test in the differentiation of depression and early dementia. British Journal of Social and Clinical Psychology,ll(l), 66-69. Crossen, J. R., & Wiens, A. N. (1994). Comparison of Auditory-Verbal Learning Test (AVLT) and California Verbal Learning Test (CVLT) in a sample of normal subjects. Journal of Clinical and Experimental Neuropsychology. 16(2), 190-194. Crossley, M., D'Arcy, C., & Rawson, N. S. B. (1997). Letter and category fluency in communitydwelling Canadian seniors: A comparison of normal participants to those with dementia of the Alzheimerorvasculartype.JoumalofClinicaland Experimental Neuropsychology, 19(1), 52-62. Crosson, B., Hughes, C., Roth, D., & Monkowski, P. (1984a). Review of Russell's (1975) norms for the Logical Memory and Visual Reproduction subtests of the Wechsler Memory Scale. Journal of Consulting and Clinical Psychology, 52(4), 635--641.
Crosson, B., Hughes, C., Roth, D., & Mankowski, P. (1984b). Use of errors and correct ideas in scoring Wechsler Memory Scale stories. Paper presented at the meeting of the International Neuropsychological Society, Houston, TX. Crosson, B., Benefield, H., Cato, M.A., Sadek, J. R., Moore, A. B., Wierenga, C. E., et al. (2003). Left and right basal ganglia and frontal activity during language generation: Contributions to lexical, semantic, and phonological processes. Journal of the 1ntemational Neuropsychological Society, 9(7), 1061-1077. Crovitz, H. F., & Zener, K. (1962). A group-test for assessing hand-and-eye dominance. American Journal of Psychology, 75, 271-276. Crowe, S. F. (1998a). Decrease in performance on the verbal fluency test as a function of time: Evaluation in a young healthy sample. Journal of Clinical and Experimental Neuropsychology, 20(3), 391-401. Crowe, S. F. (1998b). 11te differential contribution of mental tracking, cognitive flexibility, visual search, and motor speed to performance on parts A and B of the Trail Making Test. Journal of Clinical Psychology, 54(5), 585-591.
549
REFERENCES Cruice, M. N., Worrall, L. E., & Hickson, L. M. H. (2000). Boston Naming Test results for healthy olderAustralians:Alongitudinalandcross-sectional study. Aphasiology, 14(2), 143-155. Cullum, C. M., Steinman, D. R., & Bigler, E. D. (1984). Relationship between fluid and crystallized cognitive functions using Category Test and W AIS scores. International Journal of Clinical Neuropsychology, 6(3), 172-174. Cullum, C. M., Butters, N., Truster, A., & Salmon, D. (1990). Normal aging and forgetting rates on the Wechsler Memory Scale-Revised. Archives of
Clinical Neuropsychology, 5, 23--30. Curtiss, G., Vanderploeg, R. D., Spencer, J., & Salazar, A. M. (2001). Patterns of verbal learning and memory in traumatic brain injury. Journal of
the International Neuropsychological Society, 7(5), 574-585.
Dahan, C., Amado, 1., Bayle, F., Gut, A., Willard, D., Bourdel, M.-C., et al. (2002). Correlation between clinical syndromes and neuropsychological tasks in unmedicated patients with recent onset schizophrenia. Psychiatry Research, 113(1-2), 83-92. Daigneault, S., Braun, C. M. J., & Whitaker, H. A. (1992). Early effects of normal aging on perseverative and non-perseverative prefrontal measures. Developmental Neuropsychology, 8, 99-114. Dalrymple-Alford, J. C., Kalders, A. S., Jones, R. D., & Watson, R. W. (1994). A central executive deficit in patients with Parkinson's disease. Journal of
Neurology, Neurosurgery, and Psychilltry, 57(3), 36
Davies, K., Bell, B. D., Bush, A. J., Hermann, B. P., Dohan, F. C., Jr., & Jaap, A. S. (1998). Naming decline after left anterior temporal lobectomy correlates with pathological status of resected hippocampus. Epilepsia, 39, 407-419. Davis, K. L., Price, C. C., Kaplan, E., & Libon, D. J. (2002). Error analysis of the nine-word California Verbal Learning Test (CVLT-9) among older adults with and without dementia. Clinical Neuropsychologist, 16(1), 81-89. Davis, R. D., Adams, R. E., Gates, D. 0., & Cheramie, G. M. (1989). Screening for learning disabilities: A neuropsychological approach. Journal of Clinical Psychology, 45(3), 423-429. Dawes, R. M., Faust, D., & Meehl, P. E. (1993). Statistical prediction versus clinical prediction: lmprovingwhatworks. loG. L. Keren&C. Lewis (Eds.), A handbook for data analysis in the behavioral sciences: Methodological issues (pp. 351367). Hillsdale, NJ: Lawrence Erlbaum. Dawes, R. M., Faust, D., & Meehl, P. E. (2002). Clinical versus actuarial judgment. In T. Gilovicb, D. Griffin, et al. (Eds.), Heuristics and biases: The psychology of intuitive judgment (pp. 716-729). New York: Cambridge University Press. Dawson, L. K., & Grant, I. (2000). Alcoholics' initial organizational and problem-solving skills predict learning and memory performance on the Rey-Osterrieth Complex Figure. Journal of
the International Neuropsychological Society, 6(1), 12-19. Dealberto, M.-J., Pajot, N., Courbon, D., & Alperovitch, A. (1996). Breathing disorders during sleep and cognitive performance in an older community sample: The EVA study. Journal
of
the American Geriatrics Society, 44(11),
1287-1294. Deary, I. J., Ebmeier, K. P., MacLeod, K. M., Dougall, N., Hepburn, D. A., Frier, B. M., et al. (1994). PASAT performance and the pattern of uptake of -super(99m)Tc-exametazime in brain estimated with single photon emission tomography. Biological Psychology, 38(1), 1-18. Deary, I. J., Langan, S. J. Hepburn, D. A., & Frier, B. M. (1991). Which abilities does the PASAT test? Personality and Individual Differences, 12, 983-987. DeckeL A. W. (1999). Tests of executive functioning predict scores on the MacAndrew Alcoholism Scale. Progress in Neuro-Psychupharmacology and Biological Psychiatry, 23(2), 209-223. Deckersbacb, T., Otto, M. W., Savage, C. R., Baer, L., & Jenike, M. A. (2000a). The relationship between semantic organization and memory in
550 obsessive-compulsive disorder. Psychotherapy and Psychosomatics, 69(2), 101-107. Deckersbach, T., Savage, C. R., Henin, A., MataixCols, D., Otto, M. W., Wilhelm, S., et al. (2000b). Reliability and validity of a scoring system for measuring organizational approach in the Complex Figure Test. Journal of Clinical and Experimental Neuropsychology, 22(5), 64()....64$. DecouHe, P., Holmgreen, P., Calle, E., & ~eeks, M. (1991). Nonresponse and intensityoffollow-up in an epidemiologic study of Vietnam-era veterans. American Journal of Epidemiology, 133,,83-93. D'Eiia, L. F., Satz, P., & Schretlen, D. (1989). Wechsler Memory Scale: A critical appraisal of the normative studies. Journal of Clirdcal and Experimental Neuropsychology, 11(4), 551-568. D'Elia, L., Satz, P., Uchiyama, C., & White, T. (1999). Color Trails Test, Professional manual. Odessa, FL: Psychological Assessment Resources. DeFilippis, N. A., McCampbell, E., & Rogers, P. (1979). Development of a booklet forro of the Category Test: Normative and validity run. Journal of Clinical Neuropsychology, 1(4), 339-342. Degenszajn, J., Caramelli, P., Caixeta, L., & Nitrini, R. (2001). Encoding process in delayed recall impairment and rate of forgetting in Alzheimer's disease. Arquivos de Neuro-Psifuiatria, 59(2,A), 171-174. Dehaene, S., & Changeux, J. P. (1991). Ute Wisconsin Card Sorting Test: Theoretical analysis and modeling in a neuronal network. Cerebral Cortex, 1, 62-79. de Jager, C. A., Hogervorst, E., Combrinck, M., & Budge, M. M. (2003). Sensitivity and specificity of neuropsychological tests for mild cognitive impairment, vascular cognitive impairment and Alzheimer's disease. Psychological Medicine, 33(6), 1039-1050. Delaney, R. C., Prevey, M. L., Cramer, J., & Mattson, R. H. (1992). Test-retest comparability and control subject data for the Rey AuditoryVerbal Learning Test and Rey-Osterrietbl Taylor complex figures. Archives of Clinical Neuropsychology, 7(6), 523-528. Delis, D. C., Kramer, J. H., Kaplan, E., & Ober, B. A. (1987). California Verbal Learning Test Manual. San Antonio, TX: Harcourt Brace Jovanovich. Delis, D. C., Kramer, J. H., Kaplan, E., &: Ober, B. A. (1994). California Verbal Learning TestChildren's Version. San Antonio, TX: Psychological Corporation. Delis, D. C., Kramer, J. H., Kaplan, E., & Ober, B. A. (2000). California Verbal Learning Test (~d ed.). San Antonio, TX: Psychological Corporation.
REFERENCES Delis, D. C., Kaplan, E., & Kramer, J. (2001). Delis-Kilplan Executive Function System. San Antonio, TX: Psychological Corporation. Dellatolas, G., Braga, L., Souza, L., Filho, G. N., Queiroz, E., & Deloche, G. (2003). Cognitive consequences of early phase of literacy. Journal of the 1ntemational Neuropsychological Society, 9(5), 771-782. Deloche, G., Hannequin, D., Dordain, M., Perrier, D., et al. (1996). Picture confrontation oral naming: Performance differences between aphasics and normals. Brain and Language, 53(1), 105-120. DeLuca, J., Gaudino, E., Diamond, B., Christodoulou, C., & Engel, R. (1998). Acquisition and storage deficits in multiple sclerosis. Journal of Clinical and Experimental Neuropsychology, 20(3), 376-390. Demakis, G. J. (1999). Serial malingering on verbal and nonverbal fluency and memory measures: An analog investigation. Archives of Clinical Neuropsychology, 14(4), 401-410. Demakis, G. J. (2003). A meta-analytic review of the sensitivity of the Wisconsin Card Sorting Test to frontal and lateralized frontal brain damage. Neuropsychology, 17(2), 255-264. Demakis, G. J., & Harrison, D. W. (1997). Relationships between verbal and nonverbal fluency measures: Implications for assessment of executive functioning. Psychological &ports, 81(2), 443-448.
Demakis, G. J., Mercury, M. G., Sweet, J. J., Rezak, M., Eller, T., & Vergenz, S. (2003). Qualitative analysis of verbal fluency before and after unilateral pallidotomy. Clinical Neuropsychologist, 17(3), 322--330. Demaree, H., Gaudino, E., & DeLuca, J. (2003). The relationship between depressive symptoms and cognitive dysfunction in multiple sclerosis. Cognitive Neuropsychiatry, 8(3), 161-171. Demery, J. A., Pedraza, 0., & Hanlon, R. E. (2002). Differential profiles of verbal learning in traumatic brain injury. Journal of Clinical and Experimental Neuropsychalogy, 24(6), 818-827. Demick, J., & Harkins, D. (1997). Role of cognitive style in the driving skills of young, middle-aged, and older adults (American Association of Retired Persons Andrus Foundation Final Grant Report). Washington, DC: AARP. Demick, J., & Wapner, S. (1985, August). Age differences in processes underlying sequential activity (Stroop Color-Word Test). Paper presented at the Eastern Psychological Association annual meeting, Los Angeles, CA. Demick, J., Salas-Passeri, J., & Wapner, S. (1986, August). Age differences among preschoalers in
REFERENCES processes underlying sequential activity. Paper presented at the Eastern Psychological Association annual meeting, New York, NY. Denman, S. (1984). Denman Neuropsychology Memory Scale. Charleston, SC: Author. Desmond, D. W., Glenwick, D. S., Stem, Y., & Tatemichi, T. K. (1994). Sex differences in the representation of visuospatial functions in the human brain. Rehabilitation Psychology, 39(1), 3-14. D'Esposito, M., Onishi, K., Thompson, H., Robinson, K., Armstrong, C., & Grossman, M. (1996). Working memory impairments in multiple sclerosis: Evidence from a dual-task paradigm. Neuropsychology, 10(1), 51-56. DesRosiers, G., & Ivison, D. (1986). Paired associate learning: Normative data for differences between high and low associate word pairs. Journal of Clinical and Experimental Neuropsychology, 8(6), 637-642. DesRosiers, G., & Kavanagh, D. (1987). Cognitive assessment in closed head injury: Stability, validity and parallel forms for two neuropsychological measures of recovery. International Journal of Clinical Neuropsychology, 9(4), 162-173. de Zubicaray, G., & Ashton, R. (1996). Nelson's (1976) Modified Card Sorting Test: A review. Clinical Neuropsychologist, 10(3), 245-254. Diamond, B. J., & DeLuca, J. (1996). Rey-Osterrieth Complex Figure Test performance following anterior communicating artery aneurysm. Archives of Clinical Neuropsychology, 11(1), 21-28. Diamond, B. J., DeLuca, J., Kim, H., & Kelley, S. M. (1997). The question of disproportionate impairments in visual and auditory information processing in multiple sclerosis. Journal of Clinical and Experimental Neuropsychology, 19(1), 34-42. Diamond, R., Krengel, M., White, R. F., & Javorsky, D. J. (2003). The ROCF in assessment of individuals exposed to toxicants. In J. A. Knight (Ed.), The handbook of Rey-Osterrieth Complex Figure usage: Clinical and research applications. Lutz, FL: Psychological Assessment Resources. DiCarlo, M.A., Gfeller, J.D., & Oliveri, M. V. (2000). Effects of coaching on detecting feigned cognitive impairment with the Category Test. Archives of Clinical Neuropsychology, 15(5), 399-413. Dick, F., Semple, S., Soutar, A., Osborne, A., Cherrie, J. W., & Seaton, A. (2004). Is colour vision impairment associated with cognitive impairment in solvent exposed workers? Occupational and Environmental Medicine, 61(1), 76-78. Diehr, M. C., Heaton, R. K., Miller, W., & Grant, I. (1998). The Paced Auditory Serial Addition Task
551 (PASAT): Norms for age, education, and ethnicity. Assessment, 5(4), 375-387. Diehr, M. C., Chemer, M., Wolfson, T. J., Miller, S. W., Grant, 1., Heaton, R. K., et al. (2003). The 50 and 100-item short forms of the Paced Auditory Serial Addition Task (PASAT): Demographically corrected norms and comparisons with the full PASAT in normal and clinical samples. Journal of Clinical and Experimental Neuropsychology, 25(4), 571-585. Dikmen, S. S., Machamer, J. E., Winn, H. R., & Temkin, N. R. (1995). Neuropsychological outcome at 1-year post head injury. Neuropsychology. 9(1), 80-90. Dikmen, S. S., Heaton, R. K., Grant, 1., & Temkin, N. R. (1999). Test-retest reliability and practice effects of Expanded Halstead-Reitan Neuropsychological Test Battery. Journal of the International Neuropsychological Society, 5(4), 346--356. Diniz, L. F. M., da Cruz, M.d. F., Torres, V. d. M., & Cosenza, R. M. (2000). 0 teste de aprendizagem auditive-verbal de Rey: Normas para uma populacao brasileira. The Rey Auditory-Verbal Learning Test: Norms for a Brazilian sample. &vista Brasileira de Neurologia, 36(3), 79-83. Doan, Q. T., & Swerdlow, N. R. (1999). Preliminary findings with a new Vietnamese Stroop test. Perceptual and Motor Skills, 89(1), 173-182. Dodrill, C. B. (1978a). A neuropsychological battery for epilepsy. Epilepsia, 19, 611--623. Dodrill, C. B. (1978b). The Hand Dynamometer as a neuropsychological measure. Journal of Consulting and Clinical Psychology, 46(6), 1432-1435. Dodrill, C. B. (1979). Sex differences on the Halstead-Reitan Neuropsychological Battery and on other neuropsychological measures. Journal of Clinical Psychology, 35(2), 236-241. Dodrill, C. B. (1987). What's normal: Presidential address. Paper presented at the first annual meeting of the Pacific Northwest Neurological Association, Seattle, WA. Dodrill, C. B., & Troupin, A. S. (1975). Effects of repeated administrations of a comprehensive neuropsychological battery among chronic epileptics. Journal of Nervous and Mental Disease, 161(3), 185-190. Danders, J. (1999a). Cluster subtypes in the standardization sample of the California Verbal Learning Test-Children's Version. Developmental Neuropsychology, 16(2), 163-175. Donders, J. (1999b). Specificity of a malingering formula for the Wisconsin Card Sorting Test. Journal of Forensic Neuropsychology, 1, 35-42.
552 Donders, J. (2001). A suiVey of report writing by neuropsychologists. II: Test data, report format, and document length. Clinical Neuropsychologist, I5(2), 150-161. Donders, J., & Kirsch, N. (1991). Nature and implications of selective impairment on the Booklet Category Test and the Wisconsin Card Sorting Test. Clinical Neuropsychologist, 5, 78-82. Downey, J., Elkin, E. J., Ehrhardt, A. A., MeyerBahlburg, H. F., Bell, J. J., & Morishima, A. (1991). Cognitive ability and everyday functioning in women with Turner syndrome. Journal of Learning Disabilities, 24(1), 32-39. Drane, D. L., Loring, D. W., Lee, G. P., & Meador, K. J. (1998). Trial-length sensitivity of the Verbal Selective Reminding Test to lateralized temporal lobe impairment. Clinical Neuropsychologist, I2(1), 68-73. Drane, D. L., Yuspeh, R. L., Huthwaite, J. S., & Klingler, L. K. (2002). Demographic characteristics and normative obseiVations for derived Trail Making Test indices. Neuropsychiatry, Neuropsychology, and Behavioral Neurology, I5(1), 39-43. Drebing, C. E., Van Gorp, W. G., Stuck, A. E., Mitrushina, M., & Beck, J. (1994). Early detection of cognitive decline in higher cognitively functioning older adults: Sensitivity and specificity of a neuropsychological screening battery. Neuropsychology, 8(1), 31-38. Drewe, E. A. (1974). The effect of type and area of brain lesion on Wisconsin Card Sorting Test performance. Cortex, IO, 159-170. Duara, R., Grady, C., Haxby, J., Ingvar, D., Sokoloff, L., Margolin, R. A., et al. (1984). Human brain glucose utilization and cognitive runction in relation to age. Annals of Neurology, I6(6), 703-713. Duchnick, J. J., Vanderploeg, R. D., & Curtiss, G. (2002). Identifying retrieval problems using the California Verbal Learning Test. Journal
of Clinical and Experimental Neuropsychology, 24(6), 840--852. Duff, K., WesteiVelt, H. J., McCaffrey, R. J., & Haase, R. F. (2001). Practice effects, test-retest stability, and dual baseline assessments with the California Verbal Learning Test in an HIV sample. Archives of Clinical Neuropsychology,I6(5), 461-476. Duff, K., Pattern, D., Schoenberg, M. R., Mold, J., Scott, J. G., & Adams, R. L. (2003). Ageand education-corrected independent normative data for the RBANS in a community dwelling elderly sample. Clinical Neuropsychologist, 17(3), 351-366.
REFERENCES Dugbartey, A. T., Townes, B. D., & Mahurin, R. K. (2000). Equivalence of the Color Trails Test and Trail Making Test in nonnative Englishspeakers. Archives of Clinical Neuropsychology, I5(5), 425-431. Duley, J. F., Wilkins, J. W., Hamby, S. L., Hopkins, D. G., Burwell, R. D., & Barry, N. S. (1993). Explicit scoring criteria for the Rey-Osterrieth and Taylor complex figures. Clinical Neuropsychologist, 7(1), 29-38. Dunn, E. J., Margolis, R. B., & Taylor, J. M. (1985). Short forms of the Category Test: Applications for geriatric patients. International Journal of Clinical Neuropsychology, 7(1), 29-31. Dwyer, C. A. (1996). Cut scores and testing: Statistics, judgment, truth, and error. Psychological Assessment, 8(4), 360-362. Dye, 0. A. (1979). Effects of practice on Trail Making Test performance. Perceptual and Motor Skills. 48(1), 296. Dyer, F. N. (1973). The Stroop phenomenon and its use in the study of perceptual, cognitive, and response processes. Memory and Cognition, I, 106-120. Dywan, J., Segalowitz, S. J., & Unsal, A. (1992). Speed of information processing, health, and cognitive performance in older adults. Developmental Neuropsychology, 8(4), 473-490. Dywan, J., Segalowitz, S. J., Henderson, D., & Jacoby, L. L. (1993). Memory for source after traumatic brain injury. Brain and Cognition, 2I(l), 20-43. Echternacht, R. (1981). Neuropsychological assessment of motor functioning. The Finger Tapping Test: Data on an adult female inpatient psychiatric population. Clinical Neuropsychology, 3(2), 8-9. Eckardt, M. J., & Matarazzo, J. D. (1981). Testretest reliability of the Halstead impairment index in hospitalized alcoholic and nonalcoholic males with mild to moderate neuropsychological impairment. Journal of Clinical Neuropsychology, 3(3), 257-269. Eckman, P. S., & Shean, G. D. (2000). Impairment in test performance and symptom dimensions of schizophrenia. Jotimal of Psychiatric Research, 34(2), 147-153. Egan, V. (1988). PASAT: ObseiVed correlations with IQ. Personality and Individual Differences, 9(1), 179-180. Egger M., Smith, G. D., & Sterne, A. C. (2001). Uses and abuses of meta-analysis. Clinical Medicine, I(6), 478-484. Egger, M., Ebrahim, S., & Smith, G. D. (2002). Where now for meta-analysis? International Jour-
nal of Epidemiology, 3I, 1-5.
553
REFERENCES Elfgren, C. 1., & Risberg, J. (1998). Lateralized frontal blood flow increases during fluency taslcs: Influence of cognitive strategy. Neuropsychologia. 36(6), 505--512. Elfgren, C. 1., Ryding, E., & Passant, U. (1996). Performance on neuropsychological tests related to single photon emission computerised tomography findings in frontotemporal dementia. British journal of Psychiatry, 169, 416-422. Elias, M. F., Podraza, A. M., Pierce, T. W., & Robbins, M. A. (1990). Determining neuropsychological cut scores for older, healthy adults. Experimental Aging Research, 16(4), 209-220. Elias, M. F., Robbins, M. A., Walter, L. J., & Schultz, N. R. (1993). The influence of gender and age on Halstead-Reitan Neuropsychological Test performance. Journals of Gerontology, 48(6), P278-P281. Ellingsen, D. G., Bast-Pettersen, R., Efskind, J., & Thomassen, Y. (2001 ). Neuropsychological effects of low mercmy vapor exposure in chloralkali workers. Neurotoxicology, 22(2), 249-258. El-Sheikh, M., El-Nagdy, S., Townes, B. D., & Kennedy, M. C. (1987). The Luria-Nebraska and Halstead-Reitan neuropsychological test batteries: A cross-cultural study in English and Arabic. International Journal of Neuroscience, 32, 757-764. Elvevag, B., Weinstock, D. M., Akil, M., Kleinman, J. E., & Goldberg, T. E. (2001). A comparison of verbal fluency taslcs in schizophrenic patients and normal controls. Schizophrenia Research, 51(2--3), 119-126. Elvevag,B.,Fisher,J.E.,Gunl,J.M.,&Goldberg, T.E. (2002). Semantic clustering in verbal fluency: Schizophrenic patients versus control participants. Psychalogical Medicine, 32(5), 909-917. Elwan, 0., Hassan, A., Naseer, M., Fahmy, M., Elwan, F., Kader, A., et al. (1996). Brain aging in normal Egyptians: Neuropsychological, electrophysiological and cranial tomographic assessment.
Journal of Neurological Sciences, 136,
7~.
Elwan, 0., Hassan, A., Naseer, M., Elwan, F., Deif, R., Serafy, 0., et al. (1997). Brain aging in a sample of normal Egyptians cognition: education, addiction and smoking. Journal of Neurological Sciences, 148,79-86. Elwood, R. W. (1991). Factor structure of the Wechsler Memory Scale-Revised (WMS-R) in a clinical sample: A methodological reappraisal. Clinical Neuropsychalogist, 5, 329-337. Elwood, R. W. (1993). Psychological tests and clinical discriminations: Beginning to address the base rate problem. Clinical Psychalogy Review, 13(5), 409--419.
Elwood, R. W. (1995). The California Verbal Learning Test: Psychometric characteristics and clinical application. Neuropsychology Review, 5(3), 173-201. Embretson, S. E. (1996). The new rules of measurement. Psychalogical Assessment, 8(4), 341-349. Engelsmann, F., Katz, J., Ghadirian, A. M., & Schachter, D. (1988). Lithium and memory: A long-term follow-up study. Journal of Clinical Psychuphamwcology, 8(3), 207-212. Epker, M. 0., Lacritz, L. H., & Cullum, C. M. (1999). Comparative analysis of qualitative verbal fluency performance in normal elderly and demented populations. Journal of Clinical and Experimental Neuropsychology, 21(4), 425-434. Epperson, R. C., & Cripe, L. (1985). Relationship of PASAT performance and IQ. Unpublished manuscript. Ernst, J. (1987). Neuropsychological problemsolving skills in the elderly. Psychology and Aging. 2(4), 363-365. Ernst, J. (1988). Language, grip strength, sensoryperceptual, and receptive skills in a normal elderly sample. Clinical Neuropsychologist, 2(1), 30--40. Ernst, J., Warner, M. H., Townes, B. D., Peel, J. H., & Preston, M. (1987). Age group differences on neuropsychological battery performance in a neuropsychiatric population: An international descriptive study with replications. Archives of Clinical Neuropsychology, 2, 1-12. Escalona, E., Yanes, L., Feo, 0., & Maizlish, N. (1995). Neurobehavioral evaluation of Venezuelan workers exposed to organic solvent mixtures.
American Journal of Industrial Medicine, 27, 15-27. Escandell, V. A. (2002). Cross-cultural neuropsychology in Saudi Arabia. In F.R. Ferraro (Ed.),
Minority and cross-cultural aspects of neuropsychological assessment. Studies on neuropsychology, development, and cognition. (pp. 299325) Lisse: Swets & Zeitlinger. Eslinger, P. J., & Benton, A. L. (1983). Visuoperceptual performances in aging and dementia: Clinical and theoretical implications. Journal of Cltnical Neuropsychology, 5(3), 213-220. Eslinger, P. J., & Grattan, L. M. (1993). Frontal lobe and frontal-striatal substrates for different forms of human cognitive flexibility. Neuropsychologia, 31(1), 17-28. Eslinger, P. J., Damasio, H., Graff-Radford, N. R., & Damasio, A. R. ( 1984). Examining the relationship between computed tomography and neuropsychological measures in normal and demented elderly. Journal of Neurology, Neurosurgery, and Psychiatry, 47(12), 1319-1325.
554 Eslinger, P. J., Damasio, A. R., Benton, A. L., & Van Allen, M. (1985). Neuropsychological detEction of abnormal mental decline in older persons. Journal of American Medical Association, 253, 670--674. Eson, M. E., Yen, J. K., & Bourke, R. S. (1978). Assessment of recovery from serious head injury.
Journal of Neurology, Neurosurgery, and Psychiatry, 41(11), 1036-1042. Esposito, G., Kirkby, B. S., Van Hom, J. D., Ellmore, T. M., & Berman, K. F. (1999). Context-dependent, neural system-specific neurophysiological concomitants of ageing: Mapping PET correlates during cognitive activation. Brain, 122(5), ~79. Estes, W. K. (1974). Learning theory and intelligence. American Psychologist, 29, 740-749. Evans, R. W., Ruff, R. M., & Gualtieri, C. T. (1985). Verbal fluency and figural fluency in bright children. Perceptual and Motor Skills, 61(3, Pt 1), 699-709. Fabian, M. S., Jenkins, R. L., & Parsons, 0. A. (1981). Gender, alcoholism and neuropsychological functioning. Journal of Consulting and Clinical Psychology. 49(1), 138-140. : Fabrigoule, C., Rouch, 1., Taberly, A., Letenaeur, L., Commenges, D., Mazaux, J.-M., et al. (1998). Cognitive process in preclinical phase of dementia. Brain, 121(1), 135-141. Faglioni, P., Bertolani, L., Botti, C., & Merelli, E. (2000a). Verbal learning strategies in patients with multiple sclerosis. Cortex, 36(2), 243-263. Faglioni, P., Saetti, M. C., & Botti, C. (2000b). Verbal learning strategies in Parkinson's disease. Neuropsychology, 14(3), 456-470. Fama, R., Sullivan, E. V., Shear, P. K., CaboWeiner, D. A., Yesavage, J. A., Tinklenberg, J. R., et al. (1998). Fluency performance patterns in Alzheimer's disease and Parkinson's disease. Clinical Neuropstjchologist, 12(4), 487-499. Fama, R., Sullivan. E. V., Shear, P. K., CaboWeiner, D. A., Marsh, L., Lim, K. 0., et al. (2000). Structural brain correlates of verbal and nonverbal fluency measures in Alzheimer's disease. Neuropsychology. 14(1), 29-41. . Farabat, T. M., Abdelrasoul, G. M., Amr, M. M., Shebl, M. M., Farabat, F. M., & Anger, W. K. (2003). Neurobehavioural effects among workers occupationally exposed to organophosphorous pesticides. OcCI.lpational and Environmental Medicine, 60, 279-286. Farmer, A. (1990). Performance of normal males on the Boston Naming Test and the Word Test. Aphasiology, 4(3), 293-296. Farver, P. F., & Farver, T. B. (1982). Performance of normal older adults on tests desi~ed to
REFERENCES measure parietal lobe functions. American Journal of OcCI.lpational Therapy, 36(7), 444-449. Fastenau, P. S. (1996a). An elaborated administration of the Wechsler Memory Scale-Revised. Clinical Neuropsychologist, 10(4), 425-434. Fastenau, P. S. (1996b). Development and preliminary standardization of the "Extended Complex Figure Test" (ECFT). Journal of Clinical and Experimental Neuropsychology, 18(1), 63-76. Fastenau, P. S. (1998). Validity of regression-based norms: An empirical test of the comprehensive norms with older adults. Journal of Clinical and Experimental Neuropsychology, 20(6), 906-916. Fastenau, P. S. (2002a). Examination of the appropriateness of 30-50-year-old ECFT norms for younger adults: Supporting evidence. Archives of Clinical Neuropsychology, 17(8), 835. Fastenau, P. S. (2002b). The Extended Complex Figure Test (ECFT). Los Angeles: Western Psychological Services. Fastenau, P. S. (2003). Extended Complex Figure Test (ECFT): Rationale and empirical support for recognition and matching. In J. A. Knight (Ed.), The handbook of Rey-Osterrieth Complex
Figure usage: Clinical and research applications. Lutz, FL: Psychological Assessment Resources. Fastenau, P. S., & Adams, K. M. (1996). Heaton, Grant and Matthews' comprehensive norms: An overzealous attempt [Book review]. Journal of
Clinical and Experimental Neuropsychology, 18(3), 444-448. Fastenau, P. S., Denburg, N. L., & Mauer, B. A. (1998). Parallel short forms for the Boston Naming Test: Psychometric properties and norms for older adults. Journal ofClinical and Experimental Neuropsychology, 20(6), 828-834. Fastenau, P. S., Denburg, N. L., & Hufford, B. J. (1999). Adult norms for the Rey-Osterrieth Complex Figure Test and for supplemental recognition and matching trials from the Extended Complex Figure Test. Clinical Neuropsychologist, 13(1), 30-47. Fastenau, P. S., Evans, J. D., Johnson, K. E., & Bond, G. R. (2002). Multicultural training in clinical neuropsychology. In F. R. Ferraro (Ed.),
Minority and cross-cultural aspects of neuropsychological asessment. Studies on neuropsychology, development, and cognition. Lisse: Swets & Zeitlinger. Faust, D. (1991). Forensic neuropsychology: The art of practicing a science that does not yet exist [Special Issue: Forensic clinical neuropsychology]. Neuropsychology Review, 2(3), 205-231. Faust, D., Ziskin, J., & Hiers, J. (1991). Brain
damage claims: Coping with neuropsychological
REFERENCES
555
evidence (Vols. 1, 2). Los Angeles: Law and
Clinical and Experimental Neuropsychology,
Psychology Press. Feinstein, A., Brown, R., & Ron, M. (1994). Effects of practice of serial tests of attention in healthy subjects. Journal of Clinical and Experimental
19(2), 204-210. Fillenbaum, G. G., Peterson, B., Welsh-Bohmer, K. A., Kukull, W. A., & Heyman, A. (1998). Progression of Alzheimer's disease in black and white patients: The CERAD experience, part XVI. Neurology. 51(1), 154-158. Fillenbaum, G. G., Heyman, A., Huber, M. S., Ganguli, M., & Unverzagt, F. W. (2001). Performance of elderly African American and white community residents on the CERAD Neuropsychological Battery. Journal of the International Neuropsychological Society, 7(4), 502-509. Fillenbaum, G. G., Unverzagt, F. W., Ganguli, M., Welsh-Bohmer, K. A., & Heyman, A. (2002). The CERAD Neuropsychology Battery: Performance of representative community and tertiary care samples of African-American and European-American elderly. In F. R. Ferraro (Ed.), Minority and cross-cultural aspects of neu-
Neuropsychology, 16, 436--447. Feldstein, S. N., Keller, F. R., Portman, R. E., Durham, R. L., Klebe, K. J., & Davis, H. P. (1999). A comparison of computerized and standard versions of the Wisconsin Card Sorting Test. Clinical Neuropsychologist, 13(3), 303--313. Fenwick, P., Galliano, S., Coate, M.A., Rippere, V., & Brown, D. (1985). "Psychic sensitivity," mystical experience, head injwy and brain pathology. British Journal of Medical Psychology, 58(1),
35-44. Ferland, M. B., Ramsay, J., Engeland, C., & O'Hara, P. (1998). Comparison of the performance of normal individuals and survivors of traumatic brain injwy on repeat administrations of the Wisconsin Card Sorting Test. Journal of
Clinical and Experimental Neuropsychology, 20(4), 473-482. Ferman, T. J., Ivnik, R. J., & Lucas, J. A. (1998). Boston Naming Test discontinuation rule: Rigorous versus lenient interpretations. Assessment, 5(1), 13-18. Ferraro, F. R., & Barth, J. (2003). Speeded lexical decision performing on 15-item forms of the Boston Naming Test. Psychology and Education: An Interdisciplinary Journal, 40(2), 38-40. Ferraro, F. R., & Bercier, B. (1996). Boston Naming Test performance in a sample of Native American elderly adults. Clinical Gerontologist, 17(1), 58--60. Ferraro, F. R., Blaine, T., Flaig, S., & Bradford, S. (1998). Familiarity norms for the Boston Naming Test stimuli. Applied Neuropsychology, 5(1), 43-47. Ferraro, F. R., Bercier, B. J., Holm, J., & McDonald, J. D. (2002). Preliminary normative data from a brief neuropsychological test battery in a sample of Native American elderly. In F. R. Ferraro (Ed.), Minority and cross-cultural aspects of neuropsychological assessment. Lisse: Swets & Zeitlinger. Feyereisen, P. (1997). A meta-analytic procedure shows an age-related decline in picture naming: Comments on Goulet, Sica, and Kahn (1994).
journal of Speech, Language, and Hearing Research, 40, 1328-1333. Fillenbaum, G. G., Huber, M., & Taussig, I. M. (1997). Performance of elderly white and African American community residents on the abbreviated CERAD Boston Naming Test. Journal of
ropsychological assesment. Studies on neuropsychology, development, and cognition. Lisse: Swets & Zeitlinger. Filskov, S. B., & Catanese, R. A. (1986). Effects of sex and handedness on neuropsychological testing. In S. B. Filskov & T. J. Boll (Eds.), Handbook of Clinical Neuropsychology (Vol. 2). New York: Wiley. Fimm, B., Bartl, G., Zimmermann, P., & Wallesch, C. (1994). Different mechanisms underlie shifting set on external and internal cues in Parkinson's disease. Brain and Cognition, 25, 287-304. Finlayson, M.A. J., Johnson, K. A., & Reitan, R. M. (1977). Relationship of level of education to neuropsychological measures in brain-damaged and non-brain-damaged adults. Journal of Consulting and Clinical Psychology. 45(4), 536--542. Finlayson, M. A., Sullivan, J. F., & Alfano, D. P. (1986). Halstead's Category Test: Withstanding the test of time. Journal of Clinical and Experimental Neuropsychology, 8(6), 706-709. Fisher, L. M., Freed, D. M., & Corkin, S. (1990). Stroop Color-Word Test performance in patients with Alzheimer's disease. Journal of Clinical and Experimental Neuropsychology, 12, 745-758. Fisher,N.J.,Tiemey, M.C.,Snow, W. G., &Szalai,J. P. (1999). Odd/even short forms of the Boston Naming Test: Preliminary geriatric norms. Clinical Neuropsychologist, 13(3), 359--364. Fisk, J. D., & Archibald, C. J. (2001). Limitations of the Paced Auditory Serial Addition Test as a measure of working memory in patients with multiple sclerosis. Journal of the International
Neuropsychological Society, 7(3), 363-372.
REFERENCES
556 Fitz, A. G., Conrad, P. M., Hom, D. L., Sarff, P., & Majovski, L. V. (1992). Hooper Visual Organization Test performance in lateralized brain injury. Archives of Clinical Neuropsychology, 7, 243-250. Fitzhugh, K. B., Fitzhugh, L. C., & Reitan, R. M. (1964). Influence of age upon measures of problem solving and experimental background in subjects with longstanding cerebral dysfunction. Journal of Gerontology, 19, 132-134. Fix, A. J., Daughton, D., Kass, 1., Bell, C. W., & Golden, C. J. (1985). Cognitive functioning and survival among patients with chronic obstructive pulmonary disease. International Journal of Neuroscience, 27(1-2), 13-17. Flanagan, J. L., & Jackson, S. T. (1997). Test-retest reliability of three aphasia tests: Performance of non-brain damaged older adults. Journal of Communication Disorders, 30, 33-43. Fleming, K., Goldberg, T. E., Gold, J. M., & Weinberger, D. R. (1995). Verbal working memory dysfunction in schizophrenia: Use of a Brown-Peterson paradigm. Psychiatry Research, 56(2), 155-161. Fleming, K., Goldberg, T. E., Binks, S., Randolph, C., et al. (1997). Visuospatial working memory in patients with schizophrenia. Biological Psychiatry, 41(1), 43-49. Fletcher, R. H., Fletcher, S. W., & Wagner, E. H. (1996). Clinical epidemiology: The essentials (3rd ed.). Philadelphia: Williams & Wilkins. Fletcher-Janzen, E., Strickland, T. L., & Reynolds, C. R. (2000). Handbook of cross-cultural neuropsychology. Netherlands: Kluwer Academic Publishers. Flicker, C., Ferris, S., Crook, T., & Bartus, R. (1987). Implications of memory and language dysfunction in the naming deficit of senile dementia. Brain and Language, 31, 187-200. Flinton, M. J., Lucas, J. A., Graff-Radford, N. R., & Uitti, R. J. (1998). Analysis of visuospatial errors in patients with Alzheimer's disease or Parkinson's disease. Journal of Clinical and Experimental Neuropsychology, 20(2), 186-193. Flowers, K. A., & Robertson, C. (1985). The effect of Parkinson's disease on the ability to maintain a mental set. Journal of Neurology, Neurosurgery, and Psychiatry, 48, 517-529. Fluck, E., Fernandes, C., & File, S. E. (2001). Are lorazepam-induced deficits in attention similar to those resulting from aging? Journal of Clinical Psychopharmacology, 21(2), 126-130. Flynn, J. R. (1984). The mean IQ of Americans: Massive gains 1932 to 1978. Psychological Bulletin, 95(1), 29-51.
Flynn, J. R. (1998). WAIS-III and WISC-III gains in the United States from 1972 to 1995: How to compensate for obsolete norms. Perceptual and Motor Skills, 86(3, Pt 2), 1231-1239. Folbrecht, J. R., Charter, R. A., Walden, D. K., & Dobbs, S.M. (1999). Psychometric properties of the Boston Qualitative Scoring System for the Rey-Osterrietb Complex Figure. Clinical Neuropsychologist, 13(4), 442-449. Folstein, M., Folstein, S., & McHugh, P. (1975). Mini-Mental State: A practical method for grading the cognitive state of patients for the clinician. Journal of Psychiatric Research,12, 189-198. Fontenot, D. J., & Benton, A. L. (1972). Perception of direction in the right and left visual fields. Neuropsychologia, 10(4), 447-452. Foo, S. C., Jeyaratnam, J., & Koh, D. (1990). Chronic neurobehavioural effects of toluene.
British Journal
of
Industrial Medicine, 47(1),
480-484. Foo, S. C., Lwin, S., Chia, S. E., & Jeyaratnam, J. (1994). Chronic neurobehavioural effects in paint formulators exposed to solvents and noise.
Annals of the Academy
of Medicine,
Singapore,
23(5), 650--654. Fos, L.A., Greve, K. W., South, M. B., Mathias, C., & Benefield, H. (2000). Paced VISual Serial Addition Test: An alternative measure of information processing speed. Applied Neuropsychology, 7(3), 140-146. Fossati, P., Amar, G., Raoux, N., Ergis, A. M., & Allilaire, J. F. (1999). Executive functioning and verbal memory in young patients with unipolar depression and schizophrenia. Psychiatry Research, 89(3), 171-187. Fossati, P., Ergis, A.-M., & Allilaire, J.-F. (2001). Problem-solving abilities in unipolar depressed patients:Comparisonofperformanceonthemodified version of the Wisconsin and the California sorting tests. Psychiatry Research, 104(2), 1~156.
Fossum, B., Holmberg, H., & Reinvang, I. (1992). Spatial and symbolic factors in performance on the Trail Making Test. Neuropsychology, 6(1), 71-75. Foster, H. G., Hillbrand, M., & Silverstein, M. (1993). Neuropsychological deficit and aggressive behavior: A prospective study. Progress in
Neuro-PsychopharmllCology and Biological Psychiatry, 17(6), 939-946. Francis, W. N., & Kueera, H. (1982). Frequency
analysis of English usage: Lexicon and grammar. Boston: Houghton Miffiin. Frank, E. M., McDade, H. L., & Scott, W. K. (1996). Naming in dementia secondary to Parkinson's,
REFERENCES Huntington's, and Alzheimer's diseases. Journal
of Communication Disorders, 29, 183-197.
Frank, R. M., & Bryne, G. J. (2000). The clinical utility of the Hopkins Verbal Learning Test as a screening test for mild dementia. International Journal of Geriatric Psychiatry,15(4), 317-324.
Franklin, R. D. (2003). Predktion in forensic and neuropsychology. Hillsdale, NJ: Lawrence Erlbaum. Franzen, M. D. (2000). &liability and validity in neuropsychological assessment (2nd ed.). New York: Kluwer Academic/Plenum. Franzen, M. D., Smith, S. S., Paul, D. S., & MacInnes, W. D. (1993). Order effects in the administration of the Booklet Category Test and Wisconsin Card Sorting Test. Archives of Clinical Neuropsychology, 8(2), 105-110. Franzen, M. D., Haut, M. W., Rankin, E., & Keefover, R. (1995). Empirical comparison of alternate forms of the Boston Naming Test. Clinical Neuropsychologist, 9(3), 225-229. Franzen, M. D., Paul, D., & Iverson, G. L. (1996). Reliability of alternate forms of the Trail Making Test. Clinical Neuropsychologist, 10{2), 125-129. Frazier, T. W., Adams, N. L., Stauss, M. E., & Redline, S. (2001). Comparability of the Rey and Mack forms of the Complex Figure Test. Clinical Neuropsychologist, 15(3), 337-344. Freides, D. (1993). Proposed standard of professional practice: Neuropsychological reportsdisplay all quantitative data. Clinical Neuropsychologist, 7(2).~235.
Freides, D. (1995). Interpretations are more benign than data? Clinical Neuropsychologist, 9, 248. Friedman, L., Jesberger, J. A., & Meltzer, H. Y. (1991). A model of smooth pursuit performance illustrates the relationship between gain, catch-up saccade rate, and catch-up saccade amplitude in normal controls and patients with schizophrenia. Biological Psychiatry, 30(6), 537-556. Friedman, L., Kenny, J. T., Jesberger, J. A., Choy, M. M., & Meltzer, H. Y. (1995). Relationship between smooth pursuit eye-tracking and cognitive performance in schizophrenia. Bwlogtcal Psychiatry, 37(4), 265-272. Friedman, M.A., Schinka, J. A., Mortimer, J. A., & Borenstein Graves, A. (2002). Hopkins Verbal Learning Test-Revised: Norms for elderly African Americans. Clinical Neuropsychologist, 16(3), 356-372. Fristoe, N. M., Salthouse, T. A., & Woodard, J. L. (1997). Examination of age-related deficits on the Wisconsin Card Sorting Test. Neuropsychology, 11(3), 428-436. Frith, C. D., Friston, K. J., Herold, S., Silbersweig, D., Fletcher, P., Cahill, C., et al. (1995). Regional
557
brain activity in chronic schizophrenic patients during the performance of a verbal fluency task. British Journal of Psychiatry, 167, 343-349. Fromm-Auch, D., & Yeudall, L. T. (1983). Normative data for the Halstead-Reitan neuropsychological tests. Journal of Clinical Neuropsychology, 5(3), 221-238. Fujii, D. E., Uoyd, H. A., & Miyamoto, K. (2000). The salience of visuospatial and organizational skills in reproducing the Rey-Osterrieth Complex Figure in subjects with high and low IQs. Clinical Neuropsychologist, 14(4), 551-554. Fukui, T., Sugita, K., Sato, Y., Takeuchi, T., & Tsukagoshi, H. (1994). Cognitive functions in subjects with incidental cerebral hyperintensities. European Neurology, 34, 272--276. Fuld, P. A. (1981). The Fuld Object Memory Test. Chicago: Stoelting Instrument. Fuller, K. H., Gouvier, W. D., & Savage, R. M. (1997). Comparison oflist Band list C of the Rey Auditory Verbal Learning Test. Clinical Neuropsychologist, 11(2), 201-204. Furry, C. A., & Baltes, P. B. (1973). The effect of age differences in ability-extraneous performance variables on the assessment of intelligence in children, adults, and the elderly. Journal of Gerontology, 28, 73-80. Gaddes, W. H., & Crockett, D. J. (1975). SpreenBenton aphasia tests: Normative data as a measure of normal language development. Brain and Language, 2(3), 257-280. Gaillard, W. D., Hertz-Pannier, L., Mott, S. H., Barnett, A. S., LeBihan, D., & Theodore, W. H. (2000). Functional anatomy of cognitive development: fMRI of verbal fluency in children and adults. Neurology, 54(1), 180-185. Galasko, D., Edland, S.D., Morris, J. C., Clark, C., Mohs, R., & Koss, E. (1995). The Consortium to Establish a Registry for Alzheimer's Disease (CERAD): XI. Clinical milestones in patients with Alzheimer's disease followed over 3 yrs. Neurology, 45(8), 1451-1455. Galindo, G. & Cortes, J. F. (2003). The ROCF and the Complex Figure for Children in Spanishspeaking populations. In J. A. Knight (Ed.), The
handbook of &y-Osterrieth Complex Figure usage: Clinical and research applications. Lutz,
FL: Psychological Assessment Resources. Gambini, 0., Macciardi, F., Abbruzzese, M., & Scarone, S. (1992). Influence of education on WCST performances in schizophrenic patients. International Journal of Neuroscience, 67(1-4), 105-109. Ganguli, M., Ratcliff, G., Huff, F., Belle, S., Kano, M., Fischer, L., et al. (1991). Effects of age, gender,
558 and education on cognitive tests in a rural elderly community sample: Norms from the Monongahela Valley Independent Elders Survey. Neuroepidemiology. 10, 42-52. Ganguli, M., Seaberg, E., Belle, S., Fischer, L., & Kuller, L. H. (1993). Cognitive impainnent and the use of health services in an elderly rural population: The MoVIES project. Journal of the American Geriatrics Society, 41, 1065-1070. Ganguli, M., Seaburg, E. C., Ratcliff, G. G., & Belle, S. H. (1996). Cognitive stability over 2 years in a rural elderly population: 'Ihe MoVIES project. Neuroepidemiology, 15(1), 42-50. Gansler, D. A., Fucetola, R., Krengel, M., Stetson, S., Zimering, R., & Makaly, C. (1998). Are there cognitive subtypes in adult attention deficit/ hyperactivity disorder? Journal of NerlJous and Mental Disease, 186(12), 776-781. Garb, H. N., & Schramke, C. J. (1996). Judgment research and neuropsychological assessment: A narrative review and meta-analyses. Psgchological Bulletin, 1.20(1), 140-153. Gasquoine, P. G. (2001). Research in clinical neuropsychology with Hispanic American participants: A review. Clinical Neuropsydwlogist, 15(1), 2-12. Gaudino, E. A., Geisler, M. W., & Squires, N. K. (1995). Construct validity in the Trail Making Test: What makes part B harder? journal of Clinical and Experimental Neuropsychology, 17(4), 529-535. Geffen, G. M., Moar, K. J., O'Hanlon, A. P., Clark, C. R., & Geffen, L. B. (1990). Performance measures of 16- to 86-year-old males and females on the Auditory Verbal Learning Test. Clinical Neuropsychologist, 4(1), 45--63. Geffen, G. M., Bate, A., Wright, M., Rozenbilds, U., & Geffen, L. (1993). A comparison of cognitive impairments in dementia of the Alzheimer type and depression in the elderly. Demenla, 4(5), 294--300. Geffen, G. M., ButteiWorth, P., & Geffen, L. B. (1994). Test-retest reliability of a new fonn of the Auditory Verbal Learning Test (AVLT). Archives of Clinical Neuropsychology, 9(4), 303-316. George, M. S., Ketter, T. A., Parekh, P. I., Rosinksy, N., Ring, H., Casey, B. J., et al. (1994). Regional brain activity when selecting a response despite interference: An H 20 150 PET study of the Stroop and an emotional Stroop. Human Brain Mapping. 1, 194-209. Gershberg, F. B., &: Shimamura, A. P. (1995). Impaired use of organizational strategies in free recall following frontal lobe damage. Neuropsychologia, 33(10), 1305-1333.
REFERENCES Gerson, A. (1974). Validity and reliability of the Hooper Visual Organization Test. Perceptual and Motor Skills, 39, 95-100. Giambra, L. M., Arenberg, D., Kawas, C., Zonderman, A. B., & Costa, P. T. (1995). Adult life span changes in immediate visual memory and verbal intelligence. Psychology and Aging. 10(1), 123-139. Gigi, A., Schnaider-Beeri, M., Davidson, M., &: Prohovnik, I. (1999). Validation of a Hebrew selective reminding test. Israel Journal of Psychiatry and Reloted Sciences, 36(1), 11-17. Gilandas, A. J., Touyz, S., Beumont, P. J. V., &: Greenberg, H. P. (1984). Handbook of neuropsychological assessment. Sydney: Grone &: Stratton. Gilleard, E., & Gilleard, C. (1989). A comparison of Turkish and Anglo-American normative data on the Wechsler Memory Scale. Journal of Clinical Psychology, 45(1), 114-117. Giovagnoli, A. R. (1996). Trail Making Test: Normative values from 287 normal adult controls. Italian Journal of Neurological Sciences, 17(4), 305-310. Giovagnoli, A. R., &: Avanzini, G. (1996). Forgetting rate and interference effects on a verbal memory distractor task in patients with temporal lobe epilepsy. Journal of Clinical and Experimental
Neuropsychology, 18, 259-264. Giovannetti, T., Goldstein, R. Z., Schullery, M., Barr, W. B., & Bilder, R. M. (2003). Category fluency in first-episode schizophrenia. Journal of
the International Neuropsychological Society, 9(3), 384-393. Gladsjo, J. A., Schuman, C. C., Evans, J. D., Peavy, G. M., Miller, S. W., &: Heaton, R. K. (1999). Norms for letter and category fluency: Demographic corrections for age, education, and ethnicity. Assessment, 6(2), 147-178. Glass, G. V. (1976). Primary, secondary, and metaanalysis. Educational Researcher, 5, 3--5. Gleissner, U., & Elger, C. E. (2001). The hippocampal contribution to verbal fluency in patients with temporal lobe epilepsy. Cortex, 37(1), 55--63. Glennerster, A., Palace, J., Warburton, D., & Oxbury, S. (1996). Memory in myasthenia gravis: Neuropsychological tests of central cholinergic function before and after effective immunologic treatment. Neurology, 46(4), 1138-1142. Goethe, K. E., Mitchell, J. E., Marshall, D. W., Brey, R. L., Cahill, W. T., Leger, G. D., et al. (1989). Neuropsychological and neurological function of human immunodeficiency virus seropositive asymptomatic individuals. Archives of Neurology, 46(2), 129-133.
REFERENCES Gold, A. E., Macl..axl, K. M., Deary, I. J., & Frier, B. M. (1995). Hypoglycemia-induced cognitive dysfunction in diabetes mellitus: Effect of hypoglycemia unawareness. Physiology and Behavior, 58(3), 501-511. Goldberg, T. E., Bennan, K. F., Mohr, E., & Weinberger, D. R. (1990). Regional cerebral blood How and cognitive function in Huntington's disease and schizophrenia: A comparison of patients matched for perfonnance on a prefronataltype task. Archives of Neurology, 47, 418-422. Golden, C. (1978). Stroop experimental uses. Chicago: Stoelting. Golden, C. (1981). A standardized version of Luria's neuropsychological tests. In S. Filskov & T. Boll (Eds.), Handbook of clinical neuropsychology. New York: Wiley-Interscience. Golden, C. J., Kupennan, S. K, Macinnes, W. D., & Moses, J. A. (1981a). Cross-validation of an abbreviated fonn of the Halstead Category Test.
Journal of Consulting and Clinical Psychology, 49(4), 606--607. Golden, C. J., Osmon, D. C., Moses, J. A., Jr., & Berg. R. A. (1981b). Interpretation ofthe Halstead-
Reitan Neuropsychological Test Battery: A Casebook Approach. New York: Grune & Stratton. Goldman, R. S., Axelrod, B. N., Heaton, R. K., Chelune, G. J., et al. (1996). Latent structure of the WCST with the standardization samples. Assessment, 3(1), 7:>---78. Goldman, W. P., Baty, J. D., Buckles, V. D., Sahnnann, S., & Morris, J. C. (1998). Cognitive and motor functioning in Parkinson disease: Subjects with and without questionable dementia. Archives of Neurology, 55(5), 674--680. Goldsmith, R. W., & Brengelmann, J. C. (1971). Interactions between personality, abnonnality and test conditions in a battery of tests. Archives of Psychology, 123(3), 217-224. Goldstein, D., Mercury, M., Azrin, R., Millsapa. C., Ventura, T., & Pliskin, N. (2000). Cultural considerations on the Boston Naming Test: The effects of race and geographic region. Journal of the International Neuropsychological Society, 6, 143. Goldstein, G. (1997). The clinical utility of standardized or flexible battery approaches to neuropsychological assessment. In G. I. Goldstein & T. M. Incagnoli (Eds.), Contemporary approaches to neuropsychological assessment. New York: Plenum. Goldstein, G., & Zubin, J. (1990). Neuropsychological differences between young and old schizophrenics with and without associated neurological dysfunction. Schizophrenia Research, 3(2), 117-126.
559 Goldstein, G., & Shelly, C. H. (1972). Statistical and nonnative studies of the Halstead neuropsychological test battery relevant to a neuropsychiatric hospital setting. Perceptual and Motor Skills, 34, 603-620. Goldstein, G., & Shelly, C. H. (1975). Similarities and differences between psychological deficit in aging and brain damage. Journal of Gerontology, 30(4), 448-455. Goldstein, S. G., & Braun, L. S. (1974). Reversal of expected transfer as a function ofincreased age. Perceptual and Motor SkiUs, 38, 1139-1145. Gollan, T. H., Montoya, R.I., & Werner, G. A. (2002). Semantic and letter Huency in Spanish-English bilinguals. Neuropsychology, 16(4), 562-576. Gontkovsky, S. T., & Souheaver, G. T. (2002). Tscore and raw-score comparisons in detecting brain dysfunction using the Booklet Category Test and the Short Category Test. Perceptual and Motor Skills, 94(1), 319-322. Gonzalez, E. A., Dieter, J. N. 1., Natale, R. A., & Tanner, S. L. (2001). Neuropsychological evaluation ofhigher functioning homeless persons: A comparison of an abbreviated test battery to the Mini-Mental State Exam. Journal of Neroous and Mental Disease, 189(3), 176-181. Goodglass, H. (1980 ). Disorders of naming following brain injury. American Scientist, 68(6), 647--655. Goodglass, H., Kaplan, E., & Barressi, B. (2001).
The Boston Diagnostic Aphasia Examination (BDAE) (3rd ed.). Odessa, FL: Psychological Assessment Resources. Gooding, D. C., & Tallent, K. A. (2002). Spatial working memory perfonnance in patients with schizoaffective psychosis versus schizophrenia: A tale of two disorders? Schizophrenia Research, 53(3), 209-218. Gooding. D. C., Kwapil, T. R., & Tallent, K. A. (1999). Wisconsin Card Sorting Test deficits in schizotypic individuals. Schizophrenia Research, 40(3), 201-209. Goodman, A. M., Delis, D. C., & Mattson, S. N. (1999). Nonnative data for 4-year-old children on the California Verbal Learning TestChildren's Version. Clinical Neuropsychologist, 13(3), 274-282. Gordon, H. W., & Lee, P. A. (1986). A relationship between gonadotropins and visuospatial function. Neuropsychologia, 24(4), 56:>-576. Gordon, N. G. (1972). The Trail Making Test in neuropsychological diagnosis. Journal of Clinical Psychology, 28, 167-169. Gordon, N. G., & O'Dell, J. W. (1983). Sex differences in neuropsychological perfonnance. Perceptual and Motor Skills, 56, 126.
560 Gorsuch, R. L. (1983). The theory of continuous nanning. In R. L. Gorsuch (chair), Continuous nonning: An alternative to tabled nonns? Symposium conducted at the 9lst Annual Convention of the American Psychological Association, Anaheim, August 26--30. Gottschalk, L.A., & Selin, C. (1991). Comparative neurobiological and neuropsychological deficits in adolescent and adult schizophrenic and nonschizophrenic patients. Psychotherapy and Psychosomatics, 55(1), 32-41. Goul, W. R., & Brown, M. (1970). Effects of age and intelligence on Trail Making Test performance and validity. Perceptual and Motor Skills, 30, 319-326. Goulet, P., Ska, B., & Kahn, H. J. (1994). Is there a decline in picture naming with advancing age? Journal of Speech and Hearing Research, 37(3), 629--644. Gouvier, W. D. (1999). Base rates and clinical decision making in neuropsychology. In J. J. Sweet (Ed.), Forensic neuropsychology: Fundamentals
and practice. Studies on neuropsychology, development, and cognition. Lisse: Swets & Zeitlinger. Gouvier, W. D. (2001). Are you sure you're really telling the truth? [Special Issue: Controversies in
neuropsychology] NeuroRehabilitation, 16(4), 215-219. Gouvier, W. D., Hayes, J. S., & Smiroldo, B. B. (1998). The significance of base rates, test sensitivity, test specificity, and subjects' knowledge of symptoms in assessing TBI sequelae and malingering. In C. R. Reynolds (Ed.), Detection
of malingering during head injury litigation. Critical issues in neuropsychology, (pp. 55-79). NY: Plenum Press. Gouvier, W. D., Pinkston, J. B., Santa Maria, M.P., & Cherry, K. E. (2002). Base rate analysis in crosscultural clinical psychology-Diagnostic accuracy in the balance. In F. R. Ferraro (Ed.), Minority
and cross-cultural aspects of neuropsychological assessment. Studies on neuropsychology. development, and cognition. Lisse: Swets & Zeitlinger. Grady, D., Yaffe, K., Kristof, M., Lin, F., Richards, C., & Barrett-Connor, E. (2002). Effect of postmenopausal hormone therapy on cognitive function: The Heart and Estrogen/Progestin Replacement Study. American Journal of Medicine, 113(7), 543-548. Graf, P., & Uttl, B. (1995). Component processes of memory: Changes across the adult lifespan. Swiss Journal of Psychology, 54(2), 11~130. Graf, P., Uttl, B., & Tuokko, H. (1995). Colorand Picture-Word Stroop Tests: Performance
REFERENCES changes in old age. journal of Clinical and Experimental Neuropsychology, 17, 390-415. Grafman, J., Jonas, B., & Salazar, A. (1990). Wisconsin Card Sorting Test performance based on location and size of neuroanatomical lesion in Vietnam veterans with penetrating head injury. Perceptual and Motor Skills, 71(3, Pt 2), 1120-1122. Grant, D. A., & Berg, E. A. (1948). Behavioral analysis of degree of reinforcement and ease of shifting to new responses in a Weigl-type cardsorting problem. Journal of Experimental Psychology, 38, 404-411. Grant, 1., Prigatano, G. P., Heaton, R. K., McSweeny, A. J., Wright, E. C., & Adams, K. M. (1987). Progressive neuropsychologic impairment and hypoxemia: Relationship in chronic obstructive pulmonary disease. Archives of General Psychiatry, 44(11), 999-1006. Green, B., & Hall, J. (1984). Quantitative methods for literature review. Annual Review of Psychology, 35, 31-53. Greene, R. L., & Farr, S. P. (1985, August). Multiple regression of moderator variables on Trail Making Test peifonnance. Paper presented at the annual meeting of the American Psychological Association, Los Angeles, CA. Gregg, E. W., Yaffe, K., Cauley, J. A., Rolka, D. B., Blackwell, T. L., Narayan, K. M., & Cummings, S. R. (2000). Is diabetes associated with cognitive impairment and cognitive decline among older women? Study of Osteoporotic Fractures Research Group. Archives of Internal Medicine, 160(2), 17~180. Gregory, R., & Paul, J. (1980). The effects of handedness and writing posture on neuropsychological test results. Neuropsychologia, 18, 231-235. Gregory, R. J., Paul, J. J., & Morrison, M. W. (1979). A short form of the Category Test for adults. journal of Clinical Psychology, 35(4), 795-798. Greiffenstein, M. F., Baker, W. J., & Gola, T. (1994). Validation of malingered amnesia measures with a large clinical sample. Psychological Assessment, 6(3), 218-224. Greve, K. W. (1993). Can perseverative responses on the Wisconsin Card Sorting Test be scored accurately? Archives of Clinical Neuropsychology, 8, 511-517. Greve, K. W. & Bianchini, K. J. (2002). Using the Wisconsin Card Sorting Test to detect malingering: An analysis of the specificity of two methods in nonmalingering normal and patient samples. Journal of Clinical and Experimental Neuropsychology, 24, 48-54.
561
REFERENCES Greve, K. W., Williams, M. C., Haas, W. G., Littell, R. R., & Reinoso, C. (1996). The role of attention in Wisconsin Card Sorting Test performance. Archives of Clinical Neuropsychology, 11(3), 215-222. Greve, K. W., Brooks, J., Crouch, J. A., Williams, M. C., et al. (1997). Factorial structure of the Wisconsin Card Sorting Test. British Journal of Clinical Psychology, 36(2), 283-285. Greve, K. W., Ingram, F., & Bianchini, K. J. (1998). Latent structure of the Wisconsin Card Sorting Test in a clinical sample. Archives of Clinical Neuropsychology, 13(7), 597-609. Greve, K. W., Bianchini, K. J., Hartley, S. M., & Adams, D. (1999). The Wisconsin Card Sorting Test in stroke rehabilitation: Factor structure and relationship to outcome. Archives of Clinical Neuropsychology, 14, 497-509. Greve, K. W., Lindberg, R. F., Bianchini, K. J., & Adams, D. (2000). Construct validity and predictive value of the Hooper Visual Organization Test in stroke rehabilitation. Applied Neuropsychology, 7(4), 215-222. Greve, K. W., Love, J. M., Sherwin, E., Mathias, C. W., Ramzinski, P., & Levy, J. (2002). Wisconsin Card Sorting Test in chronic severe traumatic brain injury: Factor structure and performance subgroups. Brain Injury, 16(1), 29-40. Greve, K. W., Bianchini, K. J., & Adams, D. (2003a). The ROCF in stroke rehabilitation and recovery. In J. A. Knight (Ed.), The handbook of
Rey-Osterrieth Complex Figure usage: Clinical and research applications (pp. 525-541). Lutz, FL: Psychological Assessment Resources. Greve, K. W., Hartley, S. M., Houston, R. J., Bianchini, K. J., Adams, D., & Stanford, M. S. (2003b). The ROCF in patients with vascular lesions of the cerebellum. In J. A. Knight (Ed.),
Rey-Osterrieth Complex Figure usage: Clinical and research applications (pp. 611-624). Lutz, FL: Psychological Assessment Resources. Griffiths, P. (1991). Word-finding ability and design Huency in developmental dyslexia. British Journal of Clinical Psychology, 30(1), 47-60. Grigsby, J., & Kaye, K. (1995). Alphanumeric sequencing and cognitive impairment among elderly persons. Perceptual and Motor Skills, 80(3, Pt 1), 732-734. Grigsby, J., Kaye, K., & Busenbark, D. (1994). Alphanumeric sequencing: A report on a brief measure of information processing used among persons with multiple sclerosis. Perceptual and Motor SkiUs, 78(3, Pt 1), 883--887. Grober, E., Merling, A., Heimlich, T., & Lipton, R. (1997). Free and cued selective reminding in the
elderly. Journal of Clinical and Experimental Neuropsychology, 19(5), 643-654. Grober, E., Lipton, R., Katz, M., & Sliwinski, M. (1998). Demographic inHuences on free and cued selective reminding performance in older persons. Journal of Clinical and Experimental Neuropsychology, 20(2), 221-226. Grober, E., Lipton, R., Hall, C., & Crystal, H. (2000). Memory impairment on free and cued selective reminding predicts dementia. Neurology, 54(4), 827--832. Groff, M. G., & Hubble, L. M. (1981). A factor analytic investigation of the Trail Making Test.
International Journal ogy, 3(4), 11-13.
of Clinical Neuropsychol-
Gronwall, D. (1977a). Paced Auditory SerialAddition task: A measure of recovery from concussion. Perceptual and Motor SkiUs, 44(2), 367-373. Gronwall, D. (1977b). PASAT (Paced Auditory
Serial Addition Test): Manual of instructions and norms. Victoria: University of Victoria. Gronwall, D., & Sampson, H. (1974). The psychological effects of concussion. Auckland, New Zealand: Auckland University Press-Oxford University Press. Gronwall, D., & Wrightson, P. (1974). Delayed recovery of intellectual function after minor head injury. lAncet, 2, 605--609. Gronwall, D., & Wrightson, P. (1981). Memory and information processing capacity after closed head injury. Journal of Neurology, Neurosurgery,
and Psychiatry, 44,
88~95.
Gross-Isseroff, R., Sasson, Y., Voet, H., Hendler, T., Luca-Haimovici, K., Kandel-Sussman, et al. (1996). Alternation learning in obsessivecompulsive disorder. Biological Psychiatry, 39(8), 733-738. Groth-Marnat, G. (2000). Introduction to neuropsychological assessment. In G. Groth-Mamat (Ed.), Neuropsychological assessment in clinical proctice. New York: Wiley. Groth-Mamat, G. (2003). Handbook of psychological assessment (4th ed.). New York: Wiley. Gruenewald, P. J., & Lockhead, G. R. (1980). The free recall of category examples. Journal of Ex-
perimental Psychology: Human Learning and
Memory, 6(3), 225-240. Gruzelier, J., & Warren, K. (1993). Neuropsychological evidence of reductions on left frontal tests with hypnosis. Psychological Medicine, 23(1), 93-101. Guilford, J.P. (1965). Fundamental statistics in psychology and education. New York: McGraw-Hill. Guilmette, T. J., & Rasile, D. (1995). Sensitivity, specificity, and diagnostic accuracy of three
REFERENCES
562 verbal memory measures in the assessment of mild brain injury. Neuropsychology, 9(3), 338--344. Guo, Q., Lu, C., & Hong, Z. (2000). Application of Rey-Osterrieth Complex Figure Test in Chinese normal older people. Chinese Journal of Clinical Psychology, 8(4), 205-207. Gur, R. C., Alsop, D., Glahn, D., Petty, R., Swanson, C. L., Maldjian, J. A., et al. (2000). An fMRI study of sex differences in regional activation to a verbal and a spatial task. Brain and Lan{!}Jage, 74(2), 157-170. Gurd, J. M. (2000). Verbal Huency deficits in Parkinson's disease: Individual differences in underlying cognitive mechanisms. Journal of NeurolinfYJisHcs, 13(1), 47-55. Curling, H. M., Curtis, D., & Murray, R. M. (1991). Psychological deficit from excessive alcohol consumption: Evidence from a co-twin control study. British Journal ofAddiction, 86(2), 151-155. Guruje, 0., Unverzargt, F., Osuntokun, B., Hendrie, H., Baiyewu, 0., Ogunniyi, A., et al. (1995). The CERAD Neuropsychological Test Battery: Norms from a Yoruba-speaking Nigerian sample. West African Journal of Medicine, 14, 29-33. Guskiewicz, K. M., Ross, S. E., & Marshall, S. W. (2001). Postural stability and neuropsychological deficits after concussion in collegiate athletes [Special Issue: Concussion in athletes]. Journal of Athletic Training, 36(3), 263-273. Haaland, K. Y., Cleeland, C. S., & Carr, D. (1977). Motor performance after unilateral hemisphere damage in patients with tumor. Archives of Neurology, 34, 556-559. Haaland, K. Y., Linn, R., Hunt, W., & Goodwin, J. (1983). A normative study of Russel's variant of the Wechsler Memory Scale in a healthy elderly population. Journal of Consulting and Clinical Psychology, 51(6), 87~1. Haaland, K. Y., Vranes, L. F., Goodwin, J. S., & Garry, P. J. (1987). Wisconsin Card Sort Test performance in a healthy elderly population. Journal of Gerontology, 42(3), 345-346. Haaland, K. Y., Price, L., & LaRue, A. (2003). What does the WMS-111 tell us about memory changes with normal aging? Journal of the International Neuropsychological Society, 9, 89-96. Haddock, C. K., Rindskopf, D., & Shadish, W. R. (1998). Using odds ratios as effect sizes for metaanalysis of dichotomous data: A primer on methods and issues. Psychological Methods, 3(3), 339-353. Halligan, F. R., Reznikoff, M., Friedman, H. P., & LaRocca, N. G. (1988). Cognitive dysfunction
and change in multiple sclerosis. Journal
of
Clinical Psychology, 44(4), 540--548. Halligan, P. W., Cockburn, J., & Wilson, B. A. (1991). The behavioural assessment of visual neglect. Neuropsychological Rehabilitation, 1(1), 5-32. Halstead, W. C. (1947). Brain and intelligence. Chicago: University of Chicago Press. Hamby, S. L., Wilkins, J. W., & Barry, N. S. (1993). Organizational quality on the Rey-Osterrieth and Taylor complex figure tests: A new scoring system. Psychological Assessment, 5(1), 27--33. Hamby, S. L., Bardi, C. A., & Wilkins, J. W. (1997). Neuropsychological assessment of relatively intact individuals: Psychometric lessons from an Hw+ sample. Archives of Clinical Neuropsychology, 12(6), 545-556. Hameleers, P. A. H. M., Van Boxtel, M. P. J., Hogervorst, E., Riedel, W. J., Houx, P. J., Buntinx, F., et al. (2000). Habitual caffeine consumption and its relation to memory, attention, planning capacity and psychomotor performance across multiple age groups. Human Psychopharmacology: Clinical and Erperimental, 15(8), 573-581. Hamilton, M. (1960). A rating scale for depression.
Journal of Neurology, Neurosurgery, and Psychiatry, 23, 56-62. Hanks, R. A., Allen, J. B., Ricker, J. H., & Deshpande, S. A. (1996). Normative data on a measure of design Huency: The Make A Figure Test. Assessment, 3(4), 459-466. Hannay, H. J., & Levin, H. S. (1985). Selective Reminding Test: An examination of the equivalence of four forms. Journal of Clinical and Experimental Neuropsychology, 7(3), 251-263. Hannay, H. J., Falgout, J. C., Leli, D. A., Katholi, C. R., et al. (1987). Focal right temporooccipital blood How changes associated with judgment of line orientation. Neuropsychologia, 25(5), 755-763. Hanninen, T., Hallikainen, M., Koivisto, K., Paartanen, K., Laakso, M. P., Riekkinen, P. J., et al. (1997). Decline of frontal lobe functions in subjects with age-associated memory impairment. Neurology, 48, 148-153. Harnadek, M. C. S., & Rourke, B. P. (1994). Principal identifying features of the syndrome of nonverbal learning disabilities in children. Journal of Learning Disabilities, 27(3), 144-154. Harris, J. G., Cullum, C. M., & Puente, A. E. (1995). Effects of bilingualism on verbal learning and memory in Hispanic adults. Journal of
the International Neuropsychological Society, 1(1), 10--16.
REFERENCES Harris, M., Cross, H., & VanNieuwkerk, R. (1981). The effects of state depression, induced depression and sex on the Finger Tapping and Tactual Performance Tests. Clinical Neuropsychology, 3(4), 28-34. Harris, M. E. (1988). Wisconsin Card Sorting Test computer version. Odessa, FL: Psychological Assessment Resources. Harris, M. J., & Rosenthal, R. (1985). Mediation of interpersonal expectancy effects: 31 metaanalyses. Psychological BuUetin, 97, 363-386. Hart, R. P., Kwentus, J. A., Wade, J. B., & Taylor, J. R. (1988). Modified Wisconsin Sorting Test in elderly normal, depressed and demented patients. Clinical Neuropsychologist, 2(1), 49-56. Harter, S., Hart, C., & Harter, G. (1999). Expanded scoring criteria for the Design Fluency Test: Reliability and validity in neuropsychological and college samples. Archives of Clinical Neuropsychology, 14(5), 419-432. Hartlage, L. C. (2001). Neuropsychological testing of adults: Further considerations for neurologists. Archives of Clinical Neuropsychology, 16(3), 201-213. Hartman, M., & Potter, G. (1998). Sources of age differences on the Rey-Osterrieth Complex Figure Test. Clinical Neuropsychologist, 12(4), 513-524. Harvey, N. S. (1986). Impaired cognitive setshifting in obsessive-compulsive neurosis. IRCS Medical Science, 14, 936-937. Hasselblad, V., & Hedges, L. V. (1995). Metaanalysis of screening and diagnostic tests. Psychological BuUetin, 117(1), 167-178. Haut, M. W., Cahill, J., Cutlip, W. D., Stevenson, J. M., Makela, E. H., Bloomfield, S. M. (1996). On the nature of Wisconsin Card Sorting Test performance in schizophrenia. Psychiatry Research, 65(1), 15-22. Hawkins, K. A., & Bender, S. (2002). Norms and the relationship of Boston Naming Test performance to vocabulary and education: A review. Aphasiology, 16(12), 1143-1153. Hawkins, K. A., Sledge, W. H., Orleans, J. F., Quinlan, D. M., et al. (1993). Nonnative implications of the relationship between reading vocabulary and Boston Naming Test performance. Archives of Clinical Neuropsychology, 8(6), 525-537. Haxby, J. V., Grady, C. L., Duara, R., RobertsonTehabo, E. A., Koziarz, B., Cutler, N. R., et al. (1986). Relations among age, visual memory, and resting cerebral metabolism in 40 healthy men.
Brain and Cognition, 5(4), 412-427. Hayes, W. L. (1963). Statistics. New York: Rinehart &Winston.
563 Hays, J. R. (1995). Trail Making Test norms for psychiatric patients. Perceptual and Motor Skills, 80(1), 187-194. Head, D., Bolton, D., & Hymas, N. (1989). Deficits in cognitive shifting ability in patients with obsessive-compulsive disorder. Biological Psychiatry, 25, 929-937. Head, D., Raz, N., Gunning-Dixon, F., Williamson, A., & Acker, J. D. (2002). Age-related differences in the course of cognitive skill acquisition: The role of regional cortical shrinkage and cognitive resources. Psychology and Aging, 17(1), 72-84. Heaton, R. K. (1981). Wisconsin Card Sorting Test manual. Odessa, FL: Psychological Assessment Resources. Heaton, R. K. (1985). Importance of demographic
variables in interpreting scores on the HalsteadReitan Battery. Paper presented at the 13th annual meeting of the International Neuropsychological Society, San Diego, CA. Heaton, R. K. (1992). Comprehensive nonns for an
expanded Halstead-Reitan Battery: A supplement for the WAIS-R Odessa, FL: Psychological Assessment Resources. Heaton, R. K. (1993). Wisconsin Card Sorting Test Computer version 2.0. Odessa, FL: Psychological Assessment Resources. Heaton, R. K. (2003a). Wisconsin Card Sorting Test computer version 4.0. Odessa, FL: Psychological Assessment Resources. Heaton, R. K. (2003b). Wisconsin Card Sorting
Test-64 research edition computer version 2.0. Odessa, FL: Psychological Assessment Resources. Heaton, R. K., Vogt, A. T., Hoehn, M. M., Lewis, J. A., Crowley, T. J., & Stallings, M. A. (1979). Neuropsychological impairment with schizophrenia vs. acute and chronic cerebral lesions. Journal of Clinical Psychology, 35(1), 46-53. Heaton, R. K., Nelson, L. M., Thompson, D. S., Burks, J. S., & Franklin, G. M. (1985). Neuropsychological findings in relapsing-remitting and chronic-progressive multiple sclerosis. Journal of Consulting and Clinical Psychology, 53(8), 103-110. Heaton, R. K., Grant, 1., & Matthews, C. G. (1986). Differences in neuropsychological test performance associated with age, education, and sex. In I. Grant & K. Adams (Eds.), Neuropsycholog-
ical assessment of neuropsychiatric disorders. New York: Oxford University Press. Heaton, R. K., Grant, 1., & Matthews, C. (1991).
Comprehensive nonns for an expanded HalsteadReitan Neuropsychological Battery: Demographic
564 corrections, research findings, and clinical applications. Odessa. FL: Psychological Assessment Resources. Heaton, R. K., Chelune, G. J., Talley, J. L., Kay, G. G., & Curtiss, G. (1993). Wiscon$n Card
Sorting Test manual: revised and etpanded. Odessa, FL: Psychological Assessment R•sources. Heaton, R. K., Matthews, C., Grant, 1., & ~vitable, N. (1996a). Demographic corrections with comprehensive norms: an overzealous attempt or a good start? Journal of Clinical and1 Experimental Neuropsychology, 18(3), 449--458. Heaton, R. K., Ryan, L., Grant, 1., & Matthews, C. G. (1996b). Demographic influences: on neuropsychological test performance. In I. Grant & K. M. Adams (Eds.), Neuropsychological assessment of neuropsychiatric disorders ($nd ed.). New York: Oxford University Press. Heaton, R. K., Avitable, N., Grant, 1., & Matthews, C. G. (1999). Further crossvalidation of regression-based neuropsychological norms with an update for the Boston Naming Test. ]~mal of
Clinical and Experimental Neurops!Jfhology, 21(4), 571-582. Heaton, R. K., Temkin, N., Dikmen, S., Avitable, N., Taylor, M. J., Marcotte, T. D., et al, (2001). Detecting change: A comparison of th~e neuropsychological methods, using nofllal and clinical samples. Archives of Clinical Niuropsychology, 16(1), 75-91. Heaton, R. K.• Miller, S. W., Taylor, M. J., &Grant, I. (2004). Revised comprehensive nonns for an ex-
panded Holstead-Reitan Battery: Demllgraphino~ for African American and Caucasian adulfiJ. Lutz,
caUy adjusted neuropsychological
FL: Psychological Assessment Resources. Hedges, L. (1982). Estimation of effect size from a series of independent experiments. Psyc~logical BuUetin, 92, 490-499. Hedges, L., & Olkin, I. (1985). Statistical tpethock for meta-analysis. Orlando, FL: Academ~ Press. Heilbronner, R. L., Henry, G. K., Buck, P.;Adams, R. L., & Fogle, T. (1991). Lateralized brain damage and performance on Trail Maki~ A and B, Digit Span forward and backward, and TPT memory and location. Archives of Cliniapl Neuropsychology, 6(4), 251-258. Helmstaedter, C., Pohl, C., & Elger, C. E.,(1995). Relations between verbal and nonverbal ..emory performance: Evidence of confounding; effects particularly in patients with right tempotal lobe epilepsy. Cortex, 31(2), 345-355. Hemsley, D. (1974). Relationship between two tests of visual retention. Perceptual an4 Motor Skills, 39, 1132-1134. ·
REFERENCES Henderson, L. W., Frank, E. M., Pigatt, T., Abramson, R. K., & Houston, M. (1998). Race, gender, and educational level effects on Boston Naming Test scores. Aphasialogy, 12(10), 901-911. Henderson, V. W., Mack, W., Freed, D. M., Kempler, D., & Andersen, E. S. (1990). Naming consistency in Alzheimer's disease. Brain and
Language,39, 530-538. Hermann, B. P., Wyler, A. R., & Richey, E. T. (1988). Wisconsin Card Sorting Test performance in patients with complex partial seizure of temporal-lobe origin. Journal of Clinical and Experimental Neuropsychology, 10, 467-476. Hesselbrock, M. N., Weidenman, M.A., & Reed, H. B. (1985). Effect of age, sex, drinking history and antisocial personality on neuropsychology of alcoholics. Journal of Studies on Alcohol, 46(4), 313-320. Heubrock, D. (1995). Error analysis in neuropsychological assessment of verbal memory and learning. European Journal of Psychological Assessment, 11(1). 21-28. Hilgert, L. D., & Treloar, J. H. (1985). The relationship of Hooper Visual Organization Test to sex, age and intelligence of elementary school children. Measurement and Evaluation in Counseling and Development, 17(4), 203-206. Hinton, V. J., Dobkin, C. S., Halperin, J. M., Jenkins, E. C., Brown, W. T., Ding, X. H., et al. (1992). Mode of inheritance influences behavioral expression and molecular control of cognitive deficits in female carriers of the fragile X syndrome. American Journal of Medical Genetics, 43(1-2), 87-95. Ho, A. K., Sahakian, B. J., Robbins, T. W., Barker, R. A., Rosser, A. E., & Hodges, J. R. (2002). Verbal Huency in Huntington's disease: A longitudinal analysis of phonemic and semantic clustering and switching. Neuropsychologia, 40(8), 1277-1284. Hochberg, F. H., & Slotnick, B. (1980). Neuropsychologic impairment in astrocytoma survivors. Neurology, 30(2), 172-177. Hochla, N. N., Fabian, M. S., & Parsons, 0. A. (1982). Brain-age quotients in recently detoxified alcoholic, recovered alcoholic and nonalcoholic women. Journal of Clinical Psychology, 38(1), 207-212. Hodges, J. R., Salmon, D. P., & Butters, N. (1991). The nature of the naming deficit in Alzheimer's and Huntington's disease. Brain, 114, 1547-1558. Hoff, A. L., Riordan, H., Morris, L., Cestaro, V., Wieneke, M., Alpert, R., et al. (1996). Effects
565
REFERENCES of crack cocaine on neurocognitive function. Psychiatry Research, 60(2-3), 167-176. Hoffman, D. T. (1969). Sex differences in preferred finger tapping rates. Perceptual and Motor Skills, 29,676. Hogervorst, E., Combrinck, M., Lapuerta, P., Rue, J., Swales, K., & Budge, M. (2001). The Hopkins Verbal Learning Test and screening for dementia. Dementia and Geriatric Cognitive Disorders, 13(1), 13-20. Holdwick, D. J., Jr., & Wingenfeld, S. A. (1999). The subjective experience of PASAT testing: Does the PASAT induce negative mood? Archives of Clinical Neuropsychology, 14(3), 273-284. Hom, J. (2003). Forensic neuropsychology: Are we there yet? Archives of Clinical Neuropsychology, 18, 827-845. Hom, J., & Reitan, R. M. (1990). Generalized cognitive function after stroke. Journal of Clinical and Experimental Neuropsychology, 12, 644-654. Honn, V. J., Para, M. F., Whitacre, C. C., & Bomstein, R. A. (1999). Effect of exercise on neuropsychological performance in asymptomatic HIV infection. AlDS and Behavior, 3(1), 67-74. Hooper, H. E. (1958, 1983, 1997). Hooper Visual Organization Test (VOT). Los Angeles: Western Psychological Services. Homer, M.D., Flashman, L.A., Freides, D., Epstein, C. M., & Bakay, R. A. E. (1996). Temporal lobe epilepsy and performance on the Wisconsin Card Sorting Test. Journal of Clinical and Experimental Neuropsychology, 18, 310-313. Horton, A.M., & Roberts, C. (2003). Demographic effects on the Trail Making Test in a drug abuse treatment sample. Archives of Clinical Neuropsychology, 18(1), 49-56. Houx, P. J., Jolles, J., & Vreeling, F. (1993). Stroop interference: Aging effects assessed with the Stroop Color-Word Test. Experimental Aging Research, 19, 209-224. Hsieh, S., & Riley, N. (1997, November). Neuropsychological peiformtlnce in the People's Republic of China: Age and educational nonns for four attention tasks. Paper presented at the National Academy of Neuropsychology, Las Vegas, NV. Hsieh, S., Lee, C. Y., & Tai, C. T. (1995). Setshifting aptitude in Parkinson's disease: External versus internal cues. Psychological Reports, 77, 339--349. Hubley, A. M., & Tombaugh, T. N. (2003). Taylor Complex Figure: Comparability to the ROCF
and norms. In J. A. Knight (Ed.), The handbook Complex Figure usage: Clinical and research applications. Lutz, FL: Psychological Assessment Resources. Hubley, A. M., & Tremblay, D. (2002). Comparability of total score performance on the ReyOsterrieth Complex Figure and a modified Taylor Complex Figure. Journal of Clinical and Experimental Neuropsychology, 24(3), 370-382. Hubley, A.M., Tombaugh, T. N., & Hemingway, D. (2003). A modification of the Taylor Figure and the development of new figures for older adults. In J. A. Knight (Ed.), The handbook of Rey-Osterrieth Complex Figure usage: Clinical and research applications. Lutz, FL: Psychological Assessment Resources. Huff, F. J., Collins, C., Corkin, S., & Rosen, T. J. (1986a). Equivalent forms of the Boston Naming Test. Journal of Clinical and Experimental Neuropsychology, 8(5), 556-562. Huff, F. J., Corkin, S., & Growdon, J. H. (1986b). Semantic impairment and anomia in Alzheimer's disease. Brain and Language, 28, 235-249. Hugdahl, K., & Franzon, M. (1985). Visual half-field presentations of incongruent color-words reveal mirror-reversal of language lateralization in dextral and sinistral subjects. Cortex, 21, 359-374. Hughes, D. L., & Bryan, J. (2002). Adult age differences in strategy use during verbal fluency performance. Journal of Clinical and Experimental Neuropsychology, 24(5), 642-654. Huhtaniemi, P., Haier, R. J., Fedio, P., & Buchsbaum, M. S. (1983). Neuropsychological characteristic of college males who show attention dysfunction. Perceptual and Motor Skills, 57,
of Rey-Osterrieth
399-406.
Hulicka, I. M. (1966). Age differences in Wechsler Memory Scale scores. Journal of Genetic Psychology, 190, 135-145. Hultsch, D. F., Hammer, M., & Small, B. J. (1993). Age differences in cognitive performance in later life: Relationships to self-reported health and activity life style. Journals of Gerontology, 48(1), Pl-Pll. Hunter, J. E., Schmidt, F. L., & Jackson, G. B. (1982). Meta-analysis: Cumulating research findings across studies. Beverly Hills: Sage. Iijima, M., Osawa, M., Iwata, M., Miyazaki, A., & Tei, H. (2000). Topographic mapping of P300 and frontal cognitive function in Parkinson's disease. Behavioural Neurology, 12(3), 143-148. Ilonen, T., Taiminen, T., Lauerma, H., Karlsson, H., Helenius, H. Y. M., Tuimala, P., et al. (2000). Impaired Wisconsin Card Sorting Test performance in first-episode schizophrenia: Resource or
566 motivation deficit? Comprehensive Psychiatry,
41(5), 385-391. Ingraham, L. J., & Aiken, C. B. (1996). An empirical approach to determining criteria for abnormality in test batteries with multiple measures. Neuropsychology, 10(1), 120-124. Ingraham, L. J., Chard, F., Wood, M., & Mirsky, A. F. (1988). A Hebrew language version of the Stroop test. Perceptual and Motor Skills, 67(1), 187-192. Ingram, F., Greve, K. W., Ingram, P., & Soukup, V. M. (1999). Temporal stability of the Wisconsin Card Sorting Test in an untreated patient sample. British Journal of Clinical Psychology, 38, 209-211. Inman, V. W., & Parkinson, S. R. (1983). Differences in Brown-Peterson recall as a function of age and retention interval. Journal of Gerontology, 38, 58-64. Insel, T. R., Donnelly, E. F., Lalakea, M. L., Alterman, I. S., & Murphy, D. L. (1983). Neurological and neuropsychological studies of patients with obsessive-compulsive disorder. Biological Psychiatry, 18(7), 741-751. Isaacs, B., & Kennie, A. T. (1973). The Set Test as an aid to the detection of dementia in old people. British Journal of Psychiatry, 123, 467-470. Ishikawa, S. S., Raine, A., Lencz, T., Bihrle, S., & Lacasse, L. (2001). Autonomic stress reactivity and executive functions in successful and unsuccessful criminal psychopaths from the community. Journal of Abnon11al Psychology, 110(3), 423-432. lsingrini, M., & Vazou, F. (1997). Relation between fluid intelligence and frontal lobe functioning in older adults. International Journal of Aging and Human Development, 45(2), 99-109. Ismail, B., Cantor-Graae, E., & McNeil, T. F. (2000). Minor physical anomalies in schizophrenia: Cognitive, neurological and other clinical correlates. Journal of Psychiatric Research, 34(1), 45-56. Iverson, G. L. (2001). Can malingering be identified with the Judgment of Line Orientation Test? Applied Neuropsychology. 8(3), 167-173. Iverson, G. L., Sherman, E. M. S., & SmithSeemiller, L. ( 1997a). Evaluation of a short form of the Visual Form Discrimination Test for assessing cognitive decline associated with dementia. Journal of Cognitive Rehabilitation, 15, 20-21. Iverson, G. L., Slick, D., & Smith-Seemiller, L. (1997b). Screening for visual-perceptual deficits following closed head injury: A short form of the Visual Form Discrimination Test. Brain Injury, 11, 125-128.
REFERENCES Iverson, G. L., Franzen, M. D., & Lovell, M. R. (1999). Normative comparisons for the Controlled Oral Word Association Test following acute traumatic brain injury. Clinical Neuropsychologist, 13(4), 437-441. Iverson, G. L., Woodward, T. S., & SmithSeemiller, L. (2000). Internal consistency and concurrent validity of two short forms of the Visual Form Discrimination Test. Applied Neuropsychology, 7, 108-110. Iverson, G. L., Lange, R. T., Green, P., & Franzen, M. (2002). Detecting exaggeration and malingering with the Trail Making Test. Clinical Neuropsychologist, 16(3), 398-406. lvinskis, A., Allen, S., & Shaw, E. (1971). An extension of Wechsler Memory Scale norms to lower age groups. Journal of Clinical Psychology, 27, 354--357. lvison, D. (1977). The Wechsler Memory Scale: Preliminary findings toward an Australian standardization. Australian Psychologist, 1.2, 303312. lvison, D. (1986). Anna Thompson and the American Liner New York: Some normative data.
Journal of Clinical and Experimental Neuropsychology, 8(3), 317-320. lvison, D. (1993). Logical memory in the Wechsler Memory Scales: Does the order of passages affect difficulty in an university sample? Clinical Neuropsychologist, 7(2), 215-218. lvnik, R. J., Sharbrough, F. W., & Laws, E. R. (1987). Effects of anterior temporal lobectomy on cognitive function. Journal of Clinical Psychology, 43, 128-137. lvnik, R. J., Malec, J. F., Tangalos, E. G., Petersen, R. C., Kokmen, E., & Kurland, L. T. (1990). The Auditory-Verbal Learning Test (AVLT): Norms for ages 55 and older. Psycho-
logical Assessment: A Journal of Consulting and Clinical Psychology, 2, 304-312.
lvnik, R. J., Smith, G., Tangalos, E., Petersen, R., Kokmen, E., & Kurland, L. (1991). Wechsler Memory Scale: IQ-dependent norms for persons ages 65 to 97 years. Psychological Assessment, 3(2), 156-161. lvnik, R. J., Malec, J. F., Smith, G. E., Tangalos, E. G., Petersen, R. C., Kokmen, E., et al. (1992a). Mayo's Older Americans Normative Studies: WAIS-R norms for ages 56 to 97. Clinical Neuropsychologist, 6(Suppl.), 1-30. Ivnik, R., Malec, J., Smith, G., Tangalos, E., Petersen, R., Kokman, E., et al. (1992b). Mayo's Older Americans Normative Studies: WMS-R norms for ages 56 to 94. Clinical Neuropsychologist, 6(Suppl.), 49-82.
567
REFERENCES Ivnik, R. J., Malec, J. F., Smith, G. E., Tangalos, E. G., Petersen, R. C., Kokmen, E., et al. (1992c). Mayo's Older Americans Normative Studies: Updated AVLT norms for ages 56 to 97. Clinical Neuropsychologist, 6, 83--104. lvnik, R., Smith, G., Malec, J., Tangalos, E., & Parisi, J. (1993). Comparison of Wechsler vs. Mayo summary scores in a clinical sample. Journal of Clinical Psychology, 49(4), 534-542. Ivnik, R. J., Malec, J. F., Smith, G. E., Tangalos, E. G., & Petersen, R. C. (1996). Neuropsychological tests' norms above age 55: COWAT, BNT, MAE Token, WRAT-R Reading, AMNART, STROOP, TMT, and JLO. Clinical Neuropsychologist, 10(3), 262-278. lvnik, R., Smith, G., Lucas, J., Tangalos, E., Kokmen, E., & Petersen, R. (1997). Free and cued selective reminding test: MOANS norms. Journal of Clinical and Experimental Neuropsychology, 19(5), 676--691. lvnik, R., Smith, G., Petersen, R., Boeve, B., Kokmen, E., & Tangalos, E. G. (2000). Diagnostic accuracy of four approaches to interpreting neuropsychological test data. Neuropsychology, 14, 163--177. lvnik, R. J., Smith, G. E., Cerhan, J. H., Boeve, B. F., Tangalos, E. G., & Petersen, R. C. (2001). Understanding the diagnostic capabilities of cognitive tests. Clinical Neuropsychologist, 15(1), 114-124. Jackson, S. T., & Tompkins, C. A. (1991). Supplemental aphasia tests: Frequency of use and psychometric properties. Clinical Aphasiology, 20, 91-99. Jacobs, D. M., Sano, M., Dooneief, G., Marder, K., Bell, K., & Stem, Y. (1995). Neuropsychological detection and characterization of preclinical Alzheimer's disease. Neurology, 45, 957-962. Jacobs, D. M., Sano, M., Albert, S., Schofield, P., Dooneief, G., & Stem, Y. (1997). Cross-cultural neuropsychological assessment: A comparison of randomly selected, demographically matched cohorts of English- and Spanish-speaking older adults. Journal of Clinical and Experimental Neuropsychology, 19(3), 331-339. Jacqmin-Gadda, H., Fabrigoule, C., Commenges, D., Letenneur, L., & Dartigue, J. F. (2000). A cognitive screening battery for dementia in the elderly. Journal of Clinical Epidemiology, 53(10), 980--987. Janowsky, J. S., & Thomas-Thrapp, L. J. (1993). Complex figure recall in the elderly: A deficit in memory or constructional strategy? Journal of Clinical and Experimental Neuropsychology, 15(2), 159-169.
Janowsky, J. S., Shimamura, A. P., Kritchevsky, M., & Squire, L. R. (1989). Cognitive impairment following frontal lobe damage and its relevance to human amnesia. Behavioral Neuroscience, 103, 548--560.
Jarvis, P. E., & Barth, J. T. (1984). Holstead-Reitan Test Battery: An interpretive guide. Odessa, FL: Psychological Assessment Resources. Javorsky, D., & Stem, R. A. (1999). Validity of the Boston Qualitative Scoring System (BQSS) for the Rey-Osterrieth Complex Figure in discriminating between Alzheimer's and vascular dementia. Journal of the International Neuropsychological Society, 5, 120. Jenkins, R. L., & Parsons, 0. A. ( 1978). Cognitive deficits in male alcoholics as measured by a modified Wisconsin Card Sorting Test. Alcohol Technical Reports, 7, 76--83. Jenkins, R. L., & Parsons, 0. A. (1981). Neuropsychological effect of chronic alcoholism on tactual-spatial performance and memory in males. Alcoholism: Clinical and Experimental Research, 5(1), 26--33. Jenkins, R. L., & Parsons, 0. A. (1989). Hemispheric asymmetry in the processing of tactualspatial material of low verbal codability in normal subjects. Archives of Clinical Neuropsychology, 4(4), 311-321. Jensen, A. (1965). Scoring the Stroop Test. Acta Psychologica, 24, 398-408. Jensen, A., & Rohwer, W. (1966). The Stroop Color-Word Test: A review. Acta Psychologica, 25,36-93. Jeste, D. V., Harris, M. J., Krull, A., Kuck, J., McAdams, L. A., & Heaton, R. (1995). Clinical and neuropsychological characteristics of patients with late-onset schizophrenia. American Journal of Psychiatry, 152(5), 722-730. Johnson, D. A., Roethig-Johnson, K., & Middleton, J. (1988). Development and evaluation of an attentional test for processing capacity in a normal sample. Journal of Child Psychology and Psychiatry, 2, 199-208. Johnson, S. C., Saykin, A. J., Flashman, L. A., McAllister, T. W., & Sparling, M. B. (2001). Brain activation on fMRI and verbal memory ability: Functional neuroanatomic correlates of CVLT performance. Journal of the International Neuropsychological Society, 7(1), 55-62. Johnson, S. K., DeLuca, J., Diamond, B. J., & Natelson, B. H. (1996). Selective impairment of auditory processing in chronic fatigue syndrome: A comparison with multiple sclerosis and healthy controls. Perceptual and Motor Skills, 83(1), 51--62.
568 Johnson-Greene, D., Adams, K. M., Gilman, S., & Junek, L. (2002). Relationship between neuropsychological and emotional functioning in severe chronic alcoholism. Clinical Neuropsychologist, 16(3), 300--309. Johnson-Selfridge, M., Zalewski, C., & Aboudarham, J. (1998). The relationship between ethnicity and word fluency. Archives of Clinical Neuropsychology, 13(3), 319--325. Johnstone, B., & Wilhelm, K. L. (1997). The construct validity of the Hooper Visual Organization Test. Assessment, 4(3), 243-248. Johnstone, B., Holland, D., & Hewett, J. E. (1997). The construct validity of the Category Test: Is it a measure of reasoning or intelligence? Psychological Assessment, 9(1), 28--33. Jones, B., Teng, E., Folstein, M., & Harrison, K. (1993). A new bedside test of cognition for patients with HIV infection. Annals of Internal Medicine, 119, 1001-1004. Jones, B. P., Mirsky, A., & Duncan, C. C. (2003). ROCF performance, attention disorders, and neuropsychiatric disorders. In J. A. Knight (Ed.),
The handbook of Rey-Osterrieth Complex Figure usage: Clinical and research applications. Lutz, FL: Psychological Assessment Resources. Jones-Cotman, M. (1991a). Localization of lesions by neuropsychological testing. Epilepsia, 32(Suppl. 5), S41-S52. Jones-Cotman, M. (199lb). Presurgical psychological assessment in children: Special tests. Journal of Epilepsy, 3(Suppl. 1), 93-102. Jones-Cotman, M., & Milner, B. (1977). Design fluency: The invention of nonsense drawings after focal cortical lesions. Neuropsychologia, 15(Suppl. 5), 653-674. Joyce, E., Blumenthal, S., & Wessely, S. (1996). Memory, attention, and executive function in chronic fatigue syndrome. Journal of Neurology. Neurosurgery, and Psychiatry, 60(5), 495--503. Judd, P. H., & Ruff, R. M. (1993). Neuropsychological dysfunction in borderline personality disorder. Journal of Personality Disorders, 7(4), 275-284. Jung, R. E., Yeo, R. A., Chiulli, S. J., Sibbitt, W. L., Jr., Weers, D. C., Hart, B. L., et al. (1999). Biochemical markers of cognition: A proton MR spectroscopy study of normal human brain. Neuroreporll0(16), 3327-3331. Kaasa, S., Olsnes, B. T., & Mastekaasa, A. (1988). Neuropsychological evaluation of patients with inoperable non-small cell lung cancer treated with combination chemotherapy and radiotherapy. Acta Oncologica, 27(3), 241-246.
REFERENCES Kalechstein, A. D., van Gorp, W. G., & Rapport, L. J. (1998). Variability in clinical classification of raw test scores across normative data sets. Clinical Neuropsychologist, 12(3), 339--347. Kalinowski, A. G., Weinstein, C. S., & Seidman, L. J. (2003). Organizational and retrieval deficits on the ROCF in schizophrenia. In J. A. Knight (Ed.), The handbook of Rey-Osterrieth Complex
Figure usage: Clinical and research applications. Lutz, FL: Psychological Assessment Resources. Kaltreider, L. B., Cicerello, A. R., Lacritz, L. H., Weiner, M. F., Honig, L. S., Rosenberg, R.N., et al. (2000). Comparison of the Cerad and CVLT list-learning tasks in Alzheimer's disease.
Clinical Neuropsychologist, 14(3), 269-274. Kanaya, T., Scullin, M. H., & Ceci, S. J. (2003). The Flynn effect and U.S. policies: The impact of rising IQ scores on american society via mental retardation diagnoses. American Psychologist, 58(10), 778-790. Kane, R. L., Parsons, 0. A., & Goldstein, G. (1985). Statistical relationships and discriminative accuracy of the Halstead-Reitan, Luria-Nebraska, and Wechsler IQ scores in the identification of brain damage. Journal of Clinical and &perimental Neuropsychology, 7(3), 211-223. Kang, S. K. (2000). TheapplicabilityofWHO-NCTB in Korea. Neurotoxicology, 21(5), 697-701. Kanter, G. (1984). PASAT performance and intelligence: A relationship? The International Journal of Cltnical Neuropsychology, 6, 84. Kaplan, E. (1988). A process approach to neuropsychological assessment. In T. Boll & B. K. Bryant (Eds.), Clinical neuropsychology and
brain fonction: Research, measurement and practice (Vol. 7). Washington, DC: American Psychological Association. Kaplan, E., Goodglass, H., & Weintraub, S. (1978).
The Boston Naming Test. Experimental edition. Boston: Kaplan and Goodglass. Kaplan, E., Goodglass, H., & Weintraub, S. (1983). The Boston Naming Test. Philadelphia: Lea and Febiger. Kaplan, E., Fein, D., Morris, R., & Delis, D. C. (1991). WAIS-R as a neuropsychological instrument: WAIS-R-Nl manual. New York: Psychological Corporation. Kaplan, E., Goodglass, H., & Weintraub, S. (2000). Boston Naming Test, Second edition. Philadelphia: Lippincott Williams & Wilkins. Kareken, D. A., Moberg, P. J., & Gur, R. C. (1996). Proactive inhibition and semantic organization: Relationship with verbal memory in patients with schizophrenia. Journal of the International Neuropsychological Society, 2(6), 486-493.
569
REFERENCES Kasahara, H., Tanno, M., Yamada, H., Endoh, K., Kobayashi, M., et al. (1993). MRI study of the brain in aged volunteers: T 2 high signal intensity lesions and high cortical function. Nippon Ronen Igakkai Zasshi, 30(10), 892-900. Kasahara, H., Yamada, H., Tanno, M., Kobayashi, M., Karasawa, A., Endo, K., et al. (1995). Magnetic resonance imaging study of the brain in aged volunteers: T 2 high intensity lesions and higher order cortical function. European Archives ofPsychiatry and Clinical Neuroscience, 49(5-6), 273-279. Kaskie, B., & Storandt, M. (1995). Visuospatial deficit in dementia of the Alzheimer type. Archives of Neurology, 52, 422--425. Katz, L. J., Wood, D. S., Goldstein, G., Auchenbach, R. C., & Geckle, M. (1998). The utility of neuropsychological tests in evaluation of attentiondeficit/hyperactivity disorder (ADHD) versus depression in adults. Assessment, 5(1), 45-51. Kawas, C. H., Corrada, M. M., Brookmeyer, R., Morrison, A., Resnick, S. M., Zonderman, A. B., et al. (2003). Visual memory predicts Alzheimer's disease more than a decade before diagnosis. Neurology, 60(7), 1089-1093. Kawasaki, Y., Maeda, Y., Suzuki, M., Urata, K., Higashima, M., Kiba, K., etal. (1993). SPECTanalysis of regional cerebral blood Row changes in patients with schizophrenia during Wisconsin Card Sorting Tests. Schizophrenia Research, 10, 109-116. Kay, G. G. (2002). Guidelines for the psychological evaluation of air crew personnel. Occupational Medicine, 17(2), 227-245. Kear-Colwell, J. J., & Heller, M. (1978). A normative study of the Wechsler Memory Scale. Journal of Clinical Psychology, 34(2), 437-442. Keenan, P. A., Ricker, J. H., Lindamer, L. A., Jiron, C. C., & Jacobson, M. W. (1996). Relationship between WAIS-R Vocabulary and Performance on the California Verbal Learning Test. Clinical Neuropsychologist, 10(4), 455-458. Kehrer, C. A., Sanchez,P. N.,Habif, U.,Rosenbaum,J. G., & Townes, B. D. (2000). Effects of a significantother observer on neuropsychological test performance. Clinical Neuropsychologist, 14(1), 67-71. Kelland, D. Z., & Lewis, R. F. (1994). Evaluation of the reliability and validity of the Repeatable Cognitive-Perceptual-Motor Battery. Clinical Neuropsychologist, 8(3), 295-308. Kelland, D. Z., & Lewis, R. F. (1996). The Digit Vigilance Test: Reliability, validity, and sensitivity to diazepam. Archives of Clinical Neuropsychology, 11(4), 339-344. Keller, F. R., & Davis, H. P. (1998). Colorado assessment tests (version 1.0) [Computer software]. Colorado Springs: Colorado Assessment Tests.
Kellner, C. H., Rubinow, D. R., & Post, R. M. (1986). Cerebral ventricular size and cognitive impairment in depression. Journal of Affective Disorders, 10(3), 215-219. Kelly, M. D., Kundert, D. K., & Dean, R. S. (1992). Factor analysis and matrix invariance of the HRNB-C Category Test. Archives of Clinical Neuropsychology, 7, 411>-418. Kempen, J. H., Kritchevsky, M., & Feldman, S. T. (1994). Effect of visual impairment on neuropsychological test performance. Journal of
Clinical and Experimental Neuropsychology, 16(2), 223-231. Kempler, D., Teng, E. L., Dick, M., Taussig, I. M., & Davis, D. S. (1998). The effects of age, education, and ethnicity on verbal Ruency. Journal
of the International Neuropsychological Society, 4(6), 531-538. Kennedy, K. J. (1981). Age effects on Trail Making Test performance. Perceptual and Motor Skills, 52(2), 671-675. Kibby, M. Y., Schmitter-Edgecombe, M., & Long, C. J. (1998). Ecological validity of neuropsychological tests: Focus on the California Verbal Learning Test and the Wisconsin Card Sorting Test. Archives of Clinical Neuropsychology, 13(6), 523-534. Kilander, L., Nyman, H., Boberg, M., & Lithell, H. (2000). The association between low diastolic blood pressure in middle age and cognitive function in old age. A population-based study. Age and Ageing, 29(3), 243-248. Killgore, W. D. S., & Adams, R. L. (1999). Prediction of Boston Naming Test performance from vocabulary scores: Preliminary guidelines for interpretation. Perceptual and Motor Skills, 89(1), 327-337. Kilpatrick, D. G. (1970). The Halstead Category Test of brain dysfunction: Feasibility of a short form. Perceptual and Motor Skills, 30, 577-578. Kim, H., & Na, D. L. (1999). Normative data on the Korean version of the Boston Naming Test.
Journal of Clinical and Experimental Neuropsycholor!J, 21(1), 127-133.
J. K., & Kang, Y. (1999). Nonnative study of the Korean-California Verbal Learning Test (K-CVLT). Clinical Neuropsychologist, 13(3), 365-369. Kimbarow, M. L., Vangel, S. J., Jr., & Lichtenberg. P. A. (1996). The influence of demographic variables on normal elderly subjects' performance on the Boston Naming Test. Clinical Aphasiology, 24, 135-144. Kimura, S. D. (1981). A card form of the Reitanmodified Halstead Category Test. Journal of Kim,
570
Consulting and Clinical Psychology, 49(1), 1~146.
Kindennann, S. S., Kalayam, B., Brown, G. G., Burdick, K. E., & Alexopoulos, G. S. (2000). Executive functions and P300 latency in elderly depressed patients and control subjects. American Journal of Geriatric Psychiatry, 8(1), 57-65. King, G. D., Hannay, H. J., Masek, B. J., &: Burns, J. W. (1978). Effects of anxiety and sex on neuropsychological tests. Journal of Consulting and Clinical Psychology, 46(2), 375-376. · King, J. H., Gfeller, J. D., & Davis, H. P. (1998). Detecting simulated memory impairment with the Rey Auditory Verbal Learning Test: Implications of base rates and study generalizability.
Journal of Clinical and Experimental NtiUropsychology, 20(5), 603-612. King, M. C. (1981). Effects of non-focal brain dysfunction on visual memory. Journal of Clinical Psychology, 37(3), 638-643. Kirk, U. (1992a). Confrontation naming in normally developing children: Word-retrieval or word knowledge? Clinical Neuropsychologist, 6(2), 156-170. Kirk, U. (1992b). Evidence for early acquisition of visual organization ability: A developmental study. Clinical Neuropsychologist, 6(2), 171-177. Kirk, U., & Kelly, M.S. (1986). Scoringscaleforthe Rey-Osterrieth Complex Figure. Paper presented at the meeting of the Internatio~ Neuropsychological Society, Denver, CO. Kirshner, H. S., Webb, W. G., & Kelly, M. P. (1984). The naming disorder of dementJa. Neuropsychologia, 2.2, 23--30. Kishi, R., Harabuchi, 1., Katakura, Y., Ikeda, T., & Miyake, H. (1993). Neurobehavioral effects of chronic occupational exposure to organic solvents among Japanese industrial painters. Environmental Research, 62(2), 303-313. Kivircik, B. B., Yener, G. G., Alptekin, K., & A)'din, H. (2003). Event-related potentials and neuropsychological tests in obsessive-compulsive disorder. Progress In Neuro-Psychophormacology and Biological Psychiatry, 27(4), 601-606. Kivling-Boden, G., & Sundbom, E. (2003). Cognitive abilities related to post-traumatic syinptoms among refugees from the fonner Yugoslavia in psychiatric treatment. Nordic Journal of Psychiatry, 57(3), 191-198. Kixmiller, J. S., Verfaellie, M., Mather, M. M., & Cennak, L. S. (.2000). Role of perceptual and organizational factors in amnesics' recall of the Rey-Osterrieth Complex Figure: A cont>arison of three amnesic groups. Journal of Clinical and Experimental Neuropsychology, 22(2), 11}8-207.
REFERENCES Klein, M., Ponds, R. W. H. M., Houx, P. J., & Jolles, J. (1997). Effect of test duration on age-related differences in Stroop interference.
Journal of Clinical and Experimental Neuropsychology, 19, 77-8.2. Klicpera, C. (1983). Poor planning as a characteristic of problem-solving behavior in dyslexic children: A study with the Rey-Osterrieth Complex Figure test. Acta Paedopsychiatrica, 49, 7~2.
Klimczak, N. J., Donovick, P. J., & Burright, R. (1997). The malingering of multiple sclerosis and mild traumatic brain injury. Brain Injury, 11(5), 343-35.2. Klonoff, H., & Kennedy, M. (1965). Memory and perceptual functioning in octogenarians and nonagenarians in the community. Journal of Gerontology, 20, 328--333. Klonoff, H., & Kennedy, M. (1966). A comparative study of cognitive functioning in old age. Journal of Gerontology, 21, 239--243. Klove, H. (1974). Validation studies in adult clinical neuropsychology. In R. M. Reitan & L. A. Davison (Eds.), Clinical Neuropsychology: Current Status and Applications (pp. 211-236). Washington, DC: Winston. Klusman, L. E., Cripe, L.l., & Dodrill, C. B. (1989). Analysis of errors on the Trail Making Test. Perceptual and Motor SkiUs, 68, 1199-1204. Knesevich, J. W., LaBarge, E., & Edwards, D. (1986). Predictive value of the Boston Naming Test in mild senile dementia of the Alzheimer type. Psychiatry Research, 19, 155-161. Knight, J. A. (2003). The handbook of Rey-
Osterrieth Complex Figure usage: Clinical and research applications: Lutz, FL: Psychological Assessment Resources. Knight, J. A., Kaplan, E., & Ireland, L. (2003). Survey findings of Rey-Osterrieth Complex Figure usage. In J. A. Knight (Ed.), The hand-
book of Rey-Osterrieth Complex Figure usage: Clinical and research applications. Lutz, FL: Psychological Assessment Resources. Knights, R. M., & Moule, A. D. (1967). Nonnative and reliability data on finger and foot tapping in children. Perceptual and Motor Skills, 25, 717-720. Koffier, S. P., & Zehler, D. (1985). Nonnative data for the hand dynamometer. Perceptual and Motor Skills, 61, 589-590. Kohn, S. E., & Goodglass, H. (1985). Picturenaming in aphasia. Brain and Language, 24(2), 266--283. Kohnert, K. J., Hernandez, A. E., & Bates, E. (1998). Bilingual performance on the Boston
REFERENCES Naming Test: Preliminary norms in Spanish and English. Brain and Lan(YJage, 65(3), 422-440. Kongs, S. K., Thompson, L. L., Iverson, G. L., & Heaton, R. K. (2000). Wisconsin Card Sorting Test-64 Card Version. Lutz, FL: Psychological Assessment Resources. Konishi, S., Nakajima, K., Uchida, 1., Kameyama, M., Nakahara, K., Sekihara, K., et al. (1998). Transient activation of inferior prefrontal cortex during cognitive set shilling. National Neuroscience, 1, 80--84. Koren, D., Seidman, L. J., Harrison, R. H., Lyons, M. J., Kremem, W. S., Caplan, B., et al. (1998). Factor structure of the Wisconsin Card Sorting Test: Dimensions of deficit in schizophrenia. Neuropsychology, 12(2), 289-302. Kortte, K. B., Horner, M. D., & Windham, W. K. (2002). The Trail Making Test, part B: Cognitive flexibility or ability to maintain set? Applied Neuropsychology, 9(2), 106-109. Koss, E., Ober, B. A., Delis, D. C., & Friedland, R. P. (1984). The Stroop Color-Word Test: Indicator of dementia severity. International Journal of Neuroscience, 24, 53--61. Kozel, J., & Meyers, J. E. (1998). A cross-validation study of the Victoria Revision of the Category Test. Archives of Clinical Neuropsychology, 13(3), 327-332. Kozora, E., & Cullum, C. M. (1995). Generative naming in normal aging: Total output and qualitative changes using phonemic and semantic constraints. Clinical Neuropsychologist, 9(4), 313--320. Kramer, A. F., Humphrey, D. G., Larish, J. F., Logan, G. D., & Strayer, D. L. (1994). Aging and inhibition: Beyond a unitary view of inhibitory processing in attention. Psychology and Aging, 9(4), 491-512. Kramer, J. H., Delis, D. C., & Daniel, M. H. (1988). Sex differences in verbal learning. Journal of Clinical Psychology, 44(6), 907-915. Krebs, R. (1994). The Hopkins Verbal Learning Test: An alternative to the MMSE? Gerontologist, 34(5), 692. Kritz-Silverstein, D., & Barrett-Connor, E. (2002). Hysterectomy, oophorectomy, and cognitive function in older women. Journal of the American Geriatrics Society, 50(1), 55-61. Krupp, L. B., Sliwinski, M., Masur, D. M., Friedberg, F., & Coyle, P. K. (1994). Cognitive functioning and depression in patients with chronic fatigue syndrome and multiple sclerosis. Archives of Neurology, 51(7), 7~710. Kuehn, S. M., & Snow, W. G. (1992). Are the Rey and Taylor figures equivalent? Archives of Clinical Neuropsychology, 7, 445-448.
571 Kujala, P., Portin, R., Revonsuo, A., & Ruutiainen, J. (1995). Attention related performance in two cognitively different subgroups of patients with multiple sclerosis. journal of Neurology, Neurosurgery, and Psychiatry, 59(1), 77-82. Kulik, J. (1983). Review of G. V. Glass et al., Metaanalysis in social research [Book review]. Evaluation News, 4, 101-105. Kumar, P., Gupta, B. N., Pandya, K. P., & Clerk, S. H. (1988). Behavioral studies in petrol pump workers. International Archives of Occupational and Environmental Health, 61(1-2), 35--38. Kupke, T. (1983). Effect of subject sex, examiner sex, and test apparatus on Halstead Category and Tactual Performance tests. Journal of Consulting and Clinical Psychology, 51(4), 624--626. Kurylo, M., Temple, R. 0., Elliott, T. R., & Crawford, D. (2001). Rey Auditory Verbal Learning Test (AVLT) performance in individuals with recent-onset spinal cord injury. Rehabilitation Psychology, 46(3), 247-261. Kuslansky, G., Katz, M., Verghese, J., Hall, C. B., Lapuerta, P., LaRuffa, G., et al. (2004). Detecting dementia with the Hopkins Verbal Learning Test and the Mini-Mental State Examination. Archives of Clinical Neuropsychology, 19(1), 89-104. Kuzis, G., Sabe, L., Tiberti, C., Merello, M., Leiguarda, R., & Starkstein, S. E. (1999). Explicit and implicit learning in patients with Alzheimer disease and Parkinson disease with dementia.
Neuropsychiatry, Neuropsychology, and Behavioral Neurology, 12(4), 265-269. Laatsch, L., & Choca, J. (1991). Understanding the Halstead Category Test by using item analysis.
Psychological Assessment: A Journal of Consulting and Clinical Psychology, 3, 701-704. Labarge, A. S., McCaffrey, R. J., & Brown, T. A. (2003). Neuropsychologists' abilities to determine the predictive value of diagnostic tests. Archives of Clinical Neuropsychology, 18(2), 165-175. LaBarge, E., Edwards, D., & Knesevich, J. W. (1986). Performance of normal elderly on the Boston Naming Test. Brain and lAnguage, 27, 380--384. LaBarge, E., Balota, D. A., Storandt, M., & Smith, D. S. (1992). An analysis of confrontation naming errors in senile dementia of the Alzheimer type. Neuropsychology, 6(1), 77-95. Labreche, T. M. (1983). The Victoria Revision of the Halstead Category Test. Unpublished doctoral dissertation, University of Victoria, Canada. Lacritz, L. H., & Cullum, C. M. (1998). The Hopkins Verbal Learning Test and CVLT:
572
REFERENCES
A preliminary comparison. Archives of Clinical Neuropsychology, 13(1), 623--628. . Lacritz, L. H, & Cullum, M. (2003). The WAIS-III and WMS-III: Practical issues and frequently asked questions. In D. Tulsky, D. Sakl~fske, G. Chelune, R. Heaton, R. Ivnik, R. Bornstein, et al. (Eds.), Clinical interpretation of the AlS-lil and WMS-111. San Diego: Academic Press. Lacritz, L. H., Cullum, C. M., Frol, A. B., Dewey, R. B., Jr., & Giller, C. A. (2o00). Neuropsychological outcome following unilateral stereotactic pallidotomy in intractable P~ldnson's disease [Special Issue: Neurobehavioral4Bsues in
W
the neurosurgical treatment of move111e1Jt disorders, Part II: Pallidotomy and paUidal ;timulation]. Brain and Cognition, 42(3), 364-3T8. Lacritz, L. H., Cullum, C. M., Weiner, ~· F., & Rosenberg, R. N. (2001). Compa1son of the Hopkins Verbal Learning Test-Re~ to the California Verbal Learning Test in ~eimer's disease. Applied Neuropsychology, 8(3), 180-184. Lacy, M. A., Gore, P. A., Jr., Pliskin, IN. H., Henry, G. K., Heilbronner, R. L., & Hamf::r, D.P. (1996). Verbal Huency task equivalence. 'Clinical Neuropsychologist, 10(3), 305-308. . LaHeche, G., & Albert, M. S. (1995). Ejecutive function deficits in mild Alzheimer's idisease. 1 Neuropsychology, 9(3), 313--320. Laiacona, M., Inzaghi, M.G., De Tanti, A~ & Capitani, E. (2000). Wisconsin Card Sorting Test: A new global score, with Italian norms, i and its relationship with the Weigl sorting test.! Neurological Sciences, 21(5), 279-291. Lamar, M., Zonderman, A. B., & Res{rlck, S. (2002). Contribution of specific cogni~e processes to executive functioning in an a~g population. NeuropSfJchology, 16(2), 156-1~. Lamberty, G. J., Putnam, S. H., Chatel, :D. M., Bieliauskas, L. A., et al. (1994). Deriv~ Trail Making Test indices: A preliminary · report.
Neuropsychiatry, Neuropsychology, andl Behavioral Neurology, 7(3), 230-234. Lannoo, E., & Vingerhoets, G. (1997). Flemish normative data on common neuropsychplogical tests: InHuence of age, education, and gender. Psychologica Belgica, 37(3), 141-155. Lansdell, H., & Donnelly, E. F. (1977). · Factor analysis of the Wechsler Adult Intelligence Scale subtests and the Halstead-Reitan Categpry and Tapping tests. Journal of Consulting and f:linical Psychology, 45, 412--416. Lansing, A. E., Ivnik, R. J., Cullum, C., M., & Randolph, C. (1999). An empirically derived short form of the Boston Naming Test. t1rchives of Clinical Neuropsychology, 14(6), 481~7.
Larrabee, G. J. (2003). Detection of malingering using atypical performance patterns on standard neuropsychological tests. Clinical Neuropsychologist, 17(3), 410-425. Larrabee, G. J., & Curtiss, G. (1995). Construct validity of various verbal and visual memory tests. Journal of Clinical and Experimental Neuropsychology, 17(4), 536-547. Larrabee, G. J., & Levin, H. S. (1986). Memory self-ratings and objective test performance in a normal elderly sample. Journal of Clinical and Experimental Neuropsychology, 8(3), 27~284. Larrabee, G. J., Largen, J. W., & Levin, H. S. (1985). Sensitivity of age-decline resistant ("hold") WAIS subtests to Alzheimer's disease.
Journal of Clinical and Experimental Neuropsychology, 7(5), 497-504. Larrabee, G. J., Levin, H. S., & High, W. M. (1986). Senescent forgetfulness: A quantitative study. Developmental Neuropsychology, 2(4), 373--385. Larrabee, G. J., Trahan, D. E., Curtiss, G., & Levin, H. S. (1988). Normative data for the Verbal Selective Reminding Test. Neuropsychology, 2(3-4), 173-182. Larrabee, G. J., Trahan, D. E., & Levin, H. S. (2000). Normative data for a six-trial administration of the Verbal Selective Reminding Test. Clinical Neuropsychologist, 14(1), ll0-118. Larrain, C. M., & Cimino, C. R. (1998). Alternate forms of the Boston Naming Test in Alzheimer's disease. Clinical Neuropsychologist, 12(4), 5~.
La Rue, A., D'Elia, L., Clark, E., Spar, J., & Jarvik, L. (1986). Clinical tests of memory in dementia, depression, and healthy aging. Psychology and Aging, 1(1), 69-77. LaRue, A., Romero, L., Ortiz, I., Liang, H. C., & Lindeman, R. D. (1999). Neuropsychological performance of Hispanic and non-Hispanic older adults: An epidemiologic survey. Clinical Neuropsychologist, 13(4), 474--486. Lawton, M. P., & Brody, E. M., (1969). Assessment of older people: Self-maintaining and instrumental activities of daily living. Gerontologist, 9, 179-188. Le Carret, N., Rainville, C., Lechevallier, N., Lafont, S., Letenneur, L., & Fabrigoule, C. (2003). InHuence of education on the Benton Visual Retention Test performance as mediated by a strategic search component. Brain and Cognition, 53(2), 408-411. Leckliter, I. N., & Matarazzo, J. D. (1989). The inHuence of age, education, IQ, gender, and alcohol abuse on Halstead-Reitan Neuropsychological
573
REFERENCES Test Battery performance. Journal of Clinical Psychology, 45(4), 485-512. LeDorze, G., & Durocher, J. (1992). The effects of age, educational level, and stimulus length on naming in normal subjects. Journal of Speech Lan(!;Jage Pathology and Audiology, 16, 21-29. Lee, G. P., Strauss, E., Loring, D. W., McCloskey, L., et al. (1997). Sensitivity of figural fluency on the Five-Point Test to focal neurological dysfunction. Clinical Neuropsychologist,11(1), 59-68. Lee, S.-H., & Lee, S. H. (1993). A study on the neurobehavioral effects of occupational exposure to organic solvents in Korean workers. Environmental Research, 60, 227-232. Lee, T. M. C., & Chan, C. C. H. (2000a). Are Trail Making and Color Trails tests of equivalent constructs? Journal of Clinical and Experimental Neuropsychology, 2.2(4), 529--534. Lee, T. M. C., & Chan, C. C. H. (2000b). Comparison of the Trail Making and Color Trails tests in a Chinese context: A preliminary report. Perceptual and Motor Skills, 90(1), 187-190. Lee, T. M. C., Cheung, C. C. Y., Chan, J. K. P., & Chan, C. C. H. (2000). Trail making across languages. Journal of Clinical and Experimental Neuropsychology, 22(6), 772-778. Lee, T. M. C., Yuen, K. S. L., & Chan, C. C. H. (2002). Normative data for neuropsychological measures of fluency, attention, and memory measures for Hong Kong Chinese. Journal of
Clinical and Experimental Neuropsychology, 24(5), 615--632. Lees, A. J. & Smith, E. (1983). Cognitive deficits in
the early stages of Parkinson's disease. Broin, 106, 257-270. Lees-Haley, P., Smith, H., Williams, C., & Dunn, J. (1996). Forensic neuropsychological test usage: An empirical survey. Archives of Clinical Neuropsychology, 11(1), 45-51. Leggio, M. G., Silveri, M. C., Petrosini, L., & Molinari, M. (2000). Phonological grouping is specifically affected in cerebellar patients: A verbal fluency study. Journal of Neurology, Neurosurgery, and Psychiatry, 69(1), 102-106. Leng, N. R. C., & Parkin, A. J. (1989). Aetiological variation in the amnesic syndrome: Comparisons using the Brown-Peterson task. Cortex, 25, 251-259. Lenzenweger, M. F., & Korfine, L. (1994). Perceptual aberrations, schizotypy, and the Wisconsin Card Sorting Test. Schizophrenia Bulletin, 20(2), 345--357. Leonberger, F. T., Nicks, S. D., Goldfader, P. R., & Munz, D. C. (1991). Factor analysis of the Wechsler Memory Scale-Revised and the
Halstead-Reitan
Neuropsychological
Battery.
Clinical Neuropsychologist, 5, 83--88. Levin, B. E., Uabre, M. M., Reisman, S., Weiner,
W. J., Sahchez-Ramos, J., Singer, C., et al. (1991). Visuospatial impairment in Parkinson's disease. Neurology, 41(3), 365-369. Levin, H. S., Benton, A. L., & Grossman, R. G. (1982). Neurobehavioml consequences of closed head injury. New York: Oxford University Press. Levin, H. S., Mattis, S., Ruff, R. M., Eisenberg, H. M., Marshal, L. F., Tabaddor, K., High, W. M., & Frankowski, R. F. (1987). Neurobehavioral outcome following minor head injury: A three-center study. Journal of Neurosurgery, 66, 234-243. Levin, H. S., Song, J., Ewing-Cobbs, L., Chapman, S. B., & Mendelsohn, D. (2001). Word fluency in relation to severity of closed head injury, associated frontal brain lesions, and age at injury in children. Neuropsychologia, 39(2), 122-131. Lewandowski, L., Kobus, D. A., Church, K. L., & Van Orden,K. (1982). Neuropsychological implications of hand preference versus hand grip performance. Perceptual and Motor Skills, 55(1), 311-314. Lewis, F. C., & Soares, L. (2000). Relationship between semantic paraphasias and related nonverbal factors. Perceptual and Motor Skills, 91(2), 366-372. Lewis, M. B., & Howdle, P. D. (2003). Cognitive dysfunction and health-related quality of life in long-term liver transplant survivors. Liver Tronsplantotion and Surgery, 9(11), 1145-1148. Lewis, R. (1995). Digit Vigilance Test: Professional User's Guide. Odessa, FL: Psychological Assessment Resources. Lewis, R., & Rennick, P. (1979). Manual for the
Repeatable Cognitive-Perceptual-Motor Battery. Grosse Pointe Park, Ml: Axon. Lewis, S., Campbell, A., Takushi-Chinen, R., Brown, A., Dennis, G., Wood, D., et al. (1997). Visual Organization Test performance in an African American population with acute unilateral cerebral lesions. International Journal of Neuroscience, 91(3-4), 295-302. Lezak, M. D. (1976). Neuropsychological assessment. New York: Oxford University Press. Lezak, M. D. (1982). The test-retest stability and
reliability of some tests commonly used in neuropsychological assessment. Paper presented at
the meeting of the International Neuropsychological Society, Deauville, France. Lezak, M.D. (1983). Neuropsychological assessment (2nd ed.). New York: Oxford University Press. Lezak, M.D. (1995). Neuropsychological assessment (3rd ed.). New York: Oxford University Press.
574 Lezak, M. D., Howieson, D. B., & Loring. D. W. (2004). Neuropsychological assessment (4th ed.). New York: Oxford University Press. Liberman, J. N., Stewart, W., Seines, 0., & Gordon, B. (1994). Rater agreement for the ReyOsterrieth Complex Figure test. Jor,rnal of Clinical Psychology, 50(4), 615-624. Libon, D. J., Glosser, G., Malamut, B. L., Kaplan, E., Goldberg, E., Swenson, R., et al. (1994). Age, executive functions, and visuospatial functioning in healthy older adults. Neuropsychology, 8, 38-43. Libon, D. J., Freeman, R. Q., Giovannetti, T., Lamar, M., Cloud, B. S., Stem, R. A., et al. (2003). The ROCF and visuoconstructional impainnent in cortical and subcortical dementia. In J. A; Knight (Ed.), The handbook of Rey-Osterrieth Complex
Figure usage: Clinical and research applications. Lutz, FL: Psychological Assessment Resources. Lichtenberg, P., & Christensen, B. (1992). Extended normative data for the Logical Memory subtest of the Wechsler Memory Scale-Revised: Responses from a sample of cognitively intact elderly medical patients. Psychological Beports, 71, 745-746. Lichtenberg, P. A., Ross, T., & Christell$en, B. (1994). Preliminary normative data on the Boston Naming Test for an older uroan population. Clinical Neuropsychologist, 8(1), 109-111. Lichtenberg, P. A., Ross, T. P., Youngblade, L., & Vangel, S. J. (1998). Normative studies research project test battery: Detection of dementia in African American and European American urban elderly patients. Clinical Neuropsychologist, 12(2), 146-154. Light, R. J., & Pillemer, D. B. (1984). Sumrmng up: The science of reviewing research. Cambridge, MA: Harvard University Press. Lin, Y. G., & Rennick, P. M. (1974). Correlations between performance on the Category Test and the Wechsler Adult Intelligence Scale in an epileptic sample. Journal of Clinical Psychology, 31(1), 62-65. Lineweaver, T. T., Bondi, M. W., Thomas, R. G., & Salmon, D. P. (1999). A normative study of Nelson's (1976) modified version of the Wisconsin Card Sorting Test in healthy older adults. Clinical Neuropsyclwlogist, 13(3), 328-347. Little, A. J., Templer, D. 1., Persel, C. S., & Ashley, M. J. (1996). Feasibility of the neuropsychological spectrum in prediction of outcome following head injury. Journal of Clinical
Psychology, 52, 455--460.
Uorente, A.M., Williams, J., Satz, P., & D'Elia, L. F. (2003). Children's Color Trails Test-Profe~sional
REFERENCES
Manual. Lutz, FL: Psychological Assessment Resources. Loberg, T. (1980). Alcohol misuse and neuropsychological deficits in men. Journal of Studies on Alcohol, 41(1), 119-128. Locascio, J. L., Growdon, J. H., & Corkin, S. (1995). Cognitive test performance in detecting. staging, and tracking Alzheimer's disease. Archives of Neurology, 52, 1087-1099. Loewenstein, D. A., Rubert, M. P., Argueelles, T., & Duara, R. (1995). Neuropsychological test performance and prediction of functional capacities among Spanish-speaking and English-speaking patients with dementia. Archives of Clinical Neuropsychology,10(2), 75-88. Loewenstein, D. A., Barker, W. W., Harwood, D. G., Luis, C., Acevedo, A., Rodriguez, 1., et al. (2000). Utility of a modified Mini-Mental State Examination with extended delayed recall in screening for mild cognitive impairment and dementia among community dwelling elders. International Journal of Geriatric Psychiatry, 15(5), 434-440. Logue, P. E., & Allen, K. (1971). WAIS-predicted Category Test scores with the Halstead Neuropsychological Battery. Perceptual and Motor Skills, 33, 1~1096.
Lombardi, W. J., Andreason, P. J., Sirocco, K. Y., Rio, D. E., Gross, R. E., Umhau, J. C., et al. (1999). Wisconsin Card Sorting Test performance following head injury: Dorsolateral frontostriatal circuit activity predicts perseveration.
Journal of Clinical and Experimental Neuropsychology, 21(1), 2-16.
Loa, H., Bonne), J., Etevenon, P., Benyacoub, J., & Slowen, P. (1981). Intellectual efficiency in manic-depressive patients treated with lithium: A control study. Acta Psychiatrica Scandinavica, 64(5), 423--430. Loong, J. W. K. (1990). The Wisconsin Card Sorting Test [Computer software]. San Luis Obispo, CA: Loong. Loonstra, A. S., Tarlow, A. R., & Sellers, A. H. (2001). COWAT metanorms across age, education, and gender. Applied Neuropsychology, 8(3), 161-166. Lopez, M. N., Arias, G. P., Hunter, M. A, Charter, R. A., & Scott, R. R. (2003). Boston Naming Test: Problems with administration and scoring. Psyclwlogj.cal Reports, 92(2), 468-472. Lopez, M. N., Lazar, M. D., & Oh, S. (2003). Psychometric properties of the Hooper Visual Organization Test. Assessment, 10(1), 66-70. Lopez-Carlos, E. (1999). Validity study of the World Health Organization/University of California at
575
REFERENCES Los Angeles Auditory Verbal Learning Test.
Dissertation Abstracts International. Section B: The Sciences and Engineering, 59(9-B), 5095. Lopez-Carlos, E., Salazar, X. F., Villasenor, T., Saucedo, C., & Peiia, R. (2003). Validez y datos
nonnativos de la pruebas de nomtnacion en personas con educacton limttada. Poster presented at the Neuropsicologia-Congreso Latinoamericano por Ia Sociedad Latinoamericana de Neuropsicologia. Toronto, Canada. Lorig, T. S., Gehring, W. J., & Hym, D. L. (1986). Period analysis of the EEG during performance of the Trail Making Test. International Journal of Clinical Neuropsyclwlogy, 8(3), 97-99. Loring, D. W. (1989). The Wechsler Memory Scale-Revised, or the Wechsler Memory Scale-Revisited? Clinical Neuropsyclwlogist, 3(1), 59--69. Loring, D. W., & Meador, K. J. (2003). The Medical College of Georgia (MCG) Complex Figures: Four forms for follow-up. In J. A. Knight (Ed.), The handbook of Rey-Osterrieth Complex Figure usage: Clinical and research applications.
Lutz, FL: Psychological Assessment Resources. Loring, D. W., Lee, G. P., & Meador, K. J. (1988). Revising the Rey-Osterrieth: Rating right hemisphere recall. Archives of Clinical Neuropsyclwlogy, 3, 239-247. Loring, D. W., Martin, R. L., Meador, K. J., & Lee, G. P. (1990). Psychometric construction of the Rey-Osterrieth Complex Figure: Methodological considerations and interrater reliability. Archives of Clinical Neuropsyclwlogy, 5, 1-14. LoSasso, G. L., Rapport, L. J., Axelrod, B. N., & Reeder, K. P. (1998). Intermanual and alternateform equivalence on the Trail Making Tests.
Journal of Clinical and Experimental Neuropsyclwlogy, 20(1), 107-110. Lu, L., & Bigler, E. D. (2000). Performance on original and a Chinese version of Trail Making Test part B: A normative bilingual sample. Applied Neuropsyclwlogy, 7(4), 243-246. Lu, L., & Bigler, E. D. (2002). Normative data on Trail Making Test for neurologically normal, Chinese-speaking adults. Applied Neuropsyclwlogy, 9(4), 219-225. Lu, P. H., Boone, K. B., Cozolino, L., & Mitchell, C. (2003). Effectiveness of the Rey-Osterrieth Complex Figure Test and the Meyers and Meyers Recognition Trial in the detection of suspect effort. Clinical Neuropsyclwlogist, 17(3), 426-440. Lucas, J. A., lvnik, R. J., Smith, G. E., Bohac, D. L., Tangalos, E. G., Graff-Radford, N. R., et al.
(1998). Mayo's Older Americans Normative Studies: Category 6uency norms. Journal of
Clinical and Experimental Neuropsychology, 20(2), 194-200. Lucas, M. D., & Sonnenberg, B. R. (1996). Neuropsychological trends in the Parkinsonism-plus syndrome: A pilot study. Journal of Clinical and Experimental Neuropsychology, 18(1), 88-97. Ludgate, J., Keating, J., O'Dwyer, R., & Callaghan, N. (1985). An improvement in cognitive function following polypharmacy reduction in a group of epileptic patients. Acta Neurologica Scandi-
navica, 71(6), 448-452. Luria, A. R. (1980). Higher cortical fUnctions in man. New York: Basic Books. Lynch, W. J. (2002). Assessment in traumatic brain injury: Update on recent developments. Journal of Head Trauma Rehabilitation, 17(1), 66--70. Lyness, S. A., Eaton, E. M., & Schneider, L. S. (1994). Cognitive performance in older and middle-aged depressed outpatients and controls.
Journal of Gerontology: Psyclwlogical Sciences, 49, P129-P136. Lysaker, P., & Bell, M. (1994). Insight and cognitive impairment in schizophrenia: Performance on repeated administration of the Wisconsin Card Sorting Test Journal of Nervous and
Mental Disorders, 182, 656--660. Lysaker, P., Bell, M., & Beam-Goulet, J. (1995). Wisconsin Card Sorting Test and performance in schizophrenia. Psychiatry Research, 56, 45-51. Lyvers, M., & Maltzman, I. (1991). Selective effects of alcohol on Wisconsin Card Sorting Test performance. British Journal of Addiction, 86(4), 399-407. Macinnes, W. D., Golden, C. J., McFadden, J., & Wilkening, G. N. (1983a). Relationships between the Booklet Category Test and the Wisconsin Card Sorting Test. International Journal of Neuroscience, 21(34), 257-264. Macinnes, W. D., McFadden, J. M., & Golden, C. J. (1983b). A short-portable version of the Category Test. International Journal of Neuroscience,
18,41-44.
Mack, J. L., &: Carlson, N. J. (1978). Conceptual deficits and aging: The Category Test. Perceptual
and Motor SlciUs, 46,
1~128.
Mack, W. J., Freed, D. M., Williams, B. W., & Henderson, V. W. (1992). Boston Naming Test: Shortened versions for use in Alzheimer's disease. Journals of Gerontology, 47(3), P154-P158. MacKay, A. J., Connor, L. T., Albert, M. L., & Ohler, L. K. (2002). Noun and verb retrieval in healthy aging. Journal of the International Neuropsyclwlogical Society, 8(6), 764-770.
576
REFERENCES
MacLeod, C. (1991). Half century of research on the Stroop effect: An integrative review. Psychological Bullettn, 109, 163-203. Maddocks, D., & Saling, M. (1996). Neurqpsychological deficits following concussion. Brain Injury, 10(2), 99-103. Madison, L. S., George, C., & Moeschler, J. B. (1986). Cognitive functioning in the hgile-X syndrome: A study of intellectual, mempry and communication skills. Journal of Men,.U Defi-
ciency Research, 30,
12~148.
Maj, M., Janssen. R., Satz, P., Zaudig, M., Starace, F., Boor, D., et al. (1991). The World Health Organization's cross-cultural sflldy on neuropsychiatric aspects of infection With the human immunodeficiency virus (HIV-1)1 Preparation and pilot phase. British Journal !of Psy-
chunry,159,351-356.
:
Maj, M., D'Eiia, L., Satz, P., Jansses, R., Za~g, M., Uchiyama, C., et al. (1993). Evaluation oft-Yo new neuropsychological tests designed to niinimize cultural bias in the assessment ofHIV-1 Seropositive persons: A WHO study. Archives of Clinical Neuropsychology, 8, 123-135. Majdan, A., Sziklas, V., & Jones-Go~. M. (1996). Performance of healthy subj~ and patients with resection from the anteri.,r temporal lobe on matched tests of verbal tand visuoperceptual learning. Journal of Clin~ and Experimental Neuropsychology, 18(3), 4J6-430. Malec, J., Ivnik, R., Smith, G., Tangalos, E., ~tersen, R., Kokmen, E., et al. (1992). Mayo's Older Americans Normative Studies: Utility of· corrections for age and education for the WAIS-R. Clinical Neuropsycholo,gst, 6(Suppl.), 31--47. Malina, A., Regan, T., Bowers, D., & Millis, S. (2001). Psychometric analysis of the Visll41 Form Discrimination Test. Perceptual and: Motor Skills, 92(2), 449-455. Malloy, P. (1987). Frontal lobe dysfunction in obsessive-compulsive disorder. In Perecman, E. (Ed.), The frontal lobes revisited (pp. rol-223). Hillsdale, NJ: Lawrence Erlbaum. · Manly, J. J., Miller, S. W., Heaton, R. K., Btro, D., Reilly, J., Velasquez, R. J., et al. (1998). The effect of African-American acculturation on neuropsychological test performance in normal and HIV-positive individuals. Journal of the .lPtemational Neuropsychological Society, 4(3), 2t1--302. Manly, J. J., Jacobs, D. M., Sano, M., B):lll, K., Merchant, C. A., Small, S. A., et al. (1Wg). Effect of literacy on neuropsychological ±t performance in nondemented, education- atched elders. Journal of the International N ropsychological Society, 5(3), 191-202.
Manly, J. J., Jacobs, D. M., Touradji, P., Small, S. A., & Stem, Y. (2002). Reading level attenuates differences in neuropsychological test performance between African American and white elders. Journal of the International Neuropsychological Society, 8(3), 341--348. Marcopulos, B. A., McLain, C. A., & Giuliano, A. J. (1997). Cognitive impairment or inadequate norms? A study of healthy, rural, older adults with limited education. Clinical Neuropsychologist, 11(2), 111-131. Mares, M. (2002). Demographic predictors of verbal learning and memory indices on the World Health Organization-University of California at Los Angeles Auditory Verbal Learning Test in a Hispanic sample. Dissertation Abstracts Inter-
national. Section B: The Sciences and Engineering, 62(10-B), 4793. Margolin, D., Pate, D. S., Friedrich, F. J., & Elia, E. (1990). Dysnomia in dementia and in stroke patients: Different underlying cognitive deficits.
Journal of Clinical and Experimental Neuropsychology, 12(4), 597-612. Marie, R. M., Rioux, P., Eustache, F., Travere, J. M., et al. (1995). Clues and functional neuroanatomy of verbal working memory: A study about the resting brain glucose metabolism in Parkinson's disease. European Journal of Neurology, 2, 83-94. Marien, P., Mampaey, E., Vervaet, A., Saerens, J., & De Deyn, P. P. (1998). Normative data for the Boston Naming Test in native Dutch-speaking Belgian elderly. Brain and Language, 65(3), 447-467. Marin, G., & Marin, B. V. (1991). Research with Hispanic populations. Newbury Park, CA: Sage Publications, Inc. Martin, A., & Fedio, P. (1983). Word production and comprehension in Alzheimer's disease: The breakdown of semantic knowledge. Brain and Language, 19, 124-141. Martin, D. J., Oren, Z., & Boone, K. (1991). Major depressive's and dysthymic's performance on the Wisconsin Card Sorting Test. Journal of Clinical Psychology, 47, 685-690. Martin, N. J., & Franzen, M. D. (1989). The effect of anxiety on neuropsychological function. In-
ternational Journal of Neuropsychology, 11, 1-8. Martin, P. W., & Greene, R. L. (1978, April). Interjudge reliability ofmemory and location scores on
the Halstead-Reitan Tactual Performance Test. Paper presented at the meeting of the Southwestern Psychological Association, New Orleans. Martin, R. C., Sawrie, S., Hugg, J., Gilliam, F., Faught, E., & Kuzniecky, R. (1999). Cognitive
577
REFERENCES correlates of 1H MRSI-detected hippocampal abnormalities in temporal lobe epilepsy. Neu-
ogist, 45(9), 999-1017.
rology, 53(9), 2052-2058. Martin, R. C., Sawrie, S. M., Edwards, R., Roth, D. L., Faught, E., Kuzniecky, R. 1., et al. (2000). Investigation of executive function change following anterior temporal lobectomy: Selective normalization of verbal fluency. Neuropsychology, 14(4), 501-508. Martin, S. E., Engleman, H. M., Deary, I. H., & Douglas, N. J. (1996). The effect of sleep fragmentation on daytime function. American Journal of Respiratory and Critical Care Medicine, 153, 1328-1332. Martin, T. A., Hoffman, N. M., & Donders, J. (2003). Clinical utility of the Trail Making Test ratio score. Applied Neuropsychology, 10(3), 163-169. Martinez, A. A., Penades, R., Vieta, E., Colom, F., Reinares, M., Benabarre, A., Salamero, M., & Gastro, C. (2002). Executive function in patients with remitted bipolar and schizophrenia and its relationship with functional outcome. Psychotherapy and Psychosomatics, 71, 39-46. Mason, C. F., & Ganzler, H. (1964). Adult norms for the Shipley Institute of living Scale and Hooper Visual Organization Test based on age and education. Journal of Gerontology, 19, 419-424. Massman, P. J., & Doody, R. S. (1996). Hemispheric asymmetry in Alzheimer's disease is apparent in motor functioning. Journal of Clinical
and Experimental Neuropsychology,
Matarazzo, J. D. (1990). Psychological assessment versus psychological testing. American Psychol-
18(1),
110-121. Mast, B. T., MacNeill, S. E., & Lichtenberg. P. A. (2000). Clinical utility of the normative studies research project test battety among vascular dementia patients. Clinical Neuropsychologist, 10, 173-180. Masur, D. M., Fuld, P. A., Blau, A. D., Thai, L. J., Levin, H. S., & Aronson, M. K. (1989). Distinguishing normal and demented elderly with the Selective Reminding Test. Journal of Clinical and Experimental Neuropsychology, 11(5), 615-630.
Masur, D. M., Fuld, P. A., Blau, A. D., CtyStal, H., & Aronson, M. K. (1990). Predicting development of dementia in the elderly with the Selective Reminding Test. Journal of Clinical and Experimental Neuropsychology, 12(4), 529--538. Mataix-Cols, D., Barrios, M., Sanchez-Turet, M., Vallejo, J., & Junque, C. (1999). Reduced design fluency in subclinical obsessive-compulsive subjects. Journal of Neuropsychiatry and Clinical Neurosciences, 11(3), 395-397.
Matarazzo, J.D., Wiens, A. N., Matarazzo, R. G., & Goldstein, S. G. (1974). Psychometric and clinical test-retest reliability of the Halstead impairment index in a sample of healthy, young. normal men. Journal of Neroous and Mental Disease, 158(1), 37--49. Matarazzo, R. G. (1995). Psychological report standards in neuropsychology. Clinical Neuropsychologist, 9(3), 249-250. Mathiesen, T., Ellingsen, D. G., & Kjuus, H. (1999). Neuropsychological effects associated with exposure to mercury vapor among former chloralkali workers. Scandinavian Journal of Work, Environment and Health, 25(4), 342-350. Matthews, C. G. (1974). Application of neuropsychological test methods in mentally retarded subjects. In R. M. Reitan & L. A. Davison (Eds.), Clinical neuropsychology: Current status and applications. Washington, DC: Hemisphere. Matthews, K. A., Cauley, J., Yaffe, K., & Zmuda, J. M. (1999). Estrogen replacement therapy and cognitive decline in older community women.
Journal of the American Geriatrics Society, 47(5), 518--523. Mattis, S. (1976). Mental status examination for organic mental syndrome in the elderly patient. In L. Bellak & T. Karasu (Eds.), Geriatric psychiatry. New York: Grune & Stratton. Mattis, S. (1988). Dementia Rating Scale. Odessa, FL: Psychological Assessment Resources. Mayeux, R., Brandt, J., Rosen, J., & Benson, F. (1980). Interictal memoty and language impairment in temporal lobe epilepsy. Neurology, 30, 120-125. Mayr, U. (2002). On the dissociation between clustering and switching in verbal fluency: Comment on Troyer, Moscovitch, Winocur, Alexander and Stuss. Neuropsychologia, 40(5), 562-566. McCaffrey, R. J., Krahula, M. M., & Heimberg, R. G. (1989). An analysis of the significance of performance errors on the Trail Making Test in polysubstance users. Archives of Clinical Neuropsychology, 4(4), 393-398. McCaffrey, R. J., Ortega, A., Orsillo, S. M., Nelles, W. B., et al. (1992). Practice effects in repeated neuropsychological assessments. Clinical Neuropsychologist, 6(1), 32-42. McCaffrey, R. J., Ortega, A., & Haase, R. F. (1993). Effects of repeated neuropsychological assessments. Archives of Clinical Neuropsychology, 8(6), 519-524.
REFERENCES
578 McCaffrey, R. J., Cousins, J. P., Westervelt, H. J., Martynowicz, M., et al. (1995). Practice effects with the NIMH AIDS Abbreviated Neuropsychological Battery. Archives of Clinical Neuropsychology, 10(3), 241-250. McCaffrey, R. J., Duff, K., & Westervelt, H. J. (2000). Practitioner's guide to evaluating change with neuropsychological assessment instruments.
New York: Kluwer Academic/Plenum. McCaffrey, R. J., Westervelt, H., & Haase, R. F. (2001). Serial neuropsychological assessment with the National Institute of Mental Health (NIMH) AIDS Abbreviated Neuropsychological Battery. Archives ofClinical Neuropsychology, I6(1), 9-18. McCarthy, D. (1972). Manual for the McCarthy Scales for Children's Abilities. New York: Psychological Corporation. McCracken, L. M., & Franzen, M. D. (1992). Principal-components analysis of the equivalence of alternate forms of the Trail Making Test. Psychological Assessment, 4(2), 235-238. McCurry, S. M., Gibbons, L. E., Uomoto, J. M., Thompson, M. L., Graves, A. B., Edland, S. D., et al. (2001). Neuropsychological test performance in a cognitively intact sample of older Japanese American adults. Archives of Clinical
Neuropsychology, I6, 447-459. McFie, J. & Piercy, M. F. (1952). Intellectual impairment with localized cerebral lesions. Brain, 75, 292-311. McKeever, W. F., & Abramson, M. (1991). Halstead and Halstead-Reitan norms for Finger Tapping Test are severely biased against females and left-banders. Journal of Clinical and Experimental Neuropsychology, I3(1), 91. McKhann, G., Drachman, D., Folstein, M., Katzman, R., Price, D., & Stadlan, E. M. (1984). Clinical diagnosis of Alzheimer's disease: Report of the NINCDS-ADRDA Work Group. Neurology, 34, 939-944. Meador, K. J., Loring, D. W., Allen, M. E., Zamrini, E. Y., Moore, E. E., Abney, 0. L., et al. (1991). Comparative cognitive effects of carbamazepine and phenytoin in healthy adults. Neurology, 4I(l0), 1537-1540. Meador, K. J., Moore, E. E., Nichols, M. E., Abney, 0. L., Taylor, H. S., Zamrini, E. Z., et al. (1993). The role of cholinergic systems in visuospatial processing in memory. Journal of
Clinical and Experimental Neuropsychology, I5(5), 832-842. Mehta, Z. & Newcombe, F. (1996). Dissociable contributions of the two cerebral hemispheres to judgments of line orientation. Journal of International Neuropsychological Society, 2, 335--339.
Mejia, S., Pineda, D., Alvarez, L. M., & Ardila. A. (1998). Individual differences in memory and executive function abilities during normal aging. International Journal of Neuroscience, 95(3-4),
271-284. Merriam, E. P., Thase, M. E., Haas, G. L., Keshavan, M. S., & Sweeney, J. A. (1999). Prefrontal cortical dysfunction in depression determined by Wisconsin Card Sorting Test performance. American Journal ofPsychiat1lJ,I56(5), 780-782. Merrick, E. E., Donders, J., & Wiersum, M. (2003). Validity of the WCST-64 after traumatic brain injury. Clinical Neuropsychologist, 17(2), 153-158. Merten, T. (2002). A short version of the Hooper Visual Organization Test: Development and validation. Clinical Neuropsychologist,I6(2), 13&-144. Merten, T., & Beal, C. (2000). An analysis of the Hooper Visual Organization Test with neurological patients. Clinical Neuropsychologist, 14(4), 521-529. Mesulam, M. M. (1985). Principles of behavioral and cognitive neurology. New York: Oxford University Press. Mesulam, M. M. (2000). Principles of behavioral and cognitive neurology (2nd ed.). New York: Oxford University Press. Meyer, G. J., Finn, S. E., Eyde, L. D., Kay, G. G., Moreland, K. L., Dies, R. R., et al. (2001). Psychological testing and psychological assessment: A review of evidence and issues. American Psychologist, 56(2), 128-165. Meyers, J. E., & Lange, D. (1994). Recognition subtest for the Complex Figure. Clinical Neuropsychologist, 8(2), 153-166. Meyers, J. E., & Meyers, K. R. (1992). A training
manual for the clinical scoring of the ReyOsterrieth Complex Figure and the recognition subtest. Sioux City, lA: Author. Meyers, J. E., & Meyers, K. R. (1995a). Rey Complex Figure Test under four different administration procedures. Clinical Neuropsychologist, 9(1), 63-67. Meyers, J. E., & Meyers, K. R. (1995b). Hey Complex Figure Test and Recognition trial: Professional manual. Lutz, FL: Psychological Assessment Resources. Meyers, J. E., & Meyers, K. R. (1996). Hey Com-
plex Figure Test and Recognition trial: Supplemental norms for children and adolescents. Lutz, FL: Psychological Assessment Resources. Meyers, J. E., Galinsky, A. M., & Volbrecht, M. (1999). Malingering and mild brain injury: How low is too low? Applied Neuropsychology, 6(4), 208-216.
REFERENCES
Miceli, G., Caltagirone, C., Gainotti, G., Masullo, C., & Siweri, M. C. (1981). Neuropsychological correlates of localized cerebral lesions in nonaphasic brain-damaged patients. Journal of Clinical Neuropsychology, 3, 53--63. Mickanin, J., Grossman, M., Onishi, K., Auriacombe, S., & Clark, C. (1994). Verbal and nonverbal fluency in patients with probable Alzheimer's disease. Neuropsychology, 8, 385--394. Miezejeski, C. M., Jenkins, E. C., Hill, A. L., Wisniewski, K., et al. (1986). A profile of cognitive deficit in females from fragile X families. Neuropsychologia, 24(3), 405-409. Miller, B. L., Lesser, I. M., Boone, K. B., Hill, E., Mehringer, C. M., & Wong, K. (1991). Brain lesions and cognitive function in late-life psychosis. British Journal of Psychiatry, 158, 76-82. Miller, E. (1984). Verbal fluency as a function of a measure of verbal intelligence and in relation to different types of cerebral pathology. British Journal of Clinical Psychology, 23, 53-57. Miller, E. (1985). Possible frontal impairments: A test using a measure of verbal fluency. British Journal of Clinical Psychology, 24, 211-212. Miller, E. N. (2003). An update on the 1991 article by Seines, Jacobson, Machado, Becker, Wesch, Miller, Visscher and McArthur. Miller, E. N., Seines, 0. A., McArthur, J. C., Satz, P., Becker, J. T., Cohen, B. A., et al. (1990). Neuropsychological performance in HIV-1-infected homosexual men: The Multicenter AIDS Cohort Study (MACS). Neurology, 40(2), 197-204. Miller, L. S., & Rohling, M. L. (2001). A statistical interpretive method for neuropsychological test data. Neuropsychology Review, 11(3), 143-169. Millis, S. R., Rosenthal, M., & Lourie, I. F. (1994). Predicting community integration after traumatic brain injury with neuropsychological measures. lntematWnal Journal of Neuroscience, 79(3-4), 165-167. Millis, S. R., Rosenthal, M., Novack, T. A., Sherer, M., Nick, T. G., Kreutzer, J. S., et al. (2001). Long-term neuropsychological outcome after traumatic brain injury. Journal of Head Trauma Rehabilitation, 16(4), 343-355. Milner, B. (1962). Laterality effects in audition. In V. B. Mountcastle (Ed.), Interhemispheric relations and cerebral dominance. Baltimore: Johns Hopkins University Press. Milner, B. (1963). Effects of different brain lesions on card sorting test. Archives of Neurology, 9, 90-100. Milner, B. (1964). Some effects of frontal lobectomy in man. In J. M. Warren & K. A. Akert
579 (Eds.), The frontal granular cortex and behavior (pp. 331-334). New York: McGraw-Hill. Milner, B. (1970). Memory and the medial temporal regions of the brain. In K. H. Pribram & D. E. Broadbent (Eds.), Biology of 1nenwry (pp. 29--50). New York: Academic Press. Milner, B. (1972). Disorders of learning and memory after temporal lobe lesions in man. Clinical Neurosurgery, 19, 421-446. Milner, B. (1975). Psychological aspects of focal epilepsy and its neurosurgical management. Advances in Neurology, 8, 299--321. Miner, T., & Ferraro, F. R. (1998). The role of speed of processing, inhibitory mechanisms, and presentation order in Trail-Making Test performance. Brain and Cognition, 38(2), 246-253. Misra, U. K., Prasad, M., & Pandey, C. M. (1994). A study of cognitive functions and event related potentials following organophosphate exposure. Electromyography and Clinical Neurophysiology, 34(4), 197-203. Mitropoulou, V., Harvey, P. D., Maldari, L. A., Moriarty, P. J., New, A. S., Silverman, J. M., et al. (2002). Neuropsychological performance in schizotypal personality disorder: Evidence regarding diagnostic specificity. Biological Psychiatry, 52(12), 1175-1182. Mitrushina, M., & Satz, P. (1991a). Effect of repeated administration of a neuropsychological battery in the elderly. Journal of Clinical Psychology, 47(6), 790-801. Mitrushina, M., & Satz, P. (1991b). Changes in cognitive functioning associated with normal aging. Archives of Clinical Neuropsychology, 6, 49-60. Mitrushina, M., & Satz, P. (1995). Repeated testing of normal elderly with the Boston Naming Test. Aging: Clinical and Experimental Research, 7, 123-127. Mitrushina, M., Satz, P., & Van Gorp, W. (1989). Some putative cognitive precursors in subjects hypothesized to be at-risk for dementia. Archives of Clinical Neuropsychology, 4, 323-333. Mitrushina, M., Satz, P., & Chervinsky, A. B. (1990). Efficiency of recall on the Rey-Osterrieth Complex Figure in normal aging. Brain Dtjsfonction, 3, 148-150. Mitrushina, M., Satz, P., Chervinsky, A., & D'Elia, L. (1991). Performance of four age groups of normal elderly on the Rey Auditory-Verbal Learning Test. Journal of Clinical Psychology, 47(3), 351-357. Mitrushina, M., D'Elia, L., Satz, P., Uchiyama, C., Mathews, A., & Harker, J. (1993). A comparison of selective attention deficits in normal elderly
580 and AIDS patients. Developmental Brain Dys' Mitrushina, M., Drebing, C., Uchiyama, C., Satz, P., Van Gorp, W., & Chervinsky, A. (1994). The pattern of deficit in different memory components in normal aging and dementia of Alzheimer's type. Journal ofClinical Psychology, 50(4), 591-596. Mitrushina, M., Fogel, T., D'Eiia, L., Uchiyama, C., & Satz, P. (1995a). Performance on motor tasks as an indication of increased behavioral asymmetry with advancing age. Neuropsychologia, 33(3), 359-364. Mitrushina, M., Uchiyama, C., & Satz, P. (l995b). Heterogeneity of cognitive profiles in normal aging: Implications for early manifestations of Alzheimer's disease. Journal of Clinical and Experimental Neuropsychology, 17(3), 374-382. Mittenberg, W., Seidenberg, M., O'Leary, D. S., & DiGiulio, D. V. (1989). Changes in cerebral functioning associated with normal aging. Jour-
function, 6, 324--328.
nal of Clinical and Experimental Neuroplychology, 11, 91~932. Mittenberg, W., Burton, D., Darrow, E., & Thompson, G. (1992). Normative data i>r the WMS-R: 25 to 34 year olds. Psychological Assessment, 4(3), 363--368. Moberg, M., Ferraro, F. R., & Petros, T. V. (2000). Lexical properties of the Boston Naming Test stimuli: Age differences in word naming and lexical decision latency. Applied Neuropllljclwlogy, 7(3), 147-153. Moehle, K. A., Fitzhugh-Bell, K. B., Engleman, E., & Hennon, D. (1988). Statistical and diagnostic adequacy of a short form of the Halstead Category Test. International Journal of Neuroscienee, 42, 107-112. Moehle, K. A., Rasmussen, J. L., & Fitzhugh-Bell, K. B. (1990). Factor analysis of neuropsyt:hological tests in an adult sample. InterntJtional Journal of Clinical Neuropsychology, 1.2(3-4), 107-115. Moering, R. G., Schinka, J. A., Mortimer, J. A., & Graves, A. B. (2004). Normative data for elderly African Americans for the Stroop Color and Word Test. Archives of Clinical Neuropsychology, 19(1), 61-71. Moffoot, A. P. R., O'Carroll, R. E., Bennie, J., Carroll, S., Dick, H., Ebmeier, K. P., et al. (1994). Diurnal variation of mood and neuro psychological function in major depression with melancholia. Journal of Affective Disorders, 32, 257-269. Moher, D., Cook, D. J., Eastwood, S., Olkin, 1., Rennie, D., Stroup, D. (1999). Improving the
REFERENCES quality of reporting of meta-analysis of randomized controlled trials: The QUOROM statement. Laneet, 354, 1896-1900. Monsch, A. U., Bondi, M. W., Butters, N., Salmon, D. P., Katzman, R., & Thai, L. J. (1992). Comparisons of verbal fluency tasks in the detection of dementia of the Alzheimer type. Archives of Neurology, 49(12), 1253-1258. Monsch, A. U., Bondi, M. W., & Butters, N. (1994). A comparison of category and letter fluency in Alzheimer's disease. Neuropsychology, 8, 25-30. Montgomery, P., Silverstein, R., Wichmann, R., & Fleischaker, K. (1993). Spatial updating in Parkinson's Disease. Brain and Cognition, 23, 113-126. Montse, A., Pere, V., Carme, J., Francese, V., & Eduardo, T. (2001). Visuospatial deficits in Parkinson's disease assessed by Judgment of Line Orientation Test: Error analyses and practice effects. Journal of Clinical and Experimental Neuropsychology, 23(5), 592-598. Moore, T. E., Richards, B., & Hood, J. (1984). Aging and the coding of spatial information. Journal of Gerontology, 39(2), 210-212. Morehouse, S. A., Szeliga, F., & DiTommaso, E. (2000). Characteristics of the bimanual deficit using grip strength. Laterality: Asymmetries of Body, Brain and Cognition, 5(2), 167-185. Morey, C. E., Cilo, M., Berry, J., & Cusick, C. (2003). The effect of Aricept in persons with persistent memory disorder following traumatic brain injury: A pilot study. Brain Injury, 17(9), 809-815. Morgan, J. E., & Caccappolo-van Vliet, E. (2001). Advanced years and low education: The case against the comprehensive norms. Journal of Forensic Neuropsychology, 2(1), 53-69. Morice, R. (1990). Cognitive inflexibility and prefrontal dysfunction in schizophrenia and mania. British Journal of Psychiatry, 157, 50-54. Mormont, C. (1984). The influence of age and depression on intellectual and memory performances. Acta Psychiatrica Belgica, 84(2), 127-134. Morris, J. C., Heyman, A., Mohs, R. C., Hughes, S. P., van Belle, G., Fullenbaum, G., et al. (1989). The Consortium to Establish a Registry for Alzheimer's Disease (CERAD). Part 1. Clinical and neuropsychological assessment of Alzheimer's disease. Neurology, 39, 1159-1165. Morris, J. C., Edland, S., Clark, C., Galasko, D., Koss, E., Mohs, R., et al. (1993). The Consortium to Establish a Registry for Alzheimer's Disease (CERAD): N. Rates of cognitive change in the longitudinal assessment of probable Alzheimer's disease. Neurology, 43(12), 2457-2465.
REFERENCES Morrison, M. W., Gregory, R. J., & Paul, J. J. (1979). Reliability on the Finger Tapping Test and a note on sex differences. Perceptual and Motor Skills, 48, 13~142. Morrow, L. A., Muldoon, S. B., & Sandstrom, D. J. (2001). Neuropsychological sequelae associated with occupational and environmental exposure to chemicals. In R. E. Tarter, M. Butters, & S. R. Beers, (Eds.), Medical neuropsyclwlogy (2nd ed.). New York: Kluwer Academic/Plenum. Moses, J. A. (1986). Factor structure of Benton's Tests of Vtsual Retention, Visual Construction, and Visual Form Discrimination. Archives of Clinical Neuropsyclwlogy, 1(2), 147-156. Moses, J. A., Jr., Pritchard, D. A., & Adams, R. L. (1999). Normative corrections for the Halstead Reitan Neuropsychological Battery. Archives of Clinical Neuropsychology, 14(5), 445-454. Mount, D. L., Hogg, L., & Johnstone, B. (2002). Applicability of the 15-item versions of the Judgment of Line Orientation Test for individuals with traumatic brain injury. Brain Injury, 16(12), 1051-1055. Mountain, M. A. & Snow, W. G. (1993). Wisconsin Card Sorting Test as a measure of frontal pathology: A review. Clinical Neuropsyclwlogist, 7, 108-118. Mungas, D. (1983). Differential clinical sensitivity of specific parameters of the Rey Auditory-Verbal Learning Test. Journal of Consulting and Clinical Psyclwlogy, 51(6), 848-855. Mungas, D., Marshall, S.C., Weldon, M., Haan, M. & Reed, B. R. (1996). Age and education correction of the Mini-Mental State Examination for English and Spanish-speaking older adults. Psychology and Aging, 12, 718-725. Mungas, D., Reed, B. R., & Kramer, J. H. (2003). Psychometrically matched measures of global cognition, memory, and executive function for assesment of cognitive decline in older persons. Neuropsyclwlogy, 17(3), 380--392. Murkin, J., Newman, S., Stump, D., & Blumenthal, J. (1995). Statement of consensus on assessment of neurobehavioral outcomes after cardiac surgery. Annals of Thoracic Surgery, 59, 1289-1295. Murphy, C., Nordin, S., & Acosta, L. (1997). Odor learning, recall, and recognition memory in young and elderly adults. Neuropsyclwlogy, 11(1), 126-137. Mutchnick, M. G., Ross, L. K., & Long, C. J. (1991). Decision strategies for cerebral dysfunction: IV. Determination of cerebral dysfunction. Archives of Clinical Neuropsychology, 6(4), 25~270. Nabors, N. A., Vangel, S. J., & Lichtenberg, P. A. (1996). Visual Form Discrimination test with
581 elderly medical inpatients. Clinical Gerontologist, 17(1), 43--53. Nabors, N. A., Vangel, S. J., Lichtenberg, P. A., & Walsh, P. (1997). Normative and clinical utility of the Hooper Visual Organization Test with geriatric medical inpatients. Journal of Clinical Geropsyclwlogy, 3(3), 191-198. Nadler, J. D., Grace, J., White, D. A., Butters, M. A., & Malloy, P. F. (1996). Laterality differences in quantitative and qualitative Hooper performance. Archives of Clinical Neuropsyclwlogy, 11(3), 223-229. Nagahama, Y., Fuyama, H., Yamauchi, H., Matsuszaki, S., Konishi, H., Shibasaki, H., et al. (1996). Cerebral activation during performance of a card sorting test. Brain, 119, 1667-1675. Nagahama, Y., Sadoto, N., Yamauchi, H., Katsumi Y., Hayashi, T., Fukuyama, H., et al. (1998). Neural activity during attention shifts between object features. Neuroreport, 9, 2633-2638. Nagahama, Y., Okina, T., Suzuki, N., Matsuzaki, S., Yamauchi, H., Nabatame, H., et al. (2003). Factor structure of a modified version of the Wisconsin Card Sorting Test: An analysis of executive deficit in Alzheimer's disease and mild cognitive impairment. Dementia and Geriatric Cognitive Disorders, 16(2), 103-112. Nalcaci, E., Kalaycioglu, C., Cicek, M., & Gene, Y. (2001). The relationships between handedness and fine motor performance. Cortex, 37(4), 493-500. Naugle, R. 1., & McSweeny, A. J. (1995). On the practice of routinely appending neuropsychological data to reports. Clinical Neuropsyclwlogist, 9(3), 245-247. Naugle, R. 1., & McSweeny, A. J. (1996). More thoughts on the practice of routinely appending raw data to reports: Response to Freides and Matarazzo. Clinical Neuropsychologist, 10(3), 313--314. Nebes, R. D. (1989). Semantic memory in Alzheimer's disease. Psychological Bulletin, 106, 377-394. Nebes, R. D., & Brady, C. B. (1990). Preserved organization of semantic attributes in Alzheimer's disease. Psychology and Aging, 5, 574--579. Nebes, R. D., Martin, D. C., & Hom, L. C. (1984). Sparing of semantic memory in Alzheimer's disease. Journal of Ahnonnal Psyclwlogy, 93, 321-330. Nehemkis, A. M., & Lewinsohn, P. M. (1972). Effects of left and right cerebral lesions on the naming process. Perceptual and Motor Skills, 35, 787-798.
582
Neils, J., Brennan, M. M., Cole, M., Boller, F., & Gerdeman, B. (1988). The use of phonemic cueing with Alzheimer's disease patients. Neu-
ropsychologia, 26, 351--354. Neils, J., Baris, J. M., Carter, C., Dell'aira, A. L., Nordloh, S. J., Weiler, E., et al. (1995). Effects of age, education, and living environment on Boston Naming Test performance. Journal of Speech and Hearing Research, 38, 1143-1149. Nell, V. (2000). Cross-cultural neuropsychological assessment: Theory and practice. Hillsdale, NJ: Lawrence Erlbaum. Nelson, H. E. (1976). A modified card sorting test sensitive to frontal lobe defects. Cortex, 12(4), 313-324. Nelson, H. E. (1982). National Adult Reading Test (NART): Test manual. Windsor, Ontario: NFER Nelson. Nelson, N. W., Boone, K., Dueck, A., Wagener, L., Lu, P., & Grills, C. (2003). Relationships between eight measures of suspect effort. ~linical Neuropsychologist, 17(2), 263-272. · Netherton, S. D., Elias, J. W., Albrecht, ;N. N., Acosta, C., et al. (1989). Changes in the performance of parkinsonian patients and nomlfl} aged on the Benton VISual Retention Test. Experimental Aging Research, 15(1-2), 13-18. Neuger, G. J., O'Leai)', D. S., Berent, S., Fishburne, F. J., Giordani, B., Boll, T. J., et al. (1981). Order effects on the Halstead.:.Reitan Neuropsychological Test Battery and allied procedures. Journal of Consulting and Clinical Psychology, 49, 722-730. Newcombe, F. (1969). Missile wounds of the brain. London: Oxford University Press. Ng, T. P., Lim, L. C., & Win, K. K. (1992). An investigation of solvent-induced neuro-psychiatric disorders in spray painters. Annals of the AcademyofMedicine, Singapore,21(6), 79'7-803. Ng, V. W. K., Eslinger, P. J., Williams, S. C. R., Brammer, M. J., Bullmore, E. T., Andrew, C. M., et al. (2000). Hemispheric preference in visuospatial processing: A complementaJ}' approach with fMRI and lesion studies. Human Brain
Mapping, 10(2), 80--86. Nicholas, L. E., Brookshire, R. H., MacLennan, D. L., Schumacher, J. G., & Porrazzo, S. A. (1989). Revised administration and scoring procedures for the Boston Naming Test and norms for non-braindamaged adults. Aphasia, 5(6), 569-580. Nicholas, M., Ohler, L., Albert, M., & Goodglass, H. (1985). Lexical retrieval in healthy aging. Cortex, 21, 595-606. Nicholas, M., Ohler, L., Au, R., & Albert, M. L. (1996). On the nature of naming errors in aging
REFERENCES and dementia: A study of semantic relatedness. Brain and Language, 54, 184-195. Nielsen, H., Knudsen, L., & Daugbjerg, 0. (1989). Normative data for eight neuropsychological tests based on a Danish sample. Scandinavian Journal of Psychology, 30(1), 37-45. Nielsen, H., Lolk, A., & Kragh-Sorensen, P. (1995). Normative data for eight neuropsychological tests, gathered from a random sample of Danes aged 64 to 83 years. Nordisk Psykologi, 47(4), 241-255. Nishiwaki, Y., Maekawa, K., Ogawa, Y., Asukai, N., Minami, M., et al. (2001). Effects of sarin on the nervous system in rescue team staff members and police officers 3 years after the Tokyo subway sarin attack. Environmental Health Perspectives, 109(11), 1169--1173. N'Kaoua, B., Lespinet, V., Barsse, A., Rougier, A., & Claverie, B. (2001). Exploration of hemispheric specialization and lexico-semantic processing in unilateral temporal lobe epilepsy with verbal fluency tasks. Neuropsychologia, 39(6), 635--642. Nolin, P. (1999). Analyses psychometriques de I'adaptation francaise du California Verbal Learning Test (CVLT). [Psychometric analyses of the French version of the California Verbal Learning Test (CVLT)]. Revue Quebecoise de Psychologie, 20(1), 39--55. Norman, M. A., Evans, J. D., Miller, S. W., & Heaton, R. K. (2000). Demographically corrected norms for the California Verbal Learning Test. Journal of Clinical and Experimental Neuropsychology, 22(1), 80-94. Norris, M. P., Blankenship-Reuter, L., SnowTurek, A. L., & Finch, J. (1995). InHuence of depression on verbal fluency performance. Aging and Cognition, 2(3), 206-215. Numan, B., Sweet, J. J., & Ranganath, C. (2000). Use of the California Verbal Learning Test to detect proactive interference in the traumatically brain injured. Journal of Clinical Psychology, 56(4), 553-562. Nyberg, L., Winocur, G., & Moscovitch, M. (1997). Correlation between frontal lobe functions and explicit and implicit stem completion in health elderly. Neuropsychology, 11(1), 70-76. Obayashi, S., Matsushima, E., Ando, H., Ando, K., & Kojima, T. (2003). Exploratory eye movements during the Benton Visual Retention Test: Characteristics of visual behavior in schizophrenia. Psychiatry and Clinical Neurosciences, 57(4), 409--415. Ober, B. A., Dronkers, N. F., Koss, E., Delis, D. C., & Friedland, R. P. (1986). Retrieval from semantic
REFERENCES memory in Alzheimer-type dementia. Journal of Clinical and Experimental Neuropsychology, 8,
75-92. Obonsawin, M. C., Robertson, A., Crawford, J. R., Perera, C., Walker, S., Blackmore, L., et al. (1998). Non-mnestic cognitive function in the scopolamine model of Alzheimer's disease. Hu-
man PsychopharmtJCOlogy Clinical and Experimental, 13(6), 439--450. O'Connell, M. E., & Tuokko, H. (2002). The 12item Buschke Memory Test: Appropriate for use across levels of impairment. Applied Neuropsychology, 9(4), 226-233. O'Connor, M. K. (2002). The predictive utility of the Hopkins Verbal Learning Test-Revised in older adults with depression versus dementia of the Alzheimer's type. Dissertation Abstracts In-
ternational. Section B: The Sciences and Engineering, 63(1-B), 543. O'Donnell, J.P., MacGregor, L.A., Dabrowski, J. J., Oestreicher, J. M., et al. (1994). Construct validity of neuropsychological tests of conceptual and attentional abilities. Journal of Clinical Psychology, 50(4), 596-600. O'Donnell, M. P., & Webb, M. G. (1986). PostECT blood pressure rise and its relationship to cognitive and affective change. British Journal of Psychiatry, 149, 494-497. O'Donnell, P. 0., Kurtz, J., & Ramanaiah, N. V. (1983). Neuropsychological test findings for normal, learning-disabled and brain-damaged young adults. Journal of Consulting and Clinical Psychology, 51(5), 726-729. Ojemann, G. A., Sutherling. W. W., Lesser, R. P., Dinner, D. S., Jayakar, P., & Saint-Hilaire, J. M. (1993). Cortical stimulation. In J. Engel (Ed.), Surgical treatment of the epilepsies (2nd ed., pp. 399--414). New York: Raven. O'Leary, M. R., Donovan, D. M., & Chaney, E. F. (1977). The relationship of perceptual field orientation to measures of cognitive functioning and current adaptive abilities of alcoholics and nonalcoholics. Journal of Neroous and Mental Disease, 165(4), 275-282. Osterrieth, P. A. (1944). Le test de copie d'une figure complexe. Archives de Psychologie, 30, 206-356. Osterrieth, P. A. (1993). The complex figure copy test. Clinical Neuropsychologist, 7(1), 3-21. Ostrosky-Solis, F., Ardila, A., & Rosselli, M. (1997). NEUROPSI: Evaluaci6n Neuropsicologica Breve en Espaiiol. Manual, Instructivo y Protocolo de Aplicaci6n [NEUROPSI: A brief neuropsychological evaluation in Spanish. Manual, instructions, and application protocol]. Mexico City: Bayer de Mexico.
583 Ostrosky-Solis, F., Jaime, R. M., & Ardila, A. (1998). Memory abilities during normal aging. International Journal of Neuroscience, 93(1-2), 151-162. Ostrosky-Solis, F., Ardila, A., & Rosselli, M. (1999). NEUROPSI: A brief neuropsychological test battery in Spanish with norms by age and educational level. Journal of the International Neuropsychological Society, 5(5), 413-433. Oswald, W., & Roth, E. (1978). Der ZahlenVerbindungs Test (ZVT). Gottingen: Hogrefe. Pachana, N. A., Boone, K. B., Miller, B. L., Cummings, J. L., & Berman, N. (1996). Comparison of neuropsychological functioning in Alzheimer's disease and frontotemporal dementia. Journal of
the International Neuropsychological Society, 2, 505-510. Paivio, A., Yuille, J. C., & Madigan, S. A. (1968). Concreteness, imagery, and meaningfulness values for 925 nouns. Journal of Experimental Psychology 76(Monogr. Suppl.), 1-25. Palmer, B. W., Boone, K. B., Chang, L., Lee, A., & Black, S. (1994). Cognitive deficits and personality patterns in maternally versus paternally inherited myotonic dystrophy. Journal of Clinical and Experimental Neuropsychology, 16(5), 784795. Palmer, C., Wolkenstein, B., LaRue, A., Swan, G., & Smalley, S. (1994). Commingling analysis of memory performance in elderly men. Genetic Epidemiology, 11, 443-449. Pan, J. W., Krupp. L. B., Elkins, L. E., & Coyle, P. K. (2001). Cognitive dysfunction lateralizes with NAA in multiple sclerosis. Applied Neuropsychology, 8(3), 155-160. Panek, P. E., Rush, M. C., & Slade, A. L. (1984). Locus of the age-Stroop interference relationship. Journal of Genetic Psychology,145(2), 209-216. Paniak, C. E., Shore, D. L., & Rourke, B. P. (1989). Recovery of memory after severe closed-head injury: Dissociations in recovery of memory parameters and predictors of outcome. Journal of
Clinical and Experimental Neuropsychology, 11(5), 631-644. Pantelis, C., Egan, G., Pipingas, A., Maruff, P., O'Keefe, G., Velakoulis, D., et al. (1996). Practice dependent alterations in activation of the anterior cingulate cortex during the Stroop task: A positron emission tomography study. Neuroimage, 3, S193. Paolo, A. M., Troester, A. 1., Axelrod, B. N., & Koller, W. C. (1995). Construct validity of the WCST in normal elderly and persons with Parkinson's disease. Archives of Clinical Neuropsychology, 10(5), 463-473.
584
Paolo, A. M., Axelrod, B. N., & Troester, A. I. (1996a). Test-retest stability of the Wisconsin Card Sorting Test. Assessment, 3(2), 137-143. Paolo, A. M., Axelrod, B. N., Troester, A. 1., Blackwell, K. T., & Koller, W. C. (1996b). Utility of a Wisconsin Card Sorting Test Short Fonn in persons with Alzheimer's and Parkinsoo's disease. Journal uf Clinical and Expe1'imental Neuropsychology, 18(6), 892-897. Paolo, A. M., Cluff, B. R., & Ryan, J. J. (1996c). Influence of perceptual organization and naming abilities on the Hooper Visual Organization Test. Neuropsychiatry, Neuropsychology, and Behavioral Neurology, 9(4), 254--257. Pardo,J. V.,Pardo,P.J.,Janer, K. W., &Raichle, M. E. (1990). The anterior cingulate cortex mediates processing selection in the Stroop attentional conflict paradigm. Proceedings of the National Actukmy of Science, 87,256-259. Parellada, E., Cataqfau, A. M., Bernardo, M., Lomena, F., Gonzalex-Monclus, E., & Setnain, J. (1994). Prefrontal dysfunction in young acute neuroleptic-naive schizophrenic patients: A resting and activation SPECT study. Psychiatry Research: Neuroimaging, 55, 131-139. Parellada, E., Catarineu, S., Catafau, A., Bernardo, M., & Lomena, F. (2000). Psychopathology and Wisconsin Card Sorting Test perfonnance in young unmedicated schizophrenic patients. Psychopathology, 33(1), 14-18. Parkin, A. J., & Java, R. I. (1999). Deterioration of frontal lobe function in nonnal aging: Influences of fluid intelligence versus perceptual speed. Neuropsychology, 13(4), 539--545. Parkin, A. J., & Lawrence, A. (1994). A dissociation in the relation between memory tasks and frontal lobe tests in the normal elderly. Neuropsychologia, 32(12), 1523--1532. Parkin, A. J., & Walter, B. M. (1991). Aging, shorttenn memory, and frontal dysfunction. Psychobiology, 19(2), 175-179. Parkin, A. J., & Walter, B. M. (1992). Recollective experience, nonnal aging, and frontal dysfunction. Psychology and Aging, 7(2), 290-298. Parkinson, S. R., Inman, V. W., & Dannenbaum, S. E. (1985). Adult age differences in short-tenn forgetting. Acta Psychologica, 60, 83-101. Parks, R. W., Loewenstein, D. A., Dodrill, K. L., Barker, W. W., Yoshii, F., Chang, J. Y., et al. (1988). Cerebral metabolic effects of a verbal fluency test: A PET scan study. Journal of Clinical and Experimental Neuropsychology, 10, 565-575. Parks, R. W., Levine, D. S., Long, D. L., Crockett, D. J., Dalton, I. E., Weingartner, H., et al. (1992).
REFERENCES Parallel distributed processing and neuropsychology: A neural network model of Wisconsin Card Sorting and Verbal Fluency. Neuropsychology Review, 3(2), 213-233. Parsons, 0. A. (1975). Brain damage in alcoholics: Altered states of unconsciousness. In M. M. Gross (Ed.), Alcohol intoxication and withdrawal. Experimental studies (No. 2). New York: Plenum. Parsons, 0. A., Maslow, H. 1., Morris, F., & Denny, J. P. (1964). Trail Making Test performance in relation to certain experimenter, test and subject variables. Perceptual and Motor Skills, 19, 199--206. Pauker, J.D. (1980). Nonnsforthe Halstead-Reitan Neuropsychological Test Battery based on a nonclinical adult sample. Address presented at the meeting of the Canadian Psychological Association, Calgary, Alberta, Canada. Pauker, J.D. (1988). Constructing overlapping cell tables to maximize the clinical usefulness of nonnative test data: Rationale and an example from neuropsychology. Journal of Clinical Psychology, 44(6), 930-933. Paul, R. H., Cohen, R., Moser, D., Ott, B., Zawacki, T., & Gordon, N. (2001). Perfonnance on the Hooper Visual Organizational Test in patients diagnosed with subcortical vascular dementia: Relation to naming perfonnance. Neuropsychiatry, Neuropsychology, and Behavioral Neurology, 14(2), 93-97. Paul,R. H.,Cohen,R.A.,Moser, D.J.,Zawacki, T. M., & Gordon, N. (2002). The serial position effect in mild and moderately severe vascular dementia. Journal of the International Neuropsychological Society, 8(4), 584-587. Paulsen, J. S., Heaton, R. K., Sadek, J. R., Perry, W., et al. (1995a). The nature of learning and memory impainnents in schizophrenia. Journal of the International Neuropsychological Society, 1(1), 88--99. Paulsen, J. S., Salmon, D. P., Monsch, A. U., Butters, N., et al. (1995b). Discrimination of cortical from subcortical dementias on the basis of memory and problem-solving tests. Journal of Clinical Psychology, 51(1), 48--58. Peaker, A., & Stewart, L. E. (1989). Rey's Auditory Verbal Learning Test-a review. In J. R. Crawford & D. M. Parker (Eds.), Developments in clinical and experimental Neuropsychology. New York: Plenum. Peirson, A. R., & Jansen, P. (1997). Comparability of the Rey-Osterrieth and Taylor fonns of the Complex Figure Test. Clinical Neuropsychologist, 11(3), 244-248.
REFERENCES Pendleton, M. G., & Heaton, R. K. (1982). A comparison of the Wisconsin Card Sorting Test and the Category Test. Journal of Clinical Psychology, 38, 392-396. Pendleton, M. G., Heaton, R. K., Lehman, R. A. W., & Hulihan, D. (1982). Diagnostic utility of the Thurstone Word Fluency Test in neuropsychological evaluations. Journal of Clinical Neuropsychology, 4, 307-317. Pentz, C. A., Elias, M. F., Wood, W. G., Schultz, N. A., & Dineen, J. (1979). Relationship of age and hypertension to neuropsychological test performance. Experimental Aging Research, 5(4), 351-372. Perlmuter, L. C., Tun, P., Sizer, N., McGlinchey, R. E., & Nathan, E. M. (1987). Age and diabetes related changes in verbal fluency. Experimental Aging Research, 13, 9-14. Perret, E. (1974). The left frontal lobe of man and the suppression of habitual responses in verbal categorical behaviour. Neuropsychologia, 12, 323-330. Perrine, K. (1993). Differential aspects of conceptual processing in the Category Test and Wisconsin Card Sorting Test. Journal of Clinical and Experimental Neuropsychology, 15, 461-473. Petersen, R. C., Smith, G., Kokmen, E., lvnik, R. J., et al. (1992). Memory function in normal aging. Neurology, 42(2), 396-401. Petersen, R. C., Smith, G. E., Waring, S.C., Ivnik, R. J., Tangalos, E. G., & Kokmen, E. (1999). Mild cognitive impairment: Clinical characterization and outcome. Archives of Neurology, 56(3), 303-308. Peterson, B., Anderson, A., Skudlarski, P., Zhang, H., & Gore, J. (1996). An fMRI study of the Stroop effect. Neuroimllge, 3, S195. Peterson, L. R., & Peterson, M. J. (1959). Shortterm retention of individual verbal items. Journal of Experimental Psychology, 58, 193-198. Peynirciglu, Z. F., Thompson, J. L. W., &Tanielian, T. B. (2000). Improvement strategies in free-throw shooting and grip-strength tasks. Journal of General Psychology, 127(2), 145-156. Phelps, E. A., Hyder, F., Blamire, A. M., & Shulman, R. G. (1997). fMRI of the prefrontal cortex during overt verbal fluency. Neuroreport: An International Journal for the Rapid Communication of Research in Neuroscience, 8(2), 561-565. Philpot, M. P., Banerjee, S., Needham-Bennett, H., Costa, D. C., & Ell, P. J. (1993). 99mTc-HMPAO single photon emission tomography in late life depression: A pilot study of regional cerebral
585 blood flow at rest and during a verbal fluency
task. Journal of Affective Disorders, 28(4), 233-240. Piatt, A. L., Fields, J. A., Paolo, A. M., Koller, W. C., & Troester, A. I. (1999a). Lexical, semantic, and action verbal fluency in Parkinson's disease with and without dementia. Journal of Clinical and Experimental Neuropsychology, 21(4), 435-443. Piatt, A. L., Fields,J. A., Paolo, A. M.,&Troester,A. I. (1999b). Action (verb naming) fluency as an executive function measure: Convergent and divergent evidence of validity. Neuropsychologia, 37(13), 1499-1503. Pieniadz, J., & Kelland, D. (2001). Reporting scores in neuropsychological assessment: Ethnicality, validity, practicality, and more. In C. G. Armengol, E. Kaplan, & E. Moes (Ed.), The consumeroriented neuropsycholo,g.cal report. Lutz, FL: Psychological Assessment Resources. Pierce, T. W., Elias, M. F., Keohane, P. J., Podraza, A.M., Robbins, M.A., & Schultz, N. R. (1989). Validity of a short form of the Category Test in relation to age, education and gender. Experimental Aging Research, 15(3), 137-141. Pihlajamaeki, M., Tanila, H., Hanninen, T., Koenoenen, M., Laakso, M., Partanen, K., etal. (2000). Verbal fluency activates the left medial temporal lobe: A functional magnetic resonance imaging study. Annals of Neurology, 47(4), 470-476. Pimental, P. A. & Ross, C. (2003). ROCF productions in right- and left-hemisphere lesion patients. In J. A. Knight (Ed.), The handbook of Rey-Osterrieth Complex Figure usage: Clinical and research applications. Lutz, FL: Psychological Assessment Resources. Pineda, D. A., & Merchan, V. (2003). Executive function in young Colombian adults. International Journal of Neuroscience, 113(3), 397-410. Pirozzolo, F. J., Hansch, E. C., Mortimer, J. A., Webster, D. D., & Kuskowski, M. A. (1982). Dementia in Parkinson disease: A neuropsychological analysis. Brain and Cognition, 1, 71-83. Polubinski, J. P., & Melamed, L. E. (1986). Examination of the sex difference on a Symbol Digit Substitution Test. Perceptual and Motor Skills, 62, 975-982. Ponsford, J., & Kinsella, G. (1992). Attentional deficits following closed-head injury. Journal of Clinical and Experimental Neuropsychology. 14(5), 822-838. Pontius, A., & Yudowitz, B. (1980). Frontal lobe system dysfunction in some criminal actions as shown in the narratives test. Journal of Nervous and Mental Disease, 168, 111-117.
586
REFERENCES
Ponton, M. 0., & Ardila, A. (1999). The future of neuropsychology with Hispanic populations in the United States. Archives of Clinical Neuropsychology, 14(1), 565--580. Ponton, M. 0., Satz, P., Herrera, L., Ortiz, F., Urrutia, C. P., Young, R., et al. (1996). Nonnative data stratified by age and education for the Neuropsychological Screening Battery for Hispanics (NeSBHIS): Initial report. Journal of International Neuropsychological Society, 2(2), 96-104. Ponton, M. 0., Gonzalez, J. J., Hernandez, 1., Herrera, L., & Higareda, I. (2000). Factor analysis of the Neuropsychological Sc~;eening Battery for Hispanics (NeSBHIS). [Special Issue: Assessment of Spanish-speaking populations}. Applied Neuropsychology, 7(1), 3i-39. Poreh, A.M., & Shye, S. (1998). Examination of the global and local features of the Rey-Osrerrieth Complex Figure using faceted smallest space analysis. Clinical Neuropsychologist, 12(4},
453-467. Poreh, A. M., Ross, T. P., & Whitman, R. D. (1995). Reexamination of executive functions in psychosis-prone college students. Personality
and Individual Differences, 18(4), 535-539. Portin, R., Saarijarvi, S., Joukamaa, M., & Salokangas, R. K. R. (1995). Education, gender and cognitive perfonnance in a 62-year-old nonnal population: Results from the Turva Project. Psychological Medicine, 25, 1295-1298. Portin, R., Polo-Kantola, P., Polo, 0., Koskinen, T., Revonsuo, A., Irjala, K., et al. (1999). Serum estrogen level, attention, memory and other cognitive functions in middle-aged women. Climacteric, 2(2), 115-123. Powell, G. E. (1979). The relationship between intelligence and verbal and spatial memory. Journal of Clinical Psychology, 35(2}, 335--340. Power, D. G., Logue, P. E., McCarty, S. M., Rosenstiel, A. K., & Ziezat, H. A. (1979). Inter-rater reliability of the Russell revision of the Wechsler Memory Scale: An attempt to clarify some ambiguities in scoring. Journal of Clinical Neuro-
psychology, 1,
~-
Prakash, I. J., & Bhogle, S. (1992). Benton's Visual Retention Test: Nonns for different age goups.
Journal of the Indian Academy of Applied Psychology. 18(1-2), 33--36. Prevey, M., Delaney, R., Cramer, J., Mattson, R., & VA Epilepsy Cooperative Study 264 Group (1998). Complex partial and secondarily generalized seizure patients: Cognitive functioning prior to treatment with antiepileptic medication. Epilepsy Research, 30(1-9}.
J., Fein, G., & Feinberg, I. (1980). Neuropsychological assessment of cognitive JUnction in the elderly. In L. W. Poon (Ed.), Aging in the 1980's. Washington, DC: American Psychologi-
Price, L.
cal Association Press. Prigatano, G. P., & Borgaro, S. R. (2003). Qualitative features of finger movement during the Halstead finger oscillation test following traumatic brain injury. Journal of the International Neuropsychological Society, 9(1), 128-133. Prigatano, G. P., & Parsons, 0. A. (1976). Relationship of age and education to Halstead test perfonnance in different patient populations.
Journal of Consulting and Clinical Psychology, 44(4}, 527-533. Prigatano, G. P., Parsons, 0. A., Levin, D. C., Wright, E., & Hawryluk, G. (1983). Neuropsychological test perfonnance in mildly hypoxemic patients with chronic obstructive pulmonary
disease. Journal of Consulting and Clinical Psychology, 51(1), 108-116. Psychological Assessment Resources (1990). Wisconsin Card Sorting Test: Scoring program (Version 3.0). Odessa, FL: Author. Puckett, J. M:, & Lawson, W. M. (1989). Absence of adult age differences in forgetting in BrownPeterson task. Acta Psychologica, 72, 159-175. Puente, A. E., & Ardila, A. (2000). Neuropsychological assessment of Hispanics. In E. FletcherJanzen, T. L. Strickland, et al. (Eds.), Handbook
of cross-cultural neuropsychology. Critical issues in neuropsychology. Amsterdam: Kluwer Academic. Pujol, J., Vendrell, P., Dues, J., Kulisevsky, J., Marti-Valalta, J. L., Garcia, C., et al. (1996). Frontal lobe activation during word generation studied by functional MRI. Acta Neurologica
Scandinavica, 93, 403-410. Qualls, C. E., Bliwise, N. G., & Stringer, A. Y. (2000). Short fonns of the Benton Judgment of Line Orientation Test: Development and psychometric properties. Archives of Clinical Neuropsychology, 15(2), 159-163. Query, W. T. (1979). Category Test score as related to age in two brain-damaged populations. Journal of Clinical Psychology, 35(4), 802--804. Query, W. T., & Berger, R. A. (1980). AVLT memory scores as a function of age among general medical, neurologic and alcoholic patients. Journal of Clinical Psychology, 36(4), 1009-1012. Query, W. T., & Megran, J. (1983). Age-related nonns for AVLT in a male patient population. Journal of Clinical Psychology, 39(1}, 136-138. Query, W. T., & Megran, J. (1984). Influence of depression and alcoholism on learning, recall
REFERENCES
587
and recognition. Journal of Clinical Psychology, 40(4), 1097-1100. Radanovic, M., Azambuja, M., Mansur, L. L., Porto, C. S., & Scaff, M. (2003). 1balamus and language: interface with attention, memory and executive functions. Arquivos de Neuro-psiquiatria, 61(1), 34-42. Rahman, Q., & Wilson, G. D. (2003). Large sexualorientation-related differences in performance on mental rotation and judgement of line orientation tasks. Neuropsychology, 17(1), 25--31. Raine, A., Lencz, T., Reynolds, G. P., Harrison, G., Sheard, C., Medley, I., et al. (1992). An evaluation of structural and functional prefrontal deficits in schizophrenia: MRI and neuropsychological measures. Psychiatry Research: Neuroimaging, 45(2), 123-137. Randall, C. M., Dickson, A. L., & Plasay, M. T. (1988). 1be relationship between intellectual function and adult performance on the Benton Visual Retention Test. Cortex, 24(2), 277-289. Randolph, C. (1998). Repeatable Battery for the
Rao, S. M., Leo, G. J., Ellington; L., Nauertz, T., Bernardin, L., & Unverzagt, F. (1991b). Cognitive dysfunction in multiple sclerosis: II. Impact on employment and social functioning. Neurology, 41(5), 692-696. Rapport, L. J., Dutra, R. L., Webster, J. S., Charter, R., & Morrill, B. (1995). Hemispatial deficits on the Rey-Osterrieth Complex Figure drawing. Clinical Neuropsychologist, 9(2), 169-179. Rapport, L. J., Charter, R. A., Dutra, R. L., Farchione, T. J., & Kingsley, J. J. (1997). Psychometric properties of the Rey-Osterrieth Complex Figure: Lezak-Osterrieth versus Denman scoring systems. Clinical Neuropsychologist, 11(1), 46-53. Rapport, L. J., Van Voorhis, A., Tzelepis, A., & Friedman, S. R. (2001). Executive functioning in adult attention-deficit hyperactivity disorder. Clinical Neuropsychologist, 15(4), 479-491. Rapport, L. J ., & Webster, J. S. (2003). Assessment of unilateral neglect using the ROCF. In J. A. Knight (Ed.), The hondbook of Rey-Osterrieth Complex
Assessment of Neuropsychological Status: Manual. San Antonio, TX: Psychological Corporation. Randolph, C., Braun, A. R., Goldberg, T. E., & Chase, T. N. (1993). Semantic fluency in Alz-
Figure usage: Clinical and research applications.
heimer's, Parkinson's and Huntington's disease: Dissociation of storage and retrieval failures. Neuropsychology, 7, 82--88. Randolph, C., Tierney, M. C., Mohr, E., & Chase, T. N. (1998). 1be Repeatable Battery for the Assessment of Neuropsychological Status (RBANS): Preliminary clinical validity. Journal ofClinical and Experimental Neuropsychology, 20(3), 310--319. Randolph, C., Lansing, A. E., Ivnik, R. J., Cullum, C. M., & Hermann, B. P. (1999). Determinants of confrontation naming performance. Archives of Clinical Neuropsychology, 14(6), 489-496. Rao, S. L., & Andrade, C. (1998). Selective Reminding Test to measure verbal and visual memory. Indian Journal of Clinical Psychology, 25(2), 149-153. Rao, S. M., Mittenberg, W., Bernardin, L., Haughton, V., Leo, G. J. (1989). Neuropsychological test findings in subjects with Leukoaraisis. Archives of Neurology, 46, 40-44. Rao, S. M., & Cognitive Function Study Group of the National Multiple Sclerosis Society (1990). A
manual for the Brief Battery of Neuropsychological Tests in multiple sclerosis. Milwaukee: Medical College of Wisconsin. Rao, S.M., Leo, G. J., Bernardin, L., & Unverzagt, F. (1991a). Cognitive dysfunction in multiple sclerosis: I. Frequency, patterns, and prediction. Neurology, 41(5), 685-691.
Lutz, FL: Psychological Assessment Resources. Raskin, S. A., Borod, J. C., Wasserstein, J., BodisWollner, 1., Coscia, L. & Yahr, M.D. (1990). VtSuospatial orientation in Parkinson's disease. Internotional Journal ofNeuroscience, 51(1-2), 9-18. Raskin, S. A., Sliwinski, M., & Borod, J. C. (1992). Clustering strategies on tasks of verbal fluency in Parkinson's disease. Neuropsychologia, 30(1), 95-99. Rasmussen, K., Jeppesen, H. J., & Sabroe, S. (1993). Psychometric tests for assessment of brain function after solvent exposure. American Journal of Industrial Medicine, 24(5), 553-565. Rasmusson, D. X, Bylsma, F. W., & Brandt, J. (1995). Stability of performance on the Hopkins Verbal Learning Test. Archives of Clinical Neuropsychology, 10(1), 21-26. Rasmusson, D. X., Carson, K. A., Brookmeyer, R., Kawas, C., & Brandt, J. (1996). Predicting rate of cognitive decline in probable Alzheimer's disease. Brain and Cognition. Special Issue: The dementias, 31(2), 133-147. Rasmusson, D. X., Zonderman, A. B., Kawas, C., & Resnick, S. M. (1998). Effects of age and dementia on the Trail Making Test. Clinical Neuropsychologist, 12(2), 169-178. Rathbun, J., & Smith, A. (1982). Comment on the validity of Boyd's validation study of the Hooper Visual Organization Test. Journal of Consulting and Clinical Psychology, 50, 281-283. Rattan, G., Dean, R. S., & Fischer, W. E. (1986). Response time as a dependent measure on the
588
REFERENCES
Category Test of the Halstead-Reitan Neuropsychological Test Battery. Archives of Clinical Neuropsychology, 1(2), 17~182. Ravdin, L. D., Katzen, H. L., Agrawal, P., & Relkin, N. R. (2003). Letter and semantic fluency in older adults: Effects of mild depressive symptoms and age-stratified normative data. Clinical Neuropsychologist, 17(2), 19~202. Ravnkilde, B., Videbech, P., Rosenberg, R., Gjedde, A., & Gade, A. (2002). Putative tests of frontal lobe function: A PET-study of brain activation during Stroop's Test and verbal fluency.
Journal of Clinical and Experimental chology, 24(4), 534--547.
N~ropsy
Raz, N., Gunning-Dixon, F., Head, D., Dupuls,J. H., & Acher, J. D. (1998). Neuroanatomical correlates of cognitive aging: Evidence fro structural magnetic resonance imaging. Neuropsychology, 12, 9~112. Razarli, J., Boone, K., Miller, B. L., Lee, A., & Sherman, D. (2001). Neuropsychological performance of right- and left-frontotemporal dementia compared to Alzheimer's disease. Journal
of the International Neuropsychological Society, 7(4), 468-480. Reader, M. J., Harris, E. L., Schuerholz, L. J., & Denckla, M. B. (1994). Attention deficit hyperactivity disorder and executive dysfunction. Developmental Neuropsychology, 10(4), 493-512. Rebok, G., Brandt, J., & Folstein, M. (1990). Longitudinal cognitive decline in patients with Alzheimer's disease. Journal of Geriatric Psychiatry and Neurology, 3(2), 91-97. Reed, H. B. C., & Reitan, R. M. (1962). The significance of age in the performance of a complex psychomotor task by brain-damaged and nonbrain-damaged subjects. Journal of Gerontology, 17, 193-196. Reed, H. B. C., & Reitan, R. M. (1963a). A comparison of the effects of the nonnal aging process with the effects of organic brain damage on adaptive abilities. Journal of Gerontology, 18, 177-179. Reed, B. C., & Reitan, R. M. (1963b). Changes in psychological test performance associated with the normal aging process. Journal of Gerontology, 18, 271-274. Regard, M., & Landis, T. (1994). The "smiley:" A graphical expression of mood in right anterior cerebral lesions. Neuropsychiatry, Neuropsychology, and Behavioral Neurology, 7(4), 303-307. Regard, M., Strauss, E., & Knapp, P. (1982). Children's production on verbal and non~verbal fluency tasks. Perceptual and Motor Skills, 55, 839-844.
Reinprecht, F., Elmstahl, S., Janzon, L., & AndrePetersson, L. (2003). Hypertension and changes of cognitive function in 81-year-old men: A 13-year follow-up of the population study "Men hom in 1914," Sweden. Journal of Hypertension, 21,57-66. Reitan, R. M. (1955a). Certain differential effects of left and right cerebral lesions in human adults.
Journal of Comparative and Physiological Psychology, 48, 474-477. Reitan, R. M. (1955b). Investigation of the validity of Halstead's measures of biological intelligence. Archives of Neurology and Psychiatry, 73, 28-35. Reitan, R. M. (1955c). The relation of the Trail Making Test to organic brain damage. Journal of Consulting Psychology, 195, 393-394. Reitan, R. M. (1955d). The distribution according to age of a psychologic measure dependent upon organic brain functions. Journal of Gerontology, 10, 338--340. Reitan, R. M. (1958). Validity of the Trail Making Test as an indicator of organic brain damage. Perceptual and Motor Skills, 8, 271-276. Reitan, R. M. (1959). The comparative effects of brain damage on the Halstead impairment index and the Wechsler-Bellevue scale. Journal of Clinical Psychology, 15, 281-285. Reitan, R. M. (1964). Psychological deficits resulting from cerebral lesions in man. In J. M. Warren & K. A. Akert (Eds.), The frontal granular corlex and behavior. New York: McGraw-Hill. Reitan, R. M. (1971). Trail Making Test results for normal and brain-damaged children. Perceptual and Motor Skills, 33(2), 57~81. Reitan, R. M. (1979). Manual for administration of
neuropsychological test batteries for adults and children. Tucson, AZ: Neuropsychology Press. Reitan, R. M. (1985). The Holstead-Reitan Neuropsychological Test Battery. Tucson, AZ: Neuropsychology Press. Reitan, R. M., & Wolfson, D. (1985). The Halstead-
Reitan Neuropsychological Test Battery. Theory and clinical interpretation. Tucson, AZ: Neuropsychology Press. Reitan, R. M., & Wolfson, D. (1988). Traumatic
brain injury. Volume II: Recovery and rehabilitation. Tucson, AZ: Neuropsychology Press. Reitan, R. M., & Wolfson, D. (1993). The HolsteadReitan Neuropsychological Test Battery: Theory and clinical interpretation (2nd ed). Tucson, AZ: Neuropsychology Press. Reitan, R. M., & Wolfson, D. (1995a). Category Test and Trail Making Test as measures of frontal lobe functions. Clinical Neuropsychologist, 9(1), 50--56.
REFERENCES Reitan, R. M., & Wolfson, D. (1995b). Influence of age and education on neuropsychological test results. Clinical Neuropsychologist, 9(2), 151-158. Reitan, R. M., & Wolfson, D. (2001). Critical evaluation of "Assessment: Neuropsycholgical testing of adults." Archives of Clinical Neuropsychology, 16(3), 215-226. Reiter, J. C. (2000). Measuring cognitive processes Wlderlying picture naming in Alzheimer's and cerebrovascular dementia: A general processing tree approach. Journal of Clinical and Experimental Neuropsychology, 22(3), 351-369. Rempfer, M. V., Hamera, E. K., Brown, C. E., & Cromwell, R. L. (2003). The relations between cognition and the independent living skill of shopping in people with schizophrenia. Psychiatry Research, 117(2), 103-112. Rende, B., Ramsberger, G., & Miyake, A. (2002). Commonalities and differences in the working memory components Wlderlying letter and category fluency tasks: A dual-task investigation. Neuropsychology, 16(3), 309--321. Resnick, S. M., Trotman, K. M., Kawas, C., & Zonderman, A. B. (1995). Age-associated changes in specific errors on the Benton VISual Retention Test. Journals of Gerontology. Series B:
Psychological Sciences and Social Sciences, 508(3), P171-P178. Resnick, S. M., Metter, E. J., & Zonderman, A. B. (1997). Estrogen replacement therapy and longitudinal decline in visual memory: A possible protective effect? Neurology, 49(6), 1491-1497. Retzlaff, P., Butler, M., & Vanderploeg, R. D. (1992). Neuropsychological battery choice and theoretical orientation: A multivariate analysis. Journal of Clinical Psychology, 48(5), 666--672. Rey, A. (1941). L'examen psychologique dans les cas d'encephalopathie traumatique. Archives de
Psychologie, 28, 286--340. Rey, A. (1964). L'examen clinique en psychologie. Paris: Presses Universitaires de France. Rey, A., & Osterrieth, P. A. (1993). Translations of excerpts from Andre Rey's "Psychological examination of traumatic encephalopathy" and P. A. Osterrieth's ''The Complex Figure Copy Test." Clinical Neuropsychologist, 7(1), 4-21. Rey, G. J., & Benton, A. L. (1991). Examen de afasia multilingue. Iowa City: AJA Associates. Rey, G. J., Feldman, E., Rivas-Vazquez, R., Levin, B. E., & Benton, A. (1999). Neuropsychological test development and normative data on Hispanics. Archives ofClinical Neuropsychology, 14(7), 593-601. Reynolds, C. R. (2002). Comprehensive Trail Making Test. Lutz, FL: Psychological Assessment Resources.
589 Rezai, K. (1988). Wisconsin Card Sorting Test. Version 1.1. Iowa City: University of Iowa. Rich, J. B.• Troyer, A. K., Bylsma, F. W., & Brandt, J. (1999). Longitudinal analysis of phonemic clustering and switching during word-list generation in HWltington's disease. Neuropsychology, 13(4), 525-531. Richardson, E. D., & Marottoli, R. A. (1996). Education-specific normative data on common neuro-psychological indices for individuals older than 75 years. Clinical Neuropsychologist. 10(4), 375-381. Ricker, J. H., & Axelrod, B. N. (1994). Analysis of an oral paradigm for the Trail Making Test. Assessment, 1(1). 47-51. Ricker, J. H., & Axelrod, B. N. (1995). Hooper Visual Organization Test: Effects of object naming ability. Clinical Neuropsychologist, 9(1), 57-62. Ricker, J. H., Axelrod, B. N., & Houtler, B. D. (1996). Clinical validation of the oral Trail Making Test. Neuropsychiatry, Neuropsychology. and Behavioral Neurology, 9(1), 50-53. Rimel, R. W., Giordani, B., Barth, J. T., & Jane, J. A. (1982). Moderate head injury: Completing the clinical spectrum of brain trauma. Neurosurgery, 11(3), 344-351. Ripich, D. N., Petrill, S. A., Whitehouse, P. J., & Ziol, E. W. (1995). Gender differences in language of AD patients: A longitudinal study. Neurology, 45,299--302. Risser, M. G., & Bowers, T. G. (1993). Cognitive and neuropsychological characteristics of attention deficit hyperactivity disorder children receiving stimulant medications. Perceptual and Motor Skills, 77(3, Pt 1), 1023-1031. Ritchie, K., & Hallerman, E. (1989). Cross-validation of a dementia screening test in a heterogeneous population. International Journal of Epidemiology, 18(3), 717-719. Robert, P. H., Lafont, V., Medecin, 1., Berthet, L., Thauby, S., Baudu, C., et al. (1998). Clustering and switching strategies in verbal fluency tasks: Comparison between schizophrenics and healthy adults. Journal of the International Neuropsychological Society, 4(6), 539-546. Roberts, P. M., Garcia, L. J., Desrochers, A., & Hernandez, D. (2002). English performance of proficient bilingual adults on the Boston Naming Test. Aphasiology, 16(4-6), 635-645. Robertson-Tchabo, E. A., & Arenberg, D. (1989). Assessment of memory in older adults. In T. HWlt & J. Clyde (Eds.), Testing older adults:
A reference guide for geropsychological assessments. Austin, TX: Pro-Ed.
590 Robertson-Tchabo, E. A., Arenberg, D., Tobin, J.D., & Plotz, J. B. (1986). A longitudinal study of cognitive performance in noninsulin dependent (type II) diabetic men. Experimental Gerontology, 21(4-5), 459-467. Robinson, A. L., Heaton, R. K., Lehman, R. A., & Stilson, D. (1980). The utility of the Wisconsin Card Sorting Test in detecting and localizing frontal brain lesions. Journal of Consul~ng and Clinical Psychology, 48, 605-614. Robinson, L. J., Kester, D. B., Saykin~ A. J., Kaplan, E. F., et al. (1991). Comparison of two short forms of the Wisconsin Card Sorting Test. Archives of Clinical Neuropsychology, 6(1-2), 27-33. Robinson-Whelen, S. (1992). Benton Visual Retention Test performance among normal and demented older adults. Neuropsycholo~, 6(3), 261-269. Rochford, J., Grant, I., & LaVigne, G. (1977). Medical students and drugs: Further . neuropsychological and use pattern considerations. International Journal of the Addictions, 12(8), 1057-1065. Rodriguez-Aranda, C. (2003). Reduced writing and reading speed and age-related changes in verbal fluency tasks. Clinical Neuropsychologist, 17(2), 203-215. Rogers, R. D., Andrews, T. C., Grasby, P. M., Brooks, D. J., & Robbins, T. W. (2000). Contrasting cortical and subcortical activation produced by attentional-set shifting and reversal learning in humans. Journal of Cognitive Neuroscience, 12, 142--162. Rohling, M. L., Langhinrichsen-Rohling, J., & Miller, L. S. (2003a). Actuarial assesslllent of malingering: Rohling's interpretative method. In R. D. Franklin (Ed.), Prediction in forensic and
neuropsychology: Sound statistical practices. Hillsdale, NJ: Lawrence Erlbaum. Rohling, M. L., Williamson, D. J., Miller, L. S., & Adams, R. L. (2003b). Using the HalsteadReitan Battery to diagnose brain damage: A comparison of the predictive power of traditional techniques to Rohling's interpretive Jllethod.
Clinical Neuropsychologist, 17(4), 531-543. Rollnik,J. D., Borsutzky, M., Huber, T. J., Mogk, H., Seifert, J., Emrich, H. M., et al. (2002). Shortterm cognitive improvement in schizophrenics treated with typical and atypical neuroleptics. Neuropsychobiology, 45(2), 74--80. Roman, D. D., Edwall, G. E., Buchanan, R. J., & Patton, J. H. (1991). Extended norms for the Paced Auditory Serial Addition Task. Clinical Neuropsychologist, 5(1), 33-40.
REFERENCES
Roman, G. C., Tatemichi, T. K., Erkinjuntti, T., Cummings, J. L., Masden, J. C., Garcia, J. H., et al. (1993). Vascular dementia: Diagnostic criteria for research studies. Report of the NINDSAIREN International Workshop. Neurology, 43(2), 250-260. Rosen, W. G. (1980). Verbal fluency in aging and dementia. Journal of Clinical Neuropsychology, 2(2), 135-146. Rosenberg, S., Ryan, J. J., & Prifitera, A. (1984). Rey Auditory-Verbal Learning Test performance of patients with and without memory impairment. Journal of Clinical Psychology, 40(3), 785-787. Rosenfeld, B., Sands, S. A., & Van Gorp, W. G. (2000). Have we forgotten the base rate problem? Methodological issues in the detection of distortion. Archives of Clinical Neuropsychology, 15(4), 349--359. Rosenthal, R. (1979). The "file drawer" problem and tolerance for null results. Psychological Bulletin, 86, 638--641. Rosenthal, R. (1983). Assessing the statistical and social importance of the effects of psychotherapy. Journal of Consulting and Clinical Psychology, 51, 4-13. Rosenthal, R. (1984). Meta-analytic procedures for social research. Beverly Hills, CA: Sage. Rosenthal, R. (1995). Writing meta-analytic reviews. Psychological Bulletin, 118(2), 183-192. Ross, S. R., Millis, S. R., & Rosenthal, M. (1997). Neuropsychological prediction of psychosocial outcome after traumatic brain injury. Applied Neuropsychology, 4(3), 165-170. Ross, T. P. (2003). The reliability of cluster and switch scores for the Controlled Oral Word Association Test. Archives of Clinical Neuropsychology, 18(2), 153-164. Ross, T. P., & Lichtenberg, P. A. (1998). Expanded normative data for the Boston Naming Test for use with urban, elderly medical patients. Clinical Neuropsychologist, 12(4), 475-481. Ross, T. P., Lichtenberg, P. A., & Christensen, B. K. (1995). Normative data on the Boston Naming Test for elderly adults in a demographically diverse medical sample. Clinical Neuropsychologist, 9(4), 321-325. Ross, T. P., Foard, E. L., Hiott, F. B., & Vincent, A. (2003). The reliability of production strategy scores for the Ruff Figural Fluency Test. Archives of Clinical Neuropsychology, 18(8), 879-891. Rosselli, M., & Ardila, A. (1991). Effects of age, education, and gender on the Rey-Osterrieth Complex Figure. Clinical Neuropsychologist, 5, 370-376.
REFERENCES Rosselli, M., & Ardila, A. (1996). Cognitive effects of cocaine and polydrug abuse. Journal of Clinical and Experimental Neuropsychology, 18(1), 122-135. Rosselli, M., Ardila, A., Araujo, K., Weekes, V. A., Caracciolo, V., Padilla, M., et al. (2000). Verbal fluency and repetition skills in healthy older Spanish-English bilinguals. [Special Issue: Assessment of Spanish-speaking populations]. Applied Neuropsychology, 7(1), 17-24. Rosselli, M., Ardila, A., Bateman, J. R., & Guzman, M. (2001a). Neuropsychological test scores, academic performance, and developmental disorders in Spanish-speaking children. Developmental Neuropsychology, 20(1), 355-373. Rosselli, M., Ardila, A., Lubomski, M., Murray, S., & King, K. (2001b). Personality profile and neuropsychological test performance in chronic cocaine-abusers. International Journal of Neuroscience, 110(1-2), 55-72. Rosselli, M., Ardila, A., Salvatierra, J., Marquez, M., Matos, L., & Weekes, V. A. (2002a). A crosslinguistic comparison of verbal fluency tests. International Journal of Neuroscience, 112(6), 759-776. Rosselli, M., Ardila, A., Santisi, M. N., Arecco, M.d. R., Salvatierra, J., Conde, A., et al. (2002b). Stroop effect in Spanish-English bilinguals. Journal of the International Neuropsychological Society, 8(6), 819-827. Rosser, A., & Hodges, J. R. (1994). Initial letter and semantic category fluency in Alzheimer's disease, Huntington's disease, and progressive supranuclear palsy. Journal of Neurology, Neurosurgery, and Psychiatry, 57(11), 1389-1394. Rossi, A., Arduini, L., Daneluzzo, E., Bustini, M., Prosperini, P., & Stratta, P. (2000). Cognitive function in euthymic bipolar patients, stabilized schizophrenic patients, and healthy controls. Journal of Psychiatric Research, 34, 333--339. Rossini, E. D., & Karl, M. A. (1994). The Trail Making Test A and B: A technical note on structural nonequivalence. Perceptual and Motor Skills, 78(2), 625-626. Roth, E., Davidoff, G., Thomas, P., Doljanac, R., Dijkers, M., Berent S., et al. (1989). A controlled study of neuropsychological deficits in acute spinal cord injury patients. Paraplegia, 27(6), 480-489. Rounsaville, B. J., Jones, C., Novelly, R. A., & Kleber, H. (1982). Neuropsychological functioning in opiate addicts. Journal of Neroous and Mental Disease, 170(4), 209-216. Royan, J., Tombaugh, T. N., Rees, L., & Francis, M. (2004). The Adjusting-Paced Serial Addition
591 Test (Adjusting-PSAT): Thresholds for speed of information processing as a function of stimulus modality and problem complexity. Archives of Clinical Neuropsychology, 19(1), 131-143. Ruchinskas, R. (2003). Limitations of the Oral Trail Making Test in a mixed sample of older individuals. Clinical Neuropsychologist, 17(2), 137-142. Ruchinskas, R., Broshek, D., Crews, W. D., Barth, J., Francis, J., & Robbins, M. (2000). A neuropsychological normative database for lung transplant candidates. Journal of Clinical Psychology in Medical Settings, 7(2), 107-112. Ruff, R. (1988). Ruff Figural Fluency Test. San Diego: Neuropsychological Resources. Ruff, R. M. (1994). What role does depression play on the performance of the Ruff 2 & 7 Selective Attention Test? Perceptual and Motor Skills, 78(1), 63--66. Ruff, R. M. (1996). Ruff Figural Fluency Test: Professional manual. Odessa, FL: Psychological Assessment Resources. Ruff, M., & Allen, C. C. (1996). Ruff2 & 7 Selective Attention Test: Professional manual. Odessa, FL: Psychological Assessment Resources. Ruff, R. M., & Crouch, J. A. (1991) Neuropsychological test instruments in clinical trials. In E. Mohr & P. Brouwers (Eds.), Handbook of clinical trials: The neurobehavioral approach. Lisse: Swets & Zeitlinger. Ruff, R. M., & Jurica, P. J. (2003). The ROCF and frontal lobe damage. In J. A. Knight (Ed.), The handbook of Rey-Osterrieth Complex Figure usage: Clinical and research applications. Lutz, FL: Psychological Assessment Resources. Ruff, R. M., & Parker, S. B. (1993). Gender- and agespecific changes in motor speed and eye-hand coordination in adults: Normative values for the Finger Tapping and Grooved Pegboard Tests. Perceptual and Motor Skills, 76, 1219-1230. Ruff, R. M., Evans, R. W., & Light, R. H. (1986a). Automatic detection vs. controlled search: A paper-and-pencil approach. Perceptual and Motor Skills, 62(2), 407-416. Ruff, R. M., Evans, R., & Marshall, L. F. (1986b). Impaired verbal and figural fluency after head injury. Archives of Clinical Neuropsychology, 1(2), 87-101. Ruff, R. M., Levin, H. S., & Marshall, L. F. (1986c). Neurobehavioral methods of assessment and the study of outcome in minor head injury. Journal of Head Trauma Rehabilitation, 1(2), 43-52. Ruff, R. M., Light, R. H., & Evans, R. W. (1987). The Ruff Figural Fluency Test: A normative study with adults. Developmental Neuropsychology, 3(1), 37-51.
592 Ruff, R. M., Baser, C. A., Johnston, J. W., Marshall, L. F., et al. (1989a). Neuropr;ychological rehabilitation: An experimental stqdy with head-injured patients. Journal of Head Trauma Rehabilitation, 4(3), 20-36. Ruff, R. M., Light, R. H., & Quayhagen, M. (1989b). Selective Reminding Tests: A normative study of verbal learning in adults. Joyrnal of Clinical and Experimental Neuropsy~hology, 11(4), 539-550. I Ruff, R. M., Niemann, H., Allen, C. C., F~, C. E., & Wylie, T. (1992). The Ruff 2 & 7 ~elective Attention Test: !i neuropsychological apJ}Iication. Perceptual and Motor Skills, 75(3, : Pt 2), 1311-1319. . Ruff, R. M., Marshall, L. F., Crouch, J.. Klaubt;r, M. R., Levin H. S., Barth, J., et al. (1993). Pred,tctors of outcome following severe head trauma: :Followup data from the Traumatic Coma Data Bank. Brain Injury, 7(2), 101-111. ' Ruff, R. M., Allen, C. C., Farrow, C. E., Niemann, H., & Wylie, T. (1994). Figural fluency: Differential impairment in patients with left versus right frontal lobe lesions. Archives ofClinical Neuropstjchology, 9{1), 41-55. Ruff, R. M., Light, R. H., Parker, S. B., & Le~n. H. S. (1996). Benton Controlled Oral Word As$leiation Test: Reliability and updated norms. Artihives of Clinical Neuropsychology, 11(4), 329-338. Ruff, R. M., Light, R. H., Parker, S. B., & Le'Jin, H. S. (1997). The psychological construct of word fluency. Brain and Language, 57, 394-405. Ruffolo, L. F., Guilmette, T. J., & Willis; W. G. (2000). Comparison of time and error rates on the Trail Making Test among patients with head injuries, experimental malingerers, patients with suspect effort on testing, and normal eontrols. Clinical Neuropsychologist, 14(2), 223-230. Ruffolo, J. S., Javorsky, D. J., Tremont, G., Westervelt, H. J., & Stem, R. A. (2001). A comparison of administration procedures for the ReyOsterrieth Complex Figure: Flowcharts versus pen switching. Psychological Assessment, 13(3), 299-305. Ruggieri, R. M., Palermo, R., Vitello, G., Gennuso, M., Settipani, N., & Piccoli, F. (2003). Co~tive impairment in patients suffering from relapsingremitting multiple sclerosis with EDSS ~8.5. Acta Neurologica Scandinavica, 108, 323-326; Russell, E. W. (1975). A multiple scoring method for the assessment of complex memory functions. Journal of Consulting and Clinical Psychology, 43, 800-809. Russell, E. W. (1984). Theory and development of pattern analysis methods related , to the
REFERENCES Halstead-Reitan Battery. In P. E. Logue & J. M. Schear (Eds.), Clinical neuropsychology, a multidisciplinary approach (pp. 50-98). Springfield, IL: Thomas. Russell, E. W. (1985). Comparison of the TPT 10 and 6 hole form board. Journal of Clinical Psychology, 41(1), 68-81. Russell, E. W. (1987). A reference scale method for constructing neuropsychological test batteries. Journal of Clinical and Experimental Neuropsychology, 9(4), 376--392. Russell, E. W. (1988). Renorming Russell's version of the Wechsler Memory Scale. Journal of Clinical and Experimental Neuropsychology, 10(2), 235-249. Russell, E. W., & Levy, M. (1987). Revision of the Halstead Category Test. Journal of Consulting and Clinical Psychology, 55(6), 898-901. Russell, E., & Starkey, R. (1993). Halstead-Russell neuropsychological evaluation system (HRNES). Los Angeles: Western Psychological Services. Russell, E. W., & Starkey, R. I. (2001). HalsteadRussell neuropsychological evaluation systemrevised (HRNES-R). Los Angeles: Western Psychological Services. Russell, E. W., Neuringer, C., & Goldstein, G. (1970). Assessment of brain damage: A neuropsychological key approach. New York: Wiley. Rutschmann, J., Comblatt, B., & ErlenmeyerKimling, L. (1980). Auditory recognition memory in adolescents at risk for schizophrenia: Report on a verbal continuous recognition task. Psychiatry Research, 3(2), 151-161. Ryan, C. M., Morrow, L. A., Bromet, E. J., & Parkinson, D. K. (1987). Assessment of neuropsychological dysfunction in the workplace: Normative data from the Pittsburgh Occupational Exposures Test Battery. Journal of Clinical and Experimental Neuropsychology, 9(6), 665-679. Ryan, J. J., & Geisser, M. E. (1986). Validity and diagnostic accuracy of an alternate form of the Rey Auditory Verbal Learning Test. Archives of Clinical Neuropsychology, 1, 209-217. Ryan, J. J., Rosenberg, S. J .. & Mittenberg, W. (1984). Factor analysis of the Rey AuditoryVerbal Learning Test. International Journal of Clinical Neuropsychology, 6(4), 239-241. Ryan, J. J., Geisser, M. E., Randall, D. M., & Georgemiller, R. J. (1986). Alternate form reliability and equivalency of the Rey Auditory Verbal Learning Test. Journal of Clinical and Experimental Neuropsychology. 8(5), 611-616. Ryan, J. J., Paolo, A. M., & Brungardt, T. M. (1990). Standardization of the Wechsler Adult
REFERENCES Intelligence Scale-Revised for persons 75 years and older. Psychological Assessment, 2, 404-411. Ryan, J. J., Paolo, A. M., & Skrade, M. (1992). Rey Auditory Verbal Learning Test performance of a federal corrections sample with acquired immunodeficiency syndrome. International Journal of Neuroscience, 64, 177-181. Rybakowski, J. & Borkowska, A. (2001). The effect of treatment with risperidone, olanzapine or phenothiazine on cognitive functions in patients with schizophrenia. International Journal of Psychiatry in Clinical Practice, 5, 249-256. Sackellares, D. K., & Sackellares, J. C. (2001). Impaired motor function in patients with psychogenic pseudoseizures. Eptlepsia, 42(12), 1600-1606. Sackett, D. L., Straus, S. E., Richardson, W. S., Rosenberg, W., & Haynes, R. B. (2000). Evidence-based medicine. New York: Churchill Livingstone. Sacks, T. L., Clark, C. R., Pols, R. G., & Geffen, L. B. (1991). Comparability and stability of performance of six alternate forms of the Dodrill-Stroop Color-Word Test. Clinical Neuropsychologist, 5, 220-225. Sagawa, K., Kawakatsu, S., Komatani, A., & Totsuka, S. (1990a). Frontality, laterality, and cortical-subcortical gradient of cerebral blood How in schizophrenia: Relationship to symptoms and neuropsychological functions. Neuropsychobiology, 24(1), 1-7. Sagawa, K., Kawakatsu, S., Shibuya, I., Oiji, A., Morinobu, S., Komatani, A., et al. (1990b). Correlation of regional cerebral blood How with performance on neuropsychological tests in schizophrenic patients. Schizophrenia Research, 3(4), 241-246. Salthouse, T. A., & Fristoe, N. (1995). A process analysis of adult age effects on a computeradministered Trail Making Test. Neuropsychology, 9, 518--528. Salthouse, T. A., Fristoe, N., & Rhee, S. H. (1996). How localized are age-related effects on neuropsychological measures? Neuropsychology, 10(2), 272-285. Salthouse, T. A., Toth, J. P., Hancock, H. E., & Woodard, J. L. (1997). Controlled and automatic fonns of memory and attention: Process purity and the uniqueness of age-related influences.
Journals of Gerontology. Series B: Psychological Sciences and Social Sciences, 52B(5), P216-P228. Salthouse, T. A., Toth, J., Daniels, K., Parks, C., Pak, R., Wolbrette, M., et al. (2000). Effects of aging on efficiency of task switching in a variant of the Trail Making Test. Neuropsychology, 14(1), 102-111.
593 Salthouse, T. A., Atkinson, T. M., & Berish, D. E. (2003). Executive functioning as a potential mediator of age-related cognitive decline in nonnal adults. Journal of Experimental Psychology: General,132(4), 566-594. Sampson, H. (1956). Pacing and performance on a serial addition task. Canadian Journal of Psychology, 10, 219-225. Samuels, I., Butters, N., & Fedio, P. (1972). Short tenn memory disorders following temporal lobe removals in humans. Cortex, 8, 283-298. Samuels, I., Butters, N., Fedio, P., & Cox, C. (1980). Deficits in short-tenn auditory memory for verbal material following right temporal removals in humans. International Journal of Neuroscience, 11(2), 101-107. Sasher, T. M., & Fastenau, P. S. (2001). Preliminary child nonnative data for the Extended Complex Figure Test (ECFT). Clinical Neuropsychologist, 15, 258. Saoud, M., d'Amato, T., Gutknecht, C., Triboulet, P., Bertaud, J.-P., Marie-Cardine, M., et al. (2000). Neuropsychological deficit in siblings discordant for schizophrenia. Schizophrenia Bulletin, 26(4), 893-902. Satz, P. (1988). Neuropsychological testimony: Some emerging concerns. Clinical Neuropsychologist, 2, 89-100. Satz, P. (1993). Brain reserve capacity on symptom onset after brain injury: A formulation and review of evidence for threshold theory. Neuropsychology, 7, 273-295. Satz, P., & Mogel, S. (1962). An abbreviation of the WAIS for clinical use. Journal of Clinical Psychology, 18, 77-79. Savage, C. R., & Otto, M. W. (2003). Evaluating nonverbal memory in obsessive-<:ompulsive disorder with the ROCF. In J. A. Knight (Ed.),
The handbook of Rey-Ostenieth Complex Figure usage: Clinical and research applications. Lutz, FL: Psychological Assessment Resources. Savage, C. R., Baer, L., Keuthen, N.J., Brown, H. D., Rauch, S. L., & Jenike, M. A. (1999). Organizational strategies mediate nonverbal memory impairment in obsessiv~ompulsive disorder. Biological Psychiatry, 45(7), 905-916. Savage, C. R., Deckersbach, T., Wilhelm, S., Rauch, S. L., Baer, L., Reid, T., et al. (2000). Strategic processing and episodic memory impairment in obsessive compulsive disorder. Neuropsychology, 14(1), 141-151. Savage, C. R., Deckersbach, T., Heckers, S., Wagner, A. D., Schacter, D. L., Alpert, N. M., et al. (2001). Prefrontal regions supporting spontaneous and directed application of verbal
594 learning strategies. Evidence from PET. Brain, 124(1), 219-231. Savage, R. M., & Gouvier, W. D. (1992). Rey Auditory-Verbal Learning Test: The effects of age and gender, and norms for delayed recall and story recognition trials. Archives of Clinical Neuropsychology, 7, 407-414. Savard, R. J., Rey, A. C., & Post, R. M .. (1980). Halstead-Reitan Category Test in bipolar and unipolar affective disorders. Relationshi~ to age and phase of illness. Journal of Nervaus and Mental Disease, 168(5), 297-304. Sawrie, S. M., Martin, R. C., Gilliam, F. G., Faught, R. E., Maton, B., Hugg, J. W., et al. (2000). Visual confrontation naming and hippocampal function: A neural network study using quantitative 1H magnetic resonance spectroscopy. Brain, 123(4), 770-780. Saxton, J., Ratcliff, G., Munro, C. A., Coffey, E. C., Becker, J. T., Fried, L., et al. (2000a). Normative data on the Boston Naming Test and two equivalent 30-item short forms. Clinical Neuropsychologist, 14(4), 526-534. Saxton, J., Ratcliff, G., Newman, A., Belle, S., Fried, L., Yee, J., et al. (2000b). Cognitive test performance and presence of subclinical cardiovascular disease in the cardiovascular health study. Neuroepidemiology, 19(6), 312-319. Saxton, J. A., Becker, J. T., & Wisniewski, S. (2003). The ROCF and dementia. In J. A. Knight (Ed.), The handbook of Rey-Osterrieth Complex Figure usage: Clinical and research application~. Lutz, FL: Psychological Assessment Resources. Scarone, S., Abbruzzese, M., & Gambini, 0. (1993). The Wisconsin Card Sorting Test discriminates schizophrenic patients and their siblings. Schizophrenia Research, 10(2), 103-107. Schaie, K. W. (1983). The Seattle Longitudinal Study: A 21-year exploration of psychometric intelligence in adulthood. InK. W. Schaie (Ed.), Longitudinal studies of adult psychological development (pp. 64-135). New York: Guilford. Schaie, K. W., & Parham, I. A. (1977). Cohortsequential analyses of adult intellectual development. Developmental Psychology, 13(6), 649-653. Schaie, K. W., & Strother, C. R. (1968a). Cognitive and personality variables in college graduates of advanced age. In G. A. Talland (Ed.), Human aging and behavior. New York: Acadeinie Press. Schaie, K. W., & Strother, C. R. (1968b). A crosssectional study of age changes in cognitive behavior. Psychological Bulletin, 70, 671-680. Schapiro, M. B., Berman, K. F., Alexander, G. E., Weinberger, D. R., & Rapoport, S. I. (1999).
REFERENCES
Regional cerebral blood flow in Down syndrome adults during the Wisconsin Card Sorting Test: Exploring cognitive activation in the context of poor performance. Biological Psychiatry, 45(9), 1190-1196. Schatz, P. (2001). Commentary on "Psychometric concerns in neuropsychological testing'' [Special Issue: Controversies in neuropsychology]. NeuroRehabilitation, 16(4), 303. Schear, J. M. (1984). Neuropsychological assessment of the elderly in clinical practice. In P. E. Logue & J. M. Schear (Eds.), Clinical neuropsychology: A multidisciplinary approach. Springfield, IL: Thomas. Schear, J. M. (1986). Utility of half-credit scoring of Russell's revision of the Wechsler Memory Scale. Journal of Clinical Psychology, 42(5), 783-787. Schear, J. M., & Sato, S.D. (1989). Effects of visual acuity and visual motor speed and dexterity on cognitive test performance. Archives of Clinical Neuropsychology, 4(1), 25-32. Schloesser, R., Hutchinson, M., Joseffer, S., Rusinek, H., Saarimaki, A., Stevenson, J., et al. (1998). Functional magnetic resonance imaging of human brain activity in a verbal fluency task. Journal of Neurology, Neurosurgery, and Psychiatry, 64(4), 492-498. Schmidt, I. W., Brouwer, W. H., Vanier, M., & Kemp, F. (1996). Flexible adaptation to changing task demands in severe closed head injury patients: A driving simulator study. Applied Neuropsychology, 3(3-4), 155-165. Schinidt, M. (1996). Rey Auditory and Verbal Learning Test: A handbook. Los Angeles: Westem Psychological Services. Schmidt, S. L., Oliveira, R. M., Rocha, F. R., & Abreu-Villaca, Y. (2000). Influences of handedness and gender on the grooved pegboard test. Brain and Cognition, 44(3), 445-454. Schlnitt,F.A., Bigley,J. W., McKinnis,R., Logue,P. E., Evans, R. W., & Drucker, J. L. (1988). Neuropsychological outcome of zi.dovudine (AZT) treatment of patients with AIDS and AIDS-related complex. New England Journal of Medicine, 319(24), 1573-1578. Schmitter-Edgecombe, M., Vesneski, M., & Jones, D. W. R. (2000). Aging and word-finding: A comparison of spontaneous and constrained naming tests. Archives of Clinical Neuropsychology, 15(6), 479-493. Schneider, W. (1989). Enhancing a standard experimental delivery system (MEL) for advanced psychological experimentation. Behavioral Research Methods, 1nstroments, and Computers.
REFERENCES Schonfield, A. D., Davidson, H., & Jones, H. (1983). An example of age-associated interference in memorizing. Journal ofGerontology, 38,204-210. Schreiber, D. J., Goldman, H., Kleinman, K. M., Goldfader, P. R., & Snow, M. Y. (1976). The relationship between independent neuropsychological and neurological detection and localization of cerebral impairment. Journal of Neroous and Mental Disease, 162(5), 360-365. Schreiber, H., Rothmeier, J., Becker, W., Jurgens, R., Born, J., Stolz-Bom, G., et al. (1995). Comparative assessment of saccadic eye movements, psychomotor and cognitive performance in schizophrenics, their first-degree relatives and control subjects. Acta Psychiatrica Scandinavica, 91, 195-201. Schreiber, H. E., Javorsky, D. J., Robinson, J. E., & Stem, R. A. (1999). Rey-Osterrieth Complex Figure performance in adults with attention deficit hyperactivity disorder: A validation study of the Boston Qualitative Scoring System. Clinical Neuropsychologist, 13(4), 509-520. Schretlen, D. J., Munro, C. A., Anthony, J. C., & Pearlson, G. D. (2003). Examining the range of normal intraindividual variability in neuropsychological test performance. Journal of the International Neuropsychological Society, 9(6), 864-870. Schuepbach, D., Goenner, F., Staikov, I., Mattie, H. P., Hell, D., & Brenner, H.-D. (2002a). Temporal modulation of cerebral hemodynamics under prefrontal challenge in schizophrenia: A transcranial Doppler sonography study. Psychiatry Research: NeuroiTTUJging, 115(3), 155-170. Schuepbach, D., Merlo, M. C. G., Goenner, F., Staikov, 1., Mattie, H. P., Dierks, T., et al. (2002b). Cerebral hemodynamic response induced by the Tower of Hanoi puzzle and the Wisconsin Card Sorting Test. Neuropsychologia, 40(1), 39-53. Schultheis, M. T., Caplan, B., Ricker, J. H., & Woessner, R. (2000). Fractioning the Hooper: A multiple-choice response format. Clinical Neuropsychologist, 14(2), 196-201. Schwartz, M. S., & Ivnik, R. J. (1980, September). Wechsler Menwry Scale 1: Toward a nwre objective and systematic scoring system for the Logical Menwn1 and Visual Reproduction subtests. Paper presented at the meeting of the America! Psychological Association, Montreal, Canada. Schwartz, R. H., Gruenewald, P. J., Klitzner, M., & Fedio, P., (1989). Short-term memory impairment in cannabis-dependent adolescents. American Journal of Diseases of Children, 143(10), 1214-1219.
595 Searight, H. R., Dunn, E. J., Grisso, T., Margolis, R. B., & Gibbons, J. L. (1989). The relation of the Halstead-Reitan Neuropsychological Battery to ratings of everyday functioning in a geriatric sample. Neuropsychology, 3, 135-145. Segalowitz, S. J., Unsal, A., & Dywan, J. (1992). CNV evidence for the distinctiveness of frontal and posterior neural processes in a traumatic brain-injured population. Journal of Clinical and Experimental Neuropsychology, 14, 545-565. Seidel, W. T. (1994). Applicability of the Hooper Visual Organization Test to pediatric population: Preliminary findings. Clinical Neuropsychologist, 8, 59-68. Seidenberg, M., Gamache, M. P., Smith, M., Sackellares, J. C., Beck, N.C., Giordani, B., et al. (1984). Subject variables and performance on the Halstead Neuropsychological Test Battery: A multivariate analysis. Journal of Consulting and Clinical Psychology, 52(4), 658-Q62. Seidenberg, M., Hermann, B., Noe, A., & Wyler, A. R. (1995). Depression in temporal lobe epilepsy: Interaction between laterality of lesion and Wisconsin Card Sort performance. Neuropsychiatry, Neuropsychology, and Behavioral Neurology, 8(2), 81-87. Seidenberg, M., Hermann, B., Wyler, A. R., Davies, K., Dohan, F. C., Jr., & Leveroni, C. (1998). Neuropsychological outcome following anterior temporal lobectomy in patients with and without the syndrome of mesial temporal lobe epilepsy. Neuropsychology, 12(2), 303-316. Seidman, L. J., Yurgelun-Todd, D., Kreman, W. S.. Woods, B. T., Goldstein, J. M., Faraone, S. V., et a!. (1994). Relationship between prefrontal and temporal lobe MRI measures to neuropsychological performance in chronic schizophrenia. Biological Psychiatry, 35, 235-246. Seidman, L. J., Benedict, K. B., Biederman, J., Bernstein, J. H., et al. (1995). Performance of children with ADHD on the Rey-Osterrieth Complex Figure: A pilot neuropsychological study. Journal of Child Psychology and Psychiatry and Allied Disciplines, 36(8), 1459-1473. Seidman, L. J., Biederman, J., Faraone, S. V., Weber, W., et al. (1997). Toward defining a neuropsychology of attention deficit-hyperactivity disorder: Performance of children and adolescents from a large clinically referred sample. Journal of Consulting and Clinical Psychology, 65(1), 150-160. Sellers, A. H., & Nadler, J. D. (1992). A survey of current neuropsychological assessment procedures used for different age groups. Psychotherapy in Private Practice, 11(3), 47-57.
596 Seines, 0. A., Jacobson, L., Machado, A. M., Becker,J. T., Wesch,J., Miller, E. N.,etai: (1991). Nonnative data for a brief neuropsychological screening battery. Perceptual and Motot Skills, 73, 539--550. Shapiro, A.M., Benedict, R. H. B., Schretlen, D., & Brandt, J. (1999). Construct and concurrent validity of the Hopkins Verbal Learning TestRevised. Clinical Neuropsychologist, 13(3), 348--358. Shapiro, D. M., & Harrison, D. W. (1990). Alternate forms of the AVLT: A procedure and test of form equivalency. Archives of Clinical ·Neuropsychology, 5, 405-410. Shawaryn, M.A., Schiaffino, K. M., LaRocca, N. G., & Johnston, M. V. (2002). Determinants of health-related quality of life in multiple sclerosis: The role of illness intrusiveness. Multiple Sclerosis, 8(4), 310-318. Shea, B., Dube, C., & Moher, D. (2001). A¥essing the quality of reports of systematic revie~: The QUOROM statement compared to other tools. In M. Egger, G. D. Smith, & D. G. :Altman (Ed.), Systenwtic reviews in health care: Metaanalysis in context. London: BMJ. Shean, G., Burnett, T., & Eckman, F. S. (2002). Symptoms of schizophrenia and neurocognitive test performance. Journal of Clinical Psychology, 58(7), 723-731. Shear, P. K., Wells, C. T., & Brock, A.M. (2000). The effect of semantic cuing on CVLT performance in healthy participants. Journal of Clinical and Experimental Neuropsychology, 22(5), 649--e55. Sherer, M., Nick, T. G., Millis, S. R., & lYovack, T. A. (2003). Use of the WCST and the WCST64 in the assessment of traumatic brain injury. Journal of Clinical and Experimental N~ropsy chology, 25(4), 512-520. Shennan, A. M., & Massman, P. J. (1999). Prevalence and correlates of category versus letter fluency discrepancies in Alzheimer's disease. Archives of Clinical Neuropsychology, 14(5), 411-418. Shennan, D. S., Boone, K., Lu, P., & Razani, J. (2002). Re-examination of a Rey Auditory Verbal Learning Test/Rey Complex Figure discriminant function to detect suspect effort. Clinictd Neuropsychologist, 16(3), 242-250. Shennan, E. M. S., Strauss, E., & Spellacy, F. (1997). Validity of the Paced Auditol)'\ Serial Addition Test (PASAT) in adults referred for neuropsychological assessment after head injury. Clinical Neuropsychologist, 11(1), 34-45. Sherrill, R. E. (1985). Comparison of three short forms of the Category Test. Journal of Clinical
REFERENCES and Experimental Neuropsychology, 7(3), 231238. Shichita, K., Hatano, S., Ohashi, Y., Shibata, H., & Matuzaki, T. (1986). Memory changes in the Benton Visual Retention Test between ages 70 and 75. Journal of Gerontology, 41 (3), 385-386. Shimamura, A. P., Salmon, D.P., Squire, L. R., & Butters, N. (1987). Memory dysfunction and word priming in dementia and amnesia. Behavioral Neuroscience, 101, 347-351. Shipley, J. E., Kupfer, D. J., Spiker, D. G., Shaw, D. H., Coble, P. A., Neil, J. F., et al. (1981). Neuropsychological assessment and EEG sleep in affective disorders. Biological Psychiatry, 16(10), 907-918. Shoqeirat, M. A., Mayes, A., MacDonald, C., Meudell, P., et al. (1990). Performance on tests sensitive to frontal lobe lesions by patients with organic amnesia: Leng & Parkin revisited. British Journal of Clinical Psychology, 29(4), 401-408. Shore, C., Shore, H., & Pihl, R. 0. (1971). Correlations between performance on the category test and the Wechsler Adult Intelligence Scale. Perceptual and Motor Skills, 32, 70. Shores, E. A., & Carstairs, J. R. (2000). The Macquarie University Neuropsychological Nonnative Study (MUNNS): Australian norms for the WAIS-R and WMS-R. Australian Psychologist, 35(1), 41--59. Shorr, J., Delis, D., & Massman, P. (1992). Memory for the Rey-Osterrieth Figure: Perceptual clustering, encoding, and storage. Neuropsychology, 6, 43--50. Shum, D. H., McFarland, K. A., & Bain, J. D. (1990). Construct validity of eight tests of attention: Comparison of normal and closed head injured samples. Clinical Neuropsychologist, 4(2), 151-162. Shum, D., Murray, R., & Eadie, K. (1997). Effect of speed of presentation on administration of the Logical Memory subtest of the Wechsler Memory Scale-Revised. Clinical Neuropsychologist, 11(2), 188-191. Shure, G. H., & Halstead, W. C. (1958). Cerebral localization of intellectual processes. Psychological Monographs: General and Applied, 72(12), 1-40. Shute, G. E., & Huertas, V. (1990). Developmental variability in frontal lobe function. Developmental Neuropsychology, 6(1), 1-11. Shuttleworth, E. C., & Huber, S. J. (1988). The naming disorder of dementia of Alzheimer type. Brain and Language, 34, 222-234. Siegert, R. J., & Cavana, C. M. (1997). Norms for older New Zealanders on the Trail-Making Test. New Zealand Journal of Psychology, 26(2), 25--31.
REFERENCES Silver, H., Shlomo, N., Turner, T., & Gur, R. C. (2002). Perception of happy and sad facial expressions in chronic schizophrenia: Evidence for two evaluative systems. Schizophrenia Research, 55(1-2), 171-177. Silverstein, A. B. (1962). Perceptual, motor, and memory functions in the Visual Retention Test. American Journal of Mental Deficiency, 66, 613--fil7. Silverstein, A. B. (1963). Qualitative analysis of performance on the Visual Retention Test. American Journal of Mental Deficiency, 68, 109--113. Simard, M., van Reekum, R., & Myran, D. (2003). Visuospatial impairment in dementia with Lewy bodies and Alzheimer's disease: A process analysis approach. International Journal of Geriatric Psychiatry, 18, 387-391. Simkins-Bullock,J., Brown, G. G., Greiffenstein, M., Malik, G. M., & McGillicuddy (1994). Neuropsychological correlates of short-term memory distractor tasks among patients with surgical repair of anterior communicating artery aneurysms. Neuropsychology, 8(2), 246--254. Sjogren, P., Olsen, A. K., Thomsen, A. B., & Dalberg, J. (2000). Neuropsychological performance in cancer patients: The role of oral opioids, pain and performance status. Pain, 86(3), 237-245. Ska, B., Poissant, A., & Joanette, Y. (1990). Line orientation judgment in normal elderly and subjects with dementia of Alzheimer's type. Journal of Clinical and Experimental Neuropsychology, 12(5), 695-702. Skelton-Robinson, M., & Jones, S. (1984). Nominal dysphasia and the severity of senile dementia. British Journal of Psychiatry, 145, 168-171. Skenazy, J. A., & Bigler, E. D. (1984). Neuropsychological findings in diabetes mellitus. Journal of Clinical Psychology. 40(1), 246--258. Slay, D. K. (1984). A portable Halstead-Reitan category test. Journal of Clinical Psychology, 40(4), 1023-1027. Slick, D. J., Iverson, G. L., & Green, P. (2000). California Verbal Learning Test indicators of suboptimal performance in a sample of headinjury litigants. Journal of Clinical and Experimental Neuropsychology, 22(5), 569--579. Slick, D. J., Hinkin, C. H., van Gorp, W. G., & Satz, P. (2001). Base rate of WMS-R Malingering Index in a sample of non-compensation-seeking men infected with HIV-1. Applied Neuropsychology, 8(3), 185-189. Sliwinski, M., Buschke, H., Stewart, W. F., Masur, D., et al. (1997). The effect of dementia risk factors on comparative and diagnostic selective reminding
597 norms. Journal of the International Neuropsychological Society, 3(4), 317-326. Sliwinski, M., Lipton, R., Buschke, H., & Wasylyshyn, C. (2003). Optimizing cognitive test norms for detection. In R. Petersen (Ed.), Mild cognitive impairment: Aging to Alzheimer's disease. New York: Oxford University Press. Small, B. J., Graves, A. B., McEvoy, C. L., Crawford, F. C., Mullan, M., & Mortimer, J. A. (2000). Is APOE-epsilon4 a risk factor for cognitive impairment in normal aging? Neurology, 54(11), 2082-2088. Small, G. W., La Rue, A., Komo, S., Kaplan, A., & Mandelkern, M. A. (1995). Predictors of cognitive change in middle-aged and older adults with memory loss. American Journal of Psychiatry, 152(12), 1757-1764. Smalley, S. L., Wolkenstein, B. H., LaRue, A., Woodward, J. A., Jarvik, L. F., & Matsuyama, S. S. (1992). Commingling analysis of memory performance in offspring of Alzheimer patients. Genetic Epidemiology, 9(5), 333--345. Smith, G. E., & Ivnik, R. J. (2003). Normative neuropsychology. In: R. C. Petersen (Ed.), Mild cognitive impairment: Aging to Alzheimer's disease. New York: Oxford University Press. Smith, G. E., Ivnik, R. J., Malec, J. F., Petersen, R. C., Kokmen, E., Tangalos, E. G., et al. (1992). Mayo's Older Americans Normative Studies (MOANS): Factor structure of a core battery. Psychological Assessment, 4(3), 382-390. Smith, R. L., Goode, K. T., La Marche, J. A., & Boll, T. J. (1995). Selective Reminding Test short form administration: A comparison of two through twelve trials. Psychological Assessment, 7(2), 177-182. Smith, S., Murdoch, B., & Chenery, H. (1989). Semantic abilities in dementia of the Alzheimer's type. Brain and Language, 36, 314-324. Smith, Y., Giordani, B., Lajiness-O'Neill, & Zubieta, J. (2001). Long-term estrogen replacement is associated with improved nonverbal memory and attentional measures in postmenopausal women. Fertility and Sterility, 76(6), 1101-1107. Snodgrass, J. G. (1984). Concepts and their surface representations. Journal of Verbal Learning and Verbal Behavior, 23, 3--22. Snodgrass, J. G., & Vanderwart, M. (1980). A standardized set of 260 pictures: Norms for name agreement, image agreement, familiarity and visual complexity. Journal of Experimental Psychology: Human Learning and Merrwry, 6, 174-215. Snow, J. H. (1998). Clinical use of the Benton Visual Retention Test for children and adolescents
598
REFERENCES
with learning disabilities. Archives aJ Clinical Neuropsychology, 13(1), 629--636. Snow, W. G. (1987). Standardization of test administration and scoring criteria: s
I
Somerville, J., Tremont, G., & Stem, R.;A. (2000). The Boston Qualitative Scoring Syjtem as a measure of executive functioning in Rey-Osterrieth Complex Figure performance.)oumal of Clinical and Experimental Neur~ychology, 22(5), 613-621. Sonobe, N., Kanno, M., Ito, M., Uchiy~a, M., Takahashi, Y., Yashima, Y., et al. (199p. Lateral asymmetry of eye movements in temporal lobe epileptic patients with unilateral foci.lnfemational Journal of Psychophysiology, 11(3), 253--256. Soukup, V. M., Ingram, F., Grady, J. J., & Schiess, M. C. (1998). Trail Making Test: Issues in normative data selection. Applied Neuropsychology, 5(2), 65-73. Sovcikova, E., & Bronis, M. (1985). Eviluation of mental workload by Stroop Colour-~ord Test. Studia Psychologica, 27, 245--248.
Spencer, W. D., & Raz, N. (1994). Memory for facts, source, and context: Can frontal lobe dysfunction explain age-related differences? Psychology and Aging, 9(1), 149-159. Spreen, 0., & Benton, A. L. (1969). Neurosensory Center comprehensive examination for aphasia: Manual of directions. Victoria: Neuropsychology Laboratory, University of Victoria. Spreen, 0., & Strauss, E. (1991). A compendium of neuropsychological tests. New York: Oxford University Press. Spreen, 0., & Strauss, E. (1998). A Compendium of neuropsychological tests (2nd ed.). New York: Oxford University Press. Squire, L. R., & Shimamura, A. P. (1986). Characterizing amnesic patients for neurobehavioral study. Behavioral Neuroscience, 100, 866--877. Stanczak, D. E., & Triplett, G. (2003). Psychometric properties of the Mid-Range Expanded Trail Making Test: An examination of learningdisabled and non-learning-disabled children. Archives of Clinical Neuropsychology, 18(2), 107-120. Stanczak, D. E., Lynch, M. D., McNeil, C. K., & Brown, B. (1998). The Expanded Trail Making Test: Rationale, development, and psychometric properties. Archives of Clinical Neuropsychology, 13(5), 473-487. Stanczak, E. M., Stanczak, D. E., & Templer, D. I. (2000). Subject-selection procedures in neuropsychological research: A meta-analysis and prospective study. Archives of Clinical Neuropsychology, 15(7), 587--601. Stanczak, D. E., Stanczak, E. M., & Awadalla, A. W. (2001). Development and initial validation of an Arabic version of the Expanded Trail Making Test: Implications for cross-cultural assessment. Archives of Clinical Neuropsychology, 16(2), 141-149. Stanton, B. A., Jenkins, C. D., Savageau, J. A., & Zyzanski, S. J. (1984). Age and educational differences on the Trail Making Test and Wechsler Memory Scales. Perceptual and Motor Skills, 58, 311-318. Steenhuis, R. E., & Ostbye, T. (1995). Neuropsychological test performance of specific diagnostic groups in the Canadian Study of Health and Aging (CHSA). Journal of Clinical and Experimental Neuropsychology, 17(5), 773--785. Stefanova, E. D., Kostic, V., Ziropadja, L., Markovic, M., & Ocic, G. (2002). Serial position learning effects in patients with aneurysms of the anterior communicating artery. Journal of Clinical and Experimental Neuropsychology, 24(5), 687-694.
599
REFERENCES Steffens, D. C., Wagner, H. R., Levy, R. M., Hom, K. A., & Krishnan, K. R. R. (2001). Perfonnance feedback deficit in geriabic depression. Biological Psychiatry, 50(5), 358--363. Stein, M. B., Hanna, C., Vaerum, V., & Koverola, C. (1999). Memory functioning in adult women traumatized by childhood sexual abuse. Journal ofTraumatic Stress, 12(3), 527-534. Stein, M. B., Kennedy, C. M., & Twamley, E. W. (2002). Neuropsychological function in female victims of intimate partner violence with and without posttraumatic stress disorder. Biological Psychiatry, 52(11), 1079-1088. Steindl, S. R., & Boyle, G. J. (1995). Use of the Booklet Category Test to assess abstract concept fonnation in schizophrenic disorders. Archives of Clinical Neuropsychology, 10(3), 205-210. Stem, R. A., Javorsky, D. J., Singer, E. A., Singer, N. G., Duke, L. M., Somerville, J. A., et al. (1999). The Boston Qualitative Scoring System for the Rey-Osterrieth Complex Figure (version 3.1). Odessa, FL: Psychological Assessment Resources. Stem, R. A., Singer, E. A., Duke, L. M., Singer, N. G., Morey, C. E., & Daughtrey, E. W. (1994). The Boston Qualitative Scoring System for the Rey-Osterrieth Complex Figure: Description and interrater reliability. Clinical Neuropsychologist, 8, 309-322. Stem, Y., Andrews, H., Pittman, J., Sano, M., Tatemichi, T., Lantigua, R., et al. (1992). Diagnosis of dementia in a heterogeneous population. Ar-
chives of Neurology, 49,453-460. Stem, Y., Tang, M. X., Jacobs, D. M., Sano, M., Marder, K., et al. (1998). Prospective comparative study of the evolution of probable Alzheimer's disease and Parkinson's disease dementia.
Journal of the International Neuropsychological Society, 4(3), 279-284. Stem, Y., Albert, S., Tang, M.-X., & Tsai, W-Y. (1999). Rate of memory decline in AD is related to education and occupation: Cognitive reserve? Neurology, 53(9), 1942-1947. Sterne, J., Egger, M., & Smith, G. D. (2001). Investigating and dealing with publication and other biases in meta-analysis. British Medical Journal, 323, 101-105. Stevens, M. C. (2000). Pictorial presentation of verbal stimuli: A semantic memory study using an adaptation of the California Verbal Learning Test. Dissertation Abstracts International. Section B: The Sciences and Engineering, 60(8-B), 4254. Stevens, M. C., Fein, D. A., & Markus, E. (2001). Connecticut Pictorial Learning Test: A pictorial
version of the California Verbal Learning Test. Clinical Neuropsychologist, 15(1), 95-108. Stewart, R., Richards, M., Brayne, C., & Mann, A. (2001). Cognitive function in UK communitydwelling African Caribbean elders: Normative data for a test battery. International Journal of Geriatric Psychiatry, I6(5), 518-527. Stillhard, G., Landis, T., Schiess, R., Regard, M., & Sigler, G. (1990). Bitemporal hypoperfusion in transient global amnesia: 99m-Tc-HM-PAO SPECT and neuropsychological findings during and after an attack. Journal of Neurology, Neurosurgery, and Psychiatry, 53(4), 339-342. Stolar, N., Berenbaum, H., Banich, M. T., & Barch, D. (1994). Neuropsychological correlates of alogia and affective flattening in schizophrenia. Biological Psychiatry, 35(3), 164-172. Stone, B. J., Gray, J. W., Dean, R. S., & Wheeler, T. E. (1988). An examination of the Wechsler Adult Intelligence Scale (WAIS) subtests from a neuropsychological perspective. International Journal of Neuroscience, 40(1-2), 31-39. Storandt, M., & Hill, R. D. (1989). Very mild senile dementia of the Alzheimer type: II. Psychometric test perfonnance. Archives of Neurology, 46,
383--386. Stratta, P., Rossi, A., Mancini, F., Cupillari, M., Matteri, & Casacchia (1993). Wisconsin Card Sorting Test perfonnance and educational level in schizophrenic and control samples. Neuro-
psychiatry, Neuropsychology, and Behavioral Neurology, 6(3), 149-153. Strauss, E., & Spreen, 0. (1990). A comparison of the Rey and Taylor figures. Archives of Clinical Neuropsychology, 5, 417-420. Strauss, E., & Wada, J. (1988). Hand preference and proficiency and cerebral speech dominance determined by the carotid amytal test. Journal of
Clinical and Erperimental Neuropsychology, 10, 169-174. Strauss, E., Spellacy, F., Hunter, M., & Berry, T. (1994). Assessing believable deficits on measures of attention and information processing capacity. Ar-
chives ofClinical Neuropsychology, 9{6), 483-490. Strenge, H., Niederberger, U., & Seelhorst, U. (2002). Correlation between tests of attention and perfonnance on Grooved and Purdue Pegboards in nonnal subjects. Perceptual and Motor Skills, 95(2), 507-514. Sbicker, J. L., Brown, G. G., Wixted, J., Baldo, J. V., & Delis, D. C. (2002). New semantic and serial clustering indices for the California Verbal Learning Test-second edition: Background, rationale, and fonnulae. Journal of the International Neuropsychological Society, 8(3), 425-435.
600 Strickland, T., D'Elia, L., James, R., & Stein, R. (1997). Stroop Color-Word performance of African Americans. Clinical Neuropsychologist, 11, 87-90. Stroop, J. (1935). Studies of interference in serial verbal reactions. Journal of Experimetttal Psychology, 18, 643--662. Stuss, D. T. (1987). Contribution of fro,tal lobe injury to cognitive impairment after cl0$ed head injury: Methods of assessment and re~nt findings. In H. S. Levin, J. Grafman & It M. Eisenberg (Eds.), Neurobehavioral recovtry from head injury. New York: Oxford Univers~ Press. Stuss, D. T., Benson, D. F., Kaplan, E. F., eir, W. S., Naiser, M. A., Lieberman, 1., & F rrill, D. (1983). The involvement of orbitofrontal cerebrum in cognitive tasks. Neuropsycho{iJgia, 21, 235-248. Stuss, D. T., Ely, P., Hugenholtz, H., Richard, M. T., LaRochelle, S., Poirier, C. A., et ai. (1985). Subtle neuropsychological deficits in '.patients with good recovery after closed hea4 injury. Neurosurgery, 17(1), 41-47. Stuss, D. T., Stethem, L. L., & Poirier, C. A. (1987). Comparison of three tests of attention apd rapid information processing across six age:. groups. Clinical Neuropsychologist, 1, 139-152.· Stuss, D. T., Stethem, L. L., & Pelchat, a (1988). Three tests of attention and rapid inf~rmation processing: An extension. Clinical ~ropsy chologist, 2, 246-250. Stuss, D. T., Stethem, L. L., Hugenhol~, H. & Richard, M. T. (1989). Traumatic braill injury: A comparison of three clinical tests and analysis of recovery. Clinical Neuropsychologist, 3, 145-156. Stuss, D. T., Alexander, M. P., Hamer,) L., Palumbo, C., Dempster, R., Binns, ML et al. (1998). The effects of focal anterior and posterior brain lesions on verbal fluency. Joornal of the International Neuropsychological Society, 4(3), 265-278. : Stuss, D. T., Levine, B., Alexander, M.P., Hong, J., Palumbo, C., Hamer, L., et al. (2000). Wisconsin ' with Card Sorting Test performance in patients focal frontal and posterior brain damage; Effects of lesion location and test structure on Sfparable cognitive processes. Neuropsychologid,. 38(4), 388-402. Stuss, D. T., Bisschop, S. M., Alexander, M. P., Levine, B., Katz, D., & lzukawa, D. (20~)1). The Trail Making Test: A study in focal le$ion patients. Psychological Assessment, 13(2), ~0-239. Suchy, Y., Sands, K., & Chelune, G. J.i (2003). Verbal and nonverbal fluency pe;a,rmance
REFERENCES before and after seizure surgery. Journal of Clinical and Experimental Neuropsychology, 25(2), 190-200. Suhr, J. A. & Boyer, D. (1999). Use of the Wisconsin Card Sorting Test in the detection of malingering in student simulator and patient samples. Journal of Clinical and Experimental Neuropsychology, 21, 701-708. Sullivan, E. V., Mathalon, D. H., Zipursky, R. B., Kerteen-Tucker, Z., Knight, R. T., & Pfefferbaum, A. (1993). Factors of the Wisconsin Card Sorting Test as measures of frontal-lobe function in schizophrenia and in chronic alcoholism. Psychiatry Research, 46, 175-199. Sullivan, K., & Bowden, S. C. (1997). Which tests do neuropsychologists use? Journal of Clinical Psychology, 53(1), 657-661. Sullivan, K., Keane, B., & Deffenti, C. (2001). Malingering on the RAVLT: Part I. Deterrence strategies. Archives of Clinical Neuropsychology, 16(1), 627-641. Sullivan, K., Deffenti, C., & Keane, B. ("2002). Malingering on the RAVLT: Part II. Detection strategies. Archives of Clinical Neuropsychology, 17(3), 223-233. Sutton, A., Abrams, K. R., Jones, D. R., Sheldon, T. A., & Song, F. (2000). Methods for metaanalysis in medical research. Chichester: Wiley. Swan, G. E., Morrison, E., & Eslinger, P. (1990). Interrater agreement on the Benton Visual Retention Test. Clinical Neuropsychologist, 4(1), 37-44. Swan, G. E., Dame, A., & Carmelli, D. (1991). Involuntary retirement, type A behavior, and current functioning in elderly men: 27-year follow-up of the Western Collaborative Group Study. Psychology and Aging, 6(3), 384-391. Swan, G. E., DeCarli, C., Miller, B. L., Reed, T., Wolf, P. A., Jack, L. M., et al. (1998). Association of midlife blood pressure to late-life cognitive decline and brain morphology. Neurology, 51(4), 986-993. Sweeney, J. A., Haas, G. L., Keilp, J. G., & Long, M. (1991). Evaluation of the stability of neuropsychological functioning after acute episodes of schizophrenia: One-year followup study. Psychiatry Research, 38(1), 63--76. Sweet, J. J., & King, J. H. (2002). Category test validity indicators: Overview and practice recommendations. Journal of Forensic Neuropsychology. Special Issue: Detection of Response Bias in Forensic Neuropsychology: Part II, 3(1-2), 241-274. Sweet, J. J., Moberg, P. J., & Westergaard, C. K. (1996). Five-year follow-up survey of practices
REFERENCES
and beliefs of clinical neuropsychologists. Clinical Neuropsychologist, 10(2), 202-221. Sweet, J. J., Moberg, P. J., & Suchy, Y. (2000a). Ten-year follow-up survey of clinical neuropsychologists: Part I. Practices and beliefs. Clinical Neuropsychologist, 14(1), 18--37. Sweet, J. J., Wolfe, P., Sattlberger, E., Numan, B., Rosenfeld, J. P., Clingerman, S., et al. (2000b). Further investigation of traumatic brain injury versus insufficient effort with the California Verbal Learning Test. Archives of Clinical Neuropsychology, 15(2), 105-113. Sweetland, J., Kertesz, A., Prato, F. S., & Nantau, K. (1987). The effect of magnetic resonance imaging on human cognition. Magnetic Resonance Imaging, 5(2), 129-135. Swerdlow, N. R., Filion, D., Geyer, M. A., & Braff, D. L. (1995). "Normal" personality correlates of sensorimotor, cognitive, and visuospatial gating. Biological Psychiatry, 37, 286-299. Swets, J. A. (1996). Signal detection theory and ROC analysis in psychology and diagnostics: Collected papers. In Scientific psychology series. Hillsdale, NJ: Laurence Erlbaum. Swiercinsky, D. P. (1978). Manual for the adult neuropsychological evaluation. Springfield, IL: Thomas. Swiercinsky, D. P. (1979). Factorial pattern description and comparison of functional abilities in neuropsychological assessment. Perceptual and Motor Skills, 48(1), 231-241. Takashima, Y., Yao, H., Koga, H., Endo, K., Matsumoto, T., Uchino, A., et al. (2003). Frontal lobe dysfunction caused by multiple lacunar infarction in community-dwelling elderly subjects. Neurological Sciences, 214, 37-41. Tallent, K. A., & Gooding, D. C. (1999). Working memory and Wisconsin Card Sorting Test performance in schizotypic individuals: A replication and extension. Psychiatry Research, 89(3), 161-170. Tamkin, A. S., & Hyer, L. A. (1984). Testing for cognitive dysfunction in the aging psychiatric patient. Military Medicine, 149(7), 397-399. Tamkin, A. S., & Jacobsen, R. (1984). Age-related norms for the Hooper Visual Organization Test. Journal of Clinical Psychology, 40(6), 1459-1463. Tang, C., & Liu, Y. (1993). Impairment of visual form discrimination in Parkinson's disease. Acta Psychologica Sinica, 25, 258-263. Tang, H. W., Liang, Y. X., Hu, X. H., & Yang, H. G. (1995). Alterations of monoamine metabolites and neurobehavioral function in lead-exposed workers. Biomedical and Environmental Sciences, 8(1), 23-29.
601
Tarter, R. E., & Parsons, 0. A. (1971). Conceptual shifting in chronic alcoholics. Journal of Abnormal Psychology, 77, 71-75. Taylor, A. E., Saint-Cyr, J. A., & Lang, A. E. (1986). Frontal lobe dysfunction in Parkinson's disease. Brain, 109, 845-883. Taylor, D. J., Hunt, C., & Glaser, B. (1990). A crossvalidation of the revised Category Test. Psychological Assessment: A Journal of Consulting and Clinical Psychology, 2(4), 486-488. Taylor, E. M. (1959). Psychological appraisal of children with cerebral defects. Cambridge, MA: Harvard University Press. Taylor, H. C., & Russell, J. T. (1939). The relationship of validity coefficients to the practical effectiveness of tests in selection: Discussion and tables. Journal of Applied Psychology, 23. 565-578. Taylor, J. M., Goldman, H., Leavitt, J., & Kleinman, K. N. (1984). Limitations of the briefform of the Halstead Category Test. Journal of Clinical Neuropsychology, 6(3), 341-344. Taylor, L. B. (1969). Localization of cerebral lesions by psychological testing. Clinical Neurosurgery, 16, 269-287. Taylor, L. B. (1979). Psychological assessment of neurosurgical patients. Functional Neurosurgery, 165-180. Taylor, R. (1998a). Order effects within the Trail Making and Stroop tests in patients with neurologic disorders. Journal of Clinical and Experimental Neuropsychology, 20(5), 750-754. Taylor, R. (1998b). Continuous norming: Improved equations for the WAIS-R. British Journal of Clinical Psychology, 37(4), 451-456. Taylor, S. F., Kornblum, S., Lauber, E. J., Minoshima, S., & Koeppe, R. A. (1997). Isolation of specific interference processing in the Stroop task: PET activation studies. Neuroimage, 6, 81-92. Teknos, K. S., Bernstein, J. H., & Seidman, L. J. (2003). ROCF performance of attention-deficit! hyperactivity disordered children. In J. A. Knight (Ed.), The handbook of Rey-Osterrieth Complex Figure usage: Clinical and research applications. Lutz, FL: Psychological Assessment Resources. Tenhula, W. N., & Sweet, J. J. (1996). Double cross-validation of the Booklet Category Test in detecting malingered traumatic brain injury. Clinical Neuropsychologist, 10(1), 104-116. Teunissen, C. E., De Vente, J., von Bergmann, K., Bosma, H., vanBoxtel, M.P. J., De Bruijn, C.,etal. (2003). Serum cholesterol, precursors and metabolites and cognitive performance in an aging population. Neurobiology of Aging, 24(1), 147-155.
602 Thompson, L. L., & Heaton, R. K. (1989). Comparison of different versions of the Boston Naming Test. Clinical NeuropsycholofPt, 3(2), 184-192. Thompson, L. L., & Heaton, R. K. (1991}. Pattern of performance on the Tactual Perlbrmance Test. Clinical Neuropsychologist, 5(4), 322-328. Thompson, L. L., & Parsons, 0. A. (19$). Contribution of the TPT to adult neuropsychological assessment: A review. Journal of ClirrJcal and Experimental Neuropsychology, 7(4), ~Thompson, L. L., Heaton, K. R., Matthe~. C. G., & Grant, I. (1987). Comparison of pref$Ted and nonpreferred hand performance on fottr neuropsychological motor tasks. Clinical Weuropsychologist, 1(4), 324-334. Thompson, M. D., Scott, J. G., DicksoQ, S. W., Schoenfeld, J. D., Ruwe, W. D., & A~s. R. L. (1999). Clinical utility of the Trail Making Test practice time. Clinical NeuropsychologiJt, 13(4), 450-455.
11lorndike, E. L., & Lorge, I. (1944). The Teacher's word book of30,000 words. New York: 'teacher's College, Columbia University. , 11lurstone, L. L. (1944). A factorial studt of perception. Chicago: University of Chicago Press. Thurstone, L. L., & 11lurstone, T. G. (1949). Exam-
iner manual for the SRA Primary Mentai, Abilities Test. Chicago: Science Research Associaies. Thurstone, L. L., & Thurstone, T. G. (1002). Pri-
mary mental abilities (Rev.). Chicago: Science Research Associates. Tierney, M. C., Nores, A., Snow, W. G., Fisher, R. H., Zorzitto, M. L., & Reid, D. W. (1994). Use of the Rey Auditory Verbal Learnin~ Test in differentiating normal aging from ~eimer's and Parkinson's dementia. Psychological Assessment, 6, 129--134. Tiersky, L.A., Cicerone, K. D., Natelson, B. H., & DeLuca, J. (1998). Neuropsychological functioning in chronic fatigue syndrome and mild traumatic brain injury: A comparison. Clinical Neuropsychologist, 12(4), 503-512. Toglia, M. P., & Battig, W. F. (1978). Handbook of word norms. Hillsdale, NJ: Lawrence Erlbaum. Tombaugh, T. N. (1999). Administrative manual
for the Adjusting-Paced Serial Addi~n Test (Adjusting-PSAT). Ottawa: Ottawa Cancussion Clinic, Carleton University. Tombaugh, T. N. (2004). Trail Making Test A and B: Normative data stratified by age and education. Archives af Clinical Neuropsychology, 19, 203-214. Tombaugh, T. N., & Hubley, A. M. (1991). Four studies comparing the Rey-Osterrieth and Taylor
REFERENCES complex figures. Journal of Clinical and Experimental Neuropsychology, 13, 587-599. Tombaugh, T. N., & Hubley, A. M. (1997). The 60item Boston Naming Test: Norms for cognitively intact adults aged 25 to 88 years. Journal of
Clinical and Experimental Neuropsychology, 19(6), 922-932. Tombaugh, T. N., Hubley, A.M., Faulkner, P., & Schmidt, J. P. (1990). The Rey-Osterrieth and
Taylor complex figures: Comparative studies, modified figures and normative data for the Taylor figure. Paper presented at the 18th annual meeting of the International Neuropsychology Society, Orlando, FL. Tombaugh, T. N., Faulkner, P., & Hubley, A. M. (1992a). Effects of age on the Rey-Osterrieth and Taylor complex figures: Test-retest data using an intentional learning paradigm. Journal
of Clinical and Experimental Neuropsychology, 14, 647-661. Tombaugh, T. N., Schmidt, J. P., & Faulkner, P. (1992b). A new procedure for administering the Taylor Complex Figure: Normative data over a 60-year age span. Clinical Neuropsychologist, 6(1), 63-79. Tombaugh, T. N., Kozak, J., & Rees, L. (1999). Normative data stratified by age and education for two measures of verbal fluency: F AS and animal naming. Archives af Clinical Neuropsychology, 14(2), 167-177. Tomer, R., & Levin, B. E. (1993). Differential effects of aging on two verbal fluency tasks. Perceptual and Motor Skills, 76, 465-466. Tomer, R., Fisher, T., Giladi, N ., & Aharon-Peretz, J. (2002). Dissociation between spontaneous and reactive flexibility in early Parkinson's disease.
Neuropsychiatry, Neuropsychology, and Behavioral Neurology, 15(2), 106--112. Torralva, T., Dorrego, F., Sabe, L., Chemerinski, E., & Starkstein, S. E. (2000). Impairments of social cognition and decision making in Alzheimer's disease. International Psychogeriatrics, 12(3), 359--368. Torres, I. J., Flashman, L. A., O'Leary, D. S., & Andreasen, N. C. (2001). Effects of retroactive and proactive interference on word list recall in schizophrenia. Journal of the International Neuropsychological Society, 7(4), 481-490. Torres, I. J., Mundt, A. J., Sweeney, P. J., UanesMacy, S., Dunaway, L., Castillo, M., et al. (2003). A longitudinal neuropsychological study of partial brain radiation in adults with brain tumors. Neurology, 60(7), 1113-1118. Toshima, T., Toma, C., Demic, J., & Wapner, S. (1992). Age and cross-cultural differences in
REFERENCES processes underlying sequential cognitive activity. In B. Wilpert, H. Motoaki, & J. Misumi (Eds.), General psychology and environmental
psychology: Proceedings of the 22nd International Congress of Applied Psychology (p. 189). Hillsdale, NJ: Lawrence Erlbaum. Toshima, T., Demick, J., Miyatani, M., Ishii, S., & Wapner, S. (1996). Cross-cultural differences in processes underlying sequential cognitive activity. Japanese Psychological Research, 38(2), 90-96. Touradji, P., Manly, J. J., Jacobs, D. M., & Stern, Y. (2001). Neuropsychological test performance: A study of non-Hispanic white elderly. Journal of
Clinical and Experimental Neuropsychology, 23(5), 643-649. Trahan, D. E. (1998). Judgment of line orientation in patients with unilateral cerebrovascular lesions. Assessment, 5(3), 227-235. Trahan, D. E., & Larrabee, G. J. (1993). Clinical and methodological issues in measuring rate of forgetting with the verbal selective reminding test. Psychological Assessment, 5(1), 67-71. Trahan, D. E., Patterson, J., Quintana, J., & Biron, R. (1987). The Finger Tapping Test: A reex-
amination of traditional hypotheses regarding normal adult performance. Paper presented at the 15th annual meeting of the International Neuropsychological Society, Washington, DC. Trahan, D., Quintana, J., Willingham, A., & Goethe, K. (1988). The Visual Reproduction subtest: Standardization and clinical validation of a delayed recall procedure. Neuropsychology, 2(1), 29-39. Trautt, G. M., Chavez, E. L., Brandon, A. D., & Steyaert, J. (1983). Effects of test anxiety and sex of subject on neuropsychological test performance: Finger Tapping, Trail Making, Digit Span, and Digit Symbol tests. Perceptual and Motor Skills, 56, 923-929. Tremont, G., Hoffman, R. G., Scott, J. G., & Adams, R. L. (1998). Effect of intellectual level on neuropsychological test performance: A response to Dodrill (1997). Clinical Neuropsychologist, 12(4), 560-567. Treneny, M. R., Crosson, B., DeBoe,J., & Leber, W. R. (1989). Stroop Neuropsychological Screening Test, manual. Odessa, FL: Psychological Assessment Resources. Treneny, M. R., Crosson, B., DeBoe, J., & Leber, W. R. (1990). Visual Search and Attention Test. Odessa, FL: Psychological Assessment Resources. Trestman, R. L., Keefe, R. S. E., Mitropoulou, V., Harvey, P. D., deVegvar, M. L., Lees-Roitman,
603 S., et al. (1995). Cognitive function and biological correlates of cognitive performance in schizotypal personality disorder. Psychiatry Research, 59(1-2), 127-136. Trichard, C., Martinot, J. L., Alagille, M., Masure, M. C., Hardy, P., Ginestet, D., et al. (1995). Time course of prefrontal lobe dysfunction in severely depressed inpatients: A longitudinal neuropsychological study. Psychological Medicine, 25, 79-85. Triggs, W. J., Calvanio, R., Levine, M., Heaton, R. K., & Heilman, K. M. (2000). Predicting hand preference with performance on motor tasks. Cortex, 36(5), 679-689. Troester, A. 1., Fields, J. A., Testa, J. A., Paul, R. H., Blanco, C. R., Hames, K. A., et al. (1998). Cortical and subcortical influences on clustering and switching in the performance of verbal fluency tasks. Neuropsychologia, 36(4), 295--304. Troyer, A. K. (2000). Normative data for clustering and switching on verbal fluency tasks. Journal of
Cltnical and Experimental Neuropsychology, 22(3), 370-378. Troyer, A. K., & Wishart, H. A. (1997). A comparison of qualitative scoring systems for the Rey-Osterrieth Complex Figure Test. Clinical Neuropsychologist, 11(4), 381-390. Troyer, A. K., Moscovitch, M., & Winocur, G. (1997). Clustering and switching as two components of verbal fluency: Evidence from younger and older healthy adults. Neuropsychology, 11(1), 138-146. Troyer, A. K., Moscovitch, M., Winocur, G., Alexander, M. P., & Stuss, D. (1998a). Clustering and switching on verbal fluency: The effects of focal frontal- and temporal-lobe lesions. Neuropsychologia, 36(6), 499-504. Troyer, A. K., Moscovitch, M., Winocur, G., Leach, L., & Freedman, M. (1998b). Clustering and switching on verbal fluency tests in Alzheimer's and Parkinson's disease. Journal of the International Neuropsychological Society, 4(2), 137-143. Tsai, C. H., Lu, C. S., Hua, M. S., Lo, W. L. & Lo, S. K. (1994). Cognitive dysfunction in early onset parkinsonism. Acta Neurologica Scandinavica, 89,9-14. Tsunoda, M., Kurachi, M., Yuasa, S., Kadono, Y., Matsui, M., & Shimizu, A. (1992). Scanning eye movements in schizophrenic patients: Relationship to clinical symptoms and regional cerebral blood flow using 123I-IMP SPECT. Schizophrenia Research, 7(2), 159-168. Tucha, 0., Smely, C., & Lange, K. W. (1999). Verbal and figural fluency in patients with mass
604 lesions of the left or right frontal lob~s. Journal of Clinical and Experimental Neuropsychology, 21 (2), 229-236. Tuokko, H., & Woodward, T. S. (1996): Development and validation of a demographiclcorrection system for neuropsychological measures used in the Canadian Study of Health and Aging. Journal of Clinical and Experimental Neuropsychology, 18(4), 479-616. Tupler, L. A., Welsh, K. A., Asare-Aboagye, Y., & Dawson, D. V. (1995). Reliability of the ReyOsterrieth Complex Figure in use with memoryimpaired patients. Journal of Clinictfz and Experimental Neuropsychology, 17(4), $6--579. Tupper, D. E. (1999). Introduction: Neuropsychological assessment apres Luria. [Special Issue: Part II: International extentions of L.,ria's neuropsychological investigation]. NeunYpsychology Review, 9(2), 57-61. Turner, M. A. (1999). Generating novel ideas: Fluency performance in high-functi(ming and learning diabled individuals with autiSJll. Journal of Child Psychology and Psychiatry bnd Allied Disciplines, 40(2), 189-201. Uchiyama, C. L., D'Elia, L. F., Dellinger, A. M., Becker, J. T., et al. (1995). Alternate f~rms of the Auditory-Verbal Learning Test: Isst.les of test comparability, longitudinal reliability,i and moderating demographic variables. A(-chives of Clinical Neurqpsychology, 10(2), 133+145. Unverzagt, F. W., Hall, K. S., TorkE, A. M., Rediger, J. D., et al. (1996). Effects of age, education and gender on CERAD neuropsychological test performance in an African American sample. Clinical Neuropsychologist, 10(2), 180-190. Unverzagt, F. W., Morgan, 0. S., Thesiger, C. H., Eldemire, D. A., Luseko, J., Pokuri~ S., et al. (1999). Clinical utility of CERAD ne~ropsycho logical battery in elderly Jamaicans. Journal of the International Neuropsychologic4l Society, 5(3), 255-259. Uttl, B., & Graf, P. (1997). Color-Word Stroop Test performance across the adult life sp~. Journal of Clinical and Experimental Neurop,sychology, 19(3), 405-420. . Uzzell, B. P., & Oler, J. (1986). Chron~ low-level mercury exposure and neuropsycholOgical functioning. Journal of Clinical and E~erimental Neurqpsychology, 8(5), 581-593. Vaisman, N., Voet, H., Akivis, A., & Vakil, E. (1996). Effect of breakfast timing on:the cognitive functions of elementary school students. Archives of Pediatric and Adolescent; Medicine, 150(10), 1089-1092. ;
REFERENCES Vakil, E., & Blachstein, H. (1993). Rey AuditoryVerbal Learning Test: Structure analysis. Journal of Clinical Psychology, 49, 883-890. Vakil, E., & Blachstein, H. (1994). A supplementary measure in the Rey AVLT for assessing incidental learning of temporal order. Journal of Clinical Psychology, 50(2), 240-245. Vakil, E., & Blachstein, H. (1997). Rey AVLT: Developmental norms for adults and the sensitivity of different memory measures to age. Clinical Neurqpsychologist, 11(4), 356-369. Valdois, S., Poissant, A., & Joanette, Y. (1989). Visual form discrimination in normal aging and dementia of the Alzheimer's type. Journal of Clinical and Experimental Neuropsychology, 11, 91. van Boxtel, M. P. J., ten Tusscher, M. P. M., Metsemakers, J. F. M., Willems, B., & Jolles, J. (2001). Visual determinants of reduced performance on the Stroop Color-Word Test in normal aging individuals. Journal of Clinical and Experimental Neuropsychology, 23(5), 620-627. Van den Broek, M. D., Bradshaw, C. M., & Szabadi, E. (1993). Utility of the Modified Wisconsin Card Sorting Test in neuropsychological assessment. British Journal of Clinical Psychology, 32(3), 333--343. Vanderploeg, R. D., LaLone, L. V., Greblo, P., & Schinka, J. A. (1997). Odd-even short forms of the Judgment of Line Orientation Test. Applied Neuropsychology. 4, 244-246. Vanderploeg, R. D., Schinka, J. A., Jones, T., Small, B. J., Graves, A. B., & Mortimer, J. A. (2000). Elderly norms for the Hopkins Verbal Learning Test-Revised. Clinical Neuropsychologist, 14(3), 318-324. Vanderploeg, R. D., Crowell, T. A., & Curtiss, G. (2001). Verbal learning and memory deficits in traumatic brain injury: Encoding, consolidation, and retrieval. Journal of Clinical and Experimental Neurqpsychology, 23(2), 185-195. Van Gorp, W. G., & McMullen, W. J. (1997). Possible sources of bias in forensic neuropsychological evaluations. Clinical Neurqpsychologist, 11(2), 180-187. Van Gorp, W. G., Satz, P., Kiersch, M. E., & Henry, R. (1986). Normative data on the Boston Naming Test for a group of normal older adults. Journal of Clinical and Experimental Neurqpsychology, 8(6), 702-705. Van Gorp, W. G., Satz, P., Miller, E., & Visscher, E. (1989). Neuropsychological performance in HIV-1 immunocompromised patients: A preliminary report. Journal of Clinical and Experimental Neuropsychology, 11(5), 763-773.
605
REFERENCES Van Gorp, W. G., Satz, P., & Mitrushina, M. (1990). Neuropsychological processes associated with normal aging. Developmental Neuropsy-
chology, 6(4), 279-290. van Spaendonck, K. P. M., Berger, H. J. C., Horstink, M. W. I. M., Bonn, G. F., & Cools, A. R. (1995). Card sorting performance in Parkinson's disease: A comparison between acquisition and shifting perfonnance. Journal of Clini-
cal and Experimental Neuropsychology, 17(6), 918-925.
Varney, N. R. (1981). Letter recognition and visual form discrimination in aphasic alexia. Neuropsychologia, 19, 795-800. Varney, N. R., Roberts, R. J., Struchen, M. A., Hanson, T. V., Franzen, M. D., & Connell (1996). Design fluency among normals and patients with closed head injmy. Archives of Clinical Neuropsychology, 11(4), 345--353. Vega, A., & Parsons, 0. A. (1967). Cross-validation of the Halstead-Reitan Tests for brain damage.
Journal of Consulting Psychology, 31, 619-625. Verma, S. K., Pershad, D., & Khanna, R. (1993). Hooper's Visual Organization Test: Item analysis on Indian subjects. Indian Journal of Clinical Psychology, 20(1), 5-10. Vernon, P. A. (1993). Der Zahlen-VerbindungsTest and other trail-making correlates of general intelligence. Personality and Individual Differences, 14(1), 35-40. Veroff, A. E. (1980). The neuropsychology of aging: Qualitative analysis of visual reproductions.
Psychological Research, 41, 259-268. Vickers, D., & Lee, M. D. (1998). Never cross the path of a traveling salesman: The neural network generation of Halstead-Reitan trail making tests.
Behavior Research Methods, Instruments, and Computers, 30(3), 423-431. Vickers, D., Vincent, N., & Medvedev, A. (1996). The geometric structure, construction, and interpretation of path-following (trail-making) tests. Journal of Clinical Psychology, 52(6), 651-661. Vilkki, J. (1989). Differential perseverations in verbal retrieval related to anterior and posterior left hemisphere lesions. Brain and Langpage, 36(4), 543-554. Villardita, C., Cultrera, S., Cupone, V., & Mejia, R. (1985). Neuropsychological test performances and normal aging. International Workshop: Psychiatry in aging and dementia. Archives of Gerontology and Geriatrics, 4(4), 311-319. Vingerhoets, G., Lannoo, E., & Wolters, M. (1998). Comparing the Rey-Osterrieth and Taylor Complex Figures: Empirical data and metaanalysis. Psychologica Belgica, 38(2), 109-119.
Visser, R. S. H. (1973). Manual of the Complex Figpre Test (CFT). Amsterdam: Swets & Zeitlinger. Vlahou, C. H., & Kosmidis, M. H. (2002). The Greek Trail Making Test: Preliminary nonnative data for clinical and research use. Psychology:
Journal
of the
HeUenic Psychological Society,
9(3), 336--352.
Vollant, M., Lafitte, M. L., & Rapin, J. R. (1986). Some specific errors in VRT of Benton in detection of senile dementia of Alzheimer type. In: A. Bes (Ed.), Senile demmtia: Early detection (pp. 631~). John Libbey Eurotext. Volz, H. P., Gaser, C., Haeger, F., Rzanny, R., Mentzel, H. J., Kreitschmann-Andermahr, 1., et al. (1997). Brain activation during cognitive stimulation with the Wisconsin Card Sorting Test-A functional MRI study on healthy volunteers and schizophrenics. Psychiatric Re-
search: Neuroimaging, 75, 145-157. Waber, D. P., & Holmes, J. M. (1985). Assessing children's copy productions of the Rey-Osterrieth Complex Figure. Journal of Clinical and Experimental Neuropsychology, 7(3), 264-280. Waber, D. P., & Holmes, J. M. (1986). Assessing children's memory productions of the ReyOsterrieth Complex Figure. Journal of Clinical and Experimental Neuropsychology, 8, 563--580. Waber, D., Bernstein, J., & Merola, J. (1989). Remembering the Rey-Osterrieth Complex Figure: A dual-code, cognitive neuropsychological model. Developmental Neuropsychology,
5, 1-15. Waber, D. P., Isquith, P. K., Kahn, C. M., & Romero, I. (1994). Metacognitive factors in the visuospatial skills of long-term survivors of acute lymphoblastic leukemia: An experimental approach to the Rey-Osterrieth Complex Figure Test. Developmental Neuropsychology, 10(4), 349-367. Waber, D. P. (2003). Parsing children's productions of the ROCF: What develops? In J. A. Knight (Ed.), The handbook of Rey-Osterrieth Complex
Figure usage: Clinical and research applications. Lutz, FL: Psychological Assessment Resources. Wahlin, T.-B. R., Baeckman, L., Wahlin, A., & Winblad, B. (1996). Trail Making Test performance in a community-based sample of healthy very old adults: Effects of age on completion time, but not on accuracy. Archives of Gerontology and Geriatrics, 22(1), 87-102. Waldmann, B. W., Dickson, A. L., Monahan, M. C., & Kazelskis, R. (1992). The relationship between intellectual ability and adult performance on the Trail Making Test and the Symbol Digit
606
REFERENCES
Modalities Test. Journal of Clinical Psychology, 48(3), 360-363. Walsh, P. F., Lichtenberg, P. A., & R(lwe, R. J. (1997). Hooper Visual Organization Test perfonnance in geriatric rehabilitation patients. Clinical Gerontologist, 17(4), 3-11. Wang, P. J. (1977). Visual organization ability in brain damaged adults. Perceptual firid Motor Skills, 45, 723-728. Ward, T. (1997). A note of caution for clinicians using the Paced Auditory Serial Addition task. British Journal of Clinical Psycholdgy, 36(2),
303--307. Warkentin, S., c!L Passant, U. (1997). functional imaging of the frontal lobes in organic: dementia. Dementia and Geriatrtc Cognitive l)isorders, 8(2), 105-109. Warner, M. H., Ernst, J., Townes, B. D., Peel, J. H., & Preston, M. (1987). Relationships between IQ and neuropsychological measures in ·neuropsychiatric populations: Within-laboratory and cross-cultural replications using WAIS and WAIS-R. Journal of Clinical and Experimental Neuropsychology, 9(5), 545-562. : Warrington, E. (1984). Recognition Merr»:Jry TestFaces. Windsor, UK: NFER-Nelson. Warrington, E., & Rabin, P. (1970). Perceptual matching in patients with cerebral leSions. Neuropsychologia, 8(4), 475-487. Wechsler, D. (1945). A standardized memory scale for clinical use. Journal of Psychology, 19,87-95. Wechsler, D. (1955). WAlS Technical Monual. San Antonio, TX: Psychological Corporation. Wechsler, D. (1987). Wechsler Memory ScaleRevised. San Antonio, TX: Psychological Corporation/Harcourt Brace Jovanovich. Wechsler, D. (1997). WMS-III. Administration and scoring manual. San Antonio, TX: Ps;chological Corporation/Harcourt Brace JovanoviCh. Wechsler, D. (2002a). WAlS-lll, WMS..Jll technical manual~pdated. San Antonio, TX: Psychological Corporation. Wechsler, D. (2002b). WMS-III abbreviatedManual. San Antonio, TX: Psychological Corporation. Wecker, N. S., Kramer, J. H., Wisniewski, A., Delis, D. C., & Kaplan, E. (2000). Age ·effects on executive ability. Neuropsycholol!lJ, 14(3), 4{K}-414. . Wedding, D., & Faust, D. (1989). Clinical judgment and decision making in neuropsychology.
Archives
of
Clinical NeuropsycholJgy, 4(3),
233-265. Wegesin, D. J., Jacobs, D. M., Zubin, N. R., Ventura, P. R., & Stem, Y. (2000). Souree memory
and encoding strategy in nonnal aging. Journal
of Clinical
and Experimental Neuropsychology,
22(4), 455-464. Weible, J. A., Nuest, B. D., Welty, J., Pate, W. E., & Turner, M. L. (2002). Demonstrating the effects of presentation rate on aging memory using the California Verbal Learning Test (CVLT). Aging, Neuropsychology, and Cognition, 9(1), 38-47. Weigl, E. (1941). On the psychology of so-called processes of abstraction. Journal of Abnormal
and Social Psychology, 36, 3-33. Weigner, S., & Donders, J. (1999). Perfonnance on the Wisconsin Card Sorting Test after traumatic brain injury. Assessment, 6, 179-187. Weinberger, D. R., Bennan, K. F., & Zec, R. F. (1986). Physiologic dysfunction of dorsolateral prefrontal cortex in schizophrenia. I. Regional cerebral blood How evidence. Archives of General Psychiatry, 43, 114-124. Weinberger, D. R., Bennan, K. F., Iadarola, M., Driesen, N., & Zec, R. F. (1988). Prefrontal cortical blood How and cognitive function in Huntington's disease. Journal of NeurolorJ.J. Neurosurgery, and Psychiatry, 51, 94-104. Weinberger, D. R., Bennan, K. F., & Illowsky, B. P., (1989). Physiologic dysfunction of dorsolateral prefrontal cortex in schizophrenia. III. A new cohort and evidence for a monoaminergic mechanism. Archives of General Psychiatry, 45, 609--615.
Weinberger, D. R., Aloia, M. S., Goldberg, T. E., & Bennan, K. F. (1994). The frontal lobes and schizophrenia. Journal of Neuropsychiatry and Clinical Neuroscience, 6, 419-427. Weingartner, H., Burns, S., Diebel, R., & LeWitt, P. A. (1984). Cognitive impainnents in Parkinson's disease: Distinguishing between effort-demanding and automatic cognitive processes. Psychiatry Research, 11, 223-235. Weinstein, C., Kaplan, E., Casey, M., & Hurwitz, I. (1990). Delineation of female perfonnance on the Rey-Osterrieth Complex Figure. Neuropsychology, 4, 117-127. Weiss, K. M. (1996). A simple clinical assessment of attention in schizophrenia. Psychiatry Research, 60(2-3), 147-154. Welch, L. W., Doineau, D., Johnson, S., & King, D. (1996). Educational and gender normative data for the Boston Naming Test in a group of older adults. Broin and Language, 53, 260-266. Welsh, K. A., Butters, N., Hughes, J., Mobs, R., & Heyman, A. (1991). Detection of abnonnal memory decline in mild cases of Alzheimer's disease using CERAD neuropsychological measures. Archives of Neurology, 48(3), 278-281.
REFERENCES Welsh, K. A., Butters, N., Mobs, R. C., Beeldy, D., Edland, S., Fillenhaum, G., et al. (1994). The Consortium to Establish a Registry for Alzheimer's Disease (CERAD): V. A normative study of the neuropsychological battery. Neurology, 44(4), 609-614. Welsh, K. A., Fillenbaum, G., Wilkinson, W., Heyman, A., et al. (1995). Neuropsychological test performance in African-American and White patients with Alzheimer's disease. Neurology, 45(12), 2207-2211. Wentworth-Rohr, I., Mackintosh, R. M., & Fialkoff, B. S. (1974). The relationship of Hooper VOT score to sex, education, intelligence and age. Journal of Clinical Psychology, 30(1), 73-75. Wetzel, L., & Boll, T. (1987). Short Category Test, Booklet Format. Los Angeles, CA: Western Psychological Services. Wetzel, L., & Murphy, G. S. (1991). Validity of the use of a discontinue rule and evaluation of discriminability of the Hooper VISual Organization Test. Neuropsychology, 5(2), 119--122. Wheeler, L., & Reitan, R. M. (1963). Discriminant functions applied to the problem of predicting cerebral damage from behavioral tests: A crossvalidation study. Perceptual and Motor Skills, 16, 681-701. Whelihan, W. M., & Lesher, E. L. (1985). Neuropsychological changes in frontal functions with aging. Developmental Neuropsychology, 1, 371-380.
White, A. J. (1984). Cognitive impairment of acute mountain sickness and acetazolamide. Aviation, Space and Environmental Medicine, 55, 589-603. Whitfl.eld, K. E., Fillenbaum, G. G., Pieper, C., Albert, M. S., Berkman, L. F., Blazer, D. G., et al. (2000). The effect of race and healthrelated factors on naming and memory. The MacArthur Studies of Successful Aging. Journal of Aging and Health, 12(1), 69--89. Wiederholt, W. C., Cahn, D., Butters, N. M., Salmon, D.P., et al. (1993). Effects of age, gender and education on selected neuropsychological tests in an elderly community cohort. Journal of the American Geriatrics Society, 41(6), 639--647. Wiegner, S., & Donders, J. (1999). Performance on the California Verbal Learning Test after traumatic brain injury. Journal of Clinical and Experimental Neuropsychology, 21(2), 159--170. Wiens, A.M., & Matarazzo, J.D. (1977). WAIS and MMPI correlates of the Halstead-Reitan Neuropsychology Battery in normal male subjects. Journal of Neroous and Mental Disease, 164(2), 112-121.
607
Wiens, A. N., McMinn, M. R., & Crossen, J. R. (1988). Rey Auditory-Verbal Learning Test: Development of norms for healthy young adults. Clinical Neuropsychologist, 2(1), 67-87. Wiens, A. N., Tindall, A. G., & Crossen, J. R. (1994). California Verbal Learning Test: A normative data study. Clinical Neuropsychologist, 8(1), 75-90. Wiens, A. N., Fuller, K. H., & Crossen, J. R. (1997). Paced Auditory Serial Addition Test: Adult norms and moderator variables. Journal of Clinical and Experimental Neuropsychology, 19(4), 473-483. Wiggs, C. L., Weisberg, J., & Martin, A. (1999). Neural correlates of semantic and episodic memory retrieval. Neuropsychologta, 37(1), 103-118. Wilde, M. C., Boeke, C., & Sherer, M. (2000). Wechsler Adult Intelligence Scale-Revised block design broken configuration errors in nonpenetrating traumatic brain injury. Applied Neuropsychology, 7, 208-214. Wildgruber, D., Kischka, U., Fassbender, K., & Ettlin, T. M. (2000). The Frontal Lobe Score. Part II: Evaluation of its clinical valdity. Clinical Rehabilitation, 14(3), 272-278. Williams, A. D. (2001). Psychometric concerns in neuropsychological testing. NeuroRehabtlitation, 16(4), 221-224. Williams, B. W., Mack, W., & Henderson, V. W. (1989). Boston Naming Test in Alzheimer's disease. Neuropsychologta, 27(8), 1073-1079. Williamson, J. B., & Harrison, D. W. (2003). Functional cerebral asymmetry in hostility: A dual task approach with fluency and cardiovascular regulation. Brain and Cognition, 52(2), 167-174. Willis, L., Yeo, R., Thomas, P .• & Garry, P. G. (1988). Differential declines in cognitive function with aging: The possible role of health status. Developmental Neuropsychology, 4(1), 23-28. Wdson, B. A., & Watson, P. (2003). Performance of people with nonprogressive brain injury and organic memory impairment on the ROCF. In J. A. Knight (Ed.), The handbook of Rey-Osterrieth Complex Figure usage: Clinical and research applications. Lutz, FL: Psychological Assessment Resources. Wilson, B. A., Cockburn, J., & Halligan, P. (1987). Behaoioml Inattention Test. Titch6.eld, UK: Thames Valley Test CoJGaylord, MI: National Rehabilitation Services. Winegarden, B. J., Yates, B. L., Moses, J. A., Jr., Benton, A. L., & Faustman, W. 0. (1998). De,velopment of an optimally reliable short form for Judgment of Line Orientation. Clinical Neuropsychologist, 12(3), 311-314.
REFERENCES
608
Wingenfeld, S. A., Holdwick, D. J., Jr., Davis, J. L., & Hunter, B. B. (1999). Normative data on computerized paced auditory serial addition task performance. Clinical Neuropsychologist, 13(3), 268-273. Witjes-Ane, M.-N. W .• Vegter-van der Vlis, M., van Vugt. J. P. P., Lanser, J. B. K., Hermans, J., Zwinderman, A., et al. (2003). Cognitive and motor functioning in gene carriers for Huntington's disease: A baseline study. Journal of Neuropsychiatnj and Clinical Neurosciences, 15(1), 7-16. Wolf, F. M. (1986). Meta-analysis: Quantitative methods for research synthesis. Newbury Park, CA: Sage. Wolf, L. E., Comblatt, B. A., Roberts, S. A., Shapiro, B. M., & Erlenmeyer-Kimling, L. (2002). Wisconsin Card Sorting deficits in the offspring of schizophrenics in the New York liigh-Risk Project. Schizophrenia Research, 57(2-3), 173-182. Wong, T. M. (2000). Neuropsychological assessment and intervention with Asian Americans. In E. Fletcher-Janzen, T. L. Strickland, et al. (Eds.), Handbook of cross-cultural r&europsychology. Critical issues in neuropsychology. Amsterdam: Kluwer. Wood, F. B., Ebert, V., & Kinsbourne, M. (1982). The episodic-semantic memory distinction in memory and amnesia: Clinical and experimental observations. In L. Cermak (Ed.), Memory and amnesia. Hillsdale, NJ: Lawrence Erlbaum. Wood, W. D .• & Strider, M.A. (1980). Comparison of two methods of administering the Halstead Category Test. Journal of Clinical Psrchology, 36(2), 476-479. Woodard, J. L. (1994). Personal programComputerized version of the Wisconain Card Sorting Test. Woodard, J. L., Axelrod, B. N., & Henry, R. R. (1992). Interrater reliability of scoring :parameters for the Design Fluency Test. Neuropsychology, 6(2), 173-178. Woodard, J. L., Benedict, R. H. B., Roberts, V. J., Goldstein, F. C., Kinner, K. M., Caprulio, D. X., et al. (1996). Short-form alternative~ to the Judgment of Line Orientation Test. Journal of Clinical and Experimental Neuropsychology, 18(6), 898-904. Woodard, J. L., Benedict, R. H. B., Salthouse, T. A., Toth, J. P., Zgaljardic, D. J., & Hancodc, H. E. (1998). Normative data for equivalent. parallel forms of the Judgment of Line Orientation Test. Journal of Clinical and Experimental N~ropsy chology, 20(4), 457-462.
Woodard, J. L., Dunlosky, J., & Salthouse, T. A. (1999a). Task decomposition analysis of intertrial free recall performance on the Rey Auditory Verbal Learning Test in normal aging and Alzheimer's disease. Journal of Clinical and Experimental Neuropsychology, 21(5), 666--676. Woodard, J. L., Goldstein, F. C., Roberts, V. J., & McGuire, C. (1999b). Convergent and discriminant validity of the CVLT (dementia version). Journal of Clinical and Experimental Neuropsychology, 21(4), 553-558. Woodruff-Pak, D. S., & Finkbiner, R. G. (1995). Larger nondeclarative than declarative deficits in learning and memory in human aging. Psychology and Aging, 10(3), 416-426. Woodward, A. C. (1982). The Hooper Visual Organization Test: A case against its use in neuropsychological assessment. Journal of Consulting and Clinical Psychology, 50(2), 286-
288. Woodward, J. L., Benedict, R. H. B., Roberts, V. J., Goldstein, F. C., Kinner, K. M., Capruso, D. X., et al. (1996). Short-form alternatives to the Judgment of Line Orientation Test. Journal of Clinical and Experimental Neuropsychology, 18, 898-904. Worrall, L. E., Yiu, E.M-L., Hickson, L. M. H., & Barnett, H. M. (1995). Normative data for the Boston Naming Test for Australian elderly. Aphasiology, 9(6), 541-551. Xavier, F., Ferraz, M., Trentini, C., Freitas, N., & Moriguchi, E. (2002). Bereavement-related cognitive impairment in an oldest-old communitydwelling Brazilian sample. Journal of Clinical and Experimental Neuropsychology, 24(3), 294--301. Yaffe, K., Cauley, J., Sands, L., & Browner, W. (1997). Apolipoprotein E phenotype and cognitve decline in a prospective study of elderly community women. Archives of Neurology, 54(9), 1110-1114. Yaffe, K., Grady, D., Pressman, A., & Cummings, S. (1998). Serum estrogen levels, cognitive performance, and risk of cognitive decline in older community women. Journal of the American Geriatrics Society, 46(7), 816--821. Yaffe, K., Blackwell, T., Gore, R., Sands, L., Reus, V., & Browner, W. S. (1999a). Depressive symptoms and cognitive decline in nondemented elderly women: A prospective study. Archives of General Psychiatry, 56(5), 425-430. Yaffe, K., Browner, W., Cauley, J., Launer, L., & Harris, T. (1999b). Association between hone mineral density and cognitive decline in older women. Journal of the American Geriatrics Society, 47(10), 1176-1182.
REFERENCES Yaffe, K., Lui, L.-Y., Zmuda, J., & Cauley, J. (2002). Sex honnones and cognitive function in older men. Journal of the American Geriatrics Society, 50(4), 707-712. Yamazaki, A. (1985). Interference in the Stroop color-naming task. Japanese Journal of Psychology, 56, 185-191. Yeates, K. 0., Patterson, C. M., Waber, D. P., & Bernstein, J. H. (2003). Constructional and figural memory skills following pediatric closed-head injury: Evaluation using the ROCF. In J. A. Knight (Ed.), The handbook of Rsy-Osterrieth Complex
Figure usage: Clinical and research applications. Lutz, FL: Psychological Assessment Resources. Yehuda, R., Keefe, R. S. E., Harvey, P. D., Levengood, R. A., Gerber, D. K., Geni, J., et al. (1995). Learning and memory in combat veterans with posttraumatic stress disorder. American Journal of Psychiatry, 152(1), 137-139. Yesavage, J. A., Brink, T. L., Rose, T. L., Lum, 0., Huang, V., Adey, M., et al. (1983). Development and validity of a Geriatric Depression Scale: A preliminary report. Journal of Psychiatric Research, 17, 37-49. Yeudall, L. T., Fromm-Auch, D., & Davies, P. (1982). Neuropsychological impainnent of persistent delinquency. Journal of Nervous and Mental Disease, 170(5), 257-265. Yeudall, L. T., Reddon, J. R., Gill, D. M., & Stefanyk, W. 0. (1987). Nonnative data for the Halstead-Reitan Neuropsychological Tests stratified by age and sex. Journal of Clinical Psychology, 43(3), 346--367. Yeudall, L. T., Fromm, D., Reddon, J. R., & Stefanyk, W. 0. (1986). Normative data stratified by age and sex for 12 neuropsychological tests. Journal of Clinical Psychology, 43(3), 918-946. Ylikoski, R., Ylikoski, A., Erkinjuntti, T., Sulkava, R., Raininko, R., & Tilvis, R. (1993). White matter changes in healthy elderly persons correlate with attention and speed of mental processing. Archives of Neurology, 50, 818-824. York, C. D., & Cennak, S. A. (1995). VISual perception and praxis in adults after stroke. American
JournalofOccupatianalTherapy,49(6),543-550. York, M. K., Levin, H. S., Grossman, R. G., Lai, E. C., & Krauss, J. K. (2003). Clustering and switching in phonemic fluency following pallidotomy for the treatment of Parkinson's disease.
Journal of Clinical and Experimental Neuropsychology, 25(1), 110-121. Young, M. H., & Justice, J. (1998). Neuropsychological functioning of inmates referred for psychiatric treatment. Archives of Clinical Neuropsychology, 13(3), 303-318.
609 Youngjohn, J. R., Larrabee, G. J., & Crook, T. H. (1992). Discriminating age-associated memory impainnent from Alzheimer's disease. Psychological Assessment, 4(1), 54-59. Youngjohn, J. R., Larrabee, G. J., & Crook, T. H. (1993). New adult age- and education-correction nonns for the Benton Visual Retention Test. Clinical Neuropsychologist, 7(2), 155-160. Yurgelun-Todd, D. A., & Kinney, D. K. (1993). Patterns of neuropsychological deficits that discriminate schizophrenic individuals from siblings and control subjects. Journal of Neuropsychiatry and Clinical Neurosciences, 5(3), 294-300. Zable, M., & Harlow, H. F. (1946). The performance of rhesus monkeys on a series of object quality and positional discriminations and discrimination reversals. Journal of Comparotive Psychology, 39, 1. Zachary, R. A., & Gorsuch, R. L. (1985). Continuous nonning: Implications for the WAIS-R. Journal of Clinical Psychology, 41(1), 86-94. Zakzanis, K. K. (1998). The reliability of metaanalytic review. Psychological Reports, 83(1), 215-222. Zalewski, C., Thompson, W., & Gottesman, I. (1994). Comparison of neuropsychological test perfonnance in PTSD, generalized anxiety disorder, and control Vietnam veterans. Assessment, 1(2), 133-142. Zappala, G., Measso, G., Cavarzeran, F., Grigoletto, F., Lebowitz, B., Pirozzolo, F., et al. (1995). Aging and memory: Corrections for age, sex and education for three widely used memory tests. Italian Journal of Neurological Sciences, 16(3), 177-184. Zec, R. F., Landreth, E. S., Fritz, S., Grames, E., Hasara, A., Fraizer, W., et al. (1999). A comparison of phonemic, semantic, and alternating word fluency in Parkinson's disease. Archives of Clinical Neuropsychology, 14(3), 255-264. Zeitlhofer, J., Asenbaum, S., Spiss, C., Wimmer, A., Mayr, N., Wolner, E., et al. (1993). Central nervous system function after cardiopulmonary bypass. European Heart Journal, 14(1), 885-890. Zhou, W., Liang, Y., & Christiani, D. C. (2002). Utility of the WHO neurobehavioral core test battery in Chinese workers-A meta-analysis. Environmental Research, 88(2), 94-102. Zondennan, A. B., Giambra, L. M., Arenberg, D., Resnick, S., & Costa, P. (1995). Changes in immediate visual memory predict cognitive impainnent. Archives of Clinical Neuropsychology, 10(2), 111-123.
Appendix 1: Where to Buy the Tests
Several of the tests mentioned in this book are available from more than one distributor. An asterisk placed before the name of the test
indicates that the company listed at left is also the primary publisher of the test.
TEST PUBLISHER/DISTRIBUTOR
TEST NAME
Editorial Medica Panamericana Alberto Alcocer, 24-6a 28036 Madrid Spain (3491)-457-0203 [Phone] www.medicapanamericana.com
0
Boston Naming Test, Spanish version
Lafayette Instrument 3700 Sagamore Parkway North P.O. Box 5729 Lafayette, IN 47903 1-800-428-7545 [Phone orders] 1-765-423-4111 [Fax orders] www.Lafayetteinstrument.com
0
Lafayette Hand Dynamometer Grooved Pegboard Test
Normative Data. com 35 S. Raymond Ave., #304 Pasadena, CA 91105-1993 1-626-304-9995 [Fax orders only] www.NormativeData.com
0
0
0
WHO-UCLA Auditory-Verbal Learning Test Stroop (Comalli/Kaplan versions in Spanish and/or English)
611
612
APPENDIX 1
Psychological Assessment Resources 16204 N. Florida Ave. Lutz, FL 33549 1-800-331-TEST [Phone orders] 1-866-727-2884 [24-hour order line] 1-800-727-9329 [Fax orders] www.parinc.com
•color Trails Test (Adult & Children's versions) Grooved Pegboard Test Finger Tapping Test Lafayette Hand Dynamometer Boston Naming Test-Revised •Rey Complex Figure & Recognition •Intermediate Booklet Category Test •Portable Tactual Performance Test •stroop Test (Trenerry version) Stroop Color & Word Test (Golden version) Ruff Figural Fluency Test •wisconsin Card Sorting Test (64 SP, 64 V2, and V4 versions) Boston Qualitative Scoring System for the Rey-Osterrieth Complex Figure Hopkins Verbal Learning Test-Revised Rey Auditory-Verbal Learning Test-A Handbook Benton Judgment of Line Orientation Benton Visual Form Discrimination Test Ruf£2&7 Digit Vigilence Test
The Psychological Corporation 555 Academic Court San Antonio, TX 78204-2498 1-800-211-8378 [Phone orders] 1-800-232-1223 [Fax orders] www.PsychCorp.com
Boston Naming Test-Revised Rey Complex Figure & Recognition •Wechsler Memory Scale-Revised •Wechsler Memory Scale-III, IliA California Verbal Learning Test, 2nd ed. Wisconsin Card Sorting Test (64 V2 and V3 versions) Paced Auditory Serial Addition Test
Reitan Neuropsychological Laboratory P.O. Box 66080 Tucson, AZ 85728 1-520-577-2970 [Phone orders] 1-520-577-2940 [Fax orders] www.ReitanLabs.com
•category Test •Finger Tapping Test •Tactual Performance Test •Trail Making Test (1945 version) Hand Dynamometer
Riverside Publishing 425 Spring Lake Drive Itasca, IL 60143-2079 1-800-323-9540 [Phone orders] 1-630-467-7192 [Fax orders] www.riversidepublishing.com
Boston Naming Test-Revised Stroop Color & Word Test (Golden version) Lafayette Hand Dynamometer Grooved Pegboard Test
Western Psychological Services 12031 Wilshire Blvd. Los Angeles, CA 90025-1251 1-800-648-8857 [Phone orders] 1-310-478-7838 [Fax orders] www.wpspublish.com
•short Category Test, Booklet Format •Stroop Color & Word Test (Golden version) •Hooper Visual Organization Test Rey Auditory-Verbal Learning Test Wisconsin Card Sorting Test
Appendix 2a: Subject Instructions for ACT According to Boone et al. (1990) and Boone (1999)
The examiner instructs the patient: ''I'm going to say three letters, and I want you to say them back to me. Ready? Q, L, X." Administer first five trials. Write down patient's responses in the response column. In the next column, indicate the number of correct responses (i.e., 0, 1, 2, or 3). If more than one or two errors are present, the test should be discontinued. If the patient cannot repeat the letters reliably, this would point to a linguistic or hearing problem, in which case frontal lobe functioning cannot be assessed with this test. "Now I'm going to say three letters, and then I'm going to say a number. After I say the number, I want you to count backward out loud by 3s from the number. For example, if the number was 100, you would say, '100, 97, 94,' etc. After a few seconds of counting, I will stop you and will want you to tell me the
letters. In other words, you have to do two things at once-hold the letters in mind while you are counting backward by 3s. It is a difficult task, and it is normal to make mistakes. Let's try one. X, C, P, 194. Start counting." 'nle numbers in the "delay" column indicate how many seconds the person should count backward by 3s. Begin timing when the patient actually says a number out loud. You may allow the patient to attempt to consolidate the information for approximately 12 seconds prior to the actual commencement of counting; do not allow more time than this. A perseveration is scored if the person says a letter that is incorrect and one which was said in the trial directly preceding the current one; a total of 57 perseverations are possible. A sequence error occurs when the patient says the letters out of sequence; a total of 20 sequence errors are possible.
613
Append ix 2b: Auditory Consonant Trigrams (Boone et al., 1990; Boone, 1999)
Starting umbe r
Stimulus
Delay (Seconds)
QLX
-
0
SZB
0
HJT
-
CPW
-
0
DLH
-
0
X p
194
DJ
75
9
FXB
28
3
J
1 0
9
BCQ
167
KM
0
l
1
3
20
RA'T
l
18
KF
82
9
MBW
47
3
TDH
141
9
LRP
51
3
zw
117
18
9
9
PHQ XGD
15 91
ZQ
umber
orrect:
0" Delay
/15
3" Delay
/ 15
9" D elay
/ 15
1 " Delay
/15
TOTAL
/ 60
614
l
3
Response
#Correct
Perseveration
Sequence
Appendix 2c: Subject Instructions for ACT According to Stuss et al. (1987, 1988)
The examiner starts by saying: "I am going to say three letters and when I am through, I am going to knock like this. When I do I want you to say the letters back to me." The examiner says letters out loud at the rate of 1 per second and records the patient's answers. After the five trials with "0" delay, the examiner continues the test. "This time, I am going to say three letters followed immediately by a number. As soon as you get the number, I want you to start counting backward out loud until I knock as before." (Examiner demonstrates by knocking on the desk) "When I knock, I want you to recall the three letters. Do you have any questions?" If the instructions are clearly understood, the examiner starts with the delayed trials. If not understood, repeat with examples. For this part of the test there are three delayedrecall conditions, which are "3," "9," and "18" randomly alternating. All trials are presented independently of the patient's performance. The examiner says the letters and the numbers and immediately starts the stopwatch until the corresponding delayed-recall period elapses. He then knocks on the desk and records the letters reported by the patient.
SUPPLEMENTARY INSTRUCTIONS On this test, it is important to maintain interference conditions. The examiner must
make sure that the patient is counting during the delayed period. Some patients tend to repeat the letters after the examiner instead of starting counting immediately. In this instance, the patient should be told not to repeat the letters but to start counting as soon as he or she hears the number. The examiner may have to encourage him or her to count out loud by counting with him or her at the beginning or if the patient hesitates. Some patients have great difficulty counting by 3s backward. In such cases, the patient is asked to count backward by ls instead. However, this procedure is nonstandard and should be noted. On this test, only one presentation for each trial is allowed.
SCORING For each trial, the letters given by the patient are recorded verbatim (i.e., in the order reported by the patient) in the first column of the score sheet. The number of correct letters identified is noted in the second column. The number of correct letters for each delayed condition over 15 is registered at the bottom of the score sheet. The summation of these subscores constitutes the total score of the test over 60. 615
Appendix 2d: Auditory Consonant Trigrams (Stuss et al., 1987, 1988)
Stimulus
Starting Number
Delay (Seconds)
QLX
-
0
SZB
0
DLH
-
XCP
194
18
NDJ
75
9
FXB
128
36
JCN
180
9
BGQ
167
18
KMC
120
36
RXT
188
18
KFN
82
9
MBW
147
36
TDH
141
9
HJT
GPW
616
0 0 0
Responses
Number Correct
APPENDIX 20
617
Starting Number
Delay (Seconds)
LRP
151
36
zws
117
18
PHQ
89
9
XGD
158
18
CZQ
191
36
Stimulus
Number Correct 0" Delay 9" Delay 18" Delay 36" Delay
TOTAL
Two
Last Two
Responses
Number Correct
Appendix 3: WHO-UCLA Auditory Verbal Learning Test: Instructions and Test Forms
GENERAL INSTRUCTIONS FOR EXAMINERS Instructions to Subjects: The ins~ons to subjects are printed on the test fonn. Be sure to indicate before Trial II that the: subject should try to remember as many wo~ as he can, ". . . tncludtng the words yau remembered on the first trial." After the last trial (Trial V), you should remember to ·tell the subject that "I want yau to try to ~eras many of those words as possible bectuse I'm going to ask yau abaut them again a little later." These instructions are printed on the test fonn for "Recall Following Interference." Instructions to Examiners: Use a clipboard to hold the test fonns out of the subject's view. Read the words at the rate of approxi~ately 1 word per second. On the last word, 4rop the pitch of your voice to indicate that you are finished. If necessary, prompt the s~ject to begin recall. In general, you should not look at the subject while reading the list or while recording his responses since this kind of eye contact makes many people nervous. Code all responses by placing a check mark in the relevant box. Place additional chedc marks when items are repeated. If the subject gives a word that is not on the list, record ~ intrusion in the spaces provided below the ;IS-item 618
list. At the end of each trial write down the total numbers of correct responses, repetitions, and intrusions. If the subject makes an intrusion, wait until he indicates that he is finished, then prompt the subject by saying, "Yau said __ , __ was not on the list."
If the subject makes an intrusion that may reflect poor hearing on the part of the subject or poor pronunciation on the part of the examiner (such as saying "Pie" instead of "Eye"), count the item as correct the first time, and correct the subject by saying, "Yau said Pie, the correct word is Eye." If the subject continues to produce the same intrusion on subsequent trials, correct the subject in the same manner but score the response as an intrusion. Make allowances for translation and pronunciation difficulties on the part of nonnative speakers of the language being used for test administration.
If a subject asks if a particular response is correct, answer him truthfully. If a subject asks if he has already said a particular word, again, answer truthfully, but count the word as a repetition if he has already said it once. In general, feel free to answer any of the subject's questions about the task, including the number of trials and the number of words on the list.
619
APPENDIX 3
If the subject is not producing at least 10 responses by the third trial, encourage the subject to try a little longer.
ACQUISITION TRIALS
Trial 1: "The next task may seem a bit difficult in the beginning, but usually it gets easier as we go along. I am going to read for you a long list of words. Once I'm done, I'd like to see how many of the words you can recall. You can repeat the words in any order that you prefer; you don't have to use the same order that I use. Then, I am going to read the same list for you a few more times, to see how many of the words you can eventually learn. Ready?" Trial II: "That was a good beginning. Now I'm going to read the same list again, and again I would like to see how many of the words you can recall, including the words you remembered on the first trial. Again, listen very carefully. Ready?" Trials 111-V: "Very good. I'm going to read the list again. Again, listen carefully and try to remember as many words as you can. Ready?"
INTERFERENCE LIST, RECALL FOLLOWING INTERFERENCE Instructions for Trial VI: After Trial V of the primary word list, say, "Very good. I want you to try to remember as many of those words as possible because I'm going to ask you about them again a little later." Then say, "Now I am going to read for you a different list of words. Once again, when I'm done, I'd like to see how many of the words you can recall. Ready?" Read the interference list (boot, monkey,
etc.), and record responses under Trial VI. Unlike earlier trials, you should not correct any intrusions that the subject makes. Instructions for Trial VII: After the subject has recalled as much as possible from the interference list, say, "Now I'd like to see how many words you can recall from the first listthe one we went through five times. Tell me as many words as you can remember from the first list." Record responses under Trial VII.
30-MINUTE DELAYEO RECALL AND RECOGNITION Instructions for Trial VIII: Without reading the list again, say, "Remember the long list of words I read to you five times? I'd like you now to tell me as many of the words from that list as you can remember." Do not correct the subject if he/she makes any intrusions. Instructions for Trial IX: Immediately following the delayed recall say, "Next I would like to see how many of the words you can recognize. Say Yes if you hear a word you think was part of the original list we went through five times. If you think that the word is not from that list, say No. Make sure you only say Yes to those words you are sure you remember as being a part of that list we went through five times." Read the words in order from left to right. Circle the word if the subject says 'Yes.' If the subject hesitates and fails to answer within a few seconds, say, "If you are not sure, just make your best guess." Words from the original list are capitalized. Scoring: "Correct Recognitions" is the total number of circled words that are capitalized. "False Identifications" is the total number of circled words that are not capitalized.
APPENDIX 3
620 Trial I
Trial II
Trial II
Trial IV
Trial V
Ann
Arm
Cat
Cat
AJ.e
AJ.e
Bed
Bed
Plane
Plane
Ear
Ear
Dog
Dog
Hammer
Hammer
Chair
Chair
Car
Car
Eye
Eye
Horse
Horse
Knife
Knife
Clock
Clock
Bike
Bike
Correct: Repeats: Intrusions: Copyright © 1990 by Paul Satz, Ph.D., Alexander Chervinsky, Ph.D., and Louis F. D'Elia, Ph.D. All rights reserved. Fonn design by E.N. Miller.
621
APPENDIX 3 Trial VII
Trial VI Boot
Arm
Monkey
Cat
Bowl
Axe
Cow
Bed
Finger
Plane
Dress
Ear
Spider
Dog
Cup
Hammer
Bee
Chair
Foot
Car
Hat
Eye
Butterfly
Horse
Kettle
Knife
Mouse
Clock
Hand
Bike
Correct: Repeats: Intrusions: Copyright © 1990 by Paul Satz, Ph.D., Alexander CheJVinsky, Ph.D., and Louis F. D'Elia, Ph.D. All rights reserved. Form design by E.N. Miller.
622
APPENDIX 3
Trial IX-Oral Recognition
Trial VIII
.
mirror
HORSE
leg DOG
Ear
HAMMER KNIFE candle motorcycle AXE CLOCK CHAIR PLANE turtle
Dog
Correct Recognitions: _ __
Ann
Cat Axe
Bed Plane
Hammer
table CAT
lips tree
ARM
truck EYE fish EAR BIKE
snake stool bus
nose
BED
sun
CAR
False Identifications: _ __
Chair Car
Eye Horse Knife Clock Bike
Correct: Repeats: Copyright © 1990 by Paul Satz, Ph.D., Alexander Chervinsky, Ph.D., and Louis F. D'Eiia, Ph.D. All rights reserved. · Form design by E.N. Miller.
Appendix 4: Locator and Data Tables for the Trailmaking- Test (TMT)
Study numbers and page numbers provided in these tables refer to study numbers and descriptions of studies in the text of Chapter 4.
Locator table also provides a reference for each study to a corresponding data table in this appendix.
Table A4.1. Locator Table for the Trail-Making Test (TMT) Study
TMT.l Davies, 1968 page 72 Table A4.2
TMT.2 Goul & Brown, 1970 page 73 Table A4.3
TMT.3 Wiens & Matarazzo, 1977 page 73 Table A4.4
TMT.4 Eson et a!., personal communication page 74 Table A4.5
Age•
20-39 40-49 50-59 60-69 70-79
20-29 30-39 40-49 50-59 1»-72
n
540 180
90 90 90 90 106 26 25 24 16 15
British adults: 50 M,40 F in each decade of age. Mean times corresponding to several percentile ranges are presented. Ss were Canadian workers' compensation non-braininjured, hospitalized patients. Data are stratified by five age groups.
48
All males, neurologically normal. Divided into two equal groups. Random sample of 29 were retested 14-24 weeks later.
63 15 16 16 16
Older participants. Data are provided for 4 age groups.
23.6 24.8
63.2 67.0 72.0 78.3
Sample Composition
IQ/Education•
Location England
Educ. 6-13
Canada
Educ. 13.7 14.0 FSIQ 117.5 118.3
Portland, OR
USA
(continued)
623
624
APPENDIX 4
Table A4.1. (Contd.) Age•
n
TMT.5 Harley et a!., 1980 page 74 Table A4.6
55-79
193
55-79
160
TMT.6 Anthony et a!., 1980 page 74 Table A4.7
38.88 (15.80)
100
TMT. 7 Bak & Greene, 1980 page 75 Table A4.8
50-62 55.6 (4.4) 67-86 74.9 (6.0)
Study
TMT.8 Kennedy, 1981 page 75 Table A4.9
2~29
~9 ~9 ~9 ~9
Sample Composition V.A.-hospitalized patients (with chronic brain syndrome were included). Participants in alcoholequated sample. Both samples are divided into 5 age groups.
IQ/Education•
Location
Educ. 8.8
Wisconsin
Healthy control group. Data are provided for Trails B only.
Educ. 13.33 (2.56) FSIQ 113.5 (10.8)
Colorado
15
6 M, 9F
Educ. 13.7 (1.91)
Texas
15
5 M, 10 F healthy participants
14.9 (2.99)
150 30 30 30 30 30
Ss were employees of a mental health center. Five age groups are represented; equal M/F ratio.
111 M, 82 F volunteers described as nonpsychiatric & nonneurological; 83% are R-handed. Sample partitioned into 5 age groups.
Educ. (est. IQ) 13.73 (123.43) 13.53 (127.10) 13.11 (127.40) 11.59 (123.30) 12.50 (128.54)
Canada
Educ.
Canada
TMT.9 Fromm-Auch & Yeudall, 1983 page 76 Table A4.10
15-64 25.4 (8.2) 15-17 18-23 24-32 33-40 41-64
193
TMT.10 Bornstein, 1985 page 76 Table A4.11
1~9
365
178M, 187 F paid volunteers. Neurologically healthy. Data are presented in age x gender x education ceUs.
Educ. 12.3 (2.7) (5-20 years) < high school 2: high school
Canada
553
356M, 197 F. Data for Trails B are reported. Sample was divided into 3 age groups & 3 education categories; % of Ss classified as normal is provided.
Educ. 13.3 (3.4) < 12 (132) 12-15 (249) 2:16 (172)
Colorado, Callfornia, Wisconsin
Ss were medical and psychiatric V.A. inpatients without cerebral lesions or histories of alcoholism or cerebral contusions. All Ss except for one were male.
Educ. 1-20 11.43 (3.20) FSIQ 105.9 (13.5)
Southern California
9 M, 14 F volunteers. Data on 3-week retest are provided.
VIQ 105.8 (10.8) PIQ 105.0 (10.5)
Canada/ Ohio
32 75 57 18 10
43.3 (17.1) 2~9 ~9
~26
14.8 (3.0) FSIQ 119.1 (8.8)
~9
TMT.ll Heaton eta!., 1986 page 77 Table A4.12
15-81 39.3 (17.5) <40 ~9
2:60 TMT.12 Alekoumbides eta!., 1987 page 77 Table A4.13
19-82 46.85 (17.17)
118
TMT.13 Bornstein eta!., 1987a page 78 Table A4.14
17--52 32.3 (10.3)
23
APPENDIX 4
625
Table A4.1. (Contd.) Age•
n
TMT.l4 Dodrill, 1987 page 78 Table A4.15
27.77 (11.04)
120
60 M, 60 F volunteers. Data for various intelligence levels are presented.
TMT.l5 Ernst, 1987 page 79 Table A4.16
65-75
no
51 M, 59 F volunteers. Time to completion and number of errors are provided.
60 10 10 10 10 10 10
Canadian English- or French-
16-19 20-29 30-39 40-49 50-59 60-69
Study
TMT.l6 Stuss et al., 1987 page 80 Tables A4.17, A4.18
Sample Composition
speaking Ss; 55% male, 18% L-handed; 6 age groups represented. Data for test/ retest ( 1 week) are provided.
TMT.l1 Yeudall et al., 1987 page 80 Table A4.19
15-20 21-25 26-30 31-40
225
Normal adults; 127 M, 98 F, data are stratified by 4 age groups x gender.
TMT.18 Bomstein & Suga, 1988 page 81 TableA4.20
55-70 62.7 (4.3)
134
Healthy elderly paid volunteers; 49 M, 85 F. No history of neurological or psychiatric disorders. Data are divided into 3 education groups: 17M, 29 F 16M, 28 F 16M, 28 F Data are divided into 3 age groups for original test and 1-week retest. Expansion of the TMT.16 study
IQ/Education• Educ. 12.28 (2.18) FSIQ 100 (14.3) Educ. 10.3
Location Washington
Brisbane, Australia
Educ. 14.3 (2.62) $12 >12
Canada
Educ. 10-17 14.55 (2.78) FSIQ ll2.25 (10.25)
Canada
Educ. 11.7 (2.9)
Canada
Range, mean: 5-10,8.5 11-12, 11.7 >12, 15.0
16-29 30-49 50-69
90 30 30 30
TMT.20 Van Gorp et al., 1990 page 82 Table A4.22
57-SS 57-65 66-65 71-75 76-85
156 28 45 57 26
Elderly Ss with no history of neurological or psychiatric disorders; 61% F; 4 age groups presented.
Educ. 14.4 (2.86) FSIQ 117.21 (21.59)
California
TMT.lU Heaton et al., 1991,2004 page 82 Data are not reproduced in this book
42.1 (16.8) 20-34 35-39 40-44 45-49
486
Volunteers; 65% of the sample were males. Data are presented in T-score equivalents for M and F separately in 10 age groupings by 6 education groupings. In the 2004 edition, age range is expanded to include 85 years, and the data are presented for African-American and Caucasian participants separately.
Educ. 13.6 (3.5) FSIQ 113.8 (12.3)
California, Washington, Texas, Oklahoma, Wisconsin, Illinois, Michigan. New York, Virginia, Massachusetts, Canada
TMT.l9 Stuss et al., 1988 page 81 Table A4.21
50-54 55-59 60-64 65-69 70-74 75-80
Canada Educ.frange 14.1 (1.34) 11-18 14.9 (3.95) 5-20 13.2 (2.38) 8-18
(continued)
626
APPENDIX 4
Table A4.1. (Contd.) Study
Age•
TMT.U Seines et al., 1991 page 83 Table A4.23
n 733
25--34 35--44 45-54
TMT.i3 Elias et al.,
15--24
1993 page 84 Table A4.24
25--34 35--44
TMT.!4 Cahn et al.,
78.4 (6.8)
Sample Composition Ss from MACS study. Seronegative homosexual and bisexual males. Data are stratified by 3 age groups and 3 education levels.
IQ/Education•
Location
CoDege
MACS centers at Baltimore,
Chicago, Los Angeles. &: Pittsburgh
427
Healthy volunteers; 187 M, 240 F. Data are stratified by 6 age groups x gender.
Educ. 12-19
Maine
238
Cognitively intact elderly participants in Rancho Bernardo Study. Data for the entire sample and optimal cutoffs are provided.
13.8 (2.6)
California
359
167 M, 192 F; 332 R-handed; normal elderly volunteers. The article provides tables for age correction and a regression equation for education correction. Tables are not reproduced.
MAYOFSIQ 106.2 (14.0)
Minnesota
45-54 55-&4 ~65
1995 page 84 Table A4.25
TMT.215 Ivnik et al., 1996 page 85 TableA4.26
56-59 60-64
65-69 70-74 75--79 80-84
85--89
90-94
TMT.I6 Richardson&:
54 81 65 57 53 27 17 5
81.5 (3.3) 76-80 81-91
101
32.1 (9.7)
54
Paid male volunteers, control group; strict selection criteria.
15.4 (2.4)
New York
TMT.28 Salthouse
18-39
40
et al., 1997 page 87 Table A4.30
38
60-78
37
15.5 (1.7) 15.2 (2.5) 15.3 (2.6)
Atlanta, GA
40-59
Healthy adults, 47% M. Data are stratified into 3 age groups.
60-69 70-79 80-89
203 262 179 23
Nondemented elderly sample, participants in Bl.SA study, majority are males; sample is partitioned into 4 age groups.
16.0 (2.9)
Baltimore, MD
21.7 (5.24)
110
Undergrad. students, 22 M, 88 F. Performance was compared for A-B and B-A order of presentation.
Undergrad. students
North Dakota
23.4 (3.1)
98
Marottoli, 1996 page 86 Tables A4.27, A4.28
TMT.I7 Hoff et al., 1996 page 86 Table A4.29
TMT.I9 Rasmusson et al., 1998 page 87 Table A4.31
TMT.30 Miner &: Ferraro, 1998 page 88 Table A4.32
TMT.31 Crowe, 1998b page 88 Table A4.33
90-96
All autonomously living elderly Ss, current drivers; 53 M, 48 F. Data are provided for Trails B for younger-old and older-old by two educational levels.
Undergrad. students, 49 M, 49 F
Educ. 11.0 (3.7) <12
New Haven,
CT
~12
14.0 (2.3)
Australia
627
APPENDIX 4
Table A4.1. (Contd.) Study TMT.32 Tremont et al., 1998 page88 TableA4.34 TMT.33 Basso et al., 1999 page 89 TableA4.35 TMT.34 Crews et al., 1999 page 89 Table A4.36 TMT.315 Dikmen et al., 1999 page 90 Table A4.37 TMT.36 Binder et al., 1999
Age"
n
16-74
157
32.50 (9.27)
50
20.20 (3.47)
30
Blacksburg, VA
St. louis, MO
29.1 (12.1)
49
14.3 (1.9)
Rhode Island
13.23 (2.85)
Washington County, MD Pittsburg, PA
31.9%
Pennsylvania
TMT. 41 Stuss et al.,
53.4
19
2001
(13.6)
page 93 TableA4.43
page 94 Table A4.45
14.40
13.5 (3.0)
413
TMT.43 Stein et al., 2002
Control sample of 30 women
Normal elderly sample, aged 70 or above, 25% M, 87% Caucasian. Data on the number of lines drawn within 180 sec. for both parts A and B and time to completion are reported. Control sample consisting of students and employees of social services agency. Time to completion and number of errors are reported. Elderly volunteers who participated in the multicenter Memory and Aging Study; 44.9% male.
72.90 60-86
2001 page93 Table A4.44
Tulsa, OK/ Ohio
125
483
TMT.42 BeD et al.,
14.98 (1.93)
82.3 (4.4)
74.9 (4.4)
2000 page 92 Table A4.42
Oklahoma
Washington, Colorado, California
page 91 Table A4.40
TMT.40 Small et al.,
13.12 (3.26)
12.1 (2.6)
357
2000 page 92 Table A4.41
Patients referred for evaluation with negative findings; 71 M, 86 F. Data are stratified by 3 levels of intelligence. Data for healthy men on 2 testing probes over a 12-month interval.
Normal and neurologically stable adults; some had neurological conditions; 66% M. Data on test-retest reliability and practice effect are provided.
73.63 (4.45)
TMT.39 Chen et al.,
Location
384
page 91 TableA4.39 TMT.38 Saxton et al., 2000
IQ/Education•
34.2 (16.7)
page 91 Table A4.38 TMT.37 Ruffolo et al., 2000
Sample Composition
Control elderly volunteers who participated in the MoVIES study, 37.5% male. Data are reported for the entire sample. Results of ROC analysis are reported. Normal elderly volunteers, approximately equal number of M and F. Data are stratified by 2 age groups and 2 APOE genotype groups. Control group; 8 M, 11 F. Time to completion, B-A difference and (8-A)/A proportion are reported.
< high school
13.76-14.58
South Florida
13.7 (2.5)
Canada
34.4 (12.5)
29
Sample included friends, relatives, and spouses of TLE patients; 28% male.
13.0 (1.7)
Wisconsin
29.4 (10.7)
22
Control group of women. Tune to completion and B-A difference are reported.
13.9 (1.5)
California
(continued)
628
APPENDIX 4
Table A4.1. (Contd.) Study
Age•
n
TMT.44 Drane et al., 2002 page 94 Table A4.46
18-20 00-29 30-39 40-49 50-59
18 39 53 46 38 36 36 19
286 healthy adults; 205 M, 80 F.
D~ for women with established disease; 2 groups: i:RT treatment and placebo.
60-69
70-79 80-90
TMT.45Grady et al., 2002 page 95 Table A4.47
66.3 (6.4)
517
67.3 (6.3)
546
TMT.46 Miller, 2003 (an update on Seines et al., 1991) page 95 Table A4.48
38.0 (7.5)
949
TMT.47 Tombaugh, 2004 page 96 Table A4.49
25--34 35-44 45-59 18-24 25--34 35-44
911
Sample Composition
IQ/Education•
'rime to completion, 8-A difference, Jnd B:A ratio scores are reported.
toronary
Location
12.98 (2.65)
USA
12.7 (2.7)
Multicenter, USA
12.7 (2.7)
MACS centers
16.3 (2.4)
Se"negative homosexual ~d bisexual males from tfle MACS study, native inglish speakers. Data li-e partitioned by age x ctiucation.
<16 16 >16
Heftthy Canadian volunteers 418 M, 503 F. Data are presented ~ age x education groups.
Canada
0-12 12+
45-54 55-59 60-64
65-69 70-74 75-79 80-84 85-89 • Age column and IQ/Education column contain Information regarding range and/or mean and standard deviation for the whole sample and/or separate groups, whichever information is provided by authors.
Table A4.2. [TMT.l] Davies, 1968: Mean Time in Seconds Corresponding to lOth, 25th, 50th, 75th, and 90th Percentile Ranks for Trails A and 8 per Age Group for a Sample of British Adults Trails B Percentile
Trails A Percentile Age
10
25
50
75
90
10
25
50
75
90
180
20s+30s
42 45 49 67 105
.
26 28 29 35
21
129 151 177 282 450
94 100 135 172 292
69 78 98 119 196
55
40s
50 59 67 104 168
31
90 90 90 90
45 49
n
50s 60s 70s
3l 38
so
54
22 25 29 38
57 75 89 132
55 64 79
629
APPENDIX 4
Table A4.3. [TMT.2] Goul and Brown, 1970: Data for Trails A, B, and Total Time (A+ B) for a Sample of Non-Brain-Injured, Hospitalized Canadian Patients Trails• Age
FSIQ
VIQ
PIQ
A
B
A+B
20-29
103.8 (12.1)
102.6 (12.6)
104.7 (13.0)
36.1 (10.0) 19-62 35.0
85.7 (38.7) 47-245 76.0
121.8 (42.2) 72-290 111.5
106.4 (11.3)
113.7 (8.9)
35.5 (9.4) 19--58 34.0
79.6 (20,4) 32-115 80.0
114.0 (25.3) 51-173 115.0
105.8 (9.1)
104.5 (9.2)
40.0 (13.3) 16--70 38.0
105.2 (42.2) 51-225 99.5
145.0 (48.5) 72-277 142.0
111.2 (11.5)
111.6 (9.0)
45.3 (13.6)
103.2 (43.3)
148.4 (53.0)
n 26
Range Median 30-39
25
110.1 (8.9)
Range Median 40-49
24
105.3 (7.9)
Range Median 16
50-59
112.7 (8.6)
Range Median 15
60-72
104.2 (12.2)
103.2 (13.2)
105.4 (11.6)
Range Median Total
106
5~190
22-68
107.0 (10.6)
105.6 (11.8)
107.9 (11.2)
80-253
46.0
98.5
140.0
68.9 (21.2) 35-120 68.0
158.8 (49.5) 147.0
227.7 (63.2) 123--347 219.0
101.7 (46.2)
144.3 (58.6)
~272
42.9 (17.3)
•Mean time in seconds, standard deviations, ranges, and medians.
Table A4.4. [TMT.3] Wiens and Matarazzo, 1977: Data for Two Equal Groups of Male Applicants to a Patrolman Program and Retest Data 14-24 Weeks Later for a Random Subsample Trails• Initial Test n
Age
Education
FSIQ
VIQ
PIQ
A
B
24
23.6 (21-27)
13.7 (12-16)
117.5 (8.3)
117.4 (8.4)
115.4 (10.5)
23.83 (6.61)
56.42 (12.79)
24
24.8
14.0 (12-16)
118.3
116.4
118.2
20.54
51.04
(6.8)
(6.9)
(8.6)
(4.43)
(11.46)
21.76 (5.65)
54.17 (12.54)
Retest A
B
21.72 (5.86)
51.28 (12.29)
(1~139)
(21-28)
(1~131)
29
24
(21-28)
14 (12-16)
118
•Mean time in seconds and standard deviations.
116
118
APPENDIX 4
630
Table A4.5. [TMT.4] Eson et al., Personal Communication: Data for a Sample of Older Participants Trails• n
Age
A
B
15
63.2
38.9 (15.6)
115.3 (57.8)
16
67.0
42.1 (14.2)
134.9 (57.7)
16
72.0
46.2 (24.0)
145.2 (134.9)
16
78.3
80.1 (64.8)
234.8 (117.0)
•Mean time in seconds and standard deviations.
Table A4.6. [TMT.S] Harley et al., 1980: Data for the Whole Sample and for an Alcohol-Equated Sample of Veterans Administration Inpatients with Various Psychiatric Diagnoses and Alcoholism, Stratified into Five Age Groups, Patients with chronic brain syndrome included TraiJs•
WAIS Age
n
FSIQ
VIQ
PIQ
Education
A
8
10.1
67.04 (29.75)
175.5 (99.98) 60-676
TotaltGmple 55-59
60-64
~9
70-74
75-79
56
98.57
99.39
(11.43)
(12.92)
80-129
77-131
97.00 (10.65) 72-129
98.58 (9.93) 80-121
101.27 (11.42) 78-123
95.00 (9.82) 78-116
9.8
97.51 (11.18) 80-130
100.37 (12.51) 80-135
93.66 (10.20) 68-120
8.7
100.41 (9.92) 82-125
102.95 (11.81) 80-133
97.24 (10.08) 75-114
8.8
20
101.75 (10.18) 81-119
101.40 (11.40) 77-117
102.15 (9.95) 83-119
6.5
47
99.00 (11.73) 80-129
100.00 (13.02) 77-131
98.00 (11.13) 72-129
10.1
45
35
37
~160
63.67 (24.30)
158.67
~114
70-275
87.89 (75.60) 27-470
219.43 (120.60) ~8
95.24 (6U7)
(126.90)
35-353
83-606
85.80 (43.64) 28-180
225.15 (81.16) 100-410
65.89 (28.21)
178.43 (107.29) 60-176
(55.11)
237.16
Akohol-equmed _...,. 55-59
~140
60-64
33
96.00 (9.43) 80-117
99.00 (11.33) 78-123
93.00 (9.30) 78-112
9.3
65-69
23
99.00 (12.06) 80-130
102.00 (13.06) 80-135
95.00 (11.52) 68-120
8.8
71.82 (23.08) ~114
174.97 (52.68) 80-275
88.78 (88.34) 27-470
210.26 (135.24) 80-678
631
APPENDIX 4 Table A4.6. (Contd.) Trails•
WAIS
n
FSIQ
VIQ
PIQ
Education
A
8
70-74
37
100.00 (9.92) 82-125
103.00 (11.91) 80-133
97.00 (10.08) 75-114
8.8
95.24 (64.17) 35-353
237.16 (126.90) 83-606
75-79
20
102.00 (10.18) 31-119
101.00 (11.40) 77-117
102.00 (9.95)
6.5
85.80 (43.64) 28-180
255.15 (81.16) 100-410
Age
~119
"Mean time in seconds, standard deviations, and ranges.
Table A4.7. [TMT.6] Anthony et al., 1980: Data for Trails B for a Control Group n
100
Age
Education
FSIQ
VIQ
PIQ
Trails 8"
38.88 (15.80)
13.33 (2.56)
113.54 (10.83)
113.24 (11.59)
112.26 (10.88)
68.58 (32.72)
"Mean time in seconds and standard deviations.
Table A4.8. [TMT.7] Bak and Greene, 1980: Data for Healthy Right-Handed Older Adults WAISt
Trails" Age
n
Education
M/F ratio
A
8
Info
Arithmetic
Block Design
Digit Symbol
55.6 (4.44) 50-62
15
13.7 (1.91)
6/9
32.53 (12.58)
81.67 (30.76)
20.13 (3.38)
11.87 (1.92)
34.33 (6.79)
54.67 (12.19)
74.9 (6.04) 67--86
15
14.9 (2.99)
5/10
41.60 (10.33)
109.00 (38.84)
21.07 (3.84)
13.60 (2.97)
28.07 (5.36)
39.47 (12.11)
"Mean time in seconds and standard deviations. twechsler Adult Intelligence Scale data are presented in raw scores.
632
APPENDIX 4
~a for Trails A, B, and Total Time (A+ B) for a Sample of Healthy Canadian Employees q£ a Mental Health Center: Males and Females are Equally Represented within each Age G$>uping
Table A4.9. [TMT.8] Kennedy, 1981:
I
Trails*
n
Age•
Education
IQ Estimatet
A
B
A+B
30
20-29 25.77
13.73
123.43
25.03 (8.94)
59.58 (28.78)
84.62 (33.11)
30-39
13.53
! 127.10
28.88 (9.70) 29.68 (7.67)
70.28 (27.79) 78.80 (26.81)
99.13 (34.57) 108.48 (30.32)
30 30 30 30
34.34 40-49 45.79 50-59 53.74
13.11
60-69
'
: 127.40
11.59
; 123.30
37.73 (19.01)
96.01 (39.25)
133.74 (55.98)
12.50
'128.54
35.22 (12.36)
95.02 (34.62)
130.23 (40.67)
64.24
•Age range and mean age are provided for eaci. group. trntelligence quotient estimate is based on pe+rmance on the Ammons Quick Test. *Mean time in seconds and standard deviatioiiSI
lI I I
Table A4.10. [TMT.9] Fromm-Auch ani Yeudall, 1983: Data for a Sample of Healthy Canadian Adults•
'
n
Age
J\
B
32
15-17
23.. (5.9)
47.7 (10.4) 25.4-81.0 51.3 (14.6) 23.3-101.0
15.~9.0
76
1~23
26.!7
~
12. . .1 57
24.• (7.f)
11.~.0 18
10
27.; (8.. ) 16.0-$2.7 41-64
'
29.7 (8.-4) 16.5--f.O
53.2 (15.6) 29.1-98.0 62.1 (17.5) 39.0-111.0 73.6 (19.4) 41.9-102.0
"Mean education 14.8 years, mean full-scale int~gence quotient 119.1. tMean time in seconds, standard deviations, and ;ranges.
APPENDIX 4
633
Table A4.11. [TMT.10] Bomstein, 1985: Data for a Sample of Healthy Canadian Adults for Trails A and B" Stratified by Three Age Groupings and Two Educational Levels (< High School, ~High School) for Males and Females Separately Female
Male
?:HS
?:HS
A
B
A
B
A
B
A
B
20-39
28.3 (8.4)
70.0 (28.7)
23.8 (6.8)
53.9 (18.3)
23.2 (5.5)
56.4 (21.3)
22.0 (6.0)
53.5 (20.5)
40-59
38.9 (12.5)
107.8 (52.2)
28.6 (9.6)
74.1 (35.8)
30.5 (9.2)
76.7 (25.7)
27.3 (9.2)
60-69
37.6 (8.5)
119.4 (42.3)
35.3 (10.2)
78.3 (26.1)
40.7 (12.9)
96.4 (27.3)
34.5 (8.9)
n=21
n=13
n=86
n=13
n=17
n=l6
n=50
n=22
n=23
n=43
n=22
Table A4.12. [TMT.ll] Heaton et al., 1986: Data for a Normal Control Sample on Trails B and Percent Classified as Normal using Russell et al:s {1970) Criteria, Stratified by Age and by Education Trails 319 134 100 132 249 172
Age
Education
B•
<12 12-15 ?:16
58.5 78.3 116.8 102.2 69.7 57.9
<40 40-59 ?:60
Percent Normal
WAIS SS Meant
91.5 74.6 33.0
11.9 11.2 9.7 9.5 11.2 12.9
54.6
79.9 89.5
"Mean time in seconds. tMean scaled scores for Wechsler Adult Intelligence Scale subtests.
Table A4.13. [TMT.l2] Alekoumbides et al., 1987: Data for a Sample of Medical and Psychiatric Veterans Administration Patients without a History of Neurological Disorder: Mostly Males and Mostly Inpatients Trails• n
118
Age
Education
FSIQ
VIQ
PIQ
A
B
46.85
11.43
(17.17)
(3.20)
105.89 (13.47)
107.03 (14.38)
103.31 (13.02)
48.60 (23.79)
120.49 (78.90)
•Mean time in seconds and standard deviations.
87.4 (27.1) n=34
•Mean time in seconds and standard deviations.
n
65.6 (28.5)
634
APPENDIX 4
Table A4.14. [TMT.13] Bornstein et al., 1987a: Data for a Sample of Healthy Volunteers for Trails A and B• over Two Testing Probes 3 Weeks Apart Raw Score Test
Changet
Retest
Median Raw Score Change
Mean IIJ of Change
n
Age
VIQ
PIQ
A
B
A
B
A
B
A
B
A
B
23
32.3 (10.3)
105.8 (10.8)
105.0 (10.5)
25.6 (6.8) 17-41
52.1 (15.1)
21.5 (5.6) 15-35
47.4 (16.5) 25-73
3.1 (4.9)
6.9 (11.4)
3
6.5
13
9
34-97
•Mean time in seconds, standard deviations, and ranges. t Change
from test to retest.
Table A4.15. [TMT.14] Dodrill, 1987: Data for the Whole Sample of Healthy Participants and for the Different Intelligence Levels Trails•
n 120
Age
Education
27.73 (11.04)
12.28 (2.18)
M/F Ratio
60/60
FSIQ
VIQ
PIQ
A
B
100.00 (14.35)
100.92 (14.73)
98.25 (13.39)
25.37 (9.17)
66.02 (34.17)
Trails
Trails
n
FSIQ
A
B
FSIQ
A
B
7 18 34 64 93 101 75 60 48
130 125 120 115 110 105 100
20 21 20
50
22
>89 80-89 70-79 <70
<30 30-39 40-49 >49
<76 76-103 104-180 >180
95 90
33
85 80 75 70
28 30
19 10
23 24
47 48 49 53 56
25
60
26 26
62 68 82 99 135 159
33
39
•Mean time in seconds and standard deviations.
635
APPENDIX 4 Table A4.16. [TMT.l5] Ernst, 1987: Data for Neurologically Healthy Older Australian Adults• Trailst n
Gender
51
Male Errors
59
Female Errors
A
B
40.5 (21.1) 0.3 (0.6)
98.2 (52.9) 0.8
42.3 (14.4) 0.3 (0.5)
108.7 (79.2) 0.7 (0.9)
(1.1)
•Mean education is 10.3 years. tMean time in seconds with standard deviations and number of errors with standard deviations.
Table A4.17. [TMT.l6a] Stuss et al.,l987: Data for the Initial Test, Retest 1 Week Later, and Both Testing Probes Combined for a Sample of Healthy Canadian Adults Partitioned into Six Age Groups Trails• Gender
B
A
Handedness
Age
M
F
R
L
Education
Combined
Test 1
Test 2
Combined
Test 1
Test 2
10
17.3 (0.95) 16-19
5
5
8
2
12.3 (0.95) 11-13
21.8 (7.0)
21.8 (5.3)
22.2 (8.7)
45.5 (17.2)
49.0 (21.2)
41.8 (13.1)
10
23.0 (2.67) 20-29
6
4
6
4
16.2 (1.39) 14-18
17.3 (5.1)
18.5 (5.1)
16.2 (5.0)
37.8 (12.1)
41.6 (11.4)
34.0 (12.7)
10
33.9 (2.88) 30-39
5
5
7
3
16.7 (3.86) 10-20
20.1 (4.6)
21.9 (6.3)
18.4 (2.9)
46.5 (12.6)
46.3 (13.7)
46.8 (11.4)
10
44.2 (3.12) 40-49
6
4
9
1
15.5 (2.88) 10-20
27.9 (8.2)
29.2 (9.0)
26.5 (7.5)
64.3 (19.7)
64.1 (16.3)
64.4 (23.0)
10
55.3 (2.98) 50-59
6
4
9
1
11.7 (2.41) 8-16
35.6 (20.9)
38.5 (18.2)
32.7 (23.8)
77.3 (42.8)
83.1 (44.3)
71.4 (41.3)
10
63.7 (3.13) 60-69
5
5
10
0
14.3 (2.00) 12-17
33.6 (11.8)
37.3 (14.7)
29.9 (8.9)
70.3 (21.7)
73.3 (20.3)
67.3 (23.2)
n
•Mean time in seconds and standard deviations.
636
APPENDIX 4
Table A4.18. [TMT.16b] Stuss et al., 1987: Data for the Whole Sample Stratified by Gender and iEducation Trails• n
A
8
Males
33
25.80, (14.85.
55.40 (30.97)
Females
27
26.32. (10.08)
55.10 (21.27)
::;High school
27
28.56: (15.06)
63.58 (32.33)
>High school
33
23.98
51.63 (20.44)
(10.37~
•Mean time in seconds and standard deviatiolll.
;
Table A4.19. [TMT.17] Yeudall et al., Ujs7: Data for Healthy Canadian Adults Stratified by Age for the Entire Sample and for Males and Femal~ Separately
n
Trails
%RightHanded
WAIS-R FSIQ
A
8
12.16 (1.75)
79.03
111.75 (10.16)
24.75 (8.19)
49.17 (15.21)
Age Group
AW!
Education
15-20
17.16
lis)
Entire....,. (n=JJS) 62
(I.
73
21-25
22.'lb (1.-1))
14.82 (1.88)
86.30
109.79 (9.97)
24.53 (7.93)
50.36 (12.96)
48
26-30
28.06 (I.$)
15.50 (2.65)
89.58
113.95 (10.61)
24.49 (7.22)
51.94 (15.75)
42
31-40
34.~
16.50 (3.11)
90.48
116.09 (9.51)
25.74 (7.53)
59.35 (17.1J)
14.55 (2.78)
85.78
(6.~)
112.25 (10.25)
24.81 (7.75)
52.05 (15.36)
(2 ...,)
225
15-40
24.~
Femala(n=98) 30
15-20
17.7r3 (UN)
12.10 (1.52)
73.33
110.32 (10.64)
25.74 (9.10)
47.69 (15.11)
36
21-25
22.83 (l.S.)
14.53 (1.99)
83.33
107.28 (9.14)
25.71 (9.16)
51.76 (12.39)
16
26-30
28.69
14.94 (2.32)
93.75
113.10 (11.37)
25.71 (7.15)
51.47 (11.52)
16.19 (2.29)
87.50
114.27 (11.32)
25.49 (6.00)
57.29 (12.38)
14.12 (2.43)
82.65
119.19 (10.46)
25.69 (8.28)
51.37 (13.34)
(1.~)
16
31-40
33.88 (2.~)
98
15-40
24.0J (5.~)
637
APPENDIX 4
Table A4.19. (Contd.) Trails n
Age Group
Males (n =121) 32
%Right-
WAIS-R
Age
Education
Handed
FSIQ
A
B
15-20
17.78 (2.09)
12.22 (1.96)
84.38
113.00 (9.72)
23.83 (7.26)
50.56 (15.41)
37
21-25
22.57 (1.96)
15.11 (1.74)
89.19
112.30 (10.27)
23.34 (6.39)
48.99 (13.52)
32
26-30
27.75 (1.57)
15.78 (2.79)
87.50
114.38 (lo.43)
23.88 (7.28)
52.18 (17.65)
26
31-40
34.69 (2.41)
16.69 (3.55)
92.31
117.31 (8.21)
25.89 (8.45)
60.63 (19.60)
127
15-40
25.15 (6.29)
14.87 (2.99)
88.19
113.87 (9.83)
24.12 (7.27)
52.57 (16.79)
•Mean time in seconds and standard deviations.
Table A4.20. [TMT.l8] Bornstein and Suga, 1988: Data for a Sample of Healthy Older Canadian Volunteers Stratified by Education Trails*
Gender n
Education•
Aget
M
F
A
B
46
5-10 8.5
62.3
17
29
38.9 (11.5)
102.0 {39.5)
44
11-12 11.7
62.9
16
28
33.6 (10.3)
82.5 (34.5)
44
>12 15.0
63.0
16
28
34.0 (10.7)
80.9 {30.9)
"Range and mean number of years for education. range for the sample is 55-70 years.
t Age
*Mean time in seconds and standard deviations.
APPENDIX 4
638
Table A4.21. [TMT.l9] Stuss et al., 1988: Data for a Sample of Healthy Canadian Adults Stratified into Three Age Groups, for Two Testing Sessions One Week Apart Trails• Gender
Handedness
B
A
n
Age
M
F
R
L
Education
Test
Retest
Test
Retest
30
22.43 (2.67)
16
14
22
8
14.1 (1.34) 11-18
21.48 (6.44)
19.68 (7.32)
48.77 (18.66)
42.18 (15.54)
30
40.63 (2.97) 30-49
14
16
26
4
14.9 (3.95) 5-20
27.58 (9.43)
22.95 (6.23)
61.30 (17.88)
61.52 (22.79)
30
61.77 (3.0)
14
16
28
2
13.2 (2.38)
36.73 (13.68)
29.30 (14.73)
76.97 (30.52)
67.10 (28.37)
1~29
~9
~18
•Mean time in seconds and standard deviations.
Table A4.22. [TMT.20] Van Gorp et al., 1990: Data for Four Age Groups and for the Whole Sample of Healthy Older Adults• Trailst
n
Age
VIQ
PIQ
A
B
28
57--ffi
117.20 (11.33)
109.20 (11.56)
41.50 (7.38)
84.40 (24.60)
45
~70
114.80 (17.03)
111.47 (16.83)
43.20 (14.98)
105.20 (43.43)
57
71-75
122.88 (11.38)
115.08 (11.94)
50.08 (12.88)
97.79 (30.40)
26
7~
110.55 (11.25)
101.00 (8.78)
59.73 (15.95)
153.09 (62.60)
156
57-&5
117.65 (13.53)
110.62 (13.49)
48.70 (14.47)
107.55 (45.63)
•Mean education for the sample is 14.14 (2.86) years, 61% females. tMean time in seconds and standard deviations.
639
APPENDIX 4
Table A4.23. [TMT.22] Seines et al., 1991: Data for a Sample of Seronegative Homosexual/Bisexual Males Participating in the Multi-Center AIDS Cohort Study, Stratified by Age and Education Trails• B
A Percentiles
Percentiles
n
Mean Age
Education
Mean (SD)
5th
lOth
Mean (SD)
5th
lOth
25--34
309
31.0 (2.6)
16.1 (2.2)
19.0 (5.9)
27
24
49.5 (17.1)
80
74
35-44
290
39.3 (2.9)
16.4 (2.3)
20.8 (5.5)
32
29
52.5 (18.6)
83
78
45-54
97
48.5 (2.6)
16.7 (2.6)
23.1 (7.3)
37
35
53.9 (20.3)
87
79
229
36.1 (7.4)
13.7 (1.2)
22.8 (7.1)
31
30
51.8 (20.7)
87
79
College
202
35.6 (7.2)
16.0 (0.0)
19.2 (5.8)
32
25
51.4 (17.1)
83
75
>College
302
38.4 (7.8)
18.6 (1.3)
20.1 (5.5)
30
28
50.2 (15.8)
79
73
Age
By age
By education
•Mean time in seconds and standard deviations.
Table A4.24. [TMT.23] Elias et al., 1993: Data for a Sample of Healthy Adults Stratified by Age and Gender Women
Men
Trails•
Trails• Age Group
n
Education
A
B
n
Education
A
B
15-24
37
14.81 (1.54)
21.46 (1.25)
52.73 (4.60)
24
14.50 (1.53)
20.17 (.85)
43.00 (2.34)
25--34
40
14.85 (1.70)
29.92 (2.47)
58.85 (3.50)
56
14.62 (1.68)
23.34 (1.07)
52.75 (2.72)
35-44
36
14.56 (1.70)
25.89 (1.46)
62.06 (4.21)
56
14.64 (2.08)
29.57 (1.93)
56.96 (2.36)
45-54
25
14.72 (2.26)
26.24 (1.65)
58.04 (3.21)
46
14.89 (2.38)
27.72 (1.30)
56.50 (1.87)
55-64
25
14.72 (1.57)
37.48 (2.10)
75.36 (3.70)
35
14.40 (1.90)
34.31 (1.64)
78.54 (4.46)
~65
24
14.54 (2.15)
39.58 (3.78)
86.46 (10.22)
23
14.65 (2.52)
40.70 (3.24)
86.57 (6.71)
•Mean time in seconds and standard deviations.
APPENDIX 4
640 Table A4.25. [TMT.24] Cahn et al., 1995: Data for a Control Sample of Cognitively Intact Elderly
Table A4.27. [TMT.26a] Richardson and Marottoli, 1996: Demographic Characteristics of the Healthy Elderly Sample
Trails•
n
Age
Education
238
78.4 (6.8)
13.8 (2.6)
M/F ratio 97/141
A
B
47.9 (1.4)
123.5 (3.4)
•Mean time in seconds and standard deviations.
Table A4.26. [TMT.25] Ivnik et al., 1996: Demographic Description of the Healthy Sample Partitioned into Groups Used in TMT Testing Characteristic
Total Sample
Younger-Old
Older-Old
50
51
Age
81.47 (3.30)
78.80 (1.07)
84.08 (2.56)
Education
11.02 (3.68)
10.44 (3.86)
11.59 (3.45)
Mini-Mental State Exam
26.97 (2.55)
26.56 (3.03)
27.37 (1.92)
%Female
47.5%
46.0%
49.0%
%White
90.1%
82.0%
98.0%
%Black
9.9%
18.0%
2.0%
n
101
n
Age group 56-59 60--64 65-69 70--74 75-79 80--84 85-89 90--94 95+
54
81 65 57 53 27 17 5 0
Table A4.28. [TMT.26b] Richardson and Marottoli, 1996: Data for Trails B• Stratified into Two Age Groups By Two Education Groups (in Years of Schooling) Age 7&-80
81-91 Education
Education ~7
8-11 12 13--15 1&-17 ~18
2 33 135 87 67 35
~12
<12
n Trails B
24
26 197.17 (71.03)
119.17 (33.47)
<12 18 195.47 (69.70)
~12
33 137.30 (55.93)
•Mean time in seconds and standard deviations.
Gender Male Female
167 192
Race Caucasian Black
358 1
Table A4.29. [TMT.27] Hoff et al., 1996: Data for
332 17 10
n
Age
Education
A
B
54
32.1 (9.7)
15.4 (2.4)
22.7 (8.7)
55.7 (19.3)
the Control Sample
Handednea Right Left Mixed
Total
359
"Mean time in seconds and standard deviations.
641
APPENDIX 4
Table A4.30. [TMT.28] Salthouse et al., 1997: Data for a Sample of Healthy Participants Stratified into Three Age Groups Trails• Age Mean Group Age
n
Education %Male
A
B
18-39
29.0 (4.8)
40
15.5 (1.7)
42.5
21.0 53.6 (4.6) (20.3)
40-59
49.1 (5.1)
38
15.2 (2.5)
50.0
26.2 66.7 (7.9) (21.3)
69.2 (5.1)
37
15.3 (2.6)
48.6
32.9 87.2 (10.5) (36.6)
60-78
"Mean time in seconds and standard deviations.
Table A4.31. [TMT.29] Rasmusson et al., 1998: Data for a Sample of Nondemented Highly Educated Elderly Participants• Partitioned into Four Age Groups Trailst Mean Age
n
Education
A
B
%A Errors*
% B Errors*
60-69
64.8
203
15.7 (2.8) 8-20
32.1 (11.0)
81.2 (35.2)
9.8
34.5
70-79
74.2
262
16.0 (3.0) 8-20
40.7 (15.5)
103.3 (51.1)
13.4
39.0
80-89
83.4
179
16.4 (2.9) 7-20
48.6 (17.5)
132.1 (55.9)
8.8
48.2
90-96
91.5
23
16.6 (2.9) 12-20
52.1 (19.6)
153.0 (68.2)
5.3
57.9
Age Group
"Majority of the sample are males. tMean time in seconds and standard deviations. *Percent of participants who made errors on parts A and B.
Table A4.32. [TMT.30] Miner and Ferraro, 1998: Data for Two Administration Conditions (A-B, B-A) for a Sample of Undergraduate Students Trails• TMTOrder Total sample
n
Age
110
21.7 (5.24)
M/F Ratio
A
B
22/88
A-B
55
22.93 (6.25)
45.68 (10.74)
B-A
55
21.04 (6.99)
49.17 (13.04)
•Mean time in seconds and standard devilitions.
APPENDIX 4
642
Table A4.33. [TMT.31] Crowe, 1998: Data for a Sample of Undergraduate Students Trails•
n
Age
Education
M/F Ratio
A
B
98
23.4 (3.1)
14.0 (2.3)
49/49
24.7 (5.9)
50.3 (11.8)
"Mean time in seconds and standard deviations.
Table A4.34. [TMT.32] Tremont et al., 1998: Data for Patients Referred for Evaluation which Yielded Negative Findings, Stratified by Three Levels of Intelligence WAIS-R FSIQ Below Average
35
n
38
84
34.03 (13.8)
Age
Above Average
Average
40.55 (16.73)
41.71 (14.65)
Education
11.53 (2.76)
12.62 (2.76)
15.63 (3.37)
FSIQ
84.89 (4.84)
99.15 (8.05)
119.92 (7.55)
VIQ
85.74 (7.29)
98.32 (9.13)
118.79 (8.86)
PIQ
86.06 (7.92)
101.12 (9.22)
117.42 (10.24)
Trails A"
40.43 (18.07)
34.84 (14.41)
28.76 (10.61)
Trails B"
124.51 (76.49)
93.20 (49.11)
68.21 (32.72)
"Mean time in seconds and standard deviations.
Table A4.35. [TMT.33] Basso et al., 1999: Data for a Sample of Healthy Men on Two Testing Probes over a 12-Month Interval Trails" Test
Retest
n
Age
Education
WAIS-R FSIQ
A
B
A
B
50
32.50 (9.27)
14.98 (1.93)
109.30 (12.29)
21.52 (7.54)
48.70 (17.76)
21.32 (7.36)
47.72 (19.33)
"Mean time in seconds and standard deviations.
'
643
APPENDIX 4
Table A4.36. [TMT.34] Crews et al., 1999: Data for a Control Sample of Women Trailst n
Age
Education
WAIS-R Vocabulaty"
30
20.20
14.40 (1.33)
13.50 (2.08)
(3.47)
A
B
21.13 (5.78)
47.43 (13.33)
•Wechsler Adult Intelligence Scale-Revised Vocabulary scaled score. tMean time in seconds and standard deviations.
Table A4.37. [TMT.35] Dikmen et al., 1999: Test-Retest Data for Normal and Neurologically Stable
Adults• Trails A1 n
384
Age
Education
34.2 (16.7)
12.1 (2.6)
M/F Ratio 66/34
Trails B
WAIS FSIQt
Test-retest interval
Time 1
Tlme2
Time 1
Time2
108.8 (12.3)
9.1 (3.0)
26.52 (11.66)
25.56 (11.66)
72.05 (45.22)
68.19 (46.13)
•A number of participants had preexisting conditions that might affect test performance, the most significant being alcohol abuse and a significant traumatic brain injwy. twechsler Adult Intelligence Scale full-scale intelligence quotient (Wechsler, 1955). 1Mean time in seconds and standard deviations.
Table A4.38. [TMT.36] Binder et al., 1999: Data for a Normal Elderly Sample Trails
%
%
cos·
A
n
Age
Education
Male
Caucasian
score
Blessedt score
125
82.3 (4.4)
13.5 (3.0)
25
87
1.8 (1.8)
2.1 (2.1)
B
Time1
Lines'
Time
Lines
53.5 (25.3)
24.0 (0.3)
124.1 (39.9)
22.6 (3.6)
"Geriatric Depression Scale. tshort Blessed Orientation-Memory-Concentration Test.
'Mean time in seconds and standard deviations. 'Number of lines correctly drawn within the time limit of 180 seconds for both parts A and B.
644
APPENDIX 4
Table A4.39. [TMT.37] Ruffolo et al., 2000: Data for the Control Sample Trails 8
A n
Age
Education
Errors•
Timet
Errors
T'une
49
29.1 (12.1)
14.3 (1.9)
0.14 (0.41)
26.6 (7.9)
0.47 (0.77)
57.2 (17.2)
•Mean number of performance errors and sta$lard deviations. tMean time in seconds and standard deviationS.
Table A4.40. [TMT.38] Saxton et al., ~: Data for a Sample of Elderly Free of Cardiovascular Disease · Trails• n
357
Age
Education
73.63
13.23 (2.85)
(4.45)
%~e
44~
A
8
43.61 (1.00)
114.53 (3.10)
•Mean time in seconds and standard deviations.
Table A4.41. [TMT.39] Chen et al., 2000: Pata for the Control Sample of Nondemented Elderly•
n
Age
%Male
A
8
483
74.9
37.5
48.02 (17.09)
130.49 (62.21)
(4.4)
•Participants with lower than high school educatton, 31.9%. tMean time in seconds and standard deviations. '
645
APPENDIX 4
Table A4.42. [TMT.40] Small et al., 2000: Data for a Sample of Normal Elderly Stratified by Two Age Groups and the Presence of the APOE-£4 Allele Trails• n
Age
%Male
Education
A
B
Young-old
156
50.6
Old-old
166
67.56 (3.63) 78.27 (3.01)
14.58 (2.84) 13.92 (2.78)
38.34 (13.24) 46.33 (18.38)
89.47 (34.27) 128.34 (66.18)
13.78 (2.53) 13.76 (2.60)
38.04 (13.71) 43.91 (15.69)
109.58 (51.25) 132.55 (77.56)
Group APOE-84-nega&e
51.8
APOE-84-poaitive Young-old
46
Old-old
45
66.91 (3.53) 77.76 (2.96)
37.0 57.8
"Mean time in seconds and standard deviations.
Table A4.43. [TMT.41] Stuss et al., 2001: Data for the Control Sample Trails" Age
Education
53.4 (13.6)
13.7 (2.5)
n
19
M/F Ratio
NART Score
A
8
Difference 8-A
Proportion (8-A)/A
8/11
113.2 (5.6)
30.8 (17.0)
64.2 (26.2)
33.4 (17.6)
1.3 (.8)
•Mean time in seconds and standard deviations. NART, National Adult Reading Test.
Table A4.44. [TMT.42] Bell et al., 2001: Data for
Table A4.45. [TMT.43] Stein et al., 2001: Data for
the Control Sample
the Control Sample of Women Trailst
n
Age
29
34.4 (12.5)
Education 'If> Male FSIQ" 13.0 (1.7)
28
97.7 (6.4)
A
8
24.3 57.9 (6.3) (18.7)
•Wechsler Adult Intelligence Scale-ill full-scale intelligence quotient based on seven-subtest short form. tMean time in seconds and standard deviations.
Trails• n
Age
Education
A
8
Difference 8 -A
22
29.4 (10.7)
13.9 (1.5)
23.9 (8.6)
55.0 (18.3)
31.1 (14.9)
"Mean time in seconds and standard deviations.
APPENDIX 4
646
Table A4.46. [TMT.44] Drane et al., 2002: Data for a Sample• of Healthy Adults Partitioned into Eight Age Groups Trailst Age Group
n
A
B
Difference B-A
Ratio B:A
18-20
18
23.22 (6.56)
52.94 (20.10)
29.72 (16.21)
2.31 (0.58)
20-29
39
26.12 (9.78)
60.92 (33.17)
35.31 (27.72)
2.36 (0.78)
30-39
53
28.02 (8.78)
72.30 (28.55)
44.13 (26.72)
2.72 (1.21)
40-49
46
31.00 (11.21)
81.26 (23.69)
50.04 (20.28)
2.80 (0.93)
50-59
38
36.29 (16.41)
103.42 (50.26)
67.24 (39.35)
2.94 (0.88)
60-69
36
39.60 (12.14)
105.23 (41.15)
65.60 (33.84)
2.70 (0.77)
70-79
36
45.58 (18.91)
152.59 (88.42)
109.14 (73.87)
3.49 (1.76)
80-90
19
56.37 (20.20)
170.21 (84.68)
113.84 (70.73)
3.05 (1.05)
Table A4.48. [TMT.46] Miller, 2003 (Update on Seines et al., 1991): Data for a Sample of Seronegative Homosexual/Bisexual Males Participating in the MACS Study, Stratified by Age x Education Trails• Age 25-34
Education <16
A
B
Mean (SD)
28.41 (12.86) 46 23.50 (7.37) 36 23.03 (7.38)
n
35
58.76 (18.39) 123 50.80 (18.70) 96 47.08 (17.01) 110 52.53 (18.67) 329
Mean (SD)
n 16
Mean (SD)
n >16
Total
Mean (SD)
n 35-44
<16
Mean (SD)
n 16
Mean (SD)
n
•Mean education for the sample= 12.98 (2.65) years.
>16
tMean time in seconds and standard deviations.
Mean (SD)
n Total
Mean (SD)
n
Table A4.47. [TMT.45] Grady et al., 2002: Data for Trails B for a Sample• of Women with Established Coronary Disease, Stratified into Estrogen/Progestin Replacement Treatment and Placebo Groups %
<16
Total
Education
White
Treatment
517
66.3 (6.4)
12.7 (2.7)
90.9
156.2 (77.5)
Placebo
546
67.3 (6.3)
12.7 (2.7)
90.5
151.5 (77.5)
Mean (SD)
n Mean (SD)
n Total
<16
• All participants were younger than 80 years of age. tMean time in seconds and standard deviations.
Mean (SD)
n >16
Age
Mean (SD)
n 16
Trails st
n
Group
45-59
Mean (SD)
n 16
Mean (SD)
n >16
Mean (SD) .
n Total
Mean (SD)
n
25.29 (10.14) 117 28.52 (7.86) 63 27.80 (8.65) 59 24.60 (6.68) 80 26.76 (7.83) 202
422
31.25 (10.51) 40 33.11 (15.04) 27 29.84 (9.15) 62 30.96 (11.00) 129
70.65 (23.60) 60 64.65 (23.53) 40 60.05 (17.90) 96 64.23 (21.37) 196
29.22 (10.33) 149 27.70 (10.58) 122 26.12 (8.21) 177 27.58 (9.69) 448
63.82 (23.52)
•Mean time in seconds and standard deviations.
65.52 (26.87) 124 59.64 (22.31) 121 53.53 (16.65) 177 58.80 (22.22)
307 57.12 (21.77) 257 53.31 (17.68)
383 57.75 (21.29) 947
> "'C "'C
m
z
a X
• Table A4.49. [TMT.47] Tombaugh, 2004: Data for a Sample of Healthy Canadian Adults Stratified by Age and Education• Age Group 18-24
25-34
35-44
45-54
55-59
61)..$
60-64
70-74
80-84
75-79
85-89
Education Group
n
0-12
12+
0-12
12+
0-12
12+
0-12
12+
0-12
12+
0-12
12+
0-12
12+
76
30
74
84
16
13
81.94 (1.41)
34 81.56 (1.52)
86.38 (1.50)
86.31 (1.65)
Age
20.17 (1.48)
29.42 (2.87)
39.74 (2.94)
48.54 (2.96)
56.90 (1.31)
57.05 (1.45)
62.33 (1.28)
61.94 (1.50)
67.04 67.22 (1.63) (1.43)
71.99 (1.4)
72.07 (1.60)
77.32 (1.35)
34 77.21 (1.49)
Education
12.92 (1.01)
14.18 (1.61)
13.59 (2.06)
13.68 (2.80)
11.05 (1.05)
15.32 (1.93)
10.84 (1.27)
15.45 (1.31)
10.87 15.91 (1.71) (1.87)
10.50 (1.72)
15.43 (2.21)
10.80 (1.50)
15.29 (1.80)
10.48 (1.54)
15.50 (2.54)
9.88 (1.96)
16.23 (2.45)
Trails At
22.93 (6.87)
24.40 28.54 (8.71) (10.09)
31.78 35.10 31.72 (9.93) (10.94) (10.14)
33.22 (9.10)
31.32 39.14 33.84 (6.96) (11.84) (6.69)
42.47 (15.15)
40.13 (14.48)
50.81 (17.44)
41.74 (15.32)
58.19 (23.31)
55.32 (21.28)
57.56 (21.54)
63.46 (29.22)
Trails st
48.97 50.68 58.46 63.76 78.84 68.74 74.55 64.58 91.32 67.12 (12.69) (12.36) (16.41) (14.42) (19.09) (21.02) (19.55) (18.59) (28.89) (9.31)
109.95 (35.15)
86.27 (24.07)
130.61 (45.74)
100.68 152.74 (44.16) (65.68)
155
33
39
41
58
37
55
31
65
32
132.15 167.69 140.54 (42.95) (78.50) (75.38)
•Male/female ratio for the sample is 408/503. tMean time in seconds and standard deviations.
~
Appendix 4m: Meta-Analysis Tables for the Trailmaking Test (TMT)
Table A4m.1. Results of the Meta-Analysis and Predicted Scores for the TMT, Trails A (Relevant values are weighted on the standard error for the test mean)
Description of the aggregate sample Number of studies included in the analysis Years of publication Number of data points used in the analysis
28 1980-2004 89
(a data point denotes a study or a cell in education/gender-stratified data)
Total number of participants Variable
6,317
xt
sot
Range
89
49.33
59.29
10-483
89 89
53.94 3.52
21.92 3.02
1&-91.5 0.5-16.7
69 64
13.87 2.38
1.49 0.65
8.5-16.7 0.3-3.9
21 16
116.69 10.64
7.48 3.25
97.7-128.5 5.6-17.0
70
46.16
29.98
0-100
89 89
35.79 12.21
11.53 5.56
19.0-60.5 0.9-25.3
n•
sample me Mean
Age Mean SD
Education Mean SD
IQ Mean SD
Percent male Tellt score meGne
Combined mean Combined SD
•Number of data points differs for different analyses due to missing data. tweighted means and standard deviations.
648
APPENDIX 4M
649
Table A4m.1. (Contd.)
Predicted number of seconds to completion and SDs per age group• (TMT-A)
95%CI
95%CI
Age &nge
Predicted Score
Lower Band
Upper Band
Predicted SD
Lower Band
Upper Band
16-19 20-24 25-29 30-34 35-39 40-44 45-49 50-54 55-59
23.97 24.05 24.46 25.23 25.34 27.81 29.62 31.78 34.30 37.16 40.38 43.94 47.85 52.11 56.73
22.33 23.07 23.74 24.24 25.04 26.28 27.99 30.17 32.74 35.59 38.56 41.57 44.63 47.78 51.07
25.62 25.02 25.18 26.21 27.65 29.33 31.24 33.40 35.86 38.74 42.19 46.31 51.07 56.44 62.38
7.63 7.63 7.78 8.05 8.48 9.04 9.75 10.59 11.58 12.71 13.98 15.40 16.95 18.65 20.49
6.59 7.09 7.15 7.03 7.10 7.40 7.95 8.74 9.76 10.98 12.34 13.74 15.08 16.30 17.43
8.67 8.18 8.39 9.07 9.86 10.68 11.54 12.44 13.40 14.45 15.63 17.05 18.82 20.99 23.54
~
65-69 70-74 75-79 80-84 85-89
•Based on the equations: Predicted teat score =26.50094 - 0.2665049 • age+ 0.0069935 • age2 Predicted SD = 8.760348- 0.1138093 • age+ 0.0028324 • age2
Correction for education
Years of Education 12 13 14 15 16 17
Correction Factor +2.62 + 1.31 0 -1.31 -2.62 -3.93
With every year of education above or below 14, we suggest correcting the obtained score by adding or subtracting 1.31 to or from the predicted score given in the table for the relevant age group. SD for the person's actual age group should be used with the educationcorrected scores. Extrapolation of this correction outside the boundaries of 12-17 years of education should be made with caution as empirical data are not available beyond these educational ranges. (continued)
650
APPENDIX 4M
Table A4m.1. (Contd.)
Significance tests for regression with .the test scores Ordinary least-squares regression or tail means on age (quadratic) Number of observations Number of clusters R2
89 28 0.905
F<2.27> =146.47, p < 0.000.
F
Coefficient
SE
Age Age2 Constant
-0.2665049 0.0069935
0.166 0.002 3.179
26.50094
-1.60" 3.85 8.34
p
95%CI
0.121" 0.001 0.000
- 0.608 to 0.075 0.003 to 0.011 19.98 to 33.02
"Significance test for age centered (sample means - aggregate mean): t = 12.80, p = 0.000. Prediction Predicted age range Mean predicted score SEe 95%CI
16-90 yeara 35.05 (10.~) 1.05 32.99-37.11
eo
50
40
30
i
,i
t1
20
age
Figure A4m.1. A scatterplot illustrating the dispersion of the data points around the regression line for TMT-A. Tbe size of the bubbles reHects the weight of the data point, with larger bubbles indicating larger standard error and smaller weight.
Table A4m.1. (Contd.)
Tests for assumptioos and model 8t Tests for heterogeneity in the 8aal dataset Pooled estimates for fixed effect Pooled estimates for random effect
QcciO•P
37.898 32.166
Q(88)
Moment-based estimate of between-study variance
=721.02, p < 0.000 130.721
Tests for model &t-wlition of a quadratic term
Model Linear Quadratic
0.838
0.836 0.903
0.905
BIC
BIC'
134.376 91.403
-157.335 -200.307
BIC' difference of 42.972 provides very strong support for the quadratic model. Tests for parameter speeJ8ealioas Normality of the residuals Shapiro-Wille W test W = 0.990, p = 0.706 Homoscedasticity White's general test 22.246, p < 0.000
Significance tests for regression with SDs Ordinary least-squares regression of SD1 on age (quadratic) Number of observations 89 Number of clusters 28 R2 0.602 F
Coefficient
SE
Age Age2 Constant
-0.1138093
0.129 0.001 2.385
0.0028324 8.760348
-0.89• 2.18 3.67
p
95%CI
0.384•
- 0.378 to 0.150 0.000 to 0.006 3.87 to 13.65
0.038 0.001
•Significance test for age centered (sample means- aggregate mean): t = 9.75, p < 0.000.
Prediction Mean predicted SD SEe 95%CI
11.91 (4.28)
0.82 10.31-13.51
Effects of demographic variables Edueation Est. tau2 without education Est. tau2 with education Regression of test means on education and age Number of observations Number of clusters R2
83.81 79.48 68 25 0.917 (continued)
APPENDIX 4M
652
Table A4m.1. (Contd.) Term
Coefficient
Education
-1.308
p
95%CI
0.006
-2.21 to -0.41
SE 0.435
-3.00
IQ Regression of test means on IQ and age Number of observations Number of clusters R2 Term
Coefficient
SE
IQ
0.0678098
0.045
21 7 0.946
1.51
p
95%CI
0.181
-0.042 to 0.177
Gender t-test by gender n
X male
X female
M-F difference
17M, 15F
26.449
27.153
-0.704
p -0.343
0.367
Table A4m.2. Results of the Meta-Analysis and Predicted Scores for the TMT, Trails B (Relevant values are weighted on the standard error for the test mean) Description of the aggregate sample Number of studies included in the analysis Years of publication Number of data points used in the analysis (a data point denotes a study or a cell in education/gender-stratified data) Total number of participants Variable
29 1980-2004 89
6,360
n•
xt
sot
Range
89
50.69
65.44
10-483
89 89
56.69 3.56
22.03
16-91.5 0.5-16.7
69
13.82 2.38
1.56 0.65
8.5-16.7
64
21 16
116.88 11.25
7.80 3.26
97.7-128.5 5.6-17.0
Sampk size Mean
Age Mean
so Education Mean
so
3.06
0.~.9
IQ Mean
so
APPENDIX 4M
653
Table A4m.2. (Contd.) Variable
n•
x·t
sot
Range
Percentmak
70
48.08
29.47
0-100
Tat acore means Combined mean Combined SD
89 89
95.08 38.97
38.93 23.27
43.0-170.2 1.9-88.4
"Number of data points differs for different analyses due to missing data. tweighted means and standard deviations.
Predicted number of seconds to completion and SDs per age group• (TMT-B)
95%CI
Age Btmge 1~19
.20-24 .25-.29 30-34
35-39 40-44 45--49 50-IU SS-S9 60-64 65-69 10-14 15-19 80-84 85-89
Predicted Score 53.92 53.77 54.72 56.84 60.15 64.63 70.29 77.13 85.15
94.34 104.71
116.26 128.99 142.90 157.98
95%CI
Lower Band
Upper Band
49.21 50.54 51.69 52.89 55.05 58.50 63.29 69.40 76.76 85.24 94.71 105.03 116.11 127.86 140.26
58.63 56.99 57.74 60.80 65.25 70.77 77.30 84.86 93.54 103.45 114.72 127.50 141.88 157.94 175.71
Predicted SD 20.12 19.19 18.87 19•.29 20.46 .2.2.37
25.02 28.42 32.55 37.44 43.07 49.44 56.55 64.41 73.01
Lower Band
Upper Band
14.76 14.93 15.09 15.30 15.88 17.11 19.10 21.86 25.37 29.55 34.33 39.63 45.38 51.54 58.07
25.48 23.45 22.65 23.29 25.03 27.62 30.94 34.97 39.74 45.33 51.80 59.24 67.72 77.28 87.95
"Based on the equations:
Predicted te1t •core =64.07469- 0.9881013•age+0.023558l•age2 Predicted SD = 29.8444-0.8080508 • age+ 0.0148732 • age2
Correction for education
Years of Education 12 13 14 15 16 17
Correction Factor +12.90 +6.45 0 -6.45 -12.90 -19.35
With every year ofeducation above or below 14, we suggest correcting the obtained score by adding or subtracting 6.45 to or from the predicted score given in the table for the relevant age group. Standard deviation for the person's actual age group should be used with the education-corrected scores. Extrapolation of this correction outside the boundaries of 12-17 years of education should be made with caution as empirical data are not available beyond these educational ranges. (continued)
654
APPENDIX 4M
Table A4m.2. (Contd.) Significance tests for regression with the test scores Orclioary least squares regression o£ test means on age (quadratic) Number of observations Number of clusters
89 29
R2
0.876
F
F(2.28l ~74.01, p
p
95%CI
-2.31•
0.029•
5.35
0.000 0.000
-1.865 to - 0.111 0.014 to 0.032 46.93 to 81.22
Tenn
Coefficient
SE
Age Age2 Constant
-0.9881013 0.0235581
0.428
0.004
64.07469
8.368
7.66
•significance test for age centered (sample means -
Predietion Predicted age range Mean predicted score
< 0.000
aggregate mean): t = 11.92, p = 0.000.
16-90 yeal's 88.12 (34.13) 4.26 • 79.77-96.f7
SEe
95%CI
200
150
100
50
20
30
40
50
70
age
80
Figure A4m.2. A scatterplot illustrating the ~rsion of the data points around the regression line for TMT-8. The size of the bubbles reftects the weight of the data point, with larger bubbles indicating larger standard error and smaller weight.
Tests for assumptioas and model &t Tests for heterogeneity in the 8aal datf4 set Pooled estimates for 6xed effect Pooled estimates for random effect Q(dO• p Moment-based estimate of between-study variance
87.131 75.955 Q(88) = 114431.30, p < 0.000
1154.290
--~--
II
655
APPENDIX 4M Table A4m.2. (Contd.) Tests for model &t--'lition of a quadratic term BIC
BIC'
0.808
365.177
0.873
331.763
-143.195 -176.610
Model Linear Quadratic
0.810 0.876
BIC' difference of 33.414 provides veey strong support for the quadratic model.
Tests for parameter speei&eatioas Normality of the residuals Sbapiro-Wilk W test Homoscedasticity White's general test
W =0.979, p =0.166
23.806, p <0.000
Signiftcance tests for regression with the SDs Ordinary least squares regression of SDs on age (quaclratie) Number of observations Number of clusters R2 F
89 29 0.676 F<2.28) = 28.44, p < 0.000
Term
Coefficient
SE
t
p
95%CI
Age Age2 Constant
-0.8080508 0.0148732 29.8444
0.332 0.003 7.159
-2.43" 4.43 4.17
0.022" 0.000 0.000
-1.488 to - 0.128 0.008 to 0.022 15.18 to 44.51
"Significance test for age centered (sample means -aggregate mean): t = 7.53, p < 0.000.
Predietion Mean predicted SD SEe 95%CI
35.35 (18.03) 3.82 27.86--42.83
Effects of demographic variables Education Est. tau2 without education Est. tau2 with education Regression of test means on education and age # of observations Number of clusters
837.1 813.2 68 26
R2
0.907
Term
Coefficient
Education
-6.446
SE
t
p
95%CI
2.515
-2.56
0.017
-11.63 to-1.27
IQ Regression of test means on IQ and age # of observations Number of clusters ~
21 8 Q~ (continued)
APPENDIX 4M
656 Table A4m.2. (Contd.) Term
Coefficient
SE
IQ
0.4133416
0.286
1.44
p
95%CI
0.192
-0.264 to 1.091
Gender t-test by gender: n
X male
X female
M-F difference
17M,15F
59.809
59.431
0.379
p 0.084
Table A4m.3. Summary Table of Predicted Values for the TMT Part A
Age .Range 16-19 20-24
25-!9 30-34 3S-;J9
40-44 45-49 50-54 55-59 60-64
65-69 10-14 15-19 80-84
85-89
PartB
Time
SD
Time
SD
23.97 24.05 24.46 25.23 25.34 27.81 29.62 31.78 34.30 37.16 40.38 43.94 47.85 52.11 56.73
7.63 7.63 7.78 8.05 8.48 9.04 9.75 10.59 11.58 12.71 13.98 15.40 16.95 18.65 20.49
53.92 53.77 54.72 56.84 60.15 64.63 70.29 77.13 85.15 94.34 104.71 116.26 128.99 142.90 157.98
20.12 19.19 18.87 19.29 20.46 22.37 25.02 28.42 32.55 37.44 43.07 49.44 56.55 64.41 73.01
Correction for education•
Yean o£ Education
Part A
Part 8
12 13 14 15 16 17
+2.62 +1.31 0 -1.31 -2.62 -3.93
+12.90 +6.45 0 -6.45 -12.90 -19.35
•To be added to or subtracted from the predicted score for the respective age group.
0.467
Appendix 5: Locator and Data Tables for the Color Trails Test
Study numbers and page numbers provided in these tables refer to study numbers and descriptions of studies in the text of Chapter 5.
Locator table also provides a reference for each study to a corresponding data table in this appendix.
Table A5.1. Locator Table for the Color Trails Test (CIT) Study
Age•
CIT.l D'Elia et al., 1994 page 103
30-44
Data are not reproduced in this book
n
18-29 45-59 60-74 75--89
CIT.2 Ponton et al., 1996 page 104 Table A5.2, A5.3
16-29 30-39 40-49 50-75
CIT.3 Hsieh & Riley,
30-39 40-49 50-59 60-69 70-83
43 39 33 32
64-74 75-97
240 106
page 105 Table A5.4-A5.6
CIT.4 LaRue et al., 1999 page 105 Table A5.7-A5.10
42 66
27
Education•
Location
Data are stratified at each age category by 6 education categories: < 8 years, 9-11 years, 12 years, 13-15 years, 16 years, ~17 years
USA
Data are stratified at each age category by< 10 and ~10 years of education
Southern California
Urban-dwelling, Mandarinspeaking adults; 93 males and 84 females
Data collected on individuals with 1-17 years of education; however, data reported by age categories only
Mainland China
Community-dwelling, non-Hispanic and Hispanic men and women who are bilingual Spanish/ English. Data reflect number of circles with digits that were correctly completed in 60 seconds.
Data are stratified by age and educational level. Education categories: 0-6 years, 7-9 years, 10-12 years, and > 12 years for Hispanics; 0-12 and > 12 for non-Hispanics
New Mexico
Medically & psychiatrically healthy adults residing in a variety of settings
1,528
1997
Sample Composition
Sample is 88% male
180 female, 120 male medically & psychiatrically healthy Hispanics
45
30
•Age column and education column contain information regarding range and/or mean and standard deviation for the whole sample and/or separate groups, whichever is provided by the authors.
657
658
APPENDIX 5
Table A5.2. [CIT.2a] Ponton et al., 1996: Data for a Sample of Spanish-Speaking Adult Females Stratified by Two Age Groups and Two Education Groups (Means and Standard Deviations) Age Group 16-29
30-39
40-49
50-75
Years of Education ~10
<10 (n= 12)
2:10 (n=30)
<10 (n=22)
2:10 (n=44)
<10 (n=16)
2:10 (n=11)
<10 (n=25)
(n=20)
Color Trails 1
49.33 (30.75)
34.43 (12.69)
49.62 (13.76)
34.96 (11.91)
56.38 (14.16)
34.09 (6.44)
70.28 (44.10)
44.20 (15.73)
Color Trails 2
128.58 (45.19)
80.03 (23.43)
114.05 (49.80)
84.48 (27.40)
134.06 (54.15)
79.46 (28.42)
146.72 (62.21)
99.20 (25.75)
Raven's Total
38.33 (13.24)
42.73 (10.28)
34.77 (13.82)
42.48 (9.70)
29.19 (12.49)
41.36 (9.52)
23.36 (7.63)
42.30 (10.38)
Table A5.3. [CIT.2b] Ponton et al., 1996: Data for a Sample of Spanish-Speaking Adult Males Stratified by Two Age Groups and Two Education Groups (Means and Standard Deviations) Age Group 16-29
30-39
50-75
40-49
Years of Education ~10
<10 (n=11)
2:10 (n=ll)
<10 (n= 13)
2:10 (n= 18)
<10 (n=l2)
2:10 (n=l7)
<10 (n=l8)
(n=6)
Color Trails 1
49.27 (15.70)
37.28 (13.20)
49.23 (10,92)
39.28 (10.44)
59.08 (15.22)
36.88 (8.12)
53.06 (20.00)
55.17 (21.86)
Color Trails 2
100.64 (20.38)
88.36 (26.39)
116.31 (39.35)
84.33 (17.11)
129.17 (39.64)
91.24 (16.12)
136.22 (34.71)
113.83 (39.78)
Raven's Total
35.09 (10.00)
42.68 (11.86)
38.08 (10.44)
46.28 (7.76)
28.58 (13.84)
45.94 (6.62)
31.78 (8.63)
42.50 (10.17)
Table A5.4. [CIT.3a] Hsieh and Riley, 1997: Data for a Sample of Mandarin-Speaking Adults in China: Sample Size for Each Age/ Education Category Years of Education Age
1-6
7-10
11-17
30-39 40-49 50-59 60-69 70-83
6 6 6 22 28 68
14 19 15 6 2 57
22 14 12 4 0 52
Total
APPENDIX 5
659
Table A5.5. [CIT.3b] Hsieh and Riley, 1997: Data for a Sample of Mandarin-Speaking Adults in China: Effect of Age on Test Performance Age Group 30--39 (n =43)
40-49 (n=39)
50-59 (n=33)
60-69 (n=32)
70-83 (n =30)
Color Trails 1
42.05 (15.69)
50.97 (20.66)
56.76 (27.63)
129.66 (80.99)
162.97 (98.55)
Color Trails 2
89.95 (37.47)
104.74 (35.65)
138.58 (67.75)
225.31 (103.06)
306.47 (188.38)
1.24 (0.80)
1.28 (0.97)
1.59 (l.li)
0.98 (0.73)
1.01 (0.75)
Interference index
Table A5.6. [CIT.3c] Hsieh and Riley, 1997: Data for a Sample of Mandarin-Speaking Adults in China: Effect of Education on Test Performance Years of Education 0--6 (n=96)
7-9 (n=59)
10--17 (n=22)
Color Trails 1
132.9 (92.45)
55.6 (30.42)
48.2 (25.5)
Color Trails 2
243.10 (154.30)
128.0 (77.17)
97.9 (39.90)
1.12 (0.73)
1.28 (1.08)
1.56 (1.11)
Interference index
Table A5.7. [CTT.4a] LaRue et al., 1999: Data for a Sample of Bilingual Hispanics in New Mexico, aged 65-74 years: Effects of Age and Education on Test Performance• Years of Education 0--6
7-9
10--12
>12
18.44
(5.26)
20.73 (4.42)
21.67 (4.14)
22.77 (4.50)
Sample size Color Trails 2
n=39
n=56
n=92
n=53
10.19 (3.84)
11.93 (3.37)
12.50 (4.00)
14.09 (4.72)
Sample size
n=37
n=55
n=92
n=53
Color Trails 1
•Performance was measured as the number of digits correctly traced in 60 seconds.
660
APPENDIX S
Table A5.8. [CIT.4b] LaRue et al., 1999: Data for a Sample of Bilingual Hispanics in New Mexico, aged 75-97 years: Effects of Age and Education on Test Perfonnance• Years of Education
0-6
7-9
10-12
>12
Color Trails 1
14.41 (5.77)
17.15 (5.50)
19.08 (4.48)
20.73 (4.05)
Sample size Color Trails 2
n=37 7.84 (3.25)
n=34 9.33 (3.25)
n=24 9.71 (4.12)
n=ll 11.73 (4.82)
Sample size
n=32
n=33
n=24
n=ll
•perfonnance was measured as the number of digits correctly traced in 60 seconds.
Table A5.9. [CIT.4c] LaRue et al., 1999: Data for a Sample of NonHispanic Caucasians in New Mexico, aged 65-74 years: Effects of Age and Education on Test Perfonnance• Years of Education 0-12
>12
Color Trails 1
23.46 (3.14)
23.49 (2.91)
Sample size
n=84
n=181
Color Trails 2
15.46 (4.38)
15.77 (4.81)
Sample size
n=84
n=l81
"Perfonnance was measured as the number of digits correctly traced in 60 seconds.
Table A5.10. [CIT.4d] LaRue et al., 1999: Data for a Sample of NonHispanic Caucasians in New Mexico, aged 75-97 years: Effects of Age and Education on Test Perfonnance• Years of Education 0-12
>12
Color Trails 1
21.10 (4.90)
21.18 (5.24)
Sample size
n=67
n=94
Color Trails 2
12.41 (3.62)
13.08 (4.10)
Sample size
n=66
n=92
•perfonnance was measured as the number of digits correctly traced in 60 seconds.
Appendix 6: locator and Data Tables for the Stroop Test
Study numbers and page numbers provided in these tables refer to study numbers and descriptions of studies in the text of Chapter 6.
Locator table also provides a reference for each study to a corresponding data table in this appendix.
Table A6.1. Locator Table for the Stroop Test Study/Version STROOP.l/Golden Golden, 1978 page 116 Data are not reproduced in this book
STROOP..2/Trenerry Trenerry et al., 1989 page 117 Data are not reproduced in this book STROOP.3/Comalli Comalli et al., 1962 page 117 Table A6.2
STROOP.4/Comalli Eson, personal communication page 118 Table A6.3
Age•
n
Sample Composition t
IQ/Education•
Locationt USA
15--45 46-64
65-80
18-49 x=30.34 (8.57) 50-79 x=62.68 (7.93)
106
17-19 35-44 65-80
18 14 16 15
63.2 67.0 72.0 78.3
15 16 16 16
25-34
50
43 males, 63 females; 26 males, 24 males; Nonneurological, nonpsychiatric Age 17-44 years, college students
14.68 (2.44)
USA
14.70 (3.24)
MA
Age 65-80 from community old-age club
NY
(continued)
661
APPENDIX 6
662
Table A6.1. (Contd.) Age•
Study/Version
n
Sample Compositiont
IQ/Education•
Locationt
STROOP.5/Comalli Stuss et al., 1985 page US Table A6.4
29.2 (12.0)
20
13 males, 7 females, English/French
12.5 (2.0) 106.6 (13.4)
Canada
STROOP.6/Comalli Boone et al., 1990 page US Table A6.5
50-59
25 males, 36 females; 51 white, 4 black, 3 Asian, 3 Hispanic; fluent English; Nonneurological, nonpsychiatric, no substance abuse
14.34 (2.63) 113.79 (13.51)
S.CA
70-79
25 21 15
STROOP.7/ComalJi
35.8 (13.7)
16
7 males, 9 females; 19% left-handed; 14 white, 2 Asian; fluent English; no substance abuse, nonneurological, nonpsychiatric Communitydwelling individuals in good health 53 males, 102 females; nonneurological, nonpsychiatric, no substance abuse; ftuent English
15.2 (2.8) 2LD 109.1 (10.9)
S.CA
60-69
Boone et al., 1991 page 119 TableA6.6
STROOP.8/Coma1Ji Demick & Harkins, 1997 page 120 Tables A6.7-A6.10 STROOP.9/Comalli Boone, 1999 page 120 Table A6.11
STROOP.IO/Comalli Boone et al., 2001 page 121 Table A6.12
231 24M, 32 F 21M, 31 F 23M, 31 F
20-39 40-59 60-74 75+ <65 Average IQ High average Superior IQ ?:65 Average IQ High average SuperioriQ
34.32 (14.81)
40-49 STROOP.ll/Kaplan D'Elia et al., unpublished 50-59
38M, 28 F
1q•
33 23
:
35
I I
uj
20
High school
Boston, MA
plus some college 14.57 (2.55)
S.CA
Data for male controls; fluent English
13.36 (2.15)
S.CA
White male pilots,
16.1 (1.9) 15.6 (2.0)
USA
16 24 22
118 79
passed med exam
data page 121 Table A6.13 STROOP.l2/Kaplan Schiltz, personal communication page 122 Table A6.14
18-20
50
13.36 (0.63) 28 males, 22 females; range 13-15 college students; no head injury with loss of consciousness; native English-speaking old-age club
S.CA
APPENDIX 6
663
Table A6.1. (Contd.) Study/Version
Age•
n
STROOP.l3/Kaplan Strickland et a!., 1997 page 122 Table A6.15
19--41 30.17 (6.34)
42
STROOP.l4/Kaplan Miller, 2003 page 122 Table A6.16
40.57 (7.5)
692
24-36
STROOP.l5/Golden Ingraham et a!., 1988 page 123 Table A6.17
Sample Compositiont
IQ/Education•
15 males, 27 females; African American; nonneurological, nonpsychiatric, no substance abuse
14.76 (2.2)
Seronegative homosexual and bisexual males from MACS, native English speakers; data partitioned by age x education
16.3 (2.3) <16 16 >16
46
College students and college-educated adults; 28 males, 18 females; Golden version with new randomization, bold typeface, and Hebrew lettering
College
28.4 (3.2)
I.ncationt S.CA
MACS centers
Israel
STROOP.l6/Golden Connor et a!., 1988 page 123 TableA6.18
18-32
40
17 males, 23 females
College students
wv
STROOP.l7/Golden Fisher et a!., 1990 page 124 Table A6.19
72.9 (8.3)
36
13 males, 23 females; no ocular disease
14.6 (2.7)
S.CA
2()....35
70
58
38 males, 32 females; 30 males, 28 females; French language; no substance abuse, nonneurological, nonpsychiatric; unskilled blue-collar to professional
12.36 (2.09)
27.71 (4.05) 45-65 56.62 (5.29)
STROOP.l8/Golden Daigneault et a!., 1992 page124 Table A6.20
STROOP.l9/Golden Swerdlow et a!., 1995 page 125 TableA6.21
STROOP.20/Golden Ivnik et a!., 1996 page125 Table A6.22
56-59 60-64 65-69
70-74 75-79 80-84 85-89 90-94
72
Normal controls; 35 males, 38 females; sample divided by MMPI criteria and gender; interference ratio reported in addition to means
54 81 65 57 52 27 16 4
165 males, 191 females; 355 white, 1 black; 329 right-handed, 17 left-handed, 10 mixed-handed; nonneurological, nonpsychiatric
Canada
12.11 (3.63)
:s;7=2 8-11 (n=34) 12 (n= 133) 13-15 (n =86) 16-17 (n=66) ~18 (n=35)
MN
(CXJntinued)
APPENDIX 6
664 Table A6.1. (Contd.) Study/Version
Age•
n
STROOP.21/Golden Doan &: Swerdlow, 1999 page 126 Table A6.23
34.4 (13.1)
30
31.2 (11.9)
30
STROOP.U/Golden Rapport et al., 2001 page 127 Table A6.24
33.2 (13.2)
32
STROOP.23/Golden Rosselli et al., 2002 page 127 Table A6.25
31.98 (13.14) 35.90 (13.08) 40.91 (15.17)
STROOP.24/Golden Lopez-Carlos et al., 2003 page 128 Tables A6.26-A6.29
28.23 (8.74)
71
40
.
~1
us
lS-29
30-49
STROOP.25/Golden Cohen et al., 2003 page 129 Table A6.30
30.5 (10.7)
2o
STROOP.26/Golden Moering et al., 2004 page 129 Table A6.31
60-84
2a6
STROOP.27/Dodrill Dodrill, 1978a page 130 Table A6.32
27.34 (8.41)
so
STROOP.28/Dodrill Sacks et al., 1991 page 130 Table A6.33
22.4 (5) range 1S-32
12
STROOP.29/Victoria Regard. 1981, cited in Spreen &: Strauss, 1991 page 131 Table A6.34
20-35 x=26.7
.,
Sample Compositiont
IQ/Education•
Locationt
Vietnamese speakers, English speakers, Vietnamese translation used Control sample; 19 males, 13 females; undergraduate students
14.3 (3.5)
English-Spanish bilinguals, English monolinguals, Spanish monolinguals
14.92 (2.35) 15.35 (2.45) 14.25 (3.49)
S.FL
All-male sample; monolingual Spanishspeaking Latino manual laborers; data partitioned by age, education, age x education, and country; Spanish version administered
6.66 (2.54)
Los Angeles, Mexico
Male control group
11.8 (3.3) FSIQ 100.7 (11.0)
Massachusetts
African-American older adults; data stratified by 2 age x gender x education
<12 12 >12
Tampa. FL
30 males, 20 females; 49 white, 1 nonwhite; 9 students, 26 unemployed, 15 employed; nonneurological Male, normal vision
11.96 (2.01)
WA
13.7 (2.3) 109.1(9.5)
Australia
Right-handed young adults
Average IQ
Canada
S.CA
15.4 (1.6)
14.8 (2.5) FSIQ 108.0 (7.7)
0-6
7-10
I
665
APPENDIX 6
Table A6.1.
(Contd.)
Study/Version
Age"
n
STROOP.30/Victroria Spreen & Strauss, 1991 page 131 Table A6.35
50-59 60-69 70-79
19 28 24 15
80-94 79.04 (6.59)
STROOP.31/Trenerry Anstey et al., 2000 page 131 Table A6.36
Sample Compositiont
369
IQ/Education•
Healthy volunteers
13.2 (3.1)
Canada
Data collected on old and very old adults living in retirement villages and hostels; data presented in raw scores for the entire sample and in percentile distribution for the sample stratified by age x education; 14% male
11.25 (2.79)
Australia
0-9 10-12 13+
"Age column and IQ/education column contain information regarding range and/or mean and standard deviation for the whole sample and/or separate groups, whichever is provided by the authors. ts.CA, Southern California; S.FL, Southern Florida.
Table A6.2.
[Stroop.3] Comalli et al., 1962 (Comalli Version): Data for a Nonclinical Sample Age Groupings
n Word Reading Color Naming Color Interference
17-19
25-34
35--44
65-80
18 40.5 56.1 103.0
14 39.4 60.9 106.2
16 42.6 57.9 109.9
15 45.1 68.9 165.1
Table A6.3.
[Stroop.4] Eson, Personal Communication (Comalli Version): Data for a Nonclinical Sample of Older Participants Mean Age 63.2
n
15
67.0 16
72.0 16
78.3 16
Stroop A
46.9 (11.4)
51.8 (18.6)
53.4 (20.3)
67.9 (23.3)
StroopB
64.9 (16.2)
71.2 (18.4)
71.6 (20.4)
83.9 (20.3)
Stroop C
148.9 (45.2)
165.2 (59.2)
177.4 (63.8)
231.1 (72.2)
666
APPENDIX 6
Table A6.4. [Stroop.S] Stuss et al., 1985 (Comalli Version): Data for the Control Sample n
Table A6.6. [Stroop.7] Boone et al., 1991 (Comalli Version): Data for the Control Sample n
20 29.2 (12.0)
Age
16
Age
35.8 (13.7)
Education
12.5 (2.0)
WAIS IQ
106.6 (13.4)
WAIS-R FSIQ
109.1 (10.9)
Color Interference
64.0 (12.9)
Color Interference
112.9 (22.5)
Education
Table A6.5. [Stroop.6] Boone et al., 1990 (Comalli Version): Data for the Control Sample• :
15.2 (2.2)
Table A6.7. [Stroop.8a] Demick and Harkins, 1997 (Comalli Version): Data for a Sample of Healthy Adults•
Age Groupinrf
n
50-59
60-69
70-79
25
21
15
40.25 (4.95) 0.42 (0.72)
46.05 (9.02) 0.55 (1.15)
20-39 Age Grouping Number of errors CardA
M (SD) Range M (SD) Range M (SD) Range
0.5 (0.8) 0.0-3.0 1.4 (1.5) 0.0-6.0 3.5 (2.9) 0.0-14.0
M (SD) Range M (SD) Range M (SD) Range
45.0 (6.7) 28.0-65.0 62.1 (12.0) 41.0-91.0 109.4 (29.5) 70.0-233.0
Color difficulty factor (total time on 8/total time on A)
M (SD) Range
1.4 (0.3) 0.8-2.3
Interference factor (total time on C - total time on B)
M (SD) Range
47.3 (24.6) 14.0-151.0
Word .Reading Time Errors
'
'
44.07 (6.39) 0.40 (0.63)
Card 8
CardC
Color Naming Time Errors
57.75 (9.74) 2.00 (2.11)
63.30 (8.25) 1.55 (1.99)
74.27 (17.16) 2.00 (2.27)
120.79 (38.07) 1.75 (1.42)
135.50 (29.81) 2.20 (2.12)
148.67 (41.09) 1.60 (1.50)
Color I~ Time Errors
Total Time(s) CardA
Card B
•The sample included 25 men and 36 womep; mean education 14.34 (2.63)years; mean WAIS-R FSIQ 113.79 (13.51); 51 participants were white, 4 African Apterican, 3 Asian, 3 Hispanic.
CardC
•Mean age 24.2 (6.2) years, n=56; 24M, 32 F.
APPENDIX 6
667
Table A6.8. [Stroop.Bb] Demick and Harkins, 1997 (Comalli Version): Data for a Sample of Healthy Adults• 40-59 Age Grouping Number of errors CardA
M (SO)
Card B
Range M
CardC
Range M
(SO)
(SO)
Range Total Time(s) CardA
M (SO)
Card B
Range M
CardC
Range M
(SO)
(SO)
Range Color difficulty factor (total time on B/total time on A)
M (SO)
Range Interference factor (total time on C - total time on B)
M (SO)
Range
0.3 (0.9) 0.0--5.0 0.9 (1.3) 0.0--7.0 1.3 (1.8) 0.0--8.0 43.7 (8.8) 32.4-85.0 59.2 (11.5) 36.0--93.0 104.5 (20.6)
63.0--155.0 1.4 (0.2) 1.0--2.0 45.3 (15.4) 17.0--82.0
•Mean age 48.9 (5.5) years; n =55; 21 M, 34 F.
Table A6.9. [Stroop.&] Demick and Harkins, 1997 (Comalli Version): Data for a Sample of Healthy Adults• 60--74 Age Grouping Number of errors CardA
M (SO)
Card B
Range M
CardC
Range M
(SO)
(SO)
Range Total Time(s) CardA
M (SO)
Range
0.5 (0.9) 0.0--4.0 1.7 (2.0) 0.0--11.0 2.8 (3.5) 0.0--13.0 48.5 (11.2) 25.0--86.0 (continued)
APPENDIX 6
668 Table A6.9. (Contd.) 60-74 Age Grouping M (SD) Range M (SD) Range
69.4 (14.1) 46.0-123.0 142.4 (26.2) 88.0-204.0
Color difficulty factor (total time on 8/total time on A)
M (SD) Range
1.4 (0.3) 0.8-2.2
Interference factor (total time on C -total time on B)
M (SD) Range
73.0 (23.1) 32.0-142.0
Card B
CardC
•Mean age (68.9) (3.8) years; n =54; 23M, 31 F.
Table A6.10. [Stroop.Bd] Demick and Harkins, 1997 (Comalli Version): Data for a Sample of Healthy Adults• 75+ Age Grouping Number of errors CardA
High M (SD) Range M (SD) Range M (SD) Range
0.6 (2.1) 0.0-16.0 2.1 (2.2) 0.0-8.0 3.2 (4.7) 0.0-229.0
M (SD) Range M (SD) Range M (SD) Range
50.2 (9.1) 33.0-76.0 77.6 (19.0) 44.0-133.0 156.6 (49.8) 66.0-344.0
Color difficulty factor (total time on Bltotal time on A)
M (SD) Range
1.5 (0.3) 1.0-2.7
Interference factor (total time on C- total time on B)
M (SD) Range
79.0 (40.4) 2.0-216.0
Card 8
CardC
Total Time(s) CardA
Card 8
CardC
Table A6.11. [Stroop.9] Boone, 1999 (Comalli Version): Data for a Nonclinical Sample• of Adults Ranging in Age from 45 to 84, Partitioned by Age x IQ
•Mean age 80.7 (3.6) years; n=66; 38M, 28 F.
Average IQ
Average IQ
Superior IQ
<65
132.64 (34.51) (n=33)
128.65 (26.87) (n=23)
110.29 (22.37) (n=35)
~65
164.65 (51.90) (n=20)
153.75 (56.99) (n=16)
137.08 (33.14) (n=24)
Age
•Mean education for the sample is 14.57 (2.55); mean WAIS-R FSIQ 115.41 (14.11); 53 males, 102 females.
669
APPENDIX 6 Table A6.12. [Sbuop.lO] Boone et al., 2001 (Comalli Version): Data on 22 Male Controls Age Education
34.32 (14.81) 13.36 (2.15)
WAIS-R FSIQ
107.14 (15.89)
Word Reading
43.64 (8.16)
Color Naming Color Interference
58.91 (9.96) 112.36 (21.48)
Table A6.13. [Sbuop.ll] D'Elia et al., Unpublished Data (Kaplan Version): Data for a Sample of Male Pilots Stratified into Two Age Groups Age Groupings
n Education
40-49
50-59
118
79
16.1 (1.9)
15.6 (2.0)
60.0 (10.2) 0.63 (1.0) 0.39 (0.67)
63.9 (15.7) 0.33 (0.63) 0.39 (0.67)
45.8 (8.8) 0.18 (0.41) 0.26 (0.52)
47.2 (9.1) 0.28 (0.58) 0.20 (0.49)
105.7 (21.4) 0.72 (1.4) 0.70 (1.1)
112.4 (22.6) 0.94 (1.7) 0.95 (1.8)
Color NIJflling T'une Near-misses Errors
Word Beading Time Near-misses Errors
Color Irtterferenee Time Near-misses Errors
APPENDIX 6
670
Table A6.14. [Stroop.12] Schiltz, Persoqru Communication (Kaplan Version): Data for a Sample of 50 HealtJty Undergraduate Students First Half
Total
I
Color Naming
23.02 (3.28) (range 16-30)
Word Reading
17.62 (2.78) ·~ (range 13-25) 43.22 (9.23) (range 27-74)
I
51.14 (6.79) (range 37-67)
I I
38.46 (5.10) (range 30-54)
I
Interference
89.40 (17.76) (range 58-132)
Table A6.15. [Stroop.13] Strickland et ., 1997 (Kaplan Version): Data for a Sample of 42 Healthy African- erican Adults• (15 Male, 27 Female) Between 19 and 41 Years of Tune
Near-Misses
Color Naming
59.26 (17.57)
0.43 0.74)
1.28 (1.29)
Word Reading
43.62 (7.17)
0.24 0.53)
0.50 (0.67)
109.98 (23.42)
1.05
Color Interference
•Mean age for the sample is 30.17 (6.34) years, (2.24) years.
.82)
2.74 (2.07)
d mean education is 14.76
671
APPENDIX 6 Table A6.16. [Stroop.l4) Miller, 2003 (Kaplan Version): Data for a Sample of Seronegative Homosexual/Bisexual Males Participating in MACS, • Stratified by Age x Education Age
Education (years)
25-34
<16
16
>16
Total 35--44
<16
16
>16
Total 45-59
<16
16
>16
Total Total
<16
16
>16
Total
•Multi-Center AIDS Cohort Study.
Color Naming
Word Reading
Mean (SD) n Mean (SD) n Mean (SD) n Mean (SD) n
59.56 (10.31) 32 55.03 (9.57) 32 54.41 (8.82) 27 56.44 (9.80) 91
45.31 (9.06) 32 43.25 (8.31) 32 40.33 (5.60) 27 43.11 (8.07) 91
99.84 (22.02) 55 99.78 (23.79) 45 104.73 (25.20) 146
Mean (SD) n Mean (SD) n Mean (SD) n Mean (SD) n
59.08 (9.48) 73 55.97 (9.72) 79 53.44 (9.69) 87 55.69 (9.78) 239
45.97 (9.30) 73 42.68 (6.56) 79 40.19 (6.35) 86 42.79 (7.78) 238
115.16 (29.22) 99 103.21 (22.98) 97 102.90 (25.16) 125 106.78 (26.39) 321
Mean (SD) n Mean (SD) n Mean (SD) n Mean (SD) n
59.85 (11.30) 54 61.92 (12.93) 36 57.81 (12.19) 102 59.16 (12.13) 192
46.11 (8.13) 54 47.75 (8.19) 36 42.56 (7.86) 102 44.53 (8.25) 192
120.34 (30.88) 65 113.52 (23.16) 46 108.27 (24.60) 114 112.83 (26.69)
Mean (SD) n Mean (SD) n Mean (SD) n Mean (SD) n
58.98 (10.26) 159 57.22 (10.83) 147 55.62 (11.01) 216 57.10 (10.81) 522
45.89 (8.82) 159 44.05 (7.63) 147 41.33 (7.10) 215 43.49 (8.03) 521
Interference 115.41 (27.23) 46
225
116.82 (29.28) 210 104.67 (23.21) 198 104.56 (24.85) 284
108.31 (26.41) 692
APPENDIX 6
672
Table A6.17. [Stroop.15] Ingraham et al., 1988 (Golden Version): Data for 46 CollegeEducated Israelis on a Hebrew Version (18 Female, 28 Male)
Table A6.20. [Stroop.18] Diagneault et al., 1992 (Golden Version): Data for Two Age Groups of French-Speaking Canadian Adults Age Group
Age
28.4 (3.2) 24-36
Reading
99.6 (11.0)
Naming colors
77.0 (10.4)
Color Interference
47.1 (10.1)
Golden Interference score
3.9 (8.3)
Table A6.18. [Stroop.16] Connor et al., 1988 (Golden Version): Data for a Sample of 40 College Students (17 Male, 23 Female)• Pretest
Post-Test
FoHow-Upt
Word
113.52 (14.72)
123.22 (19.28)
130.87 (17.00)
Color
81.22 (9.38)
93.80 (16.85)
99.18 (14.67)
Color-word
49.75 (7.53)
70.62 (15.74)
75.07 (16.30)
• Age range for the sample is 18-25, with the exception of one 32-year-old participant. 'The test was administered at the baseline (pretest), after 6ve practice sessions (post-test), and at 1-week follow-up.
TableA6.19. [Stroop.17] Fisheretal.,1990 (Golden Version): Data for the Control Sample n
36
Age
72.9 (8.3)
Education
14.6 (2.7)
Male/female ratio
13123
Word
96.6 (15.8)
Color
64.9 (13.9)
Color-word
33.4 (10.8)
Interference scores
-5.2 (8.6)
20-35
45-65
n
70
58
Mean age
27.71 (4.05)
56.62 (5.29)
Mean education
12.36 (2.09)
12.11 (3.63)
Male/female ratio
38132
30128
Color Interference
48.80 (8.63)
37.87 (7.67)
Table A6.21. [Stroop. 19] Swerdlow et al., 1995 (Golden Version): Data on Normals Divided into "Psychosis-Prone" vs. "Non-Psychosis-Prone" Groups and Male vs. Female Normal MMPI
n
46
Abnormal MMPI 26
Male 34
Female 38
Word
112.36 (2.59)
106.39 (3.16)
106.24 (3.13)
113.76 (2.53)
Color
80.74 (1.56)
78.15 (3.02)
76.53 (2.43)
84.72 (1.64)
Interference
49.70 (1.16)
44.00 (1.75)
46.12 (1.54)
49.00 (1.34)
Interference ratio
1.65 (0.03)
1.80 (0.06)
1.69 (0.05)
1.71 (0.04)
673
APPENDIX 6 Table A6.22. [Stroop.20] Ivnik et al., 1996 (Golden Version): Demographic Description of the Sample Used in the Mayo Older Americans Normative Studies
n
Table A6.23. [Stroop.21] Doan and Swerdlow, 1999 (Golden Version): Data for 30 Vietnamesespeaking Participants (on a Vietnamese Version) and 30 English-Speaking Participants
English
Vietnamese
31.2 (11.9)
34.4 (13.1)
15.4 (1.6)
14.3 (3.5)
12M, 18F 108.5 (12.22)
13M, 17F 103.85 (18.68)
76.25 (10.79)
72.10 (18.06)
Interference
44.50 (9.93)
43.20 (14.27)
Color-interference ratio
1.77 (0.35)
1.72 (0.42)
"Cost"
1.25 (8.45)
0.94 (11.19)
Age
Age groups 56-59 60-64 65-69 70-74 75-79 80-84
85--89 90-94
54
81 65 57 52 27 16 4
Education <7 8--11 12 13-15 16-17 >18
Gender Male Female
2 34 133 86
Left Mixed
Color
35 165 191 355 1
Handetlna.
Right
Gender Word
66
.Race Caucasian Black
Education
329 17 10
Table A6.24. [Stroop.22] Rapport et al., 2001 (Golden Version): Data for a Sample of 32 Controls (19M, 13 F) Age Education
33.2 (13.2) 14.8 (2.5)
FSIQ
108.0 (7.7)
Word
100.9 (13.4)
Color
80.3 (10.4)
674
APPENDIX 6
Table A6.25. [Stroop.23] Rosselli et al., 2002 (Golden stimuli but scores are time to complete all items): Data for Spanish Monolinguals, SpanishEnglish Bilinguals, and English Monolinguals Monolingual Spanish n
11
Age Education Gender
Bilingual 71
Monolingual English 40
40.91 (15.17)
31.98 (13.14)
35.90 (13.80)
14.25 (3.49)
14.92 (2.35)
15.35
3M, SF
32M, 39F
13M, 27F
45.73 (5.39) 00.00 (0.00)
46.89 (10.01) 0.09 (0.42)
63.56 (12.18) 0.09 (0.26)
68.76 (16.14) 0.28 (0.72)
97.91 (27.44) 0.64 (0.93)
112.85 (30.18) 0.76 (1.20)
(2.45)
Spani.h Stroop Word Reading Time Errors Color Naming Time Errors Interference Time Errors Engliah Stroop Word Reading Time Errors Color Naming Time Errors Interference Time Errors
47.20 (14.34) 0.42 (0.20)
43.68 (8.59) 0.15 (0.70)
72.07 (17.94) 0.23 (0.54)
61.98 (12.53) 0.20 (0.56)
114.24 (32.22) 0.59 (1.69)
108.40 (30.17) 0.68 (1.07)
675
APPENDIX 6
Table A6.26. [Stroop.24a] Lopez-Carlos et al., 2003: Data for Monolingual Spanish Speakers with :510 Years of Education Stratified by Age Group Age Group
n
Design•
Block Word
Color
ColorWord
18-29
71
29.94 (10.35)
102.00 (20.87)
67.60 (17.29)
39.97 (13.65)
30-49
44
28.41 (12.06)
104.67 (25.27)
64.81 (18.31)
36.42 (12.94)
•wechsler Adult Intelligence Scale-III Block Design raw scores (Mexican version).
Table A6.27. [Stroop.24b] Lopez-Carlos et al., 2003: Data for Monolingual Spanish Speakers with :510 Years of Education Stratified by Education Group Education Group
n
Block Design•
Word
Color
ColorWord
0-6
56
25.75 (10.74)
99.96 (26.10)
61.16 (17.65)
33.44 (11.53)
7-10
59
32.78 (10.22)
105.91 (18.39)
71.64 (16.22)
43.53 (13.36)
•wAIS-111 Block Design raw scores (Mexican version).
Table A6.28. [Stroop.24c] Lopez-Carlos et al., 2003: Data for Monolingual Spanish Speakers with :510 Years of Education Stratified by Age and Education Group Age x Education Group
n
Block Design"
Word
Color
Color-Word
30
26.50 (9.46)
97.40 (25.86)
62.53 (16.63)
(10.98)
41
32.46 (10.20)
105.45 (15.65)
71.40 (16.99)
44.85 (13.54)
0-6
26
24.88 (11.97)
103.04 (26.59)
59.52 (19.02)
33.40 (12.40)
7-10
18
33.50 (10.52)
106.94 (23.88)
72.17 (14.80)
40.61 (12.85)
18-29
0-6 7-10
33.47
30-49
•wAIS-111 Block Design raw scores (Mexican version).
676
APPENDIX 6
Table A6.29. [Stroop.24d] Lopez-Carlo~ et al., 2003: Data for Monolingual Spanish Speakers with ~10 Years of Education Statified by Country Group Country Group
n
Los Angeles, USA
65
Jalisco, Mexico
Block Desigr(
Word
Color
ColorWord
(10.26)
103.25 (24.64)
64.92 (17.76)
36.48 (13.32)
31.52 (11.66!
102.72 (19.90)
68.58 (17.50)
41.32 (13.24)
27.69~
50
•wAIS-111 Block Design raw scores (Mexican frsion).
Table A6.30. [Stroop.25] Cohen et al., 2003 (Golden Version): Data for a Male Control Sample Age
30.5 (10.7)
Education
11.8 (3.3)
FSIQ
100.7 (11.0)
Stroop Interference
56.2 (6.3)
Table A6.31. [Stroop.26] Moering et al.~ 2004 (Golden Version): Data for a Sample of Elderly African Americans
I
Years of Education <12
: 12
>12
Total
6 16.33 (1.51) 65.83 (1.60) 81.83 (18.80) 35.67 (11.11) 25.17 (7.28)
54 10.93 (2.28) 65.13 (2.77) 72.30 (14.57) 32.20 (12.81) 21.11 (6.24)
11 15.00 (1.79) 64.18 (3.16) 87.45 (8.24) 53.91 (14.52) 27.82 (4.09)
57 11.19 (2.48) 64.46 (3.33) 79.02 (11.94) 38.33 (13.55) 24.02 (5.66)
Age rcmge 60-71
Male n Education Age Word Color Color-word Female n Education Age Word Color Color-word
37 9.73 (0.81) 65.11 (2.93) 70.49 (13.64) 31.46 (13.98) 20.57 (6.38) 30 9.37 (1.22) 65.53 (3.61) 74.93 (10.93) 34.67 (12.34) 21.87 (5.85)
i1 l2·00 :t.82 .82) 'f3.18 (:(4.55) ~.82
(9.77) ~.73
{4.73)
]j;
too
~.63
<. .93)
~.87
(
.93)
~.50
.42)
2$.44 (~.49)
677
APPENDIX 6
Table A6.31. (Contd.) Years of Education
Age nmge 7!-84 Male n Education
Age Word Color Color-word Female n Education Age Word Color Color-word
<12
12
>12
Total
48 7.29 (2.19) 77.31 (2.13) 54.75 (17.27) 27.50 (10.49) 19.96 (3.89)
4 12.00
2 17.00 (1.41) 74.50 (3.54) 80.50 (7.78) 45.50 (21.92) 27.00 (15.56)
54 8.00 (3.00) 77.24 (2.22) 56.52 (17.31) 28.28 (10.96) 20.28 (4.52)
4 15.50 (1.92) 79.75 (3.10) 89.50 (10.41) 56.25 (24.07) 31.50 (10.75)
71 9.25 (2.67) 77.56 (2.83) 69.04 (14.94) 30.69 (11.01) 21.70 (5.46)
23 15.61 (1.75) 68.21 (6.66) 85.74 (11.81) 48.83 (17.20) 27.70 (7.13)
236 9.82 (2.90) 71.53 (6.91) 69.33 (16.69) 32.33 (12.54) 21.80 (5.63)
56 8.27 (1.88) 77.61 (2.73) 66.36 (14.59) 28.48 (7.95) 20.57 (4.38)
77.75 (2.63) 65.75 (7.37) 29.00 (5.42) 20.75 (3.40)
11 12.00 76.64 (3.04) 75.27 (10.39) 32.64 (5.87) 23.91 (4.21)
Allagu
n Education
Age Word Color Color-word
171 8.50 (1.90) 72.75 (6.56) 65.50 (16.28) 29.94 (11.18) 20.63 (5.02)
42 12.00 68.30 (7.01) 75.95 (12.84) 33.05 (6.86) 23.36 (4.72)
678
APPENDIX 6
Table A6.32. [Stroop.27] Dodrill, 1978a (Dodrill Version): Data for a Control Sant>le of 50 Participants• Part I
88.62 (17..23)
Part II
225$0 (59~)
Part I+ part II
314.~
Part II- part I
137.18 (50.74)
(71.04)
•Mean age 27.34 (8.41) 11.96 (2.01) years.
years, mean educat+m
Table A6.33. [Stroop.28] Sacks et al., l~~n (Dodrill Version): Data for a Sample of 12 Male Australian University ~dents• Alternate Fonns
DodrillFonn 72.6 (18.2)
1
2
3
4
5
63.7 (13.4)
68.8 (21.0)
68.6 (13.3)
67.4 (18.1)
66.0 (18.0)
•Mean age 22.4 (5.0) years, mean education 13.t (2.3)
years, mean FSIQ 109.1 (9.5).
Table A6.34. [Stroop.29] Regard, 1981, ~ted in Spreen and Strauss, 1998 (Victoria Versioq'): Data for a Sample of 40 Right-Handed Young\Adults, Age 20-35 Nalldng color of cloe. (D) 10.10 (2.01) 0.03 (0.16)
Time
Errors Nalldng color print of noncolor tDOrda (W)
12.00 (2.49) 0.03 (0.16)
Time
Errors NIJfldng color print of color tDOrda (C)
I
Time
i 19.25
Errors
I (5.18) : 0.23 : (0.53)
679
APPENDIX 6
Table A6.3S. [Stroop.30] Spreen and Strauss, 1991 (Victoria Version): Data for a Sample of Healthy Older Participants• Age Groupings
n
50-59
60-69
70-79
80-94
19
28
24
15
13.74 (2.58)
12.71 (1.90)
15.00 (5.07) 0.08 (0.28)
18.87 (4.67) 0.20 (0.56)
16.58 (3.34)
16.32 (3.33) 0.04 (0.19)
19.04 (5.10)
24.13 (5.13) 0.13
Nflllling color of doe. (D)
Time Errors Naming color print of noncolor worcla
Time Errors
(0.35)
Naming color print of color worcla
Tune
28.90 (7.62) 0.42 (0.77)
Errors
38.38
31.82 (9.86) 0.36 (0.68)
61.13 (30.94) 2.73 (2.46)
(13.29) 0.71 (1.16)
•Mean education for the sample is 13.2 (3.1) years.
Table A6.36. [Stroop.31] Anstey et al., 2000 (Treneny Version): Data for 259 Retired Australian Elderly Partitioned by Age Group and Education Age Group 62-69
70-79
80-89
90-95
Education Group
0-9
10-12
5 10 25 50 75 90 95
47 47 60 64 74
42 42 42 49 52
n
7
7
6
5 10 25 50 75 90 95
42 42 59 89 99
83 83 83
47 47 72
99 112
93
n
7
Percentile
13+
0-9
10-12
13+
0-9
10-12
13+
0-9
10-12
13+
52 52 54 66 71
49 49 56
47 48 50 56 64 73 86 56
42 49 57 56 58 65 75 26
48 51 57
50 52 58 64 75 86 103
44 45 50 56
56 56 60 61
49 49 49 59
63
85
56 56 59 69 81
71 108 25
7
4
3
31 47
51 54
63
63
77
76 96 112 112 29
Color Naming
63 73 86 120 39
63 76
88 103 31
44
Ini#Jrforence
7
100
6
39 45 54 78 92 103 112 39
93 105 112 58
5 17 41 69 90 100 110 28
8 13 44 59 80 90 95 45
5 32 58 73
6 6 22 53
85
5 8 55
3 37 78
83
99 103 24
7
4
5
Appendix 6m: Meta-Analysis Tables for the Stro<)p Test (Golden Version, Interference! Condition) I I
Table A6m.1. Results of the Meta-Anal~is and Predicted Scores for the Stroop Test, Interference Condition ; (Relevant values are weighted on the standlfd error for the test mean) Description of the aggregate sample
I ,
Number of studies included in the analyiU Years of publieation Number of data points used in the analyiU (a data point denotes a study or a cell in education/gender-stratified data) Total number ol participants
6 1988-2004 10
490
n•
xt
snt
Range
10
44.40
17.14
.20-71
10 10
~ 48.95 6.44
22.13 3.93
22.0-77.6 2.0-11.9
9 9
12.53 2.58
2.30 .68
8.0-15.4 Uh'3.6
Percent male Tat acore met1t11
4
44.48
8.66
36.1-54.3
Combined mean Combined SD
10 10
38.71 7.90
12.71 3.00
20.3-56.2 4.5-10.8
Variable SGmp~.e.-
Mean
Age Mean SD
Edueadon Mean SD
•Number of data points differs for different ~s due to missing data. tweighted means and standard deviations.
680
!
681
APPENDIX 6M Table A6m.1. (Contd.)
Predicted scores and SDs per age group• (Stroop, Golden version, Interference condition)
95%CI
Age Btmge
Predicted Score
25-29 30-34 3S-39
49.66 47.10
40-44 4S-49 50-54 55-59 60-64
Lower Band
Upper Band
44.43 42.36 40.03 37.39 34.48 31.33 28.01 24.59 21.09 17.53
54.88 51.85 49.07 46.60 44.42 42.46 40.67
44.55 41.99 39.45 36.89
34.34
~
31.79 29.24
10-14
26.68
Standard deviation for all age groups is 7.90.
38.99 37.39
35.84
•Based on the equation: Predict«l teat score= 63.69403-0.5104828 •age
Significance tests for regression with the test scores Ordinary least-squares regression of test means on age (linear) Number of observations Number of clusters
R2 F
10
6 F(I.S) =
0.791 18.07, p = 0.008
Term
Coefficient
SE
t
p
95%CI
Age
-0.5104828 63.69403
0.120
-4.25 12.29
0.008
5.184
0.000
- 0.819 to- 0.202 50.369 to 77.019
Constant
Prediction
Predicted age range Mean predicted score SEe 95%CI
25-74 years 38.17 (7.73) 3.08 32.12-44.22 (continued)
APPENDIX 6M
682
Table A6m.1. (Contd.)
0
0 40
30
0 0
0
20 20
40
30
70
60
50 age
80
Figure A6m.1. A scatterplot illustrating the dispersion of the data points around the regression line for the Stroop Test. The size of the bubbles reflects the weight of the data point, with larger bubbles indicating larger standard error and smaller weight.
Tests for assumptions and model &t Tests for heterogeneity in the 8nal data set 29.129 35.721 Q(9) = 1659.82,
Pooled estimates for fixed effect Pooled estimates for random effect
Q(dO·P Moment-based estimate of between-study variance
p <0.000
161.556
Tests for model 8t--addition of a quadratic term Model linear Quadratic
Adjusted R2
BIC
BIC'
0.764 0.731
44.113 46.415
-13.330 -11.028
0.791 0.791
BIC' difference of 2.302 provides positive support for the linear model.
Tests for parameter speclfleations Normality of residuals Shapiro-WJ.Ik W test Homoscedasticity White's general test
W = 0.941, p = 0.562
2.171, p=0.338
APPENDIX 6M
683
Table A6m.1. (Contd.)
Significance tests for regression with SDs A regression of SDs on age yielded an R2 of 0.015 (Fo.sl = 0.06, p = 0.822). Therefore, the SD for the aggregate sample is suggested for use with all age groups.
Effects of demographic variables Education Est. tau2 without education Est. tau2 with education Regression of test means on education and age Number of observations Number of clusters R2
Term
Education
Coefficient
SE
0.772
1.308
0.59
178.8 128.1 9 5 0.792 p
95%CI
0.586
- 2.858 to 4.403
Table A6m.2. Comparison of means and standard deviations for the aggregate sample for three conditions•
Mean score Aggregate n Number of studies Number of data points Mean age
Word Reading
Color Naming
Color Interference
92.09 (14.32)
58.07 (11.83)
34.24 (8.12)
342 4
7 54.19 (23.68)
Mean education
12.79 (2.34)
•Four studies containing data for all three conditions are compared.
Appendix 7; Locator and Data Tables for Auditory Consonant Trigrams
I
Study numbers and page numbers pro~ded in these tables refer to study numbers fi1d descriptions of studies in the text of Chapter 7.
Locator table also provides a reference for each study to a corresponding data table in this appendix.
I
Table A7.1.
Locator Table for Auditory q,nsonant Trigrams (ACf)
Study Acr.I Stuss et al., 1987 page 137 Tables A7.2, A7.3
Age• 1~19
20-29 30-39 40-49 50-59 60-69
Acr.! Stuss et al., 1988 page 137 Tables A7.4, A7.5
ACf.3 Boone et al., 1990 page 138 TableA7.6
1~29
30-49 5()...81
50-59 60-69
70-79
Sample Composition
IQ/ Education•
Version
60 10 10 10 10 10 10
Data presented by age, ! gender, and education : groups; 33 male, 27 female; 49 right-handed; nonpsychiatric, 1nonneurological, 1 no substance abuse
14.5 (2.63)
9,18,36
Canada
90 30 30 30
Data presented by age
9,18,36
Canada
igroups; 44 male, 46 'female; 76 1right-handed; :nonpsychiatric, jnonneurological
14.1 (1.34) 14.9 (3.95) 13.2 (2.38)
61 25 21 15
r;ta presented by age ~ups; 25 men, 36 :WOmen; all but 10 white; nonpsychiatric
14.34 (2.63)
3,9, 18
California
n
.
Location
i I
~onneurological,
~o substance abuse, ~ major medical ndition I
684
Used
IQ 113.8 (13.5)
685
APPENDIX 7 Table A7.1. (Contd.) Study
Age"
n
Acr.4 Boone et al., 1991 page 139 Data are not reproduced in this book
35.8 (13.7)
16
Acr.s Boone, 1999
page 139 Table A7.7
155 <65, Average IQ High average IQ Superior IQ
32 23 37
~65.
Average IQ High average IQ SuperioriQ Acr.&Anil et al., 2003
page 140 Table A7.8
20 16 23 236
16-25 26-45 46-65
Sample Composition
IQ/ Education•
Version Used
Location
Data presented for healthy controls; 7 men, 9women; nonpsychiatric, nonneurological, no substance abuse, no major medical condition; total score for the sample provided Data presented by IQ and age groups; 53 male, 102 female; no substance abuse, nonpsychiatric, nonneurological
15.2 (2.8)
3, 9,18
California
3, 9,18
California
3,9, 18
Turkey
IQ 109.1 (10.9)
14.6 (2.6) IQ 115.4 (14.1)
Volunteers were assessed in Turkey using a translated and modified version of the ACT; data partitioned by 3 age x 3 education groups
8-10 11-14 >14
•Age column and IQ/education column contain information regarding range and/or mean and standard deviation for the whole sample and/or separate groups, whichever information is provided by the authors.
Data for the 9-, 18-, and 36-Second Delay Version Table A7.2. [ACf.la] Stuss et al., 1987: Data for 60 Healthy Canadian English- or French-Speaking Volunteers, Partitioned by Age Group• Delay Interval
Age Group 16-19
20-29
30-39
40-49
50-59
60-09
12.4 (1.6) 12.0
12.6 (2.3) 12.2 (2.7) 9.8 (2.8)
12.7 (2.6) 12.6 (2.8) 11.3 (2.8)
11.1 (2.6) 11.0 (2.5) 9.9 (2.8)
10.9 (2.8) 10.5 (3.2) 8.2 (3.6)
12.3
Combined cialcl
9 seconds 18 seconds
(2.1)
36 seconds
10.0 (3.0)
(1.6)
11.2 (1.8) 9.9 (2.1) (continued)
686
APPENDIX 7
Table A7.2. (Contd.) Age Group Delay Interval
1~19
20-29
30--39
40-49
50-59
60-69
11.8 (1.9) 11.8 (2.2) 9.9 (2.3)
12.2 (2.5) 11.7 (2.9) 8.7 (3.3)
12.8 (1.8) 12.5 (3.0) 11.0 (2.5)
11.0 (2.8) 10.0 (2.8) 8.7 (3.4)
10.9 (2.5) 9.8 (3.4) 8.2 (3.1)
12.2 (1.6) 10.5 (1.7) 10.0 (1.8)
13.0 (1.4) 12.2 (2.4) 10.1 (3.6)
13.0 (2.0) 12.6 (2.6) 10.9 (2.3)
12.6 (3.5) 12.7 (2.5) 11.5 (3.0)
11.2 (2.5) 12.0 (2.2) 11.0 (2.1)
11.0 (3.0) 11.1 (2.9) 8.2 (4.1)
12.3 (1.7) 11.8 (2.0) 9.8 (2.4)
Fint ciait 9 seconds 18 seconds 36 seconds
Second ciait (l tHeA: later) 9 seconds 18 seconds 36 seconds
"Mean age for the sample 39.6 (2.62) years, and mean education is 14.5 (2.63) years; 33 males, 27 females.
Table A7.3. [ACf.1b] Stuss et al., 1987: Data Collapsed Across Age Groups, Partitioned by Gender and Educational Level ~High
Delay Interval
Males
Females
School
>High School
9 seconds
11.71 (2.65)
12.35 (1.94)
11.46 (2.60)
12.44 (2.06)
18 seconds
11.17 (2.74)
12.04 (2.31)
11.13 (2.55)
11.91 (2.56)
36 seconds
9.20 (3.13)
10.61 (2.57)
9.32 (3.25)
10.26 (2.67)
Table A7.4. [ACf.2a] Stuss et al., 1988: Demographic Characteristics for the Sample of Healthy Canadian Volunteers Stratified into 11uee Age Groups Gender
Hand Preference
Age
Education (Years)
Group
n
M
F
R
L
Mean (SD)
Mean (SD)
Range
1
30
16
14
22
8
22.43 (2.67)
14.10 (1.34)
11-18
2
30
14
16
26
4
40.63 (2.97)
14.90 (3.95)
5-20
3
30
14
16
28
2
61.77 (3.0)
13.20 (2.38)
8-18
687
APPENDIX 7
Table A7.5. [ACf.2b] Stuss et al., 1988: Performance Across Three Age Groups for the Initial Test and Retest 1 Week Later Age Group
50-69
30-49
16-29 Delay Interval
Test
Retest
Test
Retest
Test
Retest
9 seconds
12.0 (2.2)
12.6 (2.0)
12.0 (2.5)
12.1 (2.9)
11.5 (2.3)
11.7 (2.3)
18 seconds
11.4 (2.8)
12.3 (2.4)
10.5 (3.1)
12.0 (2.6)
10.2 (2.5)
10.7 (2.9)
36 seconds
9.4 (2.7)
10.9 (2.9)
9.9 (3.0)
11.1 (2.4)
8.7 (2.9)
8.6 (3.5)
Data for the 3-, 9-, and 18-Second Delay Version Table A7.6. [ACf.3] Boone et al., 1990: Data for the Control Sample Stratified into Three Age Groups• Age Group
n Total
50-59
60-09
70-79
25
21
15
44.76 (7.36)
48.15 (8.02)
42.50 (7.70)
Perseverative errors
6.36 (3.81)
4.20 (2.78)
5.71 (2.87)
Altered sequence
2.00 (1.50)
1.85 (1.73)
2.71 (2.55)
3 seconds
12.56 (2.02)
12.95 (2.42)
11.21 (3.17)
9 seconds
9.44 (3.79)
10.75 (3.34)
8.93 (2.70)
8.20 (3.56)
9.65 (3.59)
7.50 (3.32)
18 seconds
"Mean education for the sample 14.34 (2.63), mean WAIS-R FSIQ 113.79 (13.51); 25 male, 36 female; 51 Caucasian, 4 African American, 3 Asian, 3 Hispanic.
Table A7.7. [ACf.S] Boone, 1999: Data for a Nonclinical Sample• of Adults Ranging in Age from 45 to 84, Partitioned by Age x IQ High Age
Average IQ
Average IQ
Superior IQ
<65
45.81 (6.05) (n=32)
45.91 (6.45) (n=23)
50.38 (8.01) (n=37)
39.95
43.31 (9.23) (n=l6)
49.22 (6.02) (n=23)
(9.99)
(n=20)
"Mean education for the sample 14.57 (2.55); mean WAIS-R FSIQ 115.41 (14.11); 53 males, 102 females.
APPENDIX 7
688
Table A7.8. [ACT.6] Anil eta!., 2003: Data Collected in Turkey on 236 Volunteers who Were Tested with a Translated Version of the ACT Age Group
46-65
26-45
16-25
Education Group• M
H
u
M
H
u
M
H
u
14.9 (0.3)
14.9 (0.3)
15.0 (0.0)
14.6 (0.9)
14.9 (0.3)
15.0 (0.0)
14.6 (0.8)
14.7 (0.6)
15.0 (0.0)
3 seconds
12.8 (3.0)
13.4 (1.9)
14.5 (1.0)
9.7 (2.8)
11.9 (2.3)
14.0 (1.3)
11.3 (2.8)
12.4 (1.7)
13.6 (1.7)
9 seconds
8.9 (2.7)
11.0 (3.0)
12.5 (2.1)
8.2 (3.0)
10.1 (2.4)
11.5 (3.0)
8.6 (3.1)
8.3 (2.5)
9.9 (2.5)
18 seconds
8.8 (4.0)
11.2 (3.3)
12.4 (2.8)
6.6 (3.6)
9.0 (3.3)
11.4 (3.1)
5.9 (2.8)
7.3 (2.5)
9.6 (3.2)
Delay interval: 0 seconds
"M, middle school; H, high school; U, university.
Appendix 8: Locator and Data Tables for the Paced Auditory Serial Addition Test
Study numbers and page numbers provided in these tables refer to study numbers and descriptions of studies in the text of Chapter 8.
Locator table also provides a reference for each study to a corresponding data table in this appendix.
Table A8.1. Locator Table for the Paced Auditory Serial Addition Test (PASAT) Study/Version
Age•
n
PASAT.l/ Gronwall Gronwall, 1977 page 146 TableA8.2
14-.55
60
1~19
60
PASAT.!/ Gronwall Stuss et al., 1987 page 146 TableA8.3
~29
30-39 40-49
50-59 60-09
Sample Composition Control group: 10 individuals who had experienced accidents but no head injuries, 10 naval "ratings" and 40 1st-year university students; no exclusion criteria M 33, F 27; community volunteers; non-neurological or psychiatric conditions; native language either French or English
IQ/
Education•
Location
New Zealand
12.30
Ottawa, Canada
(0.95) 16.20 (1.39)
16.70 (3.86)
15.50 (2.88)
11.70
(2.41) 14.30 (2.00) (continued)
689
690
APPENDIX 8
Table A8.1. (Contd.) StudyNersion PASAT.3/ Gronwall Stuss et al., 1988 page 147 Table A8.4
Age•
n
1~29
90
30-49 50-69
Sample Composition M 46, F 44; see above study (PASAT.2) for description
IQ/ Education• 14.10 (1.34)
Ulcation
Ottawa, Canada
14.90 (3.95) 13.20 (2.38)
PASAT.4/ Gronwall Rao et al., 1989 page 148 Table A8.5
42.8 (8.1)
PASAT.5/ Gronwall Stuss et al., 1989 page 148 TableA8.6
29.7 (12.4) 17-57
26
M 20, F 6; no neurological or psychiatric conditions
13.2 (3.0) 7-20
PASAT.6/ Gronwall Rao et al., 1991a page 149 Table A8.7
46.0 (11.6)
100
M 25, F 75; paid volunteers; no neurological or psychiatric conditions; 99 out of 100 were Caucasian
13.3 (2.0)
Wisconsin
PASAT.7/ Gronwall Strauss et al., 1994 page 149 Table A8.8
23.70 (2.53) 20-35
10
M 4, F 6; undergraduate students from University of Victoria
15.21 (0.79)
Victoria, Canada
PASAT.8/ Gronwall Zalewski et al., 1994 page 149 Table A8.9
(38.0)
241
Nonpsychiatric and non neurological veterans; 189 Caucasians, 35 African Americans, 11 Hispanics, and 6 "other"
PASAT.9/ Gronwall Crawford et al., 1998b page 150 Table A8.10
25.00 (3.27)
152
M 77, F 75; no neurological or psychiatric conditions; no systemic illness
1~29
38.10 (5.67) 30-49 60.70 (7.41) 50-74
40
M 10, F 30; no neurological, physical, or psychiatric conditions and neurological exams; normal brain imaging
14.0 (2.3)
Wisoonsin
WAIS-R Verbal IQ: 108.1 (6.3) Ottawa, Canada
13.6
12.97 (2.86) WAIS-R FSIQ: 105.0
United Kingdom
691
APPENDIX 8 Table A8.1. (Contd.) Sample Composition
IQ/ Education•
45
Neurologically normal; recruited from nonmedical hospital staff; no serious medical, neurological, or psychiatric disorders; no substance abuse
12.8 (1.9)
19.8 (3.85)
20
Undergraduate students; native English speakers; no hearing problems; no history of repeating grades; no neurological problems, head trauma, substance abuse attention problems, learning disability, or current medication use
PASAT.l2/ Gronwall Honn et al., 1999 page 151 TableA8.13
32.5 (6.3)
76
HIV-negative males, 13.2% and 13.4% had history of marijuana abuse or dependence; no intravenous drug use, head injuries, or neurological or psychiatric conditions
PASAT.l3/ Gronwall (computerized) Wingenfeld et al., 1999 page 152 Table A8.14
21.0 (5.1) 17-48
168 M 80, F 88; college students; native English speakers; no neurological or psychiatric conditions; no emotional problems, learning disability, attentional problems; 88% Caucasian, 4% African American, 4% Asian American, and 4% "other"
PASAT.l4/ Gronwall Bate et al., 2001 page 152 Table A8.15
30.2 (10.3)
35
PASAT.IS/ Gronwall Boringa et al., 2001 page 153 Table A8.16
45.8 22-73
Study/Version
Age"
n
PASAT.IO/ Gronwall Prevey et al., 1998 page 150 Table A8.11
44.4 (11.4)
PASAT.ll/ Gronwall (computerized) Holdwick& Wmgenfeld, 1999 page 151 Table A8.12
PASAT.l6/ Gronwall Fluck et al., 2001 page 153 Table A8.17
Young M: 21.1 (0.4) YoungF: 20.9 (0.2)
M 20, F 15; no psychiatric or neurological conditions; no intellectual disability, substance abuse, or hemiplegia of the dominant hand; native English speakers
140 M 62, F 78; community volunteers; no neurological or psychiatric conditions; no substance abuse, learning disability, or serious head injury 60
M 30, F 30; college students; nonpsychiatric; no medication; four groupings based on age and gender
Location Various locations, USA
14.6 (2.4)
College students
Arkansas
12.6 (2.0)
Australia
Verbal IQ (NART-R) 101.1 (9.1) <9 years 9-10 years >10years
Amsterdam
113.0 (1.5)
London, UK
112.4 (1.7) (continued)
APPENDIX 8
692
Table A8.1. (Contd.) StudyNersion
PASAT.l7/ Gronwall Snyder et al., 2001 page153 TableA8.18
PASAT.l8/ Levin Brittain et al., 1991 page 154 Tables A8.19, A8.20
Age•
•
Sample Composition
IQ/ Education•
Middle-aged M: 57.5 {1.3)
117.7 (1.8)
Middle-aged F: 60.3 (0.7)
113.3 (2.2)
37.97 (12.94)
21.0
35
526
(2.1)
M 9, F 26; staff, volunteer workers, and students recruited from the Queen Elizabeth II Health Science Centre, Dalhousi University, and Multiple Sclerosis Society; no psychiatric or neurological conditions M 233, F 293; healthy volunteers; no psychiatric or neurological conditions; 391 Caucasian and 135 non-Caucasian
<25
31.4 (4.1) 25-39
14.1 (2.3)
Location
Nova Scotia, Canada
WAIS-R
Vocabulary score: 54.5 (7.0) 13.0 (1.3) Shipley IQ: 105.0 (9.1) 14.0 (2.2) Shipley IQ: 103.0 (10.4)
13.0 (3.1) Shipley IQ: 101.0 (12.6) 12.0 (2.5) Shipley IQ: 106.0 {15.1)
46.2 (4.3) 40-54
67.0 (7.1) 60-75
PASAT.l9/ Levin Roman et al., 1991 page 155 Table A8.21
19.0 {1.6) 18-27
40.0 (5.1)
33-50
69.0 (4.1) 60-75
143:
M 66, F 77; undergraduate students and employees of Baylor University, business organizations, and senior centers; no psychiatric or neurological conditions
12.0 (0.77) FSIQ: 110 (12.3) 15.0 (2.6) IQ: 110 (12.3) 15.0 (3.2) IQ: 107 (11.0)
Texas
693
APPENDIX 8
Table A8.1. (Contd.) Sample Composition
StudyNersion
Age•
n
PASAT.20/
33.3 (12.4) 18-59
40
Normal controls: no neurological or psychiatric conditions
20-49
821
M 672, F 149; no neurological or psychiatric conditions; 699 Caucasian, 46 Mrican American, 31 Hispanic, and 13 Native American
Levin Cicerone, 1997 page 155 Table A8.22
PASAT.21/ Levin Wiens et al., 1997 page 156 Tables A8.23, A8.24
M: 29.2 (6.1) F: 29.2 (5.6)
IQ/ Education•
Location
14.9 (2.2)
New Jersey
14.6 (1.5) FSIQ: 106.6 (11.0)
Pacific Northwest, USA
H.6) FSIQ: 105.4 (11.1)
PASAT.22/ Levin Tiersky et al., 1998 page 156 Table A8.25
PASAT.23/ Levin Stein et al., 2002 page 156 Table A8.26
PASAT.24/ Levin Diamond et al., 1997 page 157 Table A8.27
PASAT.25/ PASAT-50 Diehr et al., 1998 page 157 Table A8.28
PASAT.26/ PASAT-50, -100, -200 Diehr et al., 2003 page 158 Table A8.29
37.1 (2.4)
20
All female; from local community; no neurological or psychiatric conditions
15.0 (0.55)
New Jersey
29.4 (10.7)
22
All female; fluent in English; no psychiatric or neurological conditions
13.9 (1.5);
California
40.9 (8.9) 31--56
22
Recruited from Kessler Institute and its local community; no psychiatric or neurological conditions; no substance abuse
15.4 (2.2)
New Jersey
39.7 (12.1) 20--68
566
M 61%, F 39%; no psychiatric or neurological conditions; 55% Mrican American, 45% Caucasian
14.2 (2.6) 9-20
California
39.7 (12.1) 20--68
560
M 342, F 218; No psychiatric or neurological conditions
14.2 (2.6) 9-20
California
• Age column and IQ/education column contain information regarding range and/or mean and standard deviation for the whole sample and/or separate groups, whichever information is provided by authors.
694
APPENDIX 8
Gronwall's Administration Version Table A8.2. [PASAT.l] Gronwall, 1971a (Gronwall's Administration Version): Data for Two Testing Probes Round&\ to the Nearest Whole Number for a Sample of 60 New Zealander Controls•
PASATTrials Testing Occasion
2.4 sec
2.0 sec
1.6 sec
1.2 sec
Test
46 (6)
40 (7)
32 (8)
22 (5)
Retest
50 (5)
45 (5)
39 (6)
31 (4)
"Age range for the sample is 14-55. The majOrity of participants are 17-25 years of age.
Table A8.3. [PASAT.2] Stuss et al., 19$7 (Gronwall's Administration Version): Data for a Sample of Canadian Volunteers Stratified into Six Aie Groups
PASATTrials
Age
Mean
Group
Age
n
MIF ratio
17.3 (0.915) 23.0 (2.67) 33.9 (2.88) 44.2 (3.12) 55.3 (2.98) 63.7 (3.13)
10
515
10
614
10
515
10
614
10
614
10
515
Education
2.4 sec
2.0 sec
1.6sec
1.2 sec
12.3 (0.95) 16.2 (1.39) 16.7 (3.86) 15.5 (2.88) 11.7 (2.41) 14.3 (2.00)
45.7 (12.3) 51.3 (6.2) 44.8 (8.1) 49.1 (7.3) 35.3 (15.6) 47.7 (8.8)
42.8 (15.2) 45.7 (8.8) 42.7 (10.4) 45.0 (6.5) 25.5 (16.4) 41.7 (9.6)
35.3 (14.4) 41.3 (12.0) 35.6 (11.7) 30.6 (13.8) 17.6 (15.1) 40.3 (10.7)
24.0 (9.6) 33.6 (9.2) 27.6 (8.1) 23.0 (15.0) 8.2 (11.5) 30.7 (8.7)
52.0 (10.2) 56.8 (4.6) 51.9 (8.1) 55.1 (5.2) 41.1 (15.8) 54.0 (5.1)
47.4 (11.6) 55.3 (5.0) 48.7 (10.3) 50.6 (7.9) 32.6 (19.8) 51.9 (6.4)
40.6 (13.2) 48.8 (6.0) 43.2 (13.7) 43.0 (9.2) 24.4 (16.6) 45.7 (8.3)
28.7 (11.7) 35.0 (7.9) 32.0 (11.8) 30.9 (9.7) 14.5 (12.4) 37.2 (7.2)
lat clait 16-19 20-29 30-39 40-49 50-59 60-69
2nd clait 16-19 20-29 30-39 40-49 50-59 60-69
695
APPENDIX 8
Table A8.4. [PASAT.3] Stuss et al., 1988 (Gronwall's Administration Version): Data for a Sample of Canadian Volunteers Stratified into Three Age Groups PASAT Trials Mean Age
Age Group
n
MIF ratio
30
16114
30
14/16
30
14/16
Education
2.4 sec
2.0 sec
1.6 sec
1.2 sec
14.1 (1.34) 14.9 (3.95) 13.2 (2.38)
47.40 (10.12) 43.43 (10.16) 43.47 (13.60)
42.00 (12.50) 41.87 (10.16) 35.60 (14.58)
35.97 (12.97) 33.10 (12.20) 30.83 (15.85)
27.40 (9.86) 24.63 (10.55) 21.20 (14.44)
53.73 (7.30) 52.57 (7.89) 49.20 (11.40)
50.23 (9.17) 49.23 (8.67) 45.00 (15.30)
43.37 (11.02) 41.93 (10.56) 37.10 (15.18)
31.20 (10.24) 31.60 (10.12) 27.27 (13.50)
let Nit 1~29
22.43 (2.67) 40.63 (2.97) 61.77 (3.00)
30-49
50-69
.hdmait 1~29
30-49 50-69
Table A8.5. [PASAT.4] Rao et al., 1989 (Gronwall's Administration Version): Data for a Normal Control Sample PASATTrials
n
MIF Ratio
Age
Education
WAIS-R Verbal IQ
40
10130
42.8 (8.1)
14.0 (2.3)
106.5 (5.8)
MMSE•
3 sec
2sec
29.8 (0.5)
48.5 (9.6)
37.1 (10.1)
•Mini-Mental State Exam (Folstein et al., 1975).
Table A8.6. [PASAT.S] Stuss et al., 1989 (Gronwall's Administration Version): Data for Two Testing Probes for Canadian Normal Controls VISit 1
Visit 2• (Retest)
2.4 sec
47.1 (10.9)
51.3 (11.0)
2.0 sec
41.8 (11.7)
47.2 (11.9)
1.6 sec
36.1 (11.8)
40.2 (12.3)
1.2 sec
26.3 (10.2)
29.3 (12.0)
n
MIF Ratio
Mean Age
Mean Education
PASAT Trials
26
2016
29.7 (12.4)
13.2 (3.0)
•Test-retest interval was approximately 1 week.
696
APPENDIX 8
Table A8.7. [PASAT.6] Rao et al., 1991a (Gronwall's Administration Version): Data for a Normal Control Sample PASAT Trials n
100
MIF Ratio
Age
Education
Premorbid IQ•
3 sec
2 sec
'lJ5n5
46.0 (11.6)
13.3 (2.0)
107.2 (11.2)
48.5 (9.6)
37.1 (10.1)
• Barona et a!. (1984).
Table A8.8. [PASAT.7] Strauss et al., 1994 (Gronwall's Administration Version): Data for a Control Sample of Canadian Undergraduate Students PASAT Trials n
MIF Ratio
Age
Education
2.0 sec
1.6sec
10
4/6
23.7 (2.58)
15.21 (0.79)
43.6 (9.1)
35.1 (9.1)
Table A8.9. [PASAT.8] Zalewski et al., 1994 (Gronwall's Administration Version): Total Score for Trials 2.4 and 1.2 for a Control Sample of Veterans• n
Age
Education
Total PASAT Score
241
38.0
13.6
111.16 (52.4)
•Ethnic composition: White (n = 189), Black (n = 35), Hispanic (n = 11), Other (n=6).
Table A8.10. [PASAT.9] Crawford et al., 1998b (Gronwall's Administration Version): Data for a Sample of Healthy Volunteers from United Kingdom Stratified into Three Age Groups Age Group Total sample
n
MIF Ratio
152
11n5
16-29
38
30-49
78
50-74
36
Age
Education
40.21 (13.89)
12.97 (2.86)
'lJ5.00 (3.27) 38.10 (5.67) 60.70 (7.41)
•National Adult Reading Test (Nelson, 1982).
Total PASAT Score for 4 Trials
WAIS-R FSIQ
NARTErrors
151.6 (40.32) 169.2 (30.12) 149.8 (40.29) 136.9 (43.79)
105.0 (14.08)
18.0 (9.01)
697
APPENDIX 8
Table A8.11. [PASAT.10] Prevey et al., 1998 (Gronwall's Administration Version): Data for Two Trials for the Control Sample PASATTrials n
45
Age
Education
2.4 sec
2.0 sec
44.4
12.8 (1.9)
31.0 (11.4)
30.0 (10.8)
(11.4)
Table A8.12. [PASAT.ll] Holdwick and Wingenfeld, 1999 (Gronwall's Administration Version, Computerized): Data for a Sample of College Students• PASAT Trials n
% Femalet
20
65
2.4 sec
2.0 sec
1.6 sec
1.2 sec
19.8
45.60
(3.85)
(7.68)
44.60 (8.56)
40.20 (7.32)
33.25 (7.74)
"Ethnic distribution: 82.5% Caucasian, 10% African American, 3.8% Hispanic, 3.8% other. tvalue represents information for the entire sample, not just for the control group.
Table A8.13. [PASAT.12] Honn et al., 1999 (Gronwall's Administration Version): Data for Three Trials for an HIV-Negative Male Control Sample Subdivided into Exerciser and Non-Exerciser Groups PASAT trials
Group
n
Age
Education
WAIS-R IQ
2.4 sec
2.0 sec
1.6 sec
Exercisers
38
31.1 (8.7)
15.0 (2.2)
107.7 (11.1)
42.3 (12.5)
40.3 (10.4)
32.5 (9.4)
Non-exercisers
38
32.9 (9.9)
14.3 (2.3)
105.9 (14.2)
39.8 (11.7)
33.6 (10.9)
30.2 (8.5)
Table A8.14. [PASAT.13] Wingenfeld et al., 1999 (Gronwall's Administration Version, Computerized): Data for a Sample of College Students Stratified by Gender and Two Age Groups PASAT Trials
Gender
Age Group
Male (n=80)
Female (n = 88)
17-29 (n = 156)
30-48 (n = 12)
2.4 sec
48.37 (8.78)
45.46 (10.10)
47.30 (9.17)
41.00 (12.88)
2.0 sec
44.33 (9.30)
41.01 (10.53)
42.83 (9.86)
39.42 (12.50)
1.6 sec
38.25 (9.58)
36.09 (10.19)
37.42 (9.47)
33.25 (14.72)
1.2 sec
30.47 (9.73)
27.66 (10.02)
29.29 (9.69)
25.25 (12.86)
APPENDIX 8
698 Table A8.15. [PASAT.14] Bate et al., 2001 (Gronwall's Administration Version): Data for an Australian Control Sample
MIF n
Ratio
Age
Education
Verbal IQ (NART-R)•
35
20/15
30.2 (10.3)
12.6 (2.0)
101.1 (9.1)
PASAT Trials 2.4 sec
2.0 sec
1.6 sec
1.2 sec
43.1 (12.0)
40.8 (10.6)
36.7 (10.1)
29.5 (6.0)
•National Adult Reading Test-Revised (Crawford, 1992).
Table A8.16. [PASAT.15] Boringa et al., 2001 (Gronwall's Administration Version): Data for Two Trials Collected in Amsterdam on a Sample of Healthy Volunteers PASAT Trials
MIF Ratio
n 140
62178
Age
3.0 sec
2.0 sec
45.8
48.7 (10.7)
38.2 (11.0)
(range 22-73)
Table A8.17. [PASAT.16] Fluck et al., 2001 (Gronwall's Administration Version): Data for Healthy Volunteers Collected in London Young
n Age WAIS-R FSIQ PASAT Trials 2.4 sec 2.0 sec 1.6 sec 1.2 sec
Middle-Aged
Male
Female
Male
Female
15
15
15
15
21.1 (0.4)
20.9 (0.2)
57.5 (1.3)
60.3 (0.7)
113.0 (1.5)
112.4 (1.7)
117.7 (1.8)
113.3 (2.2)
55.4 (1.3) 45.6 (2.3) 38.7 (2.4) .28.6 (2.7)
53.6 (1.8) 46.9 (2.1) 38.2 (2.8) 31.9 (2.7)
48.4 (2.5) 38.9 (2.3) 30.1 (2.7) 2.2.5 (2.1)
41.6 (3.4) 32.5 (3 ..2) .24.4 (.2.4) 17.5 (1.9)
699
APPENDIX 8
Table A8.18. [PASAT.l7] Snyder et al., 2001 (Gronwall's Administration Version): Data for the Canadian Control Group n
M/F Ratio
35
9/26
Age•
Education•
37.97 (12.94)
14.1 (2.3)
Averaget PASAT
Dyadt PASAT
35.53
11.66 (6.17)
(9.85)
"Data obtained from initial study (Fisk & Archibald, 2001). tTraditional scoring method in which the total correct responses for each trial are summed and divided by 4. :Based on the dyad scoring method in which pairs of correct responses were counted; these scores were then summed and divided by 4.
Levin's Administration Version Table A8.19. [PASAT.l8a] Brittain et al., 1991 {Levin's Administration version): Demographic Characteristics of the Sample of Healthy Participants n
M/F Ratio
Race White/Other
21.0 (2.1)
145
55190
25-39
31.4 (4.1)
164
40-54
46.2 (4.3)
>55
67.0 (7.1)
Age Group
Mean Age
<25
Education
Shipley IQ
79/66
13.0 (1.3)
105.0 (9.1)
67197
114150
14.0 (2.2)
103.0 (10.4)
95
50/45
79/16
13.0 (3.1)
101.0 (12.6)
122
82182°
11913
12.0 (2.5)
106.0 (15.1)
•The authors report the gender ratio as 82/82, but this is likely a misprint given that n = 122.
700
APPENDIX 8 I
Table A8.20. [PASAT.18b] Brittain et; al., 1991 (Levin's Administration Version): Data• ror a Healthy Sample Stfcttifl.ed by Four Age and Three IQ Levels 1 I
Age Group
PASAT Trial
<25
2.4 sec 2.0 sec 1.6 sec . 1.2 sec
Total trials
25-39
2.4 sec 2.0 sec 1.6 sec 1.2 sec
Total trials
'
<94
90-109
>109
(n='h 15."* (9.4f) 18.7 (9.1 ) 'JJJ.7 (8.7 ) 29. (7.2) 83. (29. )
(n=89) 10.79 (10.79) 14.84 (8.66) 19.10 (8.43) 24.22 (7.77) 68.88 (29.68)
(n=49) 16.63 (7.24) 11.06 (8.81) 12.59 (8.28) 18.59 (7.52) 49.08 (28.23)
(n=1) 18. (11.
(n=95) 9.78 (8.16) 14.98 (8.28) 19.26 (8.47) 25.64 (7.14) 69.65 (28.45)
(n=S4) 6.37 (6.08) 10.74 (6.81) 13.31 (7.66) 'JJJ.61 (7.18) 51.26 (24.13)
(n=47) 10.79 (9.35) 15.15 (8.48) 18.36 (9.21) 25.34 (6.68) 69.43 (30.08)
(n=31) 5.65 (4.53) 9.97 (7.18) 14.84 (7.48) 21.61 (7.13) 51.90 (23.26)
(n=S4) 21.78 (10.45) 23.61 (7.86) 25.04 (6.08) 29.26 (6.42) 99.76 (27.01)
(n=SO) 14.24 (10.36) 16.54 (9.32) 'JJJ.90 (8.50) 25.12 (7.22) 76.82 (31.59)
'JJJ.4 (10. 24.8 (8. 28.6 (5.4
92.~
(31.9 2.4 sec
(n = 1-;t 26.351 (9.8~
2.0 sec 1.6 sec 1.2 sec
Total trials
>54
2.4 sec 2.0 sec
28.06: (8.57) 29.82 (6.64) 32.76j (6.90) 117.00: (28.39) (n=18) 28.83: (6.84) 28.39: (6.70~
1.6 sec 1.2 sec
28.50: (4.87i 32.441 (4.76~
Total trials
118.06' ('JJJ.37)~
•Mean error scores.
Shipley IQ
APPENDIX 8
701
Table A8.21. [PASAT.19] Roman et al., 1991 (Levin's Administration Version): Raw Scores and Percent Correct for Three Adult Age Groups Young Adult
Middle-Aged
Older Adult
(18-27 years)
(33-50 yean)
(60-75 years)
n %female Age
62
40
41
58
50
51
19 (1.60)
40 (5.1)
69 (4.1)
12 (0.77)
15 (2.6)
15 (3.2)
110 (12.30)
110 (12.3)
107 (11.0)
44 (4.9)
37 (9.1)
39 (7.8)
38
31 (9.2)
36
33
(7.7)
(8.9)
28
28
(6.7)
(9.3)
20 (6.1)
148 (23.5)
144 (27.0)
115 (29.9)
91.68 (8.9)
90.70 (10.3)
75.95 (18.6)
2.0 sec
79.53 (16.2)
77.50 (15.3)
63.04 (18.9)
1.6sec
72.79 (15.9)
68.22 (18.2)
54.95 (17.4)
1.2 sec
57.92 (13.8)
57.22 (18.9)
41.63 (12.2)
Total (4 trials)
75.00 (12.0)
73.00 (14.0)
59.00 (15.3)
Education Estimated IQ• PASATnawacora
2.4 sec
45 (4.3)
2.0 sec 1.6sec 1.2sec
Total (4 trials) PASATpen:entcorrect 2.4 sec
(7.6)
27 (8.5)
•Prorated IQ using the Vocabulary and Block Design subtests of the WAIS-R.
Table A8.22. [PASAT.20] Cicerone, 1997 (Levin's Administration Version): Data for a Control Sample
n 40
Age
Education
Total PASAT Score for 4 trials
33.3
14.9 (2.2)
144.0 (27.0)
(12.4)
702
APPENDIX 8
Table A8.23. [PASAT.21a] Wiens et al., 1997 (Levin's Administration Version): Data for a Sample of Healthy Participants Stratified by Gertder Male n
Age Group WAIS-R FSIQ
Female
672
149
PASAT Trials
29.2 (5.6)
Education
14.6 (1.5)
14.5 (1.6)
106.6 (11.0)
105.4 (11.1)
43.2 (5.6)
43.5 (5.8)
110-119 2.4 sec
39.3 (7.3)
39.1 (6.8)
2.0 sec
1.6 sec
35.1 (8.1)
34.9 (8.1)
1.6 sec
1.2 sec
29.2 (7.9)
27.8 (7.3)
1.2 sec
146.8 (25.2)
145.2 (24.5)
PASAT trials
2.0 sec
Total trials
Total correct
120-129 2.4 sec 2.0 sec
Table A8.24. [PASAT.2lb] Wiens et al., 1997 (Levin's Administration Version): Data fot a Sample of Healthy Participants Stratified by Age and IQ
1.6 sec 1.2 sec
Age Group WAIS-R FSIQ
80-89
PASAT Trials
20-29
30-39
(n=27) 42.1 (5.6) 35.7 (6.1) 28.2 (9.1) 24.3 (7.4) 130.3 (22.4)
(n=10) 40.9 (4.8) 35.1 (6.8) 31.1 (8.5) 24.9 (7.0) 132.0 (24.0)
(n= 116) 42.0 (5.5) 2.0 sec 37.6 (6.9) 32.5 1.6 sec (7.5) 1.2 sec 26.1 (7.0) Total correct 138.2 (22.9)
(n=72) 40.8 (6.0) 36.1 (8.2) 31.6 (9.4) 24.6' (7.9) 133.1 (27.9);
2.4 sec 2.0 sec 1.6 sec 1.2 sec Total correct
90-99
2.4 sec
Total correct
40-49 ~130
2.4 sec 2.0 sec 1.6 sec 1.2 sec Total correct
(n=7) 33.7 (16.3) 32.7 (11.8) 27.1 (10.5) 19.0 (6.5) 112.6 (39.9)
40-49
(n=94) 41.8 (6.5) 37.8 (7.1) 34.1 (8.0) 28.5 (7.8) 142.2 (25.0)
(n =11) 38.4 (8.8) 36.4 (10.6) 33.7 (10.8) 26.4 (10.9) 135.1 (38.7)
(n=95) 44.7 (4.4) 40.8 (6.8) 37.6 (7.2) 31.3 (7.3) 154.5 (22.8)
(n =72) 43.9 (4.8) 39.4 (7.5) 36.4 (7.0) 30.0 (7.4) 149.7 (23.5)
(n = 11) 41.9 (7.2) 39.6 (7.0) 35.0 (7.5) 27.9 (8.2) 144.4 (27.2)
(n =33) 46.9 (2.5) 43.7 (4.4) 40.3 (6.4) 35.1 (7.2) 166.0 (18.4)
(n=44) 44.8 (5.4) 40.7 (6.7) (7.0) 31.7 (7.4) 155.6 (22.4)
(n = 12) 47.2 (2.3) 43.7 (3.6) 39.7 (7.6) 32.1 (7.6) 162.7 (17.0)
(n= 12) 48.0 (1.4) 46.5 (2.2) 42.2 (6.3) 35.2 (5.3) 171.8 (13.4)
(n=7) 46.0 (3.9) 39.7 (6.8) 38.1 (6.8) 30.7 (6.8) 154.6 (22.8)
(n=4) 45.0 (3.7) 43.8 (6.6) 35.0 (5.2) 28.8 (4.1) 152.5 (16.1)
(n= 192) 44.2 (4.4) 2.0 sec 40.4 (6.3) 1.6 sec 35.6 (7.1) 1.2 sec 29.9 (6.8) Total correct 150.1 (20.5)
29.2 (6.1)
2.4 sec
30--39
100-109 2.4 sec
Age
WAIS-R FSIQ
20-29
38.5
703
APPENDIX 8
Table A8.25. [PASAT.22] Tiersky et al., 1998 (Levin's Administration Version): Data for an All-Female Control Sample n
Age
Education
Total PASAT Score
20
37.1 (2.4)
15.00
143.40
(0.55)
(5.08)
Table A8.26. [PASAT.23] Stein et al., 2002 (Levin's Administration Version): Data for an All-Female Control Sample n
Age
Education
Total PASA,.. Score
22
29.4 (10.7)
13.9 (1.5)
124.4 (29.8)
•Mean and SD based on n=20.
Table A8.27. [PASAT.24] Diamond et al., 1997 (Levin's Administration Version): Data for a Control Sample PASAT
n
Age
Education
Verbal IQ (NAART)•
22
40.9 (8.9)
15.4 (2.2)
113.6 (13.0)
2.4 sec
2.0 sec
1.6sec
1.2 sec
Total
42.0 (7.3)
38.0 (8.5)
33.0 (8.0)
27.0 (7.9)
140.0 (29.0)
•North American Adult Reading Test (Blair ~ Spreen, 1989).
704
APPENDIX 8
PASAT-50, PASAT-100 and PAS.t\II'-200 Administration Versions
Table A8.28. [PASAT.25a] Diehr et al.,l998 (PASAT-200 Administration Version): Data for a Control Sample Stratified by Ethnicity, Education, and Age Groupings Age Group n
Educatiqn
20-34
35-49
African American
172
13.2
Caucasian
132
13.3
111.5 (28.3) (n=77) 135.1 (26.0) (n=51)
106.4 (29.7) (n=60) 123.5 (27.7) (n=47)
92.8 (26.0) (n=35) 109.9 (30.1) (n=34)
99.3 (27.2) (n=24)
97.9 (33.6)
(n=26)
84.3 (31.7) (n=20)
120.9 (29.7) (n= 128) 129.4 (27.8) (n=64)
113.9 (29.9) (n=107) 128.9 (35.2) (n=SO)
101.2 (29.2) (n=69) 120.8 (31.2) (n=48)
Etludclty-
Education level 9-11years
70
12-15 years
304
16-20 years
192
• A smaller portion of the sample who had 12-45 years of education was reported.
Table A8.29. [PASAT.26] Diehr et al., 2003 (PASAT-50, PASAT-100, and PASAT-200 Administration Versions): Data for the Control Sample n 560
Age
Education
%Caucasian/ African Anierican
PASAT-50 (3 sec)
PASAT-100 (3 and 2.4 sec)
PASAT-200 (3, 2.4, 2.0, and 1.6 sec)
39.7 (12.1)
14.2
45/sS
37.4
68.4 (18.0)
115.9
(2.6)
(9.6)
(32.6)
Appendix 9: Locator and Data Tables for the Cancellation Tests
Study numbers and page numbers provided in these tables refer to study numbers and descriptions of the studies in the text of
Chapter 9. Locator table also provides a reference for each study to a corresponding data table in this appendix.
Table A9.1. Locator Table for the Cancellation Tests Study
Age•
n
Sample Composition
IQ!Education•
Location
RUFF.2&7.1 Ruff & Allen, 1996 page 164 Data are not reproduced in this book
16-24 25-39 40-45 55-70
360
180M, 180 F; no exclusion criteria provided
<12 13-15 16
California, Michigan
RUFF.2&7.2 Ruff
16-24 25-39 40-45 55-70
68 83 60 48
107 M,152 F; no exclusion criteria provided
~12
13-15 2::16
California, Michigan
31.2 (4.1)
60
Normal controls; no neurological or psychiatric condition
30.2 (10.3)
35
et al., 1986a page 165 Table A9.2 RUFF.2&7.3 Ruff
et al., 1992 page165 TahleA9.3 RUFF.2&7.4 Bate et al., 2001 page 166 Table A9.4
20M, 15 F; participants were native English speakers; no neurological or psychiatric conditions
12.9 (1.5)
California, Michigan, New York
12.6 (2.0)
Australia
(continued)
705
706
APPENDIX 9
Table A9.1. (Contd.) Study DVf.l Heaton et al., 1991 &
2004 page 166 Data are not reproduced in this book
Age•
~ Sample Composition
n
20-34 35-39 40-44 45--49
200 bvr data in 1~1 manual; !ovrdatain manual; ly half were
50-54
~can-American ~halfwere
55-59
60-64 65-69
IQ/Education• 7-8
9-11 12 13-15 16-17 18-20
California, Washington DC, Colorado, Texas, Oklahoma, Wisconsin, Illinois, Michigan, New York,
Vuginia. Manitoba (Canada)
ucasian in 2004 sakple; no neurological or
70-74 75-79
Location
$hiatric conditions; are presented
;by. . . .
80-85
scores
cation, and for African~erican and
a
£ely
DVf.2 Prigatano et al., 1983 page 167 TableA9.5
59.6 (9.0)
25
H
adult volunteers screened
w.,.-e ~illnesses that
10.5 (3.3)
Wmnipeg. Canada; Oklahoma
10.2 (3.6)
USA
t interfere
~their neuropsychological
g. COPD, medications
i
eart or lung disease,
an11 diabetes DVf.3 Grant et al., 1987 page 168 TableA9.6 DVf.4 Kelland & Lewis, 1994 page 168
63.1 (10.3)
99
75 p
20.0 (2.8)
20
10 ~. 10 F; college students; nti neurological or psf<:hiatric conditions
24 F; no neurological or hiatric conditions
13.1 (1.3)
0
Table A9.7 DVf.5 Barncord & Wanlass, 1999 page 169 Table A9.8
19.80 (3.95)
10
DVf.6Smith et al., 2001 page 169 Table A9.9
65.0 (4.0) 67.0 (6.0)
16
DVf.7 Stein et al., 2002 page 169 TableA9.10
29.4 (10.7)
22
""£' """"" """"""' n exclusion criteria "ded
12.80 (0.63)
PI
13
I
All-f-tmale sample;
~tmenopausal; nor neurological or p hiatric ditions; data rted for enopausal en on HRT and not on HRT
15.0 (2.0) 16.0 (3.0)
Michigan
13.9 (1.5)
California
•Age column and IQ/education column con · information regarding range and/or mean and standard deviation for the whole sample and/or separate groups, which r information is provided by authors.
707
APPENDIX 9
Table A9.2. [RUFF 2&7.2] Ruff et al., 1986a: Data for a Sample of Healthy Volunteers Stratified into Three Age and Three Education Groups Education (years) 13-15
:512
Age Group x Condition
n
MIF Ratio
16-24
32
16/16
25-39
26
10/16
40-45
18
5/13
55-70
14
3111
16-24
32
16/16
25-39
26
10/16
40-45
18
5/13
55-70
14
3/11
Score
n
MIF Ratio
148.9 (29.8) 142.8 (27.1) 131.7 (26.9) 124.7 (21.8)
26
13113
30
11/19
17
7/10
14
3111
26
13113
30
11/19
17
7110
14
3111
:2:16
Score
n
MIF Ratio
164.1 (28.9) 156.3 (23.5) 132.3 (29.6) 131.3 (27.5)
10
515
27
12/15
25
11/14
20
11/9
10
515
27
12/15
25
11/14
20
11/9
Score
Automatic Defection• 165.1 (29.4) 153.5 (40.2) 152.2 (31.9) 137.1 (24.2)
ControllBd Search• 134.2 (24.6) 126.4 (24.9) 119.2 (19.5) 113.7 (18.4)
142.0 (24.9) 140.3 (20.5) 122.4 (22.1) 122.0 (21.7)
143.7 (20.6) 138.1 (22.1) 131.9 (21.3) 123.0 (19.2)
•capital letters are used as distractors in the Automatic Detection condition and digits other than 2 and 7 are used as distractors in the Controlled Search condition.
Table A9.3. [RUFF 2&7.3] Ruff et al., 1992: Speed and Accuracy Data for a Sample of Healthy Volunteers
n
Age
Education
Speed•
Accuracl
60
31.2 (4.1)
12.9 (1.5)
284.4
94.4
(47.2)
(4.7)
"Speed: sum of hits for both conditions (i.e., digit-digit and digit-letter conditions). t Accuracy: (total hits for both conditions- total errors for both conditions)/(total hits for both conditions).
Table A9.4. [RUFF 2&7.4] Bate et al., 2001: Data for a Sample of Australian Volunteers n
MIF Ratio
35
20115
Age
Education
Verbal IQ (NART-R)•
Digits Correctly Cancelled
30.2 (10.3)
12.6 (2.0)
101.1 (9.1)
262.5 (44.7)
•National Adult Reading Test-Revised (Crawford, 1992).
708
APPENDIX 9
Table A9.5. [DVI'.3] Prigatano et al.,
~983: Data for a Sample of
Healthy Volunteers n
Age
Education!
DVI"' Total TIDle
25
59.6 (9.0)
10.5 (3.3)
390.9 (100.0)
•ovr data reported for 21 participants. Table A9.6. [DVI'.4] Grant et al., 1987: pata for a Sample of Nonpatient Control Volunteen; MIF n
Ratio
Age
Education
DVfTotal Time
99
75124
63.1 (10.3)
10.2 (3.6)
(100.6)
424.6
Table A9.7. [DVI'.S] Kelland and Lewis,i1994: Data for a Sample of College Students Alternate Form
Standard Form n
MIF Ratio
Age
20
10/10
20.0
~.1
(2.8)
{1.3)
Table A9.8. [DVI'.6] Barncord and WanJa5s, 1999: Data for a Sample of College Students
ovr MIF n
Ratio
10
515
Age
Education
19.80 (3.95)
12.80 (0.63)
Total Items
t 360.20 : (23.22)
Table A9.9. [DVI'.7] Smith et al., 2001: Data for Postmenopausal Females on HRT and ·not on HRT• DVI'Total Groups
n
Age
Education
Errors
HRT
16
65.0 (4.0)
15.0 (2.0)
I 4.63 I (4.1)
NoHRT
13
fj{,O
16.0 (3.0)
12.58 (13.1)
(6.0)
•HRT, hormone replacement therapy.
Week 1
Week2
Week 1
Week2
314.62 (57.20)
289.00 (49.10)
309.79 (52.30)
284.42 (43.30)
Table A9.10. [DVI'.8] Stein et al., 2002: Data for an All-Female Sample n
Age
Education
DVfTotal Time
22
29.4 (10.7)
13.9 (1.5)
350,4 (80.2)
Appendix 10: Locator and Data Tables for the Boston Naming Test (BNT)
Study numbers and page numbers provided in these tables refer to study numbers and descriptions of the studies in the text of Chap-
ter 10. Locator table also provides a reference for each study to a corresponding data table in this appendix.
Table A10.1. Locator Table for the Boston Naming Test (BNT) Study BNT.l Van Gorp et al., 1986 page 183 Table A10.2
Age"
n
59-95
78
59-64
12 20
~
70-74 75-79
;:::so
Location
Normal, independently living elderly (29M, 49F)
FSIQ: 122
Los Angeles, CA
Neurologically intact males, data are presented for 5 age decades
8-22 14.6 (2.2)
Control group (52M, 58F), rigorous exclusion criteria
High IQ
24 13 9
BNT.i Farmer, 1990 20-69 page 183 43.9 (14.3) Table A10.3 20-29
125
30-39 40-49 50-59 60-69
25
BNT.3 Boone
63.06
110
et al., 1995
(9.19)
page 184 Table A10.4
IQ/ Education•
Sample Composition
California
25 25 25 25 Los Angeles, CA
14.84 (2.61) (continued)
709
APPENDIX 10
710 Table A10.1. (Contd.) Study
Age•
n
BNT.4 Mitrushina & Satz, 1995 page 184 Tables A10.5, A10.6
57-$
122
57-65 71-75 76-$
19 40 47 16
65-97
323
BNT.5 Neils et al., 1995 page 185 Table A10.7
BNT.6 Ross et al., 1995 page 185 Table A10.8
~70
65-74 75-84 85-97
70,43 (7.8)
136
55-59
7 29
Independently living, neurologically intact elderly (74.3% F); interrater reliability data are provided; error analysis was performed
65-69 70-74 75-79
IQ/ Education• 14.1 (2.7) FSIQ: 118.2 (13.0)
6-9 10-12 >12
11.3 (3.1)
Location Los Angeles, CA
Northern Kentucky, Cincinnati, OH
Michigan
Australia
35
;::85
31 14 13 7
76.2
20
80-84
BNT.9 Ivnik et al., 1996 page 187 Table AlO.ll
Neurologically intact volunteers (244 F, 79 M); 167 were living independently, 156 were institutionalized in extended-care facilities; data is presented in age-by-educationby-living environment matrix Geriatric medical inpatients from an urban rehab hospital with a variety of physical diagnoses, some of whom were 2-3 weeks post-orthopedic surgery
60-64
BNT.8 LaHeche & Albert, 1995 page 186 Table A10.10
Normal, independently living elderly (49M, 73F) Test-retest data over 3 annual probes are provided
123 40 40 43
70-74 75-79
;::so BNT.7 Worrall et al., 1995 page 186 Table A10.9
Sample Composition
Control group (9M, llF), nondemented elderly
14.7
Massachusetts
663
Normal elderly volunteers, tables for age correction and a regression equation for education correction; tables are not reproduced
Mayo FSIQ: 106.2 (14.0)
Minnesota
176
Neurologically intact volunteers (74M, 102F) representative of the regional population across most demographic parameters; data are presented for 5 age groups, age x education, and age x gender cells; suggested cutoffs are presented
12.28 (3-18)
Middle Tennessee
56-59 60-64
65-69 70-74 75-79 80-84
85-89
90-94
;::95 BNT.IO Welch et al., 1996 page 188 Tables Al0.12-Al0.14
74.0 60-64
65-69 70-74 75-79 80-93
711
APPENDIX 10
Table A10.1.
(Contd.)
Study
Age"
n
BNT.ll Hoff et al., 1996 page 188 Table A10.15
32.1 (9.7)
54
BNT.ll Ponton et al., 1996 page 189 Table A10.16
38.4 (13.5)
300
BNT.l3 Tombaugh &: Hubley, 1997 page 189 Tables A10.17, A10.18
16-29 30-39 40-49 50-75 25-88 25-34
219 22
35-44 45-54
28 33
55-59
24
60-M
19 22 18
65-69 70-74 75-79
BNT.l4 Henderson
BNT.l5 Stuss
29
17-87
100
54.4 (14.4)
37
72.2 (7.0)
108
57--*3 69-76 77-85
35 38 35
page 190 Table Al0.20
BNT.l6 Fastenau et al., 1998 page 191 Table Al0.21
BNT.l7 Ross &: Uchtenberg, 1998 page 191 Table A10.22
76.1 (7.1)
IQ/ Education•
Location
Paid male volunteers, strict selection criteria
15.4 (2.4)
New York
Spanish-speaking healthy volunteers; MIF ratio 40/60%; data are partitioned by gender (2) x age (4) x education (2); 30-item Spanish version was administered
10.7 (5.1)
Los Angeles, CA
Community-dwelling volunteers (46% male) with no known history of neurological or psychiatric illness, head injury, or stroke
12.9 (2.3)
Canada
50 African-American and 50 Caucasian participants, with 25M, 25F in each group; rigorous exclusion criteria
10-17
South Carolina
Control group (19M, 18F) without neurological or psychiatric disorder
13.9 (2.3)
Canada
Nonneurological, communitydwelling adults (47% female); rigorous exclusion criteria
14.1 (2.4)
Indiana
Neurologically intact medical patients from an urban hospital (73% female), data are presented in age-by-education cells
11.1 (3.2)
South Carolina
<10 >10
24
80-88 et al., 1998 page 190 Table A10.19 et al., 1998
Sample Composition
233
65-69 70-74 75-79
<12 ~12
80-84 85-95
BNT.l8 Kahnert et al., 1998 page 192 Table A10.23
20.82 (2.6)
100
Bilingual young adults of Mexican-American descent recruited from Univ. CA San Diego, Univ. CA Santa Barbara, and San Diego community, with Spanish as the primary language; test was administered in both Spanish and English
14.4 (1.7)
San Diego, CA
(continued)
APPENDIX 10
712 Table A10.1. (Contd.) Study
BNT.l9 Randolph et al., 1999 page 193 Table A10.24
Age•
n
73.6 (10.3)
719
BNT.20 Killgore & Adams,l999 page 193 Table Al0.25
45.7 (15.1)
62
BNT.il Heaton et al., 1999 page 194 Table Al0.26 BNT.U Rosselli et al., 2000 page 194 Table Al0.27
68.71 (5.47)
96
18
BNT.23 SchmitterEdgecombe et al., 2000 page 194 Table Al0.28
BNT.24 Saxton et al., 2000 page 195 Table Al0.29
60-80
61.3 (8.1) 63.4 (10.1) 60.6 (9.7)
45 19
18-22
26
58-74
26
75-93
26
73.63 (4.45)
357
Sample Composition Neurologically nonnal elderly (60% female), almost exclusively white; data are stratified by gender and broken down by age group based on overlapping midpoints technique; data are also presented by education Patients referred for neuropsychological evaluation, which yielded no evidence of neurological impainnent (28M, 34F)
IQ/ Education• 13.4 (2.9)
Ux:ation
Minnesota
<12 12
>12 13.1 (2.7) FSIQ 95.1 (12.0)
Massachusetts, Oklahoma
Healthy community residents (51.8% male), rigorous exclusion criteria
13.5 (2.45)
San Diego, CA
Monolingual Spanish-speaking (4M,14F), Monolingual English-speaking (15M, 30F), Bilingual, healthy participants (9M, IOF); data are reported for three language groups Healthy students and communitydwelling individuals (7M, 19F in each age group)
13.3 (4.8) 16.6 (2.4) 14.5 (3.6)
Florida
Elderly volunteers who participated in the multicenter Memory and Aging Study (44.9% male)
13.23 (2.85)
Washington
13.38 (0.75) 17.58 (2.23) 16.65 (3.03) County, MD Pittsburgh, PA
BNT.25 Bell et al., 2001 page 195 Table A10.30
34.4 (12.5)
29
Sample included friends, relatives, and spouses of temporal lobe epilepsy patients (28% male)
13.0 (1.7)
WJSClODSin
BNT.26 Roberts et al., 2002 page 196 Table A10.31
37.1 (8.99) 39.6 (6.59) 34.9 (8.14)
42
Monolingual English speakers
Florida, Canada
32
Spanish/English bilinguals
49
French/English bilinguals; data reported for 3 language groups
16.29 (3.05) 17.91 (3.38) 15.0 (2.93)
713
APPENDIX 10
Table A10.1. (Contd.) Study
Age"
"
BNT.27 Coffey et al., 2001 page 196 Table A10.32
74.85 (4.95)
320
BNT.IS Giovannetti et al., 2003 page 197 Table A10.33
23.2 6.07
31
Sample Composition
IQ/ Education•
Neurologically healthy elderly from Cardiovascular Health Study (38% male); data are reported for males and females separately
12.98 (2.87)
Pittsburgh, PA Hagerstown, MD
21M, lOF healthy adults recruited from a medical center community
Education: 15.0 (1.48) FSIQ: 109.3 (11.51)
New York, Pennsylvania
Location
• Age column and IQ/education column contain information regarding range and/or mean and standard deviation for the whole sample and/or separate groups, whichever is provided by the authors.
Table A10.2. [BNT.1] Van Gorp et al., 1986: Data for a Sample of Healthy Elderly Stratified into Five Age Groups Age Groups 59-64
65-69
70-74
n
12
20
24
13
Education
13.58 (2.37)
14.40 (2.38)
15.25 (3.30)
14.23 (3.58)
WAIS-R Verbal IQ
122
123
130
75-79
115
80+ 9 15.23 (4.59) 118
BNT score
56.75 (3.05)
55.60 (4.29)
54.46 (5.17)
51.69 (6.20)
51.56 (7.00)
Cutoff
51
47.12
44.12
39.29
37.56
Table A10.3. [BNT.2] Farmer, 1990: Data for Five Age Groups and the Entire Sample of Healthy Adult Males Age Group
n
Age
Education
BNT Scores
20-29
25
24.08 (2.53)
14.88 (1.67)
56.04 (3.60)
30-39
25
33.92 (3.35)
15.04 (2.32)
57.04 (2.25)
40-49
23
44.60 (2.50)
15.12 (1.94)
57.76 (2.19)
50-59
25
53.56 (2.77)
14.40 (2.26)
58.24 (1.88)
60-69
25
63.68 (2.93)
13.64 (2.50)
58.28 (3.19)
125
43.97 (14.31)
14.62 (2.19)
57.47 (2.79)
All
714
APPENDIX 10
Table A10.4. [BNT.3] Boone et al., 1995: Data for the Control Group n 110
Age
Education
Men
Women
VIQ
BNT Scores
63.06 (9.19)
14.84 (2.61)
52
58
115.78 (14.18)
54.97 (6.30)
Table A10.5. [BNT.4a] Mitrushina and Satz, 1995: Demographic Characteristics for the Total Sample of Healthy Elderly and for Four Age Groups• Age Groups
All
57-65
66-70
71-75
7&-85
Education
14.1 (2.7)
14.4 (2.0)
13.7 (1.8)
14.5 (3.1)
14.0 (3.6)
Age
70.4 (5.0)
62.6 (2.5)
68.2 (1.2)
72.9 (1.4)
78.3 (2.5)
WAIS-R FSIQ
118.2 (13.0)
115.0 (12.1)
119.4 (15.2)
119.9 (11.3)
114.4 (12.3)
n
122
47
16
19
40
"The sample included 49 males and 73 females.
Table A10.6. [BNT.4b] Mitrushina and Satz, 1995: Boston Naming Test Scores for Four Age Groups and for the Total Sample of Healthy Elderly Over Three Testing Probes Age Group
Time 1
Time2
Time3
57-65
56.0 (3.3)
56.2 (2.8)
56.0 (2.4)
66-70
56.1 (3.1)
56.0 (2.9)
56.1 (2.9)
71-75
53.7 (7.3)
54.6 (5.3)
54.2 (6.9)
7&-85
51.2 (7.3)
51.1 (8.6)
51.4 (7.9)
Total
54.5 (5.9)
54.7 (6.2)
54.8 (5.7)
715
APPENDIX 10
Table A10.7. [BNT.S] Neils et al., 1995: Data for the Elderly Sample Stratified Into Three Age Groups, Three Educational Levels, and Two Living Environment Settings: Noninstitutionalized/Institutionalized Educational Level
Age Group
6-9
10--12
12+
All
47.58 (6.14) n=12 42.79 (10.99) n=19 36.00 (12.46) n=17 41.58 (11.36)
53.00 (6.63) n=22 50.73 (5.72) n=22 45.53 (10.70) n=19 49.95 (8.29) n=63
53.10 (6.55) n=20 48.55 (7.96) n=20 49.88 (7.19) n=16 50.55 (7.40)
51.83 (6.77)
n=56
46.95 (8.78) n=l9 39.95 (10.05) n=19 38.11 (7.48) n=18 41.73 (9.51)
49.54 (6.42) n=13 48.30 (6.62) n=20 40.20 (7.62) n=15 46.10 (7.87)
n=56
n=48
Noninttitutiontllized (n= 167) 65-74
75-84
85-97
Total
n=48
n=54 47.54 (8.89) n=61 43.75 (11.72) n=52
I~
(n=l56) 65-74
35.14 (6.77) n=14 36.90 (11.84) n=19 34.53 (9.78) n=19 35.56 (9.80) n=52
75-84
85-97
Total
44.09
(9.59)
n=46 41.82 (10.71)
n=58 37.40 (8.60) n=52
Table A10.8. [BNT.6] Ross et al., 1995: Data for a Sample of Geriatric
Medical Inpatients From an Urban Rehabilitation Hospital with a Variety of Physical Diagnoses, Some of Whom Were 2--3 Weeks Post-Orthopedic Surgery Age
n
Education
DRS• Scores
BNT Scores
70-74
40
11.3 (3.1)
133.2 (4.5)
43.1 (11.7)
75-79
40
10.6 (3.3)
133.4 (4.8)
40.1 (10.9)
~80
43
10.2 (3.2)
131.4 (4.8)
35.8 (11.3)
•DRS, Depression Rating Scale.
716
APPENDIX 10
Table A10.9. [BNT.7] Worrall et al., 1995: Data for a Sample of Healthy Older Australians BNT Score
BNT Range
Recommended Cutoff
7
52.57 {3.10)
47-57
46.37
60-64
29
53.65 {5.60)
36-00
42.45
65-69
35
54.17 {4.47)
39-59
45.23
70-74
31
52.29 {6.38)
34-00
39.53
75-79
14
49.43 {8.01)
32-60
33.41
80-84
13
47.46 {7.54)
33-58
32.38
47.14 {6.12)
39-57
Age Groups
n
55-59
85+
7
Table A10.11. [BNT.9] Ivnik et al., 1996: Demographic Description of the Sample Partitioned into Groups Used in Boston Naming Testing Agegroupa
n
56-59 60-64 65-69 70-74 75-79 80-84 85-89 90-94
55
95+
87 79
85 125 132 71 24 5
Etluct.mon ~7
8-11 12 13-15 16-17 2:;18
34.90
4 103 218 163 115 60
Gender Males Females
Table A10.1 0. [BNT.S] Lafleche and Albert, 1995: Data for the Control Sample n
Age
Education
MIF Ratio
BNT Scores
20
76.2
14.7
9/11
50.75 {7.40)
263 400
Bace Caucasians Blacks
662 1
Handetlneu
Right
Left Mixed
Total
Table A10.12. [BNT.10a] Welch et al., 1996: Boston Naming Test Scores for Five Age Groups• of Healthy Elderly Age Group
n
BNT Score
Cutoff
60-64
20
51.6 {5.4)
45
65-69
37
53.4 {4.7)
45
70-74
40
50.1 {8.5)
42
75-79
28
42.9 {12.4)
35
80-93
51
44.7 {9.6)
35
•The sample included 74 males and 102 females with a mean education of 12.28 years (range 3-18).
607 27 29 663
717
APPENDIX 10
Table A10.13. [BNT.10b] Welch et al., 1996: Boston Naming Test Scores for Five Age Groups by Two Educational Levels <12 years
~12
years
Age Group
n
60-64
6
49.8 (5.4)
14
52.4 (5.5)
65--69
5
49.2 (3.6)
32
54.0 (4.5)
70-74
7
38.4 (12.3)
33
52.6 (4.7)
75-79
17
36.6 (10.9)
11
53.4 (4.9)
80-93
20
40.7 (11.3)
31
47.2 (7.5)
BNT Score
BNT Score
n
Table A10.14. [BNT.lOc] Welch et al., 1996: Boston Naming Test Scores for Five Age Groups for Males and Females Separately Males
Females
Age Group
n
M
n
M
60-64
10
54 (4.3)
10
49.2 (5.6)
65--69
22
54.5 (3.7)
15
51.7 (5.4)
70-74
11
52.9 (8.8)
29
49.0 (8.2)
75-79
11
46.7 (12.0)
17
40.4 (12.4)
80-93
31
45.7 (9.7)
20
43.9 (9.7)
Table A10.15. [BNT.ll] Hoff et al., 1996: Data for the Control Sample n
Age
Education
BNT Scores
54
32.1 (9.7)
15.4 (2.4)
56.3 (3.5)
APPENDIX 10
718
Table A10.16. [BNT.12] Ponton et al., 1996: Boston Naming Test Scores• for a Sample of 300 SpanishSpeaking, Healthy Participants Stratified by Gender, Age, and Education
Age Group 16-29
30-39
50-75
40-49
Education (Years) <10
>10
>10
<10
>10
<10
>10
13 22!39 (2l29)
18 26.56 (2.68)
12 22.92 (3.68)
17 27.82 (1.94)
18 24.11 (2.14)
6 27.17 (2.56)
44
16 21.81 (2.29)
11 25.64
25
20 24.15 (2.85)
Males
n
11
25
(SD)
22.09 (3.42)
22.52 (3.28)
12 20.17 (2.29)
30
x Femala n
x
(SD)
21.77 (3.95)
22 21.59
24.57 (2.82)
(2~)
"Ponton-Satz 30-item Spanish version was ~tered. I
---rable A10.17. [BNT.13a] Tombaugh ancl Hubley, 1997: Boston Naming
-!fest Scores and Gender
for a Sample of Healthy AduJts Stratified by Age, Education, '
n
SR
SR+SC
SR+SC+PC•
25-34
22
35-44
28
45-54
33
55-59
24
60-M
19
65-69
22
70-74
18
75-79
24
80-88
29
55.9 (2.8) 55.5 (3.9) 54.8 (4.1) 55.2 (3.6) 55.6 (3.5) 54.9 (3.9) 52.5 (4.6) 51.7 (5.5) 53.1 (4.0)
56.0 (2.9) 56.1 (3.6) 55.4 (3.6) 56.0 (3.1) 56.6 (2.9) 55.8 (3.5) 54.3 (4.4) 53.4 (4.6) 54.3 (3.8)
58.4 (2.5) 58.2 (2.2) 58.5 (1.8) 58.8 (1.3) 58.7 (1.7) 58.4 (1.7) 57.2 (2.7) 57.8 (1.9) 58.1 (1.6)
53.4 (4.4) 55.6 (3.7)
54.5 (3.9) 56.3 (3.2)
57.8 (2.1) 58.8 (1.7)
Age
EtluctJtion 9-12
123
13-21
96
(1.57)
20.52 (4.12)
APPENDIX 10
719
Table A10.17. (Contd.)
n
SR
SR+SC
SR+SC+PC"
Male
100
Female
119
54.9 (4.3) 53.9 (4.1)
55.9 (3.7) 54.8 (3.7)
58.5 (1.8) 58.0 (2.0)
54.3 (4.2)
55.3 (3.7)
58.3 (2.0)
Gender
Total
219
SR =Spontaneous Response; SR + SC =Sum of Spontaneous Response (SR)+ Stimulus Cue (SC); SR + SC+PC =Sum of Spontaneous Response (SR) +Stimulus Cue (SC) + Phonemic Cue (PC). Because scores on the Boston Naming Test are not normally distributed, standard deviations should not be used to compute normative data.
© Swets & Zeitlinger, 1997.
Table A10.18. [BNT.l3b] Tombaugh and Hubley, 1997: Boston Naming Test Norms Expressed as Percentiles for Age and Education Age Group 25-69
70-88 Education (Years)
Percentiles
9-12 (n=78)
13-21 (n=70)
9-12 (n=45)
13-21 (n=26)
59 58 55 53 49
60 59 58 55 53
58 56 53 48 45
58 56 54 52 48
59 58 56 53 49
59 58 56 54 51
60 60 58 56 53
59 58 55 52 47
59 58 56 53 49
60 58 57 54 51
60 60 59 57 56
60
60
60
60
60
60 59 58
60 39 58 57 54
59 58 57
59 58 56
52.1 (12.2) 11.3 (0.9) 51.0 (9.0) 41%
47.5 (12.4) 15.1 (1.9) 57.2 (7.4) 41%
78.0 (4.8) 11.2 (1.0) 53.3 (9.1) 56%
78.0 (4.7) 14.9 (1.4) 58.6 (6.9) 54%
59.0 (16.9) 12.9 (2.3) 54.3 (8.8) 46%
Total (n=219)
Sll 90 75 50 25
10
Sll+SC 90 75 50 25
10
Sll+SC+PC 90 75 50 25
10
60
DemognJplaica Age Education Vocabulary• %Male
"Raw scores on the Wechsler Adult Intelligence Scale-Revised Vocabulary subtest. SR, spontaneous response; SC, stimulus cue; PC, phonemic cue.
© Swets & Zeitlinger, 1997.
APPENDIX 10
720 Table A10.19. [BNT.14] Henderson et al., 1998: Demographic Information for the Race by Gender Subgroups of a Healthy Adult Sample and the Mean Boston Naming Test Scores for the Race, Gender, and Education Groups n
Age
Education
Males
25
Females
25
41.36 (17.95) 49.76 (19.74)
15.48 (2.21) 14.58 (1.88)
BNT
Table A1 0.21. [BNT.16] Fastenau et al., 1998: Data for Three Age Groups and the Whole Sample• of Healthy Elderly Age Group
n
57-68
Education
Female
35
64.3 (3.2)
14.8 (2.5)
49
56.8 (2.8)
69-76
38
72.2 (2.3)
14.1 (2.5)
47
54.0 (4.4)
77-85
35
80.3 (2.4)
13.5 (2.3)
46
53.8 (5.4)
108
72.2 (7.0)
14.1 (2.4)
47
54.8 (4.5)
Caucaaiaru
African Americaru Males
Females
Total sample 25 25
32.84 (13.88) 38.84 (12.49)
14.52 (1.87) 15.16 (1.95)
BNTt Scores
Mean Age
%
"The sample was predominantly Caucasian (95%). tBoston Naming Test administration procedure was not standard: the test items were reorganized.
Race
African American
50
Caucasian
50
Gender Female
50
Male
50
56.54 (5.40) 55.56 (4.34) 56.12 (4.89) 55.98
Table A10.22. [BNT.17] Ross and Lichtenberg, 1998: Boston Naming Test Scores Stratified by Age and Education for Neurologically Intact Medical Patients from an Urban Hospital Education
(4.95)
~12
<12
Education Noncollege
23
College
77
51.91 (6.37) 57.28
Age
n
BNT Score
n
BNT Score
65--69
16
41.1 (12.9)
18
48.7 (7.8)
70-74
27
38.1 (11.3)
39
45.8 (9.8)
75-79
26
37.0 (10.5)
36
44.9 (8.9)
80-84
22
36.4 (10.3)
15
43.4 (11.9)
85-95
17
29.6 (10.1)
17
41.7 (11.7)
(3.58)
Table A10.20. [BNT.15] Stuss et al., 1998: Data for the Control Sample n
Age
Education
37
54.4 (14.4)
13.9 (2.3)
M/F Ratio
NART" IQ
BNT Score
19/18
113.8 (6.1)
55.5 (4.1)
"NART, National Adult Reading Test.
APPENDIX 10
721
Table A10.23. [BNT.l8] Kohnert et al., 1998: Data for the Spanish and English Versions of the Test•
BNT Scorest
n 100
Age
Education
MIF Ratio
Spanish
English
20.82 (2.6)
14.4 (1.7)
41.159
32.00 (8.83) 15.23-48.78
46.66 (6.64) 34.06-59.26
• All participants were b11ingual, with Spanish as the primary home language, and a mean age of English acquisition of 4.6 (3.0) years. tBoston Naming Test scores with SDs and 95'11 confidence intervals.
Table A10.24. [BNT.l9] Randolph et al., 1999: Data for the Sample of 719 Participants Stratified by Gender and Broken Down into Age Groups Using Overlapping Midpoints Technique
Males
n
BNT Score
Age nange 50-60
53
~
67
56-66
85
59-69
93
62-72
90
65-75
99
68-78
99
71-S1
95
74-S4
100
77-S7
87
80-90
65
83+
41
56.4 (3.2) 55.9 (3.4) 55.5 (3.4) 55.5 (3.2) 55.1 (3.7) 54.9 (4.1) 54.4 (4.2) 53.5 (5.3) 52.3 (5.9) 51.3 (6.4) 50.9 (7.1) 50.5 (6.7)
Education (,_..) <12
n 102
12
235
>12
382
Females
n
45
80 98 109 111 112 127 156 173 177 157 115
BNTScore 49.1 (7.0) 52.2 (6.1) 53.4 (6.0)
BNT Score
55.0 (4.3) 55.3 (3.7) 55.2 (3.7) 54.5 (3.8) 54.3 (3.8) 53.8 (4.3) 52.6 (5.2) 51.6 (5.9) 50.6 (6.4) 49.5 (6.9) 48.8 (7.4) 47.3 (8.2)
722
APPENDIX 10
Table A10.25. [BNT.20] Killgore and Adams, 1999: Data for a Sample of Patients Referred for a Neuropsychological Evaluation but Cleared of Neurological Impairment n 62
WAIS-R FSIQ
Age
Education
45.7
13.1
95.1
(15.1)
(2.7)
(12.0)
WF Ratio
BNT
Score 52.8 (6.2)
Table A10.26. [BNT.21] Heaton et al., 1999: Data for a Healthy Elderly Sample n
Age
Education
Male
%
% Caucasian
96
68.71 (5.47)
13.50
111.8
82.4
BNT
Score 54.01 (4.43)
(2.45)
Table A10.27. [BNT.22] Rosselli et al., 2000: Data for Monolingual Spanish-Speaking. Monolingual English-Speaking, and BilingUal Participants BNT Scores
Language Group
n
Age
£ducation
MIF Ratio
Spanish
Spanish
18
61.3 (8.1)
13.3 (4.8)
4/14
51.1 (4.1)
English
45
63.4 (10.1)
16.6 (2.4)
15130
Bilingual
19
60.6 (9.7)
14.5 (3.6)
9/10
English
54.9 (4.8) 52.9 (6.1)
52.4 (7.1)
Table A10.28. [BNT.23] Schmitter-Edgecombe et al., 2000: Data for a Sample of Healthy Adults Stratified into Three Age Groups
Age Group
n
Age
Educalion
MIF Ratio
FSIQ"
BNT Scores
18-22
26
18.93
13.38
7/19
113.15 (7.88)
53.54 (3.39)
(0.7$)
58-74
26
66.29
17.58 (2.28)
7/19
114.58 (9.60)
57.50 (2.03)
75-93
26
79.19
16.65 (3.0$)
7/19
115.04 (9.88)
55.35
"Estimated Wechsler Adult Intelligence Scale-Revised full-scale IQ based on four subtests.
(3.53)
APPENDIX 10
723
Table A10.29. [BNT.24] Saxton et al., 2000: Data for a Sample of Free of Cardiovascular Disease Elderly Age
Education
%Male
BNT Score
73.63 (4.45)
13.23 (2.85)
44.9
52.95 (0.33) 0
n
357
"It is unclear whether this value represents SD.
Table A10.30. [BNT.25] Bell et al., 2001: Data for the Control Group n
Age
Education
%Male
FSIQ"
BNT Score
29
34.4 (12.5)
13.0 (1.7)
28
97.7 (6.4)
53.6 (3.2)
"Wechsler Adult Intelligence Scale-III full-scale IQ based on seven-subtest short fonn.
Table A10.31. [BNT.26] Roberts et al., 2002: Data for Monolingual English Speakers, Spanish/English Bilinguals, and French/English Bilinguals Language Group
n
Age
Education
BNT Scores
English
42
37.1 (8.99)
16.29 (3.05)
50.9 (3.45)
Spanish! English
32
39.6 (6.59)
17.91 (3.38)
42.6 (8.04)
French! English
49
34.9 (8.14)
15.0 (2.93)
39.5 (7.43)
Table A10.32. [BNT.27) Coffey et al., 2001: Data for a Sample of Neurologically Healthy Elderly n
Age
Education
MMSE" scores
Vocabularyt scores
BNT scores
Whole sample Males
320
74.85 (4.95)
12.98 (2.87)
28.29 (1.50)
47.52 (13.26)
51.08 (7.50)
122
75.20 (5.36)
13.30 (3.09)
28.16 (1.53)
47.83 (13.73)
51.45 (8.69)
Females
198
74.63 (4.68)
12.78 (2.91)
28.38 (1.48)
47.33 (13.99)
50.86 (6.69)
"MMSE, Mini-Mental State Exam. twAIS-R Vocabulary.
Table A10.33. [BNT.28] Giovannetti et al., 2003: Data for the Control Sample n
31
Age
25.2 (6.07)
Education
MIF Ratio
WAIS-R FSIQ
BNT Score
15.0 (1.48)
21110
109.3 (11.51)
54.7 (3.7)
Appendix 1Om: Meta-Analysis Tables for the Boston Naming Test (BNT)
Table A1Om.1. Results of the Meta-Analysis and Predicted Scores for the Boston Naming Test (Relevant values are weighted on the standard error for the test mean)
Description of the aggregate sample
or or
Number studies included in the analysis Years publieation Number data points used in the
or
analysis
14 1~2003
42
(a data point denotes a study or a cell in education/gender-stratified data) Total number of partieipants
1,684
n•
xt
sot
Mean
42
29.51
27.31
7-173
Age Mean SD
42 41
67.91 2.53
15.26 2.50
24.1-87.0
35
34
13.79 2.64
1.50 1.04
11.0-16.6 0.5-4.0
6 6
116.10 11.77
2.60 2.65
113.8-119.9 6.1-15.2
20
79.93
34.34
0-100
42 42
52.25 5.92
3.26 2.03
45.7-58.3 1.9-9.7
Variable
Range
Sample aize
1.(~14.4
Education Mean SD
IQ Mean SD
Percmt lllllle
Te81 acore mearu Combined mean Combined SD
"Number of data points differs for different analyses due to missing data. tweighted means and standard deviations.
724
725
APPENDIX 1OM Table A10m.1. (Contd.) Predicted Seores and SDs per age group• (BNT)
95%CI
95%CI Age Bange
Predicted
Lower
Upper
Predicted
Lower
Upper
Score
Band
Band
SD
Band
Band
.25-.29 30-34 35-39 40-44 45-49 50-S4 SIJ.-.59 60-64
55.71 56.37 56.76 56.89 56.75 56.35 55.68 54.75 53.56
54.93 55.60 55.93 55.97 55.77 55.34 54.69 53.82 52.74 51.42 49.85 47.91
57.00 57.14 57.59 57.81 57.74 57.37
3.08 3.09 3.18 3.35 3.61 3.95 4.38 4.89 5.48 6.16 6.92 7.77
2.47 2.71 2.78 2.82 2.98 3.29 3.74 4.32 5.00 5.65 6.18 6.64
3.69 3.47 3.58 3.88 4.24 4.62 5.02 5.46 5.97 6.67 7.66 8.91
~
70-74
suo
75-79 80-84
50.37 48.39
56.68
55.69 54.38 52.77 50.89 48.86
•Based on the equations:
Predkted ,_, ecore = 47.36842 + 0.4489501 • age- 0.0052924 • age2 Predkted SD = 4.542304- 0.0992503 • age+ 0.0016771 • ag~
Significance tests for regression with the test scores Ordioary least square regressioa or test on age (quadratic) Number of observations 42 Number of clusters 14 R2 0.850 F
Coefficient
SE
0.4489501 -0.0052924 47.36842
0.070 0.001 1.567
6.45• -8.46 30.23
p
95%CI
o.ooo•
0.299 to 0.599 -0.007 to -0.004 43.98 to 50.75
0.000 0.000
•Significance test for age centered (sample means- aggregate mean): t= -14.74, p =0.000.
Prediction Predicted age range Mean predicted score SEe
95%CI
2&-84years
54.47 (2.81) 0.41 53.66-55.28 (continued)
726
APPENDIX 1OM
Table A10rn.1. (Contd.) 60
0
0
0
55
50
0
~ ~---.-----.----~------~-----.-----.----30
40
••eo.
50
70
60
Figure A10m.1. A scatterplot illustrating the ~rsion of the data points around the regression line for the Boston Naming Test The size of the bubbles ~re8ects the weight of the data point, with larger bubbles indicating larger standard error and smaller we~t.
Tests for assumptions and model 8t I Tests for heteropneity In the 8aal data jlet Pooled estimates mr fixed effect I Pooled estimates fOr random effect I
55.083 53.961 P<41) = 465.64, p < 0.000 1 4.654 '
p Moment-based estimate of between-study variance
Q(dO·
~
Tests for model &t-edclition of a
Adjusted~
BIC
BIC'
0,611
0.602
28.180
0.850
0.842
-35.968 -72.238
Model Linear Quadratic
term
'
-8.091
BIC' difference of 36.270 provides very strong s;pport for the quadratic model.
Tests for parameter speclflcatioas Normality of the residuals Shapiro-Wilk W test Homoscedasticity White's general test
W=0.914, p =0.442
2.397, f = 0.663
727
APPENDIX 1OM
Table A10m.1. (Contd.) Significance tests for regression with SDs OnliDary least-squares regression of SDs on age (quadratic) Number of observations Number of clusters R2
42 14 0.583
Fcdfl• p
Fc2.I3l = 45.34, p
Term
Coefficient
SE
Age
-0.0992503
0.089 0.001 2.067
Age2
0.0016771
Constant
4.542304
-I.n• 1.97
2.20
p
95% CI
0.287• 0.071 0.047
-0.292 to 0.094 -0.000 to 0.004 0.08 to 9.01
•Significance test for age centered (sample means -aggregate mean): t = 4.61,
p< 0.000.
Predietion Mean predicted SD
SE.,
4.66 (1.60) 0.31 4.05-5.26
95% CI
Effects of demographic variables Education Est. tau2 without education Est. tau2 with education Regression of test means on education and age Number of observations Number of clusters
3.51 1.32
35 13 0.854
R2
Term
Coefficient
Education
-0.050
p
95% CI
0.722
-0.35 to 0.25
SE -0.36
0.136
IQ Regression of test means and IQ on age Number of observations 6 Number of clusters 3 R2 0.961
Term
Coefficient
SE
t
p
95% CI
IQ
0.1061451
0.140
0.76
0.527
-0.495 to 0.707
Gender t-test by gender
n
X male
X female
M-F difference
6M,6F
51.617
48.167
3.450
p 1.246
0.121
Appendix 11 : Locator and Data Tables for the Verbal Fluency Test
Study numbers and page numbers provided in these tables refer to study numbers and descriptions ofthe studies in the text of Chapter 11.
728
Locator tab] al o provid s a re~ rene for each stud to a corre pond.ing data table in thi appendix.
Table A11.1. Locator Table for the Verbal Fluency Test Study
Age•
VF.l Cauthen, 1978b page 209 Table A11.2
20--59
VF.2 Yeudall et al., 1986 page 210 Tables All.3-A11.5
15-40
225
15--20 21-25 26-30 31-40
62 73 48 42
VF.3 Gordon & Lee, 1986 page 211 Table All.6
26.5
VF.4 Bolla et al., 1990 page 211 Table A11.7
39-89 64.3 (13.5)
VF.5 Seines et al., 1991 page 212 Table A11.8
VF.6 Axelrod & Henry, 1992 page 212 Table A11.9
~60
n 51 64
IQ!Education •
Version
Alberta,
FSIQ: 115.6 (8.7) 111.5 (13.1)
127 M, 98 F volunteers; data presented in 4 age groups for M and F separately and combined
Education: 14.6 (2.8) FSIQ: 118.6 (8.8)
FAS
250
90 M, 160 F university students
RJ14.0 (1.0)
Letter Fluency
Pittsburgh
199
80 M, 119 F volunteer participants in a study on normal aging; data divided by 3 levels of raw WAIS-R Vocabulary scores
8-22
FAS
Maryland
Subjects from MACS; seronegative homosexual males; data presented for 3 age groups and 3 educational levels; mean education= 16 years
FAS, Animals
Baltimore, Chicago, Los Angeles, Pittsburgh
FAS
Michigan
Canada
F, T,J, P
-o -o
m
z 0
X
Alberta, Canada
14.7 (3)
College (202)
35--44 45--54
309 290 97
50-89
80
50--59 55.3 (2.5) 60-69 65.2 (2.6)
10M lOF
15.4 (2.5)
10M 10 F
14.4 (2.5)
Healthy, independently living volunteers
Letters S, G, U,N,
Location
Subjects in younger group gathered from different sources ( 12 M, 39 F), older subjects lived primarily in institutional settings (28 M, 36 F)
733
25--34
>
Sample Composition
>College (302) Education:
....
N
ID
(continued)
....
Table A11.1. (Contd.) Age•
n
70-79 74.3 (2.9) 80--89 83.4 (3.0)
10M 10 F
14.5 (4.2)
8M 12 F
14.5 (4.1)
VF.7 Monsch et al., 1992 page 212 Table A1l.l0
71.2 (7.9)
53
17 M, 36 F healthy elderly volunteers
13.6 (2.7)
FAS 3 categories (including Animals, Fruits and Vegetables), first names, supermarket
San Diego, CA
VF.8 SimkinsBullock, 1994 page 213 Table A1l.ll
52.6 (15.6)
19
10M, 9 F healthy volunteers
Education: 13.26 (2.5) FSIQ: 102.0 (12.85)
FAS categories (includes Animals and Fruits or Vegetables)
Michigan
VF.9 Parkin & Lawrence, 1994 page 213 Table A1l.l2
71.9 (4.8)
22
4 M, 18 F healthy elderly volunteers
Education: 9.4 (1.3) NART FSIQ: 106.1 (12.6)
FAS
UK
VF.10 Friedman et al., 1995 page 214 Table A11.13
35.8 (11.0)
24
Healthy volunteers recruited primarily from hospital staff
FAS
Ohio
VF.ll Kozora & Cullum, 1995 page 214 Tables A11.14, A11.15 ·
50-89 50-59 60-69 70-79 80-89
174 41 43 47 43
FAS,
USA
Study
= IIW
Sample Composition
Volunteers screened for major medical and psychiatric disorders
IQ!Education •
Education: 14.3 (2.3) 14.2 (2.3) 14.3 (3.1) 14.9 (3.3)
Version
Animals, Supermarket, First names, U.S. States
Location
> '"0 '"0
m
z
0
X
VF.12 Norris et al., 1995 page 215 Table A11.16
129
3 samples were used: I. Community elderly living independently
FAS
Texas
54
VF.13 Cahn et al., 1995 page 216 Table A11.17
78.4 (6.8)
238
Cognitively intact elderly participants of Rancho Bernardo Study; data for the entire sample and optimal cutoffs are provided
13.8 (2.6)
FAS
California
VF.14 lvnik et al., 1996 page 216 Table A11.18
56 to 95+
743
Normal elderly volunteers; tables for age correction and a regression equation for education correction; data tables not reproduced
Mayo FSIQ: 106.2 (14.0)
COWA
Minnesota
16-70
360
90 90 90 90
Education 7-22 groups: :::>12 13--15
COWA
16-24 25-39
Native English speakers: 180 M, 180 F; data are reported for 3 education groups, M & F separately; tables for data conversion to T scores and percentile ranks are provided; testretest and internal consistency data reported
California, Michigan, eastern seaboard
Paid male volunteers, strict selection criteria
15.4 (2.4)
COWA
New York
VF.l5 Ruff et al., 1996 page 217 Tables A11.19, A11.20
40-54
55-70
VF.16 Hoff et al., 1996 page 219 Table A11.21
32.1 (9.7)
> -c -c
60--86 73.1 (6.1) 62-89 75.3 (7.5) 18-28 19.4 (1.8)
16.7 (2.3)
m
z
0 35
40
2. Institutionalized elderly with MMSE scores ~20 3. Undergraduate students
12.4 (3.7)
X
13.6 (1.1)
Interrater reliability data are provided.
54
~16
(continued)
....... IIW
.....
Table A11.1. (Contd.)
~
N
Study
VF.l7 Ponton et al., 1996 page 219 Table A11.22
VF.l8 Crossley et al., 1997 page 220 Table A11.23
Age•
n
38.4 (13.5)
300
16-29 30--39 40-49 50-75 628
Sample Composition
IQ!Education•
Spanish-speaking healthy volunteers; MIF ratio 40%160%; data partitioned by gender (2) x age (4) x education (2)
10.7 (5.1)
Community-dwelling seniors, cognitively normal
Version
Location
FAS
Los Angeles, CA
4 education groups: 0-6 7-9 10--12 13+
FAS, Animal Naming
Canada
<10 >10
65-74 75--84 85+
(635)
73.7 (8.7)
38
18M, 20 F elderly volunteers screened for health problems
Education: 13.4 (3.4)
FAS, Animal Naming
Oklahoma
VF.20 Nyberg et al., 1997 page 220 Table A11.25
77.3
39
Healthy elderly
13.6
FAS
Canada. Sweden
VF..21 Salthouse et al., 1997
18-39
40
page 221 Table Al1.26
40--59
38
Healthy adults, 47% M; data stratified into 3 age groups
60--78
37
15.5 (1.7) 15.2 (2.5) 15.3 (2.6)
73.0 (7.6)
317
Healthy elderly volunteers from 5 ethnic groups (Chinese, Hispanic Vietnamese, white, and African-American) were assessed in their native language; data grouped by age, education, gender, ethnicity
10.3 (5.0)
VF.l9 Beatty et al., 1997 page 220 Table A11.24
VF..22 Kempler et al., 1998 page 221 Table Al1.27
54-74 75-99
0--8 ;:::9
CFL, Animals, Furniture, Vegetables
Animal Naming
Atlanta
California
>
"tt "tt
m
z
0 X
VF.I3 Stuss et al., 1998 page222 Table A11.28 VF.24 JohnsonSelfridge et al., 1998 page222 Table All.29
54.4 (14.4)
37.9 (2.6) 37.8 (2.7) 37.9 (2.6)
37
200
Control group (19M, 18 F) without neurological or psychiatric disorder Data for 3 ethnic groups of male veterans: white,
13.9 (2.3)
13.5 (2.3) 12.9 (2.1) 13.3 (2.4)
FAS, Animal Naming
Canada
FAS, Animal Naming
USA
"tl
m
z
0 X
200
black,
200
hispanic; sample includes medically and psychiatrically ill participants
28.5 (12.2)
81
Normal or neurologically stable adults; 20% had neurological conditions; 60% M; data on test-retest reliability and practice effect are provided
12.2 (1.9)
FAS
Washington
VF.26 Manly et al., 1999 page 224 Table A11.31
2:;65
187
Illiterate and literate nondemented elders; 74% F; data are presented for education-matched and uneducated samples for English-and Spanishspeaking participants separately
0-3
Animals, Food, Clothing
New York
VF.27 Boone, 1999 page224 Table A11.32
45-84
155
53 M, 102 F healthy elderly volunteers; data stratified by FSIQ level (average, high average, superior)
Education: 14.57 (2.55) FSIQ: 115.41 (14.11)
FAS
California
VF..28 Demakis, 1999 page 225 Table A11.33
22.5 (7.99)
21
Undergraduate students, 67% F
13.6 (1.46)
COWA
Illinois
VF.29 Epker et al., 1999 page 225 Table A11.34
70.6 (4.7)
65
22 M, 43 F healthy elderly volunteers
14.3 (2.9)
FAS, Animal Naming
Texas
VF..25 Dikmen et al., 1999 page 223 Table All.30
63.07 (9.29)
> "tl
......
~ ~
(continued)
.....
Table A11.1. (Contd.)
~
....
Study
Age•
VF.30 Tombaugh, et al., 1999 page 226 Tables A11.35-A11.37
1&-95
FAS 895
1&-19 20-29 30-39 40-49 50-59 60-69 70-79 80-89 90-95
Animal735
n
VF.31 Basso et al., 1999 page 226 Table A11.38
32.50 (9.27)
50
VF.32 Gladsjo et al., 1999 page 227 Tables All.39, A11.40
20-34 35-49
VF.33 Binder et al., 1999 page 227 Table A11.41
Sample Composition Large samples of healthy subjects obtained from 2 studies; data are stratified by 3 age x 3 educational levels, as well as by 9 age groups, 4 education groups, and 2 genders
IQ/Education• 0-21 0-8 9-12 13-16 17-21
Version
Location
FAS, Animal Naming
Canada
Data for healthy men on 2 testing probes over a 12-month interval
14.98 (1.93)
FAS
Tulsa, OK/ Ohio
768
Healthy adults, 52% M; data stratified by age (3) x education (3); data are also presented for African Americans and Caucasians separately
<11 12-15 1&-20
FAS, Animal Naming
San Diego, CA
82.3 (4.4)
125
Normal elderly sample, aged 70 or above, 25% M, 87% Caucasian
13.5 (3.0)
Animal Naming
St. Louis, MO
VF.34 Fama et al., 2000 page 228 Table Al1.42
66.7 (7.4)
51
Healthy elderly, rigorous exclusion criteria
16.4 (2.3)
FAS, Animals, Inanimate Objects
California
VF.35 Troyer, 2000 page228 Table All.43
18-91 59.8 (20.7)
Healthy adults, 30%M
5-21 13.9 (2.9)
FAS/CFL,
Canada
50+
FAS 257 CFL 154 Animal407 Supermarket 156
Animals,
> "'tt
Supermarket
m
"'tt
z
0
X
VF.36 Acevedo et al., 2000 page 229 Tables A11.44-A11.47
69.1 (6.9)
424
64.9 (7.7)
278
50-59 60-69 70-79
English speakers, 26% Male; Spanish speakers, 30.8% M; data stratified by age, education, and gender
14.4 (2.5)
Animals, Vegetables, Fruits
Florida
m
z
13.4 (3.2)
0
><
8-12 13-16 2:17
VF.37Chen et al., 2000 page 230 Table A11.48
74.9 (4.4)
483
Control elderly volunteers who participated in the MoVIES project, 37.5% M
31.9%
Total for letters P and S Total for Animals and Fruits
Pennsylvania
VF.38 Anstey et al., 2000 page 230 Tables A11.49, All.SO
79.04 (6.59)
280
Normative data collected on old and very old adults living in retirement villages and hostels; data presented in raw scores for the entire sample and in percentile distribution for the sample stratified by age x education; 14% M
11.25 (2.79)
FAS
Australia
VF.39 Brady et al., 2001 page 231 Table A11.51
66.41 (6.73)
235
Healthy elderly, all-male sample
14.03 (2.62)
Animals
Boston
VF.40 Rosselli et al., 2002a page 231
50-84 61.76 (9.3)
82
28M, 54 F; English monolingual, Spanish monolingual,
2-23 14.8 (3.6)
FAS, Animals
Florida
12.7 (2.7)
Animals
Multicenter, USA
Table All.52 VF.41 Grady et al., 2002 page 232 Table All.53
>
"tl "tl
bilingual 66.3 (6.4)
517
67.3 (6.3)
546
Data for women with established coronary disease, 2 groups: ERT and placebo
12.7 (2.7) (continued)
..... "'-! c.n
;:j cr-
Table A11.1. (Contd.) Study
Age•
n
VF.42 Giovannetti et al., 2003 page 233 Table A11.54
25.2 6.07
31
VF.43 LopezCarlos et al., 2003 page 233 Tables All.55-A11.58
28.23
115
(8.74)
18-29 30-49
VF.44 Miller, 2003 (an update on Seines et al., 1991) page 234 Table A11.59
37.5 (6.9)
VF.4S Ravdin et al., 2003 page234 Table A11.60
6()...$
728
25-34
35-44 45-59 70-79 80-92
34 80 35
Sample Composition 21 M, 10 F healthy adults recruited from a medical center community
IQ!Education• Education: 15.0
Location
Animals
New York, Pennsylvania
PMR, Animals
Los Angeles, CA: Jalisco, Mexico
FAS, Foods
MACS Centers
CFL, Animals, Fruits, Vegetables
New York
(1.48)
FSIQ: 109.3 (11.51)
All-male sample; monolingual Spanishspeaking Latino manual laborers; data partitioned by age, education, country, and age (2) x education (2)
6.66 (2.54)
Seronegative homosexual and bisexual males from MACS, native English speakers; data partitioned by age x education
16.3 (2.3)
Healthy elderly, 32M, 117 F; data partitioned by 3 age groups
Version
0-6
7-10
<16 16 >16 15.57
(2.67) est.VIQ 120.44 (5.74)
>
•Age column and IQ/education column contain information regarding range and/or mean and standard deviation for the whole sample and/or separate groups, whichever information is provided by the authors.
"tl "tl
rn
z
0
><
737
APPENDIX 11
Table A11.2. [VF.l] Cauthen, 1978b: Number ofWords for Eight Letters• Generated by Two Age Groups Age 20--59 (n =51)
IQ range 100-140
M
X FSIQ =
so
115.6 (8.7)
s
G
u
N
F
T
J
p
16.2 (4.6)
11.7 (3.7)
6.1 (2.0)
9.3 (2.7)
12.9 (3.8)
13.3 (3.7)
8.1 (3.1)
13.4 (3.4)
Age
~60 (n =
64)
s
G
u
N
F
T
8.8 (4.4)
6.9 (3.1)
3.1 (1.8)
5.5 (2.8)
8.0 (2.9)
7.6 (3.5)
3.7 (2.3)
7.4 (4.1)
10.9 (3.9)
8.9 (3.3)
3.9 (1.7)
6.0 (3.0)
10.1 (2.7)
9.8 (2.7)
4.7 (1.8)
10.0 (3.3)
13.9 (4.8)
10.4 (3.7)
5.5 (2.0)
8.7 (3.1)
12.9 (4.3)
13.0 (3.5)
7.1 (2.9)
13.6 (4.7)
p
IQ range 80--106 (n=21)
M
so 107-118 (n = 21)
M
so 119--140 (n=22)
M
so
•Data for the younger group (12M, 39 F) are presented for the whole sample. Data for the older group (28 M, 36 F, mostly institutionalized elderly) are stratified by the full-scale IQ (FSIQ) level.
Table A11.3. [VF.2a] Yeudall et al., 1986: Data• for the Whole Sample of Healthy Adults and for Four Age Groups Age Group 15-20
21-25
26--30
31-40
15-40
n
62
73
48
42
Age
17.76 (1.96)
22.70 (1.40)
28.06 (1.52)
34.38 (2.46)
24.66 (6.16)
Education
12.16 (1.75)
14.82 (1.88)
15.50 (2.65)
16.50 (3.11)
14.55 (2.78)
118.14 (8.73)
116.45 (8.57)
120.03 (9.12)
121.87 (8.17)
118.56 (8.81)
F
13.82 (4.36)
14.99 (4.37)
15.65 (4.42)
16.83 (4.04)
15.15 (4.41)
A
12.48 (3.87)
13.33 (4.89)
13.08 (3.41)
14.50 (3.66)
13.26 (4.13)
s
15.87 (4.52)
16.63 (4.97)
16.54 (4.70)
18.10 (4.89)
16.67 (4.80)
Average of 3 trials
14.06 (3.82)
14.98 (4.29)
15.09 (3.34)
16.48 (3.61)
15.03 (3.90)
WAIS FSIQ
•Total number of words generated for each letter and average for three letters.
225
APPENDIX 11
738
Table A11.4. [VF.2b] Yeudall et al., 1986: Data for Males Age Group 15-20
21-25
26-30
n
32
37
32
26
Age
17.78 (2.09)
22.57 (1.26)
27.75 (1.57)
34.69 (2.41)
25.15 (6.29)
Education
12.22 (1.96)
15.11 (1.74)
15.78 (2.79)
16.69 (3.55)
14.87 (2.99)
119.21 (8.36)
118.61 (8.83)
120.30 (8.97)
122.92 (7.06)
119.96 (8.45)
F
14.03 (4.48)
14.83 (4.84)
15.84 (4.07)
16.58 (4.31)
15.25 (4.50)
A
13.00 (3.91)
13.22 (5.52)
13.03 (3.31)
14.50 (4.13)
13.38 (4.34)
s
15.81 (4.79)
16.94 (5.05)
17.44 (5.19)
17.88 (4.41)
16.98 (4.90)
Average of 3 trials
14.28 (3.96)
15.00 (4.73)
15.44 (3.37)
16.32 (3.77)
15.20 (4.04)
31-40
15-40
WAIS FSIQ
31-40
15-40 127
Table A11.5. [VF.2c] Yeudall et al., 1986: Data for Females Age Group
26-30
15-20
21-25
n
30
36
16
16
98
Age
17.73 (1.84)
22.83 (1.54)
28.69 (1.25)
33.88 (2.53)
24.03 (5.95)
Education
12.10 (1.52)
14.53 (1.99)
14.94 (2.32)
16.19 (2.29)
14.12 (2.43)
116.91 (9.15)
114.29 (7.86)
119.29 (9.78)
120.30 (9.73)
116.79 (8.99)
F
13.60 (4.29)
15.14 (3.90)
15.25 (5.16)
17.25 (3.64)
15.03 (4.31)
A
11.93 (3.82)
13.44 (4.24)
13.19 (3.71)
14.50 (2.85)
13.11 (3.88)
s
15.93 (4.31)
16.31 (4.94)
14.75 (2.91)
18.44 (5.72)
16.29 (4.68)
Average of 3 trials
13.82 (3.71)
14.96 (3.86)
14.40 (3.27)
16.73 (3.43)
14.81 (3.73)
WAIS FSIQ
APPENDIX 11
739
Table A11.6. [VF.3] Gordon and Lee, 1986: Scores on a Letter Fluency Test for a Sample of University Students Aged 18--35 Years
Males Females
n
Score•
90
40.76 (11.46)
160
43.12 (10.84)
"Total number of words for three letters (with SD).
Table A11.7. [VF.4] Bolla et al., 1990: FAS Data for a Healthy Elderly Sample Divided into Three Groups Based on Vocabulmy Scores, Stratified by Gender Males on Vocabulary ~53
n
54-60
Females on Vocabulary 2:61
~53
54-60
2:61
32
25
23
33
39
47
61 (12)
63
65
65
(15)
(17)
61 (11)
(15)
69 (17)
Education
13 (03)
14 (03)
17 (03)
13 (03)
15 (03)
16 (03)
Vocabulary•
47 (OS)
57
65
(02)
(02)
45 (06)
52 (02)
(02)
43 (12)
47 (09)
42 (09)
46 (12)
49 (12)
Age
FASt
38 (12)
•Raw Wechsler Adult Intel1igence Scale-Revised Vocabulary scores. tTotal number of words for three letters (with SD).
65
740
APPENDIX 11
Table A11.8. [VF.5] Seines et al., 1991: Data for a Sample of Seronegative Homosexual/Bisexual Males Participating in the Multi-Center AIDS Cohort Study, Stratified by Age and Education
FAS"
Animals
Percentiles
Percentiles
n
Mean Age
Education
Mean (SD)
5th
lOth
Mean (SD)
5th
lOth
By age 25--34
309
17
26
29
14
17
97
25
29
23.4 (5.8) 23.4 (5.4) 23.3 (4.7)
15
45-54
45.7 (12.7) 46.1 (12.6) 45.9 (12.3)
30.5
290
16.1 (2.2) 16.4 (2.3) 16.7 (2.6)
26
35-44
31.0 (2.6) 39.3 (2.9) 48.5 (2.6)
15
17
36.1 (7.4) 35.6 (7.2) 38.4 (7.8)
13.7 (1.2) 16.0 (0.0) 18.6 (1.3)
41.7 (11.6) 46.2 (12.3) 49.0 (12.4)
23
26
13
15
28
31
16
17
29
32
16
18
Age
By education
229
College
202
>College
302
"Total number of words for three letters.
22.0 (5.3) 23.1 (4.8) 24.6 (5.7)
741
APPENDIX 11
Table A11.9. [VF.6] Axelrod and Heruy, 1992: FAS data for Four Age Groups of Healthy Adults Age Groups Variables
50s
60s
70s
80s
55.3 (2.5)
65.2 (2.6)
74.3 (2.9)
83.4 (3.0)
15.4 (2.5)
14.4 (3.0)
14.5 (4.2)
14.5 (4.1)
Malelfornale ratio
10/10
10/10
10/10
8/12
Etlmicity (Black/Caucasian)
48, 16C
48, 16C
48, 16C
28, 18C
10.3 (2.1)
10.3 (2.3)
9.8 (2.8)
10.0 (2.5)
4.3 (0.7)
4.0 (0.9)
4.0 (0.9)
3.9 (0.8)
2.4 (3.0)
2.7 (2.6)
3.6 (3.2)
3.1 (2.5)
14.6 (3.8)
14.2 (4.7)
11.8 (3.2)
13.1 (4.1)
11.5 (4.3)
11.2 (3.7)
10.8 (4.3)
11.5 (5.1)
14.0 (3.9)
14.2 (3.8)
13.4 (3.9)
13.2 (5.4)
41.1 (9.9)
39.6 (10.7)
36.0 (9.3)
37.8 (14.0)
Age M
SD EtluccJtion M
SD
WAIS-B Vocabulcary• M
SD Health n~tingt M
SD Number of playskian appointmenta M
SD FtDOf'd. M
SD A tDOf'd8 M
SD
s tDOf'd8 M
SD Total FAS tDOf'd8 M
SD
•Wechsler Adult Intelligence Scale-Revised scaled scores. tself-rating of health on a 5-point scale, where higher values indicate better health.
Table A11.10. [VF.7] Monsch et al., 1992: Data• for the Control Group on FAS, Category, First Names, and Supennarket Fluency n
Age
Education
MIF Ratio
FAS
Categoryt
First Names
Supennarket
53
71.2 (7.9)
13.6 (2.7)
17/36
41.2 (12.5)
48.4 (9.8)
22.6 (5.8)
22.8 (4.7)
•Total number of words for all trials per task. teategory fluency task included animals, fruits, and vegetables trials.
742
APPENDIX 11
Table A11.11. [VF.8] Simkins-Bullock et al., 1994: Data for the Control Sample n
Age
Education
fSIQ
M1F Ratio
19
52.6 (15.6)
13.26 (2.5)
JP2.0 (12.85)
1019
43.58 (9.63)
37.95 (6.54)
•Total number of words for three letters. teategory fluency condition included animals Fd fruits or vegetables trials.
i
Table A11.12. [VF.9] Parkin and Lawreece, 1994: Data for the Control Sample Est
Mffl
n
Age
Education
FSIQ"
Ratie
FASt
22
71.9 (4.8)
9.4 (1.3)
106.1 (12.6)
4/lli
36.9 (10.7)
i
"Full-scale IQ was estimated using the NatiOnal Adult : Reading Test. tTotal number of words for three letters.
Table A11.13. [VF.10] Friedman et al., ' 1995: Data for the Control Sample n
Age
24
35.8 (11.0)
44.29 (12.5)
"Total number of words for three letters.
Table A11.14. [VF.lla] Kozora and Cull9m, 1995: Demographic Characteristics for a Sample of Healthy Adults Partitioned int~ Four Age Groups Age Group
50-59
60-69
70-79
80-89
n
41
43
47
43
Mean age (SD)
54.5 (3.0)
64.6 (2.8)
74.6 (2.5)
83.8 (3.1)
Mean education (SD)
14.3 (2.3)
14.2 (2.3)
14.3 (3.1)
14.9 (3.3)
Male/female ratio
21120
Mean Vocabulary score• (SD)
57.15 (6.37)'
I
16127
15132
16127
58.8 (5.63)
60.02 (8.72)
58.79 (8.87)
•Wechsler Adult Intelligence Scale-Revised VoCabulary raw scores.
743
APPENDIX 11 Table A11.15. [VF.llb] Kozora and Cullum,1995: Data for the Letter and Category Fluency Conditions for the Four Age Groups Age Group
Letter Total F A
s Category Total
Animals Supermarket First names
50-09
60-69
70-79
80-89
41.23 (12.10) 14.05 (4.55) 12.98 (4.08) 14.13 (5.19)
45.76 (14.26) 15.69 (5.25) 14.17 (4.92) 15.91 (5.36)
46.49 (10.46) 15.98 (3.91) 14.40 (4.26) 16.11 (4.51)
40.74 (11.19) 14.21 (3.69) 12.95 (4.44) 13.49 (4.40)
108.55 (17.08) 20.95 (4.16) 26.85 (6.75) 29.21 (5.67)
105.13 (18.40) 21.07 (5.08) 25.58 (5.11) 26.76 (6.74) 31.60 (7.40)
92.53 (16.23) 18.96 (4.67) 22.60 (5.27) 23.73 (5.90) 26.67 (5.78)
(17.36) 15.81 (4.51) 19.93 (5.41) 20.47 (5.27) 27.81 (8.05)
U.S. states
30.77 (7.24)
82.63
Table A11.16. [VF.12] Norris et al., 1995: Data for a Sample of Independently Living Elderly (60-86 Years Old), Institutionalized Elderly (62--89 Years Old), and Undergraduate Students (18-28 Years Old) Group Variable
Old (Community)
Old (Institution)
Young
n
54
35
40
Age
73.1 (6.1)
75.3 (7.5)
19.4 (1.8)
Education
16.7 (2.3)
12.4 (3.7)
13.6 (1.1)
Depressiont
3.8 (3.6)
9.7 (7.4)
6.5 (4.3)
Functional status:
8.1 (0.3)
14.0 (3.1)
-
36.9 (10.1)
21.5 (9.8)
40.5 (7.8)
27.5 (2.1)
24.7 (3.0)
-
Verbal fluency' Mini-Mental State Exam
. .
•Young participants did not receive these measures due to anticipated ceiling and floor effects. 'Depression was assessed with the Geriatric Depression Scale. :Functional status was measured with the Functional Assessment Scale. 'Total number of words for three letters.
744
APPENDIX 11
Table A11.17. [VF.13] Cahn et al., 199.1: Data for
Table A11.19. [VF.15a] Ruff et al., 1996: Data•
a Control Sample of Cognitively Intact Jf;lderly
for the Controlled Oral Word Association Version for a Sample of Healthy Adults 16-70 Years of Age Stratified by Three Educational Groups x Gender
n
Age
Education
238
78.4 (6.8)
13.8 (2.6)
MIF Ratio : 97/141
"Total number of words for three letters. tThe SD is considerably lower than expected.:
Letter• Fluency 38.3 (0.78)t
Education Groups
Males
n
180
180
36.9 (9.8) +3 40.5 (9.4) -1
35.9 (9.6) +4 39.4 (10.1) -1
~16
41.0 (9.3) -1
46.5 (11.2) -7
43.8 (10.6)
TottJl _,.,.
39.5 (9.8)
40.6 (11.2)
40.1 (10.5)
~12
13-15
TableA11.18. [VF.14] Ivniketal., 1996: Demographic Description of the Healthy Sample Partitioned into Groups Used in Controlled Oral Word Association Testing
Females
Total Sample
360 36.5 (9.9) 40.0 (9.7)
•Means, SDs, and correction factors to adjust raw scores for level of education.
n Age groups
56-59 60-64
55
65-69
85 93 146 149 84 33 8
70-74 75-79 80-84 85-89 90-94
95+
90
Education ~7
8-11 12 13-15 16-17 ~18
Gender Males Females
8 121 239 181 128 66
286
457
Baee
Caucasians Blacks
741 2
Handedneu Right Left Mixed
682 29 32
Tott.al
743
Table A11.20. [VF.15b] Ruff et al., 1996: Percentile Ranks, Normalized T Scores, and Interpretation for the Education-Corrected Scores for the Controlled Oral Word Association Test Corrected Score
Percentile
T Score
1 2
26.7 29.5
34
3 4 5 8 9 10 13 16 19 21 27 30
35 36 37 38 39 40 41
38 43 47 51 58 61
31.2 32.5 33.5 35.8 36.6 37.2 38.7 40.2 41.2 41.9 43.9 44.7 45.9 46.9 48.2 49.2 50.3 52.0 52.8 53.6 54.4 55.0
~17
20 21 23 25 26
27 28 29 30 31 32 33
34
42
64
43 44
67 69
Interpretation Seriously deficient Deficient Deficient Borderline Borderline Low average
Low average Average
745
APPENDIX 11 Table A11.20. (Contd.) Corrected Score
Percentile
T Score
Interpretation
72 76 78 80 82 85 87 89 91 92 94 95 97 98 99
55.8 57.0 57.7 58.5 59.1 60.4 61.3 62.3 63.4 64.1 65.5 66.5 68.9 70.6 73.3
Average High average
45 46 47
48 49 50 51 52 53 54
55
56 58 60 ~64
High average Superior
Superior Very superior
Table A11.21. [VF.16] Hoff et al., 1996: Data for the Controlled Oral Word Association for the Control All-Male Sample n
Age
Education
COWA Score
54
32.1 (9.7)
15.4 (2.4)
(10.0)
43.7
Table A11.22. [VF.17] Ponton et al., 1996: FAS Data• for a Sample of 300 SpanishSpeaking Participants Stratified by Gender x Age x Education Age Group 16-29
Education
30-39
40-49
50-75
<10
>10
<10
>10
<10
>10
<10
>10
11 24.18 (9.95)
25 31.84 (7.85)
13 31.39 (9.92)
18 33.00 (9.34)
12 24.33 (6.91)
17 35.18 (10.64)
18 24.33 (12.66)
35.83
12 24.00 (8.79)
30 26.37 (9.01)
22 22.91 (9.95)
44 35.18 (10.49)
16
23.56
11 34.64 (7.55)
25 20.88 (11.98)
20 33.00 (9.88)
Malu n X
(SO)
6 (7.94)
Females n X
(SO)
"Total number of words for three letters.
(8.79)
746
APPENDIX 11
Table A11.23. [VF.l8] Crossley et al., 1997: Data for the FAS• and Animal Naming for a Sample of Cognitively Intact Seniors Prulitioned by Age, Gender, and Educational Level Animal Naming
FAS! M
so:
n
M
SD
n
24.0 25.8 24.0
12.4 11.5 10.8
139 343 146
14.2 14.2 12.5
4.3 3.8 3.8
144 343 148
23.2 26.2
12.1l 11.0! .j
258 370
14.2 13.6
4.2 3.9
258 377
16.2 23.7 27.0 34.2
6.9, 9.9 10.2 12.6
140 170 202 115
12.1 13.4 14.1 16.3
3.1 3.8 3.9 4.1
149 169 203 113
25.0
11.61
628
13.8
4.3
635
Age group 65-74 7/h'W ~85
Gender Male Female
Education (years)
I
0-6 7-9 10-12 ~13
TottJl aampk
I
•Total number of words for three letters.
1
I I
! I
I Table A11.24. [VF.l9] Beatty et al., 199t: Data for the FAS and Animal Naming for the Control Sample ~ n
Age
Education
MMSE
38
73.7 (8.7)
13.4 (3.4)
28.7 (1.6)
•Total number of words for three letters.
I 1
MIF Ratio
FAS Score•
Animal Naming Score
18120
36.8 (13.6)
18.0 (4.9)
MMSE, Mini-Mental State Exam.
Table A11.25. [VF.20] Nyberg et al., l997: FAS Data for a Healthy Elderly Sample · n 39
Age (Range)
Education (Range)
FAS Score•
77.3 (66-87)
13.6 (8-22)
42.51 (9.77)
•Total number of words for three letters.
APPENDIX 11
747
Table A11.26. [VF.21] Salthouse et al., 1997: Data for a Sample of Healthy Participants Stratified into Three Age Groups Letter Fluency
Category Fluency
Mean Age
n
Education
%Male
c
F
L
Animals
Furniture
Vegetables
18-39
.29.0 (4.8)
40
15.5 (1.7)
4.2.5
16..2 (4.7)
14.9 (4 ..2)
15.0 (4.1)
.20.6 (5.1)
1.2.9 (.2.8)
13.8 (3.3)
40-59
49.1 (5.1)
38
15..2 (.2.5)
50.0
14.9 (4.9)
13.7 (4.6)
13.4 (3.3)
18.8 (5.7)
1.2.3 (3.4)
14.0 (3.9)
60-78
69..2 (5.1)
37
15.3 (.2.6)
48.6
14.8 (5 ..2)
14.6 (4.7)
13.5 (4.9)
17.1 (4.6)
11.8 (3.3)
14.0 (3.3)
Age Group
Table A11.27. [VF.22] Kempler et al., 1998: Data for Animal Naming for a Sample of Healthy Adults Stratified by Age, Education, Gender, and Ethnicity
X Age
X
n
(SD)
Education (SD)
54-74
195
75-99
1.2.2
70.0 (4 ..2) 80.8 (4.6)
10.3 (5.1) 10.1 (5.0)
16.0 (5.0) 14.4 (4.3)
Edaccdion 0-8
11.2
73.3 (7.5) 7.2.7 (7.7)
4.6 (.2.5) 13.3 (3.0)
13.5 (4..2) 16.4 (4.7)
73.4 (7.3) 7.2.7 (7.8)
11.5 (4.6) 9.6 (5 ..2)
16.4 (4.4) 14.7 (4.8)
7.2.8 (9.1) 76.6 (7.6) 72.5 (7.3) 71.9 (7.1) 71.6 (5.8)
11.6 (4.7) 1.2.3 (3.8) 10.9 (5.5) 8.5 (5.4) 8.6 (4..2)
15..2 (4.4) 16.7 (4.2) 15.3 (5.1) 1.2.8 (3.9) 17.3 (5 ..2)
73.0 (7.6)
10.3 (5.0)
Group
Years in us•
Age at Immigration
%at Hornet
Animals (SD)
Age
~9
.205
Gender Male
11.2
Female
205
Etlmiciey African American
54
White
58
Chinese
67
Hispanic
78
Vietnamese
60
Total
317
•Number of years residing in United States. tPercent who speak only their native language at home.
11.8 (9.3) .27.6 (19.6) 7.0 (6.4)
60.9 (9.1) 44.0 (19.1) 64.3 (8 ..2)
98% 82% 98%
15.5 (4.6)
748
APPENDIX 11
Table A11.28. [VF.23] Stuss et al., 1998: Data for the Control Group• Stratified by Age and Gender Age 40--64
21-39 Tasks
Females
Males 10
9
Females
Males
16
Females
9
8
Letter Ouency (FAS)t
53.2 (13.1)
48.4 (10.3)
44.1 (10.2)
42.7 (11.0)
37.1 (10.1)
47.5 (14.8)
Animal Ouency
26.3 (2.6)
23.0 (3.9)
18.7 (3.3)
22.4 (6.0)
16.7 (1.7)
18.1 (5.9)
Letter-based errors
2.1 (2.5)
1.4 (1.8)
2.3 (2.0)
2.2 (1.6)
2.3 (2.4)
3.0 (2.2)
Semantic errors
0.7 (1.0)
0.1 (0.3)
0.7 (0.9)
1.1 (1.4)
0.6 (1.3)
0.4 (0.7)
n
10
Males
65--81
•Mean education for the sample is 13.9 (2.3) years; mean National Adult Reading Test IQ is 113.8 (6.1). tTotal number of words for three letters.
Table A11.29. [VF.24] Johnson-Selfridge et al., 1998: F AS and Animal Naming Data for a Sample of Male Veterans Stratified into Three Ethnic Groups Ethnic Group
n
Age
White
200
Black Hispanic
Animals
Animals adjusted
Score
scoret
33.6
21.2 (4.6)
<JJJ.7
32.5 (lO.o)
35.1
18.3 (4.8)
18.8
31.7 (9.6)
31.0
18.7 (5.2)
18.7
Education
WRAT-R Reading
FAS Score•
37.9 (2.6)
13.5 (2.3)
62.8 (14.3)
35.5 (11.9)
200
37.8 (2.7)
12.9 (2.1)
50.4 (13.6)
200
37.9 (2.6)
13.3 (2.4)
60.2 (13.9)
FAS adjusted scoret
•Total number of words for three letters. tscores after covarying for income, education, and Wide Range Achievement Test-Revised (WRAT-R) Reading scores.
Table A11.30. [VF.25] Dikmen et al., 1999: Test-Retest FAS Data for a Group of Normal, Neurologically Stable Adults• FAS Scoret n
Age
Education
M/F Ratio
WAlS FSJQ•
Test-Retest Interval
81
28.5 (12.2)
12.2 (1.9)
60/40
108.8 (12.3)
11.1 (0.6)
Time 1
Time2
43.25 (10.75)
44.47 (10.36)
•Demographic information is provided for a larger sample of 138 participants; mean Wechsler Adult Intelligence Scale full-scale IQ (Wechsler, 1955) is reported for the three groups used in this study combined; 20% of the sample had preexisting conditions that might affect test performance, the most significant being alcohol abuse and a significant traumatic brain injury. tTotal number of words for three letters.
749
APPENDIX 11 TableA11.31. [VF.26] Manlyetal., 1999: Data for Illiterate and Literate Elders (>65 years of Age) with 0-3 Years of Education n
Category Fluency•
Englisla-apealdng elden Education-matched sample Literate
43 43
Illiterate Uneducated sample Literate
26 47
Illiterate
11.79 (3.97) 11.55 (3.14) 11.34 (3.48) 12.15 (3.07)
Spaniah-8pflaldng elden Education-matched sample Literate Illiterate
32 32
Uneducated sample Literate
17
Illiterate
43
10.88 (3.90) 11.59 (3.21) 10.51 (3.06) 12.38 (3.02)
•score represents number of words averaged over three conditions (animals, food, and clothing).
Table A11.32. [VF.27] Boone, 1999: FAS Data for a Healthy Elderly Sample• Partitioned into Three IQ Levels WAIS-R IQ Level Average
n FAS scoret SD
53
36.45 (9.26)
High Average
Superior
39 38.87 (9.22)
59 44.31 (11.88)
"Mean age for the sample is 63.07 (9.29), mean education is 14.57 (2.55), mean full-scale IQ is 115.41 (14.11). tTotal number of words for three letters.
Table A11.33. [VF.28] Demakis, 1999: FAS Data for the Control Sample n
Age
Education
21
22.5 (7.99)
13.6 (1.46)
MIF Ratio
FAS Score•
33167
37.8 (11.1)
"Total number of words for three letters.
APPENDIX 11
750
Table A11.34. [VF.29] Epker et al., 1999: Data for the FAS and Animal Naming for the Control Sample n
Age
Education
MIF Ratio
MMSE
FAS Score•
Animals Score
65
70.6 (4.7)
14.3 (2.9)
22143
28.45 (1.44)
45.31 (12.67)
19.49 (4.67)
•Total number of words for three letters. MMSE, Mini-Mental State Exam.
Table A11.35. [VF.30a] Tombaugh et ·al., 1999: Data for the FAS and Animal Naming forla Sample of Healthy Adults Stratified by Ded10graphic Groups FAS•
n
~Naming
M (SD)
n
M(SD)
24.9 (10.7) 36.7 (12.2) 42.6 (11.6)
140
13.9 (3.9) 16.7 (4.6) 19.0 (5.2)
Etluccation 0-8
163
9-12
664
13-16
392
17-21
81
43.9 (12.3)
44
19.5 (5.2)
16-19
19
19
20-29
106
30-39
132
40-49
121
50-59
144
60-69
220
70-79
334
80-89
200
90-95
24
39.3 (12.0) 41.2 (9.2) 43.1 (11.4) 43.5 (12.2) 42.1 (11.1) 38.5 (13.7) 34.8 (12.8) 28.9 (11.7) 28.2 (11.0)
21.5 (4.4) 19.9 (5.0) 21.5 (5.5) 20.7 (4.2) 20.1 (4.9) 17.6 (4.7) 16.1 (4.0) 14.3 (3.9) 13.0 (3.8)
377 173
'
Age 41 43 45 43 92
228
200 24
'
Gender Male
559
Female
741
ToltJl
1,300
37.0 (13.0) 37.8 (13.1)
310
37.5 (13.1)
735
425
•Total number of words for three letters.
17.4 (5.1) 16.5 (5.0) 16.9 (5.0)
751
APPENDIX 11
TableA11.36. [VF.30b] Tombaughetal.,1999: Data• for the FAS Stratified by Three Age Groups x Three Education Groups Age ~79
16-59
80-95
Education Percentile Score
0-8 (n=12)
9-12 (n=268)
90 80 70 60 50 40 30 20 10
48 45 42 39 36 35 30 27
32
M
38.5 (12.0)
(SD)
34
13-21 (n=242)
0-8 (n=76)
9-12 (n=292)
13-21 (n=185)
0-8 (n=75)
39 36 31 27 25 22 20 17 13
54 47 43 39 35 32 28 24 21
59 53 49 45 41 38 36 34 27
33 29
28
61 55 51 49 45 42 38 35 30
40.5 (10.7)
44.7 (11.2)
25.3 (11.1)
35.6 (12.5)
42.0 (12.1)
56
50 47 43 40 38 35
9-12 (n=102)
13-21 (n=46) 56
24 22 21 19 17 13
42 38 34 31 29 27 24 22 18
22.4 (8.2)
29.8 (11.4)
37.0 (11.2)
26
47 43 39 36 33 30 28 23
•Total number of words for three letters.
Table A11.37. [VF.30c] Tombaugh et al., 1999: Data for Animal Naming Stratified by Three Age x Three Education Groups Age ~79
16-59
80-95
Education Percentile Score
0-8 (n=12)
9-12 (n=268)
13-21 (n=242)
0-8 (n=76)
9-12 (n=292)
13-21 (n=185)
0-8 (n=75)
9-12 (n=102)
13-21 (n=46)
90 75 50 25 10
23 20 17 15
30 25 23 18 16
20 17 14 12 11
22 19 17 14 12
25 22 19 16 13
18 16 13 11 9
19 17 14 12 11
24 20 16 14 12
M (SD)
19.8 (4.2)
21.9 (5.4)
14.4 (3.4)
16.4 (4.3)
18.2 (4.2)
13.1 (3.8)
13.9 (3.4)
16.3 (4.3)
26
Table A11.38. [VF.31] Basso et al., 1999: Data for a Sample of Healthy Men on Two Testing Probes over a 12-Month Interval FAS Score• n
Age
Education
WAIS-R FSIQ
Test
Retest
50
32.50 (9.27)
14.98 (1.93)
109.30 (12.29)
47.68 (10.82)
48.42 (12.06)
•Total number of words for three letters.
752
APPENDIX 11
Table A11.39. [VF.32a] Gladsjo et al., lt99: Data for the FAS and Animal Naming for a Healthy Sample• Stratified by Age and Education Education 0-11 Age Range
FASt
n
12-15 Animal
FAS
103
16-20 Animal
FAS
Animal 250
415
20-34
38.21 (13.43)
17.74 (5.52)
40.30 (9.59)
21.11 (5.90)
44.38 (10.54)
22.88 (4.73)
35-49
33.32 (11.93)
18.36 (6.63)
! (11.43)
40.63
19.82 (6.26)
47.27 (13.33)
22.28 (5.57)
50-101
31.47 (13.21)
15.28 (3.80)
. 38.63
i (11.98)
18.05 (4.81)
41.81 (12.75)
19.35 (4.42)
•sample is 52% male, 45% African American, 55% Caucasian.
1'-rotal number of words for three letters.
Table A11.40. [VF.32b] Gladsjo et al., lt99: Data for the FAS and Animal Naming for. Healthy African-American and Caucasian Particip.ts, Stratified by Age l Cau~ianst
African Americans• Age Range
FAS:
Animal
FAS;
422
346
n
Animal
20-34
38.94 (10.49)
19.44 (5.17)
45.37 (9.45)
24.79 (5.20)
35-49
38.61 (12.57)
19.11 (6.16)
47.00 (11.81)
22.96 (5.64)
50-101
33.87 (12.96)
16.68 (4.71)
40.32 (12.39)
18.59 (4.63)
"Meanage39.2(12.6)years,meaneducation13.4~.5)years.
tMean age 59.0 (19.6) years, mean education l4.5 (2.8)
years. *Total number of words for three letters.
Table A11.41. [VF.33] Binder et al., 1999: Data for a Normal Elderly Sample
cos•
Blessedt
Animal
n
Age
Education
%Male
%Caucasian
Score
Score
Naming
125
82.3 (4.4)
13.5 (3.0)
25
87
1.8 (1.8)
2.1 (2.1)
15.5 (4.5)
"Geriatric Depression Scale. tshort Blessed Orientation-Memory-Concentra~n Test.
753
APPENDIX 11 Table A11.42. [VF.34] Fama et al., 2000: Data for the FAS, Animal Naming, and Inanimate Objects Naming Conditions for the Control Sample Age
Education
FAS Score•
Animal Naming Score
Inanimate Objects
n 51
66.7 (7.4)
16.4 (2.3)
41.2 (12.9)
22.1 (4.4)
26.1 (7.3)
Naming Score
"Total number of words for three letters.
Table A11.43. [VF.35] Troyer, 2000: Data• for the Letter and Category Fluency Conditions for a Sample of Healthy Adults Version FAS/CFL'
n
Age
Education
M/F Ratio
Test Score
257/154
59.8 (20.7)
13.9 (2.9)
aono
42.5 (11.7)
Animals
407
19.5 (5.3)
Supermarket
156
22.9 (5.8)
"Demographic information is provided for a larger sample. tTbe mean for FAS/CFL performance.
Table A11.44. [VF.36a] Acevedo et al., 2000: Data for the Category Fluency Test for Healthy EnglishSpeaking Participants n
Animal Naming Vegetables Fruits Total
Age 37
60-69
107
70-79
172
18.4 (4.9) 17.1 (4.2) 15.2 (4.3)
16.0 (4.1) 14.4 (3.9) 13.6 (3.5)
16.0 (4.1) 13.7 (3.7) 12.5 (3.1)
50.4 (10.6) 45.2 (9.6) 41.3 (8.4)
15.0 (4.3) 16.3 (4.0) 18.8 (5.4)
14.2 (3.8) 14.0 (3.7) 14.7 (3.9)
13.0 (3.1) 13.3 (3.9) 13.9 (3.6)
42.2 (8.7) 43.6 (9.5) 47.4 (10.7)
Animal Naming Vegetables Fruits Total
50-59
64
60-09
97
70-79
76
16.3 (3.9) 17.2 (5.3) 16.3 (4.4)
13.0 (3.6) 13.1 (4.0) 12.3 (3.6)
13.2 (3.3) 13.4 (3.4) 12.8 (3.6)
42.6 (8.4) 43.6 (10.0) 41.3 (9.7)
15.8 (4.6) 17.1 (4.1) 17.7 (5.7)
12.6 (3.7) 13.1 (3.9) 12.8 (3.4)
12.4 (3.3) 13.8 (3.5) 13.5 (3.2)
40.9 (9.5) 44.0 (9.6) 44.0 (8.8)
16.6 (5.7) 16.7 (4.2)
11.3 (3.8) 13.5 (3.6)
12.2 (3.5) 13.5 (3.3)
40.1 (10.7) 43.7 (8.8)
16.7 (4.7)
12.8 (3.6)
13.1 (3.4)
42.6 (9.5)
EducCJtion
.Etlueaticm 8-12
112
13-16
154
17+
50
Gender Male
82
ToltJl
n
Age
50-59
Female
Table A11.4S. [VF.36b] Acevedo et al., 2000: Data for the Category Fluency Test for Healthy SpanishSpeaking Participants
234
16.2 (4.6) 16.3 (4.5)
11.9 (2.8) 15.0 (3.7)
11.9 (3.3) 13.8 (3.6)
40.0 (8.7) 45.0 (9.5)
16.2 (4.5)
14.2 (3.8)
13.3 (3.6)
43.7 (9.6)
8-12
105
13-16
94
17+
38
Gender Male
73
Female Total
164
APPENDIX 11
754
Table A11.46. [VF.36c] Acevedo et al., 2000: Data for English Speakers Stratified by Gender x Age and Gender x Education
Age Men 50-59
Women
60-69
70-79
50-59
60-69
70-79 J.i7
30
45
30
77
Animals
16.4 (3.3)
16.4 (4.9)
16.0 (4.7)
18.9 (5.1)
17.3 (3.9)
15.0 (4.2)
Fruits
12.3 (2.3)
11.7 (3.5)
11.9 (3.4)
16.9 (3.9)
14.4 (3.5)
12.7 (3.0)
Vegetables
11.7 (1.7)
11.8 (2.8)
12.0 (3.0)
17.0 (3.8)
15.4 (3.8)
14.2 (3.5)
Total Huency
40.3 (4.5)
40.0 (9.7)
39.8 (8.6)
52.7 (10.2)
47.2 (8.8)
41.9 (8.3)
n
7
Education Women
Men 8-12
13-16
17+
8-l.i
13-17
17+
n
25
Animals
15.6 (4.4)
42
15
87
1l.i
35
16.1 (4.4)
17.4 (5.8)
14.8 (4.3)
16.4 (3.9)
19.4 (5.2)
Fruits
11.9 (3.4)
11.7 (3.3)
12.3 (3.3)
13.3 (2.9)
13.9 (4.0)
14.6 (3.6)
Vegetables
12.2 (2.3)
11.7 (3.1)
12.0 (3.0)
14.8 (3.9)
14.8 (3.6)
15.9 (3.7)
Total Huency
39.8 (8.3)
39.4 (8.8)
41.7 (9.3)
42.9 (8.8)
45.2 (9.3)
49.9 (10.4)
755
APPENDIX 11 Table A11.47. [VF.36d] Acevedo et al., 2000: Data for Spanish Speaken Stratified by Gender x Age and Gender x Education Age Women
Men 50-59
60-09
70-79
50-59
60-09
70-79
n Animals
15
32
26
49
65
50
15.5 (3.4)
18.0 (7.2)
15.4 (4.2)
16.6 (4.1)
16.7 (4.0)
16.7 (4.5)
Fruits
ll.1 (3.0)
12.7 (3.9)
12.4 (3.2)
13.8 (3.1)
13.7 (3.2)
13.0 (3.8)
Vegetables
u.s (3.4)
11.0 (3.7)
11.6 (4.2)
13.5 (3.6)
14.1 (3.7)
12.7 (3.3)
38.3 (7.8)
417 (12.3)
39.3 (9.9)
43.9 (8.2)
44.6 (8.6)
42.2 (9.5)
Total fluency
Education Men
Women
8-12
13-16
17+
8-12
13-17
17+
n Animals
39
21
13
66
73
25
16.3 (5.4)
16.8 (5.2)
17.1 (7.7)
15.6 (4.2)
17.2 (3.8)
18.0 (4.5)
Fruits
12.2 (3.7)
12.8 (3.2)
11.5 (3.2)
12.6 (3.1)
14.1 (3.6)
14.5 (2.7)
Vegetables
u.s (4.0)
10.6 (3.7)
10.9 (3.4)
13.1 (3.6)
13.8 (3.7)
13.8 (3.0)
40.3 (10.9)
40.2 (10.6)
39.5 (10.9)
41.2 (8.6)
45.1 (9.1)
46.4 (6.5)
Total fluency
Table A11.48. [VF.37] Chen et al., 2000: Data for the Control Sample of Nondemented Elderly• n
483
Age
%Male
74.9 (4.4)
37.5
Letter Fluencyt
Category Flueno/
23.46 (7.26)
27.70 (6.31)
"Lower than high school education, 31.9%. tTotal number of words for letters P and S. :Total number of words for Animals and Fruits categories.
Table A11.49. [VF.38a] Anstey et al., 2000: Data for a Sample of Australian Elderly• n
280
Age
Education
% Male
FAS scoret
79.04
11.25 (2.79)
14
32.76 (11.33)
(6.59)
"Participants with Mini-Mental State Exam scores as low as 17 were included. 'Total number of words for three letters.
APPENDIX 11
756
Table A11.50. [VF.38b] Anstey et al.; 2000: Percentile Distribution for the Sample Stratified by Age x Education Age
62-69
70-79
90-95
80-89 Education
n
0-9
10-12
~13
0-9:
10-12
~13
0-9
10-12
~13
0-9
10-12
~13
7
7
6
43
60
29
36
29
26
8
4
5
13 13 16 26
44
27 27 32 47 51
10 15 22· 28! I 37. 46:
15 18 22 31 42 46 49
15 23 31 38 47 51 49
12 16 21 33 37 43 47
17 22 26 32 40 49
17 21 29 36 43 49 58
13 13 25 31 36
20 20 21 30 42
22 22 24
20
10 10 36 41
p~
5 10 25 50 75 90 95
ssl
Table A11.51. [VF.39] Brady et al., 2001:tData for the Initial Test and Retest 3 Years La~r for a 1 Sample of Healthy Males
n
Age
235
66.41 (6.73)
Education
14.03 (2.62)
Males
Test 1 ~
1009&
50
Table A11.53. [VF.41] Grady et al., 2002: Data for the Animal Naming Test for a Sample• of Women with Established Coronary Disease, Stratified into Estrogen/Progestin Replacement Treatment and Placebo Groups
Test 2
19.0 : (4.8>
I
18.3 (4.9)
Table A11.52. [VF.40] Rosselli et -~L 2002a: Performance on the FAS and Animal NlllllflgTests for Three Linguistic Groups 1
33 56
Animal Group
n
Age
Education
%White Race
Naming Score
Treabnent 517 66.3 (6.4)
12.7 (2.7)
90.9
15.9 (4.8)
Placebo
12.7 (2.7)
90.5
16.6 (4.8)
546 67.3 (6.3)
•All participants were <80 years of age.
: Animal n
Monolinguals English
45
Spanish
18
FAS Score•
37.7 (5.3) 34.9 (4.1)
I Naming Score
I
16.8 (5.2) 16.7 (3.8)
Bilinguals
Table A11.54. [VF.42] Giovannetti et al., 2003: Data for the Animal Naming Test for the Control Sample Animal n
English
19
Spanish
19
34.9 (4.8) 35.2 (4.8)
14.2 (4.1) 14.5 (3.8)
Age
31 25.2 (6.07)
Education
15.0 (1.48)
WAIS-R MIF Ratio FSIQ"
21110
109.3 (11.51)
Naming Score
24.9 (5.5)
"Wechsler Adult Intelligence Scale-Revised full-scale IQ. "Total number of words for three letters.
757
APPENDIX 11
Table A11.55. [VF.43a] Lopez-Carlos et al., 2003: Data for Monolingual Spanish Speakers with ::510 Years of Education on the PMR Version of Phonemic Fluency and Animal Naming. Stratified by Age Group
Vocabulaly-
PMR Scorest
Animal Naming Scores
71
16.86 (9.37)
31.45 (9.92)
44
19.41 (10.36)
33.95 (12.67)
Age Group
n
1~29
30-49
Table A11.56. [VF.43b] Lopez-Carlos et al., 2003: Data for Monolingual Spanish Speakers with ::510 Years of Education on the PMR Version of Phonemic Fluency and Animal Naming. Stratified by Education Group
Vocabulary•
PMR Scorest
Animal Naming Scores
56
14.38 (8.33)
28.87 (10.14)
16.70 (4.49)
59
21.12 (10.02)
35.71 (10.96)
18.71 (5.24)
Education Group
n
16.83 (4.28)
0-6
19.21 (5.66)
7-10
"Wechsler Adult Intelligence Scale-ni Vocabulary raw scores (Mexican version). 1Total number of words for three letters.
•Wechsler Adult Intelligence Scale-Ill Vocabulary raw scores (Mexican version). tTotal number of words for three letters.
Table A11.57. [VF.43c] Lopez-Carlos et al., 2003: Data for Monolingual Spanish Speakers with ::510 Years of Education on the PMR Version of Phonemic Fluency and Animal Naming. Stratified by Age x Education
Age 1~29
30-49
Animal Naming Scores
Education
n
Vocabulary•
PMR Scorest
0-6
30
12.63 (6.49)
27.24 (8.70)
15.34 (4.54)
7-10
41
19.95 (10.04)
34.50 (9.72)
17.90 (3.79)
0-6
26
16.38 (9.85)
30.76 (11.48)
18.28 (3.95)
7-10
18
23.78 (9.74)
38.39 (13.23)
20.50 (7.36)
•Wechsler Adult Intelligence Scale-Ill Vocabulary raw scores (Mexican version). 1Total number of words for three letters.
758
APPENDIX 11
Table A11.58. [VF.43d] Lopez-Carlos et al., 2003: Data for Monolingual Spanish Speakers with :::;10 Years of Education on the PMR Version of Phonemic Fluency and Animal Naming, Stratified by Country Group Countty Group
n
Vocabulary•
PMR Scorest
Animal Naming Scores
Ins Angeles, USA
65
16.18 {8.52)
30.00 {11.09)
17.16 {4.46)
Jalisco, Mexico
50
19.98 {10.96)
35.62 {10.30)
18.52 {5.53)
"Wechsler Adult Intelligence Scale-III Vocabulary raw scores {Mexican venion). tTotal number of words for three !etten.
Table A11.59. [VF.44] Miller, 2003 (An Update on Seines et al., 1991): Data for a Sample of Seronegative Homosexual/Bisexual Males Participating in the Multi-Center AIDS Cohort Study, Stratified by Age x Education Age 25-34
35-44
Education {Yean) <16 Mean {SD) n
FAS"
Animal Naming
40.35 {10.98) 88
21.10 {4.14) 30
16 Mean {SD) n
48.61 {11.99) 80
22.52 {4.77) 27
>16 Mean {SD) n
49.54 {12.43) 81
23.80 {5.72) 25
Total Mean {SD) n
46.00 {12.47) 249
22.39 {4.93) 82
<16 Mean (SD) n
41.78 {11.80) 100
22.16 (5.73) 32
16 Mean {SD) n
43.52 (12.86) 87
22.32 (4.81) 25
>16 Mean (SD) n
49.16 (12.44) 156
24.35 (5.61) 48
45.58 (12.77)
23.20 (5.52) 105
Total Mean (SD) n
343
Table A11.59. (Contd.) Age 45-59
Education {Yean) <16 Mean {SD) n
FAS"
Animal Naming
41.05 {10.41) 39
20.27 {3.58) 11
42.22
22.00
{12.06)
(3.79)
16
Mean {SD) n
Total
32
7
>16 Mean {SD) n
48.56 {13.47) 65
23.82 {5.42) 22
Total Mean {SD) n
44.91 {12.75) 136
22.53 (4.87) 40
<16 Mean {SD) n
41.10 (11.23) 227
21.44 (4.82) 73
Mean (SD) n
45.36 (12.62) 199
22.37 (4.61) 59
>16 Mean (SD) n
49.13 (12.63) 302
24.08 (5.54) 95
Total Mean {SD) n
45.60 (12.65) 728
22.79 (5.19) 227
16
•Total number of words for three letten.
APPENDIX 11
759
Table A11.60. [VF.45] Ravdin et al., 2003: Data for a Sample of Healthy Elderly• Stratified by Three Age Groups Age Group 60-69
70-79
80-92
n
34
80
35
Age
66.26 (2.59)
74.56 (2.67)
83.71 (3.66)
Education
16.00 (2.17)
15.54 (2.62)
15.23 (3.21)
Estimated VIQt
120.00 (5.36)
121.24 (5.32)
119.14 (6.83)
Letter fluency total:
49.56 (11.57) 17.35 (4.97) 17.29 (3.66) 14.91 (4.46)
46.81 (10.76) 16.64 (4.27) 15.50 (4.10) 14.74 (4.10)
44.46 (13.28) 16.03 (5.51) 15.43 (4.34) 13.00 (5.09)
51.35 (10.25) 20.68 (5.59) 14.41 (3.39) 16.26 (4.00)
46.91 (10.03) 19.09 (5.38) 13.63 (3.41) 14.20 (3.93)
41.60 (8.10) 16.34 (4.07) 12.17 (3.43) 13.17 (2.86)
c F L
Semantic fluency totalf
Animals Fruits Vegetables
"The sample included 32 males and 117 females. tverbal IQ is estimated with the AMNART.
~otal number of words for three letters. 1Total number of words for three categories.
Appendix 11m: Meta-Analysis Tables for the Verbal Fluency Test
Table A11m.1. Results of the Meta-Analysis and Predicted Scores for the FAS (Relevant values are weighted on the standard error for the test mean)
Description of the aggregate sample Number of studies included in the analysis Years of publication Number of data points used in the analysis
18 1986-2000 30
(a data point denotes a study or a cell in education/gender-stratified data)
Total number of participants
3,469
n•
xt
sot
Range
30
79.20
86.81
19-411
30 30
49.55 6.42
19.66 4.51
17.8-74.3 1.4-20.7
29 29
14.31 2.33
1.98 0.82
9.4-17.0 0.8-4.2
7 7
113.91 10.45
7.72 2.33
102-121.9 8.2-14.1
17
54.33
28.41
0-100
30 30
41.84 11.20
3.65 1.41
35.0-49.4 7.8-13.6
Variable
Sample tize Mean
Age Mean SD
Education Mean SD
IQ Mean SD
Percent male Teat score meana Combined mean Combined SD
•Number of data points differs for different analyses due to missing data. tWeigbted means and SDs.
760
761
APPENDIX 11M Table A11m.1. (Contd.)
Predicted number of words generated and SDs per age group• (FAS)
95%CI Age .Rmtge
Predieted Score
Lower Band
Upper Band
41.13 43.10 44.!1 44.87 45.17 45.13 44.73 43.99 41.89 41.44 39.64 37.48
40.11 41.47 42.70 43.43 43.74 43.68 43.30 42.63 41.66 40.36 38.69 36.49
44.16 44.92 45.72 46.30 46.61 46.58 46.16 45.35 44.12 42.51 40.59 38.48
18-19 J0-.14 JS-J9 30-34
35-39 40-44 4S-49
50-IU SS-S9
60-64 65-69 10-14 "Based on the equation:
Predictetl tat acore = 34.29763 + 0.5537161• age - 0.0070315 • age2
Correction for education
Years of Education 10 11 12 13 14 15 16 17
Correction Factor -2.00 -1.50 -1.00 -0.50 0 +0.50 +1.00 +1.50
With every year of education above or below 14, we suggest correcting the obtained score by adding or subtracting 0.50 to or from the predicted score given in the table for the relevant age group. Extrapolation of this correction outside the boundaries of 10-17 years of education should be made with caution as empirical data are not available beyond these education ranges. (continued)
Standard deviation for all age groups is 11.10.
APPENDIX 11M
762
Table A11m.1. (Contd.)
Significance tests for regression with the test scores Ordinary least-squares regression o£Wst means on age (quadratic) Number of observations Numberofcluden R2
30 18 0.711
F<2.17l = 3!1.40, p < 0.000
F
Coefficient
SE
Age
0.5537161 -0.0070315
0.109 0.001
34.29763
2.428
Age2 Constant
95%CI
p
o.ooo•
5.07" -6.41 14.13
0.323 to 0. 784
-0.009 to -0.005 29.18 to 39.42
0.000 0.000
•significance test for age centered (sample m~ -aggregate mean): t =-6.62,
p = 0.000.
Prediction 1~74yefl's
Predicted age range Mean predicted score SEe
42.91 (2.89) 0.71 : 41.52-44.29
95%CI
50
0 0 0
0
45
0
40
0
35 20
30
40
80
50 11118'
70
80
Figure A11 m.1. A scatterplot illustrating the dispenion of the data points around the regression line for the FAS. The size of the bubbles reHects the w~ght of the data point, with larger bubbles indicating larger standard error and smaller weight.
Tests for assumptions and model 6t Tests for heterogeneity in the 8nal data set Pooled estimates for 6xed effect Pooled estimates for random effect
Q•P Moment-based estimate of between-study variance
Q(29) =
42.419 42.219
270.34, p < 0.000 9.861
763
APPENDIX 11M Table A11m.1. (Contd.) Tests for model &t--additioo of a quadratic term Model Linear
Quadratic
0.478 0.711
0.459 0.690
BIC
BIC'
47.038 32.659
-16.083 -30.462
BIC' difference of 14.379 provides very strong support for the quadratic model.
Tests for parameter speci&catioos Normality of the residuals Shapiro-Wilk W test Homoscedasticity White's general test
W=0.983, p =0.891 1.672, p = 0.796
Significance tests for regression with the standard deviations A regression of SDs on age yielded an R2 of 0.018 (F(l,l7l = 0.35, p = 0.561). Therefore, the SD for the aggregate sample is suggested for use with all age groups.
Effects of demographic variables Education Est. tau2 without education Est. tau2 with education Regression of test means on education and age Number of observations Number of clusters
1.375 0 29 17 0.768
R2
Term
Coefficient
SE
Education
0.4976037
0.202
2.47
IQ Regression of test means on IQ and age Number of observations Number of clusters
p
95%CI
0.025
0.070 to 0.925
7 4 0.914
R2
Term
Coefficient
SE
t
p
95%CI
IQ
0.1135096
0.036
3.17
0.050
-0.000 to 0.227
Gender t-test by gender:
n
X male (SD)
X female (SD)
M-F cliff.
6M,6F
43.783 (1.655)
43.872 (1.707)
-0.088
p -0.037
0.515
APPEND IX 11M
764
Table 11 m.2. Results of the Meta-Analysis and Predicted Scores for Animal Naming (Relevant values are weighted on the standard error for the test mean)
Description of the aggregate sample Numbe r of studies included in the analysis Years of publication Number of data points used in the analysis (a data point denotes a study or a cell in education/gender-stratified data)
11 1991-2003 25
2,823
Total number of participants Variable
n•
xt
sot
Range
Sample size Mean
25
86.67
81.58
31-411
Age Mean SD
25 25
62.60 5.47
18.13 3.67
25 . ~7 . 5
Education Mean SD
24 24
14.42 2.49
1.89 0.91
10.5-17.0 0.8-4.7
1 1
109.30 11.51
Percent male
14
48.38
27.46
26.0-100
Test score means Combined mean Combined SD
25 25
18.94 4.65
3.15 0.54
13.3-24.9 3.7-5.
IQ Mean SD
2.3-20.7
"Number of data points differs for different analyses due to missing data. tw eighted means and SDs.
Predicted number of words generated and SD s per age group• (Animal Naming) 95% CI
Age Range
Predicted Score
Lower Band
Upper Band
25-29
24.28 23.52 22.75 21.99 21.23 20.47 19.71 18.95 18.19 17.43 16.67 15.91 15.38
23.34 22.62 21.89 21.15 20.39 19.62 18.83 18.03 17.22 16.40 15.57 14.73 14.14
25.22 24.41 23.62 22.84 22.08 21.33 20.59 19.87 19.16 18.46 17.77 17.08 16.61
30-34 35-39 40-44 45-49 50-54 55-59 60-64 65~9
70-74 75-79
80-84 85-87 • Based on the equation:
Predicted test score = 28.45972 -0.1521419 • age
Standard deviations for all age groups is 4.65.
765
APPENDIX 11M Table 11 m.2. (Contd.) 25
20
15 0
0
10 20
so
30
age
eo
eo
10
Figure A11 m.2. A scatterplot illustrating the dispersion of the data points around the regression line for Animal Naming. The size of the bubbles reflects the weight of the data point, with larger bubbles indicating larger standard error and smaller weight.
Signi&cance tests for regression with the test scores OrdiDary least-squares regression of test meaos on age (linear) Number of observations Number of clusters R2
25 11 0.764 F<2.1o> = 177.27, p < 0.000
F
Coefficient
SE
Age Constant
-0.1521419 28.45972
0.011 0.679
-13.31 41.93
p
95% CI
0.000 0.000
-0.178 to-0.127 26.95 to 29.97
Predietion Predicted age range Mean predicted score SEe 95%CI
25-87 years 19.73 (2.93) 0.49 18.76-20.70
Tests for assumptions and model flt Tests for heterogeneity in the 6nal data set Pooled estimates for fixed effect
18.753 18.768 Q<24> = 1312.69, p
Pooled estimates for random effect
Q
10.565
Tests for model &t--eddition of a quadratic term Model Linear Quadratic
0.764 0.765
0.754 0.743
BIC
BIC'
17.198 20.376
-32.916 -29.739
BIC' difference of 3.178 provides positive support for the linear model.
766
APPENDIX 11M
Table 11m.2. (Contd.) Tests for parameter speei&c:atioos Normality of the residuals Shapiro-Wilk W test Homoscedasticity White's general test
w = 0.955, p = 0.322 1.439, p = 0.487 I
Signiflcance tests for regression withithe SDs A regression of SDs on age yielded an R2 _10.319 (Fu.to) = 10.62, p =0.009). Therefore, the SD for the aggregate sample is suggested for~ with all age groups.
Effects of demographic variables Education Est. tau2 without education Est. tau2 with education · Regression of test means on education and age Number of observations 1 Number of clusters ' R2
5.443 2.911 24 11 0.790
Term
Coefficient
SE
t
p
95%CI
Education
0.2975954
0.154
1.93
0.083
-0.046 to 0.642
IQ . Suflicient information for inclusion of IQ ~to regression analysis was not available.
Gender Sufficient information for a t-test by gend~ was not available.
Appendix 12: Locator and Data Tables for the Rey-Osterrieth Complex Figure (ROCF)
Study numbers and page numbers provided in these tables refer to study numbers and descriptions of studies in the text of Chapter 12.
Locator table also provides a reference for each study to a corresponding data table in this appendix.
767
Table A12.1. Locator Table for the Rey-Osterrieth Complex Figure (ROCF) Study
Age•
ROCF.l Powell, 1979 page 255 TableA12.2
41.0 (14.05)
ROCF.2 King, 1981 page255 TableA12.3
39.6 (21.4)
n 64
71
<30 30--60 >60
Sample Composition
IQ/ Education•
Trials Reported
Location
Right-handed patients referred for neurological screening, but brain damage was ruled out
VIQ: 107.70 (16.80) PIQ: 83.70 (21.55)
%retention on 40-minute
Control group: healthy volunteers or patients with nonneurological or psychiatric conditions; divided into 3 age subgroups
Education: 11.4 (2.9) FSIQ: 104.5 (18.1)
Copy.
Male college students with good or poor attentional abilities (as determined by performance on CPT) were compared Volunteers: 76 M, 31 F: no history of head injury or epilepsy
College students
Copy, 3-minute recall
Maryland
IQ: 104.9 (7.6)
Copy, 40-minute recall (quantified technique)
England
London
recall
Canada
40-minute recall, %recall,
ROCF.3 Huhtaniemi et al., 1983 page256 Table A12.4
19-29 22
25
ROCF.4 Bennett-Levy, 1984 page256 TableA12.5
29.3 (9.3) 17-49
107
ROCF.S Speers & Hibbler, unpublished manuscript page257 TableA12.6
35.00
40
20 M, 20 F normal volunteers, modified ROCF was administered
Education: 16.15 (2.77) 10-22 Q.T.IQ: 107.93 (8.73) 87-23
Copy, immediate recall, 30-minute recall, 24-hour recall (Percent recall)
California
200
Normal Spanisb-spealcing, right-handed subjects: data are stratified into 5 age ranges, by gender and by education
Education: Dliterate vs. 2::10
Copy
Colombia
(10.79) 23-70
ROCF.6 Ardila et al., 1989
16-25
page 258
26-35
Table A12.7
36-45 46-55
i
> "tt "tt
m
z
0
X
N
ROCF.7 Van Gorp et al., 1990 page 259 TableA12.8
ROCF.8 Berry et al., 1991 page 259 Table A12.9
57-85 57-65 66-70 71-75 76-85 50-79
156 28 45
57 107
(2.9)
64
ROCF.9 Tombaugh & Hubley, 1991 page260 Table A12.10, A12.11
67
ROCF.IO Berry & Carpenter, 1992 page 261 Table A12.12
68 (8.5)
Education: 14.1 (2.9) FSIQ: 117.21 (12.59)
Copy, 3-minute recall
55 M, 52 F, elderly Caucasians without cardiac, neurological, or psychiatric disease, or psychoactive medication; Rey and Taylor figures were administered; testretest data for 1 year interval are provided for 41 subjects
Education: 15 (2.9)
Copy,
Study 1 compared performance on ROCF and Taylor figures; itemized scoring systems were used; study 2 addressed similar issues with the addition of 2 scoring systems-itemized and Osterrieth-Taylor
3rd-year undergraduate students
Copy, immediate recall, 4-minute recall, 20-minute recall
Canada
Healthy older volunteers with no history of neurological or psychiatric illness; divided into 4 equal groups, each exposed to different delay intervals (15, 30, 45, and 60 minutes)
Education: 15 (3.1)
Copy, immediate recall, 4 delay intervals
USA
26
65
31M 29F
Los Angeles,
Healthy elderly without history of neurological or psychiatric disorder; 62 M, 94 F; 4 age groups
CA
> "'tt "'tt
m
z
0
X
.... N Kentucky
immediate recall, 30-minute recall
(continued)
$
........
Table A12.1. (Contd.)
= Age•
Study
ROCF.ll Delaney et al., 1992 page 262 Table A12.13
ROCF.l2 Kuehn & Snow,
n
22-67 45.8
42
46.7
38
1992 page 262 Table A12.14
Sample Composition Study compared perfonnance of nonnal adults on ROCF and Taylor ligures in test-retest paradigm with intervals of 1 month Study compared scores on copy, recall, and percent recall for ROCF and Taylor figures in a sample of patients referred for evaluation of possible brain damage; group 1 was presented with ROCF first Group 2 was presented with Taylor figure first
ROCF.l3 Boone et al.,
45-59
1993b page 263 Table A12.15
6()...00
ROCF.l4 Chiulli et al.,
70-93
153
70-74
46 58 49
91
70-83
1995
page 264 Table A12.16
75-79 80-91
IQ/ Education• 6-16 12.8
Trials Reported
Copy, immediate recall, 20-minute recall
Location 7 V.A. Medical Center facilities: Cf,CA, FL, VA, MA, MN, Ontario, Canada
Copy, 40-minute recall, percent recall
Canada
Los Angeles, CA
Education: 10.9 FSIQ: 94.7 Education: 13.5 FSIQ: 93.4
Fluent English-speaking, healthy older adults; 34 M, 57 F; 3 age groupings and 4 IQ levels
Education: 14.5 (2.5) FSIQ: 115.9 (13.0) 90-109 110-119 120-129 130-139
Copy, 3-minute recall, %retention
Healthy elderly without any serious medical illnesses and not taking medications; 3 age groups; data on proportion adopting a conligural approach are provided
Education:
Copy,
15.3 (2.4) 15.0 (3.6) 13.9 (3.0)
immediate recall, 30-minute recall
>
""m z 0
X
_.
N
ROCF.l5 Meyers & Meyers, 1995a page 264 Table Al2.17
ROCF.l6 Ponton et al., 1996 page 265 Table A12.18
ROCF.l7 Rapport et al., 1997 page 265 Table Al2.19
ROCF.l8 Hartman & Potter, 1998
page 266 Table A12.20
ROCF.l9 OstroskySolis et al., 1998 page 266 Table A12.21
Group means: 21.2-23.8
38.4 (13.5)
30 in each of 4 groups
300
16-29 30-39 40-49 50-75 18-84 55.01 (14.31)
318
22.3 18-32 69.8 60-81 20-29 30-39 40-49 50-59 60-69 70-79 80-89
15 15 15 15 15 15 15
Undergraduate students randomly assigned to 4 experimental groups; modified scoring procedure was used Spanish-speaking healthy volunteers; M1F ratio 40%/60%; data are partitioned by gender (2) x age (4) x education (2) Veterans referred to a V.A. hospital assessment service; majority were inpatients; 312 M, 6 F; standard and Denman scoring systems were compared
12.2-12.6
)>
Copy, immediate or 3-rninute recall, 30-minute recall
Iowa
Copy, 10-minute recall
Los Angeles, CA
"'tt "'tt m
z
0
X
10.7 (5.1)
..... N
<10 >10
12.62 (2.77)
Two age groups were compared: students (13 M, 17 F) and healthy older adults (12 M, 18 F); BQSS and extended 36-point scoring system were compared
15.3
A sample of 105 healthy Spanishspeaking volunteers was partitioned into 7 age groups
>6
Copy, immediate recall
Copy, immediate recall
North Carolina
Copy, immediate recall, 20-minute recall
Mexico City
16.7
(continued)
...""
""
N
Table A12.1. (Contd.) Study
Age•
n
Sample Composition
ROCF.20 Fastenau et al., 1999 page 267 Data are not reproduced
62.9 (14.2)
211
Healthy adults, 45% M, 95% Caucasian; Extended Complex Figure Test developed by the authors was used; data for score conversion are presented in overlapping age groups using midpoint interval technique
14.9 (2.6)
ROCF.21 Schreiber et al., 1999 page 267 Table Al2.22
29.5 (11.5)
18
Healthy controls (9 M, 9 F); BQSS and 36-point scoring systems were compared
15.1 (1.7)
ROCF.22 Deckersbach et al., 2000 page 268 Table Al2.23
35.13 (12.6)
55
ROCF.23 Miller, 2003 (an update on Seines et al. , 1991) page 268 Table Al2.24
40.4 (7.4)
729
in this book
25-34 35-44 45-59
ontrol sample (38% M); 2 scoring systems measuring organizational approach and Meyers & Meyers' (1995 b) system we re compared Seronegative homosexual and bisexual males from MACS; data are partitioned by age x education
IQ/ Education•
Trials Reported
Location Indianapolis
Copy
Boston, MA
16.7 (2.3)
opy, immediate recall
Massachusetts
16.2 (2.4)
opy, immediate recall, 20-mioute recall
MACS en te rs
< 16 16 > 16
•Age column and IQ/education column contain information regarding range and/or mean and standard deviation for the whole sample and/or separate groups, whichever information is provided by the authors. )>
-o -o m
z
0 X N
773
APPENDIX 12
Table A12.2. [ROCF.1] Powell, 1979: Percent Retention of the ROCF Following a 40-Minute Delay Compared to the Original Copy Score in Individuals Referred for a Neuropsychological Evaluation but Cleared of Neurological Impairment WAIS Gender (MIF)
n
Age
64
41.0 (14.05)
43121
VIQ
PIQ
Retention
107.70 (16.80)
83.70 (21.55)
56.79 (22.25)
Table A12.3. [ROCF.2] King, 1981: Data for the Total Sample and Three Age Groups of Healthy Volunteers and Patients with Non-neurological or Psychiatric Conditions ROCF
Total
n
Age
Education
WAIS FSIQ
Copy
40-Minute Recall
%Recall
71
39.6 (21.4)
11.4 (2.9)
104.5 (18.1)
31.1 (4.5)
16.4 (7.1)
52.3 (20.4)
36
<30
33.0 (2.8)
20.0 (6.4)
60.4 (18.1)
17
30-60
30.5 (4.7)
13.4 (6.0)
44.5 (19.9)
18
>60
27.8 (5.4)
12.2 (5.9)
44.3 (19.9)
Table A12.4. [ROCF.3] Huhtaniemi et al., 1983: Data for Two Groups of Male College Students ROCF WAIS FSIQ
Copy
3-Minute Recall
Good attention
122 (5.2)
30.23 (4.62)
24.92 (5.62)
Poor attention
107 (12.0)
28.92 (5.04)
22.67 (6.81)
774
APPENDIX 12
Table A12.5. [ROCF.4] Bennett-Levy, 1984: Data for Copy Score, Strict and Lax 40-Minute De1ayed Recall, Copy Time, Symmetry, Good Continuation, and Strategy Total for the Entire Sample (Consisting of Medical Patients and Healthy Adults) and for Males and Females Separately Good
Copy
IQ
Symmetry
Continuation
Score
Sbict Recall
Copy
Age
Strategy Total
Lax
n
Recall
Time
107
29.3 (9.3)
104.9 (7.6)
10.1 (3.2)
13.2 (3.1)
23.4 (5.0)
28.1 (4.2)
16.3 (5.2)
20.9 (5.8)
158.0 (51.5)
76 Males
29.3 (9.4)
104.0 (7.9)
10.5 (3.1)
13.7 (3.1)
24.2 (4.8)
28.6 (3.9)
17.1 (5.3)
21.9 (5.6)
159.1 (52.8)
31 Females
29.2 (9.1)
106.9 (6.2)
8.9 (3.2)
12.2 (2.8)
21.1 (5.0)
26.9 (4.7)
14.5 (4.7)
18.3 (5.7)
153.6 (47.8)
Table A12.6. [ROCF.5] Speers and Hibbler, Unpublished Manuscript: Data for a Sample of Intact Participants on the Copy Condition Based on the Author's Unique Scoring System and Percent Recalled on Three Recall Conditions ROCF %retention
n
Age
Education
Est. IQ"
Copy
Immediate
30 Minutes
24 Houn
40
35.00 (10.79)
16.15 (2.77) (10-22)
107.93 (8.73) (87-123)
99
87
84
84
(~70)
"IQ was estimated with the Quick Test.
Table A12.7. [ROCF.6] Ardila et al., 1989: Data for the Copy Condition for a Sample• of Healthy Colombians: Illiterates and Those Who Completed at Least 10 Years of Education Illiterate
Educated
Age
Men
Women
Men
Women
16-25
21.7 25.2 25.3 22.6 12.0
19.7 16.7 13.1 14.5 10.6
35.5 35.6 34.3 34.9 34.7
35.1 35.3 34.1 35.2 34.8
26-35 36-45
46-55 56-65
•sample size is 10 for each cell.
775
APPENDIX 12
Table A12.8. [ROCF.7] Van Gorp et al., 1990: Data for a Sample of Healthy Elderly
n
Age
VIQ
PIQ
Copy
3-Minute Recall
28
57~
117.2 (11.3)
109.2 (11.6)
32.50 (4.7)
14.45 (5.3)
45
~70
114.8 (17.0)
111.5 (16.8)
32.93 (3.4)
14.13 (7.8)
57
71-75
122.9 (11.4)
115.1 (11.9)
31.73 (3.4)
11.13 (6.7)
26
76-85
110.6 (11.3)
101.0 (8.8)
30.14 (5.6)
8.41 (5.9)
Table A12.9. [ROCF.8] Berry et al., 1991: Data for the ROCF and Taylor Figures for 54 Healthy Elderly Subjects, as Well as for Baseline and 1-Year Follow-up ROCF Scores in a Subset of 41 Subjects• ROCF (n=54) Education
Ratio
Copy
Immediate
30-Minute Recall
Copy
Immediate
30-Minute Recall
15 (2.9)
55152
33.2 (2.1)
23.4 (6.1)
22.5 (6.0)
32.9 (2.3)
24.8 (6.3)
23.3 (7.1)
M/F
n
Age
107
65
Taylor (n =54)
(8.6)
ROCF Baseline (n =41)
1 Year (n=41)
Copy
Immediate
30-Minute Recall
Copy
Immediate
30-Minute Recall
32.6 (2.4)
17.8 (5.1)
17.2 (5.1)
31.6 (2.8)
17.5 (5.1)
17.9 (5.0)
•A modified scoring system was used in this study.
Table A12.10. [ROCF.9a] Tombaugh and Hubley, 1991: Data for the Sample of Undergraduate Students on the Copy and Three Recall Conditions Based on the Itemized Scoring Systems for the ROCF and Taylor Figures Figure
n
Copy
ROCF
31
69.8 (2.1)
Taylor
33
69.6 (1.5)
Immediate Recall
4-Minute Delay
20-Minute Delay
44.1 (13.2) 52.1 (11.6)
46.4 (13.4) 54.2 (10.5)
48.7 (12.7) 55.5 (10.0)
776
APPENDIX 12
Table A12.11. [ROCF.9b] Tombaugh and Hubley, 1991: Data for a Sample of Undergraduate Students on the Copy anfl Three Recall Conditions for the ROCF and Taylor Figures Based on Two Scoring S)ttems Figure
Immediate Recall
20-Minute Delay
30-Day Delay
. 69.0 (1.8) 69.6 . (1.5)
46.9 (11.0) 56.4 (11.6)
50.3 (12.2) 59.4 (10.5)
29.5 (12.1) 39.2 (10.0)
34.9 (1.2) 35.1 (1.4)
23.5 (5.1) 28.6 (5.9)
25.5 (6.0) 30.3 (5.3)
14.6 (6.1) 19.8 (8.1)
n
Copy
ROCF
33
Taylor
34
Ifetlliud acoring .,.,_
o.teniella-Taylor ICOring .,.,_ ROCF Taylor
33 34
Table A12.12. [ROCF.10] Beny and C:ftnter, 1992: Data for a Sample of Healthy Elderly on the Copy and Two Recall Conditions r Each Experimental Group (Based on Length of Delay Interval) Delay Period
Sample Size
Age
E~cation
Copy
Immediate Recall
Delayed Recall
15
15
67.3 (7.8)
\5.1 (2.6)
30.8 (3.4)
19.3 (3.9)
19.2 (3.2)
30
15
69.2 (9.9)
15.2 (3.6)
31.0
(4.3)
19.1 (7.6)
18.4 (8.1)
45
15
67.5 (8.5)
15.2 (2.7)
32.5 (2.7)
22.6 (6.3)
22.1 (5.5)
60
15
67.4 (8.4)
15.3 (3.4)
33.4 (2.4)
20.1 (7.5)
18.9 (6.9)
Table A12.13. [ROCF.ll] Delaney et al., J992: Data for a Control Sample on the Copy, Immediabl Recall, and 20-Minute Delayed Recall Conditions for the ROCF and Taylor Figures Age (Range)
Education (Range)
ROCF
Taylor
45.8
12.8 6-16
33.8 (2.1)
33.6 (2.2)
Immediate recall
21.0 (7.8)
26.1 (6.4)
20-minute delayed recall
20.8 (8.0)
25.7 (7.2)
n
Copy
42
~7
777
APPENDIX 12
Table A12.14. [ROCF.12] Kuehn and Snow, 1992: Data for Patients Referred for a Neuropsychological Evaluation on Copy, 40-Minute Delayed Recall, and Percent Recall for the Rey and Taylor Figures•
Copy
Absolute recall
Groupt
n
Gender
Education
FSIQ
ROCF
Taylor
1
19
12M,7F
10.9
94.7
31.5 (4.0)
32.7 (3.4)
2
19
1M, 12F
13.5
93.4
31.0 (6.5)
30.5 (7.4)
ll.2 (5.2) 9.2 (6.3)
9.2 (7.5) 14.2 (6.9)
35.2 (14.6) 28.0 (16.8)
29.5 (22.1) 46.0 (17.0)
1 2
Percent recall
1 2
"Mean age for the sample is 46.7 years. tcroup 1, ROCF administered first; Group 2, Taylor administered first.
Table A12.15. [ROCF.13] Boone et al., 1993b: Data for a Sample of Healthy Elderly for Three Age Groupings and Four Full-Scale IQ (FSIQ) Levels n
Age
38
Education
FSIQ
Copy
3-Minute Recall
%Retention
45-59
14.6 (2.6)
ll4.7 (14.2)
34.2 (1.8)
18.9 (6.1)
55.0 (17.1)
31
60-69
14.4 (2.1)
ll4.5 (12.8)
33.8 (2.8)
17.3 (5.2)
51.7 (13.8)
22
70-83
14.5 (2.9)
119.4 (10.6)
31.3 (4.7)
13.8 (5.0)
43.8 (14.8)
n
FSIQ
Age
Education
Copy
3-Minute Recall
%Retention
32
~109
60.3 (9.8)
13.4 (2.2)
32.6 (4.5)
15.2 (4.9)
46.3 (13.3)
23
110-119
62.1 (9.0)
14.2 (2.2)
33.5 (2.0)
16.7 (5.6)
49.6 (16.1)
21
120-129
63.0 (9.3)
15.1 (2.4)
33.7 (2.2)
18.9 (5.4)
56.0 (14.8)
15
130-139
62.9 (10.4)
16.4 (2.4)
34.3 (2.3)
19.4
56.0 (20.2)
(7.6)
778
APPENDIX 12
Table A12.16. [ROCF.l4] Chiulli et al., 1995: Data for a Sample of Healthy Elderly for Three ROCF Conditions Age GrCiup 70-74
75-78
80-91
n
46
58
49
Age
72.7
82.4
(1.1)
(3.0)
Educcdion Gender(~)
c.,
Accuracy Approach•
15.3
15.0
13.9
(2.4)
(3.6)
(3.0)
52%
59%
49%
32.6 (2.8) 39%
31.0 (4.0)" 36%
29.8 (4.6) 35%
17.2 (6.2) 55%
14.2 (6.6) 41%
12.9 (6.4) 40%
16.9 (6.3) 55%
14.2 (6.2) 52%
12.4 (6.0) 41%
IrruneditJte reccdl Accuracy Approach
:JO..tninutedeltJyetlreccdl Accuracy Approach
"Proportion of subjects adopting a configura! approach.
Table A12.17. [ROCF.l5] Meyers and Meyers, 1995a. Data for Undergraduate Students on the Copy Condition and Different Combinations of Three Recall Conditions/Recognition Trial for Each Experimental Group• (n=30 for Each Group) Group
Age
Education
Gender
Copy
Immediate Recall
1
23.6 (7.4)
12.2 (0.6)
10M 20F
34.7 (1.7)
26.7 (4.6)
2
21.2 (4.2)
12.4 (0.7)
17M 13 F
35.5 (0.9)
3
23.8 (5.4)
12.6 (0.8)
11M 19 F
35.2 (1.0)
4
21.6 (4.4)
12.6 (0.9)
18M 12 F
35.5 (0.6)
26.6 (4.3)
3-Minute Delay
30-Minute Delay
Recognition
26.6 (4.4)
21.9 (1.3)
27.6 (4.0)
27.7 (3.9)
21.6 (1.3)
27.2 (3.6)
27.4 (3.6)
21.5 (1.5)
25.3 (3.7)
20.9 (1.6)
"Scoring was based on the procedure developel by Meyers and Meyers (see text).
779
APPENDIX 12
Table A12.18. [ROCF.16] Ponton et al., 1996: Data for a Sample of 300 Spanish-Speaking Healthy Participants Stratified by Gender, Age, and Education Age Group 16-29
40-49
30-39
50-75
Education (Years) <10
>10
<10
>10
<10
>10
<10
>10
11 30.27 (4.13) 18.32 (5.70)
25 32.76 (3.13) 21.34 (6.31)
13 29.15 (5.68) 14.77 (7.77)
18 31.67 (3.69) 22.17 (6.54)
12 29.50 (4.21) 16.58 (7.97)
17 31.35 (3.69) 21.71 (5.62)
18 26.19 (4.96) 14.06 (4.30)
6 30.83 (4.71) 18.67 (8.85)
12 30.00 (4.09) 20.13 (7.47)
30 31.57 (2.83) 19.77 (5.22)
22 27.46 (6.24) 17.25 (6.28)
44 32.05 (4.58) 20.16 (6.08)
16 27.44 (4.77) 15.19 (5.46)
11 31.64 (2.98) 18.73 (5.95)
25 23.52 (7.97) 11.50 (6.26)
20 29.90 (4.97) 16.85 (5.16)
Mala n
Copy
Recall"
x (SD) x (SD)
Femalea n
Copy Recall
x (SD) x (SD)
"Ten-minute delayed recall.
Table A12.19. [ROCF.17] Rapport et al., 1997: Total Scores and Individual Item Scores for a Sample of Patients Referred to the Veterans Administration Hospital Assessment Service, Scored According to Denman and Standard Systems Denman Copy
Denman Recall•
Standard
Standard
Copy
Recall
51.79 15.57
23.52 15.35
26.01 7.89
11.94 7.64
2.16 0.33
0.98 0.44
1.44 0.17
0.66 0.30
Total acore M SD
Inditlitlual itema M SD
"Immediate recall.
Table A12.20. [ROCF.18] Hartman and Potter, 1998: Data• for Two Age Groups: Students and Healthy Older Adults Group
n
Young
30
Old
30
Education
MIF Ratio
22.3 18-32
15.3
69.8 60-81
16.7
Age (Range)
•The extended 36-point scoring system was used.
Copy
Immediate Recall
13/17
31.1 (3.6)
53.6 (3.2)
12/18
23.7 (5.2)
15.5 (5.5)
780
APPENDIX 12 !
Table A12.21. [ROCF.19] Ostrosky-S~ et al., 1998: Data for Seven Age Groups• of Healthy Spanish Speakers living in Mexif:o City Age Group
Mean
20--29
24.4 (2.9) 32.8 (2.8) 44.6 (3.1) 54.2 (2.1) 63.3 (2.8) 74.8 (2.0) 83.4 (3.1)
MIF Ratio
Age
30-39 40-49 50-59 60-69
70-79
80-89
1114 5/10
Immediate
20-Minute
Copy
Recall
Delayed Recall
35.1 (1.3) 35.0
25.8 (4.9) 24.1 (4.7) 19.9 (4.8) 19.2 (4.5) 13.4 (7.4) 12.2 (4.7) 8.9 (4.3)
24.1 (6.8) 24.6 (4.4) 20.4 (5.6) 16.8 (6.9) 15.8 (8.7) 10.8 (4.8) 10.0 (3.9)
(1.1)
619
34.6 (1.6) 34.2 (1.5) 30.3 (5.8) 29.4
4/11
'
817 619
(1.1)
29.2 (5.0)
4/11
"Each group included 15 participants.
I
Table A12.22. [ROCF.21] Schreiber e, al., 1999: Data for the Control Group n 18
Age
Education
29.5
15.1
(11.5)
(1.7)
I
I
Table A12.23. [ROCF.22] Deckersbach Control Group %
n
Age
Education
55
35.13 (12.6)
16.7 (2.3)
Male
•scores are based on Meyers and Meyers'
MIF Ratio
Copy
919
30.7 (3.4)
fI al., 2000: Data• for the I I
Immediate
!
Copy
Recall
I
33.81 (2.71)
20.84 (7.47)
(11
scoring system.
I
APPENDIX 12
781
Table A12.24. [ROCF.23] Miller, 2003 (An Update on Seines et al., 1991): Data for a Sample of Seronegative Homosexual/Bisexual Males Participating in the Multi-Center AIDS Cohort Study, Stratified by Age x Education Age
25-34
Education (Years) <16 Mean (SD)
n 16 Mean (SD)
n >16 Mean (SD)
22.65 (6.80) 57
22.07 (6.78) 57
34.81 (1.79) 48
22.83 (6.71) 48
23.03 (6.32) 48
26.73 (5.31)
26.05 (5.29)
43
43
43
34.91 (2.00) 148
23.90 (6.59) 148
23.54 (6.41) 148
34.06 (2.49)
lll
21.80 (6.41) 110
21.90 (6.15) 109
34.46 (2.00)
22.66 (6.30)
Ill
lll
22.63 (6.47) 110
35.00 (1.72) 134
23.20 (6.08) 134
22.75 (6.07) 133
35.00 (2.10) 356
22.60 (6.62) 355
22.45 (6.22) 352
33.92 (2.66) 64
19.93 (6.51) 63
19.63 (6.80) 63
Mean (SD)
34.47
20.68
20.10
(2.33)
(6.86)
(7.00)
n
48
48
48
34.41 (2.47) l13
21.94 (6.56) 113
21.34 (6.57) l13
34.28 (2.59) 225
21.10 (6.64) 224
20.59 (6.74) 224
n
n <16 Mean (SD)
n 16 Mean (SD)
n >16 Mean (SD) Total Mean (SD)
n 4&-59
34.54 (2.51) 57
Recall
35.50 (0.81)
Total Mean (SD)
35-44
Delayed
Copy
Immediate Recall
<16 Mean (SD)
n 16
>16 Mean (SD)
n Total Mean (SD)
n
Table A12.24. (Contd.) Age Total
Education (Years) <16 Mean (SD)
n 16 Mean (SD)
n >16 Mean (SD)
n Total Mean (SD)
n
Copy
Immediate
Delayed
Recall
Recall
34.14 (2.54) 232
21.50 (6.59) 230
21.32 (6.55) 229
34.55 (2.07) 207
22.24 (6.55) 207
22.13 (6.63) 206
34.84 (1.99) 290
23.23 (6.34) 290
22.69 (6.34) 289
34.54 (2.22) 729
22.40 (6.51) 727
22.10 (6.50) 724
Appendix 12m: Meta-Analysis Tables for the Rey-Osterrieth Complex Figure (ROCF)
Table A12m.1. Results of the Meta-Analysis and Predicted Scores for the ROCF, Copy Condition (Relevant values are weighted on the standard error for the test mean)
Description of the aggregate sample Number of studies included in the analysis Years of publication Number of data points used in the analysis (a data point denotes a study or a cell in education/gender-stratified data) Total number of participants Variable
Sample
9 1900-2003 19
1,340
n•
xt
sot
Range
19
46.64
51.40
15-356
19 19
62.73 4.00
19.27 3.52
21.2-82.4 1.0-11.3
19 19
14.33 2.68
0.98 0.90
12.2-16.2 0.1--3.6
5 5
108.09 11.74
6.12 3.06
101.0-115.1
11
53.92
21.04
33-100
19 19
32.20 3.59
1.79 1.46
me
Mean
Age Mean SD
Education Mean SD
IQ Mean SD
Percent male
Teat
~eore
me1J118
Combined mean Combined SD
•Number of data points differs for different analyses due to missing data. tweighted means and SDs.
782
8.~16.8
29.~.5
0.6-5.6
783
APPENDIX 12M
Table A12m.1. (Contd.)
Predicted scores and SDs per age
group (ROCF, Copy)• 95%CI
95%CI
Preclieted
Age .Range
Predicted Seore
Lower Band
Upper band
SD
Lower Band
Upper Band
JJ-J4 J5-J9
35.04 34.99
34.84 34.85 34.64 34.34 33.99 33.60 33.17 32.70 32.19 31.63 31.03 30.37
35.24 35.14 35.11 35.04 34.87 34.61 34.26 33.80 33.25 32.60 31.86 31.00
1.10 1.39 1.70 !.01 !.3! !.64 !.95 3.!6 3.57 3.89 4.!0 4.51
0.80 1.10 1.42 1.73 2.02 2.29 2.56 2.82 3.08 3.33 3.58 3.83
1.41 1.67 1.97 2.29 2.63 2.98 3.34 3.70 4.07 4.44 4.81 5.19
30-34
34.88
35-39 40-44 45-49 50-84
34.69
5S-S9 60-64 ~
10-14 15-19
34.43
34.11 33.71 33.!5 3!.7! 3!.11 31.44 30.70
•Based on the equations: Predicted tat ecore=34.40434+0.0595862•age -0.0013855•age2 Predicted SD = - 0.333026 + 0.0625042 • age
Sigui&cance tests for regression with the test scores Ordiaary least-squares regression of test meaos on age (quaclratie) Number of observations Number of clusters
R2
19
9 0.899 F<2.s> = 561.89, p < 0.000
F
Coefficient
SE
Age Age2 Constant
0.0595862 -0.0013855 34.40434
0.035 0.000 0.719
1.69°
-4.40 47.82
p
95%CI
0.130• 0.002 0.000
-0.022 to 0.141 -0.002 to -0.001 32.74 to 36.06
•significance test for age centered (sample means -aggregate mean): t = -25.20, p = 0.000. Prediction Predicted age range Mean predicted score SEe 95%CI
22-79 years 33.51 (1.47) 0.20 33.11-33.90 (continued)
784
APPENDIX 12M
Table A12m.1. (Contd.) 38
Figure A12m.1. A scatterplot illustrating the dispersion of the data points around the regression line for the Rey-Osterrieth Complex Figure Copy. The size of the bubbles reflects the weight of the data point. with larger bubbles indicating larger standard error and smaller weight.
Tests for assumptions and model 8t Tests for heterogeneity in the 8oal data set Pooled estimates for fixed effect Pooled estimates for random effect Q
34.686
33.590 Q(l8) = 354.90, p < 0.000 0.977
Tests for model 8t-addition of a quadratic term Model Linear Quadratic
Adjusted R2 0.836 0.899
BIC
BIC'
0.827
-9.399
0.886
-15.563
-31.444 -37.608
BIC' difference of 6.164 provides very strong support for the quadratic model.
Tests for parameter speeiflcations Normality of the residuals Shapiro-Wilk W test W=0.983, p=0.975 Homoscedasticity White's general test 8.921, p < 0.063
APPENDIX 12M
785
Table A12m.1. (Contd.)
Sigoi&cance tests for regression with the SD Ordiaary least-squares regression of SDs on age (Unear) Number of observations 19 Number of clusters 9 R2 0.685 F = 79.10, p < 0.000 Term
Coefficient
SE
Age Constant
0.0625042 -0.333026
0.007 0.268
Prediction Mean predicted SD SE., 95%CI
8.89 -1.24
p
95%CI
0.000 0.249
0.046 to 0.079 -0.951 to 0.285
2.95 (1.21) 0.22 2.51-3.39
Effects of demographic variables Education Est. tau2 without education Est tau2 with education Regression of test means on education and age Number of observations Number of clusters R2 Term
Coefficient
SE
Education
0.0126504
0.163
0.4261 0.0000 19 9 0.899
0.08
p
95%CI
0.940
-0.36 to 0.39
Gender Information for the t-test by gender was not available.
Table A12m.2. Results of the Meta-Analysis and Predicted Scores for the ROCF, Immediate Recall (Relevant values are weighted on the standard error for the test mean)
Description of the aggregate sample Number or studies ineluded in the analysis Yean of publieation
Number or data points used in the analysis (a data point denotes a study or a ceO in education/gender-stratified data) Total number of partieipants
7 1991-2003 12
1,086 (continued)
786
APPENDIX 12M
Table A12m.2. (Contd.) n•
Variable
so'
Range
s-.pk.U. Mean
12
60.63
73.51
15-356
12 12
53.46 ·5.61
22.66 4.09
22.0-82.4 1.0-11.3
12 12
J4.47 .2.36
1.34 1.21
12.2-16.2 0.1--'3.6
9
55.18
24.94
33-100
12 12
*>.53
4.39 1.16
12.9-26.7 4.3-7.8
Age Mean SD
Education Mean SD
IQ Mean SD
0 0
Percent male
Tat ICOnl meana Combined mean Combined SD
6.36
•Number of data points differs for different aqalyses due to missing data. tweighted means and standard deviations.
Predicted scores and SDs per a1e e't! (ROCF, Immediate Recall)• 95%CI
95%CI
Age Bl.mge
Predicted Score
Lower Band
Upeer Baad
Predicted SD
Lower Band
Upper band
.2.2-24 .25-.29
14.91 14.81 14.58 14.18 13.64
22.34 22.70 22.44 21.76 20.92 20.06 19.21 18.39 17.58 16.70 15.54 13.67
27.$0 26.85
4.81 5.49 6.07 6.55 6.93 7.19 7.35 1.40 7.35 7.19 6.91 6.55
4.09 4.87 5.47 5.85 6.11 6.30 6.43 6.51 6.54 6.52 6.42 6.16
5.65 6.10 6.68 7.26 7.74 8.09 8.27 8.30 8.16 7.86 7.42 6.93
30-34
3S-39 40-44
4S-49 50-lU 5S-S9 60-64 65-69 7~74
75-79
U.9$
u.n
11.11 19.98 18.69 17.26 15.67
26.12 26.81 26.36 25.$4 25.• 1 23.$5 22.$8 20.&9 18.$8 17.&7
"Based on the equations:
Predietetl le8t acore =23.5187 + 0.1292929 •t~Je - 0.0029745 • age2 Pretlkted SD = 0.34854 + 0.2456015 • age - 0.0021371 • age2
787
APPENDIX 12M Table A12m.2. (Contd.) 30
0 25
0
0
20
15
0
0 10
20
30
40
50
60
60
70
age
Figure A12m.2. A scatterplot illustrating the dispersion of the data points around the regression line for the Rey-Osterrieth Complex Figure Immediate Recall. The size of the bubbles reflects the weight of the data point, with larger bubbles indicating larger standard error and smaller weight.
Significance tests for regression with the test scores Ordinary least square regression of test means on age (quadratic) Number of observations 12 Number of clusters 7 R2 0.822 F<2.6l = 17.69, p < 0.003 F
Coefficient
SE
Age Age2 Constant
0.1292929 -0.0029745 23.5187
0.246 0.002 5.229
0.53" -1.24 4.50
p
95%CI
0.618" 0.260 0.004
-0.472 to 0.731 -0.009 to 0.003 10.72 to 36.31
"Significance test for age centered (sample means- aggregate mean): t = - 5.87, p = 0.001.
Prediction Predicted age range Mean predicted score SE. 95%CI
22-79 years 21.66 (3.14) 1.22 19.28-24.05
Tests for assumptions and model fit Tests for heterogeneity in the 6nal dataset Pooled estimates for fixed effect Pooled estimates for random effect Q
21.891 21.076 Qon=301.24, p < 0.000 12.478 (continued)
APPENDIX 12M
788 Table A12m.2. (Contd.) Tests for model &t-rulitioo o£ a quadratic term Model Linear Quadratic
0.775 0.822
Adjusted R2
BIC
BIC'
0.752 0.782
25.770 25.440
-15.397 -15.727
BIC' difference of 0.330 provides weak support for the quadratic model.
Tests £or parameter speci&catioos Normality of the residuals Shapiro-Wille W test Homoscedasticity White's general test
W
= 0.876, p = 0.076 3.275, p = 0.513
Significance tests for regression with the SD Ordinary least-squares regression o£ SDs on age (quadratic) Number of observations Number of clusters R2
12
7 0.694 F<2.6) = 7.58, p = 0.023
F<do.p Term
Coefficient
SE
Age Age2 Constant
0.2456015 -0.0021371 0.34854
0.074 0.001 1.621
3.31" -3.09 0.22
p
95%CI
0.016" 0.021 0.837
0.064 to 0.427 -0.004 to-0.000 -3.62 to 4.31
•significance test for age centered (sample means- aggregate mean): t = 2.39, p = 0.054.
Prediction 6.66 (0.81) 0.36
Mean predicted SD SE. 95%CI
5.94-7.37
Effects of demographic variables Education Est. tau2 without education Est. tau2 with education Regression of test means on education and age Number of obseiVations Number of clusters
Rz
14.24 14.47 12
7 0.823
Term
Coefficient
SE
Education
-0.1463632
0.617
-0.24
Gender Information for the t-test by gender was not available.
p
95%CI
.820
-1.66 to 1.36
789
APPENDIX 12M Table A12m.3. Results of the Meta-Analysis and Predicted Scores for the ROCF,
Long-Delayed Recall (Relevant values are weighted on the standard error for the test mean)
Description of the aggregate sample Number of studies included in the analysis Years of pubHcation Number of data points used in the analysis
7 1991-2003 11
(a data point denotes a study or a cell in education/gender-stratified data)
Total number of participants Variable
1,056
n•
xt
sot
Range
11
62.07
75.73
15-356
11 11
55.37 5.65
21.86 4.30
22.0-S2.4 1.0-11.3
11
14.64 2.46
1.26 1.19
12.2-16.2 0.1--3.6
8
58.04
25.77
11
19.95 6.67
4.38 1.17
Sample size Mean
Age Mean SD
Education Mean SD
11
IQ Mean SD
Percent male
0 0 33-100
Test ecore means Combined mean Combined SD
11
"Number of data points differs for different analyses due to missing data. tweighted means and SDs.
12.4--26.6 4.4--8.1
APPENDIX 12M
790 Table A12m.3. (Contd.)
Predicted scores and SDs, per age group• (ROCF, Delayed Recallt)
95%CI Age Range
Predieted Score
Lower Band
Upper Band
25.18 14.87 14.41 23.85 23.17 22.38 21.49 20.47 19.35 18.12 16.78 15.33
23.75 23.51 22.68 21.69 20.69 19.75 18.87 18.04 17.24 16.38 15.22 13.40
26.62 26.23 26.15 26.01 25.64 25.01 24.09 22.90 21.46 19.86 18.33 17.25
JJ-J4 J5-J9 30-34 35-39 40-44
45-49 SO-S4 55-59 60-64 fJS.-69 10-14 15-19
Standard cleviatioa for all age groups is 6.67.
"Based on the equation:
Predicted test acore = 25.39903 + 0.0416485 • age -0.0022144 • age2 tThe predicted scores are relevant for the Copy-Immediate Recall-Delayed Recall administration sequence (can be used with caution if 3-Minute Delayed Recall is administered instead of Immediate Recall, but not both). The length of the long-delay interval varies widely in the data reviewed (see text).
Significance tests for regression with the test scores Ordinary least-squares regression of test means on age (quadratic) Number of observations 11 Number of clusters 7 R2 0.862 F<2.6) = 33.14, p < 0.0006 F
Coefficient
SE
Age Age2 Constant
0.0416485 -0.0022144 25.39903
0.002
-1.05
3.982
6.38
0.207
0.20°
p
95%CI
0.848• 0.332 0.001
-0.466 to 0.549 -0.007 to 0.003 15.65 to 35.14
•Significance test for age centered (sample means - aggregate mean): t = -5.96, p = 0.001.
Prediction Predicted age range Mean predicted score SEe 95%CI
22-79 years 21.28 (3.29) 1.03 19.27-23.30
791
APPENDIX 12M Table A12m.3. (Contd.) 30
25
20
15
0 10 20
30
40
age
80
80
70
Figure A12m.3. A scatterplot illustrating the dispersion of the data points around the regression line for the Rey-Osterrieth Complex Figure Long-Delayed Recall. The size of the bubbles reflects the weight of the data point, with larger bubbles indicating larger standard error and smaller weight.
Tests for assumptions and model &t Tests for heterogeneity in the fiDal data set Pooled estimates for fixed effect Pooled estimates for random effect Q
21.303 20.400 Q(lo) = 302.99, p < 0.000 13.552
Tests for model 8t.-ddition of a quadratic term Model Linear
0.836
0.817
Quadratic
0.862
0.828
BIC
BIC'
21.213 21.646
-17.458 -17.025
BIC' difference of .433 provides weak support for the linear model.
Tests for parameter speei8eatioDs Normality of the residuals Shapiro-Wilk W test Homoscedasticity White's general test
w = 0.849, p = 0.041 2.539, p = 0.638
Signiflcance tests for regression with the SD A regression of SDs on age yielded an R2 of 0.482 (F<2•6 ) = 3.38, p = 0.104). Therefore, the SD for the aggregate sample is suggested for use with aH age groups. (continued)
792
APPENDIX 12M
Table A12m.3. (Contd.)
Effects of demographic variables Education Est. tau2 without education Est. tau2 with education Regression of test means on education ancl age Number of observations Number of clusters
14.96 16.94 11 7
R2
0.863
Term
Coefficient
SE
Education
-0.1076512
0.556
0.19
p
95%CI
0.853
- 1.25 to 1.47
Gender , Information for the t-test by gender was nOt available.
Table A12m.4. Summary Table of Predifed Scores for the ROCF Immediate ReeaD
Copy
Long-Delayed ReeaD
Age .Range
Score
SD
Score
SD
Score
SD
!2-.24 J5-J9
35.04 34.99 34.88 34.69 34.43 34.11 33.71 33.25 32.72 32.11
1.10 1.39 1.70 2.01 2.32
24.92 24.82 24.58 24.18 23.64
25.18 24.87 24.41 23.85 23.17
2.64
22.95 22.11
4.87 5.49 6.07 6.55 6.93 7.19 7.35 7.40 7.35 7.19 6.92 6.55
6.67 6.67 6.67 6.67 6.67 6.67 6.67 6.67 6.67 6.67 6.67 6.67
30-34 35-39
40-44 45-49 SO-S4 SS-59 6fJ.4J4 65-69 70-74 75-79
31.44
30.70
2.95 3.26 3.57 3.89 4.20 4.51
21.12 19.98 18.69 17.26 15.67
22.38
21.49 20.47 19.35 18.12 16.78 15.33
Appendix 13: Locator and Data Tables for the Hooper Visual Organization Test (HVOT)
Study numbers and page numbers provided in these tables refer to study numbers and descriptions ofthe studies in the text ofChapter 13.
Locator table also provides a reference for each study to a corresponding data table in this appendix.
Table A13.1. Locator Table for the Hooper VISual Organization Test (HVOT) Study
Age•
n
Sample Composition
IQ/Education•
Location
HVOT.l Rao et al., 1991a page 275 Table A13.2 HVOT.I Libon et al., 1994 page 276 TableA13.3 HVOT.3 Richardson &: Marottoli, 1996 page 276 Table Al3.4
46.0 (11.6)
100
13.3 (2.0)
Milwaukee, WI
64-74
23
14
13.4 (2.7) 12.4 (2.0) 11.0 (3.7)
Philadelphia, PA
75-94
Control group of rigorously screened participants (25 M, 75 F), paid for their participation Healthy right-handed elderly participants (8 M, 15 F), (4 M, 10 F).
HVOT.4 Walsh et al., 1997 page277 Table Al3.5
81.5 (3.3)
101
76-80 81-91 73.2 (7.7)
32
Autonomously living, current drivers {53 M, 48 F); data are provided for younger-old and older-old by two educational levels Cognitively intact geriatric rehabilitation inpatients (10M, 22 F)
New Haven,
cr
<12 2:12 11.7 (2.9)
Detroit, MI
(continued)
793
794
APPENDIX 13
Table A13.1.
(Contd.)
Study
Age•
n
Sample Composition
IQ/Education"
Location
HVOT.5 Lichtenberg et al., 1998 page277 Table A13.6
76.9 (5.9)
74
Cognitively intact geriatric rehabilitation inpatients (19M, 55 F); 38 African American, 36 European American
10.8 (3.0)
Detroit, MI
•Age column and IQ/education column contaqa information regarding range and/or mean and standard deviation for the whole sample and/or separate groups, whicbe11er is provided by the authors.
Table A13.2.
[HVOT.1] a Control Sample
n 100
Rao et al., 1991a: Data for
Age
Education
MIF Ratio
46.0 (11.6)
13.3 (2.0)
25175
Table A13.3.
HVOTScore
• 25.9 (2.4)
[HVOT.2] Libon et al., 1994: Data for a Sample of Healthy Older Adults Stratified into Two
Age Groups Age Group
n
Mean Age
Educa~n
MIF Ratio
MMSE Score
GDS Score
HVOTScore
64-74
23
69.7 (3.3)
13.4 (2.7)
8/15
28.1 (1.1)
4.2 (2.7)
23.1 (4.1)
75-94
14
81.0 (4.3)
12.4 (2.0)
4/10
28.5 (1.0)
2.9 (3.0)
19.9 (3.4)
Table A13.4.
[HVOT.3] Richardson and Marottoli, 1996: Data for a Sample of Healthy Elderly Stratified by Two Age Groups x 1\vo Education Groups
Age/Education
n
76-80 <12
26
~12
24
Age
Education
%Male
%Black
78.80 (1.07)
10.44
54.0%
18.0%
<12
18
~12
33
(3.86)
17.90 (4.01) 21.69 (4.02)
84.08 (2.56)
81-91
HVOT Score
1U>9 (3.45)
51.0%
2.0% 17.62 (6.17) 19.71 (2.97)
APPENDIX 13
795
Table A13.S. [HVOT.4] Walsh et al., 1997: Data for a Sample of Cognitively Intact Geriatric Rehabilitation Patients MIF
n
Age
Education
Ratio
32
73.2 (7.7)
11.7 (2.9)
10122
Table A13.6. [HVOT.5] Lichtenberg et al., 1998: Data for a Sample of Cognitively Intact Geriatric Rehabilitation Patients
HVOT Score
African American/ European
18.6 (4.9)
MIF
American
n
Age
Education
Ratio
Ratio
74
76.9 (5.9)
10.8 (3.0)
19155
38136
HVOT Score
18.32 (4.03) ~
Appendix 14: Locator and Data Tables for the Visual Form Discrimination Test
Study numbers and page numbers provided in these tables refer to study numben; and descriptions of studies in the text of Chapter 14.
Locator table also provides a reference for each study to a corresponding data table in this appendix.
Table A14.1. Locator Table for the Vi$Ial Fonn Discrimination Test (VFDT) Study
VFDT.l Benton eta!., 1983b page 281 Table A14.2
VFDT.2 Campo & Morales, 2003 page 282 Table A14.3
796
Age
n
19-54 55-74
58 27
18--39 40--59
222 175
Sample Composition
Education
Location
Pati~ts
Regarding total sample of 85, 72 > 12 years, 13 <12 years
USA
Heakhy, unpaid volunteers li'f.ng in south and soothwest Spain; 1$_ M, 206 F; all inClependently functioning; ruaa are partitioned h)! age and education
Average education: males 12.34 (4.14) years, females 11.85 (3.84) years; no significant education difference between men/women; data reported by age (18--39, 40--59 years)/education (6-8, >9 years) categories
Spain (south/southwest)
without history oflbrain disease or healthy subjects; ala10st equal number of males and females in'each age category; ~a are partitioned by age and gender
797
APPENDIX 14
Table A14.2. [VFDT.l] Benton et al., 1983b: Mean, Median, and Range Performance Data for a Sample of Adults Stratified by Two Age Groups by Gender Age Group 19-54
55-74
Gender Male
Female
Male
Female
(n=28)
(n=30)
(n= 15)
(n= 12)
29.3 30.0 23-32
30.3 31.0 27--32
VFDT
Mean Median Range
30.8 31.0 28-32
29.9 30.0
24--32
Benton et al.'s score interpretation is as follows: 26-32, within normal limits; 24 or 25, borderline or mildly defective; ::::;23, severely defective.
Table A14.3. [VFDT.2] Campo and Morales, 2003: Means and SDs for each Variable of the VFDT Stratified by Age and Years of Education Age Group 18--39
40-59
Years of Education 6--8 (n=52)
>9 (n = 170)
6--8 (n =63)
>9 (n=ll2)
29.29 5.70
28.15 6.25
49.32 5.26
48.73 5.39
7.81 0.49
13.96 3.33
7.44 0.84
13.84 3.20
Total score Mean SD
30.27 2.63
30.72 1.87
28.81 3.30
30.16 2.20
Correct respolllft Mean SD
14.88 1.64
15.22 1.15
13.76 2.40
14.77 1.46
0.50 0.80
0.37 0.79
1.29 1.63
0.62 1.01
0.25 0.52
0.21 0.45
0.52 0.69
0.31 0.57
0.36 0.79
0.20 0.45
0.43 0.71
0.29 0.51
Age
Mean SD l'eDrs of education
Mean SD
Peripheral errors
Mean SD Diatortiota en-on
Mean SD
Rotation errors Mean SD
Administered and scored according to Benton et al. (1983b); error types categorized according to Kaskie and Storandt (1995).
Appendix 15: Locator and Data Tables for the Judgment of Line Orientation Test
Study numbers and page numbers provided in these tables refer to study numbers and descriptions of studies in the text of Chapter 15.
Locator table also provides a reference for each study to a corresponding data table in this appendix.
Table A15.1. Locator Table for the Judgment of Line Orientation Test (JLO) Study
JLO.l Benton et al., 1983b
page 288
Age•
n
Sample Composition
IQIEducation•
Location
16-49 50-64 65-74
137
65 M, 72 F; general medical patients with no evidence of brain disease
12
Iowa
65-74 75-84
179
25 M, 162 F; healthy controls recruited from senior or retirement organizations with no history of neurological disease or psychiatric hospitalization 25M, 28 F; healthy controls recruited from senior-citizen and community organizations; no history of neurological or psychiatric disorder M 10, F 30; volunteers recruited from newspaper ads and screened for hypertension, cardiac or cerebrovascular disease, neurological illness, head injury, substance abuse, and psychiatric illness
13.7 12.5 12.1
Iowa
12.0
Iowa
Table A15.2
JLO.J Eslinger &: Benton, 1983
page289 Table A15.3 JL0.3 Eslinger et al., 1985 page 289 Table A15.4 JL0.4 Rao et al., 1989 page 289 Table A15.5
798
85-94
73.1
53
60-88
42.8 (8.1)
40
14.0 (2.3) WAIS-R Verbal IQ: 108.1 (6.3)
APPENDIX 15
799
Table A15.1. (Contd.) Study
JLO.S Ska et al., 1990 page 290 Table A15.6
JL0.6 Rao et al., 1991 page 290 Table A15.7
JLO. 7
Meader et al., 1993 page 291 Table Al5.8
JL0.8 Kempen et al., 1994 page 291 Table A15.9
JL0.9York & Cermak, 1995 page 291 Table A15.10
JLO.IO Ivnik et al., 1996 page 292 Table Al5.11
JLO.ll Fleming et al., 1997 page 293 Table A15.12
JLO.l2 Finton et al., 1998 page 293 Table A15.13
Age• 55-64 65-74 75-84
46.0 (11.6)
n
95
100
Sample Composition
IQ/Education •
Location
M 19, F 76; healthy volunteers with no history of alcoholism, drug abuse, neurologic-al or psychiatric illness
10.13 (3.38) 9.46 (3.40) 8.06 (2.77)
Canada
M 75, F 25; paid volunteers were screened with neurological evaluation and MRI; exclusion criteria included history of substance abuse, psychiatric illness, head injury, and other neurological disorders; 99 participants were Caucasian
13.3 (2.0);
Wisconsin
WAIS-R Verbal IQ: 106.5 (6.9)
12
M 8, F 4; paid volunteers were members of the staff at Medical College of Georgia; none had history of neurological, psychiatric, or "major" medical disease and none used psychoactive drugs; study used repeated measures design in which participants were administered either saline or scopolamine at least 72 hours apart
65.2 (5.9)
13
M 3, F 10; volunteers with normal vision were recruited during a routine ophthalmic exam by one of the authors at the UCSD School of Medicine; the only exclusion criteria was Snellen distance acuity of 20/50
16.5 (2.9)
California
61.89 (8.67)
15
M 6, F 9; volunteers were orthopedic patients from one of two rehabilitation hospitals; screened for history of cerebrovascular accidents and other neurological deficits
15.07 (3.41)
Maine
2 5 12 24 24 69 40 16 3
M 71, F 145; volunteers were part of the MOANS project; exclusion criteria were psychiatric or neurological illness
27
M 16, F 11; paid volunteers were recruited from the community and screened for substance use, psychiatric illness, neurological disease, and other medical diagnoses
20--42 (31)
50--59 60--64 65-69 70-74 75-79
80-84 85-89 90-94 95+ 26.1 (7.4)
70.4 (6.0)
24
12-20 (16)
5,7 8-11 12 13--15
Georgia
Minnesota
1~17
~18
M 13, F 11; elderly control participants were screened via physical and neurological exams for cognitive deficits;
15.4 (3.2)
Washington, D.C.
WAIS-R Prorated IQ: 101.0 (13.8) 15.7 (2.8)
(continued)
800
Table A15.1.
APPENDIX 15 (Contd.)
Study
JLO.l3 Obonsawin et al., 1998 page 293 Table A15.14
JLO.l4 Meyers et al., 1999 page 294 Table A15.15
Age• 40.83 (12.55)
36.70 (20.5)
JLO.l5 Basso
M:
et al., 2000 page 294 Table A15.16
22.04
n 12
30
Total 52
(3.53) F: 22.62 (7.24)
JLO.l6 Bell
34.4
et al., 2001 page 294 Table A15.17
(12.5) 16--60
JLO.l7 Montse
63.84 (9.93) 39-85
76
32.58 (.'5.87)
240
18-39 40-59 60-78
65.8 {6.7)
et al., 2001 page 295 Table A15.18
JLO.l8 Rahman & Wilson, 2003 page 29.'5 Table A1S.l9
JLO. 19 Salthouse et al., 1997 page 296 Table A15.20
JL0.20 Woodward et al., 1998 page 296 Table A15.21
55-84
29
Sample Composition M 3, F 9; paid healthy volunteers were injected with a saline solution after abstaining from alcohol, tea, and coffee; all participants were screened via medical examination M 14, F 16; healthy volunteers were screened for a history of a number of conditions, including neurological disease, closed head injUI)', motor vehicle accidents, learning disabilities, and loss of consciousness M 26, F 26; volunteers were undergraduate students screened via interview for history of learning disability, neurological illness, psychiatric disease, and head trauma; all were right-handed, and while most of the sample was Caucasian, a small percentage were non-white 72% F; volunteers were friends and family members of early-onset temporal lobe epilepsy patients; exclusion criteria were substance abuse, medical or psychiatric history affecting cognition, use of psychotropic medication, loss of eonsciousness longer than 5 minutes, developmental learning disorder, and repetition of a grade M 38, F 38; volunteers were friends and spouses of Parkinson's disease patients who were free of neurological and psychiatric illness
IQ/Education• NART:
Location United Kingdom
35.09 (7.23)
13.67 (3.47) WAIS FSIQ: 113.97 (13.51) M: 13.92 {1.01) F: 13.85 {1.05)
13.0 {1.7) FSIQ: 97.7 {6.4)
9.64 {4.17)
Barcelona, Spain
M 120, F 120; healthy hetero- and homosexual volunteers were recruited from King's College, local community, and social networks; no exelusion criteria provided
16.27 (3 ..'54)
London, UK
40 38 37
M 47%; healthy nonstudent volunteers with at least 11 years of education; no other exclusion criteria provided; short form was used
15..'53 {2.30)
82
Volunteers were geriatric individuals who were "community dwellers;" sample was healthy and not diagnosable for depression; 97% were Caucasian; short form was used
14.0 {2.3)
New York
9-20
• Age column and IQ/education column contain information regarding range and/or mean and standard deviation for the whole sample and/or separate groups, whichever information is provided by the authors.
801
APPENDIX 15
Table A15.2. [JL0.1] Benton et al., 1983a: Data• for the Sample of General Medical Patients with No Evidence of Brain Disease, Stratified by Gender and Three Age Groupings Age Groups
Mala Mean n
16-49
50-64
65-74
25.6 27
24.3 17
22.7 21
23.3 31
22.2 26
20.8 15
Table A15.3. [JL0.2] Eslinger and Benton, 1983: Mean JLO T Scores for a Sample of Healthy Elderly, Stratified into Three Age Groups
JLO
MJF•
Age Group
n
Ratio
Education
T Scores
65-74
87
22165
13.7
53.0 (7.4)
75-84
67
10n5
12.5
48.0 (9.6)
85-94
25
3/22
12.1
44.3 (14.6)
Females
Mean n
•Average educational levels for males and females are 12.6 and 13.1, respectively.
•sos are not reported.
Table A15.4. [JL0.3] Eslinger et al., 1985: Mean JLO Correct Responses for Normal Controls n
53
MIF Ratio
Age
25/28
73.1
JLO
Education
(Correct) 12.0
24.8 (3.7)
Table A15.5. [JL0.4] Rao et al., 1989: Mean JLO Correct Responses for Normal Controls
JLO
n
MIF Ratio
Age
Education
WAIS-R Verbal IQ
MMSE•
(Correct)
40
10/30
42.8 (8.1)
14.0 (2.3)
108.9 (11.9)
29.8 (0.5)
26.9 (4.7)
•Mini-Mental State Exam (Folstein et al., 1975).
Table A15.6. [JL0.5] Ska et al., 1990: Mean JLO Correct Responses for a Sample of Healthy Elderly Partitioned into Three Age Groups Age Group
Range of JLO scores
Mean Age
n
MIF Ratio
Education
JLO (Correct)
55-64
59.20 (2.97)
38
7/31
10.13 (3.38)
24.39 (3.98)
16--30
65-74
68.85 (2.90)
40
6/34
9.46 (3.40)
23.75 (3.64)
15--30
75-84
77.88 (2.62)
17
6/11
8.06 (2.77)
21.71 (5.02)
14-30
802
APPENDIX 15
Table A15.7. (JW.6] Rao et al., 1991: iData for a
Table A15.11. (JW.10] Ivnik et al.,
Normal Control Sample
1996: Demographic Description of the Sample Partitioned into Groups Used in JW Testing
n
MIF Ratio
100 75125
WAIS-R I JLO Education Verbal IQI (Correct)
Age 46.0 (11.6)
13.3 (2.0)
106.5 (6.9)
n
27.2 (4.1)
1
Age groupe 56-59 60-64
2 5 12 24
65-$
Table A15.8. (JW.7] Meader et al.,
70-74
1~: Mean
JW Correct Responses for Healthy Staff Employees of a Medical College n
MIF Ratio
12
814
' JLO
Age
Education
(Correct)
31
16
(range 20-42)
(range 12-20)
27.33 (2.35)
~7
1~: Mean Correct Responses for Non-C~'vely Impaired Patients o£ the University of C · rnia San Diego School of Medicine ·
JLO 65.2
51 33 23
Gender Males Females
145
(5.9)
48 60
71
.RGce
Caucasians
215
B~b
25.0
16.5 (2.9)
1
8-11 12 13-15 16-17 ;?:18
n Ratio Education Age (Correct) -------------------------------3110
16 3
Education
Table A15.9. (JW.8] Kempen et al.,
13
45 69 40
95+
JW
MIF
75-79 80-84 85-89 90-94
(8.0)
Bantleclneu
-------------------------------
ru~t
201 6 9
Left Mixed
Table A15.10. (JW.9] York and
Ce~
TotGl
1995:
216
Data for Mean JW Correct Respon*s for a Sample of Orthopedic Rehabilitation Pa~nts n 15
MIF Ratio
Age
619
61.89 (8.67)
Education
JLO (Correct)
15.07 (3.41)
24.6 (5.14)
Table A15.12. (JW.ll] Fleming et al., f)97: Data for Mean JW Correct Responses for a Control Group
:
n
MIF Ratio
Age
Educatio1
27
16/11
26.1
15.4 . (3.2) .
(7.4)
WAIS-R Prorated IQ
WRAT-R
JLO
Reading
(Correct)
101
101
24.1
(13.8)
(14.8)
(4.9)
803
APPENDIX 15
Table A15.13. [JL0.12] Finton et al., 1998: Mean JLO Correct Responses for a Sample of Healthy
Table A15.17. [JL0.16] Bell et al., 2001: Mean JLO Correct Responses for a Control Sample
Participants n
Age
24
70.4 (6.0)
MIF Ratio
JLO
Education
DRS" Score
(Correct)
15.7 (2.8)
138.1 (3.5)
24.5 (3.7)
13111
n
Age
Education
%Male
FSIQ"
JLO (Correct)
29
34.4 (12.5)
13.0
28
97.7 (6.4)
24.7 (3.8)
(1.7)
"WAIS-IU FSIQ seven-subtest short form. "Dementia Rating Scale.
Table A15.14. [JL0.13] Obonsawin et al., 1998:
Table A15.18. [JL0.17] Montse et al., 2001: Mean
Mean JLO Correct Responses for a Sample of Healthy Participants from the United Kingdom
JLO
n
MIF Ratio
12
319
Age
40.83 (12.55)
NARTScore 35.09
(7.23)
Raven's Matrix Score
JLO (Correct)
51.17 (7.81)
24.92 (2.54)
Error Scores for a Sample of Healthy Participants from Barcelona, Spain n
MIF Ratio
Age
Education
76
38138
63.84
9.64 (4.17) (range 1-23)
(9.93) (range 39-85)
"National Adult Reading Test.
Table A15.15. [JL0.14] Meyers et al., 1999: Mean JLO Correct Responses for a Sample of Healthy Participants n
MIF Ratio
30
14/16
Age
Education
WAIS FSIQ
JLO (Correct)
36.70 (20.5)
13.67 (3.47)
113.97 (13.51)
26.70 (3.16)
Table A15.16. [JW.15] Basso et al., 2000: Mean JW Correct Responses Stratified by Gender for a Sample of Undergraduate Students n
Age
Education
Estimated FSIQ"
JLOt (Correct)
Males
26
22.04 (3.53)
13.92 (1.01)
112.67 (3.78)
25.27 (4.08)
Females
26
22.62 (7.24)
13.85
112.08 (4.20)
22.12 (4.13)
(1.05)
"Regression method developed by Barona et al. (1984). tsDs were calculated from the 95% confidence intervals provided by the authors.
JLO Errors
15.85 (7.30)
APPENDIX 15
804 Table A15.19. [JLO.l8] Rahman and Wilson, 2003: Mean JLO Correct Responses for the Control Sample Stratified by Sexual Prientation and Gender n
Age
~ucation
Raven's Matrix
JLO (Correct)
Males
60
29.91 (6.60}
Females
60
26.80 (5.87)
'15.96 . (3.29) :16.65 i (3.29)
47.05 (7.41) 46.86 (6.73)
28.40 (1.84) 24.81 (3.43}
32.08 (5.66) 29.61 (5.35)
16.51 (3.86) 15.95 (3.71)
44.83 (6.68)
24.18 (3.75) 25.56 (3.75)
H~
HOfiiONIICUtll
Males
60
Females
60
45.55 (6.46}
Short Form Table A15.20. [JLO.l9] Salthouse et al., 1997 (Short Form): Mean JLO Correct Respqnses for a Healthy Sample Age Group~ 18-39
40-59
60-78
n
40
38
37
%Male
42.5
50
48.6
Age
29.0 (4.8)
49.1 (5.1)
69.2 (5.1)
Education
15.5 (1.7}
15.2 (2.5)
15.3 (2.6}
JLO (Correct)
12.7 (1.7}
12.0 (2.2)
12.1 (2.3)
Table A15.21. [JL0.20] Woodward et' al., 1998 (Short Form): Mean JLO Correct Responses for a Healthy, Nondepressed Geriatric Sample
JLO (Correct) n
Age
Education
82
65.8 (6.7) (range 51H14)
14.0 (2.3) (range 9-20}
MMSE" Score 27.5 (2.0} ; (range 21-30)
"Mini-Mental State Exam (Folstein et al., 19f5).
FormO
FormE
11.6 (2.4)
11.5 (2.1) (range 5-15)
(range ~15)
Appendix 16: Locator and Data Tables for the Design Fluency Tests
Study numbers and page numbers provided in these tables refer to study numbers and descriptions of the study in the text of Chapter 16.
Locator table also provides a reference for each study to a corresponding data table in this appendix.
Table A16.1. Locator Table for the Design Fluency Tests Study
Age"
n
RFFl'.l Ruff et al., 1986b page 303 Table Al6.2
28.2 (8.8)
50
RFF1'.2 Ruff et al.,
16-24 25-39 40-54 55-70
358
19.1 (2.0)
134
66.7 (7.4)
51
1987
page303 Table Al6.3
RFF1'.3 Demakis &: Harrison, 1997
page 304 Table Al6.4 RFF1'.4 Fama et al., 1998
page 304 Table Al6.5
Sample Composition
IQ/Education•
Location
Normal control volunteers
13.2 (1.7)
California
161 M, 197 F; recruited participants
<12 13-15 >16
California, Michigan
were excluded if they had a history of psychiabic hospitalization, chronic polydrug abuse, or neurological disorder, data are stratified by 4 age groups and 3 educational levels 61 M, 73 F; controls had no learning disabilities and were screened for current or past neurological or psychiabic disease Normal controls; participants were excluded from the study if they had a significant history of psychiatric disorders, neurological illness, past or present history of alcohol or substance abuse, or other serious medical conditions
College students
16.4 (2.3)
California
PIQ: 115.6 (5.9) (continued)
805
806
APPENDIX 16
Table A16.1. (Contd.) Study
Age•
RFFT.5 Berning eta!., 1998 page 305 Table A16.6
20
RFFT.6 Demakis, 1999 page 305 Table A16.7
22.50 (7.99)
n 124
Sample Composition
IQ/Education•
Location
34 M, 90 F; undergraduate students were recruited from psychology courses at the University of Mississippi and given course credit for participation
College students
Mississippi
21
Undergraduate students (67% F); data are partitioned by finn and mixed-handedness groups
13.60 (1.46)
Midwestern United States
RFFT.7 Ross eta!., 23.90 (7.32) 2003 page 305 Table A16.8
90
College students (55% F) recruited from FSIQ: introductory psychology courses, 108.1 (9.2) received course credit for participation; 44% Caucasian, 30% Mrican American, 9% Hispanic, and 6% other; exclusion criteria were history of neurological disorder, learning disability, or psychiatric conditions involving medication usage; 48 subjects were retested an average of 35.2 days later
Midwestern United States
Design Fluency.l 35.8 Boone eta!., 1991 (13.7) page 306 Table A16.9
16
7 M, 9 F; controls recruited; exclusion criteria included history of alcohol or drug abuse, head injury, seizure disorder, cerebral vascular disease or stroke, psychiatric history, or any renal, hepatic, or pulmonary disease
Southern California
Design Fluency.2 69.4 (10.6) Woodard eta!., 1992 page 306 Table A16.10
80
35 M, 45 F; volunteers were screened for dementia, current or past neurological illness or injury, substance abuse, drug use, and history of psychiatric disorder
14.6 (3.4)
Design Fluency.3 20-35 Daigneault et a!., 1992 45-65 page 306 Table A16.11
70
Subjects were divided into 2 age groups: 38M, 32 F (younger group); 30M, 28 F (older group); French-speaking; exclusion criteria included consumption of more than 24 beers, 5 bottles of wine, or 15 oz of spirits per week; consumption of cocaine, LSD, or psychostimulants; any neurological or psychiatric consultation, psychoactive medication, head trauma with hospitalization, or major surgery;
12.36 (2.09) 12.11 (3.63)
58
15.2 (2.8) FSIQ: 109.1 (10.9)
34.7 (7.7)
20
9 M, 11 F; participants were screened for a history of central nervous system disease or injury, major medical illness, major psychiatric disorder, or current alcohol or drug abuse; one of the subjects apparently had a past history of substance abuse
13.6 (1.4)
Design Fluency.5 27.7 Varney et a!., 1996 (13.1) page 308 Table A16.13
87
28 M, 59 F; volunteers had no history of neurological or psychiatric illness, loss of consciousness due to head trauma, or severe febrile illness
14.4 (2.0)
134
61 M, 73 F; controls had no learning disabilities and were screened for current or past neurological or psychiatric disease
College students
Design Fluency.4 Beatty et a!., 1993 page 307 Table A16.12
Design Fluency.6 Demakis & Harrison, 1997 page 308 Table A16.14
19.1 (2.0)
Canada
807
APPENDIX 16
Table A16.1. (Contd.) Study
Age•
n
Design Fluency.7 Carter et al., 1998 page 308 Table A16.15
25.06 (7.83)
66
Design Fluency.S Harter et al., 1999 page 309 Table A16.16
20
52
Design Fluency.9 Mataix-Cols et al., 1999 page 309 Table A16.17
19.1 (1.3}
27
Design Fluency.IO 55.8 (11.6} Abrahams et al.,
49
16 M, 9 F; right-handed controls were screened for neurological disorder or signi6cant head injury
14.1 (3.1) FSIQ: 114.6 (9.9)
London, UK
22
22 all-male, paid participants were recruited and screened for a history of learning disability, major psychiatric disorder, substance abuse, or neurological disorder
13.36 (2.15)
Southern California
2000
page 309 Table A16.18 Design Fluency.ll 34.32 Boone et al., 2001 (14.81) page 310 Table A16.19
IQ/Education•
Sample Composition
19 M, 47 F; undergraduates were primarily 15.21 (1.60) recruited; inclusion criteria included age 18-60, right-handed, English as FSIQ: 6rst or main language. FSIQ >79, and no signi6cant neurological, systemic, 100.85 (11.07) or psychiatric illness 64 college students (91% F) enrolled at College Texas Tech University; 81% reported students a history of neurological disorder; means and SDs were provided both for the sample as a whole and with the 12 students with possible neurological dysfunction excluded 4 M, 23 F; undergraduates; one subject College was left-handed; exclusion criteria students included history of psychiatric disorder
Location Ontario, Canada
Texas
Barcelona, Spain
FSIQ: 107.14 (15.89)
•Age column and IQ/education column contain information regarding range and/or mean and standard deviation for the whole sample and/or separate groups, whichever information is provided by the authors.
Table A16.2. [RFFf.l] Ruff et al., 1986b: Data for All Five RFFf Parts Combined for Healthy Individuals from the San Diego, California, Region
n
Age
Education
RFFT Total Unique Designs
50
28.2 (8.8)
13.2 (1.7)
103.0 (23.0)
·so is not provided.
RFFT Perseverative Errors 5.8 (7.3}
Error Ratio 0.06•
808
APPENDIX 16
Table A16.3. [RFIT.2] Ruff et al., 19~: Unique Designs and Perseverations for All Five Parts of the RFIT Stratified by Age Group and Eduyation Level for Healthy Individuals
1
Education (Years)
<12 Unique Designs
12-15 Perseverative• ' Errors
Unique Designs
Perseverative" Errors
29
108.2 (18.6)
5.3 (4.4)
6.9 (7.5)
36
99.9 (23.6)
7.9 (6.2)
32
(22.4)
67.5 (19.9)
7.4 (6.2)
31
Age Group
n
16-24
30
103.3 (23.8)
4.6 (6.2)
25-39
31
82.1 (18.7)
40--45
28
86.0
55-70
27
>16
n
Unique Designs
Perseverative" Errors
21
113.1 (21.2)
6.8 (3.9)
7.1 (5.8)
34
104.1 (25.4)
5.7 (5.9)
93.9 (17.8)
7.3 (6.1)
32
106.3 (16.2)
5.1 (4.5)
77.2 (17.7)
9.0 (7.6)
27
84.4 (20.8)
9.3 (6.4)
n
"The authors did not include 20 outlier score~ (~2 SD from the mean) for the entire sample in these mean values. I
Table A16.4. [RFIT.3] Demakis and Harrison, 1997: Data for All Five Parts of the RFfl' Combined for a Sample of College Students •
n
Ratio
Age
RFFf Unique Designs
134
61173
19.1 (2.0)
86.92 (21.91)
Male/Female
I
Table A16.5. [RFIT.4] Fama et al., 1 : Data for All Five Parts of the RFIT Combined for a Group of Healthy Indivi uals RFFf Uni~ue Designs
EstnUted
n
Age
Education
NARt• IQ
51
66.7 (7.4)
16.4 (2.3)
115.6 (3.9)
"National Adult Reading Test (Nelson, 1982). tMini-Mental State Exam (Folstein et al., 197Sf. ~Data are based on 50 participants.
)
28.8 (1.1)
76.1 (12.1)
809
APPENDIX 16
Table A16.6. [RFFT.5] Berning et al., 1998: Data For Each Part of the RFFT for a Sample of Undergraduate Students• n
124
Median Age
RFFf Measure
Part I
Part II
Part III
Part IV
PartV
Total
Unique Designs
17.8 (4.7)
19.6 (4.5)
19.6 (4.2)
20.2 (4.4)
21.7 (4.5)
99.0 (18.9)
Perseverations
0.9 (1.7)
1.0 (1.3)
1.3 (1.7)
1.4 (1.9)
1.6 (1.8)
6.2 (5.6)
Error Ratio
0.05 (0.13)
0.05 (0.07)
0.07 (0.08)
0.07 (0.12)
0.08 (0.09)
0.06 (0.05)
20
•The sample includes 34 males and 90 females.
Table A16.7. [RFFT.6] Demakis, 1999: Test-Retest Data for Five RFFT Parts Combined for a Sample of Undergraduate Students RFFf Total Unique Designs n
Age
Education
%Female
Initial Test
Retest (3 Weeks)
21
22.50 (7.99)
13.60
67
100.9 (24.5)
117.7 (26.9)
(1.46)
Table A16.8. [RFFT.7] Ross et al., 2003: Initial Test and !-Month Retest Data from College Undergraduates (55% Female) in the Midwest RFFf Scores
Testing Session
n
Age
FSIQ
Total Designs
Perseverative
Error Ratio
Initial testing
90
23.90 (7.32)
108.10 (9.20)
106.3 (23.1)
8.4 (8.8)
0.0790 (0.0521)
Retest
48
114.5 (24.6)
8.2 (8.1)
0.0753 (0.0695)
Table A16.9. [Design Fluency.!] Boone et al., 1991: Mean Scores for Free Condition in Controls from Southern California (9 Women, 7 Men)
n
Age
Education
FSIQ
Total Unique Designs
16
35.8 (13.7)
15.2 (2.8)
109.1 (10.9)
26.0 (11.0)
APPENDIX 16
810 Table A16.10. [Design Fluency.2] Woodard et al., 1992: Mean Scores for Free and Fixed Conditions in 80 Older Normals (35 Men, 45 Women) Free condition
Fixed condition
Novel
Nameable
Perseverative
Novel
Nameable
Perseverative
Wrong Number of Lines
17.0 (9.3)
1.9 (2.2)
8.6 (7.5)
12.0 (7.1)
0.8 (1.3)
4.6 (4.2)
1.4 (2.1)
Mean age= 69.4 (10.6), mean education= 14.6 (3.4), mean W AIS-R Vocabullii)' scaled score= 10.0.
Table A16.11. [Design Fluency. 3] Daigneault et al., 1992: Data for FrenchSpeaking Healthy Canadian Volunteers Age Group
n
Age
Education
Free Condition
Perseverative Errors
20-35
70
27.71 (4.05)
12.36 (2.09)
24.33 (12.40)
0.01 (0.08)
44-65
58
56.62 (5.29)
12.11 (3.63)
28.48 (15.00)
0.01 (0.02)
Table A16.12. [Design Fluency.4] Beatty et al., 1993: Data for Fixed Condition in Controls {9 Males, 11 Females) n
Age
Education
Fixed Condition
Rule Violations
20
34.7
13.6 (1.4)
25.4 (12.6)
0.6 (2.0)
(7.7)
Table A16.13. [Design Fluency.5] Varney et al., 1996: Data for the Free Condition in Controls (28 Males, 59 Females) n
Age
Education
Free Condition
87
27.7 (13.1) 1S-77
14.4 (2.0) 12-21
16.1 (9.13)
4--51
Table A16.14. [Design Fluency.6] Demakis and Harrison, 1997: Data for the Free and Fixed Conditions in College Students n
134
Male/Female Ratio 61173
Age
Free Condition
Fixed Condition
Total Designs
19.1 (2.0)
18.58 (9.52)
18.24 (5.94)
36.82 (13.42)
811
APPENDIX 16
Table A16.15. [Design Fluency.7] Carteret al., 1998: Data for the Free and Fixed Conditions in College Undergraduates in Canada• Free Condition
Fixed Condition
n
Novel
Perseverations
Nameable
Novel
Perseverations
Nameable
Not 4 Lines
66
13.9 (6.3)
7.1 (7.8)
0.2 (0.4)
16.7 (6.1)
6.4 (5.5)
0.4 (0.8)
2.5 (2.3)
•sample of 19 males, 47 females; age= 25.06 (7.83); education= 15.21 (1.60); FSIQ = 100.85 (11.07).
Table A16.16. [Design Fluency.8] Harter et al.,
Table A16.18. [Design Fluency.10] Abrahams et al.,
1999: Data for the Free Condition in College Undergraduates (91% Female)
2000: Data for the Free and Fixed Conditions in Right-handed Older Controls (16 Males, 9 Females)
n
Age
52
20
Total Unique Designs
Perseverations
8.92 (5.03)
0.00 (0.00)
Scribbled Responses
Cols et al., 1999: Data for College Undergraduates (23 Women, 4 Men) in Spain n
Age
27
19.1 (1.3)
25.1 (13.9)
Age
Education
FSIQ
Free Condition
Fixed Condition
25
55.8 (11.6)
14.1 (3.1)
114.6 (9.9)
25.0 (7.8)
22.2 (7.7)
0.06 (0.23)
Table A16.17. [Design Fluency.9] Mataix-
Free Condition
n
Fixed Condition 10.8 (2.8)
Table A16.19. [Design Fluency.ll] Boone et al., 2001: Data for the Free Condition in Male Controls from Southern California
n 22
Age
Education
FSIQ
Total Unique Designs
34.32
13.36 (2.15)
107.14 (15.89)
29.00 (15.14)
(14.81)
Appendix 1.7: Locator and Data Tables for the Tactual [Performance Test
Study numbers and page numbers provided in these tables refer to study numbers and descriptions of studies in the text of Chitpter 17.
Table A17.1. Locator Table for the
Locator table also provides a reference for each study to a corresponding data table in this appendix.
Ta~al Performance Test (TPT)
Study
Age•
n
Sample Composition
IQ/Education•
Location
TPT.I Halstead, 1947 page 318 Table A17.2
15-50
28
14 subjects without psychiatric diagnosis or history of brain injury; 14 subjects with psychiatric diagnosis
Education: 7-18 IQ: 70--140
Chicago
TPT.2 Reitan, 1955b, 1959 page 318 Table A17.3
32.36 (10.78)
50
35 M, 15 F volunteers hospitalized with paraplegia and neurosis
Education: 11.58 (2.85) FSIQ: 112.6 (14.3)
Indiana
TPT.3 Reitan & Wolfson, 1985 page 319 Data are not reproduced in this book
TPT.4 Klove & Lochen (in Klove, 1974) page 320 Table A17.4
812
No information provided regarding the normative sample; cutoffs for "severity ranges" (perfectly normal, normal, mildly impaired, seriously impaired) are presented 31.6 32.1
22, 22.
American, and Norwegian controls; no exclusion criteria reported
USA
Education: 11.1 12.2 FSIQ: 109.3 111.9
Wisconsin, Norway
APPENDIX 17
813
Table A17.1. (Contd.) Study
TPI'.5 Wiens & Matarazzo, 1977 page 320 Table A17.5
Age•
n
Sample Composition
IQ!Education•
Location
48
All males, neurologically nonnal; divided into 2 groups; random sample of 29 were retested 14-24 weeks later
Education: 13.7 14.0 FSIQ: 117.5 105-139 118.3 108-131
Portland, OR
23.6 21-27 24.8 21-28
TPI'.6 Cauthen, 1978a page 320 Table A17.6
Sample: 20-60 Groups: 20-29 30-39 40-49 50-60
117
35 M, 82 F subjects recruited from hospital volunteers and service clubs; all but 3 were right-handed; data represented in age x IQ ceUs
WAIS IQ: 91-111 112-122 123-139
Canada
TPI'.7 Harley et al., 1980 page 321 Table Al7.7
Sample: 55-79 Groups: 55-59 60-64
193
V.A. hospitalized patients; T-score equivalents reported
Education: 8.8 IQ:>80
Wisconsin
~9
70-74 75-79
56 45 35 37 20
TPI'.S Pauker, 1980 page 322 Table A17.8
Sample: 19-71 Groups: 19-34 35-52 53-71
363
Volunteers fluent in English; 152 M, 211 F; no physical disability, sensory de6cit, current medical illness, brain disorder or alcoholism; data presented in age x IQ cells
WAIS IQ: 89-102 103-112 113-122 123-143
Toronto, Canada
TPI'.9 Anthony et al., 1980 page323 Table A17.9
38.88 (15.80)
100
Nonnal volunteers, no history of medical or psychiatric problems, head injury, brain disease or substance abuse
Education: 13.33 (2.56) FSIQ: 113.5 (10.8)
Colorado
Participants were equally divided in 2 age groupings; subjects were fluent in English and denied history of neurological problems; 1st group had 9 F, 2nd group had 10 F
Education: 13.7 (1.91)
Texas
111 M, 82 F; participants described as nonpsychiatric and nonneurological; 83% right-handed; 5 age groupings
Education: 8-26
TPI'.IO Bak & Greene, 1980 page323 Table A17.10
TPI'.ll Fromm-Auch & Yeudall, 1983 page323 Table A17.11
50-62 55.6 (4.44) 67-86 74.9 (6.04) Sample: 15-64 25.4 (8.2) Groups: 15-17 18-23 24-32 33-40 41-64
15 15
190
14.9 (2.99)
Canada
14.8 (3.0)
FSIQ: 119.1 (8.8)
32 74 56 18 10 (continued)
814
APPENDIX 17
Table A17.1. (Contd.) Study TPT.l2 Moore et al., 1984 page 324 Table A17.12
Age" 19-27 28-36
n
Sample Composition
284
Data for healthy adults recruited through newspaper ads are partitioned into 6 age groups; various performance measures and time to completion are reported
56 64
37-45 46-55 56-65 66-76 Sample: 20-69 Groups: 20--29 3()...39 -ID-49 50-59 60-69
59 60 20 25 556
TPT.l4 Russell, 1985 page 325 Table A17.15
43.5 (13.6)
TPT.l5 Heaton et al., 1986 page 326 Table Al7.16
15-81 39.3 (17.5)
19 Caucasian controls admitted to a neurology ward for suspected neurological condition but showed no evidence of brain damage; all but 2 were male; 6-block version used; 17M, 2 F 553 356M, 197 F; exclusion criteria included history of neurological illness, significant head trauma, 319 and substance abuse; sample 134 was divided into 3 age groups 100 and 2 education groups; % classification as normal provided Ill Medical and psychiatric V.A. inpatients & outpatients without cerebral lesion or history of alcoholism or cerebral contusions; all except for one were male 23 Volunteers: 9 M, 14 F; no history of neurological or psychiatric illness; test-retest data for a 3-week interval are provided
TPT.l3 Schear, 1984 page 325 Tables A17.13, Al7.14
<40 ~9
;::60
TPT.l6 Alekoumbides et al. 1987 page 326 Table Al7.17
19-82 46.85 (17.17)
TPT.17 Bomstein et al., 1987a page 327 Table Al7.18
17-52 32.3 (10.3)
TPT.l8 EI-Sheikh et al., 1987 page 327 Table A17.19
17-24 20.6 (1.4)
TPT.l9 Dodrill, 1987 page 328 Table A17.20
27.73 (11.04)
111 112 111 155 67
Neuropsychiatric sample; 35% had evidence of various signs of organic brain syndromes, alcohol encaphalopathy, epilepsy, etc; 49% exhibited nonorganic psychotic disorders, schizophrenia, alcoholism, etc; 5 age decades are represented
Undergraduate and graduate students with no history of brain damage; test-retest data are provided 120 60 M, 60 F volunteers; data for various intelligence levels are presented
IQ/Education•
Location
FSIQ: 115 112 111 116 115 115 Education: 2-22
Canada
Education: 14.8 (6.4)
Miami
Education:
Colorado, California,
~20
Kansas
13.3 (3.4) <12 (132) 12-15 (249) ;::16 (172)
Wisconsin
Education: 1-20 11.43 (3.20) FSIQ:
s. California
105.9 (13.5) Verbal IQ: 88--128 105.8 (10.8) Performance IQ: 85--121 105.0 (10.5)
32
Cairo, Egypt
Education: 12.28 (2.18) FSIQ: 100 (14.4)
Washington
815
APPENDIX 17
Table A17.1. (Contd.) IQ!Education•
Location
225 Volunteers classifled in 4 age groupings; 8891> were right-handed; 127 M, 98 F
Education: 14.55 (2.78) FSIQ: 112.25 (10,25)
Canada
110 51 M, 59 F volunteers
Education: 10.3
Brisbane, Australia
WAIS-R FSIQ: 105.9 (12.2)
Canada
138 Healthy participants; data are partitioned into 3 age groups
Education: 15.4 15.7 14.9
Maine
489 Healthy volunteers
Education: 13.19 (3.46) FSIQ: 113.09 (12.07)
California, Colorado, Ohio, Michigan
486 Volunteers: urban and rural; data collected over 15 years through multicenter collaborative efforts; strict exclusion criteria; 6591> M; data are presented in T-score equivalents for M and F separately in 10 age groupings by 6 education groupings; in the 2004 edition, age range is expanded to 85 years, and the data are presented for African-American and Caucasian participants separately 427 Healthy participants; data are partitioned into 6 age groups by gender
Education: California, 13.6 (3.5) Washington, FSIQ: Teras, 113.8 (12.3) Oklahoma, Education groups: Wisconsin, 6-8 Illinois, 9-11 Michigan, 12 New York, 13-15 Virginia, 16-17 Massachusetts, 18+ Canada
Study
Age"
n
TPI'.JO Yeudall et al., 1987 page 328 Table Al7.21
Sample: 15-40 Groups: 15-20 21-25
Sample Composition
26-30
31-40 TPI'.Il Ernst, 1987 page329 Table Al7.22
65-75
69.6 (2.7)
TPI'.U Clark &: Klonoff, 35-68 55.5 (8.0) 1988 page330 Table Al7.23
TPI'.I3 Elias et al., 1990 page 331 Table Al7.24
TPI'.M Thompson &: Heaton, 1991 page 331 Table Al7.25
20-31 37-49 55-67 39.43 (17.76)
TPI'.IIS Heaton et al., 1991,2004 page 332 Data are not reproduced in this book
42.1 (16.8) Groups: 20-34 35-39 40-44
45-49 50-54 55-59 60-64
65-69 70-74 75-80
TPI'.16 Elias et al., 1993 page 332 Table Al7.26
Groups: 15-24 25-34 35-44
79 All male, right-handed, coronary bypass surgery patients; test-retest data; 6-block version was used
Education: 12-19
Maine
45-54 55-64
;::65 TPI'.I7 Barrett et al., 2001 page 333 Table A17.27
43.9 (7.6)
1052 Air Force veteran controls; presumably all male; SDs for the test data are not provided
High school and college
•Age column and IQ/education column contain information regarding range and/or mean and standard deviation for the whole sample and/or separate groups, whichever is provided by the authors.
816
APPENDIX 17
Table A17.2. [TPT.1] Halstead, 1947: Data for the Control Group (Including Patients with Psychiatric Diamoses): Mean Total Time for the Three Trials and Mean Scores For ~mory and Localization for the Total Group and for Each Subgroup' TPr
n
Total
30"
Total
10.8
Memory
I
U>calization
8.2
5.7
Civilian
14
9.5 {5.2-19.6)
8.4 {6-10)
6.1 {2-10)
Military
10
12.0 {6.3-15.9)
7.8 {6-9)
5.6 {1-8)
11.7
8.2 {7-10)
4.9 {1-7)
Miscellaneous
6
{5.~13.3)
"Two participants were tested twice.
I
Table A17.3. [TPT.2] Reitan, 1955b, ~959: Data for Individuals Referred for Neuropsychological Evaluation, with Negative Neurological Findings: Mean Total Time for the Three Trials and Mean Scores for Memory and Localization
!
TPr
WAIS
n
Age
Education
VIQ
50
32.36 {10.78)
11.58
110.82 {14,46)
{2.85)
I
PIQ
FSIQ
Total
Memory
Localizatioo
112.18 {14.23)
112.64 {14.28)
12.59 {5.20)
7.65 {1.41)
{2.12)
tiD
Table A17.4. [TPT.4] Klove and Lochen Klove, 1974): Data for American and Norwegian Controls: Means for Total Time in Minutef. Memory, and Localization Scores for Each Group TPr
Americans Norwegians
n
Age
Education
WAIS IQ
Total
Memory
U>calization
22 22
31.6 32.1
11.1 12.2
109.3 111.9
14.0 13.7
7.2 7.5
4.3 5.2
5.29
817
APPENDIX 17
Table A17.S. [TPT.S] Wrens and Matarazzo, 1977: Data for Male Applicants to Patrolman Program: Mean Time in Minutes and SD to Complete the TPT with the Preferred Hand, Nonpreferred Hand, Both Hands, and Total, as Well as Mean and SD for Memory and Localization Scores for Two Equal Groups of Subjects• TPT
Education WAIS FSIQ Preferred
n
Age
24
23.6 (21-27)
13.7 (12-16)
117.5 (8.3)
24
24.8 (21-28)
14.0 (12-16)
118.3 (6.8)
Memory Localization
Nonpreferred
Both
Total
4.85 (1.92)
3.02 (1.20)
1.87 (0.89)
9.74 (3.16)
8.46 (0.98)
5.67 (1.74)
4.38 (1.24)
2.97 (1.03)
1.83 (0.84)
9.18 (2.42)
8.67 (0.76)
6.13 (2.61)
Test
Retest
n
Age
Education
FSIQ
Total
Memory
Localization
Total
Memory Localization
29
24 (21-28)
14 (12-16)
118
9.36 (2.73)
8.38 (0.82)
5.34 (2.41)
8.19 (2.70)
8.72 (0.88)
7.10 (1.82)
•The data for a subset of 29 subjects include means and SDs for total time in minutes, memory, and localization scores for the original testing and retest.
both
Table A17.6. [TPT.6] Cauthen, 1978a: Data for Canadian Volunteers Presented in Four Age
by Three IQ Groupings: Mean Times in Minutes and SD for Preferred Hand, Nonpreferred Hand, Both Hands, and Total, as Well as Means and SD for Memory and Localization n
Age
IQ
Preferred
Nonpreferred
Both
Total
Memory
Localization
18
20-29
91-111
5.1 (1.4)
4.3 (1.3)
3.0 (1.2)
12.4 (3.2)
6.8 (1.4)
3.8 (2.1)
11
112-122
4.3 (0.9)
3.6 (1.4)
2.2 (0.9)
10.1 (2.0)
8.1 (1.6)
5.5 (2.1)
14
123-139
4.7 (1.2)
3.0 (1.1)
1.8 (0.7)
9.6 (2.1)
8.2 (0.9)
6.4 (1.6)
91-111
7.4 (2.5)
4.0 (2.3)
2.5 (0.9)
3.9 (5.3)
5.8 (1.8)
4.2 (0.8)
17
112-122
6.4 (2.8)
4.4 (2.5)
2.5 (1.5)
13.2 (5.9)
7.8 (1.1)
5.5 (2.0)
6
123-139
4.5 (1.1)
3.4 (0.9)
1.9 (0.3)
9.5 (2.3)
8.5 (1.0)
5.2 (3.2)
5
9
30-39
40-49
91-111
6.9
5.5
2.9
(2.2)
(1.8)
(0.8)
15.3 (3.5)
6.1 (1.3)
3.3 (1.9)
8
112-122
5.4 (1.9)
4.6 (1.7)
3.0 (2.1)
12.9 (5.2)
7.6 (1.1)
4.1 (2.2)
10
123-139
5.7 (1.8)
4.0 (1.5)
1.6 (0.4)
11.4 (3.4)
6.7 (1.5)
4.5 (2.1)
91-111
8.5 (2.6)
6.2 (1.8)
3.9 (2.1)
18.6 (3.3)
5.3 (1.4)
2.1 (1.1)
4
112-122
7.1 (3.1)
5.3 (2.8)
3.4 (2.0)
16.0 (7.0)
6.5 (3.1)
3.8 (3.1)
8
123-139
6.3 (1.9)
7.4 (3.1)
3.5 (1.2)
17.2 (5.0)
5.5 (1.1)
2.8 (1.8)
7
50-60
APPENDIX 17
818
Table A17.7. [TPT.7] Harley et al., 1980: Data for Veterans Administration-Hospitalized Patients: Means and SDs for Time in Minutes per Block for Dominant and Nondominant Hands, Both Hands as Well as Total TIDle per Block for the Three Trial!; Combined by Age Groupings n
Age
WAIS FSIQ
Education
Dominant
Nondominant
Both
Total
56
55-59
98.57 (11..43)
10.1
2.53 (2.42)
2.30 (2.54)
1.83 (2.70)
2.21 (3.31)
45
60-64
9.8
2.27 (1.73)
1.94 (1.35)
1.77 (1.93)
2.70 (3.81)
35
65-69
8.7
3.56 (3.26)
2.37 (1.67)
2.38 (2.16)
2.17 (1.86)
37
70-74
8.8
3.68 (3.24)
4.24 (3.77)
2.43 (2.59)
3.25 (3.41)
20
75-79
6.5
4.30 (2.64)
2.15 (2.51)
2.06 (1.43)
2.63 (1.42)
10.1
2.72 (2.58)
2.47 (2.72)
1.99 (2.89)
2.40 (3.57)
9.3
2.41 (1.81)
2.10 (1.41)
2.03 (2.00)
2.94 (3.92)
8.8
3.45 (3.16)
1.93 (1.29)
2.10 (2.14)
(1.47)
8.8
3.68 (3.24)
4.24 (3.77)
2.43 (2.59)
3.25 (3.41)
6.5
4.30 (2.64)
3.15 (2.51)
2.06 (1.43)
2.63 (1.42)
Totclliii.UIIple
~129
98.68 (9.93) ~121
97.51 (11.18) ~laG
100.41 (9.!E) 82-125 IOU5 (IO.i8) 81-119
Alcolwl-equaled .....,Ze 47
55-59
33
60-64
23
65-69
99.00 (ll.'IJ) ~~9
96.00 (9.43) ~117
99.00 (12.06)
1.72
~ISO
37
70-74
20
7~79
100.00 (9.99) 82-1t5 102.00 (10.18) 81-U9
APPENDIX 17
819
Table A17.8. [TPf.8] Pauker, 1980: Data for Canadian Volunteers: Means and SDs for Total Time in Seconds, as Well as Memory and Localization Scores for the Sample as a Whole and for Three Age Groupings by Four WAIS IQ Levels WAIS IQ
Age
89-102
103-112
113-122
123-143
89-143
19-34 Total
n=21
n=53
n=60
n=28
n=162
932.57 (294.57)
830.79 (298.21)
692.95 (267.62)
616.71 (174.93)
755.93 (285.73)
Memory
7.10 (1.48)
7.70 (1.41)
8.33 (1.13)
8.61 (1.07)
8.01 (1.35)
Localization
4.71 (2.19)
4.66 (2.46)
5.68 (2.39)
6.36 (2.09)
5.34 (2.41)
35-52
n=20
n=34
n=56
n=25
n=135
1139.80 (257.07)
961.79 (320.49)
888.11 (288.75)
788.64 (344.50)
925.53 (318.45)
Memory
6.30 (1.66)
6.74 (1.56)
7.43 (1.31)
8.16 (1.43)
7.22 (1.56)
Localization
2.40 (1.60)
3.38 (2.16)
4.05 (2.12)
4.08 (2.31)
3.64 (2.16)
Total
53-11 Total
n=4
n=15
n=27
n=20
n=66
1591.25 (417.50)
1342.60 (346.88)
1108.70 (317.15)
1189.90 (409.95)
1215.71 (375.07)
Memory
5.00 (0.82)
5.33 (1.63)
6.22 (1.48)
6.55 (1.73)
6.05 (1.62)
Localization
0.75 (0.96)
2.07 (1.71)
3.07 (1.66)
2.00 (2.25)
2.38 (1.92)
19-11 Total
n=45
n=102
n=143
n=73
n=363
1083.22 (340.02)
949.73 (355.55)
847.87 (322.77)
823.63 (386.83)
902.60 (356.10)
Memory
6.56 (1.62)
7.03 (1.70)
7.58 (1.48)
7.89 (1.62)
7.36 (1.64)
Localization
3.33 (2.30)
3.85 (2.43)
4.55 (2.38)
4.38 (2.81)
4.17 (2.50)
Table A17.9. [TPf.9] Anthony et al., 1980: Data for Normal Volunteers: Mean Time in Minutes and SD Divided by Number of Blocks Placed and Means and SDs for Memory and Localization Scores WAIS
n 100
TPT
Age
Education
FSIQ
VIQ
PIQ
Time
Memory
Localization
38.88 (15.80)
13.33 (2.56)
113.54 (10.83)
113.24 (11.59)
112.26 (10.88)
0.44 (0.25)
7.80 (1.49)
4.64 (2.15)
820
APPENDIX 17
Table A17.10. [TPT.10] Bak and Greene, 1980: Data for a Healthy Sample: Mean T'une in Seconds and SDs for the Right Hand, Left Hand, Both Hands, and Total Time, as Well as Means and SDs for Memory and Localization Scores TPT n
15
15
Age
Education
Right
Left
Both
Total
Memory
Locali-tion
50-62 55.6 (4.44)
13.7 (1.91)
365.20 (151.49)
306.80 (161.51)
169.80 (71.03)
841.80 (326.01)
5.27 (1.49)
2.07 (1.53)
67-86 74.9 (6.04)
14.9 (2.99)
571.60 (289.36)
514.00 (276.64)
301.20 (116.14)
1,386.80 (627.34)
5.07 (2.02)
1.60 (1.55)
Table A17.11. [TPT.ll] Fromm-Auch and Yeudall, 1983: Data for Canadian Volunteers: Mean Time in Minutes, SDs, and Range for Preferred Hand, Nonpreferred Hand, Both Hands Combined, and Total Time for Each Age Grouping• TPT n
Age
Preferred
Nonpreferred
Both
Total
Localization
Memory
32
15-17
4.6 (1.2) 2.6-6.8
3.3 (1.2) 1.1-0.4
1.7 (0.5) 0.8-3.3
9.5 (2.1) 4.7-14.1
6.8 (2.5) 1-10
8.9 (1.0) ~10
74
18-23
5.1 (2.2) 1.9-13.5
3.5 (1.6) 1.1-10.8
2.1 (1.3) 0.4-9.3
10.6 (4.5) 4.2-29.1
5.7 (2.1) 1-10
8.2 (1.3) 4-10
56
24-32
4.5 (1.8) 1.7-9.5
3.1 (1.1) 1.5-7.1
1.8 (0.8) 0.5-4.6
9.4 (3.0) 3.8-18.8
5.5 (1.8) 2-9
8.3 (1.1)
4.9 (1.7) 1.9-9.0
3.7 (1.0) 2.2-5.9
2.3 (0.8) 1.4-4.4
10.9 (2.9) 5.9-19.4
5.6 (2.2) 1-9
~10
5.6 (1.5) 4.0-9.0
4.2 (1.6) 2.4-8.1
2.5 (1.2) 1.4-5.5
12.2 (3.6) 8.3-20.6
4.9 (1.8) 2-7
7.7 (1.3) 6-9
18
10
33-40
41-04
~10
8.6 (1.1)
"Mean correct blocks, SD, and range are reported for localization and memory bials.
Table A17.12. [TPT.12] Moore et al., 1984: Data for 183 Normal Volunteers Stratified into Six Age Groups Mean Recall
n
Age Group
Mean Age
Male
Female
Total
Mean FSIQ
Shape
Location
Location proportion
Completioo time (seconds)
19-27
23.1
24
32
56
115
8.3
5.7
0.67
737
28-36
31.8
36
28
64
112
7.6
4.6
0.59
805
37-45
40.8
31
28
59
111
7.2
3.4
0.46
893
~
50.8
26
34
60
116
6.8
3.6
0.50
1,032
56-05
61.2
8
12
20
115
5.8
1.9
0.33
1,281
~76
69.5
8
17
25
115
5.0
1.7
0.32
1,377
821
APPENDIX 17
Table A17.13. [TPT.13a] Schear, 1984: Data for Patients with Neurological and Psychiatric Disturbance: Means, SDs, and Ranges for Time in Minutes Required for Completion and Number of Blocks for the Right Hand. Left Hand, and Both Hands for Five Age Decades Left
Right
n
Both
Age
Education
Minutes
Blocks
Minutes
Blocks
Minutes
Blocks
111
20-29
11.72 (1.50) 6-16
6.98 (2.30) 1.~10
8.95 (2.14) 1-10
5.39 (2.62) 1.3-10
9.36 (1.78) 2-10
3.63 (2.29) 0.7-10
9.85 (1.08) 1-10
112
30-39
12.11 (2.21) 6-18
7.04 (2.43) 2.4-10
8.66 (2.49) 0-10
6.30 (2.75) 1.6-10
8.81 (2.39) 0-10
4.51 (2.80) 0.9-10
9.53 (1.52) 1-10
111
40-49
11.71 (2.75) 5-21
7.69 (2.37) 2.2-10
7.81 (3.11) 0-10
6.96 (2.50) 2.4-10
8.41 (2.89) 0-10
5.07 (2.74) 1-10
9.32 (2.05) 1-10
156
50-59
11.16 (3.61) 2-22
8.70 (1.85) 3.3-10
7.35 (3.03) 0-10
8.20 (2.21) 2.3-10
7.45 (3.24) 0-10
6.51 (2.92) 1.2-10
8.40 (2.86) 0-10
11.13 (3.60) 3-20
8.75 (1.81) 3.5-10
6.85 (3.25) 0-10
8.35 (2.24) 2.6-10
6.39 (3.51) 0-10
7.36 (3.00) 2.1-10
7.54 (3.06) 0-10
67
Table A17.14. [TPT.13b] Schear, 1984: Data for Patients with Neurological and Psychiatric Disturbance: Means, SDs, and Ranges for Time in Minutes and Number of Blocks• Total
n 111
Age
Minutes
Blocks
Memory
Localization
20-29
16.01 (6.57) 3.~
28.15 (4.34) 4-30
7.14 (1.99) 2-10
3.43 (2.79) 0-10
112
30-39
17.86 (7.25) 6.5-30
27.00 (5.29) 4-30
6.40 (1.85) 0-10
2.54 (2.20)
111
40-49
19.73 (6.58)
25.54 (7.09)
5.6-30
2-30
6.02 (2.01) 2-10
1.94 (2.01) 0-9
156
50-59
23.41 (6.24) 8.4-30
23.18 (8.28)
5.49 (1.92) 1-10
1.47 (1.73) 0-7
67
60-69
24.46 (6.56) 8.6-30
20.78 (9.05)
4.96 (1.96) 1-9
1.45 (1.46) 0-5
0-30
0-30
"Data are also reported for memory and localization.
0-8
822
APPENDIX 17
Table A17.15. [TPT.14] Russell, 1985: Data for Veterans Administration Patients: Mean Time in Minutes and SDs for the Dominant Hand, Nondominant Hand, Both Hands, and Total Time, as Well as Mean and SD for Memory and Localization Scores for the 10-Block and 6-Block Versions TPT
10 blocks
n
Age
Education
Dominant
Nondominant
Both
Total
Memory
Localization
19
43.5 (13.6)
14.8 (6.4)
6.88 (3.36)
5.67 (3.25)
13.33 (2.03)
15.68 (6.94)
7.42 (1.64)
4.10 (2.57)
1.88 (0.99)
1.41 (1.12)
0.87 (0.41)
4.17 (2.16)
4.84 (0.83)
3.84 (1.54)
6 blocks
Table A17.16. [TPT.15] Heaton et al., 1986: Data for a Sample of Normal Controls: Mean Total Time in Minutes per Block and Mean Memory and Localization Scores for the Six Subgroups, as Well as Percent Classified as Normal Using Russell et al.'s (1970) Criteria % Classified Normal
TPT
n
Age
Education
WAIS Mean ss•
Total Time
Memory
Localization
Total T"IUie
Memory
Localization
319
<40
11.9
0.39
8.1
5.3
87.5
97.8
65.5
134
40-59
11.2
0.50
7.5
4.0
69.4
90.3
41.8
100
~60
9.7
0.85
6.2
2.0
23.0
69.0
9.0
6.9
3.6
53.7
78.8
35.6
132
<12
9.5
0.64
249
12-15
11.2
0.47
7.7
4.4
75.9
93.2
49.0
172
~16
12.9
0.43
8.0
5.0
78.5
96.5
61.1
•Mean WAIS scaled scores are also reported for each group.
> "tJ "tJ
m
z
0
Table A17.17. [TPT.l6] Alekoumbides et al., 1987: Data for Veterans Administration Inpatients: Mean Time in Minutes and SD to Correctly Place all the Blocks
X
for the Preferred Hand, Nonpreferred Hand, Both Hands, and Total Time•
......
T1me
WAIS
n
111
Blocks per Minute
Age
Education
FSIQ
VIQ
PIQ
Preferred
Nonpreferred
Both
Total
Preferred
Nonpreferred
Both
Total
Memory
Localization
46.85 (17.17)
11.43 (3.20)
105.89 (13.47)
107.03 (14.38)
103.1 (13.02)
7.80
6.18 (3.51)
4.31
(0.90)
2.10 (1.20)
3.42 (2.08)
2.05
(3.43)
18.29 (9.87)
1.58
(3.73)
(1.06)
6.28 (2.01)
2.67 (2.29)
•Mean number of blocks correctly placed per minute and SD are sUDlmarized for the preferred hand, nonpreferred hands, both hands, and total time. Memory and localization scores as well as demographic information are also provided.
Table A17.18. [TPT.l7] Bomstein et al., 1987a: Data for Healthy Volunteers: Means and SDs for Total Time, Memory, and Localization Scores for Both Testing Sessions, as Well as Raw Score Change and SD, Median Raw Score Change, and Mean Percent of Change Test
n 23
Age
VIQ
PIQ
Raw Score Change
Retest
Median Raw Score Change
Mean% of Change
Time Memory Localization Time Memory Localization Time Memory Localization Time Memory Localization Time Memory Localization
32.3 105.8 105.0 10.7 (10.3) (10.8) (10.5) (4.2)
8.4 (0.9)
5.1 (2.3)
7.4 (2.6)
8.9 (1.0)
6.3 (2.9)
3.25 (3.3)
0.65 (0.61)
0.88 (1.9)
2.2
1.0
1.0
27
6
34
~ w
824
APPENDIX 17
Table A17.19. [TPT.l8] EI-Sheikh et al., 1987: Data for Egyptian Students: Mean Times in Minutes and SDs for the Dominant Hand, Nondominant Hand, Both Hands, and Total, as Well as Means and SDs for Memory and Localization Scores for Both Test and Retest• Test
Retest
4.95 (1.96)
3.32 (1.43)
3.68 (1.28)
2.83 (1.43)
Both hands
2.35 (0.84)
1.74 (0.71)
Total
11.01 (3.38)
7.71 (3.03)
Memory
8.34 (1.41)
8.59 (1.93)
Localization
5.53 (2.59)
6.88 (2.46)
Dominant hand Nondominant hand
n=32; Mean age, 20.6 (1.4).
Table A17.20. [TPT.19] Dodrill, 1987: Data for a Sample of Volunteers: Mean T1me in Minutes for TPT Total Time and Mean Scores for Memory and Localization for the Total Sample and for Various Levels of Intelligence WAIS-R n
120
n
7 18 34 64 93 101 75 60 48 33 19 10
Age
Education
FSIQ
VIQ
PIQ
Total Time
Memory
Localization
27.73 (11.04)
12.28 (2.18)
100.00 (14.35)
100.92 (14.73)
98.25 (13.39)
13.65 (7.21)
7.86 (1.26)
4.97 (2.36)
Total Time
Memory
Localization
FSIQ 130 125 120 ll5 110 105 100 95 90 85 80
75 70
10.8 10.9 10.9 10.8 11.6 12.1 12.4 12.9 14.2 17.7 21.0 26.5
8
8 8 8 8
8 8 8 8 7 7 7 6
5 5 6 6 6 6 6 5 5 4 4 3 3
825
APPENDIX 17
Table A17.21. [TPT.20] Yendall et al., 1987: Data for Canadian Volunteers: Means and SDs for Time in Seconds to Execute the Task With the Preferred and Nonpreferred Hands Separately and Together, as Well as Means and SDs for Memory and Localization Scores, for Each Age Grouping and for the Total Sample %Right Hand
n
Age
Education
62
15--20
12.16 (1.75)
79.03
FSIQ
Preferred
Nonpreferred
Combined
Memory
Localization
111.75 (10.16)
286.80
195.49 (84.93)
106.90
(101.90)
(43.71)
8.73 (l.Oi)
6.47 (2.44)
73
21-25
14.82 (1.88)
86.30
109.79 (9.97)
312.96 (131.61)
209.16 (86.44)
128.36 (74.42)
8.11 (1.29)
5.51 (2.10)
48
26-30
15.50 (2.65)
89.58
113.95 (10.61)
265.66 (92.45)
181.86 (67.06)
103.35 (42.34)
8.13 (1.55)
5.42 (2.23)
42
31-40
16.50 (3.11)
90.48
116.09 (9.51)
278.11 (101.94)
206.61 (69.08)
134.01 (53.13)
8.19 (1.35)
5.24 (1.97)
225
15--40
14.55 (2.78)
85.78
112.25 (10.25)
288.94 (111.22)
199.00 (79.26)
118.07 (5i.75)
8.30 (1.33)
5.70 (2.24)
Table A17.22. [TPT.21] Ernst, 1987: Data for Australian Volunteers: Mean Times in Minutes and SDs for the Preferred Hand, Nonpreferred Hand, Both Hands, and Total Time• Dominant
Nondominant
Total
Both
n
Gender
Time
Blocks
Time
Blocks
Time
Blocks
Time
Blocks
Memory
Localization
51
M
9.1 (4.1)
9.5 (1.5)
7.4 (3.4)
9.7 (1.7)
5.1 (2.9)
9.9 (1.0)
21.4 (9.0)
29.2 (3.0)
6.6 (1.6)
2.8 (1.9)
59
F
10.3 (4.1)
9.0 (2.2)
10.1 (4.2)
9.1 (2.0)
6.6 (3.8)
9.8 (0.9)
26.9 (10.9)
28.0 (3.8)
5.9 (1.6)
2.0 (1.8)
•In addition, mean number of blocks and SDs are presented for each time measure and for Memory and Localization.
Table A17.23. [TPT.22] Clark and Klonoff, 1988: Data for Male Coronary Bypass Surgery Patients in Canada: Mean Time in Minutes and SDs for Time to Complete the Task with the Right Hand, Left Hand, Both Hands, and Total, as Well as Mean and SD for Memory and Localization for Each of the Four Testing Probes for the 6-Biock TPT Version TPT
Age
WAIS-R FSIQ
Right
Left
Both
Total
Memory
Localization
55.5 (8.01)
105.9 (12.2)
2.37 (1.16)
1.74 (0.86)
1.05 (0.59)
5.16 (2.29)
4.46 (1.11)
3.33 (1.59)
2.07 (0.86)
1.83 (1.10)
0.97 (0.49)
4.87 (2.16)
4.64 (1.09)
3.47 (1.67)
12 months
2.06 (0.82)
1.63 (0.81)
1.05 (0.60)
4.73 (1.87)
4.i8 (1.05)
3.69 (1.48)
24 months
2.19 (1.01)
1.65 (0.91)
0.92 (0.42)
4.i7 (1.87)
4.i2 (1.10)
3.44 (1.65)
Presurgery Postsurgery 3 months
APPENDIX 17
826
Table A17.24. [TPT.23] Elias et al., 1990: Data for 183 Healthy Volunteers Partitioned into
11tree Age
Groups n
TPT
WAIS
Age Group
Male
Female
Education
VIQ
PIQ
Total Time (Minutes)
Memory
Localization
20-31
41
47
15.7
119
116
9.40 (3.52)
8.00 (1.20)
6.10 (2.10)
37-49
23
38
15.4
122
122
11.90 (5.20)
7.80 (1.30)
4.90 (2.20)
55-67
12
22
14.9
124
121
15.70 (8.90)
7.10 (1.20)
3.20 (1.90)
Table A17.25. [TPT.24] Thompson and Heaton, 1991: Data for Healthy Volunteers: Mean Tunes in
Minutes and SDs for Dominant Hand, Nondominant Hand, Both Hands, and Total Tune, as well as Memory and Localization Scores
n 489
Age
Education
WAIS IQ
Dominant
Nondominant
Both
Total
Memory
I.ncalizatlon
39.43 (17.76)
13.9 (3.46)
113.09 (12.07)
6.78 (7.56)
5.72 (6.13)
3.51 (5.72)
14.85 (10.20)
7.59 (1.58)
4.43 (2.45)
Table A17.26. [TPT.26] Elias et al., 1993: Data for 427 Healthy Volunteers Partitioned into Six Age Groups
by Gender• TPT n
Total TIDle (Minute$)
Memory
Localization
Male
Female
Male
Female
Male
Female
Male
Female
15-24
37
24
9.60 (0.83)
9.55 (0.78)
8.27 (0.21)
7.83 (0.31)
6.51 (0.36)
6.17 (0.50)
25-34
40
56
10.20 (0.58)
11.03
(0.53)
7.77 (0.26)
7.70 (0.19)
5.92 (0.37)
4.93 (0.28)
Age Group
35-44
36
56
10.88 (0.61)
11.81 (0.56)
7.86 (0.24)
7.04 (0.25)
4.64 (0.34)
4.55 (0.34)
45-54
25
46
12.38 (0.89)
13.62 (0.81)
7.24 (0.23)
6.87 (0.19)
3.76 (0.32)
3.76 (0.29)
55-64
25
35
17.29 (2.15)
14.70 (0.98)
6.80 (0.27)
6.86 (0.23)
3.40 (0.40)
2.91 (0.33)
~65
24
23
16.49 (1.18)
18.66 (1.70)
6.62 (0.22)
6.30 (0.38)
3.12 (0.40)
3.00 (0.53)
"Education range for the sample is 12-19 years.
827
APPENDIX 17
Table A17.27. [TPT.27] Barrett et al., 2001: Data for
Air Force Veteran Controls: Localization and Memory Scores and Mean Times in Minutes for Dominant Hand, Nondominant Hand, Both Hands, and Total Time•
n
Age
Localization
Memory
Dominant
Nondominant
Both
Total
1,052
43.9
2.59
6.16
7.98
7.06
3.85
18.87
(7.6)
"SDs are not provided for the test scores.
Appendix 1~: Locator and Data Tables for the Weqhsler Memory Scale (WMS-R, \N,MS-111, and WMS-IIIA) I I
' Study numbers and page numbers pr4'ided in these tables refer to study numbers f1nd descriptions of studies in the text of Ch;>ter 18.
828
Locator table also provides a reference for each study to a corresponding data table in this appendix.
> -a
Table A18.1. Locator Table for the Wechsler Memory Scale (WMS-R, WMS-III, and WMS-IIIA) Study
Age•
n
WMS-8.1 Wechsler, 1987
16-17 18-19
page 346
20-24 25-34
53 0 50 0 54 0 54 55 50
Data are not reproduced in tbis book
35-44 ~
55-64 65-69
70-74
Sample Composition
IQ/ Education•
Length of
Sample designed to represent tbe nonnal population of tbe U.S.; stratified sample based on age, sex, race, and geographic region
Mean WAIS-R FSIQ was 110 (15); however, tbe fuDWAIS-R was adminlstered only to tbe 35-44 and
30 minutes
55-69
Delay
Scoring Method
Subtests Administered
Special Notes
Wechsler (1987)
FuDWMS-R
Normative data for 1819, 25-34, and ~ 54 age groups were statistically interpolated (i.e., no data were actually
-a
m Location
USA (standardization)
z
0
... co X
collected)
groups;
a 4-test short form was given to tbe otber 4 age groups on whom data were collected WMS-B.J CuDumet al., 1990 page 347 Tables Al8.2, A18.3
50-70 75-95
47 32
Group of healthy, welleducated, communitydwelling older adult volunteers recruited via flyers and newspaper and subsequently screened via telephone "for neuropsychological risk factors, history of neurological disorder, learning disability. major psychiatric disorder, major medical illness or substance abuse;" also excluded if taking any medications which might negatively affect perfonnance on memory tests; lowdose antihypertensives were allowed
14.4 (2.7) years 14.6 (3.0) years
30 minutes
Wechsler (1987)
FuDWMS-R
Provides WMS-R "preliminary norms" for "nonnal elderly subjects;" forgetting rates for verbal and nonverbal material also provided
San Diego, CA
(continued)
= N
\0
Table A18.1. (Contd.) Study
CIO
w
Age•
n
Sample Composition
IQI Education•
Length of Delay
Scoring Method
Subtests Administered
Special Notes
= Location
WMS..R.3 Mittenberg et al., 1992 page 348 Tables A18.4, A18.5
25-34
50
Sample designed to match 1980 u.s. Census data stratified on age, gender, ethnicity, and education; differs from WMS-R standardization in that all subjects reside in Florida; recruited from "local businesses, weekend/evening adult education and vocational!tecbnical classes"
Prorated WAIS-R FSIQ based on Vocabulmy and Block Design subtests only Mean= 101.3 (14.6) Range= 72-131 Median=100
30 minutes Wechsler (1987)
Full WMS-R
WMS-11.4 Lichtenberg & Christensen. 1992 page 348 Table A18.6
70-74 75-79
25
Consecutive admission sample of cognitively intact geriatric medical patients from an urban hospital; about 113 had hip fractures, 113 had knee replacement due to arthritis, 113 showed "deconditioning from lengthy illness;" sample comprised of 43 F, 23 M; 35 Caucastan, 31 black
No information; however, Mattis Dementia Rating Scale cutoff was 129 to insure
30 minutes Wechsler (1987)
WMS-RLogical Memory subtest only
Provides clinical comparison data (not normative data) for gtmamcmedical patients seeking treatment in an urban medical setting
Detroit, Ml
30 minutes WMS-R converted to Mayo summary scores
Full WMS-R; however, only allowed 3 learning trials during both the V'ISIIal and Verbal Paired Associates 1 subtests; no
Data are reported using the midpoint interval technique (Pauker, 1988) to "maximize available information"
Rochester and Olmsted County, MN
WMS-8.5 Ivnik et al., 1992b
page 349 Data are not reproduced in this book
80-99
23 18
Total sample: Sample represents a 56-74 274 combination of 167 (1) patients who had a 75-94 medical tlllllll1 at the Data tables Mayo Clinic and were deemed we "normal" because they midpoint interval lacked adlve neurological method: or psychiatric conditions 154 that would compromise 56-66 161 ~ cognitive l1mctioning (chronic medical illness 62-72 169 168 was not an exclusion 65-75 178 68-18 criterion, all were able
Florida
intact
cognition
For total sample (n=441) WAIS-R IQ: VlQ = 105.5 (10.0) PIQ = 107.3 (11.4) FSIQ = 106.6 (10.5)
Years of education:
additional trials were administered if criterion
Provides a statistically derived estimate of probable WMS..R wlues for individuals >14 years old
>
'"tl '"tl
m
z
0
X
.....
00
71-81 74-84 77-87 80-90
84
83-94
53
160 151 123
to function independently) and (2) "normal controls" from a research project at Mayo's Alzheimer's Disease Patient Registry; criterion for normality determination was as
above
reached by third trial
~7(0.7%)
8-11 (13.6%) 12 (39%) 13-15 (22%) 18-17 (15%) 18 (9.8%)
NOTE: Mayo and WMS-R summary scores are not Interchangeable
> "tt "tt
m
z 0
X
Sample primarily -B-educated, Caucasian older adults from an agricultural
Q)
region
WMS..B.8 Richardson & Marottoli,1996 page 351 TabJeA18.7
81.5 (3.3) 78-80 81-91
WMS..IL7 76.48 (7.87) Marcopulos et al., 1997 page 352 Tables A18.8-A18.11 55-64 65-74
101 All autonomously living elderly. current driven; 48 F, 53 M; free from neurological and psychlatric illness
Education: 11.0 (3.7)
131 Healthy adults over S5 yean of age. living In a rural setting
Education: Mean= 6.65 (2.14)
30 minutes Wechsler (1987)
Logical Memory
30 minutes Presumably Wechsler (1987)
1.ogical Memory and Visual
and V"JSual Reproduction subtests
Reproduction subtests
Range= 0-10
15-M ~85
WMS..B.8 Shores o'k Cantain, 2000
page 353 Data are not reproduced In this book
18-34
399 Healthy.
generally welleducated adults (193 M, 206 F)
FullWMS-R Three education 30 minutes Wechsler (1987) levels reported<12 (n=91), with Logical 12 (n=91), Memory >12 (n=217) yearssubtest scored for V"uual Reproduction according to and Visual lvison, 1993 Memory Span subtests
Data are presented In age by education cells: <12. ~12 yean
New Haven, CT
The study aimed at development of snormative data for
Rural central V"uginia
rural community-
dwelling older adults with no more than 10 yean of formal educalion; data are reported by age and age by education; percent retention was calculated Sydney, Provides nonnative data for urban Australians; data Australia reported by age/education categories for V"JSual ~roduction and Visual Memory Span subtests; performance for men vs. women noted for Logical Memory, V"JSUal Paired Associates, and Verbal Paired Associates subtests; tables convert performance raw to scaled scores (continued)
a...
e3 ~
Table A18.1. (Contd.) Study
:w.MS.-.DJ.l Wechsler, 1997 page 353 Data are not reproduced in this book WMS-ffi-A.l Wechsler, 2002 page 354 Data are not reproduced in this book
Age•
n
15-79 partitioned into 11 age groupings
1,250
15-79 partitioned into 11 age groupings
1,250
Sample Composition
IQ/ Education•
Length of Delay
Scoring Method
~~ standardization sample
~ mately 30 minutes
Wecb.iler
Nationally collected standardization sample; data are from WMS-111 standardization sample
Approximately 30 minutes
Wechsler (1997)
Subtests Administered
E.ull w.ws.Jl1
(1997)
Special Notes
.Baw.-.are.
Location
...
reported which can be converted to scaled scores
Full WMS-III-A (Logical Memory & Family Pictures)
Raw scores are reported which can be converted to scaled scores
USA (standardization)
USA (standardization)
• Age column and IQ/education column contain information regarding range and/or mean and standard deviation for the whole sample and/or separate groups, whichever information is provided by the authors.
> "tl "tl
m
z 0 X
())
833
APPENDIX 18
Table A18.2. [WMS-R.2a] Cullum et al., 1990: Data for a Sample of Healthy Older Adults Partitioned into Two Age Groups Age W-70 (n=47)
WMS-ll subtm Digit Span Visual Span Logical Memory I Logical Memory II Verbal Paired Associates I Verbal Paired Associates II VISual Reproduction I Visual Reproduction II Visual Paired Associates I VISual Paired Associates II Figural Memory
Age 75-95 (n = 32)
M
SD
M
SD
14.9 14.7 29.2 25.6
3.7 2.4 7.7 7.9
15.3 14.1 25.0 20.9
3.7 2.5 7.5 8.4
19.7
2.9
18.3
2.8
7.6 34.3 29.1
0.7 3.9 6.1
6.9 29.1 20.1
1.2 7.3 9.1
14.1
3.7
12.7
3.5
5.6 6.5
0.9 1.7
4.8 6.5
1.4 1.5
64.7 78.0 55.0 133.0 81.1
9.3 16.3 7.1 20.7 12.8
64.5 68.5 48.3 116.6 64.5
11.2 16.8 9.7 22.0 17.4
WMS-ll raw eummary •corea
Attention/Concentration Verbal Memory VISual Memory General Memory Delayed Memory
Table A18.3. [WMS-R.2b] Cullum et al., 1990:
Table A18.4. [WMS-R.3a] Mittenberg et al., 1992:
Savings Scores for the Two Age Groups
Data for a Sample of 50 Healthy Adults 25-34 Years of Age: Raw Score Means and SDs for the WMS-R
Age W-70 (n=47) Savings Score
Age 75-95 (n=32)
M%
SD
M%
SD
13
83
18
Logical Memory
87
Verbal Paired Associates
96
8
88
13
VISual Reproduction
85
15
68
25
Visual Paired Associates
97
20
84
23
Subtest Information/Orientation Mental Control Figural Memory Logical Memory I Visual Paired Associates I Verbal Paired Associates I Visual Reproduction I Digit Span Visual Memory Span Logical Memory II Visual Paired Associates II Verbal Paired Associates II Visual Reproduction II
M
SD
Range
13.88 4.98 7.20 26.04
0.33 1.13 1.49 6.81
13-14 2-6 4-10 11-40
15.12
3.39
5-18
20.80 34.52 14.96 17.66 22.38
3.16 5.39 3.71 3.62 7.38
12-24 16-41 7-22 10--25 9--37
5.60
0.95
2-6
7.62 32.88
0.86 6.59
13-41
4-8
834
APPENDIX 18
Table A18.5. [WMS-R.3b] Mittenberg et al., 1992: WMS-R Weighted Raw Score Composites Subtest
Composites
Verbal Memory Visual Memory General Memory Delayed Recall Attention/Connection
72.58 56.46 129.04 80.74 69.58
15.30 8.91 20.25 15.16 13.40
Table A18.6. WMS-R.4] Lichtenberg and Christensen, 1992: Data for a Sample of Cognitively Intact Geriatric Medical Patients: Means and SDs for Logical Memory I and II from the WMS-R Logical Memory Scores Group Overall Group 1 Group 2 Group 3
Age
n
Mr
so,
Mn
SDn
70-74 75-79
66 25 23 18
17.8 19.4 16.5 17.3
5.8 5.4 6.3 5.4
13.7 15.7 12.9 12.1
6.6 6.7 6.3 6.6
80-99
Table A18.7. [WMS-R.6] Richardson and Marottoli, 1996: Data for a Healthy Elderly Sample (n = 101), Partitioned by Two Age Groups and Two Educational levels: Means and SDs for Logical Memory I and II and Visual Reproduction I and II Age Group 76-80
81-91 Education ~12
~12
<12 (n=26)
(n=24)
<12 (n=18)
(n=33)
Logical Memory I
14.17 (6.48)
19.24 (5.62)
14.29 (4.70)
19.57 (7.13)
Logical Memory II
8.88 (5.24)
12.70 (5.49)
9.12 (4.41)
12.69 (8.44)
Visual Reproduction I
20.29 (6.91)
28.24 (4.77)
20.31 (7.96)
23.70 (6.80)
Visual Reproduction II
8.63 (6.57)
16.15 (7.33)
11.56 (10.16)
11.97 (8.57)
835
APPENDIX 18
Table A18.8. [WMS-R.7a] Marcopulos et al., 1997: Data for a Sample of Rural Older Adults with 10 Years of Education or Less: Means and SDs for WMS-R Logical Memory I by Age Educational Level• Total by Age
Years of Education
Age
n
55-64
1
7~
lHi
0-4
9-10
M (SD)
n
M(SD)
n
M (SD)
n
M(SD)
n
M (SD)
4.0
2
19.0 (11.3)
2
16.5 (3.5)
2
14.0 (1.4)
7
14.7 (7.1)
(-)
65-74
5
11.2 (5.9)
8
14.0 (7.9)
20
16.9 (5.8)
10
16.7 (5.3)
43
15.7 (6.2)
7~
12
11.8 (6.4)
15
13.4 (7.4)
28
14.1 (6.6)
8
19.4 (9.3)
63
14.2 (7.3)
5
7.4 (2.4)
1
0.0
10
7.9 (6.4)
2
12.5 (12.0)
18
7.8 (6.1)
10.4 (5.7)
26
60
14.1 (6.8)
22
17.0 (7.3)
85+ Total by education
23
(-)
13.5 (8.0)
•Total sample n = 131; mean score= 13.8, SD = 7.2, range 0-32.
Table A18.9. [WMS-R.7b] Marcopulos et al., 1997: Data for a Sample of Rural Older Adults with 10 Years of Education or Less: Means and SDs for WMS-R Logical Memory II by Age and Educational Level• Total by Age
Years of Education
0-4 Age
n
55-64
1
7~
lHi
9-10
M(SD)
n
M (SD)
n
M (SD)
n
M(SD)
n
M(SD)
2.0
2
14.0 (4.2)
2
11.0 (1.4)
2
11.0 (0.0)
7
10.6 (4.4)
(-)
65-74
5
4.4 (3.0)
8
8.9 (7.5)
20
11.4 (6.7)
10
12.7 (5.4)
43
10.4 (6.6)
7~
12
7.4 (5.0)
15
8.5 (6.1)
28
9.4 (7.1)
8
14.1 (9.2)
63
9.4 (6.9)
4
1.8 (2.1)
1
0.0
10
3.8 (3.7)
2
10.5 (13.4)
17
3.9 (5.2)
5.5 (4.6)
26
60
9.2 (6.8)
22
12.9 (7.1)
85+ Total by education
22
(-)
8.7 (6.5)
"Total sample n = 130; mean score= 9.1, SD = 6.5, range 0-28.
836
APPENDIX 18
Table A18.10. [WMS-R.7c] Marcopulos et al., 1997: Data for a Sample of Rural Older Adults with 10Years of Education or Less: Means and SDs for WMS-R Visual Reproduction I by Age and Education Level• Total by Age
Years of Education
5-6
0-4
Age
n
55-64
7-S
9-10
M (SD)
n .
M(SD)
n
M (SD)
n
M(SD)
n
M (SD)
12.0
2
25.0 (1.4)
2
25.5 (4.9)
2
31.5 (4.9)
7
25.1 (7.1)
(-)
65-74
5
8.0 (5.4)
8
15.3 (5.7)
20
18.6 (7.7)
10
21.6 (9.2)
43
17.4 (8.5)
75-S4
13
12.8 (5.9)
15 :
12.7 (5.7)
28
18.4 (8.8)
8
14.9 (8.1)
64
15.5 (7.9)
5
10.2 (4.2)
2·
3.5 (2.1)
10
9.2 (6.1)
2
15.0 (5.7)
19
9.5 (5.7)
11.2 (5.5)
27 .
13.7 (7.2)
60
17.2 (8.7)
22
19.5 (9.3)
85+ Total by education
24
! I
"Total sample n = 133; mean score= 15.8, SD,., 8.4, range 0-35.
Table A18.11. [WMS-R.7d] Marcopulos et al., 1997: Data fora Sample of Rural Older Adults with 10Years of Education or Less: Means and SDs for ~S-R VISual Reproduction II by Age and Education Level• Total by Age
Years of Education 0-4
Age
n
55-64
7-S
·5-6
9-10
M(SD)
n
M(SD)
n
M(SD)
n
M(SD)
n
M (SD)
2.0
2
22.5 (4.9)
2
18.0 (7.1)
2
12.5 (17.7)
7
15.4 (10.8)
(-)
65-74
5
4.6 (5.9)
8
6.1 (4.8)
20
11.3 (7.9)
10
11.7 (7.5)
43
9.7 (7.4)
75-S4
13
5.5 (5.3)
15
6.2 (5.9)
28
9.9 (10.9)
8
6.9 (4.3)
64
7.7 (8.4)
4
5.8 (4.3)
2
0.0 (0.0)
10
2.4 (4.4)
2
2.5 (3.5)
18
2.9 (4.2)
23
5.2 (5.0)
27
6.9 (7.0)
60
9.4 (9.5)
22
9.2 (7.5)
85+ Total by education
•Total sample n = 132; mean score=8.1, SD=S:2, range 0-34.
Appendix 19: Locator and Data Tables for the List-Learning Tests
Study numbers and page numbers provided in these tables refer to study numbers and descriptions of studies in the text of Chapter 19.
Locator table also provides a reference for each study to a corresponding data table in this appendix.
837
CIO
Table A19.1. Locator Table for the List-Learning Tests
~
CIO
Study
RAVLT.l Rey, 1941, 1964 page 375 Table A19.2
Age•
n
Sample Composition
132
French-speaking Swiss stratified into 5 occupational groups
19-81 19-24 25-29 30-34 35-39 40-44 45-49 50-54 55-59 60-64 65-69 70-81
677
The study provides norms for male ambulatory inpatients treated for a variety of physical complaints at a VAMC
RAVLT.3 Rosenbe rg et al., 1984 page 376 Table A19.4
48.62 (16.60) 47.51 (13.59)
47
RAVLT.4 Cohen, et al ., pe rsonal communication page 377 Table A19.5
60-64 65-69 70-74 75-89
81
RAVLT.2 Query & Megran, 1983 page 376 Table Al9.3
RAVLT.5 Hyan et al ., 1986 page 377 Table AJ 9.6
-
45.86 (14.05)
54
88 109 54
50 52 83 81 57 26 23
45
85
IQ/Education •
-
Education: 11.44 WAIS IQ: 93.83
Trials Reported
Location
1-V
Switzerland
I & V, postinterference recall, recognition, V-1 difference
North Dakota
VAMC psychiatric and neurological inpatients classified as memoryimpaired and nonmemory-impaired
Education: 10.81 (3.01) 11.87 (2.58)
1-V, postinte rference recall, recognition
lllinois
Not·mative data are provided for elderly volunteers per age group, for males and fe males separately (53 F, 28 M)
Education: 13.8
1-V, interference, postinterferenc recall, immediate recognition, 30-minute delayed recall, delayed recognition
Peoria, IL
VAMC inpatients; alte rnate form reliability for AVLT assessed
Education: 11 .85 (2.5 l)
I- V, postinterference recall, and recognition for alte rnate forms
)>
-o -o
Kansas
m
z
-0X 1.0
RAVLT.6 Bleecker et al., 1988 page 378 Table A19.7
40-49 5Ch59 60-69 70-79
196
80-89
RAVLT.7Wiens et al., 1988 page 379 Tables A19.8-A19.10
RAVLT.8 Crawford et al., 1989 page 379 Tables A19.11, A19.12
19-51 29.1 (6.0)
222
-
60
RAVLT.9 Nielsen et al., 1989 page 380 Table A19.13
20-54 Groups: 20-29
RAVLT.IO Roth
27.5 (SE=l.O)
et al., 1989 page 380 Table A19.14
30-39 40-54
101 35 27 39 61
The study presents norms for healthy subjects broken down by age group and gender (87 M, 109 F)
Means range 13-18 years for different groups
1-V, recognition
The study presents nonnative data for healthy job applicants, representing an occupational cross-section of the community; 193 M, 29 F; data are stratilled by FSIQ, age, and education
2::12 for all groups
1-V, postinterference recall, recognition, derived indices
Oregon
1-V, interference, postinterference
UK
The study compared testretest performance over 27 days with the same form of RAVLT and with a parallel version developed by the authors, in healthy adults
Maryland
> "tt "tt
m
z
0
X
..... loD
recall, recognition
53 M, 48 F; majority had undergone minor surgery and were tested several weeks postsurgery
VIQ: 98.6 (12.2)
45 M, 16 F; controls in a study on neuropsychological dellcits in acute spinal cord injury
Education: 12.8
1-V, total for trials 1-V, 15-minute delayed
Denmark
recall 1-V, interference, postinterference recall, recognition, (means and SEs reported)
Detroit, Ann Arbor, Ml
(continued)
~
10
Table A19.1. (Contd.) Age•
Study
n
Sample Composition
IQ/Education•
Trials Reported
Location
RAVLT.ll Geffen et al., 1990 page 381 Tables A19.15-A19.17
16--86 16-19 20-29 3Ch39 40-49 50-59 60-69 70--86
153 25 20 23 23 20 22 20
Nonns provided for adults aged 16--86, by age and gender; variety of perfonnance indices explore different memory mechanisms; equal number of males and females
Education: 11.2 (2.2)
Standard recall, delayed recall, and recognition trials, plus serial position and functional indices
Australia
RAVLT.l2 lvnik et al., 1990 page 382 Tables A19.18-A19.20
55-97 55-59 60--64 65-69 70-74 75-79 80-84 85-97
394 45 53 64 67 69 49 47
The study provides raw data and summary scores for an elderly sample stratified into 7 age groups; 145M, 249 F
Education: <8 years to>17 years
1-V, interference, postinterference recall, 30-minute delayed recall, recognition, errors, and 4 summary scores
Olmsted County, MN
RAVLT.l3 Miller et al., 1990 page 382 Table A19.21
Age range: 21-72
V, total for trials 1-V
MACS centers at Baltimore, Chicago, L.A.,& Pittsburgh
1-V, interference, postinterference recall
VIrginia
37.20 (7.52) 35.66 (6.47) 36.90 (7.04)
769 727 84
RAVLT.l4 Shapiro & Harrison, 1990 page383 Tables Al9.22-A19.24
66 19
17 25
The study compared perfonnance of 3 groups of homosexual! bisexual men: 1. Seronegative 2. Asymptomatic seropositive 3. Symptomatic seropositive Four alternate fonns of AVLT were compared; 2 were generated according to the criteria developed by the authors; 2 subject samples were used: VAMC patients and undergraduate students
16.36 (2.34) 15.70 (2.44) 16.06 (2.50)
=
.... =
>
~
-
~
m
z
Cl X _. \0
BAVLT.l5 Mitrushina et al., 1991 page 384 Tables A19.~A19.27
57-65 6&-70
71-75 76-85
BAVLT.l6 Mitrushina & Satz, 1991a page 384 Tables A19.28, Al9.29
57-65 6&-70
BAVLT.l7 Seines et al., 1991 page 385 Table A19.30
25-34 35-44
BAVLT.l8 Delaney et al., 1992 page385 Table A19.31
45.8 22-67
BAVLT.l9 Ivnik et al., 1992c page 386 Table A19.32
Age range 5fHl7; divided into groups based on midpoint interval
71-75 76-85
28 45 57 26
14.2 14.0 14.6 13.3
Norms are provided for highly functioning elderly stratified into 4 age groups; 62 M, 94F
14.4 13.7 14.5 14.0
1-V, postinterference recall, recognition, false-positives, rates of acquisition and forgetting, primacy/recency effect
California
I, V, postinterference recall
California
"tl
m
z
0
X
..... 1.0
19 40 47 16
Perfonnance of highly functioning elderly sample is compared over 3 longitudinal annual probes; 49 M, 73 F
733
The study reports nonnative data collected on a large sample of seronegative homosexual/bisexual men; data are stratified by age and education
Education: college
V, total for trials 1-V, postinterference recall, delayed recall, delayed recognition
MACS centers at Baltimore, Chicago, L.A.,& Pittsburgh
Controls without histories of neurological or psychiatric problems
12.8 6-16
I, III, V, postinterference recall, delayed recall, delayed recognition
Connecticut, California, Florida, Virginia, Massachusetts, New York, Minnesota
The study provides agespecific nonns for elderly; scoring procedures were developed, which convert raw and computed scores into scaled scores; tables are not reproduced in this book
Education:
Raw scores for various recall and recognition trials are reported in the earlier article
Minnesota
45-54
42
530
$;7 to 2:18 years
> "tl
(continued)
:....
Table A19.1. (Contd.) Study RAVLT.20 Savage & Gouvier, 1992
page 387 Tables Al9.33, Al9.34
RAVLT.21 Crossen
~
Age•
n
1&-19 20-29 30-39 40-49 50-59 60-69 70-76
134
Sample Composition
IQ/Education•
Trials Reported
Location
The study provides normative data for healthy adults, stratified by age group and gender; 66 M, 68 F
;:::12 years
I-V, interference, postinterference recall, recognition, 30minute delayed recall, delayed recognition
Louisiana
29.9 (6.2)
60
Performance on RAVLT and CVLT was compared on a group of healthy job applicants; 52 M, 8 F
Education: 14.7 (1.6) FSIQ: 106.3
I-V, total, interference, postinterference recall, recognition
RAVLT.22 Geffen et al., 1994 page 388 Tables Al9.3&-Al9.39
31.3 (12.7)
51
The study explored equivalence between the original form of the RAVLT and a new form on a sample of healthy volunteers; 25 M, 26 F
Education: 12.2 (2.4)
I-V, interference, postinterference recall, 20-minute delayed recall, recognition
Brisbane, Australia
RAVLT.23 Torres et al., 2001 page 388 Table A19.40
29.0M (10.2) 30.5 F (11.7)
84
Data for 160 healthy controls, stratified by gender
14.4 M (2.1) 14.8 F (1.9)
I, V, interference trial, postinterference recall
USA
RAVLT.24 Miller, 2003 (an update on Seines et al., 1991) page 389 Table A19.41
38.2 (7.4)
Seronegative homosexual and bisexual males from MACS; data are partitioned by age x education
16.3 (2.4)
I, V, total for I-V, interference trial, 20-minute delayed recall, delayed recognition, number of falsepositive errors
MACS centers
Standard indices, plus delayed cued recall and learning index
Florida
& Wiens, 1994
page 387 Table Al9.35
76 920
25-34 35-44
45-59 HVLT-R.1 Friedman et al., 2002
page 389 Table Al9.42
60-71 72-84
237
Healthy African-American sample; 108M, 129 F; data are stratified by 2 age levels, 3 educational levels, and gender; tables for conversion into percentiles are also provided
Education: <16 16 >16 <12 12 >12
N
>
""C ""C
m
z
0
X
.... ID
WHO-UCLA AVLT.l Ponton et al., 1996 page 390 Table A19.43
38.4 (13.5)
16-29 30--39 40-49 50-75
300
Spanish-speaking healthy volunteers; MIF ratio 40%160%; data are partitioned by gender x 4 age groups x 2 education groups
10.7 (5.1) <10 >10
V, recall after interference and 20-minute delayed recall
California
>
m ""
z
0
....>< \D
• Age column and I Q/education column contain information regarding range and/or mean and standard deviation for the whole sample and/or separate groups, whichever information is provided by the authors.
e
APPENDIX 19
844
Table A19.2. [RAVLT.l] Rey, 1941, 1$64: Data for French-Speaking Swiss Participants Trial
JI
III
IV
v
7.0 2.1
10.5 l.9
12.9 1.6
13.4 2.0
13.9 1.2
8.6 1.5
1i.8
13.4 1.4
13.8 1.1
14.0 1.0
8.9 1.9
1•. 7 ~.8
12.8 1.5
13.5 1.3
14.5 0.7
3.7 1.4
•.6 1.4
8.4 2.4
8.7 2.3
9.5 2.2
4.0 2.9
7.2 •. 9
8.5 2.5
10.0 3.3
10.9 2.9
Subject Groups Manual laborers (n=25) Mean SD Professionals (n=30) Mean SD Students (n =47) Mean SD Elderly laborers (70-90 years, n = 15) Mean SD Elderly professionals (70-88 years, n = 15) Mean SD
f·O '
:
Table A19.3. [RAVLT.2] Query and MeWan, 1983: Data for Male Ambulatory Inpatients Treated at North Dakota Veterans A~tration Medical Center for Physical Complaints"
Trial I
Age 19-24 25-29 30-34 35-39 40-44 45-49 50-54 55-59 60-64 65-69 70-81
Trial V
Postinterference Recall
Recognition
n
M
SD
M
SD
: M
SD
M
SD
V-1 Difference
54 88 109 54 50 52 83 81 57 26 23
6.15 5.98 5.68 5.49 5.51 5.10 5.01 4.53 4.09 4.12 3.14
1.46 1.43 1.17 1.77 1.44 1.27 1.62 2.50 1.61 1.26 1.50
11.50 11.27 10.71 11.80 11.14 10.43 9.38 8.80 7.54 7.29 5.86
0.63 1.87 4.19 1.94 1.59 1.92 2.66 5.04 2.57 6.12 2.04
. 9.80 ' 9.91 I 9.os 9.55 9.37 ' 8.18 I 7.12 . 5.96 5.81 ' 5.21 3.45
1.66 2.36 2.94 2.05 2.60 2.91 3.40 4.28 2.64 2.58 2.92
12.81 12.16 13.03 13.45 12.86 12.23 11.48 10.75 9.96 9.50 8.91
2.26 2.51 1.57 1.12 2.25 1.70 2.67 6.41 2.62 3.33 3.64
5.35 5.29 5.03 6.31 5.63 5.33 4.37 4.27 3.45 3.17 2.72
•Mean education for the sample is 11.44 y~ mean IQ is 93.83.
APPENDIX 19
845
Table A19.4. [RAVLT.3] Rosenberg et al., 1984: Data for Male Psychiatric and Neurological Inpatients
from the Veterans Administration Medical Center in Chicago• RAVLTTrial
Years
Non-memory-
n
Age
Education
45
47.51 (13.59)
11.87 (2.58)
47
48.62 (16.60)
10.81 (3.01)
impaired Memoryimpaired
II
III
IV
v
Recall after Interference
Recognition
4.96 (1.78)
6.81 (2.31)
8.66 (2.33)
9.40 (2.53)
9.71 (3.04)
7.81 (3.71)
11.53 (3.06)
3.91 (1.93)
5.00 (2.06)
5.71 (2.27)
6.13 (2.77)
6.89 (2.91)
4.07 (2.79)
8.18 (4.05)
"Mean full-scale IQ for the entire sample is 93.11 (13.43).
Table A19.5. [RAVLT.4] Cohen et al., Personal Communication: Data for Elderly
Volunteers• Age Group
n Trial I Trial II Trial III Trial IV Trial V ListB List B errors Trial VI (PostiR)
Intrusions from list B Trial V-I Trial V-VI Total errors Immediate Recognition Delayed (30-minute) recall Delayed (30-minute) recognition
60-64
65-69
70-74
23 5.70 (1.48) 8.13 (1.69) 10.00 (1.84) 11.58 (2.35) 11.83 (2.39) 6.13 (2.20) 0.00 (0.00) 10.15 (2.94)
13 5.46 (1.39) 8.88 (2.08) 10.31 (1.81) 11.00 (1.97) 11.54 (2.61) 5.81 (1.93) 0.23 (0.44) 9.23 (3.21) 0.08 (0.28) 6.08 (2.17) 2.31 (1.75) 1.38 (1.80) 12.46 (2.40) 9.38 (2.95) 12.69 (1.75)
12 5.67 (1.23) 8.42 (1.38) 10.04 (1.48) 10.67 (1.66) 10.71 (1.82) 5.17 (1.48) 0.33 (0.65) 8.21 (2.23) 0.25 (0.45) 5.04 (1.32) 2.50 (1.17) 2.08 (2.61) 12.25 (2.05) 9.00 (3.08) 11.58 (2.50)
0.09 (0.29) 6.04 (1.60) 1.67 (1.87) 0.70 (1.40) 12.65 (2.10) 9.98 (2.66) 12.61 (1.70)
7&-89
9 4.94 (1.63) 8.17 (1.80) 9.33 (1.92) 10.72 (2.28) 11.33 (2.12) 5.50 (1.17) 0.44 (0.73) 7.78 (3.87) 0.33 (0.71) 6.39 (2.00) 3.56 (2.60) 2.00 (2.24) 12.22 (3.67) 8.61 (3.00) 11.33 (3.88)
Mala n Trial I Trial II Trial III Trial IV Trial V ListB List B errors Trial VI (PostiR)
Intrusions from list B Trial V-I Trial V-VI Total errors Immediate recognition Delayed(30-minute)recall Delayed (30-minute) recognition
8 5.24 (1.50) 6.88 (1.46) 8.25 (2.38) 9.38 (1.92) 10.50 (1.69) 5.38 (1.22) 0.25 (0.46) 6.75 (2.71) 0.00 (0.00) 5.25 (2.19) 3.75 (2.12) 4.38 (6.68) 9.71 (1.89) 7.71 (1.80) 10.43 (1.72)
"Mean education for the sample is 13.8 years. PostiR, postinterference recall.
7 6.00 (1.00) 8.36 (1.49) 9.79 (1.78) 10.71 (2.36) 12.14 (1.86) 5.57 (1.40) 0.43 (0.54) 8.57 (3.50) 0.14 (0.38) 6.14 (1.95) 3.57 (1.99) 1.29 (0.95) 11.43 (2.22) 7.71 (2.98) 11.14 (3.08)
9 4.22 (1.32) 6.44 (1.13) 7.22 (1.80) 8.22 (1.64) 8.72 (0.90) 4.06 (1.55) 0.11 (0.33) 5.94 (1.98) 0.11 (0.33) 4.61 (1.87) 2.78 (1.50) 1.67 (3.08) 11.89 (1.27) 4.89 (1.54) 11.89 (2.03)
4 3.75 (2.50) 6.50 (0.58) 8.25 (1.71) 9.75 (2.22) 9.25 (1.89) 4.25 (1.71) 0.25 (0.50) 8.50 (1.73) 0.25 (0.50) 5.50 (1.73) 0.75 (0.50) 1.25 (1.89) 11.75 (2.50) 8.25 (2.63) 11.50 (2.08)
846
APPENDIX 19
Table A19.6. [RAVLT.S] Ryan et al.,l9$6: Data for Inpatients from the Veterans Administration Medical Center in Kansas Referred for fsychological and/or Neuropsychological Assessment VVAI5-RFSIQ ~
n
Gender
Race
85
82M 3F
61 White 24 Nonwhite
Education
45.86
19.74 (11.11)
(14.05)
11.85 (2.51)
Trial I
II
III
IV
v
Postinterference
Recognition"
Total I-V
M SD
4.69 1.70
6.34 2.08
7.29 2.55
8.35
3.03
8.87 3.20
6.18 3.75
9.82 3.64
35.55 11.08
Alternate M SD
4.46 1.78
6.05 2.39
6.93 2.59
7.39 2.87
7.99 2.81
5.44 3.27
10.39 2.91
32.81 11.18
r
0.63
0.62
0.60
0.62
0.71
0.66
0.65
0.77
Form
Original
"For this variable, n=84.
Table A19.7. [RAVLT.6] Bleecker et al, 1988: Data for Healthy Volunteers from the Johns Hopkins Teaching Nursing Home Study of Nornutl Aging AVLT
n
~
Gender
Education
v~·
BOlt
I
II
III
IV
v
Recognition
15
40-49
M
15 (3.1)
~ (!l3)
3 (2.5)
7.2 (1.6)
9.5 (2.3)
10.7 (3.0)
11.6 (2.6)
12.3 (2.6)
14.1 (1.3)
16
40-49
F
15 (3.0)
55 (10.2)
5 (4.5)
7.7 (1.6)
10.9 (2.1)
12.3 (2.0)
12.6 (1.8)
13.6 (1.7)
14.5 (0.9)
20
50-59
M
13 (3.2)
51 (10.2)
5 (5.4)
6.5 (1.6)
9.8 (2.0)
11.5 (2.0)
11.8 (2.0)
12.8 (2.2)
14.4 (0.9)
22
50-59
F
14 (2.9)
54 (10.2)
5 (3.6)
7.4 (1.3)
10.0 (1.5)
12.0 (1.5)
12.8 (1.6)
13.6 (1.7)
14.5 (0.8)
5 (5.4)
5.6 (1.8)
8.0 (1.9)
9.9 (1.9)
10.3 (2.5)
11.3 (1.8)
13.9 (1.7)
6 (4.7)
6.6 (1.8)
9.6 (2.0)
11.1 (2.1)
12.2 (2.2)
12.6 (2.3)
14.4 (1.0)
23 29 18
60-69 60-69 70-79
M F M
13 (2.8)
sa
14 (2.6)
5S
15 (2.8)
56 (7,7)
3 (3.0)
6.3 (1.6)
8.2 (2.2)
9.6 (2.2)
10.0 (2.3)
10.8 (2.2)
14.2 (0.6)
(7'.0) (9.0)
29
70-79
F
15 (3.4)
59 (9.8)
5 (5.1)
6.3 (1.7)
9.4 (2.2)
10.4 (2.2)
11.8 (2.3)
12.5 (2.1)
14.5 (0.8)
11
80-89
M
18 (1.9)
61 (M)
5 (3.4)
5.0 (1.2)
7.1 (1.4)
8.0 (2.1)
8.7 (1.9)
9.2 (2.1)
13.0 (4.1)
15 (2.0)
58 (7J))
7 (3.5)
5.6 (1.6)
8.2 (1.5)
9.7 (2.3)
11.0 (1.8)
11.1 (1.4)
13.9 (1.4)
13
80-89
F
•vvechsler Adult Intelligence Scale-Revised VoeabuLuy raw scores.
tBeck Depression Inventory.
> "0
Table A19.8. [RAVLT.7a] Wiens et al., 1988: Data for Job Applicants by Wechsler Adult Intelligence Scale-Revised (WAIS-R) Full-Scale IQ (FSIQ)
"0
m
Trial
WAIS-R FSIQ
n
I
II
III
IV
80--89
5
8.0 (2.5)
10.4 (1.7)
10.8 (2.2)
90-99
29
7.1 (1.6)
9.7 (1.8)
Recognition
Distractor Trial Ust (B)
Words Learned (Trial V-I)
Percentage Recall
Errors
10.6 (2.4)
14.0 (0.7)
6.6 (2.6)
3.0 (1.7)
99.5 (22.0)
13.0 (2.0)
11.2 (2.2)
14.0 (1.1)
6.0 (1.6)
5.9 (2.2)
v
Postinterference Recall
11.0 (2.1)
11.0 (3.0)
11.4 (2.1)
12.2 (1.8)
z
Repetitions
Total (Trials I-V)
0
2.2 (1.8)
2.6 (3.2)
51.2 (10.9)
1.0
86.7 (14.5)
3.2 (3.9)
5.6 (5.7)
53.4 (7.4)
100--109
81
7.2 (1.8)
9.9 (2.5)
11.8 (2.0)
12.4 (1.9)
12.9 (1.8)
11.6 (2.3)
14.2 (0.9)
6.5 (1.5)
5.7 (1.9)
90.1 (12.0)
2.2 (2.9)
5.2 (5.8)
54.2 (8.2)
110-119
55
7.5 (1.7)
10.4 (2.2)
11.9 (1.9)
13.1 (1.4)
13.2 (1.6)
12.1 (2.3)
14.0 (1.2)
6.8 (1.5)
5.7 (2.1)
91.8 (12.1)
2.1 (2.4)
7.0 (6.9)
56.1 (7.0)
120--129
38
7.7 (1.8)
10.7 (2.2)
12.7 (1.7)
13.3 (1.5)
13.7 (1.7)
12.6 (1.9)
14.4 (0.8)
7.2 (1.9)
6.0 (1.8)
92.5 (9.9)
2.0 (2.3)
5.4 (6.3)
58.1 (7.2)
130-139
3
10.0 (2.6)
12.3 (2.5)
13.7 (1.5)
15.0 (0)
14.7 (0.6)
14.3 (1.2)
15.0 (O)
7.7 (1.5)
4.7 (2.5)
97.9 (10.4)
2.0 (1.7)
0.7 (1.2)
65.7 (6.7)
><
.....
© Swets & Zeitlinger (1988).
Table A19.9. [RAVLT.7b] Wiens et al., 1988: Data for Job Applicants by Age Trial
n
I
II
III
IV
v
Postinterference Recall
Recognition
Distractor Trial UstB
Words Learned (Trial V-I)
Percentage Recall
Errors
Repetitions
Total (Trials I-V)
20--29
126
7.4 (1.7)
10.4 (2.2)
12.2 (1.9)
13.0 (1.7)
13.4 (1.7)
12.1 (2.2)
14.2 (1.0)
6.8 (1.6)
6.0 (2.0)
90.4 (12.4)
2.2 (3.0)
5.8 (6.1)
56.3 (7.4)
30-39
71
7.4 (1.9)
9.9 (2.5)
11.7 (2.0)
12.4 (1.8)
12.7 (1.8)
11.7 (2.2)
14.2 (1.1)
6.5 (1.7)
5.3 (1.9)
92.0 (12.7)
2.3 (2.5)
5.0 (5.5)
54.2 (8.3)
40--49
12
7.3 (2.2)
9.8 (2.7)
11.4 (2.6)
12.3 (1.8)
12.5 (2.5)
11.2 (3.1)
13.8 (0.9)
6.6 (1.8)
5.5 (2.6)
88.9 (10.8)
2.7 (2.3)
7.3 (10.1)
53.3 (10.3)
Age
© Swets & Zeitlinger (1988).
~ .....
!
Table A19.10. [RAVLT.7c] Wiens et al., 1988: Data for Job Applicants by Education Trial Years of Education
n
I
II
III
IV
v
Postinterference Recall
Recognition
DistractorTrial ListB
Words Learned (Trial V-1)
Percentage Recall
12
34
7.0 (1.6)
9.9 (3.5)
11.7 (2.0)
11.9 (3.5)
12.4 (2.3)
11.4 (2.41
13.9 (.1.2)
6.6
5.3 (.2.0)
93.1
(1.8}
(J.l..5)
Repetitions
Total (Trials 1-V)
1.9
6.6
52.9
{l.JI)
~6.7)
~7~)
Errors
13
25
7.5 (1.2)
10.1 (2.4)
11.9 (2.4)
12.7 (1.7)
13.2 (1.6)
12.1 (2.1)
13.9 (1.2)
6.0 (1.6)
5.7 (1.6)
91.4 (10.4)
1.4 (1.7)
7.1 (7.3)
55.3 (7.7)
14
50
7.2 (1.9)
9.9 (2.3)
ll.8 (2.1)
13.0 (1.9)
13.2 (2.0)
12.3 (2.2)
14.4 (0.8)
6.7 (1.8)
6.0 (2.2)
93.3 (ll.O)
2.5 (3.2)
6.5 (6.9)
55.2 (8.4)
15
19
7.4 (2.2)
10.3 (2.9)
12.4 (2.0)
12.6 (2.1)
13.2 (1.9)
11.4 (2.7)
14.3 (0.7)
6.6 (0.9)
5.7 (2.1)
86.0 (50.7)
2.3 (3.5)
2.6 (2.7)
55.8 (9.5)
16
80
7.6 (1.9)
10.5 (2.2)
12.1 (1.8)
13.0 (1.4)
13.3 (1.5)
12.0 (2.1)
14.2 (0.9)
6.8 (1.7)
5.7 (2.0)
90.0 (11.8)
2.6 (3.0)
5.0 (5.6)
56.5 (7.2)
5
7.8 (2.6)
10.4 (3.0)
12.4
13.8 (1.3)
13.4 (1.7)
ll.2 (3.1)
13.4 (1.8)
57.8
5.6 (3.0)
83.0 (16.6)
1.2 (1.6)
4.2 (1.6)
57.8 (7.7)
~17
(1.1)
(1.1)
© Swets & Zeitlinger (1988).
> "tt
"tt
m
z
0
X ID
849
APPENDIX 19
Table A19.11. [RAVLT.8a] Crawford et al., 1989: Data for Healthy Participants• (n =60), Recruited from Nonmedical Health Service Personnel and Fire Service in the UK; Scores for Matched Groups Receiving the Original AVLT and the Parallel Version Trial I
II
m
IV
v
ListB
Postinterference Recall
Recognition
Original
8.30 (1.80)
11.()() (2.11)
11.80 (2.36)
12.73 (2.02)
13.20 (1.61)
6.90 (2.22)
11.90 (2.55)
25.37 (2.68)
Parallel
7.37 (1.67)
10.50 (2.52)
11.70 (2.25)
12.63 (1.50)
13.00 (1.48)
6.43 (1.89)
11.43 (2.00)
25.13 (2.96)
"Demographic information for the sample Is not available.
Table A19.12. [RAVLT.Sb] Crawford et al., 1989: Data for Healthy Participants (n =60); Scores for Groups Receiving Either the Same or a Different AVLT Version at 27-Day Retest Trial I
II
m
IV
v
ListB
Postinterference Recall
Recognition
Test
7.87 (1.76)
11.10 (2.19)
11.93 (2.00)
13.03 (1.59)
13.33 (1.56)
6.70 (2.40)
11.93 (1.95)
25.30 (2.47)
Same. version retest
10.53 (2.39)
12.87 (1.81)
13.67 (1.40)
13.90 (1.35)
14.13 (1.14)
7.70 (2.29)
13.43 (1.68)
26.67 (2.47)
Test
7.53 (1.76)
10.40 (2.43)
11.57 (2.56)
12.33 (1.88)
12.87 (1.80)
6.63 (1.70)
11.40 (2.58)
25.20 (3.15)
Different version retest
7.50 (2.13)
10.27 (.2.16)
11.87 (2.00)
12.70 (1.82)
12.90 (2.11)
6.23 (1.91)
11.93 (2.77)
24.57 (3.67)
Table A19.13. [RAVLT.9] Nielsen et al., 1989: Data for Danish Participants; Majority Were Tested Several Weeks Post-Minor Surgery Trial 15-Minute Delayed Recall
n
Age
I
II
m
IV
v
Total
35
20-29
6.31 (1.53) 4-10
9.77 (2.00) 6-14
11.31 (1.97) 7-14
12.31 (1.79) 8-15
12.74 (1.46) 9-15
52.46 (7.21) 37~
11.91 (1.76) 9-15
27
5.85 (1.15) 4-8
9.04 (1.55) 6-12
10.26 (1.60) 6-14
11.59 (1.71) 7-15
12.22 (1.76) 8-15
48.96 (6.19) 33-64
11.22 (2.56) 5-15
39
5.67 (1.31) 4-9
8.18 (1.89) 4-12
10.08 (2.29) 5-15
10.77 (2.02) 6-15
11.41 (.2.14) 7-15
46.10 (7.99) 27-63
9.92 (2.73) 4-15
Table A19.14. [RAVLT.10] Roth et al., 1989: Data for the Control Group Trial Postinterference
n
Age
Education
I
II
III
IV
v
8
RecaR
Recogoitioo
61
27.5 (l.O)•
12.8 (0.2)
7.4 (0.2)
10.4 (0.3)
11.9 (0.3)
12.8 (0.2)
13.7 (0.2)
6.7 (0.3)
12.4 (0.3)
14.0 (0.2)
•standard error.
Table A19.15. [RAVLT.lla] Geffen et al., 1990: Data for Healthy Australian Adults Stratifled by Age Group: Males• Age Group 16-19 (n= 13)
20-29
~
(n=10)
(n=10)
40-49 (n=11)
(n=11)
60-69 (n=10)
70-86 (n=10)
6.9 (1.8)
8.4 (1.2)
6.0 (1.8)
6.4 (1.8)
6.5 (2.0)
4.9 (1.1)
3.6 (0.8)
II
9.7 (1.7)
10.8 (1.9)
8.0 (2.4)
9.0 (2.3)
8.6 (2.0)
6.4 (1.2)
5.7 (1.7)
III
11.5 (1.2)
11.3 (1.6)
9.7 (2.7)
9.8 (2.0)
10.1 (1.6)
8.0 (2.6)
6.8 (1.6)
IV
12.8 (1.5)
12.2 (1.8)
10.9 (2.8)
11.5 (1.9)
10.7 (1.9)
8.5 (2.7)
8.3 (2.7)
v
12.5 (1.3)
12.2 (2.2)
11.4 (2.6)
10.9 (2.0)
11.8 (2.6)
8.9 (2.0)
8.2 (2.5)
Total
53.4 (5.4)
54.9 (7.0)
46.0 (10.9)
47.5 (8.3)
47.6 (8.5)
36.7 (8.4)
32.6 (8.3)
Total repeats
5.9 (5.6)
8.0 (4.6)
3.0 (3.6)
4.1 (2.9)
7.3 (7.5)
5.0 (3.6)
5.1 (8.6)
Extra list intrusions
0.39 (0.65)
0.90 (1.29)
1.20 (3.12)
0.55 (0.82)
0.73 (1.19)
0.30 (0.68)
0.90 (1.67)
ListB
6.9 (1.9)
6.5 (1.8)
5.3 (1.6)
6.1 (2.1)
5.0 (2.3)
4.9 (1.6)
3.5 (1.3)
Postinterference recall
11.2 (1.6)
11.1 (1.7)
9.7 (2.3)
9.7 (2.5)
9.6 (2.9)
7.2 (2.8)
6.4 (1.7)
20-minute delayed recall
11.3 (1.7)
10.6 (2.4)
10.4 (2.3)
10.5 (2.7)
10.0 (2.6)
7.1 (3.8)
5.6 (lL6)
14.4 (0.9) 8.4 (2.8) 0.95 (0.04) 0.77 (0.09)
14.2 (0.8) 8.2 (2.7) 0.90 (0.05) 0.76 (0.09)
13.5 (1.5) 4.4 (2.0) 0.92 (0.04) 0.64 (0.01)
14.2 (1.0) 6.9 (2.6) 0.92 (0.06) 0.71 (0.09)
13.9 (0.9) 4.7 (2.9) 0.90 (0.06) 0.65 (0.10)
12.4 (2.8) 4.9 (2.7) 0.82 (0.13) 0.65 (0.09)
11.5 (2.6) 3.0 (2.5) 0.81 (0.10) 0.59
0.77 (1.01) 0.08 (0.28) 1.39 (1.76)
1.00 (1.89) 0.40 (0.84) 3.20 (1.99)
0.70 (1.25) 0.60) (0.70) 1.50 (1.35)
1.18 (1.33) 0.18 (0.60) 2.64 (2.66)
1.18 (1.54) 0.27 (0.65) 2.91 (2.55)
2.20 (1.32) 1.40 (1.78) 4.30 (3.95)
0.80 (1.03) 1.00 (1.25) 3.10 (2.96)
Trials
50-59
&copidon List A ListB p(A) List A p(A) ListB
(0.08)
Miaaaipmenla A to B BtoA Total false-positives
•Data for additional indices developed by the authors are included.
© Swets & Zeitlinger (1990).
APPENDIX 19
851
Table A19.16. [RAVLT.llb] Geffen et al., 1990: Data for Healthy Australian Adults Stratified by Age
Group: Females• Age Group 16-19 (n=l2)
20-29
~
(n=10)
(n=13)
40-49 (n=11)
50-59 (n=9)
60-69 (n=12)
70-86 (n=10)
7.8 (1.9)
7.7 (1.0)
8.0 (2.0)
6.8 (1.5)
6.4 (1.5)
6.0 (2.2)
5.6 (1.4)
11
10.5 (2.0)
10.5 (2.0)
10.8 (2.1)
9.4 (1.5)
8.2 (2.4)
9.0 (2.0)
6.9 (2.1)
III
12.3 (1.2)
12.2 (2.3)
11.5 (1.7)
11.4 (1.7)
10.2 (2.1)
10.8 (2.0)
8.9 (1.9)
IV
12.5 (1.7)
12.0 (1.6)
12.9 (1.3)
11.7 (2.1)
11.1 (1.9)
11.3 (1.4)
10.1 (1.9)
v
13.3 (1.5)
12.9 (1.5)
12.7 (1.3)
12.8 (1.4)
11.6 (2.1)
11.9 (1.6)
10.1 (1.2)
Total
56.5 (6.0)
55.3 (6.6)
55.9 (6.3)
52.1 (7.1)
47.6 (7.7)
49.0 (7.1)
41.6 (6.6)
Total repeats
5.5 (6.5)
10.6 (14.3)
5.0 (5.8)
8.0 (4.8)
4.9 (3.7)
4.8 (2.8)
3.5 (4.8)
Extra list inbusions
0.92 (1.38)
1.20 (1.40)
1.23 (1.74)
0.83 (1.19)
0.78 (1.30)
0.67 (1.07)
0.50 (0.97)
ListB
7.7 (1.3)
7.9 (2.0)
6.5 (1.5)
5.2 (1.3)
4.6 (1.9)
5.3 (1.1)
4.2 (1.9)
Postinterference recall
11.9 (2.5)
11.6 (2.5)
12.1 (1.9)
11.1 (2.4)
9.9 (2.8)
9.8 (1.6)
7.8 (1.8)
20-minute delayed recall
11.4 (2.5)
11.0 (2.0)
12.2 (2.5)
11.1 (2.3)
10.2 (2.7)
10.3 (2.3)
8.3 (2.1)
13.8 (2.0) 7.8 (3.1) 0.92 (0.08) 0.74 (0.10)
14.4 (0.8) 8.0 (2.9) 0.91 (0.09) 0.75 (0.10)
14.2 (1.7) 8.9 (4.1) 0.89 (0.08) 0.78 (0.13)
14.4 (0.8) 7.4 (2.8) 0.88 (0.07) 0.73 (0.09)
13.7 (1.1) 5.7 (2.4) 0.88 (0.08) 0.68 (0.07)
13.8 7.5 (3.6) 0.90 (0.06) 0.74 (0.11)
13.6 (2.0) 7.5 (3.7) 0.84 (0.11) 0.73 (0.10)
0.33 (0.49) 0.25 (0.45) 2.33 (2.96)
0.40 (0.97) 0.30 (0.48) 3.60 (3.92)
0.31 (0.63) 0.00 (0.00) 4.23 (3.37)
1.17 (1.64) 0.17 (0.58) 4.58 (2.68)
1.56 (1.94) 0.33 (0.71) 3.67 (3.71)
1.42 (1.44) 0.58 (0.67) 2.92 (3.12)
0.90 (1.37) 1.10 (0.88) 5.60 (5.72)
Trials
~ List A ListB p(A) List A p(A) ListB
(1.1)
M~
A to B BtoA
Total false-positives
"Data for additional indices developed by the authors are included. © Swets & Zeitlinger (1990).
852
APPENDIX 19
Table A19.17. [RAVLT.llc] Geffen et al. 1990: Mean (SO) Number of Words Recalled in Five Grouped Serial Positions of Words in List A Averased Over the Five Acquisition Trials
Age Serial Position
~9
40-49
50-59
60-69
10.3 (2.9) 8.6 (3.1) 6.7 (3.4) 10.0 (2.1) 10.4 (3.3)
12.9 (1.9) 9.0 (2.7) 6.0 (3.0) 8.8 (2.6) 10.8 (2.3)
11.2 (3.5) 10.5
(1.9)
12.3 (2.1) 11.1 (1.8) 7.7 (3.0) 12.0 (2.4) 11.8 (2.1)
9.0 (2.7) 6.2 (3.7) 3.5 (2.4) 8.6 (3.2) 9.4 (1.4)
5.2 (2.6) 2.4 (2.5) 7.6 (3.8) 8.1 (2.8)
12.5 (1.9) 11.6 (1.7) 10.2 (2.3) 11.3 (1.5) 11.0 (2.9)
12.4 (2.3) 11.0 (1.8) 8.4 (3.8) 11.5 (1.8) 12.0 (2.7)
13.2 (1.6) 11.2 (2.1) 8.7 (2.8) 11.2 (2.7) 11.6 (2.4)
12.6 (2.7) 9.8 (2.4) 8.2 (2.6) 10.9 (2.5) 10.6 (2.1)
11.9 (1.7) 10.1 (1.8) 8.0 (2.9) 9.1 (1.6) 9.9 (3.1)
10.3 (2.6) 8.1 (2.3) 4.9 (2.0) 8.4 (2.8) 9.9 (3.3)
16-19
20-29 '
12.7 (2.2) 10.3 (1.9) 7.8 (2.7) 10.6 (1.9) 12.0
70-86
Malea I (1-3)
II (4-6) III (7-9) IV
(1~12)
v (1~15)
F...,.,_ I (1-3)
II (4-6) III (7-9) IV (1~12) V(1~15)
(2.2)
7.2 (3.2) 8.7 (3.0) 10.1 (2.1)
11.7 (3.5) 8.9 (2.3) 7.3 (3.4) 9.0 (2.9)
10.7 (1.9)
© Swets ~ Zeitlinger (1990).
Table A19.18. [RAVLT.12a] Ivnik et al., 1990: Demographic Description of the Cognitively Intact Sample Partitioned into Groups Used in the RAVLT Testing n
Age 55-59 60-64 65-69 7~74
7>79 60-64 85-97
B,.,.,.._ Right-handed Left-handed Mixedlboth
45
53 64 67 69 49 47 369 14 11
Gend.r Men Women
145 249
Table A19.18. (Contd.) n
Mtwifal .,.,.,. Single Married
Divorced Widowed No response
51
235 7 100 1
.Edueation (,-.) <8 8-11 12-15 16-17 > 17
9 60 226 60 39
9.3 (2.3)
APPENDIX 19
853
Table A19.19. [RAVLT.12b] Ivnik et al., 1990: Data for the Cognitively Intact Sample Stratified into Seven Age Groups Trials Recall after Interference
30-Minute Delayed Recall
Recognition
Errors
5.3 1.7 2-9
11.2 2.5 >15
10.4 3.1 0-15
14.0 1.3 10-15
0.6 0.9 0-3
11.9 2.0 7-15
4.9 1.5
9.9 3.1
~
10.0 3.1 4-15
~15
13.9 1.5 8-15
0.8 1.2 0-5
10.6 2.4 6-15
11.2 2.4 6-15
4.7 1.5 1-9
9.1 3.2 0-15
8.3 3.5 0-15
13.3 2.0 8-15
0.9 0.9 0-3
10.2 2.3 ~14
10.5 2.6 5-15
4.1 1.5 1-8
8.3 2.9 1-14
7.4 3.1 0-13
12.7 2.1 6-15
1.0 1.2 0-5
9.2 2.2 4-15
10.1 2.2 5-15
4.2 2.0 1-10
7.8 2.7 2-15
6.9 2.9 0-14
12.5 2.4 6-15
1.5 1.6 0-7
8.6 2.5 1-14
9.0 2.5 4-15
3.5 1.6
0-8
6.7 2.5 1-14
5.5 3.3 0-12
12.3 2.4 2-15
1.2 1.4 0-i
7.9 2.4
9.1 2.3
~15
~14
3.1 1.4 0-7
6.2 2.6 2-14
5.4 2.7 0-13
12.3 2.3 6-15
1.5 1.6 0-7
Age Group
n
lJ5.-.S9
45
M
so Range 60-64 M SD Range
53
65-69 M SD Range
64
7~74
67
M SD Range 75-79 M
IV
v
6.8 1.6 4-10
9.5 2.2 6-14
11.4 2.0 6-15
12.4 1.9 7-15
13.1 1.9 7-15
6.4 1.9
9.0 2.3
~13
~14
10.6 2.3 6-14
11.7 2.7 7-15
5.7 1.6 1-10
8.6 2.1 5-13
9.7 2.3 4-15
5.5 1.5 2-9
7.8 1.8 ~12
9.1 2.1 4-13
7.0 1.9
8.2 2.2
~12
~15
5.0 1.5 1-8
Range 80-84 M SD Range
49
85-97 M
47
Range
III
List B
65
so
so
II
4.4 1.2 2-7
6.5 1.5
7.7 2.1
~10
~12
4.0 1.5 0-7
6.0 1.8 2-10
7.4 2.2 2-12
APPENDIX 19
854
Table A19.20. [RAVLT.l2c] Ivnik et al., 1990: Summary Scores for the Cognitively Intact Sample Stratified into Seven Age Groups
Age
n
55-59
45
M SD Range
60-64
Long-Term Percent Retention
53.2 8.2 33-67
19.3 5.8 3-34
85.0 12.9 45-100
79.1 18.7 0-108
49.7 9.1 32-68
17.8 7.0 .2-30
82.8 18.3 44-118
81.7 18.0 30-118
45.8 9.5 22-66
17.2 6.1 (-3)-28
80.5 21.2 0-133
72.3 23.6 0-125
43.1 9.1 1~1
15.6 6.9 (-1)-33
78.7 18.4 12-120
68.4 23.6 0-111
39.6 8.7 21-63
14.5 6.6 2-31
76.5 18.9 25-133
67.1 21.6 0-110
36.2 7.4 21-51
14.3 7.3 1-30
74.0 21.7 25-133
60.1 33.4 0-150
34.4 8.6 1-56
14.4 6.6 0-31
69.1 22.7 25-133
58.7 23.4 0-100
67
M SD Range
7fl-79
Short-Term Percent Retention
64
M SD Range 7~74
Learning Over Trials
53
M SD Range
6S-69
Total Learning
69
M SD Range
80-84 M SD Range
49
85-97 M SD Range
47
Table A19.21. [RAVLT.l3] Miller et al., 1990: Data for Homosexual/Bisexual Males Participating in the Multi-Center AIDS Cohort Study Race•
Seronegative
w
8
H
0
92
2
4
2
CES Depression Scale
CD4
Age
Education
Trial V
Trials 1-V Total
9.08 (9.03)
97o.42 (332.46)
37.20 (7.52)
16.36 (2.34)
12.83 (1.85)
52.75 (8.05)
Asymptomatic, seropositive
91
2
6
2
9.44 (9.27)
561.90 (277.98)
35.66 (6.47)
15.70 (2.44)
12.64 (1.88)
52.18 (8.30)
Symptomatic, seropositive
90
2
5
3
15.21 (11.19)
277.22 (269.45)
36.90 (7.04)
16.06 (2.50)
12.40 (2.20)
50.51 (9.52)
•w, white; 8, black; H, hispanic; 0, others (percentages).
855
APPENDIX 19
Table A19.22. [RAVLT.l4a] Shapiro and Harrison, 1990: The Original AVLT (List AB), the Alternate List (CD), and Two New Lists (EF and GH) List cot
List AB" Drum Curtain
Ben
Coffee School Parent Moon Garden Hat Farmer Nose Turkey Color House River
Desk Ranger Bird Shoe Stove Mountain Glasses Towel Cloud Boat
Book Flower Train Rug Meadow
Bowl Dawn Judge Grant
Harp
Plane County Pool Seed Sheep Meal
Insect
Salt Finger Apple Chimney Button Key
Lamb Gum Pencil Church Fish
ListEF
Dog
Glass Rattle
Street Grass Door Arm Star Wd'e Wmdow City Pupil Cabin Lake
Coat
Pipe
Bottle Peach Chair
Skin Fire Clock
Baby Ocean Palace
Up Bar Dress Steam Coin Rock Army Building Friend Storm Village
een
"Rey (1964). tLezak (1983, 1995, 2004).
Table A19.23. [RAVLT.l4b] Shapiro and Harrison, 1990: Data for the Original and Three Alternate Forms for a Sample of 25 University Undergraduates• Trial
List AB
List CD
List EF
List GH
I
7.00 (1.63)
7.40 (1.63)
6.84 (1.93)
7.28 (2.39)
II
10.20 (2.24)
10.08 (1.87)
9.76 (1.90)
10.00
III
11.76 (2.45)
11.40 (1.71)
11.12 (2.26)
11.80 (2.08)
IV
12.40 (2.20)
12.40 (1.68)
12.16 (1.82)
12.52 (1.78)
v
13.04 (2.09)
12.80 (1.55)
12.92 (2.14)
13.40 (1.35)
ListB Postinterference
recall "Mean age 19 years.
(2.25)
7.20
6.76
7.04
7.52
(1.89)
(1.59)
(1.62)
(1.50)
12.00 (2.61)
11.64 (2.12)
11.68 (2.64)
12.24 (2.31)
List GH
Tower Wheat Queen Sugar Home
Boy Doctor Camp Flag Letter Com Nail Cattle Shore
Body
Sky
Dollar Valley Butter Han Diamond Winter Mother Christmas Meat Forest Gold Plant Money Hotel
APPENDIX 19
856 Table A19.24. [RAVLT.14c] Shapiro and Harrison, 1990: Data for the Original and Three Alternate Fonns for a Sample of 17 VAMC Medical Patients• UstAB
UstCD
UstEF
UstGH
4.06 (1.43)
3.29 (1.96)
3.52 (1.55)
3.41 (1.37)
II
5.52 (1.66)
4.94 (2.08)
4.76 (2.61)
4.71 (1.57)
III
6.12 (2.00)
5.59 (2.09)
5.76 (2.31)
5.76 (2.44)
IV
6.41 (2.12)
5.71 (2.26)
6.47 (2.65)
5.65 (2.18)
v
6.47 (2.72)
6.47 (2.83)
6.88 (3.16)
6.47 (3.24)
UstB
3.35 (1.66)
3.41 (1.66)
3.18 (1.55)
3.41 (1.97)
Postinterference
4.06 (3.65)
3.29 (2.87)
3.71 (3.67)
4.17 (2.96)
Trial
recall • Mean age 66 years.
Table A19.25. [RAVLT.15a] Mitrushina et al., 1991: Demographic Characteristics for the Sample of Normal Elderly• Age Groups
All
57~
66-70
71-75
76-85
Age
70.7 (5.4)
62.8 (2.3)
68.2 (1.3)
72.9 (1.4)
78.7 (2.7)
Education
14.1 (2.9)
14.2 (2.0)
14.0 (2.0)
14.6 (3.4)
13.3 (3.6)
WAIS-R FSIQ
117.2 (12.6)
115.8 (12.6)
119.3 (14.5)
118.7 (10.8)
112.0 (11.7)
n
156
28
•The sample includes 62 males and 94 females.
45
57
26
APPENDIX 19
857
Table A19.26. [RAVLT.15b] Mitrushina et al., 1991: Average Recall for Four Age Groups of Normal Elderly
Trials I
II
III
IV
v
57~
6.4 (1.5)
8.8 (2.4)
10.4 (2.5)
11.4 (2.3)
66-70
5.9 (1.6)
8.5 (2.3)
9.8 (2.3)
71-75
5.1 (1.8)
7.5 (2.2)
76-85
5.1 (1.6)
6.8 (2.1)
Age Group
Postinterference
Recall
Recognition
FP"
V-VI
V-1
12.1 (2.4)
10.3 (3.0)
13.2 (1.3)
0.8 (1.4)
1.9 (1.7)
5.7 (2.0)
11.3 (2.3)
11.5 (3.0)
9.1 (3.3)
13.0 (1.1)
1.0 (1.6)
2.4 (2.5)
5.6 (2.9)
8.7 (2.4)
9.7 (2.7)
10.3 (2.9)
8.4 (3.5)
12.7 (1.9)
1.1 (1.7)
1.9 (2.2)
5.2 (2.5)
8.3 (2.3)
9.5 (2.8)
9.7 (2.8)
7.7 (3.4)
12.6 (1.9)
0.8 (1.2)
2.0 (2.6)
4.7 (2.7)
"False-positive errors.
Table A19.27. [RAVLT.15c] Mitrushina et al., 1991: Primacy/Recency Effects for the Entire Sample•
Trial
n
III
IV
v
3.0
3.4
3.9
3.9
Beginning
1.9
Middle
1.1
1.7
2.3
2.7
3.1
End
2.6
3.0
3.3
3.6
3.6
•Provides mean number of words recalled within each segment of the list across 6w acquisition trials.
Table A19.28. [RAVLT.16a] Mitrushina and Satz, 1991a: Demographic Characteristics for a Sample of Normal Elderly Stratified into Four Age Groups• Age Groups 57~
66-70
71-75
76-85
62.2
68.2
72.9
78.3
(2.5)
(1.2)
(1.4)
(2.5)
Mean education
14.4 (2.0)
13.7 (1.8)
14.5 (3.1)
14.0 (3.6)
Male/female(%)
10190
15185
30170
W78
n
19
40
47
16
Mean age
"Mean Wechsler Adult InteUigence Scale-Revised full. scale IQ for the sample is 118.2 (13.0).
858
APPENDIX 19
Table A19.29. [RAVLT.16b] Mitrushina and Satz, 1991a: Performance over 'Three Longitudinal Probes for Four Age Groups of Normal Elderly Age Groups 57-65 (n=19)
66-70 (n=40)
71-75 (n=47)
76-85 (n=16)
Tl
T2
1'3·
Tl
T2
T3
T1
T2
T3
T1
T2
T3
6.7 (1.6)
6.4 (1.3)
7.9 (2.3)
6.0 (1.6)
6.2 (1.8)
7.3 (1.8)
5.1 (2.0)
5.4
6.4 (1.8)
5.1 (1.5)
5.8
(1.7)
(1.2)
6.0 (1.9)
Trial V
12.4 (2.6)
12.3 (2.5)
11.9 (2.8)
11.8 (2.5)
11.7 (2.5)
12.1 (2.3)
10.4 (2.7)
10.1 (3.3)
10.4 (3.2)
10.3 (2.4)
10.6 (3.2)
9.8 (4.0)
Postinterference
10.7 (3.2)
10.8 (3.2)
10.5 (3.4)
9.5 (3.0)
9.9 (3.2)
10.4 (2.8)
8.7 (3.5)
8.4 (3.6)
8.9 (3.6)
8.4 (3.5)
8.5 (3.8)
7.9 (4.7)
Trial I
recall
•Three annual probes.
Table A19.30. [RAVLT.17] Seines et al., 1991: Data for Seronegative Homosexual and Bisexual Males who Participated in the Multi-Center AIDS Cohort Study
Table A19.30. (Contd.) Education
Age Group
25-34
35-44
45-54
College
>College
n
309
290
97
n
229
202
302
Education
16.1 (2.2)
16.4 (2.3)
16.7 (2.6)
Mean age
Race
96.4%C•
96.6%C
95.9%C
Race
36.1 (7.4) 94.8%C
35.6 (7.2) 96.0%C
38.4 (7.8) 96.7%C
Total score
54.4 (7.8)
51.4 (8.1)
49.5 (7.9)
Total score
50.7 (7.5)
53.2 (8.1)
53.3 (8.3)
Trial V
13.0 (1.8)
12.6 (1.8)
12.3 (1.8)
Trial V
12.6 (1.7)
12.8 (1.9)
12.9 (1.8)
Recall after interference
11.3 (2.4)
10.7 (2.6)
10.6 (2.8)
Recall after interference
10.5 (2.5)
10.9 (2.5)
11.2 (2.6)
Delayed recall
11.1 (2.5)
10.4 (2.9)
10.2 (3.2)
Delayed recall
10.1 (2.7)
10.8 (2.9)
10.9 (2.8)
Delayed recognition
14.4 (0.9)
14.1 (1.2)
14.0 (1.2)
Delayed recognition
14.1 (1.2)
14.4 (0.9)
14.2 (1.1)
•c. Caucasian.
859
APPENDIX 19
Table A19.31. [RAVLT.18] Delaney et al., 1992: Data for the Control Sample Trial Form
n
Age
Education
I
III
v
Postinterference Recall
20-Minute Delayed Recall
Recognition
A
42
45.8 (22-67)
12.8 (6-16)
6.0 (2.1)
10.1 (2.4)
11.6 (2.5)
9.9 (3.2)
9.9 (3.1)
13.6 (1.8)
6.1 (2.2)
10.0 (2.4)
11.8 (2.8)
9.9 (3.3)
9.2 (3.5)
14.0 (1.2)
c
Table A19.32. [RAVLT.19] Ivnik et al., 1992c: Demographic Description of the Cognitively Intact Sample Partitioned into Groups Used in the RAVLT Testing n
Table A19.33. [RAVLT.20a] Savage and Gouvier, 1992: Data for Healthy Adults: Trials 1-V Trials n
I
II
III
IV
v
Males
10
6.2 (0.8)
7.7 (1.8)
9.9 (1.5)
11.1 (2.3)
12.3 (1.6)
Females
10
6.5 (1.7)
7.7 (2.3)
10.6 (2.4)
11.7 (1.2)
12.9 (0.2)
10
6.4 (2.0)
8.4 (2.4)
9.6 (2.3)
10.1 (3.1)
10.5 (1.9)
9
5.7 (1.0)
7.3 (1.3)
8.0 (2.7)
9.6 (2.3)
10.3 (2.0)
9
5.5 (1.1)
7.6 (1.0)
9.0 (2.3)
9.5 (2.7)
9.8 (2.1)
10
6.2 (3.2)
8.6 (3.8)
10.8 (2.3)
10.7 (2.2)
11.8 (1.9)
9
6.0 (1.2)
7.3 (2.0)
9.1 (1.8)
9.8 (1.2)
10.4 (2.1)
10
5.7 (1.5)
7.3 (2.2)
9.1 (2.7)
9.4 (3.2)
10.4 (3.1)
9
5.7 (0.8)
8.1 (1.0)
9.1 (1.8)
9.4 (2.5)
9.3 (2.2)
10
5.6 (1.6)
7.9 (1.5)
8.8 (2.2)
11.3 (2.4)
11.6 (2.2)
9
5.0 (1.0)
6.0 (1.1)
7.4 (2.5)
8.2 (2.3)
8.4 (2.3)
10
5.6 (1.2)
7.4 (1.5)
7.0 (2.7)
9.0 (2.5)
9.3 (1.9)
10
5.3 (1.4)
6.3 (1.2)
7.5 (1.6)
7.9 (2.4)
8.1 (2.4)
9
5.6 (1.1)
6.5 (1.7)
6.5 (1.8)
6.7 (1.6)
7.4 (1.6)
Age 16-19
Age lf'OUJ'JII 56-59
60-64
41 72
65-69
83
7~74
82 105 76 49
75-79 ~
85-89 90-97
B..,.,_ Right-handed Left-handed Mixed/both
22
Widowed
.8Gc:e Caucasian
Black Hispanic .Et.fucadon (,_..) $.7 8-11 12 13-15 16-17 <::18
Females 30-39
Females 40-49
200 330
Males Females
Jlariltd Sltdu
Single Married Divorced
Males
Males 501 16 13
Gender Males Females
20-29
69 318 12 131 528 1 1
50-59
Males Females ~
Males Females
8 84 192 117
82 47
7~76
Males Females
APPENDIX 19
860 Table A19.34. [RAVLT.20b] Savage Through Delayed Recognition
and Gouvier, 1992: Data for Healthy Adults:
List B
ListB
Postinterference Recall
Immediate Recognition
30-Minute Delayed Recall
Commission Errors
Delayed Recognition
Males
5.9 (0.74)
10.6 (2.5)
14.1 (1.2)
9.9 (2.5)
0.60 (0.96)
13.9 (1.2)
Females
5.4 (1.2)
12.1 (2.6)
14.4 (0.51)
11.4 (2.6)
0.60 (0.96)
14.3 (0.67)
Males
4.7 (1.6)
10.1 (3.0)
14.2 (1.6)
10.0 (3.4)
0.30 (0.67)
14.0 (1.6)
Females
5.3 (2.1)
8.7 (2.3)
14.0 (1.1)
8.6 (2.1)
0.55 (0.88)
13.8 (1.6)
Males
4.6 (2.1)
8.3 (2.7)
13.1 (2.0)
9.0 (3.3)
0.11 (0.33)
12.6 (2.4)
Females
5.0 (2.2)
10.7 (2.9)
14.0 (0.94)
11.7 (2.8)
0.40 (0.52)
13.6 (2.8)
Males
5.0 (1.0)
8.1 (1.9)
13.1 (1.9)
7.6 (2.0)
0.11 (0.33)
13.5 (2.1)
Females
4.7 (1.4)
8.2 (3.4)
13.3 (2.2)
7.6 (3.5)
0.90 (1.4)
12.9 (2.8)
Males
4.3 (2.0)
8.3 (2.4)
12.8 (1.7)
7.5 (2.7)
0.88 (1.4)
12.5 (2.1)
Females
4.5 (1.6)
9.2
12.0 (2.8)
9.4
(1.9)
(2.3)
0.60 (1.1)
13.3 (2.4)
Age 16-19
20-29
30-39
40-49
50-59
861
APPENDIX 19
Table A19.35. [RAVLT.2l] Crossen and Wiens, 1994: Data for Job Applicants (n =60):° Comparison of AVLT and CVLT Scores
Table A19.36. [RAVLT.22a] Geffen et al., 1994: Data for the Rey AVLT Forms land 4 for a Sample of 51 Healthy Australian Volunteers• Fonn 1
Fonn4
Trial I
6.82 (1.47)
6.82 (1.58)
10.5 (1.9)
Trial II
9.35 (1.98)
8.90 (1.98)
11.1 (2.0)
11.7 (2.1)
Trial III
10.92 (1.97)
10.76 (1.99)
Trial IV
11.7 (2.1)
12.5 (2.0)
Trial IV
11.55 (2.12)
11.53 (2.05)
Trial V
12.2 (1.8)
13.0 (1.8)
Trial V
12.47 (1.91)
12.00 (1.99)
Total Words
51.7 (7.5)
55.1 (7.7)
Total (I-V)
51.12 (7.42)
50.02 (7.68)
ListB
7.0 (2.0)
7.9 (1.9)
List B
6.02 (1.73)
5.68 (1.68)
Postinterference Recall
10.6 (3.1)
11.7 (2.3)
Postinterference recall
10.88 (2.91)
10.65 (2.71)
Recognition
13.4 (1.2)
14.7 (1.4)
20-minute delayed recall
10.82 (2.99)
10.33 (2.83)
Recognition A total
13.71 (1.60)
13.65 (1.48)
Recognition A p[A]
0.90 (0.08)
0.90 (0.07)
Recognition B total
6.94 (2.92)
7.59 (2.77)
Recognition B p[A]
0.71 (0.10)
0.73 (0.09)
AVLT
CVLT
Trial I
7.0 (1.6)
7.5 (1.6)
Trail II
9.7 (2.0)
Trial III
"The sample includes 52 men and 8 women with mean age of29.9 (6.2) years, mean education of 14.7 (1.6) years, and mean WAIS-R FSIQ of 106.3.
"The sample includes 25 males and 26 females with a mean age of 31.3 (12.7) years, mean education of 12.2 (2.4) years, and mean National Adult Reading Testestimated IQ of 115.6 (6.26).
862
APPENDIX 19
Table A19.37. [RAVLT.22b] Geffen et al., 1994: Frequency (Approximate Occurrence per Million) and Length (Number of Letters) of Words, Comparing Rey AVLT Form 1 with Form 4, List A Form 1 List A Drum Curtain Bell Coffee School Parent Moon
Frequency
Hat Fanner Nose Turkey Color House River
11 13 18 78 492 15 60 60 56 23 60 9 141 591 165
Mean Range
119.47 9-591
Garden
Fonn4 Length 4 7 4 6 6 6 4 6 3 6 4 6 6(5) 5 5
List A Pipe
Wall Alarm Sugar Student Mother Star Painting Bag Wheat Mouth Chicken Sound Door Stream
5.2 3-7
Frequency
Length
20 160 16 34 131 216 25 59 42 9 103 37 204 312 51
4 4 5 5 7 6 4 8 3 5 5 7 5 4 6
94.6
5.2 3-8
~12
Table A19.38. [RAVLT.22c] Geffen et al., 1994: Frequency (Approximate Occurrence per Million) and Length (~mber of Letters) of Words, Comparing Rey AVLT Form i witllForm 4, List B . Fonn4
Form 1 ListB
Frequency
Desk Ranger Bird Shoe Stove Mountain Glasses Towel Cloud Boat Lamb Gun Pencil Church Fish
65 2 31 14 15 33 29 6 28 72 7 118 34 348 35
Mean Range
J.-.348
56.47
Length 4 6 4 4 5 8 6 5 5 4 4 3 6 6 4
ListB Bench
Frequency
Soap
22
Sky Ship
58 83 6 28 157 20 2
5 7 4 4 6 5 6 4 3 4 4 6 5 6 4
42.33 2-157
4.87 3-7
Officer Cage Sock Fridge• Cliff Bottle
Goat
Bullet Paper Chapel Crab•
4.93 3-8
-The actual words are not present in the tab!~ of word frequency.
35 101 9 4 23 11 76
Length
863
APPENDIX 19
Table A19.39. [RAVLT.22d] Geffen et al., 1994: Word List for Testing Rey AVLT Recognition (Form 4)•
Eye (SA) Crab (B) Star (A) Rag (PA) Bun (PA) Cage (B) Cliff (B) Sugar (A) Cream (PA) Stream (A)
Alarm (A) Aunt (SA) Bag (A) Creek (SA) Officer (B)
Mouth (A) Arrow (SB)
Student (A) Hail (PA) Paper (B)
Soap (B) Wall (A) Clock (SA) Sound (A)
Ship (B)
Bottle (B)
Car (PA)
Seat (SB)
Mother (A)
Sock (B) Tone (SA) Fridge (B)
Duck(SA) Wheat(A) Floor (SPA)
Bench (B) Bullet (B) Night (SA)
Rock (SPB)
Chapel (B)
Door(A)
Chicken (A) Coat (PB)
Bridge (PB)
Bread (SA) Pipe (A) Ball (PA)
Painting (A)
Goat (B)
Sky (B)
•A, words from list A; B, words from list B; P, phonemic associate of words on lists A and B; S, semantic associate of words on lists A and B.
Table A19.40. [RAVLT.23] Torres et al., 2001: Data for the Control Group WAIS-R
n
Age
Education
VIQ
FSIQ
Trial I
Trial V
Interference Trial
Recall after Interference
Males
84
29.0 (10.2)
14.4 (2.1)
113 (14)
114 (13)
6.6 (2.1)
12.7 (2.1)
6.3 (2.1)
11.6 (2.9)
Females
76
30.5
14.8 (1.9)
109 (12)
112 (12)
7.2 (2.1)
13.2 (1.6)
6.7 (2.4)
12.3 (2.3)
(11.7)
APPENDIX 19
864
Table A19.41. [RAVLT.24) Miller, 2003 (Update on Seines et al.,1991): Data for a Sample of Seronegative Homosexual/Bisexual Males Participating in the Multi-Center AIDS Cohort Study, Stratified by Age x Education Education Age 25-34
(years)
False-
Trial I
Trial V
Total 1-V
Interference Trial
Delayed Recall
Delayed Recognition
Positives
6.67 (1.72) 102
12.63 (1.67) 102
51.36 (7.02) 102
7.01 (2.03) 67
10.63 (2.62) 67
14.0 (1.04) 67
0.58 (1.13) 67
7.18 (1.72) 96
13.13 (1.84) 96
55.39 (7.47) 96
6.80 (1.94)
11.62 (2.24)
14.36 (0.94)
0.39 (0.72)
66
66
66
7.41 (1.74) 35
13.49 (1.41) 110
56.91 (6.59) 98
7.26 (2.03)
11.97 (1.90)
14.30 (0.93)
0.36 (0.65)
66
66
66
66
7.08 (1.75) 296
13.07 (1.68) 296
54.50 (7.40) 296
7.03 (2.00) 199
11.40 (2.33) 199
14.22 (0.98) 199
0.45 (0.86) 199
6.50 (1.72) 128
12.40 (1.80) 128
50.35 (7.51) 128
6.11 (1.87) 81
10.12 (2.87) 80
13.49 (1.76) 80
0.55 (0.94) 80
6.69 (1.59) 112
12.69 (1.78) 112
52.11 (7.50) 112
6.24 (2.13) 79
10.29 (3.04)
13.68 (1.53)
0.48 (0.77)
79
79
79
6.85 (1.76) 177
12.80 (1.88) 177
52.71 (8.21) 177
7.17 (2.17) 121
10.72 (2.73) 121
13.88 (1.46) 121
0.35 (0.57) 1.21
6.70 (1.71) 417
12.65 (1.83) 417
51.82 (7.86) 417
6.60 (2.13) 281
1G.43 (.2.87)
13.71 (1.57)
0.44 (0.75)
6.21 (1.97) 61
11.87 (1.74} 61
47.72 (7.40} 61
5.85 (2.36} 40
9.95 (2.28) 39
13.41 (1.71) 39
0.72 (1.12) 39
6.27 (1.40) 44
11.93 (1.96} 44
48.89 (8.68) 44
5.94 (1.75) 33
9.27 (3.51) 33
13.88 (1.75) 33
0.67 (1.34) 33
6.30 (1.69) 100
12.54 (1.89} 100
50.79 (8.04) 100
5.92 (1.69} 71
10.47 (2.89} 70
13.86 (1.34) 70
0.56 (1.11) 70
6.27 (1.72}
12.21 (1.88)
49.47 (8.08)
5.90 (1.90) 144
10.05 (2.92) 142
13.74 (1.55) 142
0.63 (1.16) 142
<16
Mean (SO) n
16
Mean (SO) n
>16
Mean (SO) n
Total Mean (SO) n
35-44
<16
Mean (SO) n
16
Mean (SO) n
>16
Mean (SO) n
Total Mean (SO) n
45-59
280
280
280
<16
Mean (SO) n
16
Mean (SO) n
>16
Mean (SO} n
Total Mean (SO) n
205
205
205
865
APPENDIX 19
Table A19.41. (Contd.) Interference
Delayed
Delayed
False-
Trial I
TnalV
Total 1-V
Trial
Recall
Recognition
Positives
6.50 (1.78) 291
12.37 (1.76) 291
50.15 (7.41) 291
6.38 (2.09) 188
10.27 (2.67) 186
13.66 (1.55) 186
0.60 (1.05) 186
6.80 (1.64) 252
12.72 (1.87) 252
52.79 (8.03) 252
6.39 (2.01) 178
10.60 (2.98) 178
13.97 (1.42) 178
0.48 (0.88) 178
6.85 (1.78) 375
12.91 (1.81) 375
53.30 (8.08) 375
6.85 (2.09) 258
10.97 (2.65) 257
13.98 (1.32) 257
0.41 (0.78) 257
6.72 (1.75) 918
12.69 (1.82) 918
52.16 (7.97) 918
6.58 (2.08) 624
10.65 (2.77) 621
13.88 (1.42) 621
0.49 (0.90) 621
Education Age Total
(years)
<16
Mean
(SD) n
16
Mean
(SD) n
> 16 Mean
(SD) n
Total Mean
(SD) n
Table A19.42. [HVLT-R.l] Friedman et al., 2002: Performance on the Hopkins Verbal Learning TestRevised for a Sample of Healthy African-American E lderly Stratified by Gender and Education Performance Index
Gender
Years of Education
Performance Index
Years of Gender Education
Percent Retention Male
M (SO )
Table A19.42. (Contd.)
n
< 12
91.05 (10.61 )
37
12
5.45 (14.11)
11
n
Ages 60-71
Total Recall
Male
Female
Total
Delayed Recall
Male
Female
Total
M (SO)
<12
16.89 (3.40)
37
> 12
1.00 (24.62)
6
12
17.09 (2.21)
11
Total
.80 (13.56)
54
> 12
18.33 (3.98)
6
<12
92.97 (13.18)
30
Total
17.09 (3.23)
54
12
95.63 (1 .53)
16
< 12
16.93 (2.96}
30
> 12
.64 (19.21 )
11
12
19.56 (2.22)
16
Total
92.11 (16.21)
57
> 12
21.64 (4.88}
11
< 12
91.91 (11.77)
67
Total
18.58 (3.70)
57
12
91.4 (17.33)
27
< 12
16.91 (3.18)
67
> 12
3.35 (20.57)
17
12
18.56 (2.50)
27
Recognition Discrimination
< 12
9.03 (1.74)
37
> 12
20.47 (4.74)
17
Index
12
9.36 (2.06}
11
Total
17.86 (3.54)
ll1
> 12
9.67 (1.97)
6
< 12
6.43 (1.54)
37
Total
9.17 (1. 1)
54
12
5.82 (1.08)
11
< 12
9.37 (2.13}
30
> 12
5.67 (2.16)
6
12
9.63 (1.54)
16
Total
6.22 (1.54)
54
> 12
10.73 (1.74)
11
< 12
6.47 1.38
30
Total
9.70 (1.95}
57
12
7.37 (1.31)
16
< 12
9.18 {1.91)
67
> 12
7.45 (2.50)
11
12
9.52 (1.74)
27
Total
6.91 (1.67)
57
> 12
10.35 (1.84)
17
< 12
6.45 (1.46}
67
Total
9.44 (1.89)
111
12
6.74 (1.43) 6.82 (2.48)
27
6.58 (1.64)
111
> 12
Total
17
Female
Total
Male
Fe mal
Total
(continued )
867
APPENDIX 19 Table A19.42. (Contd.) Perfonnance Index
Table A19.42. (Contd.)
Years of Gender Education
M (SD)
Perfonnance Index
n
Gender
Aga7~
Total Recall
12 Male
13.71 (4.09) 14.25 (2.06) 15.00 (-)
49
Total
13.78 (3.93)
54
<12
15.96 (4.05)
57
12
20.00 (2.65)
11
>12
18.50 (6.35)
4
Total
16.72 (4.23)
72
<12
14.92 (4.20)
106
12
18.47 (3.58)
15
>12
17.80 (5.72)
5
Total
15.46 (4.34)
126
<12
5.08 (1.74)
49
12
4.75 (2.06)
4
>12
3.00
1
Total
5.02 (1.75)
54
<12
6.05 (2.03)
57
12
7.82 (1.17)
11
>12
5.75 (3.30)
4
Total
6.31 (2.08)
72
<12
5.60 (1.96)
106
12
7.00 (1.96)
15
>12
5.20 (3.11)
5
Total
5.75 (2.04)
126
<12
89.14 (16.47)
49
<12 12 >12
Female
Total
Delayed Recall
Years of Education
Male
>12
Total
Percent Retention Male
80.75 (28.32) 50.00
n
4
(-)
4 Total
87.80 (17.99)
54
<12
89.47 (17.45)
57
12
96.45 (12.18)
11
>12
72.25 (29.02)
4
Total
89.58 (17.92)
72
<12
89.32 (16.93)
106
12
92.27 (18.16)
15
>12
67.80 (27.03)
5
Total
88.82
1 Female
Total
126
(17.90) Recognition Discrimination
Male
Index
<12
7.53 (2.72)
49
12
8.00 (1.83)
4
>12
9.00
1
(-)
(-)
Female
M (SD)
Female
Total
© Swets &
Zeitlinger (2002).
Total
7.59 (2.64)
54
<12
8.44 (2.61)
57
12
11.27 (0.90)
11
>12
10.00 (2.45)
4
Total
8.96 (2.61)
72
<12
8.02 (2.69)
106
12
10.40 (1.88)
15
>12
9.80 (2.17)
5
Total
8.37 (2.70)
126
APPENDIX 19
868
Table A19.43. [WHO-UCLA AVLT.l] Ponton et al., 1996: Data for a Sample of 300 Spanish-Speaking Healthy Participants Stratified by Gender, Age, and Education Age Group 1~29
30-39
50-75
40-49
Education (Yean)
<10
Mala Trial V
recall Recall after interference 20-minute delayed recall
Fetn4Jlee Trial V
recall Recall after interference 20-minute delayed recall
n
x (SD) x (SD) x (SD)
n
x (SD) x (SD) x (SD)
>10
11 12.73 (1.56) 11.73 (1.35) 12.36 (1.91)
25
12 13.33 (1.56) 11.58 (1.73) 11.75 (2.18)
30
13.12 (1.90) 12.24 (2.67) 12.52 (2.08) 13.53 (1.94) 12.37 (2.31) 12.93 (2.45)
<10
>10
<10
>10
<10
>10
13 12.23 (1.64) 11.46 (2.15) 11.23 (2.42)
18 13.33 (1.33) 11.56 (2.09) 12.61 (1.61)
12 12.92 (1.78) 10.50 (2.97) 11.42 (2.35)
17 13.53 (1.42) 13.00 (2.03) 13.18 (1.78)
18 12.11 (1.68) 10.50 (2.09) 10.83 (2.15)
6 12.67 (1.51) 11.00 (1.67) 11.83 (1.60)
22 12.77 (2.22) 11.59 (2.72) 11.86 (2.59)
44
16 12.56 (0.96) 10.56 (1.63) 11.06 (1.61)
11 13.27 (2.05) 12.09 (1.87) 12.50 (1.90)
25
20
11.52 (1.94) 10.24 (2.62) 10.63 (2.36)
13.20 (1.32) 10.75 (2.61) 12.45 (1.96)
13.77 (1.41) 12.11 (2.10) 12.89 (2.01)
Appendix 19m: Meta-Analysis Tables for the Rey Auditory-Verbal Learning Test (Rey AVLT)
Table A19m.1. Results of the Meta-Analysis and Predicted Scores for the Rey AVLT, Trial I (Relevant values are weighted on the standard error for the test mean)
Description of the aggregate sample Number o£ studies iDcluded in the analysis Yean of publication Number o£ data points used in the analysis
8 1988-2003 24
(a data point denotes a study or a cell in education/gender-stratified data)
Total number o£ participants Variable
1,910
n•
xt
sot
24
50.99
64.78
9-417
24
57.37 2.98
19.36 3.26
19.0-82.0
24 18 14
14.07 1.99
0.76 1.06
12.8-16.0 0.5-3.6
2 2
112.98 12.49
1.41 0.71
112-114 12-13
16
39.73
41.37
24
6.09 1.71
0.89 0.29
Range
Sample size Mean
Age Mean SD
1.~11.7
Education Mean SD
IQ Mean SD
Percent fiiGle
~100
TatKOtWmeana Combined mean Combined SD
24
4.4-7.4 1.2-2.2
•Number of data points differs for different analyses due to missing data. tweighted means and SDs.
(continued)
869
870
APPENDIX 19M
Table A19m.1. (Contd.)
,..d
Predicted number of words recaDed age group• (Bey AVLT, Trial I) .
SDs, per
95%CI Age RtJnge
Predicted
I..Dwer
Upper
Seore
Band
Band
J0-.!4 .25-.29 30-34 35-39 40-44
'7.10 '7.10 '7.08 '7.01 6.90 6.'76 6.58 6.36 6.10 15.81 5.4'7 5.10
6.78 6.78 6.71 6.59 6.46 6.31 6.15 5.98 5.80 5.61 5.41 4.97
7.41 7.43 7.44 7.42 7.35 7.21 7.01 6.74 6.41 6.00 5.54 5.24
45-49 50-S4
tlS-S9 60-64
65-69 70-74
75-19
Standard deviation for all age groups is 1.'71.
•Based on the equation: 1 Predicted tat ICOnJ = 6.581533 + 0.0399874 • !rge- 0.0007624 • oge2 I Signiflcanee tests for regression with +e test seores OrdiDary leut-11J118"18 regresaion or Number of obsemations Number of clusters
R2
FcdO• p
I teat -
Oil
age (quadratic)
' 24 : 8 . 0.842 Fc2.7) = ~.38, p < 0.000
Term
Coefficient
SE
t
Age Age2 Constant
0.0399874 -0.0007624 6.581533
0.031 0.000323
-2.36
0.580
11.34
1.28•
p
95%CI
0.241° 0.050 0.000
-0.034 to 0.114 -0.002 to 0.000 5.21 to 7.95
•signiflcance test for age centered (sample m . - aggregate mean): t = -7.16, p = 0.000. Predictioa Predicted age range Mean predicted score SEe
95%CI
20-79 years 6.45 (0.69) 0.16 6.13-6.77
871
APPENDIX 19M Table A19m.1. (Contd.) 8
0
0 7
0
0
6
0
0 0
0
5
0 4 20
40
30
50
age
60
70
60
Figure A19m.1. A scatterplot illustrating the dispersion of the data points around the regression line for the Rey AVLT trial I. The size of the bubbles reflects the weight of the data point, with larger bubbles indicating larger standard error and smaller weight.
Tests for assumptions and model fit Tests for heterogeneity in the 6aal data set Pooled estimates for 6xed effect 6.354 6.130 Pooled estimates for random effect Q(dl), p Q(23) =445.01, p < 0.000 Moment-based estimate of between-study variance 0.702 Tests for model 8t-4ddition of a quadratic term BIC
BIC'
-42.698
-31.258
-49.378
-37.938
Model Linear
Quadratic
0.762 0.842
0.751 0.827
BIC' difference of 6.680 provides strong support for the quadratic model.
Tests for parameter specifications Normality of the residuals Shapiro-Wille W test W = 0.964, p = 0.525 Homoscedasticity 8.055, p =0.090 White's general test
Significance tests for regression with the SDs A regression of SDs on age yielded an R2 of 0.335 (F(l.7>=8.57, p =0.022). Therefore, the SD for the aggregate sample is suggested for use with all age groups. (continued)
872
APPENDIX 19M
Table A19m.1. (Contd.) EtTeets of delllOJI'8Phic variables Education Est. tau2 without education Est. tau2 with education Regression of test means on education and age Number of observations Number of clusters B_2
Term
Coefficient
Education
0.188231
t
SE
'f
0.179
Gender t-test by gender
0 0 18 7
0.850 p
95%CI
0.334
-0.250 to 0.627
n
X male (SO)
X female (SO)
M-F difference
t
p
7M,5F
6.964 (1.869)
5.794 (1.566)
1.170
3.148
0.005
Table A19m.2. Results of the Meta-Analy ~and Predicted Scores for the Bey AVLT, Trial V (Relevant values are weighted on the stan( lard error for the test mean)
Description o£ the aggregate sample Number of studies inelucled in the
~
Yean of pubUcatloo
,
Number of data points ued in the ~ (a data point denotes a study or a cell t
8 1988-2003 23
in education/gender-stratified data)
Total number of partieipmts Variable
1,901
n•
xt
so'
Range
23
49.34
59.06
12-417
23 23
58.54 2.51
17.97 3.00
19.0-82.0 1.0-11.7
17 14
14.02 2.09
0.74 1.08
0.~.6
2 2
113.11
1.41 0.70
112-114 12-13
s....,..Mean
Age Mean
so
Edueation Mean
so
12.8-16.0
IQ Mean
so
12.56
873
APPENDIX 19M Table A19m.2. (Contd.)
Variable
n•
xt
sot
Percent mtJk
15
38.86
38.69
23 23
11.55 2.36
1.18 0.39
Range ~100
Tat•core_,.. Combined mean Combined SD
9.~13.4
1.6-3.0
"Number of data points differs for different analyses due to missing data. Weighted means and SDs.
Predicted number of words recaDed and SDs, per age group• (Bey AVLT, Trial V) 95%CI
Age Bt.ange
Predictecl
JO-J4
5S-S9
12.85 12.96 12.99 12.96 12.85 12.66 12.41 12.08
60-64
11.67
65-69
11.20 10.64 10.02
Score
JS-J9 30-34
35-39 40-44 45-49 50-S4
70-74 75-79
I..Dwer Band
Upper Band
12.48 12.66 12.66 12.55 12.38 12.17
13.23 13.26 13.34 13.37 13.32 13.16 12.89 12.50 12.00 11.39 10.76 10.30
11.93 11.65 11.34 10.99 10.53 9.74
Stanclanl deviation for all age groups is 2.36.
"Based on the equation:
Predieted tat 8conl = 11.46148 + 0.0948657 • age- 0.0014639 • age2
Significance tests for regression with the test scores Ordioary least-squares regressioo of test meaos oo age (quadratic) Number of observations 23 Number of clusters 8 R2 F(dl),
0.877
p
Term
F<2.1l = 98.10,
Coefficient
0.0948657
Age Age2
Constant
-0.0014639 11.46148
p < 0.000
SE
t
p
95%CI
0.045 0.000
2.10° -3.20 12.65
0.074• 0.015 0.000
-0.012 to 0.202 -0.003 to -0.000 9.32 to 13.60
0.906
"Signi6cance test for age centered (sample means- aggregate mean): t = -8.27, p = 0.000.
Predietioa Predicted age range Mean predicted score SEe
95%CI
~79years
12.11 (1.01) 0.18 11.76-12.46 (continued)
874
APPENDIX 19M
Table A19m.2. (Contd.) 14
12
10
0 8 20
30
40
50
age
80
80
Figure A19m.2. A scatterplot illustrating the dispersion of the data points around the regression line for the Rey AVLT trial V. The size of the bubbles reflects the weight of the data point, with larger bubbles indicating larger standard error and smaller weight.
Tests for assumptions and model 8t Tests for heterogeneity in the 8aal data set Pooled estimates for fixed effect 12.481 Pooled estimates for random effect 11.807 Q(dfl• p Q(22)=421.02, p Moment-based estimate of between-study variance 0.894
< 0.000
Tests for model 8t~ of a quadratic term Model Linear Quadratic
0.733 0.877
Adjusted R2
BIC
BIC'
0.720 0.864
-24.482 -39.110
-27.218 -41.846
BIC' difference of 14.628 provides very strong support for the quadratic model.
Tests for parameter speci8eatioas Normality of the residuals W = 0.943, p = 0.212 Shapiro-Wille W test Homoscedasticity White's general test 5.667, p =0.226 Signi8canee tests for regression with the SDs A regression of SDs on age yielded an R2 of0.263 (F0 •7,=7.18, p=0.032). Therefore, the SD for the aggregate sample is suggested for use with all age groups.
Table A19m.2. (Contd.)
Effects of demographic variables Educatioa Est. tau2 without education Est. tau2 with education Regression of test means on education and age Number of observations Number of clusters R2
0 0 17 7 0.875
Term
Coefiicient
SE
t
p
95%CI
Education
0.1582632
0.223
0.71
0.504
-0.387 to 0.703
Gender t-test by gender n
Xmale (SD)
Xfemale (SD)
M-F difference
7M, 4F
1.2.747 (0.386)
11.820 (1.035)
0.927
t
p
2.189
0.028
Table A19m.3. Results of the Meta-Analysis and Predicted Scores for the Bey AVLT, RecaD after Interference (Relevant values are weighted on the standard error for the test mean)
Description of the aggregate sample
or or
Number studies iDduded In the aoalysU Yean of publicatioa Number data points ued In the aoalysU (a data point denotes a study or a cell In education/gender-stratified data)
7 1988-2001 20
Total number or participaats
983
Variable
n•
xt
so'
Range
Mean
20
39.62
25.69
12-126
20 20
58.61 2.58
17.99 3.17
19.0-82.0 1.0-11.7
s,.,. . .
Age Mean SD
&luc:acion Mean SD
14
13.92
0.59
12.8-14.8
11
2.02
1.10
0.5-3.6
2 2
113.09 12.55
1.41 0.70
112-114 12-13
Pereent male Tatacore-
12
35.34
37.21
Combined mean Combined SD
20 20
9.72 2.93
1.55 0.39
lQ Mean SD
0-100 6.7-12.3 2.2-3.5
"Number of data points dJffers for dJfferent analyses due to missing data.
tweighted means and SDs. (continued)
876
APPENDIX 19M
Table A19m.3. (Contd.)
Predicted number of words reealled and SDs, per age group• (Bey AVLT, RecaD after Interference)
95%CI
Age
Predieted
&nge
Score
JO-U JS-29 30-34
11.88 11.88 11.79
35-39 40-44
11.64
1~14
11.41 11.11 10.73 10.28 9.76 9.16 8.49
15-19
7.75
4S-49
50-54 55-59 60-64
6S-69
Lower Band
Upper Band
11.65 11.60 11.42 11.19 10.91 10.59 10.24 9.85 9.42 8.95 8.34 7.44
12.12 12.15 12.16 12.09 11.91 11.62 11.22 10.71 10.09 9.37 8.64 8.05
Standard dmation for aD age groups Is 1.83.
"Based on the equation:
Predkled tat ICON= 11.01093 + 0.0718987 • age- 0.0014714 • age2
Sigoiflcance tests for regression with the test scores
Ordinary least-squares regression of test means on age (quadratic) Number of observations Number of clusters R2 Fed!). p
20
7 0.923 Fe2.&)=309.74, p
Term
Coefficient
SE
Age Age2
0.0718987 -0.0014714 11.01093
0.040 0.000 0.715
Constant
1.79"
-3.48 15.40
< 0.000 p
959&CI
0.123" 0.013 0.000
-0.026 to 0.170 -0.003 to -0.000 9.26 to 12.76
"Significance test for age centered (sample means - aggregate mean): t = -10.15, p
Predietion Predicted age range Mean predicted score SE.. 95%CI
20-79 years 10.49 (1.42) 0.18 10.13-10.84
Tests for assumptions and model 8t Tests for heterogeneity in the 8aal data set Pooled estimates for 6xed effect Pooled estimates for random effect Qed!). p Moment-based estimate of between-study variance
10.439 9.862 Qe 19) = 451.56, p < 0.000 3.420
=0.000.
877
APPENDIX 19M Table A19m.3. (Contd.)
12
10
0
8
0 6
20
30
40
50 age
70
60
60
Figure A19m.3. A scatterplot illustrating the dispersion of the data points around the regression line for the Rey AVLT Recall After Interference. The size of the bubbles reflects the weight of the data point, with larger bubbles indicating larger standard error and smaller weight.
Tests for model 8t-addition of a quadratic term
Model
Linear Quadratic
0.839 0.922
BIC
BIC'
0.830
-17.296
-33.511
0.913
-28.934
-45.149
BIC' difference of 11.638 provides very strong support for the quadratic model.
Tests for parameter speci8cations Normality of the residuals Shapiro-Wilk W test Homoscedasticity White's general test
W =0.902, p = 0.045 7.404, p =0.116
Significance tests for regression with the SD A regression of SDs on age yielded an R2 of 0.111 (F(l.6 ) = 2.18, p =0.190). Therefore, the SO for the aggregate sample is suggested for use with all age groups.
Effects of demographic variables Education Est. tau2 without education Est. tau2 with education Regression of test means on education and age Number of observations Number of clusters
R2
0.436 0.000 14 6
0.943 (continued)
APPENDIX 19M
878
Table A19m.3. (Contd.) Tenn
Coefficient
SE
Education
0.4297837
0.262
1.64
p
95%CI
0.162
-0.243 to 1.103
Gender t-test by gender
n
X male (SD)
X female (SD)
M-F difference
9M,8F
10.222 (0.535)
9.959 (0.591)
0.263
p 0.331
Table A19m.4. Results of the Meta-Analysis and Predicted Scores for the Bey AVLT, Recognition (Relevant values are weighted on the standard error for the test mean) Description of the aggregate sample Number of studies included iD the analysis Yean of publication Number of data points used iD the analysis
4 1988-1992 14
(a data point denotes a study or a ceU in education/gender-stratified data)
Total number of participants Variable
453
n•
xt
sot
Range
14
22.22
21.86
10-126
14 14
52.64 1.64
22.23 0.79
0.~2.7
10 7
13.95 2.31
0.40 1.17
14
36.11
42.61
14 14
13.19 1.65
0.81 0.56
Sample me Mean
Age Mean SD
17.5-78.7
Education Mean SD
13.3-14.6 1-3.6
IQ Mean SD
Percmtmak
0 0 0-100
Test acat"e meant Combined mean Combined SD
12.3-14.4 0.5-2.4
"Number of data points differs for different analyses due to missing data. tweighted means and SDs.
0.373
879
APPENDIX 19M
Table A19m.4. (Contd.)
Predicted number of words recogalzed and SDs, per age group• (Bey AVLT, Recognition)t
95%CI
Age
Predieted
.Rattp
SClore
Lower Band
Upper Band
20-!4 25-29 30-34
14.23 14.06 13.89 13.72 13.54 13.37 13.10 13.03 12.85 12.68 12.51 11.34
14.07 13.91 13.73 13.54 13.34 13.13 12.92 12.70 12.49 12.27 12.05 11.82
14.40 14.22 14.04 13.89 13.75 13.61 13.48 13.35 13.22 13.10 12.97 12.85
35-39
40-44 4lS-49
50-S4 SS-lJ9 (JQ...$4
65-69 1~14
15-19
Standard deviation for all age groups is 1.65.
"Based on the equation:
Predkted tal acore = 15.00957- 0.0344756 • age t'I'he predicted scores are relevant for the following administration sequence: five acquisition trials, interference trial, recall after interference, recognition (immediate or after a short delay).
Signiflcance tests for regression with the test scores Ordiaary least-squares regression of test means on age (linear) Number of observations Number of clusters R2 F
14 4 0.892
Fo.3)=43.69, p<0.007
Term
Coefficient
SE
Age
-0.0344756 15.00957
0.005 0.173
Constant
Predietion Predicted age range Mean predicted score SEe 95%CI
-6.61 86.83
p
95%CI
0.007 0.000
-0.051 to -0.018 14.46 to 15.56
20-79 years 13.29 (0.62) 0.15 13.00-13.57 (continued)
880
APPENDIX 19M
Table A19m.4. (Contd.)
20
30
70
60
50
40
age
60
Figure A19m.4. A scatterplot illustrating the dispersion of the data points around the regression line for the Rey AVLT Recognition. The size of the bubbles reflects the weight of the data point, with larger bubbles indicating larger standard error and smaller weight.
Tests for assumptions and model &t Tests for heterogeneity in the 6aal data set Pooled estimates for fixed effect Pooled estimates for random effect Q(d0• p Moment-based estimate of between-study variance
13.868 13.494 Q( 13) = 120.53, p < 0.000 0.384
Tests for model &t-mlition of a quadratic term Model Linear Quadratic
0.892 0.896
0.883 0.877
BIC
BIC'
-30.011 -27.914
-28.546 -26.448
BIC' difference of 2.097 provides positive support for the linear model.
Tests for parameter specl&cations Normality of the residuals Shapiro-Wllk W test Homoscedasticity White's general test
W = 0.871, p = 0.043 8.732, p =0.068
Significance tests for regression with the SD A regression of SDs on age yielded an R2 of 0.471 (F(I.Jl = 9.71, p = 0.053). Therefore, the SD for the aggregate sample is suggested for use with all age groups.
Effects of demographic variables Education Regression of test means on education and age Number of obseiVations 10
APPENDIX 19M
881
Table A19m.4. (Contd.) 3 0.788
Number of clusters
R2 Term
Coefficient
SE
Education
0.2712587
0.265
1.02
p
95%CI
0.414
-0.870 to 1.413
Gender t-test by gender
n
X male (SO)
X female (SO)
M-F difference
5M,5F
14.100 (0.077)
13.152 (0.437)
0.948
p 2.136
0.033
Table A19m.S. Results of the Meta-Analysis and Predicted Scores for the Bey AVLT, Total Recall, Trials 1-V (Relevant values are weighted on the standard error for the test mean) Description of the aggregate sample Number of studies included iD the analysis Yean of publication Number of data points used iD the analysis
6 1988-2003 20
(a data point denotes a study or a cell in education/gender-stratified data)
Total number of participants Variable
1,699
n•
xt
sot
Range
20
50.37
66.28
12-417
20 20
59.69 1.59
17.63 0.78
19.0--82.0 1.0--0.7
14 11
14.10 1.94
0.73 1.16
13.0-16.0 0.5-3.6
13
40.33
38.53
20 12 20
47.47 8.85 8.85
5.51 1.06 0.97
StJtrtpJ. me Mean
Age Mean SO
Education Mean SO
lQ Mean SO
Percent male
Test
meant Combined mean Combined SO Estimated so*
0 0 0-100
IIConl
•Number of data points differs for different analyses due to missing data. tweighted means and SOs. *Estimated SOs for the total score means were estimated as follows: 2(SO trial I)+ 2(SO trial V) + 0.9
36.2--56.3 7.4-10.3 7.0-10.3
882
APPENDIX 19M
Table A19m.S. (Contd.)
Predicted number of words recalled and SDs, per age group• (Bey AVLT, Total Recall)
95%CI
Age &nge
Predicted Score
Lower
Upper
Band
band
JQ.-.24 J5-J9 30-34 35-39 40-44 45-49 50-S4 SS-S9
5US 55.03 55.15 54.91 54.33 53.38 52.08 50.42
53.87 54.36 54.23 53.77
6(}...$4
48.40 46.03 43.31 40.9
4i.64
55.23 55.70 56.06 56.06 55.6.2 54.70 53.3.2 51.47 49.17 46.5.2 43.94 41.45
65-69 70-74 75-79
53.04 52:06 50;83
4lt37
45:55 42;67 38,99
Standard deviation for all age groups is 8.85.
"Based on the equation: Predicted tal acorw =47.98303+ 0.4520533 •age- 0.007125 •age2
Sigoiftcance tests for regression with the test scores I
Onlioary least ~quares regression of left me11111 on age (quadratic) Number of observations Number of clusters
R2 F
20 6
0.948 F<2.5> = 319.79, p <0.000
Term
Coefficient
SE
t
p
95%CI
Age
0.45.20533 -0.007125 47.98303
0.1.20 0.001 2.209
3.77• -5.58
0.013" 0.003 0.000
0.144 to 0.760 -0.010 to -0.004 42.31 to 53.66
Age2 Constant
.21.72
"Significance test for age centered (sample means-aggregate mean): t = -11.69, p = 0.000.
Prediction Predicted age range Mean predicted score
SEe 95%CI
20-79 yellll 50.65 (5.09) 0.49 49.70-51.60
883
APPENDIX 19M Table A19m.S. (Contd.) 80
50
0 ~ ~----~r-----r-----.-----.-----,------r20
0
80
age
80
Fipre A19m.S. A scatteiplot illustrating the dispenion of the data points around the regression line for the Rey AVLT Total RecaU. The size of the bubbles reflects the weight of the data point, with larger bubbles indicating larger standard error and smaller weight.
Tests for assumptioos and model 8.t Tests for heterogeneity iD the 8Dal dataset
Pooled estimates for 6xed effect Pooled estimates for random effect Q(df)•
50.531 47.881 Q(l9) = 611.27, p <0.000
p
Moment-based estimate of between-study variance
27.166
Tests for model &t--tUidltion of a quadratic term
Model Linear
0.785
0.773
Quadratic
0.948
0.942
BIC
BIC'
39.348 13.910
-27.744 -53.182
BIC' difference of 25.438 provides very strong support for the quadratic model. Tests for parameter speei6cations
Normality of the residuals Shapiro-Wilk W test Homoscedasticity White's general test
W = 0.983,
p = 0.969
5.517, p = 0.238
SigDi&eanee tests for regression with the SDs A regression of SDs on age yielded an a2 ofO.OlO (Fu.s)=0.17, p=0.699). Therefore, the SD for the aggregate sample is suggested for use with all age groups. (continued)
884
APPENDIX 19M
Table A19m.S. (Contd.)
Effects of demographic variables Education Est. tau2 without education Est. tau2 with education Regression of test means on education and age Number of obseiVations Number of clusters
31.17 22.81 14
5
R2
0.957
Term
Coefficient
SE
Education
-0.8448568
0.546
.-1.55
p
95%CI
0.197
-2.361 to 0.671
Gender t-test by gender n
X male (SD)
X female
11M, 9F
50.117 (1.63)
49.578 (1.520)
(SO)
M-F difference
p 0.238
0.539
0.408
Table A19m.6. Summary Table of Predi~ed Scores for the Bey AVLT
Total Recallt
10.02
; 11.88 11.88 11.79 11.64 11.41 11.11 10.73 10.28 9.76 9.16 8.49 7.75
14.23 14.06 13.89 13.72 13.54 13.37 13.20 13.03 12.85 12.68 12.51 12.34
54.55 55.03 55.15 54.92 54.33 53.38 52.08 50.42 48.40 46.03 43.31 40.22
2.36
2.93
1.65
8.85
11.89 (1.24)
10.21 (1.68)
13.20 (0.67)
49.58 (6.21)
Trial V
6lHi9 70-74 75-19
7.10 7.10 7.08 7.01 6.90 6.76 6.58 6.36 6.10 5.81 5.47 5.10
12.85 12.96 12.99 12.96 12.85 12.66 12.41 12.08 11.67 11.20 10.64
SD
1.71 6.31 (0.82)
20-!U JS-.29 30-34 35-39 40-44 41$-49 50-S4 5S-S9 (JQ...$4
Weighted means
~after Irf;erference
Recognition•
Trial I
Age nJnge
Leamingt
Forgetting'
6.00
0.99 1.11 1.23 1.36 1.48
5.60
1.60
4.60
1.72 1.84 1.96 2.08 2.21 2.33 1.72 (0.47)
"The predicted scores are relevant for the following administration sequence: five acquisition trials, interference trial, recall after interference, and recognition (immediate or after a short delay). tTrials 1-V. *Learning= trial V -trial I. Data in the aggregate sample were inconsistent and did not allow predictions. Original data corresponding to 3 points along the age continuiun are presented in order to demonstrate the slope of the age-related decline in learning capacity. 'Forgetting= Trial V- Recall after Interference~ Based on a linear regression (R2 = 0. 743, Fu.s> = 52.38, p < 0.0004).
Appendix 20: Locator and Data Tables for the Benton Visual Retention Test
Study numbers and page numbers provided in these tables refer to study numbers and descriptions of studies in the text of Chapter 20.
Locator table also provides a reference for each study to a corresponding data table in this appendix.
885
I0\
Table A20.1. Locator Table for the Benton Visual Retention Test (BVRT) Study
Age•
n
Sample Composition
600
Hospital inpatients or outpatients; data are partitioned by age groups and IQ levels; expected number of correct responses and errors are presented (some means and SDs)
IQ:
BVRT.l BentonSivan,1992 page 402 Data are not reproduced in this book
15-69
BVRT.2 Benton, 1962 page403 Table A20.2
41 16-60
100
Patients on medical and neurological wards with no evidence of cerebral disease; number of errors reported
BVRT.3 Klonoff
80-92
172
18-102
73.1
Location Iowa
Education: 10
FormC,E Adm.C,B
Iowa
War veterans; data presented in 6 age groupings; number correct and errors reported
7.04
Forme Adm.A
Canada
857
Male volunteers, mostly Caucasian; sample is divided by 7 age groups and date of testing; number of errors reported
Well-educated sample
Forme Adm.A
Baltimore/ Washington DC
53
Healthy volunteers, 25 M, 28 F; total correct and errors reported
12.0
Forme Adm.A
Iowa
Healthy volunteers; 17M, 61 F; number of errors reported
12.2
Forme Adm.A
Florida
Forme Adm. A,C FormD Adm.D
Mississippi
page 403 Table A20.3
BVRT.5 Eslinger et al., 1985
FonnlAdministration FormC,D,E Adm. A,B,C,D and 8-item abbreviated version
769 120
& Kennedy, 1965
BVRT.4 Arenberg. 1978 page 404 Table A20.4
IQ/Education•
60-88
page404 Table A20.5 BVRT.6 Larrabee et al., 1986 page404 TableA20.6
72.9 (6.9)
78
BVRT.7 Randall et al., 1988 page 405 TableA20.7
Mean for IQ groups: 23.8726.20
120
Volunteers; 51 M, 69 F; data partitioned into 6 IQ groups; total correct and errors reported
~110
to
~59
(3.3)
IQ: 60 to ;:;20
> "tl "tl
m
z c
>< N
0
BVRT.8 RobertsonTchabo & Arenberg. 1989 page405 Table A20.8
20-29 30-39 40-49
1,643
50-59
Volunteers; data partitioned into gender, 2 education groups, and 7 age groups; total errors reported
Forme Adm.A
Baltimore/ DC Washington
z
60-69 70-79
BVRT.IO Prakash & Bhogle, 1992 page 406 Table A20.10
25-34 35-44
"tt
m
0
>< IV
0
80-89
BVRT.9 Alder et al., 1990 page 406 Table A20.9
> "tt
277
Male volunteers; sample is divided into 5 age groups; number of errors reported
Highly educated
Forme Adm.A
Baltimore/ Washington DC
90 86 84 56 53 62 73 55 62 39
331 M, 329 F; exclusion criteria ..evident physical or psychological disorders;" sample is stratified into 10 age groups; number correct reported
Higher secondary education
Forme,D,E (collapsed) Adm.A
India
Healthy elderly; number correct, errors, and error types
13.61 (3.4)
Forme Adm.A FormD Adm.C
Missouri
Korea
45-54
55-&4 65+ 15-19 20-24 25-29 30-34 35-39 40-44
45-49 50-54
55-59 60-64
BVRT.ll RobinsonWhelen, 1992 page 407 Table A20.11
72.23 (9.0)
122
BVRT.l2 Lee & Lee, 1993 page 407 Table A20.12
34.7 (8.16)
81
Male manual workers, guards, clerks, and technicians; number correct for Recognition trial reported
12.9 (2.5)
NCTB battery, Recognition
BVRT.l3 Youngjohn et al., 1993 page 407 Table A20.13
11h39 40-49 50-59 60-69 70-84
1,128
Healthy volunteers; 464 M, 664 F; data are stratified into 5 age groups by 3 education groups; number correct and errors reported
12-14 15-17 18-25
Adm.A
trial USA
(continued)
I.....
Table A20.1. (Contd.) Study
ICD Age•
n
BVRT.l4 Palmer et al., 1994 page 408 Table A20.14
71.4 (4.69)
BVRT.l5 Escalona et al., 1995 page 408 Table A20.15
Sample Composition
IQ/Education •
FonnlAdministration
Location
1,149
Community-dwelling older males; number correct reported
WeDeducated
Form C Adm. A
Western USA
30 16--45
67
Control participants; 56 M 11 F; number correct for Recognition trial reported
8
NCTB battery, Recognition trial
Venezuela
BVRT.l6 Giambra et al., 1995 page 408 Table A20.16
28--33 34-39 40--45 46-51 52-57 58-63 64--69 70-75 76-81 82--87
1,721
Volunteers; 1,163 M, 558 F; mostly Caucasian; sample is partitioned into gender x 10 age groups; number of errors reported
Highly educated
Form C Adm.A
Baltimore/ Washington DC
BVRT.l7 Resnick et al ., 1995 page 409 Table A20.17
20-29 30-39 40-49 50---59 60-69 70--79 80--102
2,000
Volunteers; 1,365 M, 635 F; mostly Caucasian; sample is partitioned into gender x 7 age groups; data for 7 types of e rrors reported
1-IigWy educated
Form C Adm. A
Baltimore/ Washington DC
BVRT.l8 Steenhuis & Ostbye, 1995 page 409 Table A20.18
78.5 (6.7)
591
Elderly participants with no cognitive impairment; 61 % F; number correct reported
9.8 (4.0)
Multiplechoice administration
Canada
BVRT.l9 Dealberto et al., 1996 page 410 Table A20.19
60--64 65-70
1,389
Elderly participants; 574 M, 815 F; no exclusion criteria; sample partitioned by gender, 2 age groups, and 3 educational levels; numbe r correct reported
<6 6-12 > 12
Recognition
France
trial
)> "'0 "'0
m
z
-0X 0"'
BVRT.!IO Jacobs et al., 1997 page 410 Table A20.20
BVRT..11 Carmelli et al., 1999
75.07 (5.90) 74.91 (5.71)
5~
118 118
589
page411 Table A20.21
Older English speakers and Spanish speakers; 75% and 72% F, respectively; cognitive impairment or dementia among exclusion criteria; number correct reported
8.85 (3.78) 8.41 (3.98)
White male WWII veterans; no exclusion criteria; sample partitioned by 4 smoking groups and 4 alcohol intake groups; number correct reported
Education:
Recognition for matching form C; Memory form D
New York
""0
m
z
c
>< 1\J
0
~12
(n=341) >12 (n=254)
Version not specified
Massachusetts, Indiana, California
77.7 (7.89)
156
Healthy participants; mostly Caucasian; 31M, 125 F; sample is partitioned into 9 age groups x 11 educational levels; expected number correct reported
12.67 (3.46)
Adm.A
Iowa
76.2 (6.1) 74.8 (5.7)
43
Nondemented participants in the Columbia Aging Project (>65 years old); Medicare recipients; 74% F; sample is stratified into 21iteracy groups; number correct reported
0-3
Recognition trial, matching
New York
BVRT.lW Mathiesen et al., 1999 page 412 Table A20.24
45.5 (10.8)
52
Male control participants; number correct and errors reported
9.5 (1.8)
Forme Adm.A
N01way
BVRT.25 Amir, 2001 page 413 Table A20.25
15-44
260
124M, 136 F; no exclusion criteria; sample partitioned into 4 IQ groups by gender and 2 education groups
ll.6M (2.71)
FormD Adm.A FormE Adm.A
United Arab Emirates
BVRT.U Coman et al., 1999 page 412 Table A20.22
BVRT..I3 Manly et al., 1999 page 412 Table A21.23
43
> ""0
12.0 F (2.80)
(continued)
co ~
!
Table A20.1. (Contd.) Study
Age•
n
BVRT.26 Touradji et al., 2001 page 413 Table A21.26
75.7 (7.2) 77.9 (7.3)
106
BVRT.27 Coman et al., 2002 page 414 Table A20.27
55-64 65-74 75-84 85+ 73.9 (5.8) 74.6 (5.9)
BVRT.28 Manly et al., 2002 page 414 Table A21.28
IQ/Education•
Form/Administration
Nondemented participants in the Columbia Aging Project (>65 years old); Medicare recipients; sample is stratified into U.S.-bom vs. foreign-hom groups; number correct reported
12.9 (3.5) 12.0 (3.7)
Recognition
156
Same as BVRT.22; number correct reported
12.67 (3.46)
Adm.A
Iowa
192
Nondemented participants in the Columbia Aging Project (>65 years old); Medicare recipients; 68% F; data are presented for African-American and white participants separately; number correct reported
12.8 (2.8) 13.0 (3.0)
Recognition trial, matching
New York
Male control participants; number correct reported
2::12
Version not specified
Egypt
Volunteers; 1,004 M, 421 F; sample is partitioned into 6 age groupings; number of errors reported
Highly
Forme Adm.A
Baltimore/ Washington DC
87
192
BVRT.29 Farabat et al., 2003 page 414 Table A20.29
42.28 (5.54)
50
BVRT.30 Kawas et al., 2003 page 415 Table A20.30
<50 50-59
1,425
60-69
70-79 80-89 90+
Sample Composition
educated
Location New York
trial, matching
>
"'0 "'0
m
z
0
X N
0
BVRT.31 Reinprecht et al., 2003 page 415 Table A20.31
81
BVRT.32 Ruggieri et al., 2003 page 416 Table A20.32
30.08 (8.37)
50
BVRT.33 Witjes-Ane 2003 page 416 Table A20.33
42 18-64
88
141
Adm.A
Male participants born in 1914; sample is partitioned into 3 groups based on hypertension status; number correct reported Control participants; 24M, 26 F; number correct reported
Volunteers; 40 M, 48 F; number correct reported
Sweden
> ""0 ""0
m
z
0
X
11.76 (3.04)
Mostly ~high
school
Version not specified
Italy
Version not specified
Netherlands
N
0
• Age column and IQ/education column contain information regarding range and!or mean and standard deviation for the whole sample and/or separate groups, whichever is provided by the authors.
...co \D
892
APPENDIX 20
Table A20.2. [BVRT.2] Benton, 1962: D,lta for 100 Inpatients (mean age=41, mean years of education= 10) in Iowa with Medical or N•urological Disorders but without Cerebral Disease;or Injury BVRT Errors
Copy (Administration C,
0.8
Form Cor E) Memory (Administration B, Form Cor E)
6.6
Table A20.3. [BVRT.3] Klonoff and Kennedy, 1965: Data for Community-Dwelling Canadian VeteraJI; Tested in 1963--1964 BVRT" Age
n
Range
172
80-92
Mean Education 7.04
Correct
Errors
3.97 (1.70) ·range 0-8
11.71 (4.64) range 1-26
•Mean number correct and errors for administration A, form C. r
Table A20.4. [BVRT.4] Arenberg, 1978: Data• for 857 Male, Mostly Caucasian, WellEducated, High Socioeconomic Status 1\esidents of the Baltimore-Washington DC Area: Sample Is Divided According to Date of Testing Age Group
Tested 1960-1964
T~
1965-1968 (1st half)
Tested 1968-1973 (2nd half)
<30
1.25 (n=8)
2.75 (n=12)
2.55 (n=42)
30s
2.61 (n=67)
2.73 (n= 15)
3.13 (n=61
40s
2.88 (n=98)
3.15 (n=40)
3.76 (n=41)
50s
3.50 (n = 100)
4.51 (n=37)
4.48 (n=48)
60s
4.58 (n=66)
5.09 (n=35)
5.31 (n=39)
6.33 (n=SS)
70s
6.05 (n=20)
11.75 (n=8)
~80
6.64 (n=50)
12.00 (n=3)
8.33 (n=12)
•Mean number of errors for administration A, form C.
Table A20.5. [BVRT.S] Eslinger et al.. 1985: Data for Normal Elderly Volunteers Recruited in Iowa BVRT" n
Age
Education
MIFI\af.o
Correct
Errors
53
73.1 60-88
12.0
25128:
5.6 ( 1.6)
7.4 (3.3)
-Total correct and total errors for administtatiOJA, form C.
893
APPENDIX 20 Table A20.6. [BVRT.6] Larrabee et al., 1986: Data on Healthy Elderly Participants (73 Caucasian, 5 African American) Recruited in Florida
BVRT n
Age
Education
MIF Ratio
VIQ
PIQ
Errors•
78
72.9 (6.9)
12.2 (3.3)
17/61
112 (12.8)
114 (11.5)
7.8 (3.2)
"Number of errors for administration A, form C.
Table A20.7. [BVRT.7] Randall et al., 1988: Data• for 120 Participants in Mississippi (69 Females, 51 Males) in six IQ Ranges (n 20 per Group) with Mean Ages across Groups of 23.87-26.20
=
Administration D
Administration A
Administration C
Correct
Errors
Correct
Errors
Correct
Errors
60-69
2.50 (1.64)
15.50 (5.05)
1.80 (1.32)
19.10 (6.62)
6.10 (3.03)
5.55 (5.63)
70-79
4.30 (1.26)
10.15 (2.00)
4.50 (2.21)
9.30 (6.15)
8.00 (2.20)
2.90 (3.45)
80-89
6.50 (1.90)
5.40 (2.99)
6.50 (2.01)
4.80 (3.16)
8.80 (1.32)
1.90 (1.92)
90-109
8.30 (1.34)
2.10 (1.83)
8.75 (0.85)
1.35 (0.93)
9.85 (0.37)
0.15 (0.36)
110-119
8.05 (1.19)
2.50 (1.79)
8.50 (1.10)
1.60 (1.31)
9.75 (0.44)
0.25 (0.44)
120+
8.25 (0.79)
2.00 (1.07)
8.40 (0.94)
1.90 (1.21)
9.70 (0.47)
0.30 (0.47)
IQ Group
"Number correct and errors for administrations A (form C), D (form D), and C (form C).
Table A20.8. [BVRT.8] Robertson-Tchabo and Arenberg, 1989: Data• for 1,643 Participants in BaltimoreWashington DC Divided by Gender, Education, and Seven Age Groupings Age
20-29
30-39
40-49
3.56 (3.04) n=57
3.75 (2.46) n=51
5.07 (2.79)
6.25 (2.41)
n=42
n=92
2.49 (1.87) n=181
2.94 (1.89) n=155
Females, no college degree
2.90 (2.08) n=19
3.04 (2.21) n=27
Females, college degree
2.70 (1.82) n=37
2.67 (2.11) n=70
Males, no college degree
2.02 (1.53)
n=45 Males, college degree
2.47 (1.87)
"Number errors for administration A (form C).
70-79
80-89
n=28
7.72 (3.92) n=47
11.13 (4.29) n=15
3.67 (2.57) n=156
4.66 (2.81) n=123
6.03 (3.20) n=139
8.15 (4.25)
3.87 (1.82) n=23
5.11 (1.95)
n=35
6.02 (2.58) n=42
7.53 (3.37) n=32
9.44 (3.84) n=9
2.61 (2.17)
3.68 (1.99)
4.57 (2.17)
6.79 (2.97)
n=28
n=38
n=56
n=43
7.62 (4.43) n=13
50-59
60-69
n=40
894
APPENDIX 20
Table A20.9. [BVRT.9] Alder et al., 1990: Data for
Table A20.10. [BVRT.10] Prakash and Bho-
271 Males in Baltimore-Washington DC J'resented in Five Age Groupings '
gle, 1992: Data for a Sample of 660 Indian Participants (331 Male, 329 Female) with Higher Secondary Education, Divided into 10 Age Groupings
BVRT
WAIS Vocabulary
Age
n
Errors•
Ra" Score
25-34
27
2.59 (2.36)
~.26
35-44
74
2.77 (2.06)
115.68
45-54
55-64 65+
101
(9.42)
67.01
Correct•
15-19
90
7.94 (1.32)
20-.24
86
7.48 (1.88)
25-29
84
7.77 (1.69)
30-34
56
7.88 (1.72)
35-39
53
7.70 (1.77)
40-44
62
7.65 (1.49)
45-49
73
7.73 (1.67)
S0-54
55
6.93 (2.09)
55-59
62
60-64
39
7.34 (1.70) 6.36 (1.80)
~.89) ~.62
4.19 (2.96)
33
n
(10.48)
3.36 (2.47)
42
BVRT
Age Grouping
(\)..05)
~.91
5.12 (3.32)
,.96)
•Mean errors for administration A, form C.
•Mean correct for administration A, Forms C, D, and E (apparently collapsed).
Table A20.11. [BVRT.ll] Robinson-~len, 1992: Data• for 122 Caucasian Participants Recruited in Missouri with a Mean Age of 72.23 (9.0) and Mean Education of 13.61 (3.4) Total Form
c D
Correct
Errors
Omission
Distortion
5.55 (1.69)
7.38 (3.67)
0.86 (1.36)
9.38 (1.20)
0.65 (1.31)
0.05 (0..22)
2.87 (1.990 0.24 (0.,
Left
Perseveration
Rotation
Misplacement
Size
Right Side
Side
1.07
1.48 (1.34)
0.80
(1..21)
(1.05)
0.30 (0.79)
4.08 (2.27)
3.00 (1.72)
0.01 (0.09)
0.06 (0..23)
0.18 (0.44)
0.12 (0.51)
0.24 (0.53)
0.31 (0.86)
•Number correct, number of errors, and error ~ for administration A, form C followed by administration C, form D.
895
APPENDIX 20 Table A20.12. [BVRT.12] Lee and Lee, 1993: Data for Male Korean Manual Workers, Guards, Clerks, and Technicians
BVRT n
Age
Education
Correct•
81
34.7 (8.6)
12.9 (2.5)
8.1 (1.5)
"Mean correct for the Recognition trial.
Table A20.13. [BVRT.13] Youngjohn et al., 1993: Data• for a Sample of 1,128 Volunteers (464 Male, 664 Female), Divided into Five Age Groups and Three Educational Levels Age Groupings 18-39 Education
12-14
15-17
18-25
Correct
40-49
Errors
Correct
Errors
50-59
Correct
60-69
Errors
Correct
70-84
Errors
Correct
Errors
7.59 3.38 (1.52) (2.37) n=29 8.04 2.52 (1.19) (1.70) n=27
7.11 4.22 (1.53) (2.62) n=18 7.78 3.48 (1.54) (2.78) n=23
6.66 4.90 (1.47) (2.42) n=130 7.08 4.21 (1.70) (2.85) n=146
6.18 5.55 (1.67) (2. 74) n=129 6.70 4.99 (1.47) (2. 78) n=159
5.62 7.28 (1.73) (3.55) n=53 6.06 6.74 (1.84) (4.34) n=54
8.11
2.67
(1.28)
(1.78)
7.42 3.74 (1.22) (2.47) n=19
7.55 3.64 (1.53) (2.76) n=133
6.80 4.93 (1.55) (2.87) n=134
6.22 6.33 (1.57) (3.63) n=49
n=18
"Mean correct and errors for administration A (form not specified).
Table A20.14. [BVRT.14] Palmer et al., 1994: Data for Primarily Caucasian, WellEducated Males in High-Level Occupations, who Were Tested between 1986 and 1989
BVRT n 1,149
Age
Correct•
71.4 (4.69)
6.61 (1.51)
"Mean correct for administration A, form C.
Table A20.15. [BVRT.15] Escalona et al., 1995: Data for Venezuelan Control Participants
BVRT n
Age
M/F Ratio
Education
Correct•
67
30 (16-45)
56111
8
6.2
"Mean correct for Recognition trial.
(2)
896
APPENDIX 20
Table A20.16. [BVRT.16] Giambra et ~·· 1995: Data for a Sample of 1,721 Mostly Caucasian, Wtll-Educated Partici-
pants (1,163 Males, 558 Females), DividED into Gender x Age Groupings BVRT Errors• Age Grouping
Men
Women
28-33
2.22 (1.93)
34-39
2.56 (1.94)
2.96 (2.10)
40-45
3.05 (2.56)
3.11 (2.28)
46-51
2.89 (2.44)
2.43 (2.23)
52-57
3.24 (2.38)
3.48 (2.02)
58-63
3.76 (2.61)
4.47 (2.80)
64-69
4.52 (2.82)
5.39 (3.31)
7~75
5.93 (2.96)
6.59 (3.05)
76-81
7.55 (3.62)
7.50 (3.50)
82-87
8.49 (4.36)
"Mean errors for administration A, form C.
lllt• 1995: Data for a Sample of 2,000, Mostly Caucasian, WellEducated Participants (1,365 Men, 6351 Women), Recruited in the Baltimore-Washington DC Area, Divided into Seven Age Groups x Gendet
Table A20.17. [BVRT.17] Resnick et
Error Type•
Age Group/Gender
n
Omission
Addfion
Distortion
Perseveration
Rotation
Misplacement
Size
166
0.02 (0.1)
0.00 (0.9)
1.07 (1.0)
0.39 (0.7)
0.40 (0.7)
0.38 (0.7)
0.05 (0.2)
1.00 (1.2)
0.43 (0.7)
0.38 (0.6)
0.51 (0.8)
0.05 (0.3)
~29
Men
I
Women
102
0.04 (0.2)
Women
(0.~) I
~9
Men
O.fH
243 110
0.~
0.01 (0.1)
(0.,)
1.07 (1.1)
0.53 (0.8)
0.53 (0.8)
0.47 (0.9)
0.06 (0.3)
0.09 (0.4)
O.Oo (O.f)
1.26 (1.2)
0.43 (0.8)
0.65 (0.9)
0.39 (0.6)
0.02 (0.1)
t
897
APPENDIX 20
Table A20.17. (Contd.) Error Type• n
Omission
Addition
Distortion
Perseveration
Rotation
Misplacement
Size
214
0.07 (0.4)
0.00 (0.0)
1.46 (1.3)
0.56 (0.7)
0.50 (0.7)
0.48 (0.7)
0.06 (0.3)
56
0.09 (0.3)
0.00 (0.0)
1.54 (1.3)
0.46 (0.7)
0.46 (0.7)
0.36 (0.7)
0.04 (0.2)
211
0.13 (0.5)
0.01 (0.1)
1.73 (1.6)
0.64 (0.8)
0.64 (0.8)
0.63 (0.9)
0.08 (0.3)
95
0.33 (0.8)
0.00 (0.0)
1.98 (1.3)
0.71 (0.9)
0.76 (0.9)
0.52 (0.7)
0.06 (0.3)
Men
222
0.21 (0.6)
0.01 (0.1)
2.22 (1.6)
0.79 (1.0)
0.73 (0.9)
0.70 (0.9)
0.14 (0.4)
Women
112
0.54 (1.0)
0.00 (0.0)
2.08 (1.4)
0.90 (1.1)
1.01 (0.9)
0.56 (0.8)
0.12 (0.4)
Men
226
0.54 (1.0)
0.02 (0.1)
2.72 (1.8)
1.14 (1.2)
0.91 (1.1)
0.79 (1.0)
0.19 (0.5)
Women
114
0.58 (1.5)
0.00 (0.0)
2.57 (1.7)
1.28 (1.2)
1.21 (1.1)
0.88 (1.0)
0.16 (0.5)
Men
83
0.82 (1.4)
0.05 (0.3)
3.81 (2.4)
1.00 (1.2)
1.30 (1.0)
1.05 (1.1)
0.39 (0.8)
Women
46
1.09 (1.8)
0.11 (0.3)
3.46 (1.9)
1.33 (1.3)
1.28 (1.0)
1.00 (1.2)
0.39 (0.8)
Age Group/Gender 40-49
Men Women 50-59
Men Women 60-69
70-79
80-102
•Means for seven types of error from administration A, form C.
Table A20.18. [BVRT.l8] Steenhuis and Ostbye, 1995: Data for Canadian Participants with no Cognitive Impainnent BVRT
n
Age
Education
%Female
Correct•
591
78.5 (6.7)
9.8 (4.0)
61
11.84 (2.34)
•Mean correct for multiple-choice administration.
Table A20.19. [BVRT.l9] Dealberto et al., 1996: Data for a Sample o£1,389 French Participants (574 Male, 815 Female), Aged 60--70 Years, Partitioned by Gender, Two Age Groups, and Three Educational Levels: No Exclusion Criteria Were Used Gender
Correct•
Education (Years)
Age
Male
Female
60-64
65-70
<6
~12
>12
11.7 (2.0)
11.4 (2.0)
11.6 (1.9)
11.4 (2.0)
11.1 (2.0)
11.8 (1.9)
12.3 (1.6)
"Mean for number correct (apparently for Recognition trial).
APPENDIX 20
898
Table A20.20. [BVRT.20] Jacobs et al., 1997: Data• for 118 English Speakers and 118 Spanish Speakers Recruited in New York Age
Education
Gender
Matching
Memory
English
75.07 (5.90)
8.85 (3.78)
75% female
8.30 (1.60)
6.79 (1.95)
Spanish
74.91 (5.71)
8.41 (3.98)
72% female
7.58 (1.93)
5.74 (1.99)
"A recognition trial for visual perceptual assessment (matching, form C) and for visual memory (form D) was administered. Means are for total correct.
Table A20.21. [BVRT.21] Cannelli et al., 1999: Data for 589 Caucasian Male Veterans Aged 59-69 Years, Tested in 1985-1986, who Were Recruited from Massachusetts, Indiana, and California: Sample Is DMded into Four Smoking Groups and Four Alcohol Intake Groups
Daily Intake of Alcohol
Smoking Status Never Smoked
Former (Quit <::10 Years)
Former (Quit <10 Years)
Current Smoker
No Drinlcs
199 6.7
222 6.5
72 6.2
102 6.3
158 6.1
n BVRT correct•
~1
204 6.6
1-3
<::3
150 6.4
83 6.5
"Mean for correct (test version not specified).
Table A20.22. [BVRT.22] Coman et al., 1999: Data• for 156 Primarily Caucasian Participants (31 Male, 125 Female) Recruited in Iowa, who Averaged 77.7 (7.89) Years of Age, with a Range of 61-97, and 12.67 (3.46) Years of Education, with a Range of 4-20: Sample Is Partitioned into Nine Age Groups by 11 Educational Levelst Age Years of Education
55
60
65
70
75
80
85
90
95
8
6.54
6.09
5.65
5.21
4.77
4.32
3.88
3.44
3.00
9
6.72
6.28
5.83
5.39
4.95
4.50
4.06
3.61
3.17
10
6.90
6.46
6.01
5.57
5.12
4.68
4.23
3.79
3.34
11
7.09
6.64
6.20
5.75
5.30
4.86
4.41
3.96
3.52
12
7.27
6.82
6.38
5.93
5.48
5.03
4.59
4.14
3.69
13
7.46
7.01
6.56
6.11
5.66
5.21
4.76
4.31
3.87
14
7.64
7.19
6.74
6.29
5.84
5.39
4.94
4.49
4.04
15
7.82
7.37
6.92
6.47
6.02
5.57
5.12
4.67
4.21
16
8.01
7.56
7.10
6.65
6.20
5.75
5.29
4.84
4.39
18
8.38
7.92
7.47
7.01
6.56
6.10
5.65
5.19
4.74
20
8.74
8.29
7.83
7.37
6.91
6.46
6.00
5.54
5.08
"Expected number correct scores. tMean correct for administration A (form not specified) was 5.37 (1.92), with a range of0-10.
899
APPENDIX 20 Table A20.23. [BVRT.23] Manly et al., 1999: Data for Older Participants (74% Female) with 0-3 Years of Education, Recruited in New York BVRT Correct %Spanish speakers
Recognition Trial
Matching
76.2 (6.1)
72
5.12 (2.35)
6.58 (2.16)
74.8 (5.7)
86
3.75 (1.75)
5.35 (2.25)
n
Age
Literate
43
Illiterate
43
Table A20.24. [BVRT.24] Mathiesen et al., 1999: Data for Male Participants, Younger than 65 Years, Recruited in Norway BVRT"
n
Age
Education
W AIS-R Vocabulary Scaled Score
52
45.5 (10.8)
9.5 (1.8)
8.7 (1.2)
Correct
Errors
7.5 (1.4)
4.0 (2.3)
"Means for number correct and number of errors for administration A, form C.
Table A20.25. [BVRT.25] Amir, 2001: Data• for 260 Participants Recruited in United Arab Emirates FormE
FormD n
Age
Education
Correct
Errors
Correct
Errors
Males
124
21.7 (5.89)
11.6 (2.71)
7.65 (1.97)
2.48 (2.58)
7.57 (1.78)
2.80 (2.46)
Females
136
21.6 (4.66)
12.0 (2.80)
7.33 (1.87)
3.33 (3.15)
7.91 (1.64)
2.31 (2.13)
Gender
IQ'-l Superior
17
8.53 (2.26)
1.18 (1.59)
8.88 (0.99)
1.06 (1.08)
Above average
51
8.20 (1.20)
2.51 (3.19)
8.12 (1.32)
(1.56)
175
7.41 (1.73)
3.05 (2.55)
7.75 (1.62)
2.48 (2.00)
17
5.06 (1.74)
7.24 (2.84)
5.53 (2.34)
6.12 (4.09)
48
6.98 (2.17)
3.58 (3.15)
7.58 (1.67)
2.73 (2.24)
44
7.98 (1.42)
2.48 (2.13)
8.09 (1.61)
2.20 (2.12)
Average Below average
2.06
Eductrtion Low (~9
years)
High (<::university)
"Mean number correct and errors for administration A (form D followed
by formE 2 weeks later).
900
APPENDIX 20
Table A20.26. [BVRT.26] Touradji et al, 2001: Data for Older Non-Hispanic, Fluent English-speaking, White Participants (74111 female) Recruited in New York BVRTCorrect
U.S.-bom Foreign-hom
n
Age
Edueation
Recognition Trial
Matching Trial
106
75.7 (7.2)
~I:)
7.67 (1.73)
9.07 (1.26)
87
77.9 (7.3)
(7)
7.81 (1.58)
8.71 (1.52)
Ito
I
Table A20.27. [BVRT.27] Coman et al., ~:Data for a Sample of 156 Primarily Caucasian Participants, Rectwted in Iowa, Partitioned by Four Age Groups
!
Age Croup
n
BVRT Correct"
55-64
6
6.83 (1.17)
65-74
54
6.30 (1.59)
75-84
67
4.90 (1.75)
85+
29
4.45 (2.13)
"Mean correct for administration A (form not specified).
Table A20.28. [BVRT.28] Manly et al., ~2002: Data for Older Participants (689& Female) Recruited in New York Benton Correct
n
Age
E4ucation
WRAT-3 Reading"
Recognition Trial
Matching Trial
African American
192
73.9 (5.8)
12.8 ·(2.8)
44.2 (7.2)
7.4 (1.8)
8.9 (1.4)
White
192
74.6 (5.9)
!13.0 :(3.0)
49.3 (4.1)
8.1 (1.5)
9.4 (1.2)
Ethnicity
•WRAT, Wide Range Achievement Test.
901
APPENDIX 20
Table A20.29. [BVRT.29] Farahat et al., 2003: Data for Male Control Participants Recruited in Egypt: 42 with Secondary Education and Eight with University Degrees n
50
Age
BVRT Correct•
42.28 (5.54)
5.48
"Mean for number correct (administration and form not specified).
Table A20.30. [BVRT.30] Kawas et al., 2003: Data for a Sample of 1,425 Participants (1,004 Men, 421 Women), 72% of whom had College Degrees, Recruited in Baltimore-Washington DC, Partitioned by Six Age Groups BVRT
WAIS Vocabulary
n
Errors•
n
Raw Score
<50
298
3.11 (2.28)
300
64.08 (10.18)
00-59
546
3.72 (2.60)
546
64.97 (9.54)
60-69
815
4.67 (3.05)
681
65.00 (10.04)
70--79
760
6.78 (4.01)
608
65.61 (8.80)
80--89
380
9.09 (4.50)
267
63.26 (10.83)
40
9.73 (5.72)
19
58.00 (15.39)
Age Group
90+
•Mean for total number of errors for administration A (presumably form C).
Table A20.31. [BVRT.31] Reinprecht et al., 2003: Data for a Sample of 14181-Year-Old Men Recruited in Sweden, Tested in 1995-1996, Classified into Three Groups According to Level of Hypertension No Hypertension at Age 68 and 81 n
BVRT correct•
Hypertension at 81 but not 68
22
11
4.7 (1.6)
4.5 (1.9)
Hypertension at 68 and 81 108 4.3 (1.6)
•Mean for number correct for administration A (form not specified).
I
i
~2
APPENDIX 20
!
Table A20.32. [BVRT.32] Ruggieri et ~-. 2003: Data for Italian Control Participants ~ i
n
Age
Education
50
30.08 (8.37)
11.76 (3.04)
IM/F Ratio
BVRT Correct•
24126
7.76 (1.89)
"Mean correct (test administration and form not specified).
Table A20.33. [BVRT.33] Witjes-Ane et rj., 2003: Data for a Sample of Control Participants Recruited in the Netherlands, Tested in 1993-1998 n
88
BVRT Correct•
Age 42
18-64
40148
I
7.5 (1.6)
"Mean for number oorrect (administration and form not specified).
Appendix 21 : Locator and Data Tables for the Finger Tapping Test (FTT)
Study numbers and page numbers provided in these tables refer to study numbers and descriptions of studies in the text of Chapter 21.
Locator table also provides a reference for each study to a corresponding data table in this appendix.
Table A21.1. Locator Table for the Finger Tapping Test (FIT) IQ/
Study
Age•
n
Sample Composition
Fl'.l Vega & Parsons, 1967 page 424 Table A21.2
40.8 (13.1)
50
Control group of patients with no CNS dysfunction; 37M, 13 F
156
Control group of general medical and psychiatric V.A. patients, mostly males
29
Sample of neurologically intact participants, divided into 6 age groups
Fl'.! Goldstein & Shelly, 1972 page 424 Table A21.3 Fl'.3 Goldstein & Braun, 1974
page 425 Table A21.4
Fl'.4 Finlayson et al., 1977 page425 Table A21.5
20-29 30-39 40-49 50-59 60-69 70-79
34.12 (8.72) 34.37 (7.36) 35.22 (7.78)
75 47 30 16 12 17 17 17
Control group of healthy adults and hospitalized medical and psychiatric patients, all males; data are presented in T scores
Education" 11.1 (3.2)
Location Oklahoma
Topeka, KS
Grade school High school University (continued)
903
APPENDIX 21
904
Table A21.1. (Contd.) Study
Age•
n
Sample Composition
IQ/ Education•
Location
Fl'.5 Wiens & Matarazzo, 1977 page426 TableA21.6 Fl'.6 Dodrill, 1978a page426 Table A21.7 Fr.7 Dodrill, 1978b page426 Table A21.8 Fl'.8 Morrison et al., 1979 page 427 Table A21.9
23.6 24.8
24 24
Nonnal young men divided into 2 groups
FSIQ: 117.5 118.3
Oregon
27.3 (8.4)
50
Healthy control group; 30M, 20F
12.0 (2.0)
Washington
41.1
25
Healthy control group; 20M, 5 F
10.7
Washington
Mode=19
60
College students
Idaho
Fl'.9 Anthony et al., 1980 page 427 Table A21.10 Fl'.10 Bak & Greene, 1980 page 427 Table A21.11 Fl'.ll Eckardt & Matarazzo, 1981 page428 Table A21.12
38.9 (15.8)
100
Students from introductory psychology courses; 30M, 30 F; data are presented for dominant hand, averaged over test-retest and interexaminer trials Healthy control group
FSIQ: 113.5 (10.8)
Colorado
~2
15
6M,9F
Texas
67-86
15
45.6 (11.1)
20
Mean early 60s
60
5 M,10 F; healthy participants Control group of V.A. medical inpatients; data for dominant hand are provided for test and retest over 12-22 days Healthy control group, mostly males
13.7 (1.91) 14.9 (2.99) 60% had some college education
Mean 11-12 years
USA
24.9
29
CETA workers, 59% M
11.2
Massachusetts
14.8 14.5
99 47
Delinquent adolescents, nondelinquent adolescents
FSIQ: 95.3 117.1
Alberta, Canada
20.0 (1.9)
30
Student volunteers; 21M, 9 F
(1.4)
59.6 (9.0)
25
Controlgroupofhealthy adults; 84% M
10.5 (3.3)
Fl'.1! Pirozzolo et al., 1982 page 428 Table A21.13 Fl'.13 Rounsaville et al., 1982 page 428 Table A21.14 Fl'.14 Yeudall et al., 1982 page 428 Table A21.15 Fl'.15 O'Donnell et al., 1983 page 429 Table A21.16 Fl'.16 Prigatano et al., 1983 page 429 Table A21.17
California
13.7
Oklahoma City and Canada
APPENDIX 21
905
Table A21.1. (Contd.) Study
Age•
n
Fr.l7 Fromm-Auch &Yeudall, 1983 page 429 Table A21.18
15-17 18-23 24-32 33-40 41-64
193
Fl'.l8 Bomstein, 1985
20-39 40-59 60-69
365
<25 45-54
Sample Composition
IQ/ Education"
Location
Normal volunteers; 111 M, 82 F; data are partitioned by 5 age groups x gender
14.8 (3.0)
Alberta
178M, 187 F; paid volunteers free of neurological or psychiatric illness; data are presented by 3 age x 2 education x gender groups
12.3 (2.7)
Western
10 10 10 10
Healthy volunteers; data are partitioned into 4 age groups
8-13
Catania,
32.7 (13.5)
100
79 M, 21 F; controls with no neurological illness, head trauma, or substance abuse
14.5 (2.84)
Denver
Fl'.lU Kane et al., 1985 page 431 Table A21.23
38.9 (11.3)
46
Control sample of medical and nonschizophrenic V.A. psychiatric patients; data are reported in T scores
12.3 (2.6)
Oklahoma City, Pittsburgh
Fl'.U Heaton et al., 1986 page 431 Table A21.24
15-81 39.3 (17.5) <40 40-59 2::60
553
356 M, 197 F; normal with no history of neurological illness, head trauma, or substance abuse; data are presented in 3 age and 3 education groups
~20
Colorado, California, Wisconsin
Fl'.23 Bomstein,
18-39 40-59
365
178M, 187 F; paid volunteers free of neurological or psychiatric illness; data are stratified by 3 age x 2 education x gender groups; proportion of participants classified as impaired is presented
12.3 (2.7)
Western Canada
18-24
120
Undergraduate students; 60 M, 60 F; data are partitioned by firm and mixed handedness groups
College students
Ohio
18-32 33-47 48-62 63-91
713
1987 page 432 Table A21.28
Healthy subjects; 382 M, 331 F; data are presented by age x gender
Fr.!& Yeudall et al., 1987 page433 Table A21.29
15-20 21-25 26-30 31-40
62 73 48 42
Normal adults; 127 M, 98 F; data are stratified by 4 age groups x gender
FSIQ: 111.75 109.79 113.95 116.09
Alberta, Canada
page 430 Tables A21.19, A21.20
Fl'.l9 Villardita et al., 1985 page 430 Table A21.21 Fl'.IO Heaton et al., 1985 page 431 Table A21.22
1986a page 432 Tables A21.25, A21.26
Fl'.l4 Polubinski &
55-64 65-74
~9
Melamed, 1986 page 432 Table A21.27
Fl'.25 Trahan et al.,
years
13.3 (3.4) <12 12-15 2::16
Canada
Canada
Italy
(continued)
906
APPENDIX 21
Table A21.1. (Contd.) Study
Age"
Fl'.27 Alekoumbides et al., 1987 page 433 Table A21.30
46.9 {17.2)
Fl'.28 Bomestein et al., 1987b page 434 Table A21.31, A21.32
62.7 {4.3)
Fl'.29 Bomstein et al., 1987a page 434 Table A21.33 Fl'.30 Russell, 1987 page 435 Table A21.34 Fl'.31 Thompson et al., 1987 page 435 Table A21.35 Fl'. 31 van den Burg et al., 1987 page 436 Table A21.36 Fl'.33 Bomstein & Suga, 1988 page 436 Table A21.37
32.3 {10.3)
Fl'.34 Ardila & Rosselli, 1989 page 436 Table A21.38 FI'.3S Heaton et al., 1991,2004 page 437 Data are not reproduced in this book
46.19 (12.86)
40.59 {18.27)
37.4 {11.9)
55-70 62.7 {4.3)
~55
42.1 {16.8) Groups: 20-34
35-39 40-44
45-49 50-54 55-59 00-64
65-69 7~74
75-80
n ' Sample Composition
123 · Mostly inpatients of a large general hospital without a history of neurological disorder, mostly males 134 Healthy adults with no history of neurological or psychiatric illness; 49 M, 85 F; matched with neurological patients; classification rates based on conventional and optimal cutoff scores are presented 23 !Volunteers from a university community; 9 M, 14 F; test-retest data over 3week period are presented 155 ·V.A. patients suspected of having neurological condition but with negative neurological findings; 148M, 7 F 426 f279 M, 147 F; normal subjects; Percent falling in lateralized dysfunction range is presented 40 !Control group of healthy subjects; 16M, 24 F
IQ/ Education• 11.4 {3.2)
11.7 {2.9)
VIQ=105.8 (10.8) PIQ=105.0 {10.5) 12.29 (3.0)
Miami, Cincinnati
13.15 {3.49)
Northern Holland
134 49 M, 85 F; paid volunteers screened for a history of neurological or psychiatric disorders; divided into 3 education groups: ' 17M, 29 F 16M, 28 F 16M, 28 F ;Healthy older adults with MMSE 346 scores ~23; data are stratified by 5 age x 3 education groups
11.7 {2.9)
486 Urban and rural volunteers;
13.6 {3.5) FSIQ: 113.8 {12.3)
data collected over 15 years through multicenter collaborative efforts; strict exclusion criteria; 65% M; data are presented in T-score equivalents for M and F separately in 10 age groupings by 6 education groupings; in the 2004 edition, age range is expanded to 85 years and the data are presented for African-American and Caucasian participants separately
Location
Western Canada
Range, mean: 5-10,8.5 11-12, 11.7 >12,15.0 0-5 Bogota. 6-12 Colombia >12
6-8
9-11 12 13-15 16-17 ~18
California, Washington, Texas, Oklahoma, WISCOnsin,
Illinois, Michigan, New York, Virginia, Massachusetts, Canada
907
APPENDIX 21
Table A21.1. (Contd.) Age·
Study Fl'.36 Ruff & Parker,l993 page 437 Tables A21.39, A21.40
1~70
n 358
16-24 25-39 40-54 55-70
Fl'.37 Russell & Starkey, 1993 page438 Table A21.41
45.1 (13.0)
176M
40.7 (15.3)
24F
Fl'.38 Dikmen et al., 1999 page 439 Table A21.42
34.2 (16.7)
384
Fl'.39 McCurry et al., 2001 page 439 Table A21.43
74.6 (2.7) 87.0 (5.1)
120
Fl'.40 Sackellares &
1~
Sackellares, 2001 page 440 Table A21.44
X=33.2
Fl'.41 Prigatano & Borgaro, 2003 page 441 Table A21.45
33.6 (9.61)
IQ/ Education•
Sample Composition
81
Location
Normal volunteers screened for psychiatric hospitalization, chronic poly-drug abuse, or neurological disorders; data are stratified by 4 age x gender groups; data for left band-dominant sample are also presented; 179M, 179 F Norms are collected from standardization sample for the HRNES manual; Controls are V.A. patients without CNS pathology Normal and neurologically stable adults; some had neurological conditions; 66% M; data on test-retest reliabilities and practice effect are provided
7-22 years
12.1 (2.6)
Washington, Colorado, California
Nondemented Japanese American elderly; 43.8% M; weighted test scores are stratified into 2 age groups
11.7 (2.9) 10.0 (2.7)
Seattle
40
Control group; performance rates for both bands and asymmetry indices are provided for rightbanders and left-banders separately
15
Control group; 46.7% M
California, Michigan, Eastern
seaboard
12.5 (2.8)
Cincinnati, Miami
14.5 (2.6)
Florida
12.86 (1.19)
Phoenix, AZ
•Age column and IQ/education column contain information regarding range and/or mean and standard deviation for the whole sample and/or separate groups, whichever information is provided by the authors.
Table A21.2. [FT.1] Vega and Parsons, 1967: Data for the Control Group, which Included 43 Patients Hospitalized for Causes Other than Central Nervous System Dysfunction and Seven Nonhospitalized Participants• n
Gender
Age
Education
WAIS FSIQ
Dominant Hand
50
37M 13 F
40.8 (13.1)
11.1 (3.2)
99.4 (12.9)
44.6 (9.2)
"Data are available only for the dominant hand.
Table A21.3. [FT.2] Goldstein and Shelly, 1972: Data for the Control Sample of General Medical and Psychiatric Veterans Administration Patients, Mostly Males n
156
Dominant Hand
Nondominant Hand
46.8 (10.4)
41.2
(8.8)
908
APPENDIX 21
Table A21.4. [Fl'.3] Goldstein and Braqn, 1974: Data for a Healthy Sample Consisting of 201 Men and Eight Women Divided into Six Age Groups Age Group
Mean Age
n
~29
24.5
Preferred Hand
Nonpreferred Hand
Percent Reversal•
29
54.1 (4.4)
49.0 (4.6)
17
34.9
75
53.4 (5.4)
47.9 (5.2)
11
40-49
44.7
47
53.3 (4.8)
47.4 (4.5)
9
50-59
53.5
30
50.5 (5.4)
45.2 (4.4)
10
60-69
64.2
16
47.4 (8.6)
43.0
19
44.5
41.1
(7.4)
(7.5)
70-79
72.2
12
(7.0)
25
"Percent of subjects who were tapping faster wlth the nonpreferred hand than with the preferred hand. I
Table A21.S. [FI'.4] Finlayson et al., Hf77: Data Presented in T Scores for the Control Group, which Included Healthy Indivicfuals and Hospitalized Medical and Psychiatric Patients, All Males ;
T Score Dominant Hand
Nondominant Hand
101.88 (10.23)
51.06
55.77
Educational
Level
n
Age
Edu<$tion
FSIQ
Grade school
17
34.12 (8.72)
7.H (I.t1J
High school
17
34.37 (7.36)
12.00 0
112.71 (12.21)
54.00
55.81
University
17
35.22 (7.78)
17.35 (1.116)
129.53 (7.58)
55.00
56.10
Table A21.6. [Ff.5] Wiens and
Ma~.
Patrolman Program
1977: Data for Male Applicants to a
I
Group
n
WAIS FSIQ
Age
EiucaUon
Preferred Hand
Nonpreferred Hand
1
24
117.5
23.6
;13.7
54.0 (4.6)
48.4 (4.4)
2
24
118.3
24.8
'14.0
54.5 (4.0)
50.0 (4.1)
APPENDIX 21
909
Table A21.7. [FT.6] Dodrill, 1978a: Data for the Control Group with no Evidence of Neurological Disorder Preferred Hand
Nonpreferred Hand
30
56.0 (5.6)
51.43 (5.7)
20
51.2 (4.0)
48.0 (3.7)
n
Age
Education
%Male
Race
Total sample Males
50
27.3 (8.4)
12.0 (2.0)
60
98% White
Females
Table A21.8. [FT.7] Dodrill, 1978b: Data for the
Table A21.11. [FT.10] Bale and Greene, 1980:
Control Group with no Evidence of Neurological Disorder
Data for Healthy Right-Handed Older Adults
n
%Right- Right Left Age Education Gender Handed Hand Hand
25 41.1
10.7
20M SF
100
53.4 (6.2)
49.6 (5.4)
Table A21.9. [FT.8] Morrison et al., 1979: Data for the Dominant Hand on Test-Retest and Interexaminer Trials Averaged over Two Probes for Students from Introductory Psychology Courses with a Modal Age of 19 Test-Retest
Interexaminer
n
Tapping
n
Tapping
Males
30
51.37 (4.45)
30
54.79 (3.71)
Females
30
48.52 (4.97)
30
50.37 (4.63)
Age Group n
Left Hand
Age
6M 9F 5M 10 F
55.6 (4.44)
13.7 (1.91)
44.53 40.80 (6.71) (4.77)
74.9 (6.04)
14.9 (2.99)
38.73 36.33 (4.13) (5.93)
50-62 15 67-86
Right Education Hand
Gender
15
Table A21.12. [FT.ll] Eckardt and Matarazzo, 1981: Test-Retest Data for the Dominant Hand for the Control Group of Medical Inpatients from the Veterans Administration Hospital Dominant Hand n
Age
%Male
Test
Retest•
20
45.6 (11.1)
100
42.1 (9.4)
43.6 (10.4)
"Test-retest interval was 12-22 days.
Table A21.10. [FT.9] Anthony et al., 1980: Data
Table A21.13. [FT.12] Pirozzolo et al., 1982: Data
for the Control Group of Healthy Adults
for the Control Group of Healthy Elderly (Mean Age Early 60s, Mean Education 11-12 Years, Mostly Males)
n
WAIS Dominant Nondominant Age Education FSIQ Hand Hand
100 38.9 (15.8)
13.3 (2.6)
113.5 (10.8)
52.6 (9.1)
48.2 (7.6)
n
Right Hand
Left Hand
60
50.07 (8.75)
46.73 (8.43)
APPE
910
DIX 21
Table A21.14. [FT.13] Rounsaville et al ., 1982: D ata for a Sample of Comprehensive Employment Training Act Workers
29
% Male
Education
% RightHanded
Age
Dominant Hand
59
11.2
90
24.9
48.48
ondominant Hand
42.5
Table A21 .15. [FT.l4] Yeudall et al., 1982: Data for Delinquent and
Delinquent ondelinquent
ondelinquent Canadian Adolescents Preferred Hand
onpreferred Hand
n
Age
WAlS FSIQ
Number of Males
99
14.8
95.3
64
35
40.0 (6.7)
37.0 (5.8)
47
14.5
117.1
29
18
42.9 (7.7)
40.2 (6.9)
Femal
Table A21 .16. [FT.15] O ' Donnell et al., 1983: Data for a Control Group of tudent Volunteers
n
Age
Education
Gender
30
20.0 (1.9)
13.7 (1.4)
21M 9F
% RightHanded 90
F IQ
Dominant Hand
117.1 (10.3)
51.3 (5.5)
WAI
Table A21 .17. [FT.16] Prigatano et al., 1983: Data for the Control Group n
Age
Education
% Male
%RightHanded
25
59.6 (9.0)
10.5 (3.3)
84
96
WAlS FSIQ
Dominant Hand
112.0 (11.0)
48.4 (7.7)
ondominant Hand 42.6 (6.0)
911
APPENDIX 21
Table A21.18. [IT.l7]
Fromm-Auch and Yeudall, 1983: Data for a Sample of Healthy Canadian Adults Stratified by Age and Gender•
Age
n
Preferred Hand (SD)
Nonpreferred Hand (SD)
Mala 15-17
17
47.6 (5.8)
43.6 (4.9)
18-23
44
49.5 (6.9)
45.4 (6.9)
24-32
31
50.6 (6.6)
46.0 (6.1)
33-40
12
53.4 (5.9)
49.8 (4.7)
41-64
4
44.4 (5.8)
41.4 (3.5)
15-17
15
42.7 (7.9)
41.1 (6.2)
18-23
30
43.6 (7.5)
41.2 (6.5)
24-32
25
45.2 (6.7)
40.9 (5.7)
33-40
6
45.8 (5.5)
44.3 (4.6)
41-64
6
40.4 (4.8)
38.6 (4.8)
Fcmaalu
"Mean education 14.8 years, mean full-scale IQ 119.1.
Table A21.19. [IT.l8a]
Bomstein, 1985: Data for a Sample of Healthy Stratified by Age, Education, and Gender Number
Canadian Adults
Education
of Males
Number of Females
Preferred Hand
Nonpreferred Hand
20-39
13.0 (2.3)
107
64
47.2 (6.5)
43.5 (5.4)
40-59
11.9 (2.8)
31
66
40.3 (7.5)
37.8 (5.8)
60-69
11.8 (2.9)
40
57
35.4 (7.7)
33.8 (6.4)
48.5 (16.6)
51
57
39.8 (9.2)
37.6 (7.8)
41.1 (16.8)
~HS
127
130
43.3 (8.2)
40.3 (7.2)
By age
Byedueadon
(continued)
APPENDIX 21
912 Table A21.19. (Contd.) Number of Males;
Education
Bfgender 39.2 (17.2)
12.4 (2.9)
47.3 (16.1)
12.2 (2.5)
Number of Females
178 187
Preferred Hand
Nonpreferred Hand
46.1 (7.0)
42.6 (6.9)
38.6 (8.4)
36.5 (6.8)
•Hs, high school.
Table A21.20. [Ff.18b] Bomstein, 198$: Data Presented in Age x Education (
Female ~HS
Age Group
M (SD) j
~HS
n
M (SD)
n
M (SD)
n
M (SD)
Ptefon ed 1umd 20-39
21
49.7 (6.0)
86
48.5 (6.5)
13
45.2 (6.0)
49
44.3 (5.8)
40-59
13
42.3 (5.2)
17
43.4 (7.9)
22
36.3 (7.8)
43
40.5 (7.1)
60-69
16
39.1 (5.7)
23
43.0 (4.7)
22
29.7 (6.2)
34
32.2 (6.0)
Nonpnf•rretllumd 20-39
47.0 (5.5)
44.8 (6.4)
40.7 (5.0)
40.6 (5.6)
40-59
39.8 (3.6)
39.5 (5.8)
35.2 (5.8)
37.8 (6.0)
60-69
35.2 (5.2)
39.3 (6.2)
29.8 (5.6)
32.0 (4.9)
al.
Table A21.21. [Fr.19] Villardita et 19&5: Data• for Healthy Italian Adults with 8-~3 Years of Schooling Partitioned into Four Age Glk>ups Preferred Hand
NoJt>referred fland
10
93.4 (7.9)
:90.1 (8.0)
45-54
10
82.8 (11.0)
80.4 :(9.2)
55-64
10
70.3 (11.9)
69.5 (9.4)
65-74
10
67.9 (9.9)
~1.3)
Age Group
n
<25
69.3
•score Is the total number of taps recorded for !feb hand on two trials.
913
APPENDIX 21 Table A21.22. [FT.20] Heaton et al., 1985: Data
Table A21.24. [FT.22] Heaton et al., 1986: Data
for the Neurologically Healthy Control Group
for a Sample of Healthy Adults Stratified by Age and Education
n
100
Age
Education
32.7 (13.5)
14.15 (2.84)
MIF
Total for
Ratio
Both Hands 104.27 (12.03)
79/21
Age
Dominant Hand
Nondominant Hand
319 134 100
53.6 52.8 47.9
48.8 47.5 43.5
132 249 172
48.7 53.3 53.7
44.0 48.1 49.0
erou,.
<40 40-59 2:60 Education
erou,.
<12 12-15 2:16
Table A21.23. [FT.21] Kane et al., 1985: Data• for
n
the Control Group Consisting of Medical and Nonschizophrenic Veterans Administration Psychiatric Patients n
46
Age
Education
Dominant Hand
Nondominant Hand
38.9
12.3 (2.6)
51.5 (10.1)
51.5 (8.5)
(11.3)
Table A21.25. [FT.23a] Bomstein, 1986a: Proportion of Participants Classified as Impaired in a Sample of Healthy Canadian Adults (n = 365)
"Data are reported in T scores.
% Classified as
Mean Median Mode
Impaired
Preferred hand (taps)
42.3
43
47
79.9
Nonpreferred hand (taps)
39.5
39.5
39
70.2
Table A21.26. [FT.23b] Bomstein, 1986a: Proportion of Participants Classified as Impaired in a Sample Stratified by Age x Education x Gender Preferred Hand Males (<50)"
Nonpreferred Hand
Females (<46)
Age
2:HS
2:HS
18-39
35 (7120)
58.6 (51187)
61.5 (8113)
100 (13113)
75 (12116)
95.5 (21122)
79.5 (35144)
100 (16/16)
91.3 (21123)
100 (17122)
(33133)
40-59 60-69
Males (<44)
Females ( <40)
2:HS
2:HS
50
20
(23/46)
(4120)
36.8 (32/87)
69.2 (9.13)
(20146)
84.6 (ll/13)
81.3 (13116)
77.3 (17/22)
56.8 (25144)
100 (16/16)
73.9 (17123)
95.4 (21122)
(32133)
100
•cutoff criteria for impairment, which differed for males and females, are presented in parentheses. HS, high school.
43.4
9619
914
APPENDIX 21
Table A21.27. [Ff.24] Polubinski and Melamed, 1986: Data• for Undergraduate Students Partitioned into Firm and Mixed Right-Handedness Group n
Age
Education
Right Hand
Left Hand
Finn
30
19.7 (1.4)
13.3 (0.7)
93.6 (10.1)
77.6 (10.0)
Mixed
30
20.1 (1.6)
13.6 (0.9)
93.5 (13.1)
79.6 (12.1)
Finn
38
19.8 (1.2)
13.4 (0.7)
91.1 (12.4)
76.0 (13.8)
Mixed
22
19.4 (0.8)
13.3 (0.6)
88.8 (9.9)
74.5 (11.6)
Men
Women
• Scores are presented for 15-second trials.
Table A21.28. [Ff.25] Trahan et al., 1987: Data• for a Sample of Neurologically Intact Adults Preferred Hand
Nonpreferred Hand
175
54.58 (5.19)
50.31 (5.38)
221
48.71 (5.52)
44.75 (5.60)
145
53.28 (6.29)
48.52 (6.64)
56
49.12 (5.91)
45.07 (5.12)
29
52.80 (5.24)
47.98 (5.78)
35
46.84 (6.05)
43.59 (5.92)
33
49.23 (8.94)
45.31 (8.43)
19
43.13 (6.69)
41.03 (7.38)
Age/ Gender
n
18-32 Males Females
33-47 Males Females 48-{;2 Males Females 63-91 Males Females
•score is the mean of three trials for each hand.
915
APPENDIX 21
Table A21.29. [FI'.26] Yendall et al., 1987: Data for Healthy Canadian Adults Stratified by Age for the Entire Sample and for Males and Females Separately
'*'
Preferred Hand
Nonpreferred Hand
111.75 (10.16)
79.03
46.59 (6.60)
42.51 (5.81)
14.82 (1.88)
109.79 (9.97)
86.30
47.28 (8.13)
44.38 (7.20)
28.06 (1.52)
15.50 (2.65)
113.95 (10.61)
89.58
50.47 (7.28)
45.64 (6.68)
42
34.38 (2.46)
16.50 (3.11)
116.09 (9.51)
90.48
50.52 (8.37)
46.54 (6.75)
225
24.66
14.55 (2.78)
112.25 (10.25)
85.78
(6.16)
48.38 (7.76)
44.54 (6.76)
Age Group
n
Age
Education
WAIS-R FSIQ
Entire•• (n =225) 15-20
62
17.76 (1.96)
12.16 (1.75)
21-25
73
22.70 (1.40)
26-30
48
31-40 15-40
Right-Handed
F-'" (n=98) 15-20
30
17.73 (1.84)
12.10 (1.52)
110.32 (10.64)
73.33
44.77 (7.37)
41.63 (5.69)
21-25
36
22.83 (1.54)
14.53 (1.99)
107.28 (9.14)
83.33
44.36 (7.48)
41.62 (6.82)
26-30
16
28.69 (1.25)
14.94 (2.32)
113.10 (11.37)
93.75
48.14 (6.99)
43.24 (4.44)
31-40
16
33.88 (2.53)
16.19 (2.29)
114.27 (11.32)
87.50
44.35 (5.64)
41.98 (4.65)
15-40
98
24.03 (5.95)
14.12 (2.43)
110.19 (10.46)
82.65
45.10 (7.12)
41.95 (5.76)
15-20
32
17.78 (2.09)
12.22 (1.96)
113.00 (9.72)
84.38
48.36 (5.31)
43.36 (5.89)
21-25
37
22.57 (1.26)
15.11 (1.74)
112.30 (10.27)
89.19
50.20 (7.78)
47.14 (6.55)
26-30
32
27.75 (1.57)
15.78 (2.79)
114.38 (10.43)
87.50
51.68 (7.24)
46.88 (7.34)
31-40
26
34.69 (2.41)
16.69 (3.55)
117.31 (8.21)
92.31
54.32 (7.52)
49.35 (6.34)
15-40
127
25.15 (6.29)
14.87 (2.99)
113.87 (9.83)
88.19
50.97 (7.26)
46.60 (6.81)
Jlalea (n = 127)
Table A21.30. [Ff.27] Alekoumbides et al., 1987: Data for a Sample of Medical and Psychiatric Veterans Administration Patients without a History of Neurological Disorder, Mostly Males and Mostly Inpatients n 123
Age
Education
Preferred Hand
Nonpreferred Hand
46.9 (17.2)
11.4 (3.2)
43.4 (10.2)
38.5 (9.1)
916
APPENDIX 21
Table A21.31. [Fr.28a] Bornstein et at., 1987b: Data for a Sample of Healthy Canadian Adults (Subsample of Bornsteii, 1985) I
Total sample
n
Age
Edu~on
134
62.7 (4.3)
ll7 (2,.9)
Preferred Hand
Nonpreferred Hand
Males
49
41.9 (5.8)
37.3 (5.4)
Females
85
33.0 (6.9)
32.7 (5.9)
Table A21.32. [Fr.28b] Bornstein et iai., 1987b: Classification Rates Based on Conventional and Optimal Cutoff Scores or Two Hands for Males and Females Percent Correctly Classified Sample
Cutoff Scorei
Males preferred Males nonpreferred Females preferred Females nonpreferred
Conventional! <50 <44 <46 <40
Control
Brain-Damaged
8 10 12
95 79 95 93
92 86 95 88
29 45 26 48
2
Optimal Males preferred Males nonpreferred Females preferred Females nonpreferred
$32 $31 $21 $26
Table A21.33. [Fr.29] Bomstein et al., 1987a: Data on Test-Retest Performance after a 3-Week Interval Adults (9 Males, 14 Females)
for~a
Sample of 23 Healthy
Dominant
Nondominant
1
44.8 (6.3)
42.5 (6.7)
2
43.5 (7.1)
43.0 (7.1)
Test
Da~ for Patients Seen in Veterans Administration Medical Centers who Were Suspected of Having a Neurol 'cal Condition but who had Negative Neurological Findings
Table A21.34. [Fr.30] Russell, 1987:
n
155
~ucation
Gender
Age
148M 7F
46.19 (12.86)
12.29 (3.00)
Race 147C• SA
ON
•c, Caucasian; A, African American;
N, other.
WAIS FSIQ 111.9
Average Number of Taps for Both Hands 48.48 (5.70)
917
APPENDIX 21
Table A21.35. [Fr.31] Thompson et al., 1987: Percent of Normal Participants• Scoring in the Lateralized Dysfunction Range Nondominant Hemisphere Dysfunction
Intennanual Percent Difference Scores
Groups
n
Dominant Hemisphere Dysfunction
All right
167
10.18
17.96
10.8 (9.4)
Mixed right
226
12.05
10.27
8.3 (8.3)
Left
33
18.18
0.00
5.2 (5.9)
Total
426
11.79
12.50
9.0 (8.7)
"Mean age= 40.59 (18.27), mean education= 13.15 (3.49) years.
Table A21.36. [Fr.32] van den Burg et al., 1987: Data for Control Group Collected in Northern Holland n
Gender
Age
Total Score•
40
16M
37.4 (11.9)
(38)
24F
339
"Total score represents number of taps over three 10second trials for both hands.
Table A21.37. [Fr.33] Bornstein and Suga, 1988: Data for a Sample of Healthy Older Canadian Volunteers Stratified by Education Education
(years)
Age"
Number of Males
Number of Females
Preferred Hand
Nonpreferred Hand
5-10
62.3
17
29
34.1 (7.1)
32.8 (5.9)
11-12
62.9
16
28
36.9 (7.9)
35.2 (5.2)
>12
63.0
16
28
37.4 (8.1)
35.4 (7.0)
• Age range for the sample is 55-70 years.
APPENDIX 21
918 Table A21.38. [Ff.34] Ardila and Ros~lli. 1989: Data for Healthy Older Adults Collected ~ Bogota, Colombia
Table A21.40. [Fr.36b] Ruff and Parker, 1993: Data for a Sample of Healthy Adults• Stratified by Age and Gender
Age Group! Education
Gender/ Age Group
n
Women 16-24
45
25-39
45
40-54
44
~70
45
Total
179
Preferred Hand
N4tpreferred ;Hand
Dominant Hand M (SD)
Nondominant Hand M (SD)
49.5 (5.1) 49.0 (4.1) 47.0 (5.6) 45.7 (5.5) 47.8 (5.3)
45.6 (5.1) 44.6 (4.6) 43.5 (5.2) 40.4 (5.2) 43.5 (5.4)
52.9 (5.1) 52.7 (6.8) 54.3 (5.7) 53.5 (6.4) 53.4 (6.0)
48.2 (4.4) 48.7 (5.7) 48.9 (5.8) 48.3 (5.0) 48.5 (5.2)
50.6 (6.3)
46.0 (5.9)
SS-60
0-5 6-12 >12
40.9 44.4 48.1
37.2 i 39.2 :46.3 I
61~
0-5 6-12 >12
39.7 43.3 41.7
36.2 39.9 39.6
32.7 39.8 40.0
31.6 37.0 37.5
66-10
0-5 6-12 >12
I
11-15 0-5 6-12 >12
32.4 36.2 39.4
29.7 35.2 36.0
>15 0-5 6-12 >12
26.2 30.0 33.5
~.6
7.7
~1.3 I
Men
16-24
45
25-39
44
40-54
45
~70
45
Total
179
MoleiCitllple Total
I
I.
358
"Education range 7-22 years.
Table A21.39. [FT.36a] Ruff and Parke,, 1993: Data for the Left Hand-Dominant Satfple of Healthy Participants · Men
!Women
17 37.9 (18.0) 13.7 (2.8)
18 38.7 . (16.1) 14.1 ' (2.6)
Dominant hand
53.1 (5.8)
44.7 (4.5)
Nondominant hand
46.7 (13.2)
: 41.9 (6.2)
n Age Education
APPENDIX 21
919
Table A21.41. [Fr.37] Russell and Starkey, 1993: Data for a Sample of Veterans Administration Patients Without Central Nervous System Pathology, Stratified by Gender Gender Male Female
n
Age
Education
Race·
Dominant Hand
Nondominant Hand
176
45.1 (13.0)
12.5 (2.8)
165W 118
45.2 (8.9)
39.5 (7.9)
24
40.7 (15.3)
14.5 (2.6)
23W 18
44.2 (9.2)
38.8 (8.3)
•w, white; 8, black. Material from the manual for the Halstead-RusseQ Neuropsychological Evaluation System-Revised (HRNES-R) copyright 1993, 2001 by Western Psychological Services. Reprinted by permission of the publisher, Western Psychological Services, 12031 Wtlshire Boulevard, los Angeles, California, 90025, U.S.A. Not to be reprinted in whole or in part for any additional purpose without the expressed, written permission of the publisher. All rights reserved.
©
Table A21.42. [Fr.38] Dikmen et al., 1999: Test-Retest Data for Normal and Neurologically Stable Adults• Dominant Hand
n 384
Age
Education
M/F Ratio
34.2 (16.7)
12.1 (2.6)
66134%
Nondorninant Hand
WAIS FSIQt
Test-Retest Interval (Months)
Time 1
nme2
Time 1
Time2
108.8 (12.3)
9.1 (3.0)
50.88 (6.59)
51.36 (6.46)
47.02 (6.39)
47.87 (6.47)
• A number of participants had preexisting conditions that might affect test perfonnance, the most significant being alcohol abuse and a significant traumatic brain injury. twAIS FSIQ, (Wechsler, 1955).
Table A21.43. [Fr.39] McCurry et al., 2001: Data• for a Sample of Nondemented Japanese American Elderly n
Mean Age
Education
%Male
%RightHanded
Dominant Hand
Nondorninant Hand
70-79
120
74.6 (2.7)
11.7 (2.9)
54.2
94.4
42.9 (8.1)
39.4 (6.1)
80-101
81
87.0 (5.1)
10.0 (2.7)
28.4
93.4
36.4 (9.2)
34.2 (7.7)
Age Group
•Test scores are weighted.
APPENDIX 21
920
Table A21.44. [Ff.40] Sackellares and Sackellares, 2001: Data for the Control Group• n
Dominant Hand
Nondominant Hand
Asymmeby
Handedness Right
28
53.83 (7.55)
47.67 (5.56)
10.70 (8.97)
53.43
48.71
(4.94)
(6.31)
8.76 (9.45)
Left
12
Index
•Mean age=33.2, range 18-50 years.
Table A21.45. [FI'.41] Prigatano and Bo~. 2003: Data for the Control Group n 15
Age 33.6 (9.61)
J, Right-
fl. Male
Education
lfanded
46.7
12.86
100
(1.19)
Dominant Hand
Nondominant Hand
51.49 (4.33)
47.32 (3.75)
Appendix 21m: Meta-Analysis Tables for the Finger Tapping Test (FIT)
Table A21 m.1. Results of the Meta-Analysis and Predicted Scores for the
FIT: Males, Dominant Band (Relevant values are weighted on the standard error for the test mean)
Description of the aggregate sample Only those studies reporting data for males and females separately were included in the analyses.
Number of studies included in the analysis Yean of publieation Number of data points used in the analysis
8 1974-1993 20
(a data point denotes a study or a cell in education/gender-stratified data) Total number of partieipants
s,.,.
Variable
963
n•
xt
sot
Range
20
37.76
30.69
12-175
20 20
49.33 3.99
19.55 2.58
19-77 0.5-9
6 4
12.08 2.48
1.48 1.14
10.5-14 0.5-3.3
20 20
51.09 6.50
3.47 1.68
44.5-56.0 3.7-8.9
me
Mean
Age Mean SD
Eductmon Mean SD
Tatscoremecma Combined mean Combined SD
"Number of data points differs for different analyses due to missing data. tweighted means and SDs. (continued)
921
APPENDIX 21M
922
Table A21m.1. (Contd.) Predicted number of taps averaged 01fl'r 8ve trials and SDs, per age group (FIT, males, dominant band)• ·
95%CI
95%CI
Age BGnge
Predieted
Lower
Predicted
Lower
Upper
Score
Band
SD
Band
Band
!0-U !5-!9 30-34 35-39
54.41 5U7
53.20 53.55 53.16 52.41 51.61 50.79 49.88 48.76 47.32 45.50 43.29
4.60 4.96 5.31 5.66 6.01 6.37 6.71 7.07 7.43 7.78 8.13
3.98 4.41 4.84 5.26 5.67 6.06 6.43 6.77 7.08 7.38 7.66
5.23 5.50 5.78 6.07 6.36 6.68 7.01 7.38 7.77 8.18 8.60
40-44 4S-49 50-IU 5lS-S9 60-64 fi5.4J9 10-14
53.83 53.38 51.81 51.15 51.37
50.48 49.49
48.38 47.17
55. 54.1il 54.! 54. 54. 53.5J 52.8f; 52.2b 51.6J 51.2f" 51.0f
"Based on the equations:
Predicted te.e ICOre=54.11232 +0.0621745 •~e- 0.0021787 •age2 Predicted SD=3.013817 +0.0706054•age I
Signiflcance tests for regression with ~e test scores I
Ordinary least-14JUB"'S regression of tes4 meaas on age (quadratic) Number of observations Number of clusters
20 j~--
R2
0',6H6
F
F<2.7)=6.1S, p <0.023
Term
CoefBcient
SE
Age Age2 Constant
0.0621745 -0.0021787 54.11232
0.186 0.002
3.654
p
95%CI
-1.02
0.748" 0.341
li&.81
0.000
-0.378 to 0.503 -0.007 to-0.003 45.47 to 62.75
11.33•
"Significance test for age centered (sample mealls- aggregate mean): t = -3.41, p = 0.011.
Prediction Predicted age range Mean predicted score
SEe 95%CI
~74years
51.61 (2.45) 0.84 49.95-53.26
APPENDIX 21M
923
Table A21m.1. (Contd.) 0 55
50
45
20
30
40
80
50
age
70
80
Fipre A21 m.1. A scatterplot illustrating the dispersion of the data points around the regression line for the FTI', Male, Dominant Hand. The size of the bubbles reflects the weight of the data point, with larger
bubbles indicating larger standard error and smaller weight
Tests for assumptions and model 8t Tests for heterogeneity in the 8nal data set Pooled estimates for fixed effect Pooled estimates for random effect
53.478 52.789
Qusl = 85.19, p < 0.000
Q
2.319
Tests for model 8t-addition of a quadratic term Model Linear
0.656
Quadratic
0.686
Adjusted R2
BIC
BIC'
0.637 0.649
30.215
31.374
-18.353 -17.194
BIC' difference of 1.159 provides weak support for the linear model.
Tests for parameter speeUleations Normality of the residuals Shapiro-Wdk W test W = 0.964, p = 0.629 Homoscedasticity 9.223, p =0.056 White's general test (continued)
APPENDIX 21M
924 Table A21m.1. (Contd.)
Signiflcance tests for regression with the SDs OrdiDary least-squares regression of SDs on age (linear) Number of observations Number of clusters
20 8 0.677 F< t.7l = 56.39, p < 0.0001
R2 F
Coefficient
SE
Age Constant
0.0706054 3.013817
0.009 0.516
7.51
5.84
p
95,., Cl
0.000 0.001
0.048 to 0.093 1.794 to 4.233
Predietion 6.37 (1.17) 0.21 5.96-6.78
Mean predicted SD SEe
95%CI
EfFects of demographic variables Education Amount of data on education available in the literature was not sufficient for the analyses.
Table A21m.2. Results of the Meta-Analysis and Predicted Scores for the FIT: Males, Nondominant Hand (Relevant values are weighted on the standard error for the test mean)
Description of the aggregate sample Only those studies reporting data for males and females separately were included in the analyses.
Number of studies induded In the analysis Yean ofpublieation Number of data points used In the analysis (a data point denotes a study or a cell in education/gender-stratified data) Total number of participants
7 1974-1993 19
933
Variable
n•
xt
sot
Mean
19
38.74
32.60
12-175
19 19
49.92 4.07
19.25 2.51
20-77 1.5-9
5 3
12.06 2.80
1.55 0.67
10.5-14 2.0-3.3
19 19
46.44 6.08
3.17 1.42
41.1-51.4 4.1-8.4
Sa,.,. ...
Range
Age Mean SD
Eduecmon Mean SD
Test score_,., Combined mean Combined SD
•Number of data points differs for different analyses due to missing data. tweighted means and SDs.
925
APPENDIX 21M Table A21m.2. (Contd.)
Predicted number of taps averaged over 6ve trials and SDs, per age group (FIT, males, nondominant band)•
95%CI
95%CI
Age
Predicted Score
Lower Band
Upper Band
Predicted
&nge
SD
Lower Band
Upper Band
J~U
50.00 49.35 48.71 48.06 47.41 46.76 46.11 45.46 44.81 44.16 43.51
49.12 48.61 47.97 47.19 46.32 45.40 44.47 43.52 42.56 41.60 40.63
50.88 50.10 49.44 48.93 48.50 48.11 47.75 47.40 47.06 46.73 46.39
4.84 4.86 4.94 5.08 5.28 5.53 5.85 6.!2 6.65 7.13 7.68
4.20 4.35 4.44 4.52 4.65 4.86 5.17 5.57 6.07 6.65 7.27
5.48 5.37 5.45 5.65 5.91 6.20 6.52 6.86 7.22 7.62 8.09
J5-J9 30-34 35-39 40-44
45-49 50-IU 55-S9 60-64
65-69 1~14
"Based on the equations:
Predicwd tat acore =52.92403- 0.1298114 •age
Predicted SD = 5.45396- 0.0534967 • age+ 0.0011612 • age2
Significance tests for regression with the test scores Ordinary least-squares regression of test means on age (linear) Number of observations Number of clusters R2 F
19 7 0.622 Fo.sl = 14.95, p < 0.008
Term
Coefficient
SE
Age
-0.1298114 52.92403
0.034 1.076
Constant
Prediction Predicted age range Mean predicted score SE., 95%CI
-3.87 49.18
p
95%CI
0.008 0.000
-0.212 to -0.048 50.29 to 55.56
20-74 years 46.76 (2.15) 0.79 45.22-48.30 (continued)
926
APPENDIX 21M
Table A21m.2. (Contd.) 55
0
0 40 ~
~
40
~
age
i
I
ro
~
~
~rsion
Figure A21 m.2. A scatterplot illustrating the of the data points around the regression line for the FIT, Male, Nondominant Hand. The size of e bubbles reflects the weight of the data point, with larger bubbles indicating larger standard error and s aller weight.
Tests for assumptions and model 8t ;
..
Tests for heterogeneity In the fbud ..... set 48.370 Pooled estimates for 6xed effect Pooled estimates for random effect 47.679 Qcdf). p : Qns> = 105.34, p < 0.000 Moment-based estimate of
I
between-study variance
Tests for model&t-.lditioo of a
3.067
~tic term I
Model Linear Quadratic
0.622 0.630
0.600!
0.5841
BIC' difference of 2.528 provides positive su
BIC
BIC'
28.159 30.687
-15.550 -13.022
rt for the linear model.
Tests for parllllleter speei&cations Normality of the residuals Shapiro-Wille W test W = 0.~ p = 0.538 Homoscedasticity White's general test 6.616, v,=0.037
Table A21m.2. (Contd.) Significance tests for regression with the SDs Ordinary least-squares regression of SDs on age (quadratic) Number of observations 19 Number of clusters 7 R2
0.665
F(df)o
p
F<2.6)
Tenn
Coefficient
SE
Age Age2 Constant
-0.0534967 0.0011612 5.45396
0.055 0.001 1.204
=47.21, p < 0.0002
-0.96" 2.17 4.53
p
95%CI
0.375" 0.073 0.004
-0.190 to 0.083 -0.000 to 0.002 2.506 to 8.402
"Significance test for age centered (sample means- aggregate mean): t = 8.58, p < 0.000.
Prediction Mean predicted SD
5.82 (0.98) 0.29 5.25--6.40
SE. 95%CI
Effects of demographic variables Education Amount of data on education available in the literature was not sufficient for the analyses.
Table A21 m.3. Results of the Meta-Analysis and Predicted Scores for the FIT: Females, Dominant Hand (Relevant values are weighted on the standard error for the test mean)
Description of the aggregate sample Only those studies reporting data for males and females separately were included in the analyses.
Number of studies included in the analysis Years of publication Number of data points used in the analysis
4 1978-1993 10
(a data point denotes a study or a cell in education/gender-stratified data)
Total number of participants Variable
560
n•
xt
sot
10
43.59
41.98
19-221
10 10
44.91 4.24
21.48 2.43
19-77 0.5-8.4
2 2
12.49 1.27
0.71 1.06
12-13 0.5-2
10 10
47.57 5.45
2.77 0.93
43.1-51.2 4.0-6.7
Range
Sample .U.
Mean Age
Mean SD
Education
Mean SD
Test score means Combined mean Combined SD
•Number of data points differs for different analyses due to missing data. tWeighted means and SDs.
928
APPENDIX 21M
Table A21m.3. (Contd.) Predicted number of taps averaged pver &ve trials and SDs, per age group (FIT, females, dominant hand)• ~
95% CI 1
95%CI
Age &mge
Precllcted
Lower
Upper
Predicted
Lower
Upper
Score
Band
fand
SD
Band
Band
!0-14 !5-!9 30-34 35-39 40-44 45-49 SO-S4
50.07 49.74 49.34
$1.08
SS-59 60-64
46.43
49.06 48.79 48.45 48.05 47.58 47.07 46.51 45.90 45.25 44.56 43.79
4.66 4.84 5.01 5.19 5.36 5.54 15.71 5.89 8.07 8.14
4.12 4.33 4.54 4.74 4.93 5.11 5.29 5.46 5.62 5.77 5.92
5.20 5.34 5.49 5.64 5.80 5.97 6.14 6.33 6.52 6.71 6.92
48.88 48.36
47.78 47.13
45.66
~
44.83
10-14
43.84
4o.68
$0.22
49.71
49.14
~.49 ~.76
1fl.96 :.01 .11
~.09
8.41
*Based on the equations: Predicted tal ecore =50.80925- 0.00513091" age - 0.0012358 • age2 Predicted SD =3.869681 + 0.0351682 • ;age
1
Sfgni6cance tests for the regression . . test scores r Ordinary least-1quares regression of le1l means on age (quadratic) Number of observations Number of clusters
R2
10 4 0.937
F
F<2.3J
Term
Coefficient
SE
Age Age2
-0.0051309 -0.0012358 50.80925
0.031 0.000 0.846
Constant
="·78, p < 0.003
• Significance test for age centered (sample
:.....o.11• }--3.98 60.03
95%CI
95%CI
0.878• 0.028 0.000
-0.103 to 0.093 -0.002 to -0.000 48.12 to 53.50
llf8llS -aggregate mean): t = -10.64, p < 0.002.
Preclietion Predicted age range Mean predicted score SEe
p
20-74 years . 47.47 (2.05) . 0.33 46.82-48.11 .
929
APPENDIX 21M Table A21m.3. (Contd.) 0 50
45
~ ~------.-----~-----.-----.-----.------~ 20
30
50
•
80
80
Figure A21 m.l. A scatterplot illustrating the dispersion of the data points around the regression line for the FIT, Female, Dominant Hanel The size of the bubbles reflects the weight of the data point, with larger bubbles indicating larger standard error and smaller weight.
Tests for
_...,ticms
and model 8t
Tests for heterogene~ in the 8nal data set Pooled estimates for &xed effect 48.571 Pooled estimates for random effect 48.269 Q(9) = 46.94, p < 0.000 Qccll).p Moment-based estimate of between-study variance 2.298 Tests for model 8t--eddltioa of a quadratic term
Model Unear
Quadratic
0.912 0.937
Adjusted~
BIC
BIC'
0.901 0.919
4.998
3.934
-22.009 -23.072
BIC' difference of 1.()64 provides weak support for the quadratic model.
Tests for parameter speei8catioas Normality of the residuals Shapiro-Wdk W test
Homoscedasticity White's general test
W =0.943, p = 0.590
4.980, p = 0.289 (continued)
930
APPENDIX 21M
Table A21m.3. (Contd.)
Significance tests for the regression ~n SDs Ordinary least-squares regression of SDs on age (linear) Number of observations · 10 Number of clusters 4 ~
0.663 Fo.3) 34.74, p < 0.009
=
Term
Coefficient
SE
Age Constant
0.0351682 3.869681
0.371
0.006
Prediction Mean predicted SD
5.89 1G.42
'
p
95%CI
0.010 0.002
O.ol6 to 0.054
2.688 to 5.052
5.54 (0.58) ;
0.24 5.08-6.00
SEe
95%CI
Effects of demographic variables Education I Amount of data on education available in ~e literature was not sufficient for the analyses.
Table A21m.4. Results of the
Meta-~ysis and Predicted Scores for the
FIT: Females, Nondominant Hand !
.
(Relevant values are weighted on the s~dard error for the test mean)
Description of the aggregate sampl~ Only those studies reporting data for males fDd females separately were included in the analyses. Number of studies included in the ~ Years of publication Number of data points used in the ~ (a data point denotes a study or a cell in education/gender-stratified data) Total number of participants
s,.,.
Variable
3 1978-1993 9
530
n•
xt
sot
Range
9
44.78
44.76
19-221
9 9
~.40
21.11 2.17
20-77
1 1
1•.00 •.00
9 9
43.65
.ue
Mean
Age
Mean SD
...69
2-8.4
Eductmon
Mean SD Ted ecore metma
Combined mean Combined SD
5.57
2.46 1.20
"Number of data points differs for different: analyses due to missing data. tweighted means and SDs.
40.4-48.0 3.7-7.4
APPENDIX 21M
931
Table A21m.4. (Contd.)
Predicted number of taps averaged over 8ve trials and SDs, per age group (FIT, females, nondominant hand)•
95%CI
Age Bmlge !0-U ~~J9
30-34 35-39
40-44 4S-49 50-IU SS-$9 60-64
65-69 10-14
95%CI
Predicted Score
Lower Band
Upper Band
Predicted SD
Lower Band
Upper Band
46.31 45.80 45.28 44.77 44.26 43.74 43.22 42.71 42.20 41.68 41.17
44.42 44.07 43.71 43.33 42.94 42.53 42.08 41.60 41.09 40.53 39.95
48.21 47.53 46.86 46.21 45.57 44.96 44.37 43.82 43.31 42.83 42.39
4.74 4.69 4.70 4.77 4.89 5.07 5.31 5.61 5.96 6.37 6.84
3.72 3.86 3.98 4.07 4.18 4.33 4.56 4.88 5.29 5.79 6.40
5.76 5.52 5.43 5.46 5.60 5.81 6.05 6.33 6.63 6.95 7.28
•Based on the equations:
Predicted fat ecore = 48.62897- 0.1029087 • age Predicted SD=5.68279-0.0677816•age+O.OOU554•agc
Signi&cance tests for the regression on test scores Ordmary least-squares regression of test means on age (Jinear) Number of observations Number of clusters R2 F
9 3 0.779 Fn.2> = 23.84, p < 0.039
Term
Coefficient
SE
Age
-0.1029087 48.62897
0.021 1.380
Constant
Predictioo Predicted age range Mean predicted score SEe 95%CI
-4.88 35.23
p
95%CI
0.039 0.001
-0.194 to-0.012 42.69 to 54.57
20-74 years 43.74 (1.71) 0.69 42.39-45.10 (continued)
APPENDIX 21M
932
Table A21m.4. (Contd.)
0
48
48
44
42
0 0 40
20
30
40
50
age
so
70
so
Fipre A21 m.4. A scatterplot illustrating the dispersion of the data points around the regression line for the FIT, Female, Nondominant Hand. The size of the bubbles reflects the weight of the data point. with larger bubbles indicating larger standard error and smaller weight.
Tests for assumptions and model 8t Tests for heterogeneity in the 6oal data set Pooled estimates for 6xed effect 44.481 Pooled estimates for random effect 44.207 Q(dO. p Q(8l = 55.77, p < 0.000 Moment-based estimate of between-study variance 3.090 Tests for model 8t-ddition of a quadratic term Model Linear
Quadratic
0.779 0.779
Adjusted If
BIC
BIC'
0.748 0.706
11.709 13.905
-11.401 -11.402
BIC' difference of 0.001 provides weak support for the quadratic model.
Tests for parameter speclflcations Normality of the residuals Shapiro-Wilk W test W =0.959, p =0.793 Homoscedasticity White's general test 0.587, p = 0.746
APPENDIX 21M
933
Table A21m.4. (Contd.)
Signi6cance tests for the regression on SDs OrdiDary least-squares regression of SDs on age (quadratic) Number of observations 9 Number of clusters R2 F
3 0.800
Term
Coefficient
SE
Age
-0.0677816 0.00ll554 5.68279
0.056
Age2 Constant
0.000 1.416861
-1.21"
2.39 4.01
p
95%CI
0.351" 0.139 0.057
-0.309 to 0.174 -0.001 to 0.003 -0.413 to 11.78
• Significance test for Age centered (sample means -aggregate mean): t
=3.78, p =0.063
Prediction Mean predicted SD
SE., 95%CI
5.36 (0.75) 0.37 4.64-6.08
EfFects of demographic variables Edueation Amount of data on education available in the literature was not sufficient for the analyses.
Appendix 22: Locator and Data Tables for the Grip Strength Test (Hand Dynamometer)
Study numbers and page numbers provided in these tables refer to study numbers and descriptions of studies in the text of Chapter 22.
Locator table also provides a reference for each study to a corresponding data table in this appendix.
Table A22.1. Locator Table for the Hand Dynamometer Study
D. I Matarazzo et.al., 1974 page 447 Table A22.2 D.l Wiens & Matarazzo, 1977 page 447 Table A22.3 D.3 Dodrill, 1978b page 447 Table A22.4 D.4 Dodrill, 1979 page448 Table A22.5 D.S Rounsaville et.al., 1982 page 448 Table A22.6 D.6 Yeudall et al., 1982 page 448 Table A22.7
934
IQ/Education•
Age•
n
21-28 X=24
29
Normal young men; patrolmen applicants
12-16 FSIQ: 118
Oregon
23.6 24.8
24 24
Normal young men divided into 2 groups
FSIQ: 117.5 118.3
Oregon
X=41.1
25
Control group; 20 M, 5 F
X=10.7
Washington
27.51 27.49 24.9
47 47
Nonneurologicalsample: M F CETA workers, 59% M
12.47 12.36 11.2
Massachusetts
FSIQ: 95.3
Alberta, Canada
14.8 14.5
29
99
47
Sample Composition
Delinquent and nondelinquentadoresoents
Location
Washington
117.1
APPENDIX 22
935
Table A22.1. (Contd.) Study
0.7 Prigatano et al.,
Age"
n
Sample Composition
IQ/Education"
Location
25
Control group of healthy adults; 84% M
10.5 (3.3)
Oklahoma City and Canada
193
Normal volunteers; 111 M, 82 F; data are partitioned by 5 age groups x gender
Education: 14.8 (3.0) WAIS-R FSIQ: 119.1 (8.8)
Alberta, Canada
20-39 40-59 60-69
365
178M, 187 F; paid volunteers free of neurological or psychiatric illness; data are presented by age x education x gender
12.3 (2.7)
Western Canada
0.10 KofBer & Zehler,
20-29
206
1985
30-39
page 450 Table A22.12
50-59
Normal sample; 100 M 106 F; stratified by 5 age groups and gender
32.7 (13.5)
100
79 M, 21 F; controls with no neurological illness, head trauma, or substance abuse
14.5 (2.84)
Denver
38.9 (11.3)
46
Control sample of medical and nonschizophrenic V.A. psychiatric patients; data are reported in T scores
12.3 (2.6)
Oklahoma City, Pittsburgh
0.13 Heaton et al., 1986
1~1
553
page 451
39.3 (17.5) <40
356 M, 197 F; normal subjects with no history of neurological illness, head trauma, or substance abuse; 7.2% left-handed; data are presented in 3 age and 3 education groups
0-20 13.3 (3.4) <12 12-15 2:16
Colorado, California, Wisconsin
Normal adults; 127 M, 98 F, data are stratiBed by 4 age groups x gender
FSIQ: 111.75 109.79 113.95 116.09
Alberta,
279 M, 147 F; normal subjects; percent falling in
13.15 (3.49)
1983 page 449 TableA22.8
59.6 (9.0)
0.8. Fromm-Auch at
1~17
Yendall, 1983 page449 TableA22.9
18-23 24-32
33-40 41-64
0.9 Bornstein, 1985
page 450 Tables A22.10, A22.11
40-49
ro-77 0.11 Heaton et al., 1985 page450 Table A22.13
0.12 Kane et al., 1985
page 451 Table A22.14
Table A22.15
40-59 2:60
0.14 Yendall et al., 1987
1~20
62
page 451
21-25 26-30 31-40
73 48
40.59 (18.27)
426
Table A22.16
0.13 Thompson et al., 1987 page 452 Table A22.17
0.16 Ernst, 1988 page 453
42
Canada
lateralized dysfunction range is presented 65-75
85
Normal elderly; 39 M, 46F
10.4 (3.1)
Queensland, Australia
Table A22.18 (continued)
APPENDIX 22
936
Table A22.1. (Contd.) Study
0.17 Heaton et al., 1991,2004 page 453 Data are not reproduced in this book
Age• 42.1
n
Sample Composition Urban and rural volunteers; data collected over 15 years through multicenter collaborative efforts; strict exclusion criteria; 65% M; data are presented in
486
(16.8)
Groups: 20-34 35-39 40-44
T-score equivalents for M and F separately in 10 age groupings by 6 education groupings; in 2004 edition, age range is expanded to 85 years and data are presented for AfricanAmerican and Caucasian participants separately
45-49
50-54 55-59 60-64 65-69 70-74
75-80
0.18 Russell &:
45.0
Starkey, 1993
(12.9)
page 454 Table A22.19
40.7 (15.3)
175M
24F
Norms are collected from standardization sample for HRNES manual; controls are V.A. patients without CNS i pathology
IQ/ Education• 13.6
(3.5) FSIQ: 113.8 (12.3) 6-8 9-11 12 13-15 16-17 ~18
Location California,
Washington, Texas, Oklahoma, WISCOnsin,
IDinois, Michigan, New York, Vrrginia, Massachusetts, Canada
12.5
Cincinnati,
(2.8)
Miami
14.5 (2.6)
I
0.19 Tremont et al., 1998 page 455 Table A22.20
16-74
0.20 Dikmen et al., 1999 page 455 Table A22.21
34.2 (16.7)
0.21 Triggs et al.,
37
2000
(9)
page456 Table A22.22
21-57
O.U Christensen et al., 2001 page 456 Table A22.23
70-74
75-79 80-84 ~85
157
: Patients referred for evaluation : but determined to be I neurologically normal; 71 M, 86 F; data for dominant hand partitioned by FSIQ (3 levels) are presented Normal and neurologically stable adults; some had neurological conditions; 66% M; data on testsretest reliabilities and practice effect are provided
384
60
130 right-handed, 30 left-
Oklahoma
12.62 (2.76) 15.63 (3.37)
for 3 IQ levels 12.1 (2.6)
Washington,
Colorado, California
At least HS
\
199 120 41 14
handed healthy volunteers; r data are presented for I · 2 hands by 2 handedness groups i Healthy elderly, 37%-52% M; ! data are presented as the ·, mean for 2 hands for 4 age groups
11.53 (2.76)
11.78 11.34 11.00 10.79
Australia
•Age column and IQ'education column con · information regarding range and/or mean and standard deviation for the whole sample and/or separate groups, whichevtr information is provided by the authors. I
i
APPENDIX 22
937
Table A22.2. [D.1] Matarazzo et al., 1974: Data for Men who Met Selection Criteria for the Portland Police Department Preferred Hand
Nonpreferred Hand
n
Age
Education
WAIS FSIQ
Test
Retest
Test
Retest
29
24
14
118
56.84
55.16 (8.48)
53.59 (6.14)
51.74 (7.20)
(7.66)
Table A22.3. [D.2] Wiens and Matarazzo, 1977: Data for Male Applicants to a Patrolman
Program Group
n
WAIS FSIQ
Age
Education
Preferred Hand
Nonpreferred Hand
1
24
117.5
23.6
13.7
58.1 (7.3)
53.4 (5.5)
2
24
118.3
24.8
14.0
57.5 (6.3)
53.9 (6.2)
Table A22.4. [D.3] Dodrill, 1978b: Data for Control Participants
..Right-
Right
n
Age
Education
Gender
Handed
Hand
Left Hand
25
41.1
10.7
20M SF
100
48.1 (13.4)
(12.2)
Table A22.5. [D.4] Dodrill,1979: Data for Control Participants Dominant Hand
n
Age
Education
SES"
Males
47
27.51
12.47
49.45
54.13 (9.95)
Females
47
27.49
12.36
47.41
34.00
(5.96) "SES, socioeconomic status.
44.9
APPENDIX 22
938
Table A22.6. [D.5] Rounsaville et al., 1982: Data for a Sample of Comprehensive Employment Training Act (CETA) Workers n
%Male
Education
%RightHanded
Age
Dominant Hand
Nondominant Hand
29
59
11.2
90
24.9
41.05
38.18
Table A22.7. [D.6] Yeudall et al., 1982: Data for Delinquent and Nondelinquent Canadian Adolescents n
Age
WAlS FSIQ
Number of Males
Number of Females
Preferred Hand
Nonpreferred Hand
Delinquent
99
14.8
95.3
64
35
37.0 (11.0)
35.5 (10.0)
Nondelinquent
47
14.5
117.1
29
18
33.1 (7.8)
30.3 (6.9)
Table A22.8. [D.7] Prigatano et al., 1983: Data for the Control Group n
Age
Education
%Male
%RightHanded
WAlS FSIQ
Dominant Hand
25
59.6
10.5
84
96
(9.0)
(3.3)
112.0 (11.0)
(11.4)
45.1
939
APPENDIX 22 Table A22.9. [0.8] Fromm-Auch and Yeudall, 1983: Data for a Sample of Healthy Canadian Adults Stratified by Age x Gender• n
Preferred Hand (SD)
Nonpreferred Hand (SD)
38.0 (8.4) 49.7 (9.7) 51.8 (8.1) 52.9 (8.3) 44.5 (10.9)
35.8 (9.6) 46.6 (9.9) 49.6 (7.2) 51.2 (7.9) 47.9 (11.9)
28.1 (5.0) 28.8 (7.8) 34.4 (9.2) 27.7 (3.2) 28.0 (6.2)
26.3 (5.2) 26.4 (6.2) 30.2 (6.8) 28.6 (3.1) 24.1 (6.8)
Jlala 15-17
17
18-23
43
24-32
31
33-40
12
41-64
4
F-'15-17
15
18-23
29
24-32
24
33-40
6
41-64
6
•Mean education 14.8 years, mean full-scale IQ 119.1.
Table A22.10. [D.9a] Bornstein, 1985: Data for a Sample of Healthy Canadian Adults Stratified by Age, Education, and Gender Education
Number of Males
Number of Females
Preferred Hand
Nonpreferred Hand
107
64
31
66
40
57
43.1 (12.1) 34.0 (9.7) 32.0
40.1 (11.4) 31.5 (10.4) 29.4
(10.0)
(9.2)
35.7 (11.9) 38.8 (12.1)
33.7 (11.9) 35.8 (11.5)
47.5 (9.0) 28.8 (6.2)
44.3 (8.7) 26.5 (6.2)
Bfiage 20-39
40-59 60-69
13.0 (2.3) 11.9 (2.8) 11.8 (2.9)
Bfi.daecdion 48.5 (16.6) 41.1 (16.8)
51
57
~HS
127
130
12.4 (2.9) 12.2 (2.5)
178
'Bfl gend«39.2 (17.2) 47.3 (16.1)
187
APPENDIX 22 Table A22.11. [D.9b] Bornstein, 19851 Data Presented in Age x Education (
Female ~HS
M (SD)
n
M (SD)
20-39
21
86
40-59
13
50.8 (11.5) 39.8 (6.0) 38.7 (5.9)
49.9 (8.4) 48.2 (7.3) 44.5 (5.6)
Age Group
~HS
M (SD)
n
M (SD)
32.7 (8.7) 27.7 (5.9) 25.6 (5.3)
50
31.0 (5.4) 29.8 (5.8) 25.0 (4.9)
Pt-fo• red hand
60-69
16
17 I
22
13 22 22
43 34
Nonpreferred hand 47.7 (11.7) 38.2 (6.5) 37.2 (5.4)
20-39 40-59
60-69
46.4 (7.6) 46.4 (9.1) 39.3 (5.5)
II
I I I
Table A22.12 •. [D.lO] Koffier and Ztfder, 1985: Data for Healthy Adults Stratified Iby Age x I I
Gender Dominant
$ondominant
n
Hand
Hand
Men
41
Women
39
53.8 (7.8) 33.3 (4.7)
50.3 (7.4) 30.5 (4.4)
Age/Gender .IIQ-.19
I
30-39
Men
23
Women
25
55.4 (7.1) 33.7 (6.2)
53.3 (7.4) 31.1 (5.6)
50.2 (5.3) 30.7 (5.5)
49.2 (7.8) 28.7 (4.3)
44.3 (5.4) 28.8 (3.6)
44.8 (5.8) 25.3 (3.8)
11
45.5
41.3 (6.7)
15
(5.4) 28.3 (6.3)
40-49
Men
13
Women
14
50-S9
Men
12
Women
13
60-77 Men Women
23.5 (5.2)
31.2 (8.0) 24.9 (6.7) 24.0 (6.0)
28.7 (5.0) 26.9 (5.4) 22.8 (4.8)
941
APPENDIX 22
Table A22.13. [D.ll] Heaton et al., 1985: Data for a Control Sample MIF n
100
Age
Education
Ratio
32.7 (13.5)
14.15
79121
Kilograms for 8oth Hands 88.99
(21.4.2)
(2.84)
Table A22.14. [D.12] Kane et al., 1985: Data• for the Control Group consisting of Medical and Nonschizophrenic Veterans Administration Psychiatric Patients n
46
Dominant
Nondominant
Age
Education
Hand
Hand
38.9
12.3 (2.6)
53.7 (6.5)
55.2
(11.3)
(7.1)
•Data are reported in T sconiS.
Table A22.1 S. [D.13] Heaton et al., 1986: Data for a Sample of Healthy Adults Stratified by Age and Education
Apgroupe <40 40-59 :2:60
Dominant
Nondominant
n
Hand
Hand
319 134 100
51.4 51.7 44.3
47.8 46.8 40.5
132
47.1 51.1 51.5
43.4 47.1 47.1
Ecft.c:alion groupl
<12 12-15 :2:16
249
172
APPENDIX 22
942
Table A22.16. [0.14] Yeudall et al., 1987: Data for Healthy Canadian Adults Stratified by Age for the Entire Sample and for Males and Females Separately WAIS-R Age Croup
n
Age
Education
FSIQ
17.76 (1.96) 22.70 (1.40) 28.06 (1.52) :W.38 (2.46) 24.66 (6.16)
12.16 (1.75) 14.82 (1.88) 15.50 (2.65) 16.50 (3.11) 14.55 (2.78)
111.75 (10.16) 109.79 (9.97) 113.95 (10.61) 116.09 (9.51) 112.25 (10.25)
17.73 (1.84) 22.83 (1.54) 28.69 (1.25) 33.88 (2.53) 24.03 (5.95)
12.10 (1.52) 14.53 (1.99) 14.94 (2.32) 16.19 (2.29) 14.12 (2.43)
110.32 (10.64) 107.28 (9.14) 113.10 (11.37) 114.27 (11.32) 110.19 (10.46)
17.78 (2.09) 22.57 (1.26) 27.75 (1.57) :W.69 (2.41) 25.15 (6.29)
12.22 (1.96) 15.11 (1.74) 15.78 (2.79) 16.69 (3.55) 14.87 (2.99)
113.00 (9.72) 112.30 (10.27) 114.38 (10.43) 117.31 (8.21) 113.87 (9.83)
IllRightHanded
Preferred
79.03
37.22 (10.68) 40.33 (14.33) 45.20 (11.59) 45.22 (13.47) 41.42 (13.01)
:W.57 (10.20) 37.74 (13.81) 42.19 (11.91) 42.81
30.20 (5.56) 29.79 (6.91) 33.88 (7.65) 32.98 (10.10) 31.09 (7.33)
28.07 (4.52) 27.22 (6.CH) 30.25 (6.35) 30.10 (6.62) 28.Q
43.58 (10.25) 51.49 (11.36) 51.05 (8.55) 52.28 (9.57)
40.46 (10.37) 48.56 (10.83) 48.35 (9.07) 50,15 (8.74) 46.75 (10.46)
Hand
Endre.....,. (n=JJ5)
15-20
62
21-25
73
26-30
48
31-40
42
15-40
225
86.30
89.58 90.48 85.78
(11.60)
38.76 (12.61)
F - ' a (n = 98)
15-20
30
21-25
36
26-30
16
31-40
16
15-40
98
Jlala (II= JJ7) 15-20
32
21-25
37
26-30
32
31-40
26
15-40
127
73.33 83.33 93.75 87.50 82.65
84.38 89.19 87.50 92.31 88.19
49.49
(10.53)
(5.81)
APPENDIX 22
943
Table A22.17. [D.15] Thompson et al., 1987: Percent of Healthy Participants Scoring in the Lateralized Dysfunction Range Dominant Hemisphere
Intermanual Percent Difference
Groups
n
Dysfunction
Nondominant Hemisphere Dysfunction
All right
167
19.16
8.98
7.7 (9.8)
Mixed right
226
17.70
19.47
9.1 (12.5)
Left
33
48.48
6.06
-0.2 (12.0)
Total
426
20.66
14.32
7.8 (11.7)
Scores
Table A22.18. [D.16] Ernst, 1988: Data for Healthy Elderly Australian Volunteers
n
Age
Education
Total
85
70.0 (2.6)
10.4 (3.1)
Males
39
Females
.. RightHanded
Dominant Hand
Nondominant Hand
1.1 (0.2)
99
46
Dominant/ Nondominant Ratio
41.7 (6.2)
38.5
26.9 (4.0)
23.1 (4.9)
(5.1)
Table A22.19. [D.18] Russell and Starkey, 1993: Data for a Sample of Veterans Administration Patients Without Central Nervous System Pathology, Stratified by Gender Gender Males
n
Age
Education
Race·
Dominant Hand
Nondominant Hand
175
45.0 (12.9)
12.5 (2.8)
164W 118
45.4 (10.6)
41.9 (10.3)
24
40.7 (15.3)
14.5
23W 18
27.1 (8.1)
25.5 (8.1)
ON Females
(2.6)
•w, white, 8, black, N, other. Material from the manual fur the Halstetul-Bussell Neuropsychologfcal Evaluaflon System-ReWed (HRNES-R) copyright @ 1993, 2001 by Westero Psychological Services. Reprinted by permission of the publisher, Western Psychological Services, 12031 Wilshire Boulevard, Los Angeles, California, 90025, U.S.A. Not to be reprinted in whole or in part fur any additional purpose without the expressed, written permission of the publisher. All rights reserved.
APPENDIX 22
944
Table A22.20. [D.19] Tremont et al., 1998: Data for Dominant Hand in Patients Referred for Neuropsychological Evaluation who Were Determined to be Neurologically Normal Grip Strength Performance Group
n
Age
Education
FSIQ
Male
Female
Below average
35
34.03 (13.8)
11.53 (2.76)
84.89 (4.84)
44.54 (ll.45)
27.49 (7.03)
Average
84
40.55 (16.73)
12.62 (2.76)
99.15 (8.05)
43.92 (10.51)
26.37 (7.64)
Above average
38
41.71 (14.65)
15.63 (3.37)
119.92 (7.55)
50.31 (10.10)
26.65 (5.29)
Table A22.21. [D.20] Dikmen et al., 1999: Test-Retest Data for Normal and Neurologically Stable Adults• Dominant Hand n 384
Age
Education
34.2 (16.7)
12.1 (2.6)
Nondominant Hand
MIF Ratio
WAIS FSIQt
Test-Retest Interval
Tune 1
Tune2
11me 1
66134%
108.8 (12.3)
9.1 (3.0)
43.34 (13.33)
42.41 (13.44)
40.60
39.65
(12.89)
(13.19)
11me2
•A number of participants had preexisting conditions that might affect test performance, the most significant being aleohol abuse and a significant traumatic brain Injury. twAIS FSIQ, Wechsler Adult Intelligence Scale full-scale IQ. (Wechsler, 1955).
Table A22.22. [D.21] Triggs et al., 2000: Data for Left-Handed and Right-Handed Healthy Volunteers•
Left Hand
Left-Handers
Left-Banders
(n=30)
(n=30)
Right Hand
Left Hand
Right Hand
23
22
22
23
(6)
(6)
(6)
(6)
•Mean age 37 (9) years, equal number of males and females in two groups.
APPENDIX 22
945
Table A22.23. [D.22] Christensen et al., 2001: Data for a Sample of Healthy Australian Elderly, Expressed as Mean of the Scores for Two Hands Averaged over Four Trials for Four Age Groups for Males and Females Separately Grip Strength Age Group
Age
n
%Males
Education
Male
Female
70-74
79.82
199
52
11.78
29.7 (6.3)
15.3 (5.2)
75-79
84.46
120
43
11.34
25.5 (7.3)
13.9 (5.0)
80-84
88.04
41
37
11.00
21.5 (8.2)
12.6 (4.0)
2';85
93.71
14
50
10.79
21.5 (4.4)
9.7 (3.1)
Appendix 22m: Meta-Analysis Tables for the GriJP Strength Test (Hand Dynamom~ter)
Table A22m.1. B.esults of the Meta-IWIIIIIYlilli and Predicted Scores for the Hand DyaamOJDter Test: Males, Do t Hand Relevant values are we ted on the
Description or the agrepte sample; Ooly those studies reporting data for males in the analyses.
aJtd females separately were included I
Number of studies Included In the aaalysU Yean of pablbtion · Number of data points used In the .....,.. (a data point denotes a study or a cell · in educatkmlpder-stratified data) Total number el puticiplmts Variable
9 1974-1993 15 713
n•
xt
sot
15
36.59
39.35
11-178
15 13
41.2q 4.08
15.62 4.10
22.57-70 1.3-17.2
Range
Stltttpk . . Mean
Age Mean SD Eclueadota Mean SD
r..,_,.._ Combined mean Combined SD
:
10 8
13.o3 2.66
2.43 0.91
10.4-16.7 1-3.6
15 15
49.~
4.34 2.72
41.7-56.8 5.3-13.4
8.e:t
"Number of data points differs for chfferent ~ due to missing data. 'Weighted means and SDs.
946
APPENDIX 22M
947
Table A22m.1. (Contd.) Predicted grip strength in kilograms averaged over trials and SDs per age group• (Dynamometer, males, dominant hand) 95%CI
Age Bange
Predieted
Jlf-J9 30-34 35-39 40-44 4lJ-49 IJO....S4 SlJ-59 60-64
suo
Score
51.70
50.50 49.30 48.10 46.90 45.70 44.51 43.31
~
Lower Band
Upper Band
51.21 50.22 49.20 48.13 46.99 45.78 44.49 43.14 41.74
54.60 53.19 51.81 50.48 49.22 48.03 46.92 45.87 44.87
Standard chmatioo for all age groups is 8.95.
•Based on the equation: Predkted tat KOrW = 59.50402 - 0.2399832 • age
SiguJflcance tests for regression with the test scores Ordinary least-squares regression of test means on age (linear) Number of observations 15 Number of clusters 9 R2 0.747 F
Term
Coefficient
SE
Age Constant
- 0.2399832 59.50402
0.030 1.597
-7.88 37.25
p
95%CI
0.000 0.000
-0.310 to-0.170 55.82 to 63.19
Prediction
Predicted age range Mean predicted score SEe 95%CI
25-69 years 48.10 (3.29) 0.68 46.77-49.44 (continued)
APPENDIX 22M
948 Table A22m.1. (Contd.)
eo 0
0
55
50
0
~ ~-----.------.-----.------.-----.------,50 20 30 eo eo
•
Fipre A22m.1. A scatterplot illustrating the dispersion of the data points around the regressioo line for the Dynamometer, Male, Dominant Hand. The size of the bubbles reflects the weight of the data point. with larger bubbles indicating larger standard error and smaller weight.
Tests for assumpticms and moclel8t Tests for heterogeneity In the 8aal data set Pooled estimates for 6xed effect 48.484 Pooled estimates for random effect 49.515 Q(dl).p Q(l4) =177.86, p < 0.000 Moment-based estimate of between-study variance 19.586 Tests for model flt_.Liftfon of a quadratic term Model Unear Quadratic
0.747 0.752
0.728 0.711
BIC
BIC'
29.699 32.125
-17.924 -15.498
BIC' difference of 2.427 provides positive support for the linear model.
Tests for panmeter speeifleatioas Normality of the residuals Sbapiro-Wdk W test W = 0.953, p = 0.574 Homoscedasticity 0.819, p =0.664 White's general test
Signiflcance tests for regressicm with the SDs A regression of SDs on age yielded an B_2 of 0.076 (Fu.s> = 1.36, p = 0.278). Therefore, the SD for the aggregate sample is suggested for use with all age groups.
EtTects of demographic variables EdueatiOD
Education did not contribute to grip strength beyond its inverse association with age. There was a negligible lmpRM!mentof0.0008intheB_iwithadditicmoftheeducationtermtotbeabowregressicmmodel(forlOdata points that report education). Similarly, significance tests for educaticm yielded t =0.30, p =0.77. Therefore, further analyses were not performed.
APPENDIX 22M
949
Table A22m.2. Results of the Meta-Analysis and Predicted Scores for the Hand Dynamomter Test: Males, Nondominant Hand (Relevant values are weighted on the standard error for the test mean)
Description of the aggregate sample Only those studies reporting data for males and females separately were included in the analyses Number of studies included in the aaalysis Yean of publieatioa Number of data points used in the aaalysis (a data point denotes a study or a cell in education/gender-stratified data) Total number of participaats
7 1974-1993 13
641
n•
xt
sot
Mean
13
35.59
42.45
Age Mean SD
13 12
40.91 3.38
15.33 3.78
22.6-70 1.3-17.2
8 7
13.66 2.52
2.48 0.91
10.4-16.7 1.0-3.6
13 13
47.16 8.39
4.25 2.17
38.5-53.6 5.1-12.2
Variable
Range
s-p~.e.a.
11-178
Educadon
Mean SD
TM ICGre tneCitll Combined mean Combined SD
"Number of data points differs for different analyses due to missing data.
tweighted means and SDs.
Predicted grip strength in kilograms averaged over trials and SDs per age group• (Dynamometer, males, nondominant band)
95%CI
A&•
Bange
Score
Lower Band
Upper Band
JS-19
50.1! 49.0! 47.91 46.81 45.71 44.61 43.50 42.40 41.30
47.86 46.93 45.94 44.90 43.81 42.65 41.43 40.17 38.87
52.37 51.11 49.89 48.72 47.61 46.56 45.57 44.63 43.72
30-34
35-39 40-44 45-49 IJO...S4 IJS.-.69 60-64 65-69
Predleted
"Based on the equation: Predieted tat 1eore =56.18508- 0.2205636 • age (continued)
Standard deviation for all age groups is 8.39.
950
APPENDIX 22M
Table A22m.2. (Contd.) 55
0
0 50
40 0
35 20
40
30
eo
50
•
eo
70
\
Fip'e A22m.2. A scatterplot illustrating thr:·· · rsion of the data points around the regression line for the Dyoamometer, Male, Nondominant Hand. size of the bubbles reflects the weight of the data point, with larger bubbles indicating larger standard e , smaller weight. I
I
Sigaiflcance tests for regressicm with!the test scores I
I
tft
Ordmary least-squares regression o£ meaDS OD age (linear) Number of observations I 13 Number of clusters , 7 lI 0.630 R2 F = f.l5, p < 0.0007 Term
Coefficient
SE
Age Constant
-0.2205636 56J8508
0.034 1.852
Precliction Predicted age range Mean predicted score
i -6.34
p
95%CI
0.001
- 0.306 to - 0.135 51.65 to 60.72
0.000
. 30.33
25-69ye*
SE,
45.71 (3. 1.07
)
95%CI
43.62-47.$o I
I
Tests for assumptioos and model 8t
!
Tests for heterogeneity in the 8aal ~ aet Pooled estimates for 6xed effect Pooled estimates for random effect I
!
Q(dOo
45.173
46.825 Q(l2) = 211.33, p < 0.000
p
Moment-based estimate of between-study variance
24.454
Tests for model8t-ddjtion of a + t e r m Model Linear Quadratic
Adjusted 0.630 0.642
:a
2
0.5971 0.571\
BIC
BIC'
32.373
-10.370
34.505
-8.237
BIC' difference of 2.132 provides positive supfort for the linear model. I
!
APPENDIX 22M
951
Table A22m.2. (Contd.) Tests for parameter speei8eations Normality of the residuals Shapiro-Wilk W test W=0.962, p=0.787 Homoscedasticity White's general test 2.401, p =0.301
Signiflcance tests for regression with the SDs A regression of SDs on age yielded an R2 of 0.180 (F(l.&J = 7.18, p =0.037). Therefore, the SD for the aggregate sample is suggested for use with all age groups.
Efl'eets of demographic variables Education Education did not contribute to grip strength beyond its inverse association with age. There was a small improvement of0.038 in the R2 with addition of the education term to the above regression model (for eight data points that report information on education). Similarly, significance tests for education yielded t = 1.32, p = 0.24. Therefore, further analyses were not performed.
Table A22m.3. Results of the Meta-Analysis and Predicted Scores for the
Hand Dynamomter Test: Females, Dominant Hand (Relevant values are weighted on the standard error for the test mean)
Description of the aggregate sample Only those studies reporting data for males and females separately were included in the
analyses Number of studies included in the analysis Yean oE publieatioa Number of data points used in the aaalysis (a data point denotes a study or a cell in education/gender-stratified data) Total number oE partieipaots
5 1979-1988 11
454
Variable
n•
xt
sot
Range
Satrtpk.U. Mean
11
28.22
33.14
13-187
Age Mean SD
11 10
40.16 2.85
15.96 2.83
22.83-70 1.3-16.1
6 5
14.48 2.33
1.95 0.31
10.4-16.2 2.0-3.1
11 11
31.47 6.74
2.45 2.03
26.9-34.0 3.6-10.'1
EtlucGtion Mean SD rataeoremetlftl Combined mean Combined SD
"Number of data points differs for different analyses due to missing data. tweighted means and SDs. (continued)
952
APPENDIX 22M
Table A22m.3. (Contd.)
Predicted grip strength in ldlograms averaged over bials and SDs per age group• (I>yuamometer, females, dominant band) 95%CI
Predieted
Bange
Score
Lower Band
Upper Band
25-!9
3lUl 3lU7 32.22 31.76 31.19 30.50 29.70 28.79 27.76
31.76 32.15 31.01 29.99 29.18 28.60 28.23 28.03 27.09
32.99 33.44 33.53 33.19 32.40 31.17 29.54 28.43
Age
30-34
35-39 40-44 4lJ-49
50-S4 55-SIJ ~
65-69
33.86
Stanclanl cJ.matloa for all age groups is 6.74.
"Based on the equation: Predicted teat aeore = 32.0932 + 0.087928 • age - 0.0022535 • age2
Sigoificance tests for regression with the test scores Onliaary least-squares regression of test means on age (quadratic) Number of observations Number of clusters R2 F
11 5 0.673 F<2.4l = 151.82, p < 0.0002
Term
Coefficient
SE
t
p
95%CI
Age Age squared
0.087928 -0.0022535 32.0932
0.336 0.003 7.128
0.26° -0.65
0.807" 0.552 0.011
- 0.845 to 1.021
Constant
4.50
- 0.012 to 0.007 12.30 to 51.88
"Significance test for age centered (sample means- aggregate mean): t = -1.61, p =0.182.
Predietioo Predicted age range Mean predicted score SEe
95%CI
25-69 years 30.81 (1.76)
0.64
29.56-32.06
Tests for assumptions and model flt Tests for heterogeneity in the 8nal dataset Pooled estimates for fixed effect 29.905 Pooled estimates for random effect 30.866 Q = 92.11, p < 0.000 Moment-based estimate of between-study variance 6.959
953
APPENDIX 22M Table A22m.3. (Contd.)
co
34 0 32
30
21
0 21
~----~----~-----r-----r-----r----~ 80 80 20 30
•
FIIPft A22m.3.
A sc:atterplot illustrating the dispersion of the data points around the regression line for the DyDamometer, Female, DomiDant Hand. 'l1le size of the bubbles reflects the weight of the data point, with
larger bubbles indicating larger standard error, smaller weight
Model Unear
0.639
Quadratic
0.673
0.599 0.591
BIC
BIC'
17.097 18.413
-8.820 -7.504
BIC' difference of 1.317 provides weak support for the linear modeL Testarar,...........~
Normality of the residuals Sbapiro-WIIk W test Homoscedastidty White's general test
W=0.888, p=0.131 6.1915, p=0.185
Slpiflcance teats for regreuion with the SDs A regression of SDs on age yielded an It'- of 0.166 (Fu.41 = 4.30, p = 0.107). Therefore, the SD for the aggregate sample Is suggested for use with all age groups.
Edaeatloa Education did not contribute to grip strength beyond its inverse asiiOCiltkm with age. '11lere was a negligible impnwement of 0.00071n the It'- with addition of the education term to the above regression model (for six data-points that report education). SUnilarly, sign!&cance tests for educalloa yielded t=0.06, p=0.96. '11lerefore, fUrther analyses were not performed.
954
APPENDIX 22M
Table A22m.4. Results of the Meta-.AnJ).ysis and Predicted Scores for the
Hand Dynamomter Test: Females, Nondominant Hand (Relevant values are weighted on the s!fDdard error for the test mean)
Description of the aggregate sample Only those studies reporting data for males
~d
females separately were included in the analyses.
Number of studies included in the aaalysis Years of publication Number of data points used in the all!dysis (a data point denotes a study or a cell in education/gender-stratified data) Total number of participants Variable
4 1985-1988 10
407
n•
~
sot
Range
10
2&86
36.58
13-187
10 10
4t97 2;93
16.77 3.04
22.8-70 1.3-16.1
5 5
14;41 9;37
2.12 0.36
10.4-16.2 2.0-3.1
10 10
2~92
2.96 1.00
23.1-31.1 3.8-6.6
SGtnplelhe Mean
Age Mean SD
EductJtion Mean SD
r•acoreCombined mean Combined SD
~44
•Number of data points differs for different knalyses due to missing data. tweighted means and SDs.
ldlosr:m1
Predicted grip strength in averaged over trials and SDs per age group• (Dyllamometer, fedaies, nondominant band) I
95%CI
Age Bange
Predieted
l..a6rer
Seore
B~d
J5-J9 30-34 35-39 40-44 4S-49 50-S4 55-59 60-64 65-69
!9.69 !9.65 !9.41 28.95 28.28 27.39 26.!9 !4.97 !3.44
29.!0 28lls 27:14 26.80 25,00 25,()3 24,17 22$)7
28.35
Upper Band 31.02 30.11 30.13 30.16 29.75 28.88 27.54 25.77 23.92
•Based on the equation:
PretJict.d ,_, acore = 26.0349 + 0.2504831• age - 0.00428 • age2
Standard deviation for all age groups is 5.44.
955
APPENDIX 22M Table A22m.4. (Contd.) 0 0
30
0
0
0 25
~ ~-----.----~-----.r-----r-----~----,30
40
80
•
fi&ure A22m.4.
80
A scatterplot iJlustrating the dispersion of the data points around the regression for the DyDamometer, Female, Nondomlnant Hand. '11le size of the bubbles reflects the weight of the data point. with larger bubbles indicating larger standard error, smaDer weight
Sipfflcanee tests for regrellioa. with the test scores Onliaary leMt ........ repellioa ol telt Oil . . (qudlldic) Number of observations 10 Number of clusters 4 0.833 R2 F =157.44, p < 0.0009
Term
Coef&cient
SE
t
Agel Age
0.2504831 -0.00428
0.270 0.003 6.046
0.93• -1.59 4.31
Constant
26.0349
95,_, Cl 0.422•
0.209
- 0.609 to 1.109 - 0.013 to 0.004
0.023
6.79 to 45.28
•stgn•ficanc:e test for age centered (sample means- aggregate mean): t = - 2.38, p =0.098. Predletioa
Predicted age range Mean predicted score
SEe 95,_, Cl
25-69yean 27.56 (2.24) 0.52 26.54-18.59
Tests for usumptioal and moclel&t Tells far heteropaeltr fa the flaal data eet Pooled estimates for 8xed eft'ect 27.064 Pooled estimates for random eft'ect 27.544 Q(df). p Q(9) = 87.71, p < 0.000 Moment-based estimate of between-study variance 7.174 (contmued)
APPENDIX 22M
956 Table A22m.4. (Contd.) Tests for
model&t~
of a
q~
term
Model Linear Quadratic
0.739 0.833
0.706 0.785
BIC
BIC'
17.141 14.993
-11.132 -13.281
BIC' difference of 2.148 provides positive su~rt for the quadratic model. Tests for parameter speei8catioas Normality of the residuals ' Shapiro-Wille W test w=0.~. p =0.835 Homoscedasticity White's general test s.197,r=o.268
Sigoifleance tests for regression with jthe SDs I
A regression of SDs on age yielded an R2 of4229 (Fu.a)=4.23, p=0.132). Therefore, the SO for the aggregate sample is suggested for ~e with all age groups. !'
Effects of demographic variables Education Education did not contribute to grip ~ beyond its inverse association with age. There was a small improvement of 0.48& in the R2 with addition of the education term to the above regressio:f.ml (for five data points that report ce tests for education yielded information on education). Similarly, si · t = 1.40, p =0.30. Therefore, further were not performed.
Appendix 23: locator and Data Tables for the Grooved Pegboard Test (GPT)
Study numbers and page numbers provided in these tables refer to study numbers and descriptions of studies in the text of Chapter 23.
Locator table also provides a reference for each study to a corresponding data table in this appendix.
Table A23.1. Locator Table for the Grooved Pegboard Test (GPT) Study
Age·
n
GPI'.l Rounsaville et al.• 1982 page 462 TableA23.2
24.9
29
GPI'.S Bornstein, 1985 page 462 Tables A23.3, A23.4
20-39 40-59 60-69
GPI'.3 Heaton et al., 1985 page 463 TableA23.5
32.7 (13.5)
100
GPI'.4 Heaton et al., 1986 page 463 TableA23.6
15-81 39.3 (17.5) <40
553
40-59 ~60
365
Sample Composition
IQ/ Education•
I..ncation
CETA workers, SK male
11.2
Massachusetts
178 M, 187 F; paid volunteers free of neurological or psycbiatric illness; data are presented by age x education x gender
12.3 (2.7)
Western Canada
79 M, 21 F; controls with no neurological illness, head trauma, or substance abuse 356M, 197 F; normal subjects with no history of neurological illness, head trauma, or substance abuse; 7.2% left-handed; data are presented in 3 age and 3 education groups
~HS
14.15
Denver, CO
(2.84)
0-20 13.3 (3.4) <12 12-15
Colorado, California, Wisconsin
~16
(continued)
957
APPENDIX 23
958
Table A23.1. (Contd.) Study GPr.5 Bomstein, 1986a page 463 Tables A2:3.7, A2:3.8
GPr.6 Polubinski &: Melamed, 1986 page464 Table A2:3.9 GPr.7 Ryan et al., 1987 page 464 Table A2:3.10 GPr.8 Bomstein et al., 1987a page465 Table M:l.ll GPr.9 Thompson et al., 1987 page 465 Table A2:3.12 GPr.IO Bomstein &: Suga, 1988 page 466 Table A2:3.13
Age•
n
Sample Composition
18-39 365 178 M, 187 F; paid volunteers free 40-59 of neurological or psychiatric 60-69 illn~; 91.5% right-handed; data are stratiled by age x education x gender; proportion of participants classified as ~ is presented 18-24 120 Underg,aduate students (60 M, 60 F); data tre partitioned by firm and mixe4-handedness groups 55 Blue-
21-30 31-40 41-50 51-59
17M.~F
GPr.ll Miller et al., 21-72 1990 37.20 page 466 Tables A2:3.14, A2:3.15 (7.52) 35.66 (6.47) 36.90 (7.04) GPr.l2 Heaton et al., 1991,2004 page 467 Data are not reproduced in this book
42.1 (16.8) groups: 20-34 ~9
40-44
45-49 50-S4 55-S9 60-64 ~
70-74 7~
16M, S8 F 16M, i8 F Homosexual/bisexual men 769 HIV-1-seronegative 727 HIV-1-ieropositive, asymptomatic 84 HIV-1-seropositive, symptomatic 486 Urban tnd rural volunteers; data colleCted over 15 years through multicenter collaborative efforts; strict exclusion criteria; 65% M; data are presented in T-score equNalents for M and F separately in 10 age groupings by 6 edu<:Jtion groupings; in 2004 edition, age range is expaaded to 85 years and data are presented for African-American and Caucasian participants sepaStely
IQ/ Education•
Location
12.3 (2.7)
Western Canada
College students
Ohio
12.3 (1.4) 11.9 (2.2) 11.3 (1.8) 11.0 (1.8)
Eastern Pennsylvania
VIQ: 105.8 (10.8) PIQ: 105.0 (10.5) 13.15 (3.49)
11.7 (2.9)
Western Canada
Range ~10, M8.5 Range 11-12, M11.7 Range> 12, M15.0
16.36 (2.34) 15.70 (2.44) 16.06 (2.50) 13.6 (3.5) FSIQ: 113.8 (12.3) Groups: 6-8
9-11 12 13-15 16-17 ;::::18
MACS centers at Baltimore, Chicago, Los Angeles, &: Pittsburgh
California, Washington, Texas, Oklahoma, Wisconsin,
Illinois, Michigan, New York. V'uginia. Massachusetts, Canada
959
APPENDIX 23 Table A23.1. (Contd.) Age·
Study
n
IQ/ Education•
Sample Composition
GPT.l3 Seines et al., 25-34 733 HOIIlCISt!lalall men, HIV-11991 35-H seronegative. stratified by age. education, and age x education page 467 45-54 Tables A23.16-A23.18
College
GPT.l4 Ruff~ Parker, 1993
7-22
16-70 360 Normal volunteers screened for psychiatric hospitalization, 16-39 chronic poly-drug abuse, or page 468 Tables A23.19, A23.20 40-54 neurological disotders; ~70 data are stratified by age x education X gender; data for left hand-dominant sample are
:S12
Location MACScenters at Baltimore, Chicago, Los Angeles, ~ Pittsburgh California, Michigan, eastern seaboard
~13
also presented GPT.IS Russelllk Starkey, 1993
page 468 Table A23.21 GPT.l8 Dikmen et al., 1999 page469 Table A23.22
GPT.l7 Strenge et al., 2002 page 469 Table A23.23
45.5
113 Norms are collected from standardization sample for the HRNES manual; controls are V.A. patients without CNS pathology (95 M, 18 F) 43.6 121 Healthy adults; 68,. M; data on (19.6) test-retest reliabilities and practice effect are provided (14.1)
19-30 49
12.8 (2.9)
Cincinnati,
12.0 (3.3)
Washington, Colorado, California
Medical students (23 M, 26 F)
Miami
Germany
•Age column and IQ/educatlon column contain information regarding range and/or mean and standard deviation for the whole sample and/or separate groups, whichever information is provided by the authors.
Table A23.2. [GFI'.l] Rounsaville et al., 1982: Data for a Sample of Comprehensive Employment Training Act (CETA) Workers Nondominant
,. Male
Education
,. RightHanded
Dominant
n
Age
Hand
Hand
29
59
11.2
90
24.9
70.52
75.59
APPENDIX 23
960
Table A23.3. [GPT.2a] Bornstein, 1985: Data for a Sample of Healthy Canadian Adults Stratified by Age, Education, and Gender Number of Females
Preferred
of Males
Hand
Nonpreferred Hand
107
64
31
66
40
57
60.9 (16.2) 68.6 (15.0) 75.5 (14.6)
66.2 (17.1) 74.2 (15.7) 83.1 (15.5)
51
57
2:HS
127
130
72.4 (17.7) 64.2 (15.5)
78.1 (19.1) 70.2 (16.6)
68.7 (20.8) 64.6 (10.8)
74.5 (21.3) 70.8 (13.3)
Number Age
Education"
S,age 20-39 40-59 60-69
13.0 (2.3) 11.9 (2.8) 11.8 (2.9)
By education 48.5 (16.6) 41.1 (16.8)
By gend.r 39.2 (17.2) 47.3 (16.1)
12.4 (2.9) 12.2 (2.5)
178 Males 187 Females
"HS, high school.
Table A23.4. [GPT.2b] Bornstein, 1985: Data Presented in Age x Education (
Female
~s
2:HS
n
M (SD)
n
M (SD)
n
M (SD)
n
M (SD)
20-39
21
85
60-69
16
60.4 (6.4) 66.5 (7.0) 75.4 (12.2)
49
13
62.1 (20.8) 69.5 (11.0) 75.7 (14.4)
13
40-59
65.3 (8.5) 86.8 (30.1) 84.8 (22.3)
57.2 (9.6) 63.9 (7.1) 70.9 (9.2)
Age Groups
Preferred lumd 17
23
22 22
43 34
Nonptefo•• ed lttmtl 20-39 40-59 60-69
71.3 (12.2) 91.2 (30.0) 93.6 (21.7)
67.6 (20.9) 74.2 (12.6) 81.9 (13.3)
64.1 (9.2) 71.1 (8.5) 82.0 (14.0)
62.2 (11.8) 70.6 (9.5) 79.6 (12.6)
961
APPENDIX 23 Table A23.5. [GPI'.3] Heaton et al., 1985: Data for a Control Sample
Age
Education
32.7 (13.5)
14.15 (2.84)
n 100
11me for Both
MIF Ratio
Hands 131.40 (18.76)
7M1
Table A23.6. [GPI'.4] Heaton et al., 1986: Data for a Sample of Healthy Adults Stratified by Age and Education• DomiDant Hand
Nondominant
n
319 134 100
61.1 68.1 85.1
65.7 74.7 90.0
132 249 172
74.6 66.0 62.3
79.3 71.3 67.6
Hand
Age groupe
15-40 40-59 60-81
Etlucadola groupe ~12
12-15 U~20
•Mean education 13.3 (3.4).
Table A23.7. [GPI'.5a] Bornstein, 1986a: Proportion of Participants Classified as Impaired (Exceeding a Criterion of 66 Seconds) in a Sample of Healthy Canadian Adults (n =365) % CJassifled
Preferred
Mean
Median
Mode
as Impaired
66.6
65
65
40.8
72.6
70
75
61.5
band (seconds) Nonpreferred band (seconds)
Table A23.8. [GPI'.Sb] Bornstein, 1986a: Proportion of Participants Classi6ed as Impaired in a Sample Stratified by Age, Education, and Gender Preferred Hand
Males
Nonpreferred Hand Females
Males
Females
Age
?;HS
?;HS
?;HS
?;HS
18-39
25 (5120)
19.5 (17187)
23.1 (3113)
13 (6146)
60 (W20)
47.1 (41187)
38.5 (5113)
(1~46)
76.9 (10113)
75
55.5
27.3
(12122)
(1~44)
84.6 (11113)
75
(W16)
(1~16)
59.1 (13122)
65.9 (29144)
81.2 (13116)
82.6 (19123)
77.3 (17122)
57.6 (19.133)
100 (16116)
(20.'23)
95.5 (21.122)
(28133)
~
60-69
•as. high school.
87
26.1
84.8
APPENDIX 23
962 Table A23.9. [GPT.6] Polubinski. and ~elamed, 1986: Data for Undergraduate Students ijlrtitiooed into Firm and Mixed Right-handedness Croup Handedness
n
Education
Age
Rip)t Haad
Left
Mixed
(1.4)
13.3 (0.7)
60.1 (7.6)
67.1 (9.8)
30
20.1 (1.6)
13.6 (0.9)
55.3 (5.9)
60.7 (5.9)
38
19.8 (1.2)
13.4 (0.7)
54.~
(6.2)
60.2 (6.2)
19.4 (0.8)
13.3 (0.6)
54.9 (6.6)
60.1 (7.3)
30
19.7
Dominant Hand
Nondominant Hand
Test 1
56.6 (5.9)
59.3 (6.6)
Test 2 (3 weeks later)
58.8 (8.9)
58.8 (6.6)
Hand
Men
Firm
Table A23.11. [GPT.8] Bornstein et al., 1987a: Test-Retest Data for 23 Healthy Participants (9 Men, 14 Women) Between 17 and 52 Years of Age
wFirm Mixed
22
Table A23.10. [GPT.7] Ryan et al., 1987: Data for Blue-Collar Workers (All Males) without t History of Exposure to Industrial Toxins Age Dominant Noncbninant Hind Hand Group Age Education n 21-30 26.1 (2.3)
12.3 (1.4)
55
69.7 (11.5)
7..5 (10.9)
31-40 36.8 (2.7)
11.9 (2.2)
45
67.2 (10.2)
72.8 (119)
41-50 45.7 (2.9)
11.3 (1.8)
44
76.1 (11.9)
71.1 (Q.7)
51...59 54.8 (2.8)
11.0 (1.8)
38
78.7 (13.0)
8L2 (12.9)
Table A23.12. [GPT.9] Thompson et al., ~987: Percent of Healthy Participants Scoring in the LateraJized Dysfunction ~ge Nondominant Hemisphere Dysfunction
Groups
n
Dominant Hemisphere Dysfunction
All right
167
20.96
14.97
Mixed right
226
25.89
18.75
Left
33
36.36
6.06
Total
426
24.76
16.27
lntermanual Percent
Difference Scores
-8.2 (11.3) -7.9 (12.8) -4.4 (10.8) -7.8 (12.1)
963
APPENDIX 23
Table A23.13. [GPT.lO] Bomstein and Suga, 1988: Data for Healthy Canadian Volunteers Between 55 and 70 Years of Age Education
Age
Number of Males
Number of Females
Preferred Hand
Nonpreferred Hand
~10
62.3
17
29
78.5 (19.9)
83.1 (17.2)
11-12
62.9
16
28
74.2 (16.0)
84.2 (20.8)
<12
63.0
16
28
71.9 (14.2)
77.7 (13.7)
Table A23.14. [GPT.lla] Miller et al., 1990: Demographic Characteristics for the Sample of Homosexual! bisexual Males Participating in the Multi-Center AIDS Cohort Study
Race
%Handedness
CES Depression Scale
CD4
2
9.08 (9.03)
970,42 (332.46)
6
2
9.44 (9.27)
561.90 (277.98)
5
3
15.21 (11.19)
(269.45)
Black
Hispanic
Other
92
2
4
14
91
2
8
90
2
Left White
n
Right
Ambidextrous
Seronegative
769
87
0
13
Asymptomatic, seropositive
727
86
0
Symptomatic, seropositive
84
90
1
Table A23.15. [GPT.llb] Miller et al., 1990: Data for the MultiCenter AIDS Cohort Study (All Males)
n
Age
Education
Dominant Hand
Nondominant Hand
Seronegative
769
37.20 (7.52)
16.36 (2.34)
64.28 (9.10)
69.28 (9.91)
Asymptomatic, seropositive
727
35.66
(6.47)
15.70 (2.44)
63.39 (9.17)
68.90 (12.52)
Symptomatic, seropositive
84
36.90 (7.04)
16.06 (2.50)
66.57 (11.42)
73.27 (16.39)
277.22
964
APPENDIX 23
Table A23.16. [GPT.l3a] Seines et al.,!l991: Demographic Characteristics for the Seronegative Homosexual!BiJext;d Males Participating in the MACS Smdy : I
Handedness ;
Race(%)
African
i
Left
Caucasian
American
0.3 1.1 2.1
14.9 11.7 11.3
96.4 96.6 95.9
3.6 3.4 4.1
0.4 0.0 1.3
11.8 14.8 12.0
94.8 96.0 96.7
5.2 4.0 3.3
Right
Ambidextrous
84.8 87.2 86.6
87.8 85.2 85.8
Br1 age 25-34 35-44 45-54
Brl.,._,_ CoDege
I
Table A23.17. [GPT.l3b] Seines et al., 1991: Data Stratified by Age and Education Nondomiuant Hand
Dominant
Hand
Percentile
Percentile Mean Age
Educa~
Mean (SD)
5th
lOth
Mean (SD)
5th
lOth
a
31.0 (2.6)
16.1 (2.2)"
62.0 (7.8)
76
72.5
67.0 (9.3)
85
80.5
35-44
290
39.3 (2.9)
64.4 (8.1)
78
75
69.2 (9.1)
85
82
45-54
9T
48.5 (2.6)
16.4 : (2.3) 1 16.7 I (2.6)
67.9 (9.0)
85
80
73.7 (11.1)
90
86
64.1 (8.5)
77
74
69.6 (10.3)
89
84
Age
By age 25-34
II
i I
Brl educ:adcm College
229 201
301
36.1 (7.4)
13.7
35.6
16.0 (0.0)
i
64.0 (8.7)
79
75
68.4 (10.3)
87
83
(7.2)
38.4 (7.8)
18.6 I (1.3)!
63.4 (8.3)
80
75
69.0 (9.1)
85
81
I
(1.2) :
1
965
APPENDIX 23
Table A23.18. [GPT.13c] Seines et al., 1991: Data Stratified by Age x Education (Personal Communication) Percentile
n
Dominant Hand
25--34
107
35--44
93
45-60
42
62.4 (7.7) 65.3 (8.4) 65.7 (10.0)
10%
5%
Percentile Nondominant Hand
5%
10%
College 25--34
76
73
78
75
81
77
69.1 (10.4) 68.5 (8.2) 73.0 (12.2)
89
83
84
82
93
90
104
62.4 (8.4)
76
75
65.6 (8.1)
83
74
35--44
77
65.2 (8.2)
80
77
70.4 (10.5)
92
87
45-60
35
67.1 (10.0)
85
80
73.4 (13.5)
93
86
>College 25--34
111
61.1 (7.6)
76
71
66.5 (8.7)
84
80
35--44
150
63.4 (8.4)
78
74
68.8 (9.1)
84
80
45-60
64
66.8 (8.7)
84
79.5
72.3 (8.3)
85
81
Table A23.19. [GPT.14a] Ruff and Parker, 1993: Data for the Left Hand-Dominant Sample of Healthy Participants Men
n
17
Women
18
37.9 (18.0)
38.7 (16.1)
13.7 (2.8)
14.1 (2.6)
Dominant hand
70.7 (13.5)
65.6 (11.6)
Nondominant hand
70.3 (15.7)
73.0 (18.6)
Age Education
APPENDIX 23
966
Table A23.20. [GPT.14b] Ruff and P!p'ker, 1993: Data for a Sample of Healthy Adults Stratified by Gender x Education x Age Females
Males
Combined
n
:M (SD)
~12
29
67.8 (9.2)
30
62.8 (8.9)
59
65.3 (9.3)
~13
60
ti4.7 '(10.9)
60
57.8 (6.2)
120
61.2 (9.5)
All
89
65.7 <10.4)
90
59.5 (7.5)
179
62.5 (9.6)
~12
15
~1.9 p5.1)
14
63.1 (4.4)
29
67.7 (12.0)
~13
30
J0.4
30
63.3 (7.4)
60
66.8 (9.9)
~0.9
44
63.2 (6.5)
89
67.1 (10.6)
~.7
15
78.6 (11.7)
30
81.1 (11.1)
~4.1
29
75.3 (11.3)
59
74.7 (12.1)
45
'77.3 i12.8)
44
76.5 (11.4)
89
76.9 (12.1)
~12
59
72.9 (12.8)
59
66.9 (11.2)
118
69.9 (12.3)
~13
120
~.5
119
63.4 (10.7)
239
66.0 (11.6)
Age/Education
n
M (SD)
n
M (SD)
.DolllinGnt ,.,. 16-39
40-S4
~10.9)
All
45
~12.3)
55-10 ~12
15
k10.2) ~13
30
,13.0)
All All age,..,.,.
{12.0) 179
69.9 (12.5)
178
64.6 (10.9)
357
67.3 (12.0)
~12
29
74.5 (10.9)
29
66.8 (10.7)
58
70.7 (11.4)
~13
59
67.8 (10.8)
60
65.2 (10.3)
119
66.5 (10.6)
All
88
'70.0 (11.2)
89
65.7 (10.4)
177
67.9 (11.0)
15
~9.1
14
69.6 (6.5)
29
74.5 (12.4)
All
Nondomitlcane ,_, 16-39
40-S4 ~12
J14.9) ~13
30
73.7 l9.9)
30
70.8 (8.9)
60
72.3 (9.4)
All
45
'15.1 J11.9)
44
70.4 (6.5)
89
73.0 (10.5)
967
APPENDIX 23 Table A23.20. (Contd.)
M (SD)
n
Age/Education
Combined
Females
Males n
M (SD)
n
M (SD)
ISIS-10 ~12
15
91.0 (12.7)
13
84.3 (15.3)
28
87.9 (14.1)
~13
28
83.5 (13.4)
29
82.0 (12.5)
59
82.8 (12.9)
All
43
86.1 (13.5)
42
82.8 (13.3)
85
84.5 (13.4)
~12
59
79.9 (14.0)
56
71.6 (13.1)
115
75.8 (14.1)
~13
117
73.1 (12.9)
119
70.7 (12.5)
238
71.9 (12.7)
All
176
75.4 (13.6)
175 (12.7)
71.0
351
73.2 (13.3)
All age lewJ.
Table A23.21., [GPI'.15] Russell and Starkey, 1993: Data for a Sample of Veterans Administration Patients without Central Nervous System Pathology Age
Education
Race"
Gender
Dominant Hand
Nondominant Hand
45.5 (14.1)
12.8 (2.9)
106W 7B
95M 18 F
74.4 (24.4)
78.4 (25.9)
n
113
•w, white: B, black. Material from the manual for the Halsteod-IWssell Neuropsychological Eooluation System-Revised (HRNES-R) copyright © 1993, 2001 by Western Psychological Services. Reprinted by permission of the publisher, Western Psychological Services, 12031 Wdsbire Boulevard, Los Angeles, California, 90025, U.S.A. Not to be reprinted in whole or in part for any additional purpose without the expressed, written permission of the publisher. All rights reserved.
Table A23.22., [GPI'.16] Dikmen et al., 1999: Test-Retest Data for Healthy Adults•
n
121
Age 43.6 (19.6)
Education
MIF Ratio
12.0 (3.3)
68132%
Dominant Hand
WAIS FSIQt
Test-Retest Interval
Time1
Trme 2
108.8 (12.3)
5.4 (2.5)
69.66 (19.27)
68.68
(21.04)
Nondominant Hand Time 1 75.80 (21.56)
Time2 73.70 (19.69)
"Demographic information is provided for a larger sample of 125 participants. tThe mean Wechsler Adult Intelligence Scale full-scale IQ (Wechsler, 1955) is reported for three groups used in this study combined.
968
APPENDIX 23
Table A23.23. [GPT.l7] Strenge et al., 2002: Data for a Sample of Medical Students• %Right-
Dominant
Age
n
MIF Ratio
Handed
Hand
Nondominant Hand
24.5 (2.75)
49
23126
100
54.2 (6.9)
57.9 (5.6)
"Study was conducted in Germany.
Appendix 23m: Meta-Analysis Tables for the Grooved Pegboard Test (GPT)
Table A23m.1. Results of the Meta-Analysis and Predicted Scores for the
GPT, DomiDant Hand (Relevant values are weighted on the standard error for the test mean)
Description of the aggregate sample Number of studies ineluded in the aaalysis Years of publication Number of data points used in the aaalysis (a data point denotes a study or a cell in education/gender-stratified data) Total number of participants
6 1985-1999 15
2,382
Variable
n
xt
sot
SatnpkMean
15
111.69
116.30
Age Mean
15
38.62
16.55
19.4-65
SD
15
4.81
5.56
0.8-19.6
15 15
13.54 2.33
1.49 1.20
11.8-16.7 0.6-3.8
15
57.28
34.64
15 15
64.94
7.42 4.50
Education Mean SD Percent melle
TatacoreCombined mean Combined SD
11.36
Range
22-727
0-100
54.8-76.9 5.9-19.3
"weighted means and SDs. (continued)
969
970
APPENDIX 23M
Table A23m.1. (Contd.)
Predicted number of seeonds to completion and SDs per age group• (GPT, dominant hand) 959& CI
959& CI
Age .Rtmge
Predicted Seore
Lower Band
Upper Band
20-U J5-J9
57.95 80.11
57.32
58.58
8.3J
59.45
30-34
6J.J9 64.46 66.63 68.79
61.54 63.59 65.63 67.65
60.79 63.04 65.32 67.62
70.96 73.13 75.30
69.66
10.31 11.91 13.13 13.95 14.39 14.44 14.10 13.38
35-39 40-44
45-49 50-S4 55-89 60-64
69.94 72.26 74.59 76.93
71.67 73.67
Predicted SD
Lower Band
Upper Band
6.78 7.50 7.97 8.48 9.05 9.68 10.36 10.99 11.22
9.85 13.12 15.85 17.78 18.86 19.10 18.52 17.21 15.54
"Based on the equations: Predicl.d ,_, ec:orw = 48.18889 + 0.4337963 • age Predicl.d SD = - 5.442114 + 0. 7862791• age - 0.0077628 • ag~
Ordiaary least-squares regression o£ test meaDS OD age
15 6
R2 F
0.936 Fu.s> =544.34, p < 0.000
Term
Coefficient
SE
t
p
959& CI
Age
0.4337963 48.18889
0.019
0.509
.23.33 94.58
0.000 0.000
- 0.386 to 0.482 46.88 to 49.50
Constant
Predietioa Predicted age range Mean predicted score SEe 959& CI
20-65 years 66.63 (5.94) 0.54 65.58-67.68
971
APPENDIX 23M Table A23m.1. (Contd.) 80
70
80
0
50 20
40
30
50
age
70
80
80
Figure A23m.1. A scatterplot illustrating the dispersion of the data points around the regression line for the Grooved Pegboard Test, Dominant Hand. The size of the bubbles reHects the weight of the data point, with larger bubbles indicating larger standard error and smaller weight.
Tests for assumptions and model&t Tests for heterogeneity in the 8nal data set Pooled estimates for fixed effect 63.338 Pooled estimates for random effect 64.175 Q(dl)• p Q(l4)=423.67, p <0.000 Moment-based estimate of between-study variance 18.070 Tests for model &t~n of a quadratic term Model Linear Quadratic
0.936 0.963
Adjusted R2
BIC
BIC'
0.931 0.925
25.327 27.968
-38.426
-35.785
BIC' difference of 2.641 provides positive support for the linear model.
Tests for parameter speci&cations Normality of the residuals Shapiro-Wille W test W = 0.864, p = 0.028 Homoscedasticity White's general test 2.758, p = 0.252
Significance tests for regression with the SDs Ordinary least-squares regression of SDs on age (quadratic) Number of observations 15 Number of clusters 6 R2 0.489 F· p F<2.s> = 22.61, p < 0.003 (continued)
972
APPENDIX 23M
Table A23m.1. (Contd.)
Term
Coefticient
SE
i
p
95% CI
Age Age2
0.7862791 -0.0077628 -5.442114
0.423 0.005 6.641
-q.82
0.122" 0.179 0.450
-0.300 to 1.873 -0.021 to 0.005 -22.51 to 11.6
Constant
l86· Jr56
•significance test for age centered (sample Prediction Mean predicted SD SEe 95% CI
r+ans -aggregate mean): t =4.33, p =0.007. I
12.66 (2.11f 1.81 9.11-16.20 !I
I
Etrects of demographic variables
!
Education (Analysis of the effect of education on t h t scores was performed on a separate by education groups.) data set, which contained data broken Regression of test means on education an age. • Number of observations 18 Number of clusters 8 1 R2 0.8251 Term
Coefficient
Education
-0.685
SE
0.252
-12.71
p
95% CI
0.000
- 1.28 to- 0.09
"Regression with education was ran on a ~ set comprising data stratified by education rather than by age, when available.·
Geuder t-test by gender n
X male
X female
Mi-F Difference
t
p
4M,4F
63.500
59.725
3.775
0.841
0.216
Table A23m.2. Results of the Meta-Analysis and Predicted Scores for the GFf, Nondominant Hand (Relevant values are weighted on the wrror for the test mean)
,dard
Description of the aggregate sample : Number of studies iaeluded in the ....alym Yean of pab1ieatfoo : Number of data points used in the ~ (a data point denotes a study or a cell r in education/gender-stratified data)
6 1985-1999 15
Total number of partieipants
2,382
APPE N DI X 23 M
973
Table A23m.2. (Contd.)
n
xt
sot
Range
15
115.13
125.04
22-727
15 15
38.43 4.83
16.48 5.59
19.4-65 0.8--19.6
15 15 15
13.58 2.32 58.78
1.52 1.20 34.86
11.8--16.7 0.6-3.8 0-100
15 15
70.98 12.50
7.98 4.78
60.1--84.5 5.9-21.6
Variable Sample size
Mean Age
Mean SD Education
Mean SD Percent male Test score means
Combined mean Combined SD
tweighted means and SDs. Predicted number of seconds to completion and SDs per age group• (GPT, nondominant hand)
95% CI
95% CI
Age Range
Predicted Score
Lower Band
Upper Band
Predicted SD
Lower Band
Upper Band
20-24 25-29 30-34 35-39 40-44 45-49 50-54 55-59 60-64
63.64 65.95 68.25 70.56 72.86 75.16 77.47 79.77 82.08
62.99 65.20 67.38 69.55 71.71 73.86 76.01 78. 15 80.30
64.30 66.70 69.12 71.56 74.01 76.47 78.93 81.39 83.86
9 .40 11.53 13 .23 14.49 15.33 15.74 15.72 15.26 14.38
7.96 8.59 9.00 9.47 10.02 10.67 11.39 12.11 12.40
10.84 14.47 17.45 19.52 20.64 20.81 20.04 18.42 16.36
•Based on the equations: Predicted test score = 53.27121 + 0.460912 • age Predicted SD = - 5.48594 + 0.8551187 • age - 0.0085961 • age2
Significance tests for regression with the test scores Ordinary least-squares regression of test means on age (linear) 1umber of observations 15 umber of clusters 6 R2 F(dfl· p
0.907 F(l.sl = 680.98, p < 0.000 (continued )
APPENDIX 23M
974
Table A23m.2. (Contd.)
Term
Coe&icient
SE
t
p
95% CI
Age Constant
0.460912 53.Z7121
0.018 0.384
26.10 13f.59
0.000 0.000
- 0.416 to 0.506 52.28to54.26
Prediction
Predicted age range Mean predicted score
20-65 yellS
72.86 (6.31)
SEe
0.60 71.68-74.84
95% CI 90
80
70
80r§ 20
30
40
eo
50
age
70
80
fiaure A23m.2. A scatterplot illustrating the cJtspersion of the data points around the regression line for the Grooved Pegboard Test, Nondominant Hand. 11te size of the bubbles reflects the weight of the data point, with larger bubbles indicating larger standard b r and smaller weight.
Tests for 8SIUIIlptions and model 8t Tem £or heterogeneity iD. the flaal ~
Pooled estimates for 6xed effect Pooled estimates for random effect Q(clf). p
68.665 70.012 Q(l4) =
435.41, p < 0.000
Moment-based estimate of
between-study variance
25.144
Tests for moclel &t-clclittma of a quam.tic term
Model Unear Quadratic
0.907
0.900
0.910
0.896
•
BIC
BIC'
32.957 35.141
-32.960 -30.775
BIC' difference of 2.184 provides positive support for the linear model. Tests for parameter speetfleatioas Normality of the residuals Shapiro-Wilk W test
Homosoedasticity White's general test
=
W 0.813. p
=0.005
2.136, pt=0.344
APPENDIX 23M
975
Table A23m.2. (Contd.)
Sigalflcance tests for regression with the SDs Onlmary least-squares rep11i1111 of SDs oa age (quadratic) Number of observations 15 Number of clusters 6 R2 0.468 Fecit)• p Fc2.5>=25.20, p <0.002
Term
Coefficient
Age
Age2
0.8551187 -0.0085961
Constant
-5.48594
p
95'll CI
0.13a" 0.185 0.498
- 0.366 to 2.076 - 0.023 to 0.006 -24.78 to 13.8
t
SE 0.475
1.8°
0.006 7.506
-1.54 -0.73
"Significance test for Age centered (sample means- aggregate mean): t=3.99, p=0.010
Prediction MeanpredictedSD
SEe 95%CI
13.90 (2.16) 1.90 10.18-17.62
EfFects of demographic \'IU'iables Ed-tiOD Regreslion of test means on education and age• Number of observations 18
Number of clusters
8 0.836
a2 Term Education
Coeflicient
SE
t
p
95'll CI
-0.628
0.159
-3.96
0.005
- 1.00 to- 0.25
"Regression with education was ran on a dataset compriling data stratified by education rather than by age, when available.
Geader t-test by gender
n
Xmale
Xfemale
M-F Difference
t
p
4M,4F
69.425
65.525
3.900
0.840
0.217
Appendix 24: Locator and Data Tables for the Category Test (CT)
Study numbers and page numbers p~vided in these tables refer to study numbeq and descriptions of studies in the text of Cbapter 24.
Table A24.1. Locator Table for the
Ca~gory
Locator table also provides a reference for each study to a corresponding data table in this appendix.
Test
Age•
n
Sample Composition
IQ Education•
cr.I Halstead, 1947 page 483 TableA24.2
15-50
28
14 subjects without psychiatric diagnosis, or history of brain injury; 14 with psychiatric
Cf.2 Reitan, 1955b, 1959 page484 Table A24.3 Cf.3 Reitan &: Wolfson, 1985 page 484 Data are not reproduced in this book
32.36 (10.78)
50
35 M, 15 F volunteers hospitalized with paraplegia and neurosis were included
Education: 7-18 IQ: 70-140 Education: 11.58 (2.85) FSIQ: 112.6 (14.3)
cr.4 Klove &:
31.6 32.1
Study
diagnosis
Lochen (cited in Klove, 1974) page 485 TableA24.4
976
22 22
No information is provided regarding the normative sample; cutoffs for "severity ranges" (perfectly normal, normal, mildly normal, impaired, and seriously impaired) are presented American and Norwegian controls
Location
Chicago
Indiana
USA
Education: 11.1 12.2 FSIQ: 109.3 111.9
Wisconsin,
Norway
977
APPENDIX 24
Table A24.1. (Contd.) Study Cf.5Wien& Matarazzo, 1977 page485 Table A24.5
Cf.6Mack& Carlson, 1978 page 486 Table A24.6
Age·
n
Sample Composition
IQ Education•
23.6 24.8
48 All males, neurologically normal; divided into 2 groups; random sample of 29 retested 14-24 weeks later
60-80
41 Older subjects: 3M, 38 F; no history of Education (Older): 14.05 (3.39) neurological impairment; younger subjects: 9 M, 31 F; no screening for FSIQ: 119.90 (15.14) neurological impairment was 40 Education (Younger): conducted computerized 15.43 (2.65) administration was used FSIQ: 113.76 (4.89)
69.76 (4.87) 20-37 25.03 (3.70)
Education: 13.7 14.0 FSIQ: 117.5 118.3
location Portland, OR
cr.7 Anthony et al., 1980 page486 Table A24.7
38.88 (15.80)
100 Normal volunteers, no history of medical or psychiatric problems, head injury, brain disease, or substance abuse
Education: 13.33 (2.56) FSIQ: 113.5 (10.8)
Colorado
cr.s Harley, et al., 1980 page487 Table A24.8
55-79
193 V.A-hospitalized patients; 56 T-score equivalents are reported 45 35 37 20
Education: 8.8 IQ>80
Wisconsin
363 Volunteers Huent in English; 152 M, 211 F; subjects had no physical disability, sensory deficit, current medical illness, brain disorder, or alcoholism; data are presented in age x IQ cells
WAIS IQ: 89-102 103-112 113-122 123-143
Toronto
Education: 8-26 14.8 (3.0) FSIQ: 119.1 (8.8)
Canada
Education: 0-20 13.3 (3.4) <12 (132) 12-15 (249)
Colorado, California, W'JSCOnsin
55-59 6()....64
65-69 70-74
75-79 Cf.9 Pauker, 1980 page 487 Table A24.9
19-71 19-34 35-52 53-71
Cf.IO Fromm-Auch & Yeudall, 1983 page 488 Table A24.10
15-64 193 111 M, 82 F; participants described 25.4 (8.2) as nonpsychiat:ric and non15-17 32 neurological; 83% are right-handed; 18-23 75 5 age groupings 24-32 57 33-40 18 41-64 10
cr.u Heaton, et al., 15-81 39.3 1986 (17.5) page 488 Table A24.11 <40 40-59 ~60
553 356 M, 197 F; exclusion criteria included history of neurological illness, significant head trauma, and substance abuse; sample was 319 divided into 3 age groups and 3 134 education groups; % classification 100 as normal is provided 60 M, 60 F volunteers; data for various intelligence levels are presented
Cf.l2 Dodrill, 1987 page 489 Table A24.12
27.73 (11.04)
120
Cf.l3 Yeudall et al., 1987 page490 Table A24.13
15-40 15-20 21-25 26-30 31-40
225 Volunteers: 127 M, 98 F; classified 62 in 4 age groupings; 88% 73 right-handed 48 42
~16(172)
Education: 12.28 (2.18) FSIQ: 100 (14.35)
Washington
Education: 14.55 (2.78) FSIQ: 112.25 (10.25)
Canada
(conHnued)
978
APPENDIX 24
Table A24.1. (Contd.) Study
cr.I4 Ernst, 1987
Age·
n
Sample Composition
IQ Education•
Location
65-75 69.6 (2.7)
110 51,M, 59 F volunteers
Education: 10.3
Brisbane, Australia
cr.ts Alekoumbides
1~
112
46.85 (17.17)
Education: 1-20 11.43 (3.20) FSIQ: 105.9 (13.5)
S. California
et al., 1987
page490 Table A24.14
page 491 Table A24.15
Medical and psychiatric V.A. r;::nts without cerebral lesions or ries of alcoholism or cerebral Oontusions; all subjects except for one ~remale
Cf.16 Bornstein et al., 1987a page 491 Table A24.16
17-52 32.3 (10.3)
23 Volunteers: 9 M, 14 F; no history tf neurological or psychiatric ilness; test-retest data are 'rovided
VIQ: 88--128 105.8 (10.8) PIQ: ~121
105.0 (10.5)
Cf.17 El-Sheikh et al., 1987 page492 Table A24.17
Cf.18 Russell, 1987 page 492 Table A24.18
32
46.19 (12.86)
155
Patents in V.A. hospitals; 148 M, 7 F; Ppected of neuro-logical disorders but negative findings
Education: 12.29 (3.00) FSIQ: 111.9
Cincinnati, Miami
138
Healthy participants; data are partitioned into 3 age f<mps
Education: 15.4 15.7 14.9
Maine
Education: 13.6 (3.5) Groups: 6-8 9-11 12
California, Washington, Texas, Oldahoma, WISCODSin,
Cf.19 Elias et al., 1990 page 493 Table A24.19
20-31 37-49 55-67
Cf.IO Heaton et al.,
42.1 (16.8) Groups: 20-34
1991,2004 page 493 Data are not reproduced in this book
35--39 40-44
45-49 50--54 55-59 60-64
65-69
Cf.Jl Elias et al.,
486 Volmteers: urban and rural; data ciollected over 15 years through ~ulticenter collaborative efforts; lirict exclusion criteria; 65% M; data are presented in T-score ttquivalents for M and F separately in10 age groups by 6 education fOups; in 2004 edition, age range ii expanded to 85 years and the data are presented for AfricanAmerican and Caucasian participants
1~15
16-17 ~18
FSIQ: 113.8 (12.3)
Iillnois, Michigan, New York, V'uginia, Massachusetts,
Canada
~tely
427 15-24 25-34
Cairo, Egypt
students; no history of brain damage; test-retest data are provided
70-74 75-80 1993 page 494 Table A24.20
• Urtlergraduate and graduate
17-24 20.6 (1.4)
Healthy participants; data are partitioned into 6 age groups
Education: 12-19
Maine
~gender
35-44 45-54
55-64 ~65
cr.u Barrett et al., 2001 page 494 Table A24.21
43.9 (7.6)
1,052 Air Force veteran controls; J?resumably all male; SDs not provided
High school and college
• Age column and IQ/education column cont4l information regarding range and/or mean and standard deviation for the whole sample and/or separate groups, whichdfer information is provided by the authors.
979
APPENDIX 24 Table A24.2. [Cf.1] Halstead, 1947: Data for the Control Group (Included Patients with Psychiabic Diagnoses): Mean Number of Errors for the Total Group and for Three Subgroups Number of Errors
n
Total Civilian
28 14
26.8 (10-46)
MiJibuy
8
50.8 (29-75)
Miscellaneous
6
34.8 (16-93)
36.72
Table A24.3. [Cf.2] Reitan, 1955b,1959: Data for Individuals Referred for Neuropsychological Evaluation with Negative Neurologi.cal Findings n
Age
Education
VIQ
PIQ
FSIQ
Number of Errors
50
32.36 (10.78)
11.58
110.82 (14.46)
112.18 (14.23)
112.64 (14.28)
32.38 (12.62)
(2.85)
Table A24.4. [Cf.4] Klove and Lochen (cited in
Klove, 1974): Data for American and Norwegian Controls
Americans
NoiWegians
n
Age
Education
IQ
Category Errors
22 22
31.6 32.1
11.1 12.2
109.3 111.9
34.6 45.5
Table A24.S. [Cf.5] Wiens and Matarazzo, 1977: Data for Male Applicants to Patrolman Program: Mean
Number of Errors and SDs for Two Equal Subject Groups• Category Errors
WAIS n
Age
Education
FSIQ
VIQ
PIQ
24
23.6 (21-27)
13.7 (12-16)
117.5 (8.3)
117.4 (8.4)
115.4 (10.5)
23.5 (21.3)
24
24.8 (21-28)
14.0 (12-16)
118.3 (6.8)
116.4 (6.9)
118.2 (8.6)
22.8 (11.8)
29
24 (21-28)
14 (12-16)
118
116
ll8
22.83 (19.15)
-Test-retest data are also provided for 29 subjects who were assessed twice.
Test
Retest
11.21 (9.32)
980
APPENDIX 24
Table A24.6. [CT.6] Mack and Carlson, 1978: Data for Two Age Groups of Healthy Participants: Mean
Number of Errors and SDs for the Whole Test and Subtests III- VII Subtest
n
Age
Education
IQ
III
IV
v
VI
Vll
Total
40
25.03 (3.70)
15.43 (2.65)
113.76 (14.89)
14.95 (11.93)
10.00 (9.54)
12.55 (6.55)
7.00 (5.39)
3.72 (2.92)
4 .82 (27.93)
41
69.76 (4.87)
14.05 (3.39)
119.90 (15.14)
25.07 (10.74)
23.46 (8.57)
19.22 (6.34)
15.32 (7.51)
7.93 (2.62)
91.73 (26.26)
Table A24.7. [CT.7] Anthony et a!. , 1980: Data for Normal WAlS
n
Age
100
Education
FSIQ
VIQ
PIQ
Category Errors
13.33 (2.56)
113.54 (10.83)
113.24 (11.59)
112.26 (10. )
32.59 (21. 0)
Table A24.8. [CT.8] Harley et a!., 1980: Data for
SD , and Ranges fo r the Alcohol-Equated Sample
eterans Administration- Ho pitalized Patients: Means, umber of Errors per Five Age Intervals for the Who! amp! and for th WAlS
n
Age
FS IQ
VIQ
PIQ
Education
Errors
55-59
98.57 (11.43) 80-129
99.39 (12.92) 77-131
97.00 (10.65) 72-129
10.1
64.13 (2 .47) 19- 115
98.58 (9.93) 80-121
101.27 (11.42) 78-123
95.00 (9.82) 7 116
9.
59.7 (19.6 ) 30-110
To tal sample 56
45
35
~9
97.51 (11.18) 80-130
100.37 (12.51) 80-135
93.66 (10.20) 68-120
37
70-74
100.41 (9.92) 82-125
102.95 (11.81) 80-133
97.24 (10.08) 75-114
8.
85.60 (36.27) 21- 162
20
75-79
101.75 (10.18) 81-119
101.40 (11.40) 77-117
102.15 (9.95) 83-119
6.5
69.60 (26. 9) 19- 110
47
55-59
99.00 (11.73) 80-129
100.00 (13.02) 77-131
9 .00 (11.13) 72-129
10.1
65.43 (2 .51) 20-115
33
60-64
96.00 (9.43) 80-117
99.00 (11.33) 78-123
93.00 (9.30) 78-112
9.3
63.42 (19.24) 34-110
.7
72.65 (2 .96) 22-141
Alcohol-equated sample
APPENDIX 24
981
Table A24.8. (Contd.) WAIS
n
Age
FSIQ
VIQ
PIQ
Education
Errors
23
65-69
99.00 (12.06) 80-130
102.00 (13.06) 80-135
95.00 (11.52) 68-120
8.8
71.68 (31.14) 22.-141
37
7~74
10.00 (9.92) 82.-125
103.00 (11.81) 80-133
97.00 (10.08)
8.8
~114
85.60 (36.27) 21-162
20
~79
102.00 (10.18) 81-119
101.00 (11.40) 77-117
102.00 (9.95) 83-119
6.5
69.60 (26.89) 19-110
Table A24.9. [CT.9] Pauker, 1980: Data for Canadian Volunteers: Means and SDs for Total Errors for the Whole Sample and for Three Age Groups by Four WAIS IQ Levels WAIS IQ Age
89-102
1~112
113-122
123-143
89-143
19-34
n=21 61.67 (18.95)
n=53 40.08 (17.47)
n=60 29.82 (14.26)
n=28 23.64 (11.93)
n=162 36.24 (19.33)
35-52
n=20
n=34 59.06 (17.03)
n=56 42.77 (15.25)
n=25 37.52 (18.79)
n=135 50.79 (21.98)
n=4 90.00 (15.25)
n=15
n=20 47.60 (20.96)
n=66
(14.99)
n=27 58.85 (18.17)
n=45 70.47 (22.69)
n=102 49.89 (19.76)
n=143 40.37 (18.69)
n=73 34.96 (19.58)
n=363 45.69 (22.37)
75.80 (24.12) 53-71
19-71
63.80
Table A24.10. [CT.10] Fromm-Auch and Yeudall, 1983: Data for Canadian Volunteers: Mean Number of Errors, SDs, and Ranges for Each Age Grouping n
Age
Category Errors
32
15-17
35.8 (16.2) 16-68
74
18-23
35.9 (21.2)
9-106 56
30.5 (13.6) 1~
18
10
36.3 (14.3) 11-67 41-64
53.0 (21.0) 29-96
58.45 (20.59)
982
APPENDIX 24
Table A24.11. [CT.ll] Heaton et al., 1986: Data for a Sample of Normal Controls• WAIS n
319 134 100 132 249 172
Age
Education
Category Errors
% Classified
Mean sst 11.9 11.2 9.7 9.5 11.2 12.9
29.3 42.6 66.4 53.8 38.6 28.9
89.0 70.2 31.0 49.2 76.7 89.0
<40 40-59 ~60
<12 12-15 ~16
Normal
"Mean number of errors for the six subgroups as well as percent of subjects classi6ed as normal using Russell et al.'s (1970) criteria. tMean scaled scores for the Wechsler Adult Intelligence Scale subtests are reported.
Table A24.12. [Cf.12] Dodrill, 1987: Data for a Sample of Volunteers: Mean Number of Errors and SDs for the Whole Sample and for Various Levels of Intelligence Age
Education
FSIQ
VIQ
PIQ
Category Errors
120
27.73 (11.04)
12.28 (2.18)
100.00 (14.35)
100.92 (14.73)
98.25 (13.39)
35.74 (22.76)
n
FSIQ
Category Errors
n
FSIQ
Category Errors
n
7 18 34 64
93 101 75
130 125 120 115 110 105 100
29
60 48 33 19 10
23 21 22 26 30 33
95 90 85 80 75 70
36 41 47
54 70
77
Table A24.13. [Cf.13] Yeudall et al., 1987: Data for Canadian Volunteers: Mean Number of Errors and SDs for the Whole Sample and for Each Age Group %RightHanded
FSIQ
VIQ
PIQ
Category Errors
79.09
111.75 (10.16)
111.18 (10.92)
108.30 (10.47)
33.88 (18.25)
14.82 (1.88)
86.03
109.79 (9.97)
110.48 (10.43)
105.88 (11.20)
35.10 (19.82)
26-30
15.50 (2.65)
89.58
113.95 (10.61)
114.40 (11.45)
110.28 (8.72)
30.52 (14.00)
42
31-40
16.50 (3.11)
90.48
116.09 (9.51)
117.76 (9.32)
109.72 (11.45)
36.28 (13.66)
225
15-40
14.55 (2.78)
85.78
112.25 (10.25)
112.60 (10.86)
108.13 (10.63)
33.97 (17.20)
n
Age
Education
62
15-20
12.16 (1.75)
73
21-25
48
APPENDIX 24
983
Table A24.14. [CT.14] Ernst, 1987: Data for Australian Volunteers: Mean Number of Errors and SDs for Each of the Seven Subtests as Well as Total Errors for Each Gender Separately
Category Test n
Gender
51
Male
59
Female
n
III
IV
v
VI
VII
Total
0.1 (0.6)
0.5 (0.7)
19.9 (10.0)
13.7 (11.1)
16.5 (7.5)
10.6 (7.1)
5.8 (2.6)
66.7 (27.3)
0.2 (0.6)
0.5 (0.8)
25.4 (7.3)
20.5 (10.6)
17.8 (6.5)
12.4 (5.7)
6.7 (2.3)
83.3 (21.6)
Table 24.15. [CT.15] Alekoumbides et al., 1987: Data for Veterans ~tionlnpatienb
WAIS n 112
Age
Education
FSIQ
VIQ
PIQ
Category Errors
46.85 (17.17)
11.43 (3.20)
105.89 (13.47)
107.03 (14.38)
103.31 (13.02)
(28.16)
62.04
Table A24.16. [CT.16] Bornstein et al., 1987a: Data for Healthy Volunteers: Means, SDs, and Ranges for Total Number of Errors for Both Testing Sessions• n
Age
VIQ
PIQ
Test
Retest
23
32.3 (10.3)
105.8 (10.8)
105.0 (10.5)
46.7 (25.3) 16-112
23.8 (19.0) 4-56
Raw Score Change
Medlan Raw Score Change
Mean % of Change
23.5
22
46
•Raw score change, median raw score change, and mean percent of change from the test to the retest are also reported.
Table A24.17. [CT.17] El-Sheikh et al., 1987: Data for Egyptian Studenb: Mean Number of Errors and SDs for the Total Sample for the Two Testing Probes
Table A24.18. [CT.18] Russell, 1987: Data for Veterans Administration Patienb Referred for Neuropsychological Evaluation with Negative Neurological Findings WAIS
n
Age
Test
Retest
n
32
20.6 (1.4)
29.5 (18.78)
9.84 (6.37)
155
Age 46.19 (12.86)
- - - - - Category Education FSIQ VIQ PIQ Errors 12.29 (3.00)
111.9 112.3 109.9
52.11 (26.31)
APPENDIX 24
984
Table A24.19. [CT.19] Elias et al., 1990: Data for 183 Healthy Volunteers Partitioned into Three Age Groups n
WAIS
Category
Age Group
Male
Female
Education
VIQ
PIQ
Errors
20-31
41
47
15.7
119
116
26.60 (19.10)
37-49
23
38
15.4
122
122
33.90 (22.60)
55-07
12
22
14.9
124
121
56.80 (32.60)
Table A24.20. [CT.21] Elias et al., 1993: Data for 427 Healthy Volunteers Partitioned into Six Age Groups by Gender Category Errors
n
Age Group
Male
Female
Male
Female
1~24
37
24
27.51 (3.43)
25.96 (3.05)
~
40
56
37.70 (4.23)
30.11 (2.40)
35-44
36
56
37.58 (3.59)
41.86 (3.64)
45-54
25
46
39.88 (5.37)
51.91 (3.62)
~
25
35
57.72 (5.68)
58.14 (4.90)
~65
24
23
63.04
74.57 (5.60)
(4.79) •Education range for the sample is 12-19
yeab.
Table A24.21. [CT.22] Barrett et al., 21l41: Data for Air Force Veteran Controls• n
Age
Category Errors
1,052
43.9 (7.6)
37.29
•sDs are not provided for the test scores.
Appendix 24m: Meta-Analysis Tables for the Category Test (CT)
Table A24m.1. Results of the Meta-Analysis and Predicted Scores for the Category Test (Relevant values are weighted on the standard error for the test mean)
Deserlption of the aggregate sample Number of stucliea IDeluded Ia the aaalysla Yean of publieation Number ol data poUdl used Ia the aaalysla
11 1955-1987 25
(a data point denotes a study or a cell in education/gender-stratified data) Total number of partieipants
1,579
n•
xt
sot
Mean
25
50.98
39.41
1~162
Age Mean SD
25 25
44.65 4.35
19.66 4.75
16.~77.0
16
11.11
2.92
6.5-16.5
11
2.17
0.86
Uh'3.2
16 15
106.06 10.53
7.39 1.80
97.5-118.3 6.8-14.4
1
70.00
70.00
70.0
25 25
49.84 22.57
18.43 6.44
22.8-85.6 11.8-36.3
Variable
Range
s-p~..-
&luc:cdion Mean SD
1.~17.2
IQ
Mean SD Perwa~
-'e
r..,_,.._ Combined mean Combined SO
•Number of data points differs for different malyses due to missing data. 'weighted means and SDs. (continued)
985
APPENDIX 24M
986 Table A24m.1. (Contd.)
Predicted number of errors and SDs per age group• (Category Test)
95%CI
95%CI Age Bt.mge
Predicted Score
l~l9
26.96
JS-!9
30.83
30-34
S0-54
35.12 39.41 43.70 47.99 52.29
55-59
56.58
~
60.88 65.17 69.46 73.75
35-39 40-44
45-49
6S-69
10-14 15-19
Lower Band
Upper Band
Predicted
21.81 26.24 31.12 35.96 40.73 45.38 49.88 54.16 58.25 62.17 65.99 69.73
32.11 35.42 39.12 42.86 46.68 50.61 54.71 59.00 63.50 68.17 72.94 77.78
16.59 17.60 18.72 19.84 20.96
SD
U.09 !3.21 24.33
25.45 26.57 27.69 28.82
Lower Band
Upper Band
14.48 15.61 16.87 18.07 19.24 20.35 21.41 22.44 23.42 24.38 25.31 26.22
18.71 19.58 20.57 21.61 22.69 23.82 25.00 26.22 27.48 28.77
"Based on the equations:
Predicted tat acore = 11.50841 + 0.8585716 •age Predicted SD = 12.55292 + 0.2243168 • age
Significance tests for regression with the test scores Onlmary least-squares regression of test meaDS OD age (linear) Number of observations Number of clusters R2
25 11 0.839 F0 .1o1= 138.24, p < 0.000
Term
Coefficient
SE
Age
0.8585716 11.50841
0.073 3.841
Constant
Prediction Predicted age range Mean predicted score SEe
95%CI
16-97 years 50.18 (15.42) 1.73 46.79-53.57
11.76 3.00
p
95%CI
0.000 0.013
0.659 to 1.021 2.949 to 20.067
30.08 31.41
987
APPENDIX 24M Table A24m.1. (Contd) 100
0 80
80
40
~ ~~----~-----,-----,----~r-----r-----~ 30
40
50 age
80
70
80
fipre A24m.1. A scatterplot illustrating the dispersion of the data points around the regression line for the Category Test. The size of the bubbles reflects the weight of the data point, with larger bubbles indicating larger standard error and smaller weight.
Tests for assumptions and model 8t Tests for heterogeaeity Ia the 8aal data aet Pooled estimates for fixed effect 39.920 44.644 Pooled estimates for random effect Q(df).
p
Q(!4)
=580.80, p < 0.000
Moment-based estimate of between-study variance
144.747
Tests for modelftt...adcJtion of a quadratic tenn
Model Linear Quadratic
0.839 0.843
0.832 0.829
BIC
BIC'
95.944
-4.2.431 -39.901
98.474
BIC' difference of 2.529 provides positive support for the linear model.
Tests for parameter speeifJeatloas Normality of the residuals Shapiro-Wille W test W = 0.947, p = 0.213 Homoscedasticity White's general test 2.116, p =0.341
APPENDIX 24M
988 Table A24m.1. (Contd.)
Signiflcance tests for regression with the SDs Ordinary least-tCfuares regression ol SDs on age (linear) Number of observations Number of clusters ~
25 11
0.469 Fu.1o> =57.54, p < 0.000
Tenn
Coefficient
SE
Age Constant
0.2243168 12.55292
0.030
Prediction Mean predicted SD
1.453
7.59 8.64
p
959& Cl
0.000 0.000
0.158 to 0.290 9.315 to 15.791
22.66 (4.03) 1.02
SEe 959& CI
20.65-24.66
EiJeds oE demographic variables Education Est. tau2 without education Est. tau2 with education Regression of test means on education and Number of observations Number of clusters ~
Tenn
Coefficient
Education
-0.079
354.80 116.70
age 16
8 0.874
SE
t
p
959& CI
2.079
-0.04
0.970
- 4.99 to 4.84
IQ Est. tau2 without IQ Est. tau2 with IQ Regression of test means and IQ on age Number of observations Number of clusters ~
Tenn
Coefficient
SE
IQ
-0.8418495
0.299
354.80 106.90 16 8 0.903
-2.81
p
959& CI
0.026
- 1.549 to- 0.134
Appendix 25: locator and Data Tables for the Wisconsin Card Sorting Test
Study numbers and page numbers provided in these tables refer to study numbers and descriptions of studies in the text of Chapter 25.
Locator table also provides a reference for each study to a corresponding data table in this appendix.
Table A25.1. Locator Table for the Wisconsin Card Sorting Test (WCST) Study
WCST.l Heaton, et al., 1993 page 513 Data are not reproduced in
this book
Age•
n
6-19 20-29 30-39 -ID-49 50-59 60-64
899
Sample Composition 551 M, 348 F; nonneurological or psychiatric
IQ/Education• $8 9-11 12 13-15 16-17
Location Southwestern USA, Colorado, Texas, Detroit, Washington DC
~18
65-69
70-74 75-79 80-84
WCST.! Beatty, 1993 page 514 Table A25.2
18-34 35-59
WCST.3 Boone et al., 1993 page 514 TableA25.3
45-49 60-69 70-83
65
31 M, 34 F; nonneurological or psychiatric
91
35 M, 56 F; fluent English speakers; 14.5 {2.5} nonneurological or psychiatric; 71 were white, 10 FSIQ: were African American, 115.89 {12.97} 5 were Asian, and 5 were Hispanic
60+
15.0 {2.2} 15.8 {2.5} 15.5 {4.2}
California
Los Angeles, California
{continued}
989
990
APPENDIX 25
Table A25.1. (Contd.) Age•
n
WCST.4 Stratta et al., 1993 page 515 Table A25.4
31.93 (5.95)
61
WCST.S Kramer et al., 1994 page 516 Table A25.5 WCST.6 Spencer & Raz, 1994 page 516 Table A25.6 WCST. 7 Paolo et al., 1995 page 517 TableA25.7 WCST.8 Artiola i Fortuny& Heaton, 1996 page 517 Table A25.8 WCST.9 Hoff et al., 1996 page 517 Table A25.9
18-28 60-74
Study
Sample Composition
Is; nonneurological or ~c ~
I
32 30
26 ~· 36 F; healthy volunteers
IQ/ Education• 0-8 9-13
Location
L'Aquila, Italy
~14
12.65 (4.30) 16.4 (1.1)
16.3 (1.8)
¥·
18-35 65-80
32 32
23 41 F; undergraduate students; nctnneurological or psychiatric; color blindness
13.5 15.3
69.74 (6.96)
187
14.91 (2.57)
Kansas City, KS
27.32 (9.11)
119
69 •• 118 F; nonneurological or !Ehiatric; 97% were White, % were Black, and 1.1% were panic 51 M, 68 F; Spanish-speaking; nonneurological or psychiatric
14.35
Madrid, Spain
32.1 (9.7)
54
All-male sample; nonneurological o. psychiatric; 48 -re White, 4 were African APterican, and 2 were Hispanic
WCST.lO Paolo et al., 1996 page 518 Table A25.10
68.79 (6.21)
87
WCST.ll Rosselli & Ardila, 1996 page 518 Table A25.11 WCST.l2 Salthouse et al., 1996 page 519 Table A25.12 WCST.l3 Compton et al., 1997 page 519 Table A25.13 WCST.l4 Fristoe et al., 1997 page520 Table A25.14
25.61 (7.54)
63
25M, 62 F; nonneurological or pfycluatric; 95% were White, 29& were African American, aad 2% were Hispanic; ~tial testing and retesting irl1year , All-Jilale sample; no exclusion criteria ate provided
54.1 (18.4)
259
47.74
52
(11.77)
18-35 65-80
48 49
n;
(2.25)
15.4 (2.4) VIQ: 115.1 (12.6) 14.80 (2.42)
10.51 (4.58)
27~
15.0
30 J,l, 22 F; participants were non~ology faculty members of the Georgia College and State University; no exclusion criteria are provided P~cipants were divided into younger and c{der groups; younger group consisted 25% male, older group of 35% Itaaie; no exclusion criteria are pded
18.44 (1.69)
M, 63% F; healthy participants r;cruited; no exclusion criteria life provided
cf
13.3 (1.3) 13.9 (2.0)
Bogota, Colombia
APPENDIX 25
991
Table A25.1. (Contd.) Study
IQ/ Education•
Location
Age"
n
WCST.l5 Artiola i Fortuny et al .• 1998 page520 Table A25.15
39.25 (14.8)
390
138 M, 252 F; 205 &om Madrid, Spain, 185 from US-Mexico border; nonneurological or psychiatric
11.15 (5.25)
Madrid, Spain, Mexico; Arizona
WCST.I6 Boone, 1998 page 521 Table A25.16
<65 >65 63.07 (9.29)
155
53 M, 102 F; nonneurological or
14.57 (2.55) FSIQ: 115.41 (14.11)
California
WCST.l7 Mejia et al .• 1998 page 521 Table A25.17 WCST.I8 Basso et al .• 1999 page522 Table A25.18
55-70 71-a5
60
21 M, 39 F; Spanish-speaking; nonneurological or psychiatric
2-5
Medellin, Colombia
32.50 (9.27)
82
82 M participants were recruited; 50 were retested in 1 year; 48 were Caucasian, 1 African American, and 1 Hispanic; nonneurological or psychiatric
14.98 (1.93)
WCST.19 Gooding et al., 1999 page522 Table A25.19
18.72 (0.86)
104
43 M, 61 F; undergraduate students; nonneurological or psychiabic
CoUege
WCST.SO Merriam et al .• 1999 page 522 Table A25.20
26.08 (7.67)
Sample Composition
psychiatric
6-11
IQ: 116.26 (12.56) 61
Healthy volunteers; nonneurological or psychiatric
14.66 (2.39) IQ: 103.90 (9.22)
WCST.Jl Rey et al., 33.45 (19.75) 1999 page 523 Table A25.21
75
12-15 >15
WCST.B Snitz et al .• 1999 page523 Table A25.22
36.0 (13.4)
54
19 M, 56 F; primarily Spanish-speaking; 53 from Cuba, 3 &om Peru, 1 &om Venezuela, 6 &om Puerto Rico, 1 &om Panama, 6 from Colombia, 1 &om Honduras, 8 from Nicaragua. and 19 &om "other" nationalities 19 M, 35 F; nonneurological or psychiatric
19.11
63
WCST.J3 Tallent &: Gooding. 1999 page5J4 Table A25.23
Madlson, WI
(1.03)
WCST.J4 Compton et al.• 2000 page5J4 Table A25.J4
30-39 40-49 50-59 60+
WCST.JS Ismail et al.• 2000 page525 Table A25.25
35.9
102
75
22M, 41 F; undergraduate students; primaly language was English; nonneurologlcal or psychiatric
14.53 (3.25)
15.0 (1.7) IQ: 109.7 (13.4)
Minneapolis, MN
CoUege students IQ: 114.63 (11.93)
WJSCODsin
53 M, 49 F; healthy subjects; English was primaly language
59 M, 16 F; nonneurological or psychiatric
Dade County, FL
Atlanta, GA
13.2 (3.8)
(continued)
APPENDIX 25
992 Table A25.1. (Contd.) Study
Age"
n
WCST.26 Laiacona et al., 2000 page 525 Table A25.26
1~29
205
30-30 40-49 50-59
Sample Composition
IQ/ Education•
100M, 105 F; nonneurological or psychiatric
~7
30 M, 34 F; employees and relatives/acquaintances of hospital staff; nonneurological or psychiatric 33M, 71 F; nonneurological or psychiatric
14.69 (2.99)
261
35% M; no specific exclusion criteria are provided; data are stratified into 3 age groupings
15.9 (2.8)
897
435 M, 462 F; nonneurological or psychiatric
8-12 13-16 17-24
Location
Costa Masapa. Italy
60-69
WCST.27 Rossi et al., 2000 page526 Table A25.27 WCST.28 Razani et al., 2001 page526 Table A25.28 WCST.29 Salthouse et al., 2003 page 527 Table A25.29 WCST.30 Kongs, et al., 2000 page 527 Data are not reproduced in this book
70-85 26.4 (5.44)
64
60.36
104
(9.64)
18-39 40-59 60-84 48.2 (17.2) 6-19 ~29
30-39 40-49 50-59 60-64
L'Agul]a. Italy
14.82 (3.31) FSIQ: 116.81 (14.06)
~8
9-11 12 13-15 16-17 2;18
Southwestern USA. • Colorado, Teas, Detroit, Washmgtm DC
~
70-74 7~79
80-84
WCST.31 Axelrod et al., 1993 page528 Table A25.30
20 20 20 20 20 20 20
55 M, 85 F; undergraduate students from Wayne State University or newspaper ads; no exclusion criteria are provided for participants younger than 50; nonneurological or psychiatric conditions in older participants
15.4 (1.6) 15.2
71.34 (5.73)
35
22 M, 13 F; nonneurological or psychiatric conditions
13.11 (2.03)
18-29
115
All male; Spanish speakers; participants were tested in Spanish; nonneurological or psychiatric conditions
0-6
~29
30-39 40-49 50-59 60-69
70-79 80-89
WCST.32 Paolo et al., 1996b page 529 Table A25.31 WCST.33 LopezCarlos et al., 2003 page 529 Tables A25.32, A25.33
30-49
28.89 (8.37)
15.6
Detroit, MI
(1.2)
(2.2)
15.4 (2.5) 14.4 (3.0) 14.5 (4.2) 14.5 (4.1)
7-10 5.82 (2.49)
Los
Angeles.
CA;
Jalisco, Melico
APPENDIX 25
993
Table A25.1. (Contd.) IQ/ Education•
Location
Study
Age"
n
WCST.34 Bondi et al., 1993 page 530 Table A25.34
71.1 (7.6)
75
27 M, 48 F; nonneurological or psychiatric conditions
13.7 (2.6)
San Diego, CA
77
19 M, 58 F; Caucasian participants; nonneuroological or psychiatric conditions
IQ: 109.5 (13.2)
United Kingdom
35 72
52 M, 55 F; control healthy participants; no other exclusion criteria provided
12.05 (3.03) 8.54 (1.18)
France
29 84 89 27
97 M, 132 F; normal older subjects; 78% white, 21% Mexican American or European Spanish American, 1% African American, and 1% Cuban American; nonneurological or psychiatric conditions
1-6 7-12 13-16 17-20
San Diego, CA
WCST.35van 35.2 (12.8) den Broek et al., 1993 page530 Table A25.35 WCST.36 Isingrini 18-35 & Vazou, 1997 65-80 page 530 Table A25.36 WCST.37 Lineweaver et al., 1999 page 531 Table A25.37
45-49 60-69 7~79
80-91
Sample Composition
•Age column and IQ/education column contain information regarding range and/or mean and standard deviation for the whole sample and/or separate groups, whichever Information is provided by the authors.
Table A25.2. [WCST.2] Beatty, 1993 (128-Card Administration Version): Data for a Control Sample Stratified into
'I1lree Age Groups Age Groups
n
MIF Ratio
65
31134
Variables•
18-34
35-59
>60
Age
25.5 (5.7)
40.6 (6.1)
70.9 (6.5)
Education
15.0 (2.2)
15.8 (2.5)
15.5 (4.2)
CAT
5.8 (0.7)
5.8 (0.8)
4.6 (1.9)
PR
11.3 (7.5)
11.0 (9.8)
24.2 (27.6)
PE
9.9 (5.8)
9.9 (8.5)
20.5 (21.5)
NPR
10.1 (7.6)
7.7 (8.1)
13.0 (9.4)
Trials to First Category
13.3 (5.2)
13.5 (6.6)
18.8 (25.6)
Set Failure
0.6 (1.1)
0.5 (0.7)
(1.4)
1.1
"CAT, Categories Completed; PR, Perseverative Responses; PE, Perseverative Errors; NPR, Nonperseverative Responses.
Table A25.3. [WCST.3] Boone et al., 1993 (128-Card Administration Version): Data for a Control Sample Stratified into Three Age Groups, Gender, and Three
i
Educational Levels Age Groups
Variables
45-49
Age
n MIF ratio Education
38 13125
60-69
Gender Male
Female
$12
-
62.29 (9.67)
61.45 (9.49)
62.15 (8.75)
31
22
35
56
26
13/18
9/13
-
70-83
Education (Years)
11115
13-16 61.83 (9.58)
>16 61.00 (10.90)
48
17
14134
1017
14.55 (2.69)
14.29 (2.12)
14.41 (2.86)
14.94 (2.89)
14.11 (2.24)
114.53 (14.44)
114.48 (12.76)
117.8& (12.53)
.l1Q.U
11i.Q3.
110.35
. -.. ----..ll5.2a
123JIO
(13.41)
(12.89)
(12.26)
(12.85)
(13.49)
4.61 (1.90)
5.13 (1.43)
4.14 (1.96)
4.03 (1.89)
5.07 (1.62)
4.42 (1.68)
4.63 (1.96)
5.18 (1.38)
PR
19.81 (16.79)
18.23 (14.57)
27.23 (20.54)
27.34 (19.82)
17.09 (14.23)
26.81 (21.22)
20.55 (15.36)
13.77 (12.80)
Errors
31.24 (20.30)
29.48 (19.21)
44.68 (20.45)
41.00 (21.82)
27.42 (18.78)
39.62 (21.57)
34.49 (19.59)
23.65 (19.64)
%PR
15.15 (9.68)
14.27 (8.37)
19.54 (11.06)
19.43 (11.28)
13.76 (8.02)
19.34 (12.05)
15.62 (8.67)
11.75 (7.11)
62.87 (18.67)
64.82 (17.81)
51.16 (18.91)
54.41 (20.06)
64.67 (17.39)
55.28 (18.77)
60.11 (18.90)
70.52 (16.97)
Trials to First Category
13.60 (5.45)
13.81 (5.74)
19.68 (23.64)
17.52 (18.78)
13.66 (6.20)
18.54 (21.89)
14.45 (6.34)
11.94 (2.05)
Set Failure
0.92 (1.23)
0.87 (1.09)
1.05 (1.13)
1.20 (1.30)
0.76 (1.02)
0.92 (1.09)
0.94 (1.09)
0.94 (1.44)
Other Responses
1.35 (2.06)
1.94 (3.35)
2.36 (3.33)
1.67 (2.98)
1.89 (2.83)
1.85 (3.03)
2.17 (3.07)
0.71 (1.72)
FSIQ-
-
WCST
CAT
% Conceptual
•For the majority of subjects, the Satz-Mogel format (Adams et al., 1984) was used to obtain aglHlOrrected FSIQ, but for nine subjects over the age o£74 the Ryan et al. (1990) tables were used. CAT, Categories Completed; PR, Perseverative Responses.
> ""0
""0 m
z
0 X N
c.n
APPENDIX 25
995
Table A25.4. [WCST.4] Stratta et al., 1993 (128-Card Administration Version): Data for a Control Sample Stratified into Three Education Groups Education Group 0--8
n
Age
Education
61
31.93 (5.95}
12.65 (4.30}
(n=18}
9-13 (n=23}
2::14 (n =20}
3.00 (2.02)
5.08 (1.41}
4.15 (2.23}
PE
26.05 (12.33}
20.34 (10.30}
15.95 (14.73}
Total Errors
46.55 (17.65)
30.39 (12.54}
27.35 (17.96)
7.38 (6.40}
2.69 (3.93}
2.70 (4.49}
WCST Measure• CAT
Other Responses
•cAT, Categories Completed; PE, Perseverative Errors.
Table A25.5. [WCST.5] Kramer et al., 1994 (128-Card Administration Version): Data for a Control Sample Stratified into Two Age Groups
Table A25.6. [WCST.6] Spencer and Raz, 1994 (128-Card Administration Version): Data for a Control Sample Stratified into Two Age Groups
Age Group Variables
n Age•
18-28 32
60--74 30
20.6
67.8
MIF ratio
12/20
14/16
Education
16.4 (1.1}
16.3 (1.8)
117.8 (8.5}
117.6 (8.4}
5.9 (0.5)
4.20 (2.1}
14.8 (11.2)
36.7 (26.4)
PR
11.1 (7.9}
26.5 (24.1}
PE
8.3 (6.1)
20.0 (18.9)
Trials to First Category
12.2 (3.9}
14.6 (8.1}
% Conceptual Level Responses
63.4 (7.0}
57.2 (21.9}
IQt CAT Total Errors
Age Group
•sos were not provided. t1Q was based on the Kaufman Brief Intelligence Test. CAT, Categories Completed; PR, Perseverative Responses; PE, Perseverative Errors.
18-35
65-80
n
32
32
MIF ratio
Variables
12/20
ll/21
Age•
23.8
69.5
Education•
13.5
15.3
CAT Total Errors PE
5.06
3.14
(1.70} 29.47
(2.13) 46.53
(16.30} 25.75 (16.22}
(20.77) 41.63 (19.98)
•sos were not provided. CAT, Categories Completed; PE, Perseverative errors.
996
APPENDIX 25
Table A25.7. [WCST.7] Paolo et al., 1995 (128-Card Administration Version): Data for a Control Sample Variables
Values
n MIF ratio
187
Age Education
69/118 69.74 (6.96) 14.91 (2.57)
Table A25.8. [WCST.8] Artiola i Fortuny and Heaton, 1996 (128-Card Administration Version): Data for a Healthy Sample from Madrid, Spain, for Standard and Computerized Administrations Administration Variables
n
60
Age Education
WCST
CAT
4.84 (1.72)
Standard Computerized 59
27.32 (10.82)
27.32 (7.06)
14.13 (2.33)
14.58 (2.16)
96.65 (22.74)
(21.15)
WCST
Total Trials
99.68
Total Errors
30.97 (20.73)
Total Correct
PE
17.06 (12.86)
71.60 (11.32)
75.54 (9.58)
Total Errors
PR
19.55 (15.92)
24.05 (19.52)
24.14 (13.28)
PR
NPE
13.87 (9.93)
15.23 (14.58)
12.47 (6.95)
PE
Set Failure
0.91 (1.12)
13.45 (11.86)
11.47 (6.10)
NPE
Trials to First Category
13.32 (7.61)
10.60 (9.77)
12.66 (8.29)
%Conceptual Level Responses
71.50
%Conceptual Level Responses
63.93
(18.79)
72.01 (12.04)
5.33 (1.39)
5.39 (1.21)
14.80 (11.69)
20.91 (19.55)
0.75 (1.03)
1.20 (1.74)
-1.38
-1.13 (5.58)
(21.48) Learning to Learn
-2.43 (4.94)
CAT Trials to First Category Set Failure Learning to Learn
(3.23)
PR, Perseverative Responses; PE, Perseverative Errors; NPE, Nonperseverative Errors; CAT, Categories Completed.
APPENDIX 25
997
TableA25.9. [WCST.9] Hoffetal.,
Table A25.10. [WCST.10] Paolo et al., 1996 (128-
1996 (128-Card Administration Version): Data for an All-Male Control Sample
Card Administration Version): Data for a Control Sample
Variables
n
54
Age
32.1 (9.7)
Education Verbal IQ (prorated)
Initial
Values
15.4 (2.4) 115.1 (12.6)
5.4 (0.9)
Total Errors
17.9 (14.1)
PR
9.0 (10.0)
68.79 (6.21)
Education
14.80 (2.42) 4.84 (1.76)
4.86 (1.89)
17.76 (22.79)
18.44 (23.32)
1.16 (1.39)
0.69 (1.20)
-2.66 (5.26)
-1.71 (4.61)
NormGllutl wcsr•
Variables
Values
n
63
Age
25.61 (7.54) 10.51 (4.58)
WCST 69.70 (7.60) 5.80
(0.50) Total Errors
25162
Age
Learning to Learn
Version): Data for an All-Male Control Sample
CAT
87
Set Failure
Table A25.11. [WCST.ll] Rosselli and Ardila, 1996 (128-Card Administration
Total Correct
n MIF ratio
Trials to First Categol}'
CAT, Categories Completed; PR, Perseverative Responses.
Education
Testing
RawWCST CAT
WCST CAT
Variables
Retest (1.1-Year Follow-Up)
21.00 (12.40)
PE
9.90 (6.80)
PR
11.70 (8.00)
NPE
11.20 (7.70)
CAT, Categories Completed; PE, Perseverative Errors; PR, Perseverative Responses; NPE, Nonperseverative Errors.
Total Errors
105.76 (17.69)
111.41 (19.58)
PR
104.86 (15.92)
111.94 (17.90)
PE
105.57 (16.91)
112.17 (18.03)
NPE
105.17 (19.42)
108.63 (22.02)
%Conceptual
106.21 (18.08)
111.17 (19.74)
Level Respouses
•Normalized age- and education~rrected standard scores (mean=100, SD=15). CAT, Categories Completed; PR, Perseverative Respouses; PE, Perseverative Errors; NPE, Nonperseverative Errors.
998
APPENDIX 25
Table A25.12. [WCST.12] Salthouse et al., 1996 (128-Card Administration Version): Data for a Healthy Sample Variables
Values
n
259
%Male Age
27 54.1 {18.4) 15.0
Education• WCST Number of Trials CAT
Variables
Values
Age
47.74 (11.77)
Education
18.44 (1.69)
'WCST CAT
4.96 (1.59) 27.6 {15.2)
%PR
17.7 (14.8)
%PE
15.6 {11.4)
4.02 (2.23)
Trials to First Category
19.73 (17.18)
Total Trials
111.14 (21.11)
CAT, Categories Completed.
11.9 (6.8)
% Conceptual Level Responses
65.7 (20.5)
Trials to First Categoryt
15.1 (11.5)
Set Failure
Faculty
100.7 {23.4)
%Errors
%NPE
Table A25.13. [WCST.13] Compton et al., 1997 (128-Card Administration Version): Data for a Sample of University
0.69
Table A25.14. [WCST.14] Fristoe et al., 1997 (128-Card Administration Version): Data for a Healthy Sample Using the Computerized Version Stratified into Two Age Groups
(1.04)
Learning to Learn:
Age Croup
-3.64 {6.90)
"Educational level is approDmated from the available data, and the SD is not available. tn=256. *n=254. CAT, Categories Completed; PR, Perseverative Responses; PE, Perseverative Errors; NPE, Nonperseverative Errors.
Variables
18-35
65-80
n
48
%Male
25
49 35
Age
26.7 (5.7)
70.1 (7.!)
Education
13.3 (1.3)
13.9 (lLO)
CAT
4.8 (1.9)
3.1 (2.0)
Conceptual Level Responses
65.5 (17.5)
55.2 (20.3)
% Conceptual Level Responses
61.3 {18.7)
47.2 (20.7)
PE
16.4 (10.0)
25.2 (llL1)
14.1 (7.1)
20.1 (8.8)
%PE
CAT, Categories Completed; PE, Perseverative Errors.
APPENDIX 25
999
Table A25.15. [WCST.15] Artiola i Fortuny et al., 1998 (WCST 128-Card Version): Data for Healthy Spanish-Speaking US-Mexico and Madrid, Spain,
Groups Age Group
Variables
US-Mexico
Madrid, Spain
185 471138
205 91/114
42.2 (13.5)
36.3 (16.1)
Education
9.6 (6.1)
12.7 (4.4)
CAT
3.4 (1.6)
(1.6)
33.6 (19.2)
(19.1)
n MIF ratio Age
PR
4.6 21.8
CAT, Categories Completed; PR, Perseverative Re-
sponses.
Table A25.16. [WCST.16] Boone, 1998 (128-Card Administration Version): Data for a Healthy Sample Stratified into Two Vascular Status Groups by Two Age Groups and Three IQ Groups Age Group Values
WCST Measure
Vascular Status
<65 years
Demographic
n
155 53/102 63.07 (9.29) 14.57 (2.55) 115.41 (14.11)
CAT
Healthy
4.95 (1.60)"
Vascular
4.58 (2.00)
M/F ratio Age Education FSIQ•
PE
Healthy
17.24 (14.~)
%Conceptual Level
Trials to First Category
High
years Average
Average
Superior
5.10 (1.30)
4.71 (1.70) n=34
4.88 (1.45) n=25
5.27 (1.45) n=44
3.68 (2.30)
3.53 (2.53) n=17
3.77 (2.01) n=13
4.76 (1.89) n=17
21.94 (16.41)
19.00 (12.63)
15.07 (14.19)
20.70 . - - ___ (~.70) ..
....
·;=-~·-
-- . --·
·,;-;,25
n=44
Vascular
25.11 (27.50)
32.52 (32.40)
39.31 (42.68) n=16
27.46 (25.65) n=13
21.71 (15.00) n=17
Healthy
64.95 (18.60)
62.84 (18.60)
59.87 (18.01) n=33
61.10 (17.17) n=25
69.51 (18.79) n=44
Vascular
58.02 (23.40)
47.40 (24.60)
45.32 (29.84) n=17
47.19 (20.79) n=13
61.52 (18.49) n=17
Healthy
29.51 (20.60)
33.33 (21.50)
36.21 (20.55) n=33
33.72 (19.27) n=25
24.71 (20.82) n=44
Vascular
35.89 (24.50)
45.41 (24.80)
46.81 (30.51) n=16
46.77 (21.02) n=13
32.41 (19.73) n=17
Healthy
14.12 (6.00)
13.30 (3.50)
14.53 (5.22) n=34
14.16 (6.40) n=25
13.23 (4.94) n=44
..,..,> rn
Vascular
19.16 (26.80)
30.46 (40.40)
40.76 (50.14) n=16
24.15 (30.61) n=13
12.53 (2.55) n=17
><
Responses
Total Errors
§
IQ Group ~65
....
"The Satz-Mogel format (Adams et al., 1984) was used to obtain &g~HX>rrected FSIQ. CAT, Categories Completed; PE, Perseverative Errors.
z 0
"-J VI
1001
APPENDIX 25
Table A2S.17. [WCST.17] Mejia et al., 1998 (128-Card Administration Version): Data for a Healthy Colombian Sample Stratified into Two Age and Two Education Groups Education Group
Af,e Group Variables
Entire Sample
n
60
MIF ratio
21139
Age
69.66
71-&5
55-70
2-5 Years
6-11 Years
28
32
42
18
6122
15117
16126
5113
(7.09) 2.60 (1.16)
2.25 (1.48)
2.65 (1.29)
1.89 (1.32)
PR
42.60 (19.63)
51.84 (30.85)
44.73 (22.39)
53.57 (33.43)
PE
40.14 (16.75)
43.68 (21.65)
38.92 (16.10)
48.73 (24.35)
NPE
26.92 (12.34)
25.25 (12.33)
27.07 (11.72)
23.78 (13.38)
1.10 (1.37)
1.00 (1.19)
1.00 (1.16)
1.15 (1.50)
CAT
Set Failure
CAT, Categories Completed; PR, Perseverative Responses; PE, Perseverative Errors; NPE, Nonperseverative Errors.
Table A25.18. [WCST.18] Basso et al., 1999 (128Card Administration Version): Data for a Healthy Sample Using the Computerized Version Stratified into Two Age Groups Testing Session Variables
Baseline
Table A25.18. (Contd.) Testing Session Variables
16.02 (12.82)
9.34 (7.70)
% Conceptual Level Responses
70.23 (17.94)
76.10 (18.74)
-3.14 (5.76)
-0.72 (3.90)
1.16 (1.67)
0.80 (1.16)
12 Months
50
Af,e
32.50 (9.27)
Leaning to Learn
Education
14.98 (1.93)
Set Failure
5.16
5.42
(1.38)
(1.515)
Total Trials
101.12 (22.87)
84.74 (18.59)
%Correct
76.48 (12.39)
80.99 (12.45)
Total Errors
26.12 (18.04)
16.68 (11.88)
PE
14.20 (10.53)
8.44 (6.16)
%PE
12.79 (7.52)
9.84 (7.45) (continued)
12 Months
PR
n
CAT
Baseline
CAT, Categories Completed; PE, Perseverative Errors; PR, Perseverative Responses.
1002
APPENDIX 25
Table A25.19. [WCST.l9] Gooding et al., 1999 (128-Card Administration Version): Data for a Control Sample of College Students Variables
Values
n
104
MIF ratio
43/61
Age
18.72 (0.86)
WAIS-R IQ"
Table A25.20. (Contd.) Variables
Values
NPE
11.16 (8.78)
Set Failure
0.75 (1.15)
Trials to First Category
13.61 (5.74)
% Conceptual Level Responses
116.26 (12.56)
Learning to Learn
WCST CAT
6.00 (0.00)
PE
7.58 (3.20)
NPE
8.29 (4.20)
Set Failure
0.33 (0.50)
Trials to First Category
13.08 (4.27)
Conceptual Level Responses
74.31 (13.90)
37.21 (29.80)
0.17 (3.10)
CAT, Categories Completed; PE, Perseverative Errors; PR, Perseverative Responses; NPE, Nonperseverative Errors.
Table A25.21. [WCST.21] Rey et al., 1999 (128Card Administration Version): Data for a Healthy Spanish-Speaking Sample Education Group
"Prorated from the Block Design and Vocabulary subtests of the WAIS-R. CAT, Categories Completed; PE, Perseverative Errors; NPE, Nonperseverative Errors.
Variables n
Age
Table A25.20. [WCST.20] Merriam et al., 1999 (128-Card Administration Version): Data for a Control Sample Variables
Values
Entire Sample 75
12-15 Years
>15 Years
25
30
33.45 (19.75)
Education
14.53 (3.25)
MIF ratio CAT
19/56 5.5 (1.2)
5.4 (1.0)
5.8 (0.8)
PE
10.7 {8.7)
11.6 (9.4)
8.9 (6.6)
n
61
Age
26.08 (7.67)
PR
11.6 (9.7)
12.7 (10.6)
9.6 (7.3)
Education
14.66 (2.39)
Trials to First Category Set Failure
12.9 (8.5)
12.4 {3.3)
11.7 (2.0)
0.6 (0.8)
0.5 (0.6)
0.6 (1.0)
WAIS-R IQ
103.90 (9.22)
WCST CAT
5.64 (1.02)
PE
10.26 (7.05)
PR
11.03 (8.26) (continued)
CAT, Categories Completed; PE, Perseverative Errors; PR, Perseverative Responses.
APPENDIX 25
1003
Table A25.22. [WCST.22] Snitz et al., 1999
Table A25.23. [WCST.23] Tallent and Gooding,
(128-Card Administration Version): Data for a Control Sample
1999 (128-Card Administration Version): Data for a Control Sample of College Students
Variables
Values
Variables
Values
n
54 19135 36.0 (13.4) 15.0 (1.7) 109.7 (13.4)
n
63 22/41 19.11 (1.03) 114.63 (11.93)
MIF ratio Age Education WAlS-R IQ
MIF ratio Age WAlS-R IQ" WCST
CAT
WCST
CAT PE NPE Set Failure
6.00 (0.00) 7.00 (2.30) 7.71 (5.40) 0.27 (0.50) 13.49 (6.53) 65.81 (5.06)
PE
4.6 (2.0) 17.2 (14.9) 21.8 (9.1) 1.5 (1.3)
NPE Set Failure Trials to First Category Conceptual Level Responses
CAT, Categories Completed; PE, Perseverative Errors; NPE, Nonperseverative Errors.
"Prorated from the Block Design and Vocabulary subtests of the WAlS-R. CAT, Categories Completed; PE, Perseverative Errors; NPE, Nonperseverative Errors.
Table A25.24. [WCST.24] Compton et al., 2000 (128-Card Administration Version): Data for a Highly Educated Sample Stratified into Four Age Groups Age Group Variables
n MIF ratio
Entire Sample 102
40-49
50-59
60+
30
27
25
20
13117
12/15
13112
14/6
Age
34.09 (3.71)
46.19 (2.80)
54.03 (3.20)
65.49 (5.72)
Education•
18.95
19.18
19.92
19.47
CAT
5.00 (1.59)
4.93 (1.75)
4.04 (2.44)
3.25 (2.57)
PE
8.05 (8.71)
8.41 (12.22)
11.22 (11.01)
16.43 (14.06)
77.67 (15.62)
77.46 (15.48)
69.55 (21.68)
60.16 (24.26)
% Conceptual Level Responses
53149
30-39
"SDs not reported. CAT, Categories Completed; PE, Perseverative Errors.
APPENDIX 25
1004 Table A25.25. [WCST.25] Ismail et al., 2000 (128-Card Administration Version): Data for a Control Sample
Variables
Values 75
n
MIF ratio Age•
59116 35.9
'WCST
CAT
5.72 (1.32)
PE
9.21 (6.63)
•so
not reported. CAT, Categories Com-
pleted; PE, Perseverative Errors.
Table A25.26. [WCST.26] Laiacona et al., 2000 (128-Card Administration Version): Data for an Italian Sample
WCST Measures Education (Years) ~7
Global• Age 40-49
50-59
60-69
70-85
8-12
1~29
Gender
40-49
Set
PR
Males
NPR
Failure
13.6 (9.6)
Females
69.3 (19.6)
29.2 (17.8)
20.8 (2.3)
0.3 (0.8)
Males
47.8 (35.9)
18.4 (17.3)
17.4 (3.2)
1.1 (2.1)
Females
51.0 (27.0)
24.7 (14.2)
12.3 (8.2)
0.9 (1.9)
Males
77.6 (15.9)
40.2 (9.9)
12.5 (14.8)
0.4 (0.5)
Females
75.4 (22.8)
36.0 (7.5)
20.4 (6.8)
0.8 (0.8)
Males
4.ot (48.1)
20.0 (22.6)
14.7 (8.3)
0.5 (0.7)
Females
79.5 (18.9)
47.3 (9.5)
16.3 (8.8)
0.3 (0.5)
Males
21.2 (13.4)
7.5 (5.0)
7.9 (4.8)
0.1 (0.3)
28.2 (23.1)
11.8 (9.3)
8.1 (5.9)
0.2 (0.6)
Males
37.8 (25.3)
16.3 (14.4)
12.1 (5.4)
0.3 (0.5)
Females
41.1 (14.3)
16.0 (6.0)
11.4 (6.7)
0.6 (0.8)
Males
49.6 (32.3)
19.6 (13.4)
13.8 (7.5)
0.6 (1.6)
Females 30-39
Scores
1005
APPENDIX 25
Table A25.26. (Contd.) WCST Measures Global• Scores
PR
NPR
Failure
Females
39.4 (27.8)
19.1 (13.8)
11.4 (8.4)
0.1 (0.4)
Males
44.3 (28.6)
17.8 (8.5)
14.3 (11.9)
0.0 (0.0)
Females
60.3 (34.7)
21.3 (9.9)
15.7 (9.6)
1.0 (1.3)
Males
58.3 (33.3)
27.2 (20.2)
18.5 (9.5)
0.0 (0.0)
Females
59.1 (37.2)
24.4 (16.4)
17.3 (8.7)
0.4 (0.8)
38.0 (0.0)
20.0 (0.0)
12.0 (0.0)
0.0 (0.0)
Females
80.0 (25.9)
52.8 (20.2)
12.8 (7.0)
0.0 (0.0)
Males
33.5 (26.0)
17.3 (14.6)
7.5 (4.1)
0.5 (0.6)
Females
30.5 (21.6)
11.6 (7.6)
8.8 (6.9)
0.3 (0.5)
Males
36.8 (21.0)
11.0 (5.4)
10.5 (3.3)
1.0 (1.4)
Females
42.5 (27.4)
17.3 (13.5)
12.5 (9.8)
1.0 (1.2)
Males
17.0 (0.0)
6.0 (0.0)
8.0 (0.0)
0.0 (0.0)
Females
18.3 (4.1)
9.0 (2.4)
5.3 (1.9)
0.3 (0.5)
Males
34.0 (27.6)
13.5 (8.4)
11.8 (12.0)
0.2 (0.4)
Females
25.3 (20.5)
14.3 (15.4)
5.0 (1.0)
0.0 (0.0)
Males
33.8 (15.9)
14.0 (7.3)
12.8 (6.8)
0.5 (0.6)
Females
31.8 (16.4)
15.3 (10.1)
6.8 (3.9)
0.3 (0.5)
Males
57.5 (57.3)
19.5 (19.1)
21.0 (21.2)
0.0 (0.0)
13.8 (3.5)
5.0 (2.0)
4.0 (1.9)
0.4 (0.5)
Females
38.3 (20.6)
17.3 (9.6)
14.0 (5.6)
0.3 (0.6)
Males
23.4 (18.0)
6.2 (4.0)
8.6 (5.9)
0.8 (1.8)
Females
21.8 (22.8)
6.2 (3.2)
9.5 (11.0)
0.5 (1.0)
Education (Years)
Age
50-59
60-69
70-85
13-16
15-29
30-39
40-49
50-59
60-69
70-85
Gender
Males
Set
Females 17-24
15-29
30-39
Males
(continued)
APPENDIX 25
1006
Table A25.26. (Contd.) WCST Measures Education (Years)
Age 40-49
S0-59
60-69
70-85
Global• Scores
PR
NPR
Failure
Males
20.8. (9.4)
6.8 (2.5)
7.5 (3.1)
0.3 (0.5)
Females
30.0 (16.1)
14.0 (10.1)
7.7 (1.5)
0.0 (0.0)
24.0 (2.8)
6.5 (0.7)
7.0 (2.8)
0.5 (0.7)
Females
27.0 (16.6)
10.3 (7.1)
11.0 (7.4)
0.0 (0.0)
Males
38.6 (28.3)
17.8 (12.4)
11.6 (9.8)
0.4 (0.5)
Females
19.7 (9.7)
6.3 (1.9)
6.3 (2.6)
0.3 (0.5)
Males
23.0 (0.0)
12.0 (0.0)
9.0 (0.0)
0.0 (0.0)
Females
88.0 (0.0)
32.0 (0.0)
27.0 (0.0)
0.0 (0.0)
Gender
Males
Set
"Global Score= Btrials odminlstered- (ncategories completed X 10). tThese are the reported values; however, the means and SDs may have been printed in reverse order in the original article.
Table A25.28. [WCST.28] Razani et al., 2001 (128-Card Administration Version): Data for a Control Sample
Table A25.27. [WCST.27] Rossi et al., 2000 (128-Card Administration Version): Data for a Control Sample Variables
Values
n
64
Variables n
104
M/F ratio
Age
331'71
60.36 (9.64)
M/F ratio
30134
Education
Age
26.4 (5.44)
WAIS-R FSIQ"
Education
14.69 (2.99)
WCST
Values
WCST CAT
14.82 (3.31) 116.81 (14.06)
4.99 (1.54)
CAT
5.33 (1.49)
PE
PE
8.81 (8.69)
18.25 (14.76)
Total Errors
17.81 (14.17)
30.64 (20.84)
Set Failure
0.81 (1.08)
Total Errors Unique (Other) Errors
1.34 (2.43)
CAT, Categories Completed; PE, Perseverative Errors.
% Conceptual Level Responses
64.33 (18.55)
"The Satz-Mogel format (Adams et al., 1984) was used to obtain age-corrected FSIQ. CAT, Categories Completed; PE, Perseverative Errors.
APPENDIX 25
1007
Table A25.29. [WCST..29] Salthouse et al., .2003 (1.28-Card Administration Version): Data for a Highly Educated Sample Stratill.ed into Three Age Groups Age Group
Variables n ~Male
Age
Education
Entire Sample
18-39
40-59
60-84
261
79
112
70
35
43
29
37
48.2 (17.2)
27.7 (6.4)
49.0 (5.0)
70.3 (6.2)
15.9 (2.8)
15.5 (3.3)
16.0 (2.4)
16.4 (2.9)
12.3 (2.9)
12.0 (3.5)
12.2 (2.7)
12.8 (2.4)
73.9 (12.9) 4.7 (1.8) 36.7 (22.5)
71.7 (14.7) 4.1 (2.2) 41.2 (23.7)
67.9 (16.6) 3.2 (2.0) 53.7 (22.6)
17.4 (12.2) 19.7 (15.4)
20.8 (16.5) 23.8 (21.7)
25.3 (17.3) 37.4 (50.9)
19.9 (13.8) 22.1 (20.5) 61.5 (20.8) 1.2 (1.2) -0.8 (5.4)
22.9 (24.3) 20.6 (14.2) 58.9 (21.6)
26.6 (15.1) 25.1 (20.3) 53.1 (22.2) 1.6 (1.8) -3.2 (6.1)
WAIS-In
Vocabulary 'WCST Correct responses
CAT Total Errors PE PR NPR Trials to First Category Conceptual Level Responses Set Failure
Learning to Learn
1.4 (1.5)
-2.4 (6.0)
CAT, Categories Completed; PE, Perseverative Errors; PR, Perseverative Responses; NPR, Nonperseverative Responses.
APPENDIX 25
1008 Table A25.30. [WCST.31] Axelrod et a: Sample Stratified into Seven Age Groups
, 1993
(64-Card Administration Version): Data for a Healthy Age Group
Variables
n
20-29 20
30-39 20
40-49
50-59
20
20
60-69
20
MIF ratio
4116
6/14
8112
9111
10/10
Race (blaclc/white)
9111
uw•
1000
1/19
4/16
80-89
70-79 20
20
10/10
8/12
4/16
1119
Age
24.4 (3.2)
34.2 (2.6
45.0 (3.0)
55.3 (2.5)
65.2 (2.6)
74.3 (2.9)
83.4 (3.0)
Education
15.6 (1.2)
15.4 (1.6
15.2 (2.2)
15.4 (2.5)
14.4 (3.0)
14.5 (4.2)
14.5 (4.1)
4.1 (0.8)
3.~
3.6 (1.1)
3.4 (1.3)
2.6 (1.7)
2.6
(U
(1.1)
2.2 (1.6)
14.5 (5.1)
20.~
(9."'
18.4 (7.8)
16.6 (7.6)
21.3 (11.6)
22.8 (6.0)
25.7 (10.8)
PR
9.2 (4.3)
1U (6.
12.2 (5.5)
10.2 (5.3)
13.5 (8.5)
14.8 (5.4)
20.4 (13.2)
PE
8.2 (3.6)
1U (5."'
11.7 (7.0)
8.4 (4.2)
12.0 (7.6)
12.7 (4.3)
16.2 (10.1)
NPE
6.2 (2.8)
9.C (5.~
7.7 (3.7)
7.7 (4.4)
9.4 (6.3)
10.7 (5.0)
9.0 (4.3)
WCST CAT Total Errors
"One Hispanic participant was included in tbJ group. CAT, Categories Completed; PR, Perseverative Responses; PE, Perseverative Errors; NPE, Nonperseverative ~rrors.
Table A25.31. [WCST.32] Paolo et ,al., 1996h (64-Card Administration Versickt): Data for a Control Sample
Table A25.31. (Contd.) Variables
Values
Variables
V~es
NPE
n
35
Conceptual Level Responses
MIF ratio
22113
64.26 (15.27)
Age
71.34 (5.73)
Set Failure
0.46 (0.74)
Education
13.11 (2.03)
Trials to First Category
13.91 (4.59)
DRS"
13t37 (3.36)
WCST CAT
3.1)3 (1.22)
Total Errors
1&54 (7.50)
PE
9.86 (5..()6)
PR
11.29 (6.84) (CORti~)
8.89 (4.41)
"Dementia Rating Scale (Mattis, 1976). CAT, Categories Completed; PE, Perseverative Errors; PR, Perseverative Responses; NPE, Nonperseverative Errors.
APPENDIX 25
1009
Table A25.32. [WCST.33a] Lopez-Carlos et al., 2003: Monolingual SpandSb-Spea}ing Males with :510 Years of Education Stratified by Education Group Education Group
n
M~
Similarities•
Correct
PR
PE
CAT
0-6
38
7.29 (3.77)
10.09 (4.85)
32.05 (10.09)
16.74 (10.96)
14.79 (8.12)
1.53 (1.22)
7-10
21
7.47 (3.77)
11.20 (4.97)
40.38 (9.07)
13.86 (7.25)
11.71 (5.80)
2.24 (1.00)
•WAIS-m raw scores (Mexican version). PR, Perseverative Respcmses; PE, Perseverative Errors; CAT, Categories Completed.
Table A25.33. [WCST.33b] Lopez-Carlos et al., 2003: Monolingual Spanish-Spea}ing Males with :510 Years of Education Stratified by Age and Education Group n
M~
Similarities•
Correct
PR
PE
CAT
0-6
18
5.70 (2.10)
9.65 (4.78)
31.61 (8.00)
16.78 (10.21)
14.72 (7.33)
1.44 (0.98)
7-10
12
9.40 (4.48)
10.67 (5.04)
40.75 (10.47)
12.75 (8.67)
10.83 (6.79)
2.33 (1.07)
30-49 0-6
20
7.00 (3.29)
10.52 (5.27)
32.45 (11.85)
16.70 (11.85)
14.85 (8.96)
1.60 (1.43)
7-10
9
8.56 (4.75)
12.78 (4.02)
39.89 (7.37)
15.33 (4.87)
12.89 (4.23)
2.11 (0.93)
Age x Ed Group 18-J9
•wAIS-m raw scores (Mexican version). PR, Perseverative Respcmses; PE, Perseveratlve Errors; CAT, Categories Completed. Table A25.34. [WCST.34] Bondi et al., 1993 (MCST Administration Version): Data for a Control Sample Variables
Values
n
75
M/F ratio
27/48
Age
71.1 (7.6)
MMSE•
28.9 (1.2)
JICST CAT PE
5.16 (1.33) 1.40 (2.81)
%PE
9.40 (15.29)
NPE
8.20 (6.01)
•MiDi-Mental State Examination (Folstein et al., 1975). CAT, Categories Completed; PE, Perseverative Errors; NPE, Nooperseverative Errors.
APPE N DI X 25
1010 Table A25.35. (WCST.35] van den Broek et al., 1993 (MCST Administration Version): Data for a Control Sample Variables
Table A25.36. [WCST.36] lsingrini and Vazou, 1997 (MCST Administration Ver ion): Data for a Healthy Sample Stratilled into Two Age Groups Age Group
Values
n
77
Variables
25-46
MIF ratio
19158
n
35
72
Age
35.2 (12.8)
MIF ratio
15/20
37/35
Age
FSIQ
109.5 (13.1)
35.54 (7.58)
0.59 (8.58)
Education
12.05 (3.03)
8.54 (1.18)
CAT
5.25 (0.85)
3.51 (1.78)
Total Errors
9.45 (3.62)
20.47 (8.95)
PE
3.57 (2.37}
11.68 ( .2 )
MCST CAT Total Errors PE %PE NPE Unique• Runst
4.9 (1.7) 10.7 (10.1) 2.8 (3.7) 26.5 (12.2) 7.8 (7.0) 2.0 (3.2) 0.3 (0.6)
•Errors that were neither color, fonn, nor number. tThree or more sequential correct sorts but less than the six required to achieve a category. CAT, Categories Completed; PE, Perseverative Errors; NPE, Nonperseverative Errors.
70-99
CAT, Categories Completed; PE , Perseverative Errors.
> "tJ "tJ
m
z
0 X N
VI
Table A25.37. [WCST.37] Lineweaver et al., 1999 (MCST Administration Version): Data for a Healthy Sample Stratified into Four Age Groups, Four Education Groups, and by Gender Education Groups
Age Groups
Gender
7-12
1~16
17-20
Male
Female
18
68
86
57
132
97
82.19 (2.60)
64.83 (7.79)
68.75 (9.65)
68.88 (7.95)
71.04 (8.00)
68.04 (8.67)
70.44 (8.30)
14.20 (4.39)
14.44 (2.99)
2.39 (2.30)
10.79 (1.62)
14.90 (1.05)
18.52 (1.20)
12.83 (4.37)
14.64 (4.65)
5.29 (1.31)
5.03 (1.51)
4.44 (1.76)
3.83 (1.89)
4.90 (1.66)
5.28 (1.26)
5.46 (1.10)
5.09 (1.35)
5.10 (1.55)
5.34 (5.22)
6.81 (6.01)
8.42 (6.94)
10.89 (7.52)
9.89 (8.12)
8.54 (6.83)
6.97 (6.10)
7.23 (6.56)
8.45 (6.23)
7.20 (6.88)
2.17 (7.70)
2.73 (7.38)
2.22 (4.41)
3.00 (4.65)
9.78 (13.96)
2.94 (5.96)
1.50 (4.07)
1.16 (1.95)
1.84 (3.16)
2.98 (7.54)
Variables
45-59
60-69
70-79
80-91
n
29
84
89
27
Age
54.31 (4.20)
64.77 (3.03)
73.93 (2.88)
Education
12.59 (4.82)
13.04 (5.00)
5.34 (1.32)
NPE PE
MCST CAT
-
1-6
"CAT, Categories Completed; NPE, Nonperseverative Errors; PE, Perseverative Errors.
.... .... = ....
Copyright Acknowledgments
Tables reproduced in this book were adapted from the sources discussed in the chapters where the tables are presented. Adapted with permission from Elaevier Science: Appendix 19, Tables A19.36--A19.39 adapted from: Geffen, G. M., Butterworth, P., & Geffen, L B. (1994). Test-retest reliability of a new form of the Auditory Verbal Learning Test (AVLT). Archives of Clinical Neuropsychology, 9, 303316. Appendix 11, Table A11.20 adapted from: Ruff, R. M., Light, R. H., Parker, S. B., & Levin, H. S. (1996). Benton Controlled Oral Word Association Test: Reliability and updated norms. Archives of Clinical Neuropsychology, 11, 329-338.
Tombaugh, T. N., & Hubley, A. M. (1997). The 60-item Boston Naming Test: Norms for cognitively intact adults aged 25 to 88 years.
Journal of Clinical and Experimental Neuropsychology, 19, 922-932. Appendix 19, Tables A19.15-A19.17 adapted from: Geffen, G. M., Moar, K. J., O'Hanlon, A. P., Clark, C. R., & Geffen, L. B. (1990). Performance measures of 16- to 86-year old males and females on the Auditory Verbal Learning Test. Clinical Neuropsychologist, 4, 45-63. Appendix 19, Tables A19.8-A19.10 adapted from: Wiens, A. N., McMinn, M. R., & Crossen, J. R. (1988). Rey Auditory-Verbal Learning Test: Development of norms for healthy young adults. Clinical Neuropsychologist, 2, 67-87.
Appendix 19, Table A19.22 adapted from: Appendix 19, Table A19.42 adapted from: Shapiro, D. M., & Harrison, D. W. (1990). Alternate forms of the AVLT: A procedure and test of form equivalency. Archives of Clinical Neuropsychology, 5, 405-410. Adapted with permission from Sweta {; Zei-
Friedman, M. A., Schinka, J. A., Mortimer, J. A., & Borenstein Graves, A. (2002). Hopkins Verbal Learning Test-Revised: Norms for elderly African Americans. Clinical Neuropsychologist, 16, 356--372.
tlinger: Appendix 10, Tables A10.17 and A10.18 adapted from:
Adapted with permission from the Educational Publiahing Foundation (American Paychalogkal Aaociation): 1013
1014
Chapter 10, Table 10.2 adapted from : LaBarge, E., Balota, D., Storandt, M., & Smith, D. S. (1992). An analysis of confrontation naming errors in senile dementia of the Alzheimer type. Neuropsychology, 6, 77-95. Adapted with permission from Cambridge Univemty Prea:
COPYRIGHT ACKNOWLEDGMENTS
Appendix 11, Tables All.44-All.47 adapted from: Acevedo, A., Loewenstein, D. A., Barker, W. W. , Harwood, D. G., Luis, C., Bravo, M., et al. (2000). Category Fluency Test: ormative data for English- and Spanish-speaking elderly. Journal of the International Neuropsychological Society, 6, 760-769.
Index
Absolute zero point, 34 Abstract reasoning, 12, 475, 496 Acculturation, effect on test perfonnance, 29, 31, 70, 181, 315, 481. See also Marin Marin Acculturation Scale Activities of Daily Living (ADLs), 14, 59, 314, 476,
500,506 Adaptations of neuropsychological tests, linguistic and etdtwnd,28, 178-179,287 Administration. See also Variability in test administration and scoring procedures, considerations in selection of nonnative data, 18-19 standard procedures, 9-10, 15 of a test/test battery, 3, 9, 11, 15, 17, 38 Affective Auditory Verbal Learning Test, 371 Affective state, 12, 16 Affective symptoms, 12 African American Norms Project, 227 Age-related decline in Benton VISual Retention Test perfonnance,
395,398 in finger tapping speed, 442 in grip strength, 445, 458 in Grooved Pegboard perfonnance, 460, 471 in Judgment of Line Orientation perfonnance, 286 in memory, verbal and visual, 344 in naming ability, 176-177, 199 in Rey-Osterrieth Complex Figure recall, 250 in verbal 9.uency, 207 in Wisconsin Card Sorting Test perfonnance, 508, 528,532 in word-list learning, 372 Aggie Figures Learning Test (AFLT), 372 Alternate forms. See also Equivalent forms; Multiple fonns; Versions of the test Auditory Consonant Trigrams, 137 Benton VISual Retention Test, 394 California Verbal Learning Test, 363, 368 Category Test, 477-480 Color Trails, 101 Complex Figure Tests, 241-242 Judgment of Line Orientation Test, 284
Rey Auditory-Verbal Learning Test, 361, 383,
385,393 Stroop, 116, 131 Trailmaking Test, 64-65 Verbal Fluency Tests, 201 Wisconsin Card Sorting Test, 503-505 American Academy of Neurology, report, 13 Ammons Quick Test, 75 Annett Handedness Questionnaire, 429, 449, 456,470 Appearance, as aspect of mental status, 12 Anny Individual Test Battery, 59, 63 Attention as an aspect of mental status, 12 as assessed with Auditory Consonant Trigrams, 134, 143, 158 Category Test, 475 Color Trails Test, 100, 106 Digit Vigilance Test, 162-163, 170 Paced Auditory Serial Addition Test, 141-143, 158 Ruff2&7, 160-161, 170 Trailmaking Test, 60-61, 98 Wechsler Memory Scale, 337 auditory, 202 as cognitive domain, 9 divided, 108, 134, 141-142 effect on test perfonnance, 5 selective, 160, 170 sensitivity to brain damage, 9-10 susauned, 141, 160, 162, 170 Attitude, test-taking, 3, 5. See also Effort, Motivation Attribution Identification Test, 506 Auditory Consonant Trigrams (ACT), 108, 134-140,508 Baltimore Longitudinal Study of Aging (BLSA), 87, 401,404,405,406,408,409,415 Barrow Neurological Institute Screen for Higher Cerebral Functions, 441 Base rates, 18, 22, 24, 42-43. See also Incidence of a disorder; Prevalence rates reporting, 107 Bateria Neuropsychologica en Espanol, 29
1015
1016
INDEX I
Bateria Woodcock-Muiioz-R, Pruebas de Habilldad Cognitiva-R, 128, 233, 529 i Battig and Montague categol}' exemplar collecti.... 368 Bayesian Information Criterion (BIC), 50 Beck Anxiety Inventory (BAI), 128, 233 Beck Depression Inventol}' (BDI and BDI-U), :128, 194,232,233,268,378 ~ Behavioral Dyscontrol Scale, 67 Bender-Gestalt Test, 398 Benton Visual Retention Test (BVRT), 394-41 Benton's approach to neuropsychological , assessment, 18 Blessed Dementia Scale (BDS), 213, 228 Boston Naming Test (BNT), 39, 17~199, 202 ' Ponton-Satz BNT, 178, 189 j Revised Children's BNT, 175 1 Spanish BNT, 178-179 ; Boston Process Approach to neuropsychologicali assessment, 18 Boston Qualitative Scoring System for the Rey-Osterrieth Complex Figure (BQS$, 247 Brain damage I claims in forensic proceedings, 10 effects on cognitive functioning. 9-10, 69, 72 j Brain dysfunction, assessment of, 3, 5, 10, 17, 3J4, I 420,444 general,273 ( lateralized, 273 Brain injury (traumatic) 3, 13, 61, 100, 365-366 445, 500, 50i-503. See also Head injury 1 outcomes, 59, 161, 476 severity, 365-366 mild, 61, 98, 101, 142, 161 moderate, 475 Brain reserve capacity, 101 Brief Repeatable Battel}' of Neuropsychologic tests, 142, 153, 158-159 ' Brown-Peterson Consonant Trigram Memol}' Task, 134
I
j
Calculations ability, as a component of mental status, 12 California Card Sorting Test, 514 I California Verbal Leaming Test (CVLT, CVLT-n>, 29, 341, 362-368, 369, 370 I Canadian Study of Health and Aging. 220 Cancellation tests, 1~170 Cardiovascular Health Study (CHS), 195 I Category Test, 329-330, 442, 475-495, 479, 500,<506 Cattell's Matrices, 511 ' ! Ceiling effect, 340-341, 346, 369 CERAD List-Leaming test, 370. See also Conso4ium to Establish a Registry for Alzheimer's 1
I
Disease
Charlotte County Healthy Aging Study, 92 Children/adolescents, ieferences to nonnative data sources Auditol}' Consonant Trigrams, 137 Benton Visual Retention Test, 402 Boston Naming Test, 182 Cancellation Tests, 164
Categol}' Test, 483 Design Fluency Tests, 303 Finger Tapping Test, 423 Grip Strength Test, 447 Grooved Pegboard Test, 462 Hooper Visual Organization Test, 275 Judgment of Une Orientation Test, 288 Paced Auditol}' Serial Addition Test, 146 Rey Auditoi}'-Verbal Leaming Test, 375 Rey-Osterrieth Complex Figure Test, 255 Tactual Performance Test, 318 TrailmakingTest,72 Verbal Fluency Tests, 209 Wisconsin Card Sorting Test, 513 Children's versions, for Boston Naming Test, 175 California Verbal Leaming Test, 367~ Color Trails Test, 102 Trailinalcing Test, 59 Cbcumlocution, 174, 177 Clinical comparison data, 9 Cochrane Collaboration, 45 Coef&cient of determination, 42 Cognitive abilities, 13, 14, 15, 17, 33 domains,3, 10, 13,16,21 dysfunction, 13, 18, 24 flexibility, 106, 202 functioning. baseline level, 14 inhibition, 108 profile, 15, 18 slippage, 101 status, 12, 1~14 strengths and weaknesses, ~. 15, 33 Cognitive/information-processing mechanisms contributing to test performance Auditol}' Consonant Trigrams, 134, 141 Benton Visual Retention Test, 394 Boston Naming Test, 174, 176 Cancellation Tests, 160, 170 Categol}' Test, 475 Color Trails Test, 99-100, 106 Design Fluency Test, 298-299 Digit Vigilance Test, 162 Finger Tapping Test, 420 Grip Strength Test, 444 Grooved Pegboard Test, 460 Hooper Visual Organizations Test, 27~274, 277 Judgment of Line Orientation, 284 Stroop Test, 108, 113 Paced Auditol}' Serial Addition Test, 141, 143, 158 Rey Auditoi}'-Verbal Leaming Test, 359-361 Rey-Osterrieth Complex Figure, 241, 249-251 Ruff 2&:7 Selective Attention Test, 161 Tactual Performance Test, 312 Trailmaking. 60, 66, 88, 98 Verbal Fluency, 202-203, 204 Vuual Form Discrimination Test, 278-279 Wechsler Memol}' Scale (WMS, WMS-R, WMS-III, WMS-IIIA), 337, 338-342 WISOODSin Card Sorting Test, 496, 499, 506, 507
INDEX Cognitive set, 496 Cohort effect, 180, 195 Color-Form Sorting Test, 496 Color Trails Test (CIT), 65, 67, 99-107 Color vision, 397 Complex Figure Test (CFT), 241 Comprehensive Employment Training Act (CETA), 428, 448, 462 Comprehensive Norms for an Expanded Halstead-Reitan Neuropsychological Battery (1991 manual and its 2004 revision), 29 and Boston Naming Test, 177, 183 and Cancellation Tests, 164, 166 and Category Test, 481, 493-494 and Finger Tapping Test, 420, 437 and Grip Strength Test, 444, 453 and Grooved Pegboard Test, 467 and List-Learning Tests, 362, 363 normingapp~.23
and Tactual Performance Test, 316, 332 and Trailmaking Test, 82-83 and Verbal Fluency Tests, 200 and WISCOnsin Card Sorting Test, 514 Concentration, 10,98, 106,143,158,475 Conceptformation,475,496,506 Concussion, 161 Con6.dence interval, 41, 48, 50, 51 Connecticut Pictorial Learning Test (COPLT), 368 Connections test, 66 Consortium to Establish a Registry for Alzheimer's Disease (CERAD), 29, 177-178, 179 Content of thought, 12 Continuous Performance Test (CPT), 256, 398 Controlled Oral Word Association (COWA), 19, 162 Coordination, assessment of, 12 Cornell Medical Index, 84 Corsi Cube Test, 339 Cost-benefit ratio, 38, 44 Cranial nerves, assessment of functioning. 12 Criterion measure, external, 17, 24, 44 Cross-cultural assessment, 28-29, 67 with Color Trails Test, 99-100, 106-107 tests, guidelines for development, 28 Cross-cultural Neuropsychological Battery, 221 Crovitz-Zener Test of hand dominance, 432, 464 Culture- and ethnicity-specific test versions and/or normative data sets (those hosed on USA and Canat:lfan samples are not Included), 28-29 A£dcan-AJnerican, 179,369,370,375,389-390 A£dcan-Caribbean, 65, 179 Arabic, including Egyptian, 65-67, 144, 401, 413, 414,492 Australian, 179,370,381,423,453,456,490 Brazilian, 206, 362, 370, 371 Chinese, 54,102-103,105,110,206,251,362 Colombian, 423, 436-437 Czechoslovakian, 110 Danish, 65, 380 FlenriSh,65, 179,206,362 French,367,401,410 German,110
1017 Greek,65 Hebrew, 62, 66, 110, 123, 206, 371 Hispanic/Spanish, 102, 104, 105, 110, 127, 128,178,205-206,251,282,287, 370,371,375,511,523,529 Holland, 423, 436 Indian,274,371,401,406 Italian, 55,110,251,401,416,423 Jamaican, 370 Japanese, 110 Japanese-AJnerican, 440 Korean, 179,367,401,407 Latin AJnerican, 371 Native AJnerican, 179 Netherlands, 401, 416 New Zealanders, 65, 179, 367 Norwegian,206,401,412,485 s~. 110.179,401,415 Swiss, 375 Turkish, 135, 140 Venezuelan, 401, 408 ~etnamese, 110, 126 Culture, effect on test performance, 8, 11, 15, 28-30,31,69-70,159,199,399,511. See also Acculturation; Language Cultural bias, 369 Culturally fair items, 179 test, 100-101 Cutoff criterion classification accuracy 23-24, 44-45. See also Diagnostic accuracy; Error, false positive nriSclassification rates Category Test, 476-477, 478, 484-485, 488, 489, 491,495 Finger Tapping Test, 420-421, 422, 424, 434, 443 Grip Strength Test, 445 Grooved Pegboard Test, 460, 471 Stroop Test, 130 Tactual Performance Test, 313, 333 Trailmaking Test, 62, 63, 79, 84, 87, 98 Verbal Fluency, 216 for dementia, on MMSE, 42 for impairment, 22, 23, 72, 73, 94, 117, 272, 423, 464,484
for performance levels, 21, 23, 281-282, 424 selection of, 44-45, See also Receiver Operating Characteristic curve for suboptimal effort, 367 utility of, 23, 63, 64 Data-driven app~ to neuropsychological assessment, 17 Decision-making process, 17 in activities of daily living. 14, 30 and clinical judgment, 14, 17, 18, 30, 40, 42, 44 as cognitive domain, 14 non-psychometric factors in, 15, 16 in Rohling's Interpretive Method, 25-26 and statistical properties of tests, 33, 40, 41, 43,50
INDEX
1018 Decision theory, outcomes, 42-43, 44 Delay inteiVal, length of, 242, 358 Delis-Kaplan Executive Function System (D-KEFS), 66,110-111,201,299 Delusions, 12 Dementia assessment, 13, 14, 17, 62, 87, 250, 368, 371, 395. See also Diagnosis, differential, of dementia 8uency deficits in, 205 frontotemporal, 108, 499-500 1..-y body. 285 vascular, 279, 366, 368 Dementia of Alzheimer's Type/Alzheimer's disease, 13, 22, 42. See also Consortium to establish a registry for Alzheimer's Disease and Benton Visual Retention Test, 395 and Boston Naming and Verbal Fluency tests, 174, 176 calculation of base rates, 42-43 and California Verbal Learning Test, 366, 368 and CERAD List-Learning Test, 370 and Design Fluency Tests, 299 and Finger Tapping Test, 421 and Hopkins Verbal Learning Test, 369 and Judgment of Line Orientation Test, 285 and Rey Auditory-Verbal Learning Test, 360 and Stroop Test, 109 and Trailmaking Test, 63, 84, 92 and Verbal Fluency Tests, 205, 213 and Visual Form Discrimination Test, 279 and Wisconsin Card Sorting Test, 499-500, 530 Dementia Rating Scale (DRS; Mattis, 1988), 185, 192, 214,216,277,293,347,349,517 Demographic factors, effect on test performance 3, 7-9,16,18-20,22,24,28-29,30-31,32,38, 46, 50, 54, 55. See also Culture; Occupation; Reporting of test results age,60, 62, 98,132,235,269,270,354,391,393, 430,441-442,457,470,495 with Auditory Consonant Trigrams, 135 with Benton VJSUal Retention Test, 398-399 with Boston Naming Test, 178, 180-182, 197-199 with Cancellation Tests, 162, 163, 170 with Category Test, 476, 480-481, 489, 493 with Color Trails Test, 101-102 with Design Fluency Tests, 301, 310-311 education,60,98, 132,199,236-237,269,270,354, 391,470-471,495,501 ettuucity, 28-30,31,208,287,511,532 with Finger Tapping Test, 421-422, 431, 442 gender, 72,103-104,237,304,392,422,442,445, 448,450,458,464,471 with Grip Strength Test, 445, 451 with Grooved Pegboard Test, 460, 463, 467 with Hooper Visual Organization Test, 272, 274 intellectual level, 236--237, 426, 447, 455, 495, 531 with Judgment of Line Orientation Test, 286 with List-Learning Tests, 372-374 with Paced Auditory Serial Addition Test, 143-145, 159 with Rey Auditory-Verbal Learning Test, 362
with Rey-Osterrieth Complex Figure, 251-253 with Stroop Test, 112--114, 132--133 with Tactual Performance Test, 313-315, 333 with Trailmaking Test, 63, 67-70, 96-98 with Verbal Fluency Tests, 206--208 with VISual Form Discrimination Test, 280 with Wechsler Memory Scale (WMS-R, WM5-lli, WMS-IIlA), 344-345, 355-356 with Wisconsin Card Sorting Test, 508-511, 532 Denman Neuropsychology Memory Scale, 244, 340 Depression. See also Disorders and conditions affecting cognition, electroconvulsive therapy and Benton Visual Retention Test, 396 calculation of selection ratio, 43 and Category Test, 476 in the context of mental status examination, 12 effect on test performance, 62, 70, 161 and Grip Strength Test, 445 and Judgment of Line Orientation, 285, 297 and Stroop Test, 108 and Verbal Fluency Tests, 208 and Wisconsin Card Sorting Test, 501, 522--523 Design Fluency Tests, 298-311 Diagnosis,3, 10,17,24 differential, 17, 33 of dementia, 13, 18, 27, 42, 98, 206, 361, 395. See also Dementia; Dementia of Alzheimer's Type of depression, 13. See also Depression of malingering. 22, 27, 476, 502-503 of vascular dementia, 13 Diagnostic accuracy, See also Cutoff criterion, classification accuracy; Discriminant function analysis Cancellation Tests, 161 Color Trails Test, 99, 101 Wisconsin Card Sorting Test, 499-500, 502-503 Diagnostic use of Neuropsychology, 3, 13-14 Digit Cancellation Test, 160 Digit repetition task, 29 Digit Span test, cognitive processes, 28, 134 Digit Symbol Test (DST), 108 Digit Vigilance Test (DVI'), 160, 162--164, 166-170 Direct Assessment of Functional Status (OAFS) scale, 28 Disability, evaluation for, 14 Discriminant function analysis, classification accuracy, 112, 178, 506, 526 Disorders and conditions (medical and psychiatric) affecting cognition, 13. See also Brain injury; Dementia; Dementia of Alzheimer's Type; Depression; Head injury; Postconcusslon
syndrome apolipoprotein E phenotype, 70, 395 alcohol exposure in utero, 108, 134, 300 alexia, 279 amnesia, 249-250, 360, 361, 371. See also Disorders and conditions affecting cognition, Korsakoff
syndrome transient global, 108, 299 amyotropic lateral sclerosis, 299-300
INDEX aneurism, anterior communicating artery, 360 anomia, 273 antisocial personality disorder, 476 anxiety, 38, 422 aphasia, 109,279 batteries, 182 Broca's, 177 ftuent vs. nonftuent, naming deficit in, 177 Wernicke's, 177 attention deficit hyperactivity disorder, 108, 134, 142,249,299,396 autism, 299 bipolar affective disorder, 396, 476, 506, 526 borderline personality disorder, 300 brain tumors, 371 cardiac surgery, 13 cardiopulmonary bypass, 397 cardiovascular disease, 208, 502 cerebral atrophy/ventricular enlargement, 68 cerebrovascular accident (stroke), 13, 250 brain pathology, 109 risk factors, 135, 208 cholinergic system dysfunction, 285 chronic fatigue syndrome, 142, 475 chronic obstructive pulmonary disease (COPD), 162,163,397,449,502 conversion disorder, 28 ~. 70,142,397,476 Down syndrome, 499 dyslexia, 300 electroconvulsive therapy, 366, 371, 396 endogenous cholesterol synthesis, 109 end-stage pulmonary disease, 371 epilepsy surgery, 13 epilepsy/seizures, 299, 396 frontal lobe, 108 temporal lobes, 250, 371, 396 temporal lobe, right, 396 factitious disorder, 15 fragile X syndrome, 396 frontal lobe syndrome, 361 health status, 113, 481 HIV infection/AIDS, 13, 67, 100, 101, 142, 161, 360, 370,396 hormone replacement therapy, 95, 163, 397 Huntington's disease, 396, 499-500 hypertension, 70, 208, 397, 476 hypnosis, 300 Klinefelter's syndrome, 108, 134 Korsakoff syndrome, 14, 134, 249 late-life psychosis, 108 learning disabilities, 14, 18, 67, 314, 396, 460 leukemia, acute childhood, 250 liver transplant, 397 lung cancer treatment, 397 medical condition, 70 miotonic dystrophy, 109, 134 multiple sclerosis, 13, 142, 158, 371, 396, 475, 502 Obsessive-Compulsive Disorder, 250, 299, 314, 501
1019 organic memory impairment, 250 osteoporosis, 70 chronic pain/cingulatomy, 142, 300 Parkinson's diseaselpallidotomy, 13, 134, 177, 279, 285,299,360,371,395,499,500 Post-Traumatic Stress Disorder, 396 psychopathic traits, 501--502 schizoaffective disorder, 300 schizophrenia, 108, 134, 161, 163, 250, 285, 299, 359,396,475,498,500,502,506,523-524, 525,526 Schizotypal Personality Disorder/traits, 142, 501, 522,524 sex hormone levels, 70 sleep apnea, 162 sleep disruption, 142 somatization tendency, 15 somatoform symptoms, 27-28 spinal cord injury, 361 stroke. See Cerebrovascular accident substance use/abuse, 60, 63 alcoholism,314,396,397,411,476,502 cannabis, 142, 396 chronic caffeine use, 109 cocaine, 396 heroin, 396 smoking. 142, 397, 411 suicidality, 299 tardive dyskinesia, 396 temporal lobectomy, 134 toxic exposure, 70, 250 lead, 397 mercury, 397 sarin, 397 solvents, 142, 397 workplace chemical exposure, 396-397 Turner's syndrome, 396 unilateral neglect, 250 white matter lesions, 135, 502 Distribution heterogeneous/homogeneous, 40, 46-47 kurtosis, 179, 199 negatively/positively skewed, 20, 23, 25, 38-39 in Boston Naming Test, 179, 199 in California Verbal Learning Test, 363 in Hopkins Verbal Learning Test, 369 in Rey Auditory-Verbal Learning Test, 392 in Rey-Osterrieth Complex Figure, 270 in Trailmaking Test, 64, 82, 98 in Wechsler Memory Scale, 346 normal,20,36-37 not meeting assumption of normality, 38-39, 52, 197,270 sampling, 19 standardized, 38 Drug Abuse Treatment Outcome Study (DATOS), 60 Dual-baseline approach to repeated testing. 367 Dual-code cognitive neuropsychological model, 250 Edinburgh Handedness Inventory, 294, 456 Effect size, 26, 46
1020
;r.:·,
Effort In test-taking. 5, 16. See also Attitude; differential. of malingering; Motlva · assessment of, 5, 27-28, 62, 63, 250, 279, . , 367 below chance pemrmance, 5, 279 1 probability theory-based tests, 5 Emoticmal dysfunction, 4. See also Depression; ~ania; Rapid cycling ! Emoticmal factors In test-talcing. 38 ) Emotional status, 4, 15 • Equipercentile equating. 363 Epilepsy. See Disorders and medical conditions affecting cognition; Seizures Equivalent forms, 101, 107, 178, 368, 368. See also Alternate forms; Multiple forms; Ver+ns of the test Error altered sequence, 138 analysis method, · Benton V"uual Retention Test, 394, 395, 4de Boston Naming 'fest, 174-176, 186 Rey Auditory-Verbal Learning Test, 360 Trailmalcing Test, 62-63, 93, 98 In calculation of sQtlstics, 38 In clinical decision-malcing. 40, 42, 44 color-sequence, 101 commission, 398 oostof,40 In data recording. 38, 46, 49 distortion, 394, 395, 398 execution, 38 false positive misclassi6cation rates for Boston Naming Test, 199 for Category Test, 476, 489, 495 , for Finger Tapping Test, 420-421, 433, 44:l for Grip Strength Test, 445, 450, 458 ! for Grooved Pegboard Test, 460, 471 for Rey Auditory-Verbal Learning Test, 371f for Tactual Performance Test, 313 ' for Trailmalcing Test, 62, 94 for Wisconsin Card Sorting Test, 502-503 1 false positive vs. falle negative, 38. See also ~on theory, outcomes i Inter- vs. intraquadrant, 285 Intrusion, 360, 364, 365 letter sequence, 136 of measurement, ~1 mispe~tion, 173, 177 misplacement, 394 near-miss, 101, 107 nonperseverative, 497 number-sequence, 101 omission, 394, 395, 398, 399 peripheral, 279 1 perseverative, 62, 63, 298-300, 307, 310, 365, ~. 395. 497, 499-S02 I publication bias, 46 rate, 218 ratio, 300 repetition,360 rotation, 394, 398, 399 scores, 116
i
INDEX sequential. 62, 63 sequencing. 62 shifting. 62 size, 394 sources of, 15, 53-54 systematic, In data reporting. 45, 46 In test construction, 15 In test Interpretation, 15 In test performance, 59, 111, 298-299 vertical vs. horizontal, 285 Ethical concerns In test administration and scoring. 6,10 Evidence-based clinical decisions, 46 Executive dysfunction,98, 133,140,237,250,298,359, 501,531 ~.3, 14,27,30,60, 106,108,202,389,476, 496, 506, 507. See also Nei'YOUS system, function/dysfunction of, frontal lobes system,142 Expectancy tables, 34 Experimental tests, ~11 Factor analytic studies/factor structure of tests, 41 Auditory Consonant Trigrarns, 134 Benton Vuual Retention Test, 397 Boston Naming Test, 189 California Verbal Learmng Test, 362, 365, 367 Cancellation Tests, 162, 163 Category Test, 476 Design Fluency Tests, 300 Finger Tapping Test, 420 Rey Auditory-Verbal Learning Test, 359 Stroop Test, 108 Trailmalcing Test, 60 Verbal Fluency Test, 202, 219 Vuual Form Discrimination Test, 397 WHO-UCLA Auditory Verbal Learning Test, 370 Wisconsin Card Sorting Test, 501, 506-508, 517, 531 Federal Aviation Administration/Equal Employment Opportunity Commission (FAAIEEOC), 103, 121 File-drawer problem, In meta-analysis, 46 Finger Oscillation Test, 419 Finger Tapping Test (FTT), 162, 41~. 460 Five Point Test, 298, 299 Fixed vs. flexible battery approach, 17-18, 25, 316, 481 Flight of ideas, 12 Flynn effect, 19 Forensic setting. use of neuropsychological ewduations,S-6.~10, 15, 16,17,27,61 Forest plot, 49 Fragmentatlon,273 Fuld Object-Memory Evaluation, 201 Functicmal Assessment Scale, 215 Fund of information, as a component of mental status, 12 Gates-MacGinite Reading Vocabulary Test, 181 General Processing Tree Approach, 176 Generation tasks, 140
1021
INDEX Genetics, 12 Geographic recruitment region for the samples used in this book (USA and CtJflllda locations 111'W not lndtuled) Australia, 79, 88, 130, 131, 152, 166, 186, 230, 329, 353, 381,388,453,456,491-492
China, 105 Colombia, 258, 436, 518-519, 521 Denmark, 380 Egypt. 327, 414, 492
France,410,530-531 Germany, 469 Holland, 436 Korea, 407 India, 406 Israel, 123 Ibdy,416,430,515,525 Mexico, 233, 266, 520 New Zealand, 146 Netherlands, 153, 416 Norway,320,412,485 Spa[n,282,295,309,504,517,520 S~en. 70,220,415 Switzerland, 375 Turkey, 140 United Arab Emirates, 413 United Kingdom, 72. 150, 153, 213, 255, 256, 293, 295, 309-310, 379, 530 Venezuela, 408 Geriatric Depression Scale (CDS), 88, 91, 96, 185, 192, 215,228,235,276 Global Neuropsychological Deficit Scale, 25 Grade equivalent, 34 Grip Strength Test, 421, 444-458, 460 Grooved Pegboard Test (GPT), 5, 20, 421, 459-471 Hachinski Ischemia Rating Scale, 190 Hallucinations, 12 Halstead Impairment Index, 25, 419, 424, 476, 485 Halstead-Reitan Average Impairment Rating. 25 Halstead-Reitan Battery (HRB) and Category Test, 475-476 in clinical use, 17, 23 and Finger Tapping Test, 419 and Grip Strength Test, 444 and Tactual Performance Test, 312, 314 and Trailmaking Test, 59 Halstead-Russell Neurops)'Chological Evaluation System (HRNES and HRNES-R) and Boston Naming Test, 183 and Category Test, 481 and Finger Tapping Test, 420, 438 and Grip Strength Test, 454 and Grooved Pegboard Test, 468 norming approach, 22 and Tactual Performance Test, 316 and Trailmaking Test, 71 Hamilton Depression Rating Scale, 229 Hand Dynamometer. See Grip Strength Test Handedness, effect on test performance, 64-65 Category Test, 481
Design Fluency Tests, 301
Finger Tapping Test, 421, 443 Grip Strength Test, 445, 458 Grooved Pegboard Test, 460, 464, 471 Rey-Osterrieth Complex Figure, 253 Tactual Performance Test, 315 Head Injury. See also Brain injury assessment of, 14, 18, 62, 63, 64, 108, 134, 135, 141, 250,279,300,361,369,371,500,506 prediction of driving ability, 314 prediction of return to work, 142, 161 recovery rate, 141 severity, cognitive correlates, 62, 141-142, 161, 204,299 Hillsborough Elder African American Life Study (HEALS), 389 History, 10, 56 educational, occupational, 16 family. 16 of life events, 3 medical, psychiatric, 4, 5, 16, 19 vocational, avocational, 4 Hit rate (in decision theory), 43, 44 Homoscedasticity, test for, 52 Hooper Visual Organization Test (HVOT), 27~277
Multiple Choice HVOT (MC-HVOT), 273 Hopkins Verbal Learning Test (HVLT, HVLT-R), 368-369, 389-390 Hypothesis-driven approach to neuropsychological assessment, 18 Impairment Index. See Halstead Impairment Index Incidence of a disorder, 42, 272 Independence assumption, 53-54 Independent functioning in ADLs, 14, 17, 31, 33,38 Inhibition of responses, 496 Initiation
delicit,298 measures, 300, 310
Insight, as a component of mental status, 12 Interference factor (in Stroop Test), 120 index (in Color Trails Test), 100, 107 proactive vs. retroactive, 134, 359, 360, 365-366, 381,386,389 release from, 366 score (in Stroop Test), 123, 126 Interhemispheric neural transfer, 425 Intermanual differences Asymmetry Index, 440 in Finger Tapping Test, 421, 433, 435, 442, 443 in Grip Strength Test, 444-445, 452, 453, 458 in Grooved Pegboard Test, 460, 465, 471 reversal in lntermanual differences, 425, 433, 443 in Tactual Performance Test, 313, 319 Interpretation of test performance, 3, 5-6, 7, 10, 15,98 accuracy, limitations, 10, 199, 237 content-referenced, 34
INDEX
1022 Interpretation of test performance (conflnued) criterion-referenced, 34 domain-referenced, 34 norm-referenced, 26, 34, 98 of raw scores, 11-12 lnterquartile range test, 49 Interval scale of measurement, 34 Interview, with patient and significant others, 16 lntraindividual Measure of Association (UMA), 25 Item an~ysis.
175,182,195,273-274,477
difficulty, 15, 26, 179, 273 dbcrinllnation, 15,26 Item Characteristic Curve (ICC), 26 Item Response Theory (IRT), 26-27 Johns Hopkins Teaching Nursing Home Study of No~
Aging. 211
Judgment clinical, 4, 10, 17 limitations of, 17 as a component of meow status, 12 Judgment of Une Orientation Test (JLO), 284-297 Kame project on aging and dementia, 440 Kaplan's approach to neuropsychologi~ assessment, 15, 18 Kernel Density Estimate, "K-density plot», 52, 392 Kinesthetic abilities, 420 Knox Cube Imitation Test, 339 Kurtzke's Disability Status S~e (DSS) Lagrange multiplier statistic, 52 Language as a cognitive domain, 12 bilin~m. 2S-29, 113-114, 159, 199,208,221
effect of native language on test performance, 2S-29, 99,107,113-114,199,208,221
English as a second language, 2S-29, 99, 107, 113-114,199,208,221
limited mastery of, 2S-29, 99, 107, 199, 208, 221 test translations/adaptations, 2S-29
Lateral Dominance Examination, 444 Laterallzation of brain function/dysfunction, 3, 72, 312, 343, 421, 443, 444, 448, 458, 459, 460, 498. See also Nervous system, interhemispheric differences Learning. 475. See also Memory curve, 359-360 incidenw, 361
list vs. item, 371 nonverb~. 3, 339
single trial vs. over trim, 342 strategy, 360, 365 verb~. 3, 29, 359-360 Letter and Symbol Cancellation Task, 160 Letter fluency, 29 Level of consciousness, as a component of meoW status examination, 12 Levels of data integration, 1~17
Lexi~
access, 176 deficit hypothesis, 176 network, 176 retrieval, 173, 174, 176
storage, 202
Likelihood ratio, 23, 44-45 in meta-analysis, 46 List-learning tests, 29, 357-393 Loosening of associations, 12 Luria-Nebraslca Neuropsycho~ Battery (LNNB),18 Luria's approach to neuropsychoiogical assessment,
15, 18
Macquarie University Neuropsychologic Normatiw Study (MUNNS), 353 M~ring. detection of, 15, 22, '1:1, 286 Mania, 12
Marin Marin Acculturation Scale, 128, 189, .219, .233, 265, 390, 529. See also Acculturation, effect on test performance Mathemat~W
abilitylsldlls, 143, 144
Mayo Cognitive Factor Scales. .23 Mayo Older Americans Normat:iw Studies (MOANS), 85, 125, 187, .216, .29.2, 349, 386 overlapping interval technique, .24 scoring system, .23 McCarthy SWes for Children's Abilities, .201 Measurement, See also Error, of measurement
method of, 34 scales of, 34 theory of, 34 units of, 33
Mediw-Iegal evaluations, See Forensic setting. use of neuropsychologi evaluations Memory and Aging Study (MAS), 91, 195 Memory, 12. See also Learning control. 365 as information-processing domain, 9, 475 laterallzation, 343 metamemory, 361 soun:e/cootelt, 360, 364, 398
Memory compartments immediate, 337,340,359,394 long-term, .203 short-term, 134, 202
working. 60, 108, 134, 140, 141, 143, 341, 501 Memory mechanisms auditory process, 342 consolidation, 360, 365 encodinf§'acquisition, 339, 343, 346, 359-360, 363 forgetting. 9, 345, 347, 35.2, 360 reWI. 360
cued, 390 recognition, .279, 339, 360, 364, 366 recognition discriminatioo index, 390 retrieval, 339, 342, 343, 346, 359-360, 363, 365 from short-term vs.long-term storage, 371 storage/retention, 342, 346, 359-360, 365
INDEX Memory modalities auditory, 134, 337, 341, 343 multimodality, 343 spatial, 312, 338 tactile, 312 verbal, 134, 343, 359 visual, 279, 337, 339, 340, 394, 475 Memory Quotient (MQ), 339, 376 Mental status, 16 assessment, 12-13 components, 12 Meta-analysis advantages and limitations, 46-47, SS-56 cluster option, 48, SO, 54 comparative (case-control) vs. noncomparative (descriptive), 46 fixed vs. random effects, 48, SO heterogeneity in study results, 46-47 history, 45 Quality of Reporting of Meta-Analyses (QUO ROM) statement, 45 weighting data, 48, SO Midpoint age interval technique, 24. See also Overlapping Interval Technique Boston Naming Test, 187 Judgment of Line Orientation Test, 292 Rey Auditory-Verbal Learning Test, 386 Stroop, 126 Trailmaking Test, 85 Verbal Fluency, 216 Wechsler Memory Scale Miles ABC Test of Ocular Dominance, 435, 452,465 Mini-Mental State Exam (MMSE), 42-43 and Boston Naming Test, 184, 186, 189, 191, 194, 196 and Benton Visual Retention Test, 410 and California Verbal Learning Test, 366 and Finger Tapping Test, 430, 436 and Hooper Visual Organization Test, 276 and Judgment of Line Orientation Test, 296 modified MMSE (3MS), 220 and Rey Auditory-Verbal Learning Test, 378,384 and Rey-Osterrieth Complex Figure, 261, 266 and Stroop Test, 127, 131 and Trailmaking Test, 67, 96 use in assessment of mental status, 12 and Verbal Fluency, 214, 215, 229, 231, 232 and Wechsler Memory Scale, 349 and Wisconsin Card Sorting Test, 528, 530 Minnesota Multiphasic Personality Inventory (MMPI, MMPI-2), 36, 125, 320, 476,487 MacAndrew Alcoholism Scale, S02 Modified t-test method for small samples, 25 Monongahela Valley Independent Elders Survey (MoVIES), 92, 230 Mood, as a component of mental status, 12. See also Affective state Motivation, 3, 12, 16
1023 in test-taking. 5, 16, 27-28, 38. See also Attitude; Diagnosis, differential, of malingering; Effort in test-taking Motor ability, 444 activity. 12 control, 420, 460 functioning. assessment of 5, 20, 162 systems, 12 Multicenter Aids Cohort Study (MACS), 83, 95, 122, 212,234,268,382,385,389,466,467 Multilingual Aphasia Examination Battery, 200, 206 Multiple forms of tests, 99. See also Alternate forms; Equivalent forms; Versions of the test Multitrait-multimethod matrix, 17 Multicultural neuropsychological assessment, 28-29. See also Culture, effect on test performance; Culturally fair test Naming ability. 176 confrontation, 173, 174, 182, 198 effect on Hooper VOT performance, 273, 277 National Multiple Sclerosis Society, 142, 153, 158-159 Nervous system, function/dysfunction of amygdala, 395 anterior brain areas, 161 right. 249 anterior vs. posterior brain areas, 395 basal ganglia. 204, 395 central nervous system, 3, 4 cerebellum, 205 corpus callosum, 460 diffuse brain dysfunction, 204, 218, 395, 475 frontal-extrastriate system, 499 frontal-limbic-reticular activating system, 135 frontal lobes, 10, 61, 62, 63, 93, 106, 108, 134, 177, 299,310,312,359,364,389,498, SOO,S02 basal-orbital region, 61, 62 cingulate gyrus. 108, 141, 204, 475 anterior, 499 dorsolateral convexities, 61, 62, 475, 498 inferior cortex, 204 left, 108, 204, 249 left cortex, 204 medial, 61, 62 motor strip, 420, 444 orbitofrontal cortex, 364 orbitomedial cortex, 475 prefrontal cortex, 202, 204, 475, 498-499 left, 218 left dorsolateral, 204, 364, 498 left inferior, 364 right, 204, 365, 499 frontostriatal system, 204, 250 frontotemporal region, 299, 365 hippocampal region, 177, 205, 365 left, 249 right, 395 interhemispheric differences, 161, 421
1024 Neavous system (continued) dominant, 273 left, 249, 273, 285, 286, 299 nondominant, 177,249,273 right,249,273,279,284-285,286,299,3l2,395 limbic system, 12 multisystem mechaoisms, 249 occipital region, left superior, 499 parietal lobes, 420 inferior, 499 supramarginal gyrus, 499 nondominant, 273 right, 249 superior, 177 peripheral nervous system, 4 posterior brain areas, 312 right anterior quadrant, 273 i subcortical sensory and motor tracts, 420, ~ temporal cortex, 204 temporal lobes anterior, 177, 204 dominant, 177 left, 134, 204-205 medial, left, 205, 365 medial, right, 365 posterior inferior, 499 right, 134, 204-205, 299 temporoparietal region left, 204 right, 249, 285 thalamus, 395 white matter, 140 Neural network model, 65, 203 ; Neurobehavioral Core Test Battery (NCTB), 39ft Neurobehavioral Cognitive Status Examination, t17 1 Neuroimaging techniques, 12, 13, 19 cognitive correlates, 17, 60, 203 i EEG/evoked potentials, 499 functional imaging. 108, 141, 203-204, 285, 299,499 fMRI, 365, 498 PET, 364, 395, 498 rCBF, 498-499 SPEer, 299, 498
structural imaging MRUwhite-matter hyperintensities, 68, 108, :134, 142, 180, 196, 207, 249, 395 transcranial Doppler ultrasonography, 499 Neurological examination, 12 NEUROPI,29 Neuropsychiatry, as a clinical discipline, 12 . Neuropsychological evaluation, goals and method., 3-4, 13, 16, I Neuropsychological Screening Battery for ~cs (NeSBHIS), 104, 107, 189, 219, 265, ~ Neuropsychology ' apphcations, 13-15 chnical, 3, 10 Neurosensory Center Examination for Aphasia, 200,206 Nociferous cortex hypothesis, 204
INDEX Normative Aging Study, 231 Normative data availability of, 7 cultural factors, 28 multiple sets of, 9 in research, 7 selection of, 7-9, 18-22, 55 in test results interpretation, 3, 18-27 Normative Studies Research Project (NSRP), 191 Norming approaches comparative vs. diagnostic norms, 22 co-norming. 24, 342, 351, 354, 356 continuous norming. 22 regression-based norms, 22-23 single-case approach, 24 Occupation, effect on test performance, 399
Odds ratio, 23, 45 Order of test administration, 421, 506 Organization semantic, 363 in verbal learning. 360 Organizational abilities, 496 spatial249-250, 273-274 de&cit, 298 strategy,241 perceptual clustering. 250 Orientation as assessed with Wechsler Memory Scale, 337, 338 as a component of mental status, 12 Overlapping age strata technique, 368 Overlapping Interval Technique, 24. See also Midpoint age interval technique Paced Auditory Serial Addition Test (PASAT), 141-159 adjusting-PSAT, 143 computerized versions, 143 Levin version, 142-143 PASAT-200, PASAT-100, PASAT-50, 143 visual version (PVSAT), 157 Paivio, imagery values, 361 Parallel distributed processing models, 203 Patient-centered model, 18 Pattern analysis, 21 Percentile rank, interpretation of 20-21, 34-35 Pearson's methods for research synthesis, 45, 46 Perception as a component of mental status, 12 tactile,312
Perceptual ~entation, 177 organization,273 Performance levels. See Cutoff criterion level vs. pattern, 365-366 Perseverations, 136, 138, 218, 298. See also Errors, perseverative Personality characteristics, 4, 16, 422 Phonological encoding. 177
loop, 203
INDEX
1025
Physical Performance Test, modified, 91, 227-228 Pictorial Verbal I.eaming Test (PVLT), 372 Pilots age-related cognitive changes, 14 normative data for, 423 as subjects, 103, 121
PIN test, 20 Pittsburgh Occupational Exposures Test Battery (POET), 464 Positive and Negative Syndrome Scale (PANSS), 396 Postconcussion syndrome, 14-15, 141, 142, 300 Posttest probability, 44 Practice effect, 5, 38. See also Test-retest stability; Dual-baseline approach to repeated testing Auditory Consonant Trigrams, 136 California Verbal I.eaming Test, 366-367 Design Fluency Tests, 300, 305, 306, 311 Digit Vigilance Test, 163, 168 Finger Tapping Test, 443 Grip Strength Test, 456, 458 Grooved Pegboard Test, 469 Hopkins Verbal I.eaming Test, 369 Judgment of Line Orientation Test, 286 Paced Auditory Serial Addition Test, 159 Rey Auditory-Verbal I.eaming Test, 361-362, 383,384
Rey-Osterrieth Complex Figure, 242 Ruff Uc7 Selective Attention Test, 162 Stroop,124
Tactual Performance Test Trailmaking Test, 63, 64, 80, 89, 90 Verbal Fluency, 202, 223, 226 VJSUal Form Discrimination Test, 279 Wisconsin Card Sorting Test, 505, 508, 510, 518, 522,531
Pretest probability, 44 Predictive value, 23 in decision theory, 44 Premorbid functional level, 31 Prevalence rates, 22, 42 Primacy-recency effect See Serial position effect Problem solving. 475, 476 Process of thought, 12 Professional communication between clinicians, 9 Prognostic predictions, 33 Project Safety, 86, 351 Propriospatial ability, 312 Psychiatric evaluation, 12 Psychological assessment vs. testing. 16 Psychopbannacology, 12 Psychometric approach to neuropsychological assessment, 13 Publication bias, in meta-analysis, 46 Quality of life, 14, 142 Qualitative test performance interpretation, 4, 15, 249, 270,273 Race, considerations in interpretation of test performance, 31. See also Demographic factors, effect on test performance, ethnicity
Rapid cycling. 12. See also Affective state Rasch model, 27 Raven Progressive Matrices, 295, 498 advanced, 39, 104 Raw scores, 33 inclusion in reports, 32 interpretation-age/grade equivalent, 34 number of correct responses, 33 number of errors, 33 percent of correct responses, 34 quality of drawings, 33 time to completion, 33 Reactivity effect, 40 Reasoning. 475 Receiver Operating Characteristic (ROC) curve in "diagnostic norms" approach, 22 in differential diagnosis of dementia, 87, 92, 499-500, 530 in meta-analysis. 46 in signal detection theory, ~24 Reflexes, 12 Regression approach as used in meta-analysis, 27 Regression-based norming techniques, 22-23, 55 Regression analyses on demographic variables and performance indices for Auditory Consonant Trlgrams, 139 for Benton VJSUal Retention Test, 408, 412 for Boston Naming Test, 192 for Category Test, 477-478 for Finger Tapping Test, 434 for Grooved Pegboard Test, 460, 464 for List-I.eaming Tests, 373, 374, 378 for Rey-Osterrieth Complex Figure, 252, 257, 263 for Tactual Performance Test, 326-327 for Trailmaking Test, 69, 78 for WISCOnsin Card Sorting Test, 502-503 Regression to the mean, 23 Rehabilitation/remediation, 13 cognitive, 14,33,161,360 setting. 16 strategies in, 15, 41 Reitan-Klove Lateral Dominance Exam, 435, 452,465
Relative risk analysis, 23 Reliability, 39-41 of meta-analysis, 47 of scoring. 298 Reliability, alternate form/interform, 39-40 Benton VJSUal Retention Test, 398 Boston Naming Test, 178 Category Test, 477-480 Color Trails, 107 Hopkins Verbal I.eaming Test, 368 Rey Auditory-Verbal I.eaming Test, 362, 377 Wisconsin Card Sorting Test, 504-505 Reliability, internal consistency, 40,135, 143, 178,202, 218,248 and California Verbal I.eaming Test, pictorial format, 368 and Category Test, 478 and Hooper Visual Organization Test, 274
1026
INDEX
Reliability (continued) method, 40 and Tactual Performance Test, 314 and Visual Fonn Discrimination Test, 279, 280,286 Reliability, interrater, 15 Benton VISUal Retention Test, 398 Design Fluency Tests, 30()...3()1, 309, 310 Hooper VISUal Organization Test, 274 Rey-Osterrieth Complex Figure, 248 Tactual Performance Test, 314 Verbal Fluency, 202 Wechsler Memory Scale, 340 WISCOnsin Card Scrting Test, 50S Reliability, split-half, 40 Judgment of Une Orientation Test, 286 Paced Auditory Serial Addition Test, 143 Reliability, test-retest. 39-40. See also Practice effect Benton VISual Retention Test, 398 Boston Naming Test, 179-184 CERAD List-Learoing Test, 370 Design Fluency Tests, 300, 309, 310 Digit Vigilance Test, 163, 168 Finger Tapping Test, 421, 438, 439 Grip Strength Test, 445, 456, 458 Grooved Pegboard Test, 460, 465, 468, 469 Hopkins Verbal Learoing Test, 369 Judgment of Une Orientation Test, 286 Paced Auditory Serial Addition Test, 143 Rey Auditory-Verbal Learoing Test, 388 Rey-Osterrieth Complex Figure, 248 Ruff 2.&:7 Selective Attention Test,
Rey Auditory-Verbal Learning Test (Rey AVLT), 7-9, 11, 35, 357~. 372-389, 391-393, 341
AVLTX, 361 Rey Auditory and Verbal Learoing Test: A handbook, by Schmidt, 359, 392 Rey-Osterrieth Complex Figure, 20, 39, 241-271 Fastenau's Extended Complex Figure Test (ECFT), 247-248
Handbook of ROCF usage, by Knight and Kaplan, 241,242,251
Hubley et al.'s simplified versions, 245 Mack Complex Figure Test, 243 Myers &: Myers' Rey Complex Figure Test and Recognition Trial, 246 Risk factors affecting test perfonnance, 3 Rohling's Interpretive Method (RIM), 25-26 Ruff Figural Fluency Test, 162, 298-306 Ruff-light Trail Learoing Test, 162 Ruff 2.&:7 Selective Attention Test, 160-162, 163-166, 170
Sample biased, 23, 38
normative, 7-9, 19 random, 42 1
161-162
419,484
procedures for Tactual Perfonnance Test, 325 ReDDick-Lafayette Repeatable Battery, 168 Repeatable Battery for the Assessment of . Neuropsychological Status (RBANS), , . 1
Repeatable Cognitive-Perceptual-Motor Battery,!54, 162, 163
Reporting of test results, recommendations demographically-corrected test results, 30 inclusion of test scores, 31 source of normative data, 9 Research database, 18 in test development, 11 Residual-vs.-fitted (rvf) plot, 53 Response stylelbias, 15, 38
19,23,24,46-48
263,405
Trail-Making Test, 90 Verbal Fluency, 201, 206, 218, 223 VISual Fonn Discrimination Test, 279 Wisconsin Card Sorting Test, 50S Reliable change indices, 226 Rennick Average Impairment Rating. 419 index, 435 method for HRB administration and scoring. .
201,371
s~.
San Diego African American Norms Project, 29 San Diego Neuropsychological Test Battery, 160 Satz-Mogel short fonn of the WAIS, 209, 212, 225,
.
Scale of Competence in Independent living SkiUs (SCILS), 476 Scales of measurement, 34 Schedule for Affective Disorders and Schizophrenia Interview, 233 Scoring method. See also Midpoint age interval technique; Test scoring; Variability in test administration and scoring Boston Naming Test, 177 Color Trails Test, 101 Paced Auditory Serial Addition Test, dyad method, 142 Rey-Osterrieth Complex Figure, 243-248, 271 Boston Qualitative Scoring System (BQSS), 247 Denman Itemized Scoring System, 244-245 Developmental Scoring System (DSS-ROCF), 245 Myers &: Myers scoring system, 246 other systems, 243-248 Stroop Test, 110, Ill process-oriented scoring system, Ill Verbal Fluency qualitative scoring system (clustering/switching), 202-203
WISCOnsin Card Sorting Test, 497-498 Screening tests, cognitive, 12-13 Secondary gain, considerations in test results interpretation, 5 Selection ratio, 43
INDEX Selective Reminding Test (SRT), 370-371 Free and Cued SRT (FCSRT), 371 Verbal SRT (VSRT), 371 Semantic clustering. 29, 363-364, 369 clustering index, 364 cueing. 364, 366 organization, 363 storage, 202 Semantic deficit hypothesis, 176 Sensory systems, 12 Sequin-Goddard Formboard, 312 Serial position effect (primacy-recency), 360, 361, 366,371,381,384 Set shifting. 108 Set Test, 201 Sexual orientation, effect on test performance, 286 Shapiro-Wilk W test for normality of residuals, 52 Short Blessed Test (Orientation-MemoryConcentration Test), 228 Short forms of tests Benton VJSU81 Retention Test, 398 Boston Naming Test, 177-178 California Verbal Learning Test, 363 Category Test, 477-479 Judgment of I.Jne Orientation Test, 286, 288,296 Tactual Performance Test, 313-314, 333 VJSU81 Form Discrimination Test, 279-280 Wechsler Memory Scale-IliA, 342-344 Signal detection theory (SDT), 23 Single-case approach, 24-25 Socioeconomic status, effect on test performance, 28, 181,481 Sources of information, used in neuropsychological evaluation, 4, 16 Spanish language test administration, 28-29 Spanish-speaking samples, 28-29 Speed of color naming and color-word reading. 109 of information processing. 106, 108, 133, 140, 141, 143, 159, 364 motor, 60, 301, 420, 421 psychomoto~98. 162,163,170,460 of test performance, 161 Speeded mental processing. 202 Speed vs. accuracy in test performance, 161 Standard error of measurement, 40-41, 50, 52 in Item Response Theory, 26-27 Standard scores, 20, 35-37 Standardization, 6 of raw scores, 35-36 of test administration, 9, 313 Standards for Educational and Psychological Testing. American Psychological Association, 39 Star CanceUation Test, 160 Stata, statistical package, 48 Strategy
cognitive, 496
1027
of design production, 298 of ROCF reproduction, 241, 245, 247, 248, 249 Stroop effect, 108
language-neutral Stroop, 110 test, 29, 108-133, 134, 140, 202, 508 Switching (category vs. semantic), 201 Symptoms, role in data integration, 4, 16 Systematic review, 45, 47 Tactual Performance Test (TPI'), 312-334, 491 Tanrntiality, 12 Tau estimate of residual heterogeneity, 54 Taylor Complex Figure, 242 modified, 243 Taylor-Russell Tables, 43 Test construction, 15 Testing conditions, optimal vs. standard, 16 Test items, 28, 38, 39, 40. See also Item analysis; Item ditliculty; Item discrimination; Item Response Theory Test of Everyday Attention, 152, 166 Test scoring. 33, 38. See also Scoring method criteria, 15 partial credit, 27 procedures, 19-20 standard, 9-10, 20 Test selection, 11, 17-18, 39-41 Test usage survey, 18, 59-60, 314, 333, 442 Tbird-party observers, 6. See also Validity of neuropsychological test performance Thorndike-Lorge tables, 361 Thought disturbance, 12 Thought process/content, 12-13 Trailmalcing Test (TMT), 54, 59-98, 99, 101, 106, 140, 143, 163, 491 alphanumeric sequencing. 67 comprehensive, 66
expanded, 61Hi7 mental alternation test, 67
midrange expanded, 67 oral, 67 symbol, 67 Translated versions of neuropsychological tests, 28 Treatment effectiveness of, 45 planning. 3, 33, 42, 44 response to, 14 strategies, 28, 31 Validation studies, 7 Validity, 178,278,314,334,343 of clinical judgments, 17 ~t.42,67,279,368,505 ~41, 143,273-274,365,370,475 content, 41 convergent,41, 163 criterion-related, 41-43, 67, 365, 367 discriminant, 41, 369 ecological, 506 of hypothesis testing. 52
1028 Validity (continued) incremental, 43 of meta-analysis, 47 predictive, 42, 506 of symptoms, 279. See also Effort in test-takiag; Motivation of test performance, 62. See also Effort in test-taking; Motivation of tests, 17,41-45
Variability across studies In meta-analysis, 48-.50, 55 with Auditory Consonant Trigrams, 135 with Boston Naming Test, 180, 182-183 with Category Test, 476 in the criterion, 42 with Finger Tapping Test, 419, 422-423 with Grip Strength Test, 444, 446 with Grooved Pegboard Test, 459-460 intraindividual, 22 with Judgment of Line Orientation Test, 288 with Paced Auditory Serial Addition Test, 145-146 of physical parameters, 46 in recording strate&V of reproduction, 241, 25f with Rey Auditory-Verbal Learning Test, 357-359 with Rey-Osterrieth Complex Figure, 241-24i, 248, 254,270 with Stroop Test, 110, 115 with Tactual Performance Test, 313, 334 in test administration and scoring. 9, 15, 38 of test scores, 21-22, 26, 29, 35-36, 3thl9, 40 of traits and abilities, 38 with Verbal Fluency, 209, 237 with Wechsler Memory Scale, 338 with Wisconsin Card Sorting Test, 496 Verbal fluency, 180, 200-237, 508 category naming. 201 Controlled Oral Word Association (COWA), 2110 Controlled Verbal Fluency Task (CVFT), 200 FAS version, 49-54, 108, 134, 200, 299 Fuld Verbal Fluency Test, 206 relative to nonverbal/design fluency, 298-299, 301 Thurstone Word Fluency Test, 200 Versions of the test. See also Alternate forms; Culture- and ethnicity-specillc test vei"Sons and normative data sets; Equivalent fous; Multiple forms of tests; Short forms oftests Benton VISWll Retention Test, 394 Boston Naming Test. 173 California Verbal Learning Test, 367-368 Cancellation tests, 160 Design Fluency, 298-299 Hooper Visual Organization Test, 273 Paced Auditory Serial Addition Test, 142-143; 145-146 Rey Auditory-Verbal Learning Test, recognitiop list, 358 Rey-Osterrieth Complex Figure, 241-243 Stroop, 110-112, 130, 133 Trailmaking Test, 61, 66-67, 88
INDEX Verbal Fluency, 200-201, 205-206 VISWll Form Discrimination Test, 279 Wechsler Memory Scale, 337-344 Vietnam Experience Study (VES), 149, 222 Vigilance, 160, 162, 170 VISual. See also VJSUospatial memory, 241, 249 organization,277 perception,273,277,278,284,394,397 processing. 397 scanning. 60, 396 synthesis, 273 tracking. 60, 162, 170 VISual Form Discrimination Test (VFDT), 278-283, 394 VISual-motor abilities, 420 coordination, 460 Visual Search and Attention Test, 143, 160 VISuospatial. See also VISual abilities/skills, 60, 284, 475 constructional ability. 3, 5, 241, 249, 274, 312,394 deficit, 273 functioning. 3, 296. See also Organintional abilities intelligence, 274 Visuospatial sketchpad, 203 Vocabulary fund, effect on test performance, 181, 193, 198, 199 storage, 202 WAlS-R norming.22 relation to neuropsychological test performance, 144 WAlS-R-NI, 15,341 WAlS-III, 107 co-norming with Wechsler Memory Scale, 342, 351, 354,356 IQ. 20,41 Mexican version, 128, 529 Warrington Recognition Memory for Faces Test, 5,340 Washington Heights-Inwood-Columbia Aging Project, 29, 401, 412, 410, 413,414 Wechsler-Bellevue Scale, 484 Wechsler Memory Scale (WMS), 337-356 WMS, 337-338 WMS-Revised, 338-339, 355 WMS-III, 339-342, 344, 356 WMS-IIIA, 342-344, 356 White's general test for heteroscedasticity, 52 WHO-UCLA Auditory Verbal Learning Test, 369-370, 390-391 Wide Range Achievement Test-Revised, 208 Wide Range Achievement Test-III, 414 WISCOnsin Card Sorting Test, 29, 496-532, 108, 134, 162,398,476,496-532 MCST, 503, 530-531 64-card version, 503-504, 527-529
INDEX
1029
Woodcock-Jolmson-m, Tests of Achievement, 128, 233,529
World Health Organization (WHO), 67, 99, 101, 369-370, 396
Word
frequency. 361 production set, 202
Z-scores, 20, 25, 35-37 Zung Depression Scale, 190, 405